Loading in 2 Seconds...
Loading in 2 Seconds...
E-Science, the GRID and Statistical Modelling in Social Research Rob Crouchley Collaboratory for Quantitative e-Social Science University of Lancaster. Contents. The Problem/Motivation: Some Background on Statistical Methods and Social Research;
E-Science, the GRID and Statistical Modelling in Social Research Rob CrouchleyCollaboratory for Quantitative e-Social Science University of Lancaster
A comprehensive model will allow us to disentangle the observable, direct, effects of truancy on educational attainment from any effects that arise from correlation in the errors (unobserved effects).
Independent Errors (ep, et, eq)
- Probit for PT work,
- Ordered Probits for Truancy and
Try other methods for evaluating integrals such
as Gibbs sampling and MCMC,
Laplace expansions with many terms
Pseudo and Quasi Likelihood Methods
Estimate fixed effects versions of the models;
Use Instruments for the endogenous covariates
All can be computationally demanding, and each approach has its own problems;
We do not yet have the computational power (on the GRID) to relax all the assumptions simultaneously in this model.
Data is administrative records covering the duration in employment in the workforce of a major Australian state government to investigate the determinants of quits and separations amongst permanent and temporary workers. NP base line hazard, quadrature for the REs
Using the previous example on our HPC we could have (in minutes)
An empirical analysis of vacancy duration using micro data from Lancashire Careers
Service over the period 1985–1992, NP base line hazard, quadrature for the REs
Software for the
Lancaster’s Statistical Software for e-Social Scientists
SABRE + R
SABRE is a program specifically designed for the analysis of binary, ordinal, count recurrent events as are common in many surveys. SABRE’s dedicated soft-ware ensures fast response times.
Adding SABRE as a plug-in to R allows Sabre commands to be processed from the R user interface. Configuration of models and preparation of data is then undertaken using the extensive functionality of R
Using GROWL Components, SABRE commands invoked in R are executed in parallel on the GRID, making SABRE an excellent e-Social Science tool.
The familiar R interface is being maintained by using SABRE as a plug in
Sabre was originally developed by Lancaster University’s Centre for Applied Statistics, further development and use cases have been funded by the EPSRC, and ESRC as part of the NCeSS CQeSS node
Invoking a computational intensive and parallelised method on a Grid
Sabre can be added as a library to R so that R is menu driven, rather than command driven. This makes R easier to use.
Componentised Parallel Algorithm
OGSA client invoked as a method call
Remote O/S, e.g. parallel computer
Grid Resources on Work Stations
GROWL employs a client/server architecture that hides the complexity of GRID middleware from the user. Client access to GROWL employs a secure (PKI/SSL) connection to a single port on the host system and clients are authenticated using the distinguished name extracted from their certificate. The use of a persistent server to access grid resources allows all of the service logic to be hosted by the server, making the client application, library or plugin extremely lightweight.
Further information: http://www. sabre.lancs.ac.uk
Middleware for e-Social Science
Development of a parallel, multilevel, multi-process (OGSA) implementation of SABRE as an R object to enable the Social Scientists to disentangle the full stochastic complexity of socio-economic processes.
SABRE and GROWL
GROWL provides a client-side lightweight library as a plug in to R, providing easy user friendly access to Grid resources and computational power, providing
You can watch a more detailed presentation about Growl by Dan Grose at the NCeSS conference on line at http://redress.lancs.ac.uk/Workshops/Presentations.html