scottish social survey network master class 1 data analysis with stata l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Scottish Social Survey Network: Master Class 1 Data Analysis with Stata PowerPoint Presentation
Download Presentation
Scottish Social Survey Network: Master Class 1 Data Analysis with Stata

Loading in 2 Seconds...

play fullscreen
1 / 19

Scottish Social Survey Network: Master Class 1 Data Analysis with Stata - PowerPoint PPT Presentation


  • 197 Views
  • Uploaded on

Scottish Social Survey Network: Master Class 1 Data Analysis with Stata. Dr Vernon Gayle and Dr Paul Lambert 23 rd January 2008, University of Stirling The SSSN is funded under Phase II of the ESRC Research Development Initiative. Multilevel data and analysis with Stata (in 15 minutes).

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Scottish Social Survey Network: Master Class 1 Data Analysis with Stata' - oshin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
scottish social survey network master class 1 data analysis with stata

Scottish Social Survey Network: Master Class 1Data Analysis with Stata

Dr Vernon Gayle and Dr Paul Lambert

23rd January 2008, University of Stirling

The SSSN is funded under Phase II of the ESRC Research Development Initiative

generalised linear model
Generalised linear model
  • Y = BX + e
  • Y = outcome variable(s)
  • X = explanatory variables
  • e = error term for each individual response

Generalised linear mixed models

    • Adding complexity to the GLM, such as by disaggregating the error structures
the work of statistical modelling
The work of statistical modelling
  • Yi = BXi + ei
  • Most of the time:
    • we have a single Y
    • we ignore e
    • we concentrate on what goes into B
example
Example
  • Data: British Household Panel Survey 2005 adult interviews (7k adults in work)
  • Y = GHQ scale score for adults in employment (General Health Questionnaire, higher = worse subjective well-being)
  • X = various possible measures, including gender, age, marital status, occupational advantage, education, partner’s GHQ
  • You can run this example, the files are at:
some regression assumptions
Some regression assumptions
  • All variables are measured without errors
  • All relevant predictors of the independent variable are included in the analysis
  • Expected value of the error is zero
  • Heteroscedasticity of the error
  • No autocorrelation (no relation between error terms for different cases)
  • [above using: Menard, S. 1995. Applied Logistic Regression Analysis, London: Sage.]
multilevel modelling
Multilevel modelling
  • What if there was some connection between some of the cases within the dataset?
    • This occurs by design in certain projects
      • e.g. educational research, sample includes multiple children from the same school
    • Some connections (‘hierarchical clusters’) are standard in most social surveys
how to account for hierarchy clustering in individual data
How to account for hierarchy / clustering in individual data?
  • We could try a unique dummy var. for every cluster
    • Country: Y = BX + scot + wal + Nir + e
    • ‘areg’ in Stata allows several hundred variables like this
    • often called a ‘hierarchical fixed effect’
    • but many hierarchies have too many clusters for this to be satisfactory
  • We could use higher level explanatory variables
    • e.g. average unemployment rate in local authority district
    • these are also ‘hierarchical fixed effects’
  • We could try telling the model that we expect the error terms to be related
    • these are ‘hierarchical random effects’ = multilevel models
creating a multilevel model
Creating a multilevel model
  • Linear model:

Yi = BXi + ei

  • Multilevel model (‘random intercepts’)

Yij = BXij + uj + eij

  • Multilevel model (‘random coefficients’)

Yij = BXij + UBj + uj + eij

how to implement multilevel models
How to implement multilevel models?
  • In SPSS and Stata, there are extension specifications which can be made in order to specify the simplest random intercepts model
stata examples
Stata examples
  • regress ghq fem age age2 cohab
  • regress ghq fem age age2 cohab, robust cluster(ohid)
  • xtmixed ghq fem age age2 cohab ||ohid:
comments
Comments
  • Models which ignore clustering should be unbiassed but inefficient
  • The simplest multilevel model:
    • Shouldn’t change coefficent estimates (unbiased)
    • Should change confidence intervals (inefficient)
a controversial claim about stata
A controversial claim about Stata
  • Stata is the best package to use for multilevel modelling, because:
    • It is integrated with data management capacity: easy to change variables; change cases; add higher level explanatory variables; etc
    • It has a wide range of hierarchical model estimators
    • It allows easy comparison between long-standing hierarchical estimators (from economics) and new random effects models
  • By constrast:
    • Other mainstream packages don’t have adequate range of model estimators
    • Specialist packages (e.g. MLwiN; HLM) do have more advanced modelling estimators, but they inhibit data manipulation / serious model building