1 / 41

‘Interpreting coefficients from longitudinal models’

‘Interpreting coefficients from longitudinal models’. Professor Vernon Gayle and Dr Paul Lambert (Stirling University) Wednesday 1st April 2009. Structure of this Session. Briefly Mention Change Score Models Transition (table etc) Repeated Cross-Sectional Data Duration Models

marli
Download Presentation

‘Interpreting coefficients from longitudinal models’

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ‘Interpreting coefficients from longitudinal models’ Professor Vernon Gayle and Dr Paul Lambert (Stirling University) Wednesday 1st April 2009

  2. Structure of this Session Briefly Mention • Change Score Models • Transition (table etc) • Repeated Cross-Sectional Data • Duration Models • Panel Models

  3. Change in Score (first difference model) Yi 2 - Yi 1 = b’(Xi2-Xi1) + (ei2 - ei1) Here theb’is simply a regression on the difference or change in scores The panel fixed effects linear model is a special case of the change score model This modelling approach identifies on switcher!

  4. Transitions • Historically, social mobility tables • Large literature on log-linear models • Essentially cross-sectional models are fitted • Care is required if b is essentially a lagged effect (association between mother & daughter) • In some circumstances this may swamp other effects

  5. Repeated Cross-Sectional Surveys • UK has a wealth of repeated cross-sectional data • Much of it is comparable • Often not considered longitudinal because there are no explicit repeated contacts • However, very useful for trend over time analyses • Cross-sectional models are employed • Be careful of the interpretation of b, and the int of b * time • Time is often survey year, but can be cohort (e.g. YCS)

  6. Duration Models • Modelling time to an event taking place • Duration is the outcome

  7. Simple approach accelerated life model This is a regression model b is the effect on the log duration When there are no (or a small number) of right censored cases this approach is suitable – it may be questioned by referees however! This model is a little old fashioned, but often results are very similar to hazard models (although in practice betas should be carefully compared to hazard models Loge ti = b0 + b1x1i+ei

  8. Duration Models • Duration models • Survival models • Cox regression • Failure time analysis • Event history models • Hazard models Cox, D.R. (1972) ‘Regression models and life tables’ JRSS,B, 34 pp.187-220. These are all the same thing – depending on your substantive discipline

  9. Hazard Models • Model time to an event • They do no model duration – they model the ‘Harzard’ Hazard: measure of the probability that an event occurs at time t conditional on it not having occurred before t • These models appropriately control for right-censored data

  10. Hazard Models • Hazard models are similar to logit models • b is estimated on the logit scale • b estimates the increase/decrease in the speed at which individuals (in the group) leave the risk set • b is about speed and not rate (as is commonly suggested)

  11. Alternative Types of Event History Analysis Describing sequences / trajectories: characterise progression through states into clusters / sequences / frameworks • Growing recent social science interest sequence analysis • Often analyse cluster membership as categorical factor A problem – neutrality of data, e.g. cluster 1= Men in full time employment

  12. Panel Models

  13. Orthodox Panel Data Structure Individuals 1 2345 1 2 3 41 2 3 41 2 3 4 51 2 3 4 51 2 Observations (t)

  14. Panel Regression Approach • xt suite in Stata • b can usually be interpreted relatively easily • Similarity to b in the multilevel modelling framework

  15. Standard Linear Model Slopes and Intercepts b0 is a constant intercept b1is a constant slope Constant slopes Constant intercept

  16. Possible Slopes and Intercepts Separate regression for each individual The fixed effects model Varying slopes Varying intercepts Constant slopes Varying intercepts b0j is not a constant intercept b1 is a constant slope b0j is not a constant intercept b1j is not a constant slope

  17. Regression Approach Fixed or Random effects estimators • Fierce debate • F.E. b will tend be consistent • R.E. standard errors will be efficient but b may not be consistent • R.E. assumes no correlation between observed X variables and unobserved characteristics

  18. xt Regression Approach Fixed or Random effects • Economists tend towards F.E. (attractive property of consistent b ) • With continuous Y – little problem, fit both F.E. and R.E. models and then Hausman test bf.e. / b r.e. (don’t be surprised if it points towards F.E. model) ( Steve Pudney’s suggestion)

  19. xt Regression Approach Fixed or Random effects estimators • Preference for Random Effects (RE) models in some areas (e.g. education studies) • Frequent criticism – A key assumption in RE models is than random effects are uncorrelated with the observed variables in the model • In practice this assumption goes untested and could potentially result in biased estimates (see Halaby 2004 Ann. Rev. Sociology 30)

  20. Which approaches in practice? • Some more general thoughts • banana skins • flies in the ointment

  21. The Hausman test is very sensitive and will usually lead to a preference for the FE model Substantively the RE may be better, the FE is more appropriate in relation to growth or individual level change

  22. Fixed or Random Effect Estimators? In our view R.E. is most appropriate when there are substantively important fixed in time X variables (which are not correlated with unobserved effects) F.E. can be especially misleading for variables that change little in time (e.g. trade union members) because they are “identified by changers” This may be compounded by measurement errors

  23. A further thought about fixed effects models….

  24. The Panel Model The F.E. panel model estimator is theoretically attractive in this situation F.E. is commonly used in economics, as the effect of education level is correlated with ability Time changing x vars Earnings (y) Education level (x) fixed in time Unobserved ability Remember that this rests on the (potentially strong) assumption that ability is fixed in time

  25. The Panel Model R.E. is commonly used in multilevel modelling, but the effect of education level may be correlated with ability Time changing x vars Earnings (y) Education level (x) fixed in time Correlation Unobserved ability Remember that this rests on the (potentially strong) assumption that ability is fixed in time

  26. The Panel Model Fixed Effects - econometrician Stephen Pudney makes this point The standard theoretical position (two slides back) is questionable if there is two-way causality Explanatory variable Unobserved

  27. Population Ave Model (Marginal Models) • Is a model that accounts for clustering between individuals all we need? logit y x1, cluster(id) • Becoming more popular (Pickles –preference in USA in public health) • Do we need ‘subject’ specific random/fixed effect? (is ‘frailty’ or unobserved heterogeneity important) • Time constant X variables might be analytically important • Marginal Modelling (GEE approaches) may be all we need (e.g. estimating a policy or ‘social group’ difference)

  28. Some further thoughts on comparing estimates between models……

  29. Binary Outcome Panel Models:An example Married women’s employment (SCELI Data) y is the woman working yes=1; no=0 x woman has child aged under 1 year I have contrived this illustration….

  30. Consistent b - smaller standard errors (double the sample size) but Stata thinks that there are 202 individuals and not 101 people surveyed in two waves!

  31. Consistent b - standard errors are now corrected – Stata knows that there are 101 individuals (i.e. repeated measures)

  32. Beware b and standard errors are no longer measured on the same scale Stata knows that there are 101 individuals (i.e. repeated measures)

  33. b in Binary Panel Models The b in a probit random effects model is scaled differently– Mark Stewart suggests br.e. * (Ö1-rho) compared with bpooled probit rho (is analogous to an icc) – proportion of the total variance contributed by the person level variance Panel logit models also have this issue!

  34. b in Binary Panel Models • Conceptually two types of b in a binary random effects model • X is time changing - b is the ‘effect’ for a woman of changing her value of X • X is fixed in time - b is analogous to the effect for two women (e.g. Chinese / Indian) with the same value of the random effect (e.g. ui=0) – For fixed in time X Fiona Steele suggests simulating to get more appropriate value of b

  35. Population Ave Model / Marginal Models • Motivation for thinking about these approaches: • Not really been adopted in British Sociology • Population average models/Marginal Modelling/GEE approaches are developing rapidly. They might be useful for estimating a policy or ‘social group’ differences • Population average models are becoming more popular (Pickles – preference in USA in public health) • Is a model that accounts for clustering between individual observations adequate? Simple pop. average model: regress y x1, cluster(id)

  36. Conclusion • Clustering is sometimes part of the substantive story • e.g. orthodox hierarchical (or multi-level) situation, pupils nested in schools • Explicitly modelling hierarchical structure may be desirable • Ironically, in some instances even with ‘highly’ clustered data we would tell a similar story which ever model we used (strength of coefficient, signs & significance)

  37. Conclusion • Population average models/Marginal Modelling/GEE might be useful for estimating a policy or ‘social group’ differences • Is the ‘average’ effect for a group the substantively more interesting or more important for informing policy or practice

  38. Conclusion • Some estimators (xtprobit) don’t have F.E. equivalents (xtlogit F.E. is not equivalent to R.E.) • Here population average approaches might be attractive since a key assumption in RE models is than random effects are uncorrelated with the observed variables in the model and this can’t be formally tested

More Related