- 195 Views
- Uploaded on
- Presentation posted in: General

Fiona Steele

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Multilevel Event History Models with Applications to the Analysis of Recurrent Employment Transitions

Fiona Steele

- The discrete-time approach
- Multilevel models and examples for:
- Recurrent events
- Multiple states

- Handling large datasets
- Examples of other applications
- Estimation/software

- Events times are often measured in discrete time units, e.g. months or years.
- Straightforward to allow and test for non-proportional hazards.
- We can use familiar models for discrete response data. For more complex data structures and processes, we can use existing estimation procedures for multilevel models.

We can fit a logit regression model of the form:

The covariates xtjcan be constant over time or time-varying.

ztj is vector of functions of time (e.g. polynomials or dummy variables) and αTztjis the logit of the baseline hazard function.

Other link functions possible, e.g. clog-log or probit.

- Analyse duration of periods of continuous exposure (episodes), e.g. employment episodes, birth intervals, partnerships
- There may be unobserved individual-specific (i.e. time-invariant) factors which affect the probability of an event for all of an individual’s episodes
- referred to as unobserved heterogeneity or frailty

Repeated events lead to a two-level hierarchical structure

Level 2: Individuals

Level 1: Episodes

is probability of event in time interval t during episode i

of individual j

are covariates which might be time-varying or defined at

the episode or individual level

random effect representing unobserved characteristics

of individual j – unobserved heterogeneity or frailty

u

j

Assume

- Duration of non-employment spells; event is (re)entry into employment
- Data are subsample from British Household Panel Study: 1401 women, 2290 episodes and 15314 person-year records
- Employment, birth and union histories collected retrospectively at wave 2. These were linked to subsequent panel data to form continuous histories
- Focus on effects of duration non-employed and time-varying indicators of number and age of children, but also adjust for age, characteristics of previous job (if any)

Unobserved individual heterogeneity

- Estimated standard deviation of woman-level random effect is 0.65 (se=0.09)
- significant variation between women in log-odds of entering employment due to unmeasured time-invariant characteristics

- Failure to account for unobserved heterogeneity (UH) leads to overstatement of negative duration effects and understatement of positive duration effects
- After accounting for UH, effects of time-varying covariates (e.g. duration and number/age children) are subject-specific, i.e. within-woman effects

* p<0.05

* p<0.05

An individual may pass through various ‘states’, e.g. employment and non-employment.

Suppose there are 2 states, and denote by pstij the probability of a transition from state s.

where (u1j, u2j) ~ bivariate normal

Note: Generalises to multinomial logit for > 2 states

Start with an episode-based file, e.g.

States are employment (E) and non-employment (NE)

Notes: (i) t in years; (ii) EVENTij=1 if uncensored, 0 if censored;

(iii) age, in years, at start of episode.

Convert to discrete-time format:

Eij dummy for Employment, NEij dummy for Non-Employment

- corr(u1j, u2j)=0.58, se=0.13, so large positive residual correlation between E→NE and NE→E
- Women with high (low) chance of entering E tend to have a high (low) chance of leaving E
- Positive correlation arises from two sub-groups: short spells of E and NE, and longer spells of both types

- BUT little impact on estimates for child indicators on (re)entry into employment

- Although flexible, a drawback of the discrete-time approach is that the analysis file can be very large. This is a particular problem when we wish to fit complex models with multiple correlated random effects.
- Two possible approaches:
- Group time intervals
- More efficient algorithms, e.g. reparameterisation in MCMC estimation (Browne et al. 2009)

Suppose we analyse 6-month rather than monthly intervals.

Need to allow for different lengths of exposure time. In any

6-month interval, some will have the event or be censored after 1st month while others will be exposed for full 6 months.

Denote by ntij exposure time in grouped interval t.

Estimate binomial logit model with response ytij and denominator ntij

Note: intervals do not need to be the same width.

Suppose an individual is observed to have an event during

the 17th month, and we wish to group durations into 6-month

intervals (t).

- Need to assume that hazard function is constant within the grouped intervals.
- Need to fix values of time-varying covariates within intervals, e.g. value at start.
- In practice, aggregation has little impact on estimated baseline hazard or effects of episode/individual-level covariates. But impact on coefficients of time-varying covariates can be substantial.

- Hospital admissions: length of stay or duration between admissions
- Repeated episodes nested within patients if multiple admissions
- Hospital and GP effects using cross-classified multilevel model (GPs refer to multiple hospitals, and hospitals take patients from multiple GPs)

- Area effects on mortality or fertility
- Repeated birth intervals (for fertility) for individuals nested within areas

- As in employment example, set up person-period file with multiple records per person, e.g. Kravdal(2006)
- Define a single binary response for each person and include number of years of exposure as offset in a Poisson regression, e.g. Tarkiainen et al. (2009). Could also treat as binomial response (as for grouped time intervals).
- If few, categorical covariates apply Poisson regression to aggregate data (1 record for each combination of t and covariate values)

- Suppose we want to estimate effect of age, sex and area characteristics on individual mortality risk
- Suppose we group age into four 5-year age categories. Then for each area define 8 cells, one for each age-sex combination
- For area j denote by yij the observed number of deaths for age-sex cell i
- Denote the total population at risk of mortality in cell i of area j by nij, or might use expected number of deaths Eij

- Analyse (yij, nij) using 2-level Poisson model
- Define age and sex dummies characterising cells and include these and area-level variables as predictors
- Application to cancer mortality: Langford and Day (2001)
- No. deaths for small areas (i) within regions (j) within EC nations (k). Covariates at regional level

- Application to teenage conception: Diamond et al. (2002)
- No. conceptions for age-year cell (i) within electoral wards (j). Deprivation indicators at ward level

- Recurrent events and multiple states. Any software for multilevel binary responses
- Binomial models for grouped intervals. GLLAMM, MLwiN, WinBUGS
- Simultaneous equations models for correlated processes. aML, GLLAMM, MLwiN, Sabre, WinBUGS. aML is the most general (mixed response types at different levels)

Browne, W. J., Steele, F., Golalizadeh, M. & Green, M. (2009). The use of simple reparameterisations in MCMC estimation of multilevel models with applications to discrete-time survival models. JRSS A,172, 579-598.

Diamond, I., Clements, S., Stone, N. and Ingham, R. (2002) Spatial variation in teenage conceptions in south and west England. Journal of the Royal Statistical Society, Series A, 162: 273-289.

Goldstein, H., Pan, H. and Bynner, J. (2004) “A flexible procedure for analysing longitudinal event histories using a multilevel model.” Understanding Statistics, 3: 85-99.

Kravdal, Ø (2006) Does place matter for cancer survival in Norway? A multilevel analysis of the importance of hospital affiliation and municipality socio-economic resources. Health and Place, 12: 527-537.

Langford, I. H. and Day, R.J. (2001) Poisson Regression. In A.H. Leyland and H. Goldstein (ed) Multilevel Modelling of Health Statistics. London: Wiley. Chapter 4.

Steele, F., Goldstein, H. and Browne, W. (2004) “A general multistate competing risks model for event history data, with an application to a study of contraceptive use dynamics.” Statistical Modelling, 4: 145-159.

Steele, F. (2011) Multilevel discrete-time event history models with applications to the analysis of recurrent employment transitions (with discussion). Australian and New Zealand Journal of Statistics (to appear).

Tarkiainen, L., Martikainen, P., Laaksonen, M. and Leyland, A.H. (2009) Comparing the effects of neighbourhood characteristics on all-cause mortality using two hierarchical areal units in the capital region of Helsinki. Health and Place, 16: 409-412.

See also downloadable materials:

http://www.cmm.bris.ac.uk/MLwiN/tech-support/workshops/materials/models.shtml

http://www.cmm.bris.ac.uk/MLwiN/tech-support/workshops/materials/eha.shtml