Skip this Video
Download Presentation
Fiona Steele

Loading in 2 Seconds...

play fullscreen
1 / 29

Fiona Steele - PowerPoint PPT Presentation

  • Uploaded on

Multilevel Event History Models with Applications to the Analysis of Recurrent Employment Transitions. Fiona Steele. Outline. The discrete-time approach Multilevel models and examples for: Recurrent events Multiple states Handling large datasets Examples of other applications

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Fiona Steele' - donar

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Multilevel Event History Models with Applications to the Analysis of Recurrent Employment Transitions

Fiona Steele

  • The discrete-time approach
  • Multilevel models and examples for:
    • Recurrent events
    • Multiple states
  • Handling large datasets
  • Examples of other applications
  • Estimation/software
why use discrete time methods
Why use discrete-time methods?
  • Events times are often measured in discrete time units, e.g. months or years.
  • Straightforward to allow and test for non-proportional hazards.
  • We can use familiar models for discrete response data. For more complex data structures and processes, we can use existing estimation procedures for multilevel models.
a simple discrete time logit model
A simple discrete-time logit model

We can fit a logit regression model of the form:

The covariates xtjcan be constant over time or time-varying.

ztj is vector of functions of time (e.g. polynomials or dummy variables) and αTztjis the logit of the baseline hazard function.

Other link functions possible, e.g. clog-log or probit.

recurrent events
Recurrent events
  • Analyse duration of periods of continuous exposure (episodes), e.g. employment episodes, birth intervals, partnerships
  • There may be unobserved individual-specific (i.e. time-invariant) factors which affect the probability of an event for all of an individual’s episodes
    • referred to as unobserved heterogeneity or frailty
hierarchical data structure
Hierarchical data structure

Repeated events lead to a two-level hierarchical structure

Level 2: Individuals

Level 1: Episodes

2 level model for recurrent events

is probability of event in time interval t during episode i

of individual j

are covariates which might be time-varying or defined at

the episode or individual level

random effect representing unobserved characteristics

of individual j – unobserved heterogeneity or frailty




2-level model for recurrent events
example women s employment
Example: women’s employment
  • Duration of non-employment spells; event is (re)entry into employment
  • Data are subsample from British Household Panel Study: 1401 women, 2290 episodes and 15314 person-year records
  • Employment, birth and union histories collected retrospectively at wave 2. These were linked to subsequent panel data to form continuous histories
  • Focus on effects of duration non-employed and time-varying indicators of number and age of children, but also adjust for age, characteristics of previous job (if any)

Unobserved individual heterogeneity

  • Estimated standard deviation of woman-level random effect is 0.65 (se=0.09)
    • significant variation between women in log-odds of entering employment due to unmeasured time-invariant characteristics
  • Failure to account for unobserved heterogeneity (UH) leads to overstatement of negative duration effects and understatement of positive duration effects
  • After accounting for UH, effects of time-varying covariates (e.g. duration and number/age children) are subject-specific, i.e. within-woman effects
modelling transitions between multiple states
Modelling transitions between multiple states

An individual may pass through various ‘states’, e.g. employment and non-employment.

Suppose there are 2 states, and denote by pstij the probability of a transition from state s.

where (u1j, u2j) ~ bivariate normal

Note: Generalises to multinomial logit for > 2 states

multiple states data structure 1
Multiple states: data structure (1)

Start with an episode-based file, e.g.

States are employment (E) and non-employment (NE)

Notes: (i) t in years; (ii) EVENTij=1 if uncensored, 0 if censored;

(iii) age, in years, at start of episode.

multiple states data structure 2
Multiple states: data structure (2)

Convert to discrete-time format:

Eij dummy for Employment, NEij dummy for Non-Employment

example transitions between employment and non employment
Example: transitions between employment and non-employment
  • corr(u1j, u2j)=0.58, se=0.13, so large positive residual correlation between E→NE and NE→E
    • Women with high (low) chance of entering E tend to have a high (low) chance of leaving E
    • Positive correlation arises from two sub-groups: short spells of E and NE, and longer spells of both types
  • BUT little impact on estimates for child indicators on (re)entry into employment
handling large datasets
Handling large datasets
  • Although flexible, a drawback of the discrete-time approach is that the analysis file can be very large. This is a particular problem when we wish to fit complex models with multiple correlated random effects.
  • Two possible approaches:
    • Group time intervals
    • More efficient algorithms, e.g. reparameterisation in MCMC estimation (Browne et al. 2009)
grouped time intervals
Grouped time intervals

Suppose we analyse 6-month rather than monthly intervals.

Need to allow for different lengths of exposure time. In any

6-month interval, some will have the event or be censored after 1st month while others will be exposed for full 6 months.

Denote by ntij exposure time in grouped interval t.

Estimate binomial logit model with response ytij and denominator ntij

Note: intervals do not need to be the same width.

example of grouped time intervals
Example of grouped time intervals

Suppose an individual is observed to have an event during

the 17th month, and we wish to group durations into 6-month

intervals (t).

implications of aggregation
Implications of aggregation
  • Need to assume that hazard function is constant within the grouped intervals.
  • Need to fix values of time-varying covariates within intervals, e.g. value at start.
  • In practice, aggregation has little impact on estimated baseline hazard or effects of episode/individual-level covariates. But impact on coefficients of time-varying covariates can be substantial.
examples of other applications
Examples of other applications
  • Hospital admissions: length of stay or duration between admissions
    • Repeated episodes nested within patients if multiple admissions
    • Hospital and GP effects using cross-classified multilevel model (GPs refer to multiple hospitals, and hospitals take patients from multiple GPs)
  • Area effects on mortality or fertility
    • Repeated birth intervals (for fertility) for individuals nested within areas
area effects on mortality alternative approaches
Area effects on mortality: alternative approaches
  • As in employment example, set up person-period file with multiple records per person, e.g. Kravdal(2006)
  • Define a single binary response for each person and include number of years of exposure as offset in a Poisson regression, e.g. Tarkiainen et al. (2009). Could also treat as binomial response (as for grouped time intervals).
  • If few, categorical covariates apply Poisson regression to aggregate data (1 record for each combination of t and covariate values)
area effects on mortality multilevel poisson modelling of aggregate data 1
Area effects on mortality: Multilevel Poisson modelling of aggregate data (1)
  • Suppose we want to estimate effect of age, sex and area characteristics on individual mortality risk
  • Suppose we group age into four 5-year age categories. Then for each area define 8 cells, one for each age-sex combination
  • For area j denote by yij the observed number of deaths for age-sex cell i
  • Denote the total population at risk of mortality in cell i of area j by nij, or might use expected number of deaths Eij
area effects on mortality multilevel poisson modelling of aggregate data 2
Area effects on mortality: Multilevel Poisson modelling of aggregate data (2)
  • Analyse (yij, nij) using 2-level Poisson model
  • Define age and sex dummies characterising cells and include these and area-level variables as predictors
  • Application to cancer mortality: Langford and Day (2001)

- No. deaths for small areas (i) within regions (j) within EC nations (k). Covariates at regional level

  • Application to teenage conception: Diamond et al. (2002)
    • No. conceptions for age-year cell (i) within electoral wards (j). Deprivation indicators at ward level
  • Recurrent events and multiple states. Any software for multilevel binary responses
  • Binomial models for grouped intervals. GLLAMM, MLwiN, WinBUGS
  • Simultaneous equations models for correlated processes. aML, GLLAMM, MLwiN, Sabre, WinBUGS. aML is the most general (mixed response types at different levels)

Browne, W. J., Steele, F., Golalizadeh, M. & Green, M. (2009). The use of simple reparameterisations in MCMC estimation of multilevel models with applications to discrete-time survival models. JRSS A,172, 579-598.

Diamond, I., Clements, S., Stone, N. and Ingham, R. (2002) Spatial variation in teenage conceptions in south and west England. Journal of the Royal Statistical Society, Series A, 162: 273-289.

Goldstein, H., Pan, H. and Bynner, J. (2004) “A flexible procedure for analysing longitudinal event histories using a multilevel model.” Understanding Statistics, 3: 85-99.

Kravdal, Ø (2006) Does place matter for cancer survival in Norway? A multilevel analysis of the importance of hospital affiliation and municipality socio-economic resources. Health and Place, 12: 527-537.

Langford, I. H. and Day, R.J. (2001) Poisson Regression. In A.H. Leyland and H. Goldstein (ed) Multilevel Modelling of Health Statistics. London: Wiley. Chapter 4.


Steele, F., Goldstein, H. and Browne, W. (2004) “A general multistate competing risks model for event history data, with an application to a study of contraceptive use dynamics.” Statistical Modelling, 4: 145-159.

Steele, F. (2011) Multilevel discrete-time event history models with applications to the analysis of recurrent employment transitions (with discussion). Australian and New Zealand Journal of Statistics (to appear).

Tarkiainen, L., Martikainen, P., Laaksonen, M. and Leyland, A.H. (2009) Comparing the effects of neighbourhood characteristics on all-cause mortality using two hierarchical areal units in the capital region of Helsinki. Health and Place, 16: 409-412.

See also downloadable materials: