Section Duration Data Introduction Sometimes we have data on length of time of a particular event or ‘spells’ Time until death Time on unemployment Time to complete a PhD

### Section

Duration Data

• Sometimes we have data on length of time of a particular event or ‘spells’

• Time until death

• Time on unemployment

• Time to complete a PhD

• Techniques we will discuss were originally used to examine lifespan of objects like light bulbs or machines. These models are often referred to as “time to failure”

• T is a random variable that indicates duration (time til death, find a new job, etc)

• t is the realization of that variable

• f(t) is a PDF that describes the process that determines the time to failure

• CDF is F(t) represents the probability an event will happen by time t

• Hazard function, h(t) ‘t’.

• What is the probability the spell will end at time t, given that it has already lasted t

• What is the chance you find a new job in month 12 given that you’ve been unemployed for 12 months already

• Mathematically lower the chance they will exit unemployment – ‘damaged goods’

• d λ(t) /dt = 0 then there is no duration dep.

• d λ(t) /dt > 0 there is + duration dependence

the probability the spell will end

increases with time

• d λ(t) /dt < 0 there is – duration dependence

the probability the spell will end

decreases over time

Different Functional Forms duration dependence

• Exponential

• λ(t)= λ

• Hazard is the same over time, a ‘memory less’ process

• Weibull

• F(t) = 1 – exp(-γtα) where α,γ > 0

• λ(t) = αγtα-1

• if α>1, increasing hazard

• if α<1, decreasing hazard

• if α=1, exponential

NHIS Multiple Cause of Death duration dependence

• NHIS

• annual survey of 60K households

• Data on individuals

• Self-reported healthm DR visits, lost workdays, etc.

• MCOD

• Linked NHIS respondents from 1986-1994 to National Death Index through Dec 31, 1995

• Identified whether respondent died and of what cause

• Our sample duration dependence

• Males, 50-70, who were married at the time of the survey

• 1987-1989 surveys

• Give everyone 5 years (60 months) of followup

Key Variables duration dependence

• max_mths maximum months in the survey.

• Diedin5 respondent died during the 5 years of followup

• Note if diedn5=0, the max_mths=60. Diedin5 identifies whether the data is censored or not.

Identifying Duration Data in STATA duration dependence

• Need to identify which is the duration data

stset length, failure(failvar)

• Length=duration variable

• Failvar=1 when durations end in failure, =0 for censored values

• If all data is uncensored, omit failure(failvar)

• In our case duration dependence

• Stset max_mths, failure(diedin5)

Getting Kaplan-Meier Curves duration dependence

• Tabular presentation of results

sts list

• Graphical presentation

sts graph

• Results by subgroup

sts graph, by(income)