1 / 46

Analysis of Survival Data

Analysis of Survival Data. Time to Event outcomes Censoring Survival Function Point estimation Kaplan-Meier. Introduction to survival analysis. What makes it different? Three main variable types Continuous Categorical Time-to-event Examples of each.

aggie
Download Presentation

Analysis of Survival Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analysis of Survival Data Time to Event outcomes Censoring Survival Function Point estimation Kaplan-Meier

  2. Introduction to survival analysis • What makes it different? • Three main variable types • Continuous • Categorical • Time-to-event • Examples of each

  3. Example: Death Times of Psychiatric Patients (K&M 1.15) • Dataset reported on by Woolson (1981) • 26 inpatient psychiatric patients admitted to U of Iowa between 1935-1948. • Part of larger study • Variables included: • Age at first admission to hospital • Gender • Time from first admission to death (years)

  4. Data summary . tab gender gender | Freq. Percent Cum. ------------+----------------------------------- 0 | 11 42.31 42.31 1 | 15 57.69 100.00 ------------+----------------------------------- Total | 26 100.00 gender age deathtime death 1 51 1 1 1 58 1 1 1 55 2 1 1 28 22 1 0 21 30 0 0 19 28 1 1 25 32 1 1 48 11 1 1 47 14 1 1 25 36 0 1 31 31 0 0 24 33 0 0 25 33 0 1 30 37 0 1 33 35 0 0 36 25 1 0 30 31 0 0 41 22 1 1 43 26 1 1 45 24 1 1 35 35 0 0 29 34 0 0 35 30 0 0 32 35 1 1 36 40 1 0 32 39 0 . sum age Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- age | 26 35.15385 10.47928 19 58

  5. Death time? . sum deathtime Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- deathtime | 26 26.42308 11.55915 1 40

  6. Does that make sense? . tab death death | Freq. Percent Cum. ------------+----------------------------------- 0 | 12 46.15 46.15 1 | 14 53.85 100.00 ------------+----------------------------------- Total | 26 100.00 • Only 14 patients died • The rest were still alive at the end of the study • Does it make sense to estimate mean? Median? • How can we interpret the histogram? • What if all had died? • What if none had died?

  7. CENSORING • Different types • Right • Left • Interval • Each leads to a different likelihood function • Most common is right censored

  8. Right censored data • “Type I censoring” • Event is observed if it occurs before some prespecified time • Mouse study • Clock starts: at first day of treatment • Clock ends: at death • Always be thinking about ‘the clock’

  9. Simple example: Type I censoring Time 0

  10. Introduce “administrative” censoring Time 0 STUDY END

  11. Introduce “administrative” censoring Time 0 STUDY END

  12. More realistic: clinical trial “Generalized Type I censoring” Time 0 STUDY END

  13. More realistic: clinical trial “Generalized Type I censoring” Time 0 STUDY END

  14. Additional issues • Patient drop-out • Loss to follow-up

  15. Drop-out or LTFU Time 0 STUDY END

  16. How do we ‘treat” the data? Shift everything so each patient time represents time on study Time of enrollment

  17. Another type of censoring:Competing Risks • Patient can have either event of interest or another event prior to it • Event types ‘compete’ with one another • Example of competers: • Death from lung cancer • Death from heart disease • Common issue not commonly addressed, but gaining more recognition

  18. Left Censoring • The event has occurred prior to the start of the study • OR the true survival time is less than the person’s observed survival time • We know the event occurred, but unsure when prior to observation • In this kind of study, exact time would be known if it occurred after the study started • Example: • Survey question: when did you first smoke? • Alzheimers disease: onset generally hard to determine • HPV: infection time

  19. Interval censoring • Due to discrete observation times, actual times not observed • Example: progression-free survival • Progression of cancer defined by change in tumor size • Measure in 3-6 month intervals • If increase occurs, it is known to be within interval, but not exactly when. • Times are biased to longer values • Challenging issue when intervals are long

  20. Key components • Event: must have clear definition of what constitutes the ‘event’ • Death • Disease • Recurrence • Response • Need to know when the clock starts • Age at event? • Time from study initiation? • Time from randomization? • time since response? • Can event occur more than once?

  21. Time to event outcomes • Modeled using “survival analysis” • Define T = time to event • T is a random variable • Realizations of T are denoted t • T  0 • Key characterizing functions: • Survival function • Hazard rate (or function)

  22. Survival Function • S(t) = The probability of an individual surviving to time t • Basic properties • Monotonic non-increasing • S(0)=1 • S(∞)=0* * debatable: cure-rate distributions allow plateau at some other value

  23. Example: exponential

  24. Weibull example

  25. Applied example Van Spall, H. G. C., A. Chong, et al. (2007). "Inpatient smoking-cessation counseling and all-cause mortality in patients with acute myocardial infarction." American Heart Journal 154(2): 213-220. Background Smoking cessation is associated with improved health outcomes, but the prevalence, predictors, and mortality benefit of inpatient smoking-cessation counseling after acute myocardial infarction (AMI) have not been described in detail. Methods The study was a retrospective, cohort analysis of a population-based clinical AMI database involving 9041 inpatients discharged from 83 hospital corporations in Ontario, Canada. The prevalence and predictors of inpatient smoking-cessation counseling were determined. Results….. Conclusions Post-MI inpatient smoking-cessation counseling is an underused intervention, but is independently associated with a significant mortality benefit. Given the minimal cost and potential benefit of inpatient counseling, we recommend that it receive greater emphasis as a routine part of post-MI management.

  26. Applied example Adjusted 1-year survival curves of counseled smokers, noncounseled smokers, and never-smokers admitted with AMI (N = 3511). Survival curves have been adjusted for age, income quintile, Killip class, systolic blood pressure, heart rate, creatinine level, cardiac arrest, ST-segment deviation or elevated cardiac biomarkers, history of CHF; specialty of admitting physician; size of hospital of admission; hospital clustering; inhospital administration of aspirin and β-blockers; reperfusion during index hospitalization; and discharge medications.

  27. Hazard Function • A little harder to conceptualize • Instantaneous failure rate or conditional failure rate • Interpretation: approximate probability that a person at time t experiences the event in the next instant. • Only constraint: h(t)0 • For continuous time,

  28. Hazard Function • Useful for conceptualizing how chance of event changes over time • That is, consider hazard ‘relative’ over time • Examples: • Treatment related mortality • Early on, high risk of death • Later on, risk of death decreases • Aging • Early on, low risk of death • Later on, higher risk of death

  29. Shapes of hazard functions • Increasing • Natural aging and wear • Decreasing • Early failures due to device or transplant failures • Bathtub • Populations followed from birth • Hump-shaped • Initial risk of event, followed by decreasing chance of event

  30. Examples

  31. Median • Very/most common way to express the ‘center’ of the distribution • Rarely see another quantile expressed • Find t such that • Complication: in some applications, median is not reached empirically • Reported median based on model seems like an extrapolation • Often just state ‘median not reached’ and give alternative point estimate.

  32. X-year survival rate • Many applications have ‘landmark’ times that historically used to quantify survival • Examples: • Breast cancer: 5 year relapse-free survival • Pancreatic cancer: 6 month survival • Acute myeloid leukemia (AML): 12 month relapse-free survival • Solve for S(t) given t

  33. Competing Risks • Used to be somewhat ignored. • Not so much anymore • Idea: • Each subject can fail due to one of K causes (K>1) • Occurrence of one event precludes us from observing the other event. • Usually, quantity of interest is the cause-specific hazard • Overall hazard equals sum of each hazard:

  34. Example • Myeloablative Allogeneic Bone Marrow Transplant Using T Cell Depleted Allografts Followed by Post-Transplant GM-CSF in High Risk Myelodysplastic Syndromes • Interest is in RELAPSE • Need to account for treatment related mortality (TRM)? • Should we censor TRM? • No. that would make things look more optimistic • Should we exclude them? • No. That would also bias the results • Solution: • Treat it as a competing risk • Estimate the incidence of both

  35. Estimating the Survival Function • Most common approach abandons parametric assumptions • Why? • Not one ‘catch-all’ distribution • No central limit theorem for large samples

  36. Censoring • Assumption: • Potential censoring time is unrelated to the potential event time • Reasonable? • Estimation approaches are biased when this is violated • Violation examples • Sick patients tend to miss clinical visits more often • High school drop-out. Kids who move may be more likely to drop-out.

  37. Terminology • D distinct event times • t1 < t2 < t3 < …. < tD • ties allowed • at time ti, there are di deaths • Yi is the number of individuals at risk at ti • Yi is all the people who have event times  ti • di/Yi is an estimate of the conditional probability of an event at ti, given survival to ti

  38. Kaplan-Meier estimation • AKA ‘product-limit’ estimator • Step-function • Size of steps depends on • Number of events at t • Pattern of censoring before t

  39. Kaplan-Meier estimation • Greenwood’s formula • Most common variance estimator • Point-wise

  40. Example: • Kim paper • Event = time to relapse • Data: • 10, 20+, 35, 40+, 50+, 55, 70+, 71+, 80, 90+

  41. Plot it:

  42. Interpreting S(t) • General philosophy: bad to extrapolate • In survival: bad to put a lot of stock in estimates at late time points

  43. Fernandes et al: A Prospective Follow Up of Alcohol Septal Ablation For Symptomatic Hypertrophic Obstructive Cardiomyopathy The Ten-Year Baylor and MUSC Experience (1996-2007)”

  44. R for KM library(survival) library(help=survival) t <- c(10,20,35,40,50,55,70,71,80,90) d <- c(1,0,1,0,0,1,0,0,1,0) cbind(t,d) st <- Surv(t,d) st help(survfit) fit.km <- survfit(st) fit.km summary(fit.km) attributes(fit.km) plot(fit.km, conf.int=F, xlab="time to relapse (months)", ylab="Survival Function“, lwd=2)

  45. Kaplan-Meier Curve

More Related