Statistics 262: Intermediate Biostatistics - PowerPoint PPT Presentation

Statistics 262 intermediate biostatistics l.jpg
1 / 15

  • Updated On :
  • Presentation posted in: General

Statistics 262: Intermediate Biostatistics. May 18, 2004: Cox Regression III: residuals and diagnostics, repeated events. Jonathan Taylor and Kristin Cobb. Residuals. Residuals are used to investigate the lack of fit of a model to a given subject.

Related searches for Statistics 262: Intermediate Biostatistics

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Statistics 262: Intermediate Biostatistics

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Statistics 262 intermediate biostatistics l.jpg

Statistics 262: Intermediate Biostatistics

May 18, 2004: Cox Regression III: residuals and diagnostics, repeated events

Jonathan Taylor and Kristin Cobb

Satistics 262

Residuals l.jpg


  • Residuals are used to investigate the lack of fit of a model to a given subject.

  • For Cox regression, there’s no easy analog to the usual “observed minus predicted” residual of linear regression

Satistics 262

Deviance residuals l.jpg

Deviance Residuals

  • Deviance residuals are based on martingale residuals: ci (1 if event, 0 if censored) minus the estimated cumulative hazard to ti (as a function of fitted model) for individual i:


    See Hosmer and Lemeshow for more discussion…

Satistics 262

Deviance residuals4 l.jpg

Deviance Residuals

  • Behave like residuals from ordinary linear regression

  • Should be symmetrically distributed around 0 and have standard deviation of 1.0.

  • Negative for observations with longer than expected observed survival times.

  • Plot deviance residuals against covariates to look for unusual patterns.

Satistics 262

Deviance residuals5 l.jpg

Deviance Residuals

  • In SAS, option on the output statement:

    Ouput out=outdata resdev=

Satistics 262

Schoenfeld residuals l.jpg

Schoenfeld residuals

  • Schoenfeld (1982) proposed the first set of residuals for use with Cox regression packages

    • Schoenfeld D. Residuals for the proportional hazards regresssion model. Biometrika, 1982, 69(1):239-241.

  • Instead of a single residual for each individual, there is a separate residual for each individual for each covariate

  • Based on the individual contributions to the derivative of the log partial likelihood (see chapter 6 in Hosmer and Lemeshow for more math details, p.198-199)

  • Note: Schoenfeld residuals are not defined for censored individuals.

Satistics 262

Schoenfeld residuals7 l.jpg

Schoenfeld residuals

Where K is the covariate of interest,

the Schoenfeld residual is the covariate-value, Xik, for the person (i) who actually died at time ti minus the expected value of the covariate for the risk set at ti (=a weighted-average of the covariate, weighted by each individual’s likelihood of dying at ti).

Plot Schoenfeld residuals against time to evaluate PH assumption

Satistics 262

Schoenfeld residuals8 l.jpg

Schoenfeld residuals


option on the output statement:


Satistics 262

Influence diagnostics l.jpg

Influence diagnostics

  • How would the result change if a particular observation is removed from the analysis?

Satistics 262

Influence statistics l.jpg

Influence statistics

  • Likelihood displacement (ld): measures influence of removing one individual on the model as a whole. What’s the change in the likelihood when this individual is omitted?

  • DFBETA-how much each coefficient will change by removal of a single observation

    • negative DFBETA indicates coefficient increases when the observation is removed

Satistics 262

Influence statistics11 l.jpg

Influence statistics


option on the output statement:

ld= dfbeta=

Satistics 262

Slide12 l.jpg

What about repeated events?

  • Death (presumably) can only happen once, but many outcomes could happen twice…

    • Fractures

    • Heart attacks

    • Pregnancy


Satistics 262

Slide13 l.jpg

Repeated events: 1

  • Strategy 1: run a second Cox regression (among those who had a first event) starting with first event time as the origin

  • Repeat for third, fourth, fifth, events, etc.

    • Problems: increasingly smaller and smaller sample sizes.

Satistics 262

Slide14 l.jpg

Repeated events:Strategy 2

  • Treat each interval as a distinct observation, such that someone who had 3 events, for example, gives 3 observations to the dataset

    • Major problem: dependence between the same individual

Satistics 262

Slide15 l.jpg

Strategy 3

  • Stratify by individual (“fixed effects partial likelihood”)

  • In PROC PHREG: strata id;

    • Problems:

    • does not work well with RCT data, however

    • requires that most individuals have at least 2 events

    • Can only estimate coefficients for those covariates that vary across successive spells for each individual; this excludes constant personal characteristics such as age, education, gender, ethnicity, genotype

Satistics 262

  • Login