1 / 20

Using martingale residuals to assess goodness of fit for sampled risk set data

Using martingale residuals to assess goodness of fit for sampled risk set data. Ørnulf Borgan Department of Mathematics University of Oslo Based on joint work with Bryan Langholz. Outline:. Example: Uranium miners cohort Cohort model, data and martingale residuals Risk set sampling

dieter-webb
Download Presentation

Using martingale residuals to assess goodness of fit for sampled risk set data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using martingale residuals to assess goodness of fit for sampled risk set data Ørnulf Borgan Department of Mathematics University of Oslo Based on joint work with Bryan Langholz

  2. Outline: • Example: Uranium miners cohort • Cohort model, data and martingale residuals • Risk set sampling • Martingale residuals and goodness-of-fit tests for sampled risk set data • Concluding remarks

  3. Uranium miners cohort: (e.g. Langholz & Goldstein, 1996) • 3347 uranium miners from Colorado Plateau included in study cohort 1950-60 • Followed-up until end of 1982 • 258 lung cancer deaths • Interested in effect of radon and smoking exposure on the risk of lung cancer death • Have exposure information for the full cohort. Will sample from the risk sets for illustration

  4. Relative risk regression models Hazard rate for individual i relative risk baseline hazard Relative risk for individual i depends on covariates xi1, xi2 , … , xip(possibly time-dependent) Cox: Excess relative risk:

  5. Cohort data: (arrows are censored observations) Study time individuals at risk

  6. t1< t2 < t3 < ….times of failures ijindividual failing at tj ("case") Counting process for individual i : Intensity processli(t) is given by

  7. at risk indicator hazard rate Cumulative intensity processes: Martingales: Martingale residual processes:

  8. Martingal residual processes may be used to assess goodness of fit: • Plot individualmartingale residuals versus covariates (Therneau, Grambsch & Flemming,1990) • Plot groupedmartingale residual processes versus time (Aalen,1993; Grønnesby & Borgan,1996) The latter may be extended to sampled risk set data

  9. Risk set sampling • Cohort studies need information on covariates for all individuals at risk • Expensive to collect and check (!) this information for all individuals in large cohorts • For risk set sampling designs one only needs to collect covariate information for the cases and a few controls sampled at the times of the failure

  10. Select m –1 controls among the n(t) – 1 non-failuresat risk if a case occurs at time t, i.e. match on study time Illustration for m = 2 case control

  11. A sampled risk set consists of the case ijand its controls A sampling design for the controls is described by its sampling distribution A number of sampling designs are available The classical nested case-control design:If individual i fails at time tthe probability of selecting the set ras the sampled risk set is (we assume that r is a subset of the risk set, that r is of size m and that i is in r)

  12. Inference on the regression coefficients can be based on the partial likelihood The partial likelihood enjoys usual likelihood properties (Borgan, Goldstein & Langholz1995) For the classical nested case-control design, the partial likelihood simplifies

  13. Martingale residuals and goodness-of-fit tests for sampled risk set data Introduce the counting processes Intensity processes take the form:

  14. Corresponding martingales: Martingale residual processes: The are of little practical use on their own, but they may be aggregated over groups of individuals to produce useful plots

  15. For group g May be interpreted as "observed _ expected" number of failures in group g Simplifies for classical nested case-control Asymptotic distribution may be derived using counting process methods

  16. Ilustration: uranium miners cohort Fit excess relative risk model: xi1 = cumulative radon (100 WLMs) xi2 = cumulative smoking (1000 packs) For classical nested case-control with three controls per case:

  17. Aggregate martingale residual processes in three groups according to cumulative radon exposure: Groups: I: < 500 WLMs II: 500-1500 WLMs III: > 1500 WLMs There are indications for an interaction between cumulative radon exposure and age

  18. Observed and expected number of failures in the groups for ages below and above 60 years: Chi-squared statistic with 2(3 – 1) = 4 df takes the value 10.5 (P-value 3.2%)

  19. Concluding remarks The counting process formulation of nested case-control studies: • Introduces a time aspect that is usually disregarded for sample risk set data • Gives a similar model formulation as for cohort data and thereby opens up for similar methodo-logical developments as for cohort studies • Grouped martingale residual processes is one example of this. They allow to check for time-dependent effects and other deviations from the model

  20. Questions and further develoments of grouped martingale residual plots and related goodness-of-fit methods • How should the grouping be performed? • How do specific deviations from the model turn up in the plots? • Kolmogorov-Smirnov and Cramer von Mises type tests? (Durbin’s approximation, Lin et al’s simultation trick)

More Related