1 / 36

Robin, et al., 2000, Epidemiology 11(5): 550 Hernan, et al., 2002, Stat in Med. 21:1689.

Marginal Structural Models and Causal Inference in Epidemiology Longitudinal Data. Robin, et al., 2000, Epidemiology 11(5): 550 Hernan, et al., 2002, Stat in Med. 21:1689. Set-up of Data. Measurements made at regular times, 0, 1,2,...,K .

rio
Download Presentation

Robin, et al., 2000, Epidemiology 11(5): 550 Hernan, et al., 2002, Stat in Med. 21:1689.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Marginal Structural Models and Causal Inference in Epidemiology Longitudinal Data Robin, et al., 2000, Epidemiology 11(5): 550 Hernan, et al., 2002, Stat in Med. 21:1689.

  2. Set-up of Data • Measurements made at regular times, 0,1,2,...,K. • A(j) is the AZT dose on the jth day. • Y is dichotomous outcome measured at the end (only once) at day K+1. • is the history of treatment as measured at time j. • is the history of the potential confounders measured at time j.

  3. Observed Data • O = (L(0),A(0),L(1),A(1),...,L(K),A(K),Y(K+1) or, • Temporal ordering implies future can cause past or: • Full Data or all processes L that result from all possible treatment histories.

  4. The Usual (but more complicated) assumption • If then SRA is:

  5. Factorization of Likelihood • SRA implies (defining Y=L(K+1)): • ETA - , and that doesn’t mean Ais all possible imaginable treatment histories (just all possible within the population of interest).

  6. The Counterfactuals of Interest • Must know define counterfactuals with regard to a whole vector of possible treatments. • If A is binary AZT (e.g., yes/no) there are 2K possible counterfactuals, e.g., • Saturated marginal structural model not practical if K is large and/or the number of possible treatment levels is large .

  7. Defining a more parsimonious (unsaturated) model • Must choose reasonable model that relates counterfactual mean to treatment history. • Example include: • where

  8. Estimating (stabilized) weights for longitudinal treatment for single outcome (Y) at end of study • Need to assume some model (unsaturated for both the numerator and denominator. • Only model for denominator needs to be correct for MSM coefficients to be consistent.

  9. Estimation of Weights • Again, assume A(j) is dichotomous. • One possible model for the denominator is: • Choices for models can be large even infinite so “black box” procedures are convenient.

  10. Y continuous, measured repeatedly • Hernan, et al., 2002. • Question of interest: causal effect of zidovudine (AZT) on mean CD4 count among HIV+. • Data comes from Multicenter AIDS Cohort Study (MACS). • Between 1984-1991, MACS enrolled 5622 men with no prior AIDS-defining illness. Returned approximately every 6 months and provided blood samples.

  11. Set-up (Repeated Outcome Measurements) • Ai(j) is dose of AZT on jth visit of ith person (0 = no, 1 = yes). • Vi a vector of time-independent baseline covariates (age, calendar year, CD4 count, CD8 count, WBC, RBC, platelets, etc.) – • Li(j) = vector of potentially time-varying variables of interest (e.g., confounders, outcomes) • Define the outcome(s) of interest to be and it occurs just before (temporally) the other components of L(j) at time j. • Consider 16 visits after a baseline measurement, so k=0,1,2,...,16.

  12. Counterfactual Outcomes • Will assume that once a person is given AZT, continues on AZT – 17 possible treatment regimes. • So,

  13. Treatment Effects • We define no treatment effect in terms of counterfactual outcomes • Specifically, if: then there is no causal association of AZT and CD4. • The causal effect of AZT and CD4 for person i might be define as:

  14. Confounding assumptions (same as SRA) • No confounding at all if, for all • that is, treatment assignment is statistically independent of all future counterfactual outcomes ( ). • No unmeasured confounding (or SRA) if:

  15. The Marginal Structural Model • Define the effect of changing on the counterfactual mean using an assumed model: where that is, the current CD4 count (measured just after j) can only be affected by the past and  is a vector of parameters.

  16. Specific Marginal Structural Model • In the paper, use: where and

  17. Interpretation of  • So, 1 is the change in the mean CD4 count in a population where AZT is increased by an additional visit,

  18. Estimating (stabilized) weights for longitudinal treatment and repeated Y’s. • For the outcome measured at time j, Y(j), the stabilized weight is • Again, need to assume some model (probably unsaturated) for both the numerator and denominator. • Only model for denominator needs to be correct for MSM coefficients to be consistent.

  19. Missing Data • Not uncommonly, data on some time-dependent covariates or outcomes will be missing for some subjects at some times. • Fortunately, missing data can be treated also by weighting. • Combine the “treatment” weights with “missingness” weights.

  20. Missingness Weights • Take the case of only missing data to be Y at end of study. • Let C = 1 if Y not missing, 0 otherwise. • Assumption (CAR): • In words, whether an observation is missing or not is independent of the outcome given what one observes up to that point about the past treatment, covariates and missing observations. • Given CAR assumption, can retrieve consistency by another weight.

  21. Stabilized Missingness Weights assuming everyone’s baseline measurements are observed. • Total wt for time j is: • Regression is performed only on non-missing observations with these weights.

  22. Estimating Stabilized Missingness Weights • Need a model for:

  23. Example in Hernan Paper

  24. Example in Hernan Paper

  25. Data (Comma-delimited)

  26. R Code to read in data and convert into (fake) individual-level data #### Read in the data from a comma delimited file with header hernanexam<-read.csv("c:/hubbard/causal2003/datasets/hernanexam.csv",header=T) ## Make long data set n<-length(hernanexam[,1]) #fakehernan<-data.frame(A0=NULL,L1=NULL,A1=NULL,meany=NULL,y=NULL,id=NULL) fakehernan<-NULL ll<-0 for(i in 1:n) { cat(" i = ",i,"\n") for(j in 1:hernanexam[i,"n"]) { cat(" j = ",j,"\n") ll<-ll+1 # generate Y’s from norma(meany,SD) fakehernan<- rbind(fakehernan,c(hernanexam[i,"A0"],hernanexam[i,"L1"],hernanexam[i,"A1"], hernanexam[i,"meany"],rnorm(1,hernanexam[i,"meany"],0.02),ll)) }} fakehernan<-data.frame(fakehernan) names(fakehernan)<-c("A0","L1","A1","meany","y","id")

  27. Data.frame A0 L1 A1 meany y id 1 1 1 1 100 100.01391 1 2 1 1 1 100 99.99235 2 3 1 1 1 100 100.02064 3 4 1 1 1 100 100.00919 4 5 1 1 1 100 100.02007 5 6 1 1 1 100 100.00790 6 7 1 1 1 100 99.98325 7 8 1 1 1 100 99.99479 8 9 1 1 1 100 100.03135 9 10 1 1 1 100 99.97674 10 11 1 1 1 100 99.99353 11 12 1 1 1 100 99.99125 12 13 1 1 1 100 99.99596 13 14 1 1 1 100 100.00671 14 15 1 1 1 100 99.98297 15 16 1 1 0 100 100.00802 16 17 1 1 0 100 99.95835 17 18 1 1 0 100 99.98374 18 19 1 1 0 100 99.98043 19 20 1 1 0 100 99.99869 20 21 1 1 0 100 100.02029 21

  28. Making Weights • Need to estimate, for each subject: • Can use glm in R and Splus (logit in STATA, PROC GENMOD in SAS.

  29. Making Weights in R ## Weight Function ## P(A0) a0glm<-glm(factor(A0)~1,family=binomial,data=fakehernan) predict.a0<-predict.glm(a0glm,type="response") attach(fakehernan) predict.a0<-(1-A0)*(1-predict.a0)+A0*predict.a0 ##P(A1|A0,L1) a1glm<-glm(A1~A0*L1,family=binomial,data=fakehernan) predict.a1<-predict.glm(a1glm,type="response") predict.a1<-(1-A1)*(1-predict.a1)+A1*predict.a1 ##P(A1|A0) a1glm.sht<-glm(A1~A0,family=binomial,data=fakehernan) predict.a1.sht<-predict.glm(a1glm.sht,type="response") predict.a1.sht<-(1-A1)*(1-predict.a1.sht)+A1*predict.a1.sht ## Unstabilized wght<-1/(predict.a1*predict.a0) ## Stabilized swght<-(predict.a0*predict.a1.sht)/(predict.a0*predict.a1) detach(2)

  30. Making Weights in R A0 A1 L1 wght swght 1 1 1 1 4.000000 1.5599993 16 1 0 1 4.000000 0.4400007 31 1 1 0 2.222268 0.8666841 94 1 0 0 19.996292 2.1995957 101 0 1 1 4.000000 1.7199998 106 0 0 1 4.000000 0.2800002 111 0 1 0 2.222268 0.9555751 192 0 0 0 19.996292 1.3997413

  31. Marginal Structural Model in R • MSM: ## Make MSM attach(fakehernan) cumA<-A0+A1 detach(2) fakehernan<-data.frame(fakehernan,cumA) fakehernan[1,] rm(cumA) ## Simple Linear lm.sim<-lm(y~cumA,data=fakehernan) ## Stratified (simple adjustment) lm.strat<-lm(y~cumA+L1,data=fakehernan) ## Non-stabilized weights lm.iptw<-lm(y~cumA,data=fakehernan,weight=wght) ## Stabilized weights lm.iptw.sw<-lm(y~cumA,data=fakehernan,weight=swght)

  32. Results • Simple (Unweighted) summary(lm.sim) lm(formula = y ~ cumA, data = fakehernan) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 92.5191 0.6877 134.525 <2e-16 *** cumA -0.3944 0.4746 -0.831 0.407 • Simple Adjusted (Unweighted) lm(formula = y ~ cumA + L1, data = fakehernan) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 90.002099 0.003328 27047.48 <2e-16 *** cumA -0.002915 0.002226 -1.31 0.192 L1 10.000714 0.003328 3005.00 <2e-16 ***

  33. Results, cont. • Unstabilized Weights lm(formula = y ~ cumA, data = fakehernan, weights = wght) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 90.9985 0.4847 187.744 <2e-16 *** cumA 1.0000 0.3957 2.527 0.0123 * • Stabilized Weights lm(formula = y ~ cumA, data = fakehernan, weights = swght) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 90.3043 0.6763 133.532 < 2e-16 *** cumA 1.2839 0.4667 2.751 0.00649 **

  34. Homework • Take the data from the website (that used to make the fakehernan data.frame) • Generate an individual-level data set as I have done. • Add to it non-random missing observations (just Y at the end). • Let Y be missing according to: • Estimate using stabilized weights which include the missingness weights:

  35. Homework, cont. • Compare the stabilized weighted regression model (including both the treatment and missingness weights) with the simple and simple adjusted model that just eliminates missing observations.

More Related