1 / 28

Sukyeong Pi Larry Featherston Employment and Disability Institute Cornell University

www.edi.cornell.edu. Causal Inference Using Observational Data. Sukyeong Pi Larry Featherston Employment and Disability Institute Cornell University Feb. 21, 2009. Agenda. Randomized Controlled Trial Observational Studies Propensity Score Matching Example Limitations of PSM.

rollin
Download Presentation

Sukyeong Pi Larry Featherston Employment and Disability Institute Cornell University

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. www.edi.cornell.edu Causal Inference Using Observational Data Sukyeong Pi Larry Featherston Employment and Disability Institute Cornell University Feb. 21, 2009

  2. Agenda Randomized Controlled Trial Observational Studies Propensity Score Matching Example Limitations of PSM

  3. Randomized Controlled Trial (RCT) A research study in which the participants are randomly assigned groups to objectively compare different interventions RCT is recognized as a sound scientific method: Gold Standard for making causal inferences and making policy decisions Control for subject selection bias: Minimize subject differences between groups

  4. Limitations of RCT Philosophical/Ethical Issue: Against the obligation to offer each student optimal treatment Strategic Issues: Requires time and specialized expertise, Generalizability Issue Tactical Issues: Issues of treatment fidelity and integrity Logistical Issues: Challenges finding adequate numbers of subjects, Expensive requiring substantial resources

  5. Advantages of Observational Studies Address chief criticism of RCTs: Genealizability Availability, Cost, Time Serve as a rich source of descriptive information Examine exposure in real life  Policy decisions possible Large sizes permit investigation of exposures with smaller effect sizes

  6. Observational Studies Selection Bias: No control for group assignment (Ignorability of treatment assignment) Baseline characteristics of comparison groups are different in ways that affect the outcome due to observed or unobserved confounders. One approach to remove the bias in nonrandomized experiments is propensity score matching. tx DV A ? ctl DV B

  7. Propensity Score Matching Definition: The conditional probability (0 to 1) of receiving a given exposure (treatment) given a vector of measured or observed covariates. Assumption of RCT: the probability to be assigned to treatment group is 0.5 PS reduces baseline information to a single composite summary of the covariates, thus minimizing differences and improving comparability between two groups in observational research

  8. Procedures of Propensity Score Analysis Estimate propensity for treatment given covariates using Logistic Regression method: Save predicted value e (x) = β0 + β1X1i + β2X2i +… + βnXni + eiPropensity Score = e(x) / {1+e(x)} Balance check Compare propensity scores between Tx and Ctl groups 3. Estimate effect of treatment on outcome using PS a. Regression Model b. Stratification c. Matching

  9. EXAMPLE Research Question: What is the effect of VR services? (LR found top three services related to successful VR outcome: On the Job Support, Rh Tech, Job Placement) Data Source: 2006 RSA 911 data (including consumers closed after IPE developed; N=352,138) IVs: Gender, Race/Ethnicity, Level of Education, Work Status at Application, Primary Source of Support, SSI/DI, Type of Disability Intervention (tx): Types of Services Outcome: Type of Closure

  10. Step 0: Data Set-up Variable Selection by crosstabulation of covariates and type of closure (outcome) Covariates for this example (dummy var.) - Gender (2) - Race/Ethnicity (3): White no Hispanic, African, others - Education (3): <12 yr, 12 yrs (incl. SE cert), >12 yrs - Work Status at App (3): Emp wo sup, Other emp, No emp - Source of Support (2): Personal Income, Others - SSI/DI (2): Y/N - Disability (5): Sensory, All Mental with SA, LD/ADHD, MR/Autism, Others

  11. Step 1: Propensity Scores Goal: to include all variables that play a role in the selection process, including interactions and other nonlinear terms and variables that show weak relations to outcome (e.g., p<.10 or p<.25) (Rosenbaum & Rubin,1984) “Unless a variable can be excluded because there is a consensus that it is unrelated to outcome or is not a proper covariate, it is advisable to include it in the propensity score model even if it is not statistically significant.” (Rubin & Thomas,1984) In the example, all variables were included for PS computation

  12. Step 1: PS by Stepwise LR

  13. Step 2: Balance Check Compare two groups in their distributions using descriptive statistics and t-tests Box plot graph illustrates some overlaps (similar characteristic band of propensity scores) between two groups No overlap indicates that the differences in outcome was drawn from group differences (Selection Bias), not from the service effect (e.g., rehab tech services)???

  14. Step 2: Check Distribution/Balance

  15. Step 2: Check Distribution/Balance

  16. Step 2: Balance Check

  17. Step 2: Check Distribution/Balance

  18. Step 2: Check Distribution/Balance

  19. Step 3: Analysis with PS Three techniques are commonly used to reduce selection bias and increase precision with PS - Regression (covariance) adjustment - Stratification - Matching

  20. Step 3: Analysis I - Regression Treat the PS as an additional covariate in multivariable regression model As a composite of confounders, PS can reduce bias in the estimate of the treatment effect by adjusting for the pattern of observed confounders. Treatment effect appears more efficient when using PS as a covariate after stratification within the strata

  21. Step 3: Analysis II - Stratification Solution for the problem of dimensionality to make two groups comparable (2k subclasses needed for k covariates) PS as a scalar summary of all the observed background covariates, stratification can balance the distributions of the covariates Five strata based on the PS will remove over 90% of the bias in each of the covariates (Cochran, 1968)

  22. Step 3: Analysis II - Stratification

  23. Step 3: Analysis II - Stratification

  24. Step 3: Analysis – Stratification (26 closures)

  25. Step 3: Analysis III - Matching • Nearest available matching on the estimated PS • Mahalanobis metric matching including the PS: - An equal percent bias reducing technique (mean for the treated minus the mean for the control) - Add PS to other covariates in the calculation of the Mahalanobis distance • Nearest available Mahalanobis metric matching within calipers defined by the PS within a caliper of ¼ of the standard deviation of the propensity score

  26. Step 3: Analysis III - Matching Using the key variable of PS, matching was conducted (based on the same PS). Matched casesN=114,790

  27. Interpretation What do you think? Do you think PS gives better ideas to make a causal inference?

  28. Limitations of PSM With only observed covariates; No control for unobserved (e.g., age for this example) Inspection of the overlap between conditions before matching or other techniques: Group overlap must be substantial (e.g., rehab tech svcs) Best with large samples

More Related