1 / 78

Propensity Score Analysis A tool for causal inference in non-randomized studies

Propensity Score Analysis A tool for causal inference in non-randomized studies. Summer Statistics Workshop 2010 Felix Thoemmes Texas A&M University. Agenda. Motivating examples Definition of causal effects with potential outcomes Definition of propensity scores

porter
Download Presentation

Propensity Score Analysis A tool for causal inference in non-randomized studies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Propensity Score AnalysisA tool for causal inference in non-randomized studies Summer Statistics Workshop 2010 Felix Thoemmes Texas A&M University

  2. Agenda • Motivating examples • Definition of causal effects with potential outcomes • Definition of propensity scores • Applied example of propensity scores • Hands-on example in R • Advanced Topics

  3. Motivation • Randomized experiment is considered “gold standard” for causal inference • Randomization not always possible • Trials where treatment is offered to community at large • Participants do not permit randomization • Ethical or legal considerations • Naturally occurring phenomena • Broken randomized experiments (attrition, non-compliance, treatment diffusion)

  4. Broken randomized experiments

  5. Motivation • Non-random assignment leads to group imbalances at pretest • Selection bias • Confounding of treatment effects due to imbalanced covariates

  6. Hormone Replacement therapy • 1968 “Feminine Forever“ • 2002 Women‘s Health Initative trial on hormone replacement therapy

  7. Motivation • Adjustment methods • ANCOVA / Regression Adjustment • Matching • Stratification • Many covariates are needed to control for potentially confounding influences

  8. Motivation • Assumptions of classic ANCOVA model • linearity • no baseline by treatment interactions • Region of common support in multi-dimensional space hard to assess • extrapolation beyond data is sensitive to model adequacy

  9. Motivation – Key Issues • Non-randomized studies are necessary • Many covariates should be assessed to control for confounding influences • High-dimensional regression adjustment has strong assumptions and distributional overlap is hard to check

  10. Defining causal effects • Definition of causal effect is often lacking in applied social science • Parameter estimates from any model (ANOVA, regression, structural equation model) may or may not be causally interpretable

  11. Rubin Causal Model TREATMENT Unit –level Causal Effect CONTROL

  12. Rubin Causal Model TREATMENT CONTROL Average Causal Effect

  13. Rubin Causal Model TREATMENT CONTROL Estimate of the Average Causal Effect

  14. Rubin Causal Model CONTROL TREATMENT CONTROL TREATMENT E(Yi1) = E(Yi1 | zi = 1) E(Yi0) = E(Yi0 | zi = 0) E(Yi1) ≠ E(Yi1 | zi = 1) E(Yi0) ≠ E(Yi0 | zi = 0)

  15. Rubin Causal Model τ = 11.33-12.83 = -1.5 τ* = 11.33-13.33 = -2.0 E(Yi1) = E(Yi1 | zi = 1) E(Yi0) = E(Yi0 | zi = 0) Source: West and Thoemmes (2008)

  16. Rubin Causal Model τ = 11.33-12.83 = -1.5 τ* = 11.66-11.00 = .66 E(Yi1) ≠ E(Yi1 | zi = 1)E(Yi0) ≠ E(Yi0 | zi = 0) Source: West and Thoemmes (2008)

  17. Obtaining unbiased estimates E(Yi1) = E(Yi1 | zi = 1) Randomized experiment E(Yi0) = E(Yi0 | zi = 0) E(Yi1) ≠ E(Yi1 | zi = 1) Non-randomizedE(Yi0) ≠ E(Yi0 | zi = 0) experiment E(Yi1) = Ex{E(Yi1 | zi = 1, x)} Non-randomized E(Yi0) = Ex{E(Yi0 | zi = 0, x)} experiment with unconfoundedness assumption X contains all confounding covariates

  18. Randomization • Randomized experiment is gold standard for causal inference • Covariate balance ensures that confounders cannot bias treatment effect • Few assumptions • Compliance • No missing data • No hidden treatment variations • Independence of units (assignment of one unit does not influence outomce of another unit)

  19. Non-randomized trial • Lack of randomization can create imbalance PRIOR to treatment assignment • Confounding occurs due to imbalance and relationship with outcome • Bias can be corrected, but all confounders must be assessed  no unique influence of confounder can be left out for unbiased effect estimate

  20. Increasing use of Propensity Scores Source: Web of Science

  21. Propensity scores z = treatment assignment 1 = treatment group0 = control group Propensity score conditional on controlled for e(x) = p (z=1 | x) probability x = vector of covariates

  22. Propensity scores A single number summary based on all available covariates that expresses the probability that a given subject is assigned to the treatment condition, based on the values of the set of observed covariates e(x) = p (z=1 | x)

  23. Propensity scores Actual assignment Actual assignment Control Treatment Control Treatment Likelihood of receiving treatment Likelihood of receiving treatment

  24. Example of balance property original sample e(x) = p(z=1, x={0 0}) = .5 e(x) = p(z=1, x={1 0}) = .33 e(x) = p(z=1, x={0 1}) = .66 e(x) = p(z=1, x={1 1}) = 1 (a=1 | z=0) = .5 (b=1 | z=0) = 1/4 (a=1 | z=1) = .5 (b=1 | z=1) = .5

  25. Example of balance property matched sample p(z, x|e(x)) = p(z |e(x)) p(x |e(x)) Examples for z=1 and x = {0 1} p(z=1, x={0 1}|e(x)) = 1/6 p(z=1 |e(x)) = .5 p(x={0 1} |e(x)) = .33 p(z |e(x)) p(x |e(x)) = (.5)(.33) = 1/6 (a=1 | z=0) = .5 (b=1 | z=0) = .5 (a=1 | z=1) = .5 (b=1 | z=1) = .5

  26. Propensity scores • Balance on the propensity score implies on average balance on all observed covariates • Two units in the treatment and the control group that have the same propensity score are similar on all covariates. They only differ in terms of treatment received

  27. Propensity score • Propensity score models influence of confounders on treatment assignment • In comparisons, ANCOVA models influence of confounders on outcome Confounder Treatment Outcome

  28. Comparison

  29. Propensity Score Workflow Propensity score analysis is a multi-step process Researcher has choices at each step of the analysis

  30. Propensity Score Workflow Predicting Selection Confounder Predicting Outcome Treatment Outcome Select true confounders and covariates predictive of outcome

  31. Propensity Score Workflow • Estimation of propensity scores can be achieved in numerous ways • Logistic regression • Discriminant analysis • (Boosted) regression trees

  32. Propensity Score Workflow • Logistic regression model • Outcome is treatment assignment • Predictors are covariates • can be overfitted to the sample, e.g. include interactions, higher order terms • only interest is prediction and covariate balance = β0 + βiX

  33. Propensity Score Workflow • Conditioning strategies • Matching • Weighting • Regression adjustment

  34. Propensity Score Workflow • Check of covariate balance • t-test (not recommended) or standardized difference • graphical assessment (e.g. Q-Q plot) • Region of common support (distributional overlap) • graphical assessment (e.g. histograms)

  35. Propensity Score Workflow Before Matching After Matching Logit Propensity Score Treatment Group Quantiles of both distributions are plotted against each other Logit Propensity Score Control Group

  36. Propensity Score Workflow Before Matching After Matching

  37. Propensity Score Workflow

  38. Propensity Score Workflow

  39. Propensity Score Workflow • Estimate of treatment effect • Mean difference • Standard error dependent on conditioning scheme

  40. Applied Example Braver, Thoemmes, Moser, & Baham(in progress) • Can random invitation designs yield the same results as randomized controlled trials? • Evaluation of a math treatment to teach rules of exponents – either administered as a randomized experiment or a random invitation design • Currently in progress– Pilot data available

  41. Overall Sample Design RCT RI - Treatment RI - Control RCT RI - C RI - T RCT - T RCT - C d* d = Attrition

  42. Example • Pretest • General attitude towards math • Altruism scale • Available time • 19 covariates after forming factor scores

  43. Results RCT - T RCT - C Linear regression adjustment RI - C RI - T Propensity score adjustment

  44. Mechanics of propensity score analysis • Can all of this be done in SPSS / SAS? • Only parts of the analysis can be performed • Mostly based on self-written macros • R packages MatchIt and PSAgraphics offer best solutions • Some experience / learning of R required • Packages automate most of the analysis

  45. Estimation of propensity score • Put covariates in model that are • Theoretically important confounders • Signifcantly related with treatment selection (unbalanced) • Iterative process of including covariates and potentially higher order terms (interactions, polynomials)

  46. Estimation of propensity score • Estimate PS • Logistic regression • Generlized additive model • Classification tree / Regression tree / Recursive Partioning

  47. Estiamtion of propensity score • Generalized additive model • Instead of regular regression weight, a smoother is applied Graphic from SAS PROC GAM Imagine lowess smoother for each coefficient

More Related