- 89 Views
- Uploaded on
- Presentation posted in: General

How to use propensity scores in the analysis of nonrandomized designs

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

How to use propensity scores in the analysis of nonrandomized designs

Patrick G. Arbogast

Department of Biostatistics

Vanderbilt University Medical Center

GCRC Research-Skills Workshop

GCRC Research-Skills Workshop

- Randomized clinical trials: randomization guarantees that on avg no systematic differences in observed/unobserved covariates.
- Observational studies: no control over tx assignments, and E+/E- groups may have large differences in observed covariates.
- Can adjust for this via study design (matching) or during estimation of tx effect (stratification/regression).

GCRC Research-Skills Workshop

- <10 events/variable (EPV), estimated reg coeff’s may be biased & SE’s may be incorrect (Peduzzi et al, 1996).
- Simulation study for logistic reg.

- Harrell et al (1985) also advocates min no. of EPV.
- A solution:propensity scores (Rosenbaum & Rubin, 1983).
- Likelihood that patient receives E+ given risk factors.

GCRC Research-Skills Workshop

- Covariate is confounder only if its distribution in E+/E- differ.
- Consider 1-factor matching: low-dose aspirin & mortality.
- Age, a strong confounder, can be controlled by matching.

- Can extend to many risk factors, but becomes cumbersome.
- Propensity scores provide a summary measure to control for multiple confounders simultaneously.

GCRC Research-Skills Workshop

- Identify potential confounders.
- Current conventional wisdom: if uncertain whether covariate is confounder, include it.

- Model E+ (typically dichotomous) as function of covariates using entire cohort.
- E+ is outcome for propensity score estimation.
- Do not include D+.
- Logistic reg typically used.
- Propensity score = estimated Pr(E+|covariates).

GCRC Research-Skills Workshop

- Natural question: why estimate probability that a patient receives E+ since we already know exposure status?
- Answer: adjusting observed E+ with probability of E+ (“propensity”) creates a “quasi-randomized” experiment.
- For E+ & E- patients with same propensity score, can imagine they were “randomly” assigned to each group.
- Subjects in E+/E- groups with equal (or nearly equal) propensity scores tend to have similar distribution in covariates used to estimate propensity.

GCRC Research-Skills Workshop

- For given propensity score, one gets unbiased estimates of avg E+ effect.
- Can include large no. of covariates for propensity score estimation.
- In fact, original paper applied propensity score methodology to observational study comparing CABG to medical tx, adjusting for 74 covariates in propensity model.

GCRC Research-Skills Workshop

- Matching.
- Regression adjustment/stratification.
- Weighting.

GCRC Research-Skills Workshop

- Match on single summary measure.
- Useful for studies with limited no. of E+ patients and a larger (usually much larger) no. of E- patients & need to collect add’l measures (eg, blood samples).

GCRC Research-Skills Workshop

- Nearest available matching on estimated propensity score.
- Select E+ subject.
- Find E- subjecdt w/ closest propensity score.
- Repeat until all E+ subjects matched.
- Easiest in terms of computational considerations.

- Others:
- Mahalanobis metric matching.
- Nearest available Mahalanobis metric matching w/ propensity score-based calipers.

GCRC Research-Skills Workshop

- Consider an HIV database:
- E+: patients receiving a new antiretroviral drug (N=500).
- E-: patients not receiving the drug (N=10,000).
- D+: mortality.

- Need to manually measure CD4.
- May be potential confounding by other HIV drugs as well as 10 prognostic factors, which are identified & stored in the database.

GCRC Research-Skills Workshop

- Option 1:
- Collect blood samples from all 10,500 patients.
- Costly & impractical.

- Option 2:
- For all patients, estimate Pr(E+|other HIV drugs & prognostic factors).
- For each E+ patient, find E- patient with closest propensity score.
- Continue until all E+ patients match with E- patient.
- Collect blood sample from 500 propensity-matched pairs.

GCRC Research-Skills Workshop

GCRC Research-Skills Workshop

- Teaching hospitals:
- Beth israel Hospital, Boston.
- Duke University Medical Center, Durham.
- Metro-Health Medical Center, Cleveland.
- St Joseph’s Hospital, Marshfield, WI.
- UCLA.

- Prespecified disease categories:
- Acute respiratory failure.
- COPD.
- CHF.
- Cirrhosis.
- Nontraumatic coma.
- Colon cancer metastatic to liver.
- Non-small cell cancer of lung.
- Multiorgan system failure with malignancy or sepsis.

GCRC Research-Skills Workshop

- Decision to use RHC left to discretion of physician.
- Thus, tx selection may be confounded with patient factors related to outcome.
- eg, patients with low BP may be more likely to receive RHC, & such patients may also be more likely to die.

GCRC Research-Skills Workshop

- Panel of 7 specialists in critical care specified variables related to decision to use RHC.
- Cpt propensity score, Pr(RHC|covariates), via logistic regression.
- Covariates:
- age, sex, yrs of education, medical insurance, primary & secondayr disease category, admission dx, ADHL & DASI, DNR status, cancer, 2-month survival probability, acute physiology component of APACHE III score, Glasgow Coma Score, wt, temparature, BP, respiratory rate, heart rate, PaO2/FiO2, PaCO2, pH, WBC count, hematocrit, sodium, potassium, creatinine, bilirubin, albumin, urine output, comorbid illnesses.

GCRC Research-Skills Workshop

- Adequacy of propensity score to adjust for effects of covariates assessed by testing for differences in individual covariates between RHC+/RHC- patients after stratifying by PS quintiles.
- Model each covariate as function of RHC & PS quintiles.
- Covariates balanced if not related to RHC after PS adjustment.

GCRC Research-Skills Workshop

- For each RHC+, RHC- w/ same disease category & closest PS (+/- 0.03) identified.
- Continued until all pairs identified.
- PS difference for each pair calculated. Each pair w/ positive difference matched with pair w/ negative difference closest in magnitude.
- Assure equal no.’s of pairs w/ positive & negative PS differences.

- Final matched set: 1008 matched pairs.

GCRC Research-Skills Workshop

GCRC Research-Skills Workshop

* Mean (25th, 50th, 75th %-tiles); ** Therapeutic Intervention Scoring System.

GCRC Research-Skills Workshop

- Stratification on PS alone can balance distributions of covariates in E+/E- groups w/o exponential increase in no. of strata.
- Rosenbaum & Rubin (1983) showed that perfect stratification based on PS will produce strata where avg tx effect w/i strata is unbiased estimate of true tx effect.

GCRC Research-Skills Workshop

- Full cohort: N=5735.
- PH regression:
- Adjusted for PS, age, sex, no. of comorbid illnesses, ADL & DASI 2 wks prior to admission, 2-month prognosis, day 1 Acute Physiology Score, Glasgow Coma Score, & disease category.

- Question: why include covariates in main model in addition to PS (especially covariates already used to estimate PS)?

GCRC Research-Skills Workshop

ARF – acute respiratory failure, MOSF – multiorgan system failure.

GCRC Research-Skills Workshop

GCRC Research-Skills Workshop

- Weight patient’s contribution to reg model.
- Inverse-probability-of-tx-weighted (IPTW) estimator (Robins et al, 2000):
- Estimates tx effect in pop whose distribution of risk factors equals that found in all study subjects.
- Wts: 1/PS(X) for E+ & 1/(1-PS(X)) for E-.

- Standardized mortality ratio (SMR)-weighted estimator (Sato et al, 2003):
- Estimates tx effect in pop whose distribution of risk factors equals that found in E+ subjects only.
- Wts: 1 for E+ & PS(X)/(1-PS(X)) for E-.

GCRC Research-Skills Workshop

- Example: tissue plasminogen activator (t-PA) in 6269 ischemic stroke patients (Kurth et al, 2005):
- Multivariable logistic reg.
- Logistic reg after matching on PS +/- 0.05
- Logistic reg adjusting for PS (linear term & deciles).
- IPTW.
- SMR.

GCRC Research-Skills Workshop

GCRC Research-Skills Workshop

GCRC Research-Skills Workshop

GCRC Research-Skills Workshop

- Matching on individual factors:
- Too cumbersome (eg, matching on 10 factors, each having 4 categories, resulting in ~1,000,000 combinations of patient characteristics).

- Stratified analyses: same problem.
- Regression (Cepeda et al, 2003):
- <7 events/confounder – PS less biased, more robust, & more precise.
- 8+ events/confounder – multiple reg preferable:
- Bias from multiple reg goes away, but still present for PS analysis (eg, ~25-30% bias when OR=2.0).
- Coverage probability (% of 95% CI’s containing true OR) decreases for PS analysis.

GCRC Research-Skills Workshop

- Useful when adjusting for large no. of risk factors & small no. of EPV.
- Useful for matched designs (saving time & money).
- Can be applied to exposure with 3+ levels (Rosenbaum, 2002).

GCRC Research-Skills Workshop

- Can only adjust for observed covariates.
- Propensity score methods work better in larger samples to attain distributional balance of observed covariates.
- In small studies, imbalances may be unavoidable.

- Including irrelevant covariates in propensity model may reduce efficiency.
- Bias may occur.
- Non-uniform tx effect.

GCRC Research-Skills Workshop

- E+: RHC use.
- swang1 (0=RHC-, 1=RHC+)

- D+: time-to-death, min(obs time, 30d).
- Events after 30d censored.
- RHC could not have a long-term effect.
- Such ill patients more affected by later tx decisions.

- t3d30, censor var=censor

- Events after 30d censored.
- N=5735 patients, N=1918 deaths w/i 30d.
- 38.0% RHC+ & 30.6% RHC- died w/i 30d.

GCRC Research-Skills Workshop

GCRC Research-Skills Workshop

- Logistic reg: RHC+/- dependent var.
- Adjusts for 50 risk factors.
- Propensity score distribution by RHC groups:

GCRC Research-Skills Workshop

GCRC Research-Skills Workshop

GCRC Research-Skills Workshop

- Cepeda MS, Boston R, Farrar JT, Strom BL. Comparison of logistic regression versus propensity score when the number of events is low and there are multiple confounders. Am J Epidemiol 2003; 158: 280-287.
- Connors Jr AF, Speroff T, Dawson NV, et al. The effectiveness of right heart catheterization in the initial care of critically ill patients. JAMA 1996; 276: 889-897.
- D’Agostino Jr, RB. Tutorial in biostatistics: propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. Stat Med 1998; 17: 2265-2281.
- Gum PA, Thamilarasan M, Watanabe J, Blackstone EH, Lauer MS. Aspirin use and all-cause mortality among patients being evaluated for known or suspected coronary artery disease. JAMA 2001; 286: 1187-1194.
- Harrell FE, Lee KL, Matchar DB, Reichart TA. Regression models for prognostic prediction: advantages, problems, and suggested solutions. Cancer Treatment Reports 1985: 69: 1071-1077.
- Kurth T, Walker AM, Glynn RJ, Chan KA, Gaziano JM, Berger K, Robins JM. Results of multivariable logistic regrssion, propensity matching, propensity adjustment, and propensity-based weighting under conditions of nonuniform effect. Am J Epidemiol 2006; 163: 262-270.
- Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR. A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol 1996; 49: 1373-1379.
- Robins JM, Hernan MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology 2000; 11: 550-560.
- Rosenbaum PR. Observational Studies. New York, NY: Springer-Verlag, 2002.
- Rosenbaum PR, Rubin DB. The central rol of the propensity score in observational studies for causal effects. Biometrika 1983; 70: 41-55.
- Rubin DB. Estimating causal effects from large data sets using propensity scores. Annals of Internal Medicine 1997; 127: 757-763.
- Sato T, Matsuyama Y. Marginal structural models as a tool for standardization. Epidemiology 2003; 14: 680-686.

GCRC Research-Skills Workshop