observational studies l.
Skip this Video
Loading SlideShow in 5 Seconds..
Observational Studies PowerPoint Presentation
Download Presentation
Observational Studies

Loading in 2 Seconds...

play fullscreen
1 / 25

Observational Studies - PowerPoint PPT Presentation

  • Uploaded on

Observational Studies. Based on Rosenbaum (2002) David Madigan. Rosenbaum, P.R. (2002). Observational Studies (2 nd edition) . Springer. Introduction. A empirical study in which: Examples: smoking and heart disease vitamin C and cancer survival DES and vaginal cancer.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Observational Studies' - Pat_Xavi

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
observational studies
Observational Studies

Based on Rosenbaum (2002)

David Madigan

Rosenbaum, P.R. (2002). Observational Studies (2nd edition). Springer

  • A empirical study in which:
  • Examples:
    • smoking and heart disease
    • vitamin C and cancer survival
    • DES and vaginal cancer

“The objective is to elucidate cause-and-effect relationships in which it is not feasible to use controlled experimentation”

  • aspirin and mortality
  • cocaine and birthweight
  • diet and mortality
asthma study





African American





Smoking at Home



Tobacco Experimentation



Asthma Self-Diagnosis



Asthma ISAAC



Asthma Study
  • Have data on 2,000 kids
  • What is the effect of tobacco experimentation on asthma?
cameron and pauling vitamin c
Cameron and Pauling Vitamin C
  • Gave Vitamin C to 100 terminally ill cancer patients
  • For each patient found 10 controls matched for age, gender, cancer site, and tumor type
  • Vitamin C patients survived four times longer than controls
  • Later randomized study found no effect of vitamin C
  • Turns out the control group was formed from patients already dead…

LESSONS: - observational studies are tricky

- randomized study is the gold standard



The two groups are comparable at baseline

  • Could do a better job manually matching patients on 18 characteristics listed, but no guarantees for other characteristics
  • Randomization did a good job without being told what the 18 characteristics were
  • Chance assignment could create some imbalances but the statistical methods account for this properly
the hypothesis of no treatment effect
The Hypothesis of No Treatment Effect
  • In a randomized experiment, can test this hypothesis essentially without making any assumptions at all
  • “no effect” formally means for each patient the outcome would have been the same regardless of treatment assignment
  • Test statistic, e.g., proportion (D|TT)-proportion(D|PCI)



estimates etc
Estimates, etc.
  • Note: the probability distribution needed for the test is known, not assumed or modeled
  • Randomized experiment provides unbiased estimator of the average treatment effect
  • Internal versus external validity
  • Confidence intervals by inverting tests
  • Partially ordered outcomes, censoring, multivariate outcomes, etc.
overt bias in observational studies
Overt Bias in Observational Studies

“An observational study is biased if treatment and control groups differ prior to treatment in ways that matter for the outcome under study”

Overt bias: a bias that can be seen in the data

Hidden bias: involves factors not in the data

Can adjust for overt bias…

overt bias
Overt Bias

covariate vector

treatment (assume binary 0 or 1). pj =Pr(Zj=1)

M units, j=1,…,M


An OS is free of hidden bias if the j’s are known to depend only on the ’s (i.e., )

(so two units with same x have same prob of getting the treatment)


stratifying on x
Stratifying on x
  • Suppose can group units into strata with identical x’s. Then:
  • Conditional on all ’s are equally likely…just like in a uniform randomized experiment
stratifying on the propensity score
Stratifying on the Propensity Score
  • Obviously exact matching not always possible
  • Idea: form strata comprising units with the same ’s ( i.e. could have )
  • Problem: don’t know the ’s
  • Solution: estimate them (logistic regression, SVM, decision tree, etc.)
  • Form strata containing units with “similar” probability of treatment

Matched Analysis

Using a model with 29 covariates to predict VHA use, we wereable to obtain an accuracy of 88 percent (receiver-operating-characteristiccurve, 0.88) and to match 2265 (91.1 percent) of the VHA patientsto Medicare patients. Before matching, 16 of the 29 covariateshad a standardized difference larger than 10 percent, whereasafter matching, all standardized differences were less than5 percent


Conclusions VHA patients had more coexisting conditions thanMedicare patients. Nevertheless, we found no significant differencein mortality between VHA and Medicare patients, a result thatsuggests a similar quality of care for acute myocardial infarction.

what about hidden bias
What about hidden bias?
  • Sensitivity analysis!
  • Consider two units j and k with the same x. hidden bias  they may not have the same 
  • Consider this inequality:
  • Sensitivity analysis will consider various ’s
an equivalent latent variable model
An equivalent latent variable model

for two units j and k with the samex:

between –1 and 1

so the model implies the previous inequality with

(implication goes the other way too)

matched pairs
Matched Pairs
  • Strata of size 2, one gets the treatment, one doesn’t
  • If =0, every unit has the same chance of treatment
  • Standard test statistic for matched pairs is:


rank sum test

rank of

sum of the ranks for pairs in which treated unit > control unit

more on matched pairs
More on Matched Pairs
  • No hidden bias => know the null distribution of T because sth pair contributes ds with prob ½ and 0 with prob ½
  • with hidden bias, the sth pair contributes ds with prob:

and zero with prob 1-ps

  • so null distribution of T is unknown…
even more on matched pairs
Even More on Matched Pairs
  • easy to see that:
  • The P-value we are after is
  • Lower bound on P-value: where T- is the sum of S quantities, the sth one being ds with prob and 0 otherwise
  • Upper bound likewise using
  • This directly provides bounds on P-values for fixed 
smoking lung cancer example
Smoking & Lung Cancer Example
  • Hammond (1964) paired 36,975 heavy smokers to non-smokers. Matched on age, race, plus 16 other factors
asthma study25
Asthma Study
  • Need a  of three to make the effect of tobacco experimentation on asthma become non-significant