Lecture 8: Selection Bias, Matching, & Control Selection

Lecture 8: Selection Bias, Matching, & Control Selection Matthew Fox Advanced Epidemiology

What is selection bias?

Which studies can have selection bias: cohort or case control?

Selection bias or confounding? • Comparison of mortality among office workers and longshoremen from MI • Comparison is biased because those who self-select into longshoremen are fitter which leads to less MI • What is the bias?

In a case control study, can we match cases to controls based on exposure?

If we match, do we need to adjust for the matched factor?

What is overmatching?

Misclassification Summary I #1 Non-differential and independent misclassification of dichotomous exposure or disease (usually) creates an expectation that estimates of effect are biased towards the null. #2 Non-differential and independent misclassification of a covariate creates an expectation that the relative risk due to confounding is biased towards the null, yielding residual confounding.

Misclassification Summary II #3 Errors due to misclassification can be corrected algebraically #4 Differential misclassification yields an unpredictable bias of the estimates of effect (still correctable). #5 There are important exceptions to the mantra that “non-differential misclassification biases towards the null.”

This Session • Selection bias • Definition & control • Matching • Cohort vs. Case-control studies • When to adjust, when not to adjust • Control selection • Adjustment • Is it possible?

Selection bias — definition • Distortions of the estimate of effect arising from procedures to select subjects and from factors that influence participation • Common element is that the exposure-disease relation is different among participants than among those theoretically eligible • Observed estimate of effect reflects a mixture of forces affecting participation and forces affecting disease occurrence

Separate from Confounding • Cohort studies don’t have selection bias at entry even if subjects self select • Selection into cohort can create confounding, but this can be undone by adjustment • Or becomes an issue of generalizablity • Cohort studies/RCTs can have selection bias at end through differential LTFU • Some can be undone if we know enough about the selection mechanism

Selection bias — Fallacy • Formerly frequently viewed as disease-dependent selection forces • Exposure-dependent selection forces were thought to be confounders or part of the population definition. • Sometimes selection factors can be controlled as if they were confounders • For example, matched factors in case-control studies and two-stage studies. • However, not all selection factors related to exposure can be so treated

Selection biasAdjust for selection proportions

Selection bias — Simple method

Selection bias

OR = [50/4000] / [40/8000] = 2.5 Selection bias

Selection bias

https://sites.google.com/site/biasanalysis/

Structure of Selection Bias

Selection forces don’t create bias if they are not related to both exposure and disease

Selection bias — Simple method

Selection bias

OR = [50/4000] / [100/8000] = 1 Selection bias

Selection bias

Selection Bias Occurs When Selection is Related to Both the Exposure and the Outcome Sounds like confounding, but this time E and D affect Selection

Remember back to common causes and common effects (Hernán 2004)

Selection Bias in a Case Control Study: • Case controls study of the relationship between estrogens and myocardial infarction • Cases are those hospitalized for MI • Controls are those hospitalized for hip fracture • Could this cause selection bias?

Selection Bias in a Case Control Study: • E= estrogens D = myocardial infarction • F= hip fracture C = selection into study Selection bias occurs because we condition on a common effect of both E and D

Selection Bias in a Cohort Study: • Cohort study of relationship between HAART and progression to AIDS • LTFU occurs more among those with low CD4 • LTFU occurs more among those with AIDS • But now selection out occurs before AIDS • Could this cause selection bias?

Selection Bias in a Cohort Study: Differential LTFU • E = ART, D = AIDS, L = vector of symptoms • U = True immunosuppression (unmeasured) • C= Drop out (LTFU) Selection bias occurs because we condition on a common effect of both E and a common cause C and D

Selection Bias in a Cohort Study: Differential LTFU • E = ART, D = AIDS, L = vector of symptoms • U = True immunosuppression (unmeasured) • C= Drop out

Selection Bias vs. Confounding • Bias is a systematic difference between the truth and the observed • Pr[Ya=1=1] - Pr[Ya=0=1] ≠ Pr[Y=1|a=1] - Pr[Y=1|a=0] • Separate from random error which is not structural • Using DAGs we can see the common structures • Confounding = common causes (directly or through other mechanisms) • Selection bias = conditioning on common effects

To see the difference • Comparison of mortality among office workers and longshoremen from MI • Comparison is biased because those who self-select into longshoremen are fitter which leads to less MI • What is the DAG? Occupation MI Fitness

Adjustment for Selection Bias

Adjustment for loss to follow up through weighting • Because selection bias means we are only looking at those included in the study we can’t adjust through stratification • We don’t have the data on those not included • Can use weighting, because this does not require us to have data on those missing • Inverse probability of censoring weighting • Assumes we have enough data to predict the drop out

Now we ask, what if the censored were not censored?

Further stratify IPC weights for predictors of censoring • As shown assumes those lost are same as those retained • Not likely to be true • Calculate weights within levels of predictors of censoring • Valid if we can produce conditional exchangeability between those lost and those not lost • Weights can be multiplied by IPTW weights to simultaneously adjust for confounding

Matching

Lecture 8: Selection Bias, Matching, & Control Selection