1 / 58

Confounding and Interaction: Part II

Confounding and Interaction: Part II. Methods to reduce confounding during study design : Randomization Restriction Matching during study analysis: Stratified analysis (Mathematical regression) Interaction What is it? How to detect it? Additive vs. multiplicative interaction

accalia
Download Presentation

Confounding and Interaction: Part II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Confounding and Interaction: Part II • Methods to reduce confounding • during study design: • Randomization • Restriction • Matching • during study analysis: • Stratified analysis • (Mathematical regression) • Interaction • What is it? How to detect it? • Additive vs. multiplicative interaction • Comparison with confounding • Statistical testing for interaction • Implementation in Stata

  2. Confounding Confounding occurs if there is a factor C that is a “Common Cause” of both E and D E C ? D • C is the genesis of a “backdoor path” to E and D • Adjusting/controlling for C closes the backdoor paths; eliminates confounding

  3. Methods to Prevent or Reduce Confounding By prohibiting at least one segment of the exposure- confounder - disease path, confounding is precluded E C ? D • C is the genesis of a “backdoor path” to E and D

  4. Confounding and Interaction: Part II • Methods to reduce confounding • during study design: • Randomization • Restriction • Matching • during study analysis: • Stratified analysis • (Mathematical regression) • Interaction • What is it? How to detect it? • Additive vs. multiplicative interaction • Comparison with confounding • Statistical testing for interaction • Implementation in Stata

  5. Randomization to Reduce Confounding • Definition: random assignment of subjects to exposure (e.g., treatment) categories • All subjectsRandomize • Distribution of any variable is theoretically the same in the exposed group as the unexposed • Theoretically, can be no association between exposure and any other variable • Comes closest to goal of “exchangeability” or counterfactual ideal (although still falls short) • One of the most important inventions of the 20th Century! Exposed (treatment) Unexposed (no treatment)

  6. Randomization to Prevent Confounding Blocking the path to exposure explains the exulted role of randomization in clinical research E C ? D

  7. Randomization to Reduce Confounding Exposed (treatment) • All subjectsRandomize • Applicable only for ethically assignable exposures (ie, interventions, experiments) • Not for naturally occurring exposures (e.g., air pollution) • Special strength of randomization is its ability to control the effect of confounding variables about which the investigator is unaware • Because distribution of any variable theoretically same across randomization groups • Does not, however, always eliminate confounding! • By chance alone, there can be imbalance • Magnitude of bias contained in confidence interval • Less of a problem in large studies • Techniques exist to ensure balance of certain variables (e.g., blocked or stratified randomization) Unexposed (no treatment)

  8. But what if we cannot randomize? • Methods to reduce confounding • during study design: • Randomization • Restriction • Matching • during study analysis: • Stratified analysis • (Mathematical regression) • Interaction • What is it? How to detect it? • Additive vs. multiplicative interaction • Comparison with confounding • Statistical testing for interaction • Implementation in Stata

  9. Restriction to Prevent Confounding • AKA Specification • Definition: Restrict enrollment to only those subjects who have a specific value/range of the confounding variable E C ? • e.g., when diet is a confounder, restrict to persons with a certain diet D

  10. Night lights and childhood myopia • RQ: Do night lights cause children to develop myopia? Night Lights Restrict to children with parents without myopia Parental Myopia ? Child’s Myopia

  11. Restriction to Prevent Confounding • Particularly useful when confounder is quantitative in scale but difficult to measure Behavioral factors (unmeasured) Commercial sex • Problem: degree of injection drug use is difficult to measure • Solution: restrict to subjects with no injection drug use, thereby precluding the need to measure degree of injection use • Cannon et. al NEJM 2001 • Restricted to persons denying injection drug use • e.g. • RQ: Does practice of commercial sex result in acquisition of HHV-8 infection? • Issue: Confounding by unmeasured behavioral factors operating through injection drug use ? Injection drug use HHV-8 • e.g., Effect of HIV infection on pulmonary hypertension – confounding by IDU (Hsue et al AIDS 2008)

  12. Restriction to Reduce Confounding • Advantages: • conceptually straightforward • handles difficult to quantitate variables • unlike matching, decisions can be made about individual subjects (include or not include) irrespective of other subjects • can also be used in analysis phase

  13. Restriction to Reduce Confounding • Disadvantages: • may limit number of eligible subjects • cost-inefficient to screen subjects, then not enroll • “residual confounding” may persist if restriction categories not sufficiently narrow (e.g. “20 to 30 years old” restriction in Birth Order - Down syndrome question might be too broad) • limits generalizability, but • “Validity before generalizabilty” • Including small numbers of persons in rare stratum of confounders (e.g., race) and then finding an effect for an exposure/treatment does not mean the effect is operative in that rare group • Politics trumping science • not possible to evaluate the relationship of interest at different levels of the restricted variable (i.e. cannot assess statistical interaction) • Bottom Line • Restriction not used as much as it should be

  14. Methods to reduce confounding • during study design: • Randomization • Restriction • Matching • during study analysis: • Stratified analysis • (Mathematical regression) • Interaction • What is it? How to detect it? • Additive vs. multiplicative interaction • Comparison with confounding • Statistical testing for interaction • Implementation in Stata

  15. Matching to Reduce Confounding • Definition: only unexposed/non-case subjects are enrolled who match those of the comparison group (either exposed or cases) in terms of the confounder in question • Mechanics depends upon study design: • e.g. cohort study: unexposed subjects are “matched” to exposed subjects according to their values for the potential confounder. • e.g. matching on race One unexposedlatino enrolled for each exposedlatino One unexposedasian enrolled for each exposedasian • e.g. case-control study: non-diseased controls are “matched” to diseased cases • e.g. matching on age One controlage50enrolled for each caseage50 One controlage70enrolled for each caseage70 • can be in age ranges, e.g., +/- 2.5 years • Operationally, performed by “individual matching” (one-by-one) or frequency matching (e.g., select control group at the end to match distribution of confounding factor in case group)

  16. E E C C ? ? D D Matching to Prevent Confounding • Cross-sectional/cohort study Uncommon in large cohort studies typically because there is not just one exposure of interest; More common and can be valuable in smaller studies with a single focused exposure • Case-control study More common use of matching Can be relevant for a variety of exposures

  17. Advantages of Matching 1. Useful in preventing confounding by factors which would be nearly impossible or statistically inefficient to manage in analysis phase • e.g., “neighborhood” is a nominal variable with multiple values (complex nominal variable) • e.g., Case-control study of the effect of a BCG vaccine in preventing TB (Int J Tub Lung Dis. 2006) • Cases: newly diagnosed TB in Brazil • Controls: persons without TB • Exposure: receipt of a BCG vaccine • Potential confounder: neighborhood (village) of residence; related to ambient TB incidence and practices regarding BCG vaccine • Control sampling: Relying upon random sampling without attention to neighborhood may result in (especially in a small study) choosing no controls from some of the neighborhoods seen in the case group (i.e., cases and controls lack overlap) • Matching on neighborhood ensures overlap • Even if all neighborhoods seen in the case group were represented in the control group, adjusting for neighborhood with “analysis phase” strategies is problematic

  18. Neighborhood: If you chose to stratify to manage confounding, the number of strata is unwieldy Crude Stratified Mission Sunset Richmond Marina Pacific Heights Castro Matching avoids this dilemma in the analysis phase

  19. Age Age Advantages of Matching 2. Provides a way to ensure overlap between comparator groups (e.g., cases/controls) in the distribution of confounders other than complex nominal variables e.g., Case-control study of prostate cancer -- potential confounding by age • Cases will have many old individuals • Random sampling of controls, especially in smaller studies, apt not to contain oldest individuals • Matching age distribution of controls to age distribution of cases ensures complete overlap in age between cases and controls controls cases

  20. Advantages of Matching 3. By ensuring a balanced number of cases and controls (in a case-control study) or exposed/unexposed (in a cohort study) within the various strata of the confounding variable, statistical precision may be increased

  21. Smoking, Matches, and Lung Cancer A. Random sample of controls Crude OR crude = 8.8 Stratified Smokers Non-Smokers OR CF+ = ORsmokers = 1.0 OR CF- = ORnon-smokers = 1.0 Matching facilitates statistically efficient stratification ORadj= 1.0 (0.31 to 3.2) B. Controls matched on smoking Smokers Non-Smokers OR CF+ = ORsmokers = 1.0 OR CF- = ORnon-smokers = 1.0 ORadj= 1.0 (0.40 to 2.5) Underappreciated benefit of matching: Improved precision

  22. Advantages of Matching 4. People find it easy to understand, likely because it comes close to fulfilling “exchangeability” objective. • So intuitive that it is often the first choice among the uninitiated (“let’s match on x, y, and z”) This is both good and bad

  23. Disadvantages of Matching 1. Finding appropriate matches may be difficult and expensive. Therefore, the gains in statistical efficiency can be offset by increases in overall costs. • Exacerbated when matching > 1 factors jointly 2. In a case-control study, factor used to match subjects cannot be itself evaluated as a risk factor for the disease. In general, matching decreases robustness of study to address secondary questions. 3. In a case-control study, must still perform either stratification or regression in the analysis phase. • This is because matching artifactually induces cases and controls to look more similar regarding exposure • If this extra step is forgotten (out of ignorance or the matching aspect simply gets lost over time) the crude OR is biased

  24. More Disadvantages of Matching 4. Decisions are irrevocable • if you happened to match on an intermediary factor, you have lost ability to evaluate role of exposure in question via that pathway • study of effect of exercise on coronary artery disease. Matching on HDL cholesterol precludes ability to look assess total effect of exercise • Inadvertently matching on a collider permanently induces bias 5. If potential confounding factor really isn’t a confounder, statistical precision can be worse than no matching. Bottomline: Matching very useful in certain situations but should not be done indiscriminately. Think carefully before you match and seek advice

  25. Overmatching • Often used term, poorly understood • Two types of overmatching manifestations • Overmatching resulting in precision losses • In case-control studies, matching on factors which are truly not confounders will result in larger standard errors compared to not matching • Especially bad for factors associated with exposure but not disease • In case-control or cohort studies, matching on factors very strongly related to exposure results in collinearity • Not unique to matching; occurs with stratification or regression as well • Overmatching resulting in bias • Matching on intermediary factors • Matching on colliders

  26. Confounding and Interaction: Part II • Methods to reduce confounding • during study design: • Randomization • Restriction • Matching • during study analysis: • Stratified analysis • (Mathematical regression) • Interaction • What is it? How to detect it? • Additive vs. multiplicative interaction • Comparison with confounding • Statistical testing for interaction • Implementation in Stata

  27. Strategies in the analysis phase: Stratification to Reduce Confounding • Goal: evaluate the relationship between the exposure and outcome in strata homogeneous with respect to potentially confounding variables • Each stratum is a mini-example of restriction! • CF = confounding factor Crude Stratified CF Level I CF Level 2 CF Level 3

  28. Smoking, Matches, and Lung Cancer Crude OR crude Stratified Smokers Non-Smokers OR CF+ = ORsmokers OR CF- = ORnon-smokers • ORcrude = 8.8 • Each stratum is unconfounded with respect to smoking • ORsmokers = 1.0 • ORnon-smoker = 1.0

  29. More than One Confounder RQ: Does Chlamydia pneumoniae infection cause coronary artery disease (CAD)? Chlamydia pneumoniaeinfection Smoking Age ? CAD

  30. Stratifying by Multiple Confounders Crude Potential Confounders:Age and Smoking • To control for multiple confounders simultaneously, must construct mutually exclusive and exhaustive strata:

  31. Stratifying by Multiple Potential Confounders Crude Stratified <40 smokers 40-60 smokers >60 smokers <40 non-smokers 40-60 non-smokers >60 non-smokers Each of these strata is unconfounded by age and smoking

  32. Adjusted Estimate from the Stratified Analyses • After the stratum have been formed, what next? • Process: Summarize the unconfounded estimates from the two (or more) strata to form a single overall unconfounded “adjusted” estimate • e.g., summarize the odds ratios from the smoking stratum and non-smoking stratum into one odds ratio

  33. Smoking, Matches, and Lung Cancer Crude OR crude Stratified Smokers Non-Smokers OR CF+ = ORsmokers OR CF- = ORnon-smokers • ORcrude = 8.8 • ORsmokers = 1.0 • ORnon-smoker = 1.0 • ORadjusted = 1.0

  34. Smoking, Caffeine Use and Delayed Conception RR = risk ratio Crude RR crude= 1.7 Stratified Heavy Caffeine Use No Caffeine Use RRcaffeine use = 0.7 RRnocaffeine use = 2.4 Stanton and Gray. AJE 1995 Is it appropriate to summarize these two stratum-specific risk ratio estimates into a single number?

  35. Underlying Assumption Needed to Form a Summary of the Unconfounded Stratum-Specific Estimates • If the relationship between the exposure and the outcome varies meaningfully in a clinical/biologic sense across strata of a third variable: • it is not appropriate to create a single summary estimate of all of the strata • i.e. When you summarize across strata, the assumption is that no “statistical interaction” is present

  36. Statistical Interaction • Definition • when the magnitude of a measure of association (between exposure and disease) meaningfully differs according to the value of some third variable • Synonyms • Effect-measure modification* • Effect modification • Heterogeneity of effect • Heterogeneity of measure • Nonuniformity of effect • Effect variation • Proper terminology • e.g., Smoking, caffeine use, delayed conception • Caffeine use modifies the effect of smoking on the risk for delayed conception.* • There is interaction between caffeine use and smoking in the risk for delayed conception. • Caffeine is an effect modifier in the relationship between smoking and delayed conception.

  37. RR = 3.0 RR = 3.0 Parallel lines means no interaction RR = 11.2 RR = 3.0 Non-parallel lines means interaction

  38. RR = 2.4 RR = 0.7

  39. Interaction is everywhere • Susceptibility to infectious diseases • e.g., • exposure: sexual activity • disease: HIV infection • effect modifier: chemokine receptor phenotype • Susceptibility to non-infectious diseases • e.g., • exposure: smoking • disease: lung cancer • effect modifier: genetic susceptibility to smoke • Susceptibility to drugs (efficacy and side effects) • effect modifier: genetic susceptibility to drug • “personalized medicine” is an expression of interaction • But in practice to date, difficult to document • Genomics may change this

  40. Smoking, Caffeine Use and Delayed Conception: Additive vs Multiplicative Interaction Crude RR crude= 1.7 RD crude= 0.07 Stratified Heavy Caffeine Use No Caffeine Use RRcaffeine use = 0.7 RDcaffeine use = -0.06 Multiplicative interaction RRnocaffeine use = 2.4 RDnocaffeine use = 0.12 Additive interaction RD = Risk Difference = Risk exposed - Risk Unexposed

  41. Additive vs Multiplicative Interaction • Assessment of whether interaction is present depends upon the measure of association • ratio measure (multiplicative interaction) or difference measure (additive interaction) • Hence, the term effect-measure modification • Absence of multiplicative interaction implies presence of additive interaction (exception: no association) Additive interaction present RR = 3.0 RD = 0.3 Multiplicative interaction absent RR = 3.0 RD = 0.1

  42. Additive vs Multiplicative Interaction • Absence of additive interaction implies presence of multiplicative interaction Multiplicative interaction present Additive interaction absent RR = 1.7 RD = 0.1 RR = 3.0 RD = 0.1

  43. Additive vs Multiplicative Interaction • Presence of multiplicative interaction may or may not be accompanied by additive interaction RR = 2.0 RD = 0.1 No additive interaction RR = 3.0 RD = 0.1 RR = 3.0 RD = 0.4 Additive interaction present RR = 2.0 RD = 0.1

  44. Additive vs Multiplicative Interaction • Presence of additive interaction may or may not be accompanied by multiplicative interaction RR = 3.0 RD = 0.4 Multiplicative interaction present RR = 2.0 RD = 0.1 RR = 3.0 RD = 0.2 Multiplicative interaction absent RR = 3.0 RD = 0.1

  45. Additive vs Multiplicative Interaction • Presence of qualitative multiplicative interaction is always accompanied by qualitative additive interaction Multiplicative and additive interaction both present e.g., smoking, caffeine, delayed ocnception

  46. Additive vs Multiplicative Scales • Which do you want to use? • Multiplicative measures (e.g., risk ratio) • favored measure in etiologic research • not dependent upon background incidence of disease • Additive measures (e.g., risk difference): • readily translated into impact of an exposure (or intervention) in terms of absolute number of outcomes prevented • e.g. 1/risk difference = no. needed to treat to prevent (or avert) one case of disease • or no. of exposed persons one needs to take the exposure away from to avert one case of disease • very dependent upon background incidence of disease • gives “public health impact” of the exposure

  47. Additive vs Multiplicative Scales • Causally related but minor public health importance • Risk ratio = 2 • Risk difference = 0.0001 - 0.00005 = 0.00005 • Need to eliminate exposure in 20,000 persons to avert one case of disease • Causally related and major public health importance • RR = 2 • RD = 0.2 - 0.1 = 0.1 • Need to eliminate exposure in 10 persons to avert one case of disease

  48. Smoking, Family History and Cancer: Additive vs Multiplicative Interaction Crude Family History Present Stratified Family History Absent Risk ratiofamily history = 2.0 RDfamily history = 0.20 Risk rationofamily history = 2.0 RDnofamily history = 0.05 • No multiplicative interaction but presence of additive interaction • If etiology is goal, risk ratio is sufficient • If goal is to define sub-groups of persons to target: • - Rather than ignoring, it is worth reporting that only 5 persons with a family history have to be prevented from smoking to avert one case of cancer

  49. Confounding vs Interaction • We discovered interaction by performing stratification as a means to evaluate for confounding • This is where the similarities between confounding and interaction end! • Confounding • An extraneous or nuisance pathway that an investigator hopes to prevent or rule out • Interaction (Effect-measure modification) • A more detailed description of the relationship between the exposure and disease • A richer description of the biologic or behavioral system under study • A finding to be reported, not a bias to be eliminated

  50. Smoking, Caffeine Use and Delayed Conception Crude RR crude = 1.7 Stratified Heavy Caffeine Use No Caffeine Use RRcaffeine use = 0.7 RRnocaffeine use = 2.4 RR adjusted = 1.4 (95% CI= 0.9 to 2.1) Is this the best “final” answer? In etiologic research, adjustment here is contraindicated When interaction is present, confounding becomes irrelevant! (Exception: sometimes in public health research, the adjusted RR would be used to understand net effect of the exposure across the population)

More Related