1 / 31

Graphical Exploration of Statistical Interactions

Graphical Exploration of Statistical Interactions. Nick Jackson University of Southern California Department of Psychology 10/25/2013. Overview. What is Interaction? 2-Way Interactions Categorical X Categorical Continuous X Categorical Continuous X Continuous 3-Way Interactions

shae
Download Presentation

Graphical Exploration of Statistical Interactions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Graphical Exploration of Statistical Interactions Nick Jackson University of Southern California Department of Psychology 10/25/2013

  2. Overview • What is Interaction? • 2-Way Interactions • Categorical X Categorical • Continuous X Categorical • Continuous X Continuous • 3-Way Interactions • Categorical X Continuous X Continuous • Continuous X Continuous X Continuous • Time in a Three-Way Interaction • 4-Way and beyond

  3. What is an Interaction? • Equivalent Statements: • When the relationship between X and Y depends on the levels of a third variable Z. • Z modifies the effect of X on Y. • X and Y ‘s relationship is different at differing levels of Z • Also Called Moderation or Effect Modification. Moderation is a stupid term. • Moderation (n): The avoidance of excess or extremes. • Moderate (v): To make or become less extreme or intense Those are kindathe opposite of what we mean when we say moderation in a statistical sense.

  4. What is an Interaction? As SEM diagrams: X*Z Z Z X Y X Y

  5. What is an Interaction? Z Modifies the effect of X on Y Effect of X on Y if we ignore Z Y Y Z=1 X X Z=0

  6. Types of Interaction Quantitative Interaction Only Qualitative Interaction X*Z, p<0.05 Y Y X=0 X=1 X=1 Z=0 Z=0 X=1 X=1 X=0 X=0 X=0 Quantitative Interaction: Difference between X(0) and X(1) is significantly different between Z(0) and Z(1), though these differences are not qualitatively different (visually these things look to be about the same). This occurs as a result of substantial power. Qualitative Interaction: Difference between X(0) and X(1) may or may not be significantly different between Z(0) and Z(1), however these differences are qualitatively different (ie. it really does look like an interaction) Z=1 Z=1

  7. Graphing the Interaction • Why Graph? • Interpreting the interaction coefficient(s) is not always intuitive • Two ways to graph: • 1) Look at observed means/values • Represents your actual data • Very easy to do in any package • Does not represent the statistical model being used • 2) Look at marginal (predicted) means/values from regression equation • A direct representation of the statistical model you are using • For interactions with continuous variables, it allows you to see where the interaction is occurring.

  8. Graphing the Interaction More about marginal (predicted) means/values from regression equation • The General Idea: • Take the regression equation and predict values for the different levels of your variables X and Z • For any covariates, use the their mean levels • An Example: Find the predicted means: Diabetes=1, Gender=1: 75 + 20.5(1) + 15(1) + 10.5(1*1)=121 Diabetes=0, Gender=1: 75 + 20.5(0) + 15(1) + 10.5(0*1)=90 Diabetes=1, Gender=0: 75 + 20.5(1) + 15(0) + 10.5(1*0)=95.5 Diabetes=0, Gender=0: 75 + 20.5(0) + 15(0) + 10.5(0*0)=75 Can get Standard Errors of predictions, though a bit difficult.

  9. Graphing the Interaction (Marginal Estimates) • Available in most Software Packages: • Margins/marginsplotcommand in Stata • lsmeans and effects Packages in R. predict and predict.lmcommands in R. • Some good ways to look at interactions in R. http://www.ats.ucla.edu/stat/r/faq/concon.htm • Least-Squares Means (LSMEANS), Slicing, Contrasts, Estimate in SAS. • SPSS GLM (emmeans), estimated marginal means

  10. Two-Way Interactions • Categorical X Categorical Interaction • Use Bar Graphs • 2 X 2: Below are equivalent representations of the same interaction…so which is it? Blood Pressure Blood Pressure Male Male Female Female Asian Asian White White Female Male Among males, Asians have a higher blood pressure than whites. Among females, Asians have a lower blood pressure than whites. Among Whites, Females have a higher blood pressure than Males. Among Asians, Females have a lower blood pressure than Males.

  11. Two-Way Interactions • Continuous X Categorical Interaction • Could make continuous variable categorical and use a bar graph. • Better idea, Use Scatter Plots/Linear Prediction for each category We can see that as BMI increases, blood pressure increases more sharply in Men than in Women. By looking at the Confidence Intervals we can start to get an idea about when the genders diverge (statistically) in their effects.

  12. Two-Way Interactions • Continuous X Categorical Interaction • Look at how the Slope of Gender (differences between Men and Women) change across varying levels of BMI. • We can use the 95% CI to see when these differences become significant. The differences in mean blood pressure between men and women become more pronounced at higher BMI’s such that women have a lower BP than men as BMI increases. These differences are statistically significant (95% CI of difference does not include 0) past a BMI of around 35.

  13. Two-Way Interactions • Continuous X Categorical Interaction • With more than Two Group categorical variable

  14. Two-Way Interactions • Continuous X Categorical Interaction • With more than Two Group categorical variable • Same as before, just plotting the differences relative to the reference group • Works the same with non-linear continuous variables.

  15. Two-Way Interactions • Continuous X Continuous Interaction • Traditional Methods • Discretize one of the continuous variables making it categorical and do the usual procedures for categorical X continuous interactions. • Usually +1 and -1 SD (This method sucks ) –Can miss where the interaction occurs • Newer Method: Predict values at percentiles of the continuous variables • Generally avoid the extremes of the percentiles (<5 or >95) as the variability is greater at the extremes • Newer Method: Use 3-D Graphing (Surface/Mesh Plots) • Same ideas as predicting values at the percentiles, but utilizing a 3D modeling software

  16. Two-Way Interactions Continuous X Continuous Interaction: Predicted values at percentiles

  17. Two-Way Interactions Continuous X Continuous Interaction: Which way we graph it is fairly arbitrary We can see that the nature of the relationship changes at around a BMI 30. We could say that BMI has a positive association with Blood Pressure, and that this relationship is the strongest among those with high cholesterol. Those with low cholesterol do not see a relationship of BMI with Blood Pressure We can see that the nature of the relationship changes at around a cholesterol value of 3.5. We could say that Cholesterol has a positive association with Blood Pressure, and that this relationship is the strongest among those with high BMI. Those with low BMI have a negative or no relationship of Cholesterol with Blood Pressure

  18. Two-Way Interactions Continuous X Continuous Interaction: Another way to interpret: 4-Corners Method Low Chol, Low BMI=133 Low Chol, High BMI=125 High Chol, Low BMI=130 High Chol, High BMI=155 The combination of being Obese (BMI >30) and having high cholesterol results in high BP.

  19. Two-Way Interactions Continuous X Continuous Interaction: 3D Mesh Plots (Matlab, Sigma Plot, R) Same data as before, same interpretation. Use 4-Corners Observed Data Marginal Estimates Data Why we generally don’t use observed data…not smooth

  20. Two-Way Interactions Continuous X Continuous Interaction: Useful for Non-linear continuous interactions (Response Surface Model)

  21. Three-Way Interactions • Now things get complicated. • Variables W*X*Z used to predict Y. • The Interaction of X*Z is different at differing levels of W • Or X*W is different at differing levels of Z • Or Z*W is different at differing levels of X • Or relationship of X and Y is different according to the levels of W and Z etc. • Substantially easier when one of X, W, or Z are categorical

  22. Three-Way Interactions • Substantially easier when one of X, W, or Z are categorical…. • so we pick a small range of values to predict one of the variables over…treating it as semi-discrete (Quartiles?) • Often Time is the third variable • Interested in if the interaction of X*Z change over Time (W)

  23. Three-Way Interactions Categorical X Continuous X Continuous Interaction: Sleep Medication (Y/N) * BMI * Pulse: Stratify on categorical var Sleep Meds The interaction of BMI and Pulse exists for those on Sleep Medications only.

  24. Three-Way Interactions Another way to look at this is how the difference in Apnea between those on Sleep Medications versus Not changes depending upon the relationships of pulse and BMI

  25. Three-Way Interactions Continuous X Continuous X Continuous Interaction: Glucose Level* BMI * Pulse: Stratify on Glucose Asks the question: How does the interaction of Pulse and BMI change across levels of glucose

  26. Three-Way Interactions Continuous X Continuous X Continuous Interaction: Glucose Level* BMI * Pulse: Look at how the slopes of Glucose on Apnea change. Asks the question: How does the relationship of Glucose to Apnea change across levels of BMI and pulse.

  27. Three-Way Interactions • What if we have time as our third variable? • Same techniques, but perhaps in the future we won’t be limited to just static graphs. Interaction of BMI and Pulse on Apnea Score across Time

  28. Presenting Data in Motion • Even better, lets do some of this: • http://www.ted.com/talks/hans_rosling_reveals_new_insights_on_poverty.html

  29. Four-Way Interactions and Beyond • Understanding anything much more complex than a 3-way interaction is difficult without a good way to break down variables into categories • Classification Techniques/Machine Leaning/Exploratory Data Mining • Can take high-dimensional data and find homogenous groups based upon relationships of continuous/categorical variables.

  30. Four-Way Interactions and Beyond Larger Structure Smaller Structure Lateral Walls 0.644 CART Model: 4-Way Interaction of continuous variables on Apnea Severity 50.9 ± 21.4 Soft Palate -1.845 19.0 ± 12.3 Genioglossus -1.123 42.2 ± 17.9 Mandibular Width -0.250 41.2 ± 19.1 27.8 ± 13.8

  31. Take Home Points • Test for interactions in the beginning of model building • Cause they are interesting • Cause they obscure your main effects • Interactions give us clues about underlying etiology (David Schwartz). It is not enough to detect them, we have to understand why the interaction exists. • We must search for the variable(s) that make interactions go away (mediated moderation) • Modern classification/Data Mining Methods are great at detecting high-dimensional (numerous variables) non-linear interactions • Stata Version 12 and 13 are amazing at doing these types of plots (margin plots). Also, check out “Interpreting and Visualizing Regression Models Using Stata” by Michael Mitchell

More Related