1 / 26

Multiple Regression

Multiple Regression. 1. Last Part – Model Reduction. Model Reduction. More of an art form than a science As stated earlier, we’re trying to create a model predicting a DV that explains as much of the variance in that DV as possible, while at the same time: Meets the assumptions of MLR

fawzi
Download Presentation

Multiple Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiple Regression 1 Last Part – Model Reduction

  2. Model Reduction • More of an art form than a science • As stated earlier, we’re trying to create a model predicting a DV that explains as much of the variance in that DV as possible, while at the same time: • Meets the assumptions of MLR • Best manages the other aforementioned issues – sample size, outliers, multicollinearity • Is parsimonious 1 2

  3. Model Reduction • The more variables, the higher the R2; conversely, our R2 will decrease every time we remove a variable from the model • So, if we’re reducing our R2, we want to make sure that we’re making progress relative to the assumptions, sample size, multicollinearity, & parsimony 1

  4. MLR Model Reduction Example • Let’s use the March Madness homework data • RQ: How effectively can we create a model to predict teams’ success in the NCAA Division I Men’s Basketball Tournament? • Alpha=.05 a priori for all tests 1 2

  5. MLR Model Reduction Example • Use SPSS to find the Cook’s distance for the data: 1 2 2

  6. MLR Model Reduction Example • Output from Cook’s distance request: The largest Cook’s distance is smaller than 1, so no problem (CD of > 1 signifies a problem with an influential data point, so you should consider eliminating it) 1

  7. MLR Model Reduction Example • Output from Cook’s distance request (data file: 1

  8. MLR Model Reduction Example • Examine the correlation matrix to see which variables are correlating with the DV and for multicollinearity among IV’s • Matrix on next slide • Correlations above .5 are somewhat concerning…those above .7 or particularly .8 are larger concerns • I count eight pairwise correlations (not involving the DV) that are .7+ 1

  9. 1

  10. MLR Model Reduction Example • What does this tell us? 1

  11. MLR Model Reduction Example • What does this tell us? 1

  12. 1

  13. Sample size concerns • Recall: • Tabachnick & Fidell (1996): n > 50 + 8k • Here k (# predictors) = 13 • n = 192 • 50 + (8 * 13) = 50 + 104 = 154 • So the inequality is satisfied here • Could still be improved by losing some predictors 1

  14. MLR Model Reduction Example • Am I satisfied with this model, or should I examine another model by reducing via IV elimination? • Because of some serious multicollinearity problems, it seems we can create a “better” model via reduction 1

  15. MLR Model Reduction Example 1 • So, what variables do we drop? • In examining variables to drop, look at: • Pairwise correlation with the DV ( good) • Multicollinearity with other IV’s ( good) • Prediction strength in model (ideal to have no non-significant IV’s in model) ( good) • Common sense – make your decisions based on BOTH statistical and practical grounds 2 4 3 5 6 This is an important slide, folks!

  16. MLR Model Reduction Example • Wins, losses, and winning % are all obviously highly correlated with one another – of the three, wins has the highest pairwise correlation w/ the DV and the highest t-score of the three in the model, so let’s keep it but drop the other two 1 2

  17. Example – Model #2 • So, let’s re-run the analysis without those 2 variables & see what we get… 1

  18. 1

  19. Example – Model #2 • Compare from one model to the next: • R2 • F-statistic • IV’s in model • So, how did we do? • Happy with this model? 1 2 3

  20. Example – Model #3 1 • Let’s try to clear up a couple of other multicollinearity problems: • Top 50 wins vs. Top 50 win % • Strength of schedule vs. RPI vs. Conference membership • Let’s drop Top 50 win % & SOS • Also, let’s get rid of # of wins in last ten games and Top 50 losses as they haven’t been significant anywhere 2 3 4

  21. Example – Model #3 • Model #3… 1

  22. 1

  23. Example – Model #3/4 1 • How did we do this time? • A fourth model should perhaps get rid of automatic bid & conference affiliation

  24. MLR Model Reduction • As you can see, this trial-and-error process can continue at some length • The goal is to create a highly predictive, parsimonious model with as few problems with assumptions & multicollinearity as possible 1 Finis…

  25. Some more notes on MC...VIF & tol.

  26. Some more notes on MC...VIF & tol.

More Related