1 / 21

Detecting and Managing Influential Observations in Linear Regression Analysis

Learn the importance of detecting influential observations and leverage points in linear regression analysis. Discover measures of influence and how to handle influential observations.

falbo
Download Presentation

Detecting and Managing Influential Observations in Linear Regression Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 6 Diagnostics for Leverage and Influence Linear Regression Analysis 5E Montgomery, Peck and Vining

  2. 6.1 Importance of Detecting Influential Observations • Leverage Point: • unusual x-value; • very little effect on regression coefficients. Linear Regression Analysis 5E Montgomery, Peck and Vining

  3. 6.1 Importance of Detecting Influential Observations • Influence Point: unusual in y and x; Linear Regression Analysis 5E Montgomery, Peck and Vining

  4. 6.2 Leverage • The hat matrix is: H = X(XX)- 1 X • The diagonal elements of the hat matrix are given by hii = xi(XX)-1xi • hii – standardized measure of the distance of the ith observation from the center of the x-space. Linear Regression Analysis 5E Montgomery, Peck and Vining

  5. 6.2 Leverage • The average size of the hat diagonal is p/n. • Traditionally, any hii > 2p/n indicates a leverage point. • An observation with large hii and a large residual is likely to be influential Linear Regression Analysis 5E Montgomery, Peck and Vining

  6. Linear Regression Analysis 5E Montgomery, Peck and Vining

  7. Example 6.1 The Delivery Time Data • Examine Table 6.1; if some possibly influential points are removed here is what happens to the coefficient estimates and model statistics: Linear Regression Analysis 5E Montgomery, Peck and Vining

  8. 6.3 Measures of Influence • The influence measures discussed here are those that measure the effect of deleting the ith observation. • Cook’s Di, which measures the effect on • DFBETASj(i), which measures the effect on • DFFITSi, which measures the effect on • COVRATIOi, which measures the effect on the variance-covariance matrix of the parameter estimates. Linear Regression Analysis 5E Montgomery, Peck and Vining

  9. 6.3 Measures of Influence: Cook’s D What contributes to Di: • How well the model fits the ith observation, yi • How far that point is from the remaining dataset. Large values of Di indicate an influential point, usually if Di > 1. Linear Regression Analysis 5E Montgomery, Peck and Vining

  10. Linear Regression Analysis 5E Montgomery, Peck and Vining

  11. 6.4 Measures of Influence: DFFITS and DFBETAS DFBETAS – measures how much the regression coefficient changes in standard deviation units if the ith observation is removed. where is an estimate of the jth coefficient when the ith observation is removed. • Large DFBETAS indicates ith observation has considerable influence. In general, |DFBETASj,i| > 2/ Linear Regression Analysis 5E Montgomery, Peck and Vining

  12. 6.4 Measures of Influence: DFFITS and DFBETAS DFFITS – measures the influence of the ith observation on the fitted value, again in standard deviation units. • Cutoff: If |DFFITSi| > 2 , the point is most likely influential. Linear Regression Analysis 5E Montgomery, Peck and Vining

  13. 6.4 Measures of Influence: DFFITS and DFBETAS Equivalencies • See the computational equivalents of both DFBETAS and DFFITS (page 217). You will see that they are both functions of R-student and hii. Linear Regression Analysis 5E Montgomery, Peck and Vining

  14. Linear Regression Analysis 5E Montgomery, Peck and Vining

  15. 6.5 A Measure of Model Performance • Information about the overall precision of estimation can be obtained through another statistic, COVRATIOi Linear Regression Analysis 5E Montgomery, Peck and Vining

  16. 6.5 A Measure of Model Performance Cutoffs and Interpretation • If COVRATIOi > 1, the ith observation improves the precision. • If COVRATIOi < 1, ith observation can degrade the precision.  Or, • Cutoffs: COVRATIOi > 1 + 3p/n or COVRATIOi < 1 - 3p/n; (the lower limit is really only good if n > 3p). Linear Regression Analysis 5E Montgomery, Peck and Vining

  17. Linear Regression Analysis 5E Montgomery, Peck and Vining

  18. 6.6 Detecting Groups of Influential Observations • Previous diagnostics were “single-observation” • It is possible that a group of points have high-leverage or exert undue influence on the regression model. • Multiple-observation deletion diagnostic can be implemented. Linear Regression Analysis 5E Montgomery, Peck and Vining

  19. 6.6 Detecting Groups of Influential Observations • Cook’s D can be extended to incorporate multiple observations: where i denotes the m 1 vector of indices specifying the points to be deleted. • Large values of Di indicate that the set of m points are influential. Linear Regression Analysis 5E Montgomery, Peck and Vining

  20. 6.7 Treatment of Influential Observations • Should an influential point be discarded? Yes, if • there is an error in recording a measured value; • the sample point is invalid; or, • the observation is not part of the population that was intended to be sampled No, if • the influential point is a valid observation. Linear Regression Analysis 5E Montgomery, Peck and Vining

  21. 6.7 Treatment of Influential Observations • Robust estimation techniques • These techniques offer an alternative to deleting an influential observation. • Observations are retained but downweighted in proportion to residual magnitude or influence. Linear Regression Analysis 5E Montgomery, Peck and Vining

More Related