1 / 10

# LECTURE 14 OUTLIERS AND MULTICOLLINEARITY - PowerPoint PPT Presentation

LECTURE 14 OUTLIERS AND MULTICOLLINEARITY. OUTLIER ANALYSIS 1. VISUAL DISPLAY 2. INTERACTIVE INSPECTION: http://www.stat.uiuc.edu/~stat100/java/guess/PPApplet.html. OUTLIERS. LEVERAGE h ii = 1/n + (Score – M x )/ x 2 (single predictor) Should be close to 1/n

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' LECTURE 14 OUTLIERS AND MULTICOLLINEARITY' - leonard-rowe

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
LECTURE 14OUTLIERS AND MULTICOLLINEARITY

• OUTLIER ANALYSIS

• 1. VISUAL DISPLAY

• 2. INTERACTIVE INSPECTION:

http://www.stat.uiuc.edu/~stat100/java/guess/PPApplet.html

• LEVERAGE

• hii= 1/n + (Score – Mx)/x2 (single predictor)

Should be close to 1/n

• Centered: h*ii= hii- 1/n

• Test: t(case I deleted)= [resid(i)/ 1- hij] / [MSres(i)/(1- hij )]

• Where resid(i) = residual of Y-Ymni with case i removed

• SPSS- take case i out, run analysis with SAVE

• MAHALANOBIS (Euclidean) distance of DV score from centroid of IVs

• Cook’s D: C =  (Y – Yi)2 /[(k-1)*MSres]

• DFFITSi = (Y – Yi) /SQRT[MSresi hii]

• SPSS: GENERAL LINEAR MODEL OPTIONS: ‘SAVE’

(check ‘Leverage Values’ and ‘Cooks’ to get hii and C

Plot C and h against the cases

• DELETE

• REVISE MODEL

• TRANSFORM VARIABLES (LOG, SQRT, LOGIT, ARCSIN, ETC.)

• ROBUST METHODS:

• LTS (LEAST TRIMMED SQUARES)

• VARIANT: WINDSORIZE (REMOVE TOP 5%, BOTTOM 5%)

• M-estimation: weight least squares for each case by deviation from regression line

• EXACT COLLINEARITY: One IV is predicted perfectly from another set of IVs

• MULTICOLLINEARITY: high correlation between one IV and another or set of other IVs

• VIF- Variance Inflation Factor

VIF(i) = 1 / [ 1 – R2(i.1,2,3,…k)

Calculates the R-square for each predictor from all the rest of the predictors

• TOLERANCE

= 1 / VIF

• CONDITION INDEX

= max / min

= largest eigenvalue over smallest

• VIF- Variance Inflation Factor > 10

• TOLERANCE

= 1 / VIF < .10

• CONDITION INDEX > 30

• REVISE MODEL

• NEW DATA

• RIDGE REGRESSION: SPSS Macro

• PRINCIPAL COMPONENTS REGRESSION

• STANDARDIZE PREDICTORS

• GET PRINCIPAL COMPONENT WEIGHTS

• CREATE NEW PRIN.COMP. SCORES, USE AS PREDICTORS