Adjusted r 2 residuals and review
Sponsored Links
This presentation is the property of its rightful owner.
1 / 22

Adjusted R 2 , Residuals, and Review PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Adjusted R 2 , Residuals, and Review. Adjusted R 2 Residual Analysis Stata Regression Output revisited The Overall Model Analyzing Residuals Review for Exam 2. Exercise Review. Use the caschool.dta dataseet

Download Presentation

Adjusted R 2 , Residuals, and Review

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Adjusted r 2 residuals and review

Adjusted R2, Residuals, and Review

  • Adjusted R2

  • Residual Analysis

  • Stata Regression Output revisited

    • The Overall Model

    • Analyzing Residuals

  • Review for Exam 2

Exercise review

Exercise Review

  • Use the caschool.dta dataseet

  • Run a model in Stata using Average Income (avginc) to predict Average Test Scores (testscr)

  • Examine the univariate distributions of both variables and the residuals

    • Walk through the entire interpretation

    • Build a Stata do-file as you go

Exercise review continued

Exercise Review, continued

Exercise review continued1

Exercise Review, Continued

Adjusted r 2 an alternative goodness of fit measure

Adjusted R2: An Alternative “Goodness of Fit” Measure

  • Recall that R2 is calculated as:

  • Hypothetically, as K approaches n, R2 approaches one (why?) – “degrees of freedom”

  • Adjusted R2 compensates for that tendency

“explained sum of squares”

“total sum of squares”

Calculating adjusted r 2

Calculating Adjusted R2

  • The bigger the sample size (n), the smaller

  • the adjustment

  • The more complex the model (the bigger K

  • is), the larger the adjustment

  • The bigger R2 is, the smaller the

  • adjustment

Residual analysis trouble shooting

Residual Analysis: Trouble Shooting

  • Conceptual use of residuals

    • e, or what the model can’t explain

  • Visual Diagnostics

    • Ideal: a “Sneeze plot”

    • Diagnostics using Residual Plots:

      • Checking for heteroscedasticity

      • Checking for non-linearity

      • Checking for outliers

  • Saving and Analyzing Residuals in Stata

Review assumptions necessary for estimating linear models




Review: Assumptions Necessary for Estimating Linear Models

1.Errors have identical distributions

Zero mean, same variance, across the range of X

2.Errors are independent of X and other ei

3.Errors are normally distributed

The ideal sneeze splatter


Predicted Y

The Ideal: Sneeze Splatter

Problems: It is possible to “over-interpret” residual plots; it is also possible to miss patterns when there are large numbers of observations


Problem: Standard errors are not constant; hypothesis tests invalid



Predicted Y

Non linearity

Problem: Biased estimated coefficients, inefficient model



Predicted Y

Checking for outliers

Residuals for model

with outliers deleted

Possible Outliers

Checking for Outliers

Residuals for

model using

all data


Predicted Y

Problem: Under-specified model; measurement error

Stata regression model

Stata Regression Model:

Regressing “testscr” onto “avginc”

Regression plot again

Regression Plot (again)

Residual plot

Residual Plot

Examination of residuals

Use the case ID number to find the relevant observation in the data set

Examination of Residuals

gsort e (or you can use “-e”)

list observat testscr avginc yhat e in 1/5

.list observat testscr avginc yhat e in 1/5


observat testscr avginc yhat e


1. 393 683.4 13.567 650.8699 32.53016

2. 386 681.6 14.177 652.0157 29.5842

3. 419 672.2 9.952 644.0789 28.12111

4. 366 675.7 11.834 647.6143 28.08568

5. 371 676.95 12.934 649.6807 27.26921


Residuals v predicted values

Residuals v. Predicted Values

Using an “ocular test,” non-linearity seems probable, but heteroscedasticity is not obvious here. But should we trust our eyeballs?

Formal test for non linearity omitted variables

Formal Test for Non-linearity:Omitted Variables

Tests whether adding 2nd, 3rd and 4th powers of X will improve the fit of the model:


Formal tests for heteroscedasticity

Formal Tests for Heteroscedasticity

Tests to see whether the squared standardized residuals are linearly related to the predicted value of Y:

std(e2)=b0+b1(Predicted Y)

Case wise influence analysis

Case-wise Influence Analysis

The Leverage versus Squared Residual Plot

What to do

What to Do?

  • Nonlinearity

    • Polynomial regression: try X and X2

    • Variable transformation: logged variables

    • Use non-OLS regression (curve fitting)

  • Heteroscedasticity

    • Re-specify model

      • Omitted variables?

      • Use non-OLS regression (WLS)

      • Use robust standard errors

  • Influential and Deviant Cases

    • Evaluate the cases

    • Run with controls (multivariate model)

    • Omit cases (last option)

Next week

Next Week

  • Review regression diagnostics

  • Introduction to Matrix Algebra

  • Review for Exam

  • Login