Multiple Regression Analysis
1 / 29

Multiple Regression Analysis - PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Multiple Regression Analysis. y = b 0 + b 1 x 1 + b 2 x 2 + . . . b k x k + u 4. Further Issues. Redefining Variables.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Multiple Regression Analysis

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Multiple Regression Analysis

y = b0 + b1x1 + b2x2 + . . . bkxk + u

4. Further Issues

Redefining Variables

  • Suppose we have a model with a variable like income measured in dollars on the left-hand-side. Now we re-define income to be measured in tens of thousands of dollars. What effect will this have on estimation and inference?

  • It will not affect the R2

  • Will such scaling have any effect on t-stats, F-stats and confidence intervals?

    • No, these will also have the same interpretation

  • Changing the scale of the y variable just leads to a corresponding change in the scale of the coefficients

Redefining Variables (cont)

  • Suppose we originally obtain

  • In this specification, house price is measured in dollars.

  • What happens if we re-estimate this with house price measured in thousands of dollars?

Redefining Variables (cont)

  • If we measure price in thousands of dollars, the new coefficient will be the old coefficient divided by 1000 (same estimated effect!)

  • The standard errors will be 1000 times smaller

  • t-stats etc. will be identical

Redefining Variables (cont)

  • Changing the scale of one of the x variables:What if we redefine square feet as thousands of square feet? Now all the ’s have the same interpretation as before with the exception of 1-hat

  • It will be 1000 times larger

    • Why? Because now a 1 unit change in square feet is the same as what previously was a 1000 unit change in square feet.

  • The standard error will also be 1000 times larger and t-stats etc. will have the same interpretation

Functional Form

  • OLS can be used for relationships that are not strictly linear in x and y by using nonlinear functions of x and y – will still be linear in the parameters


    log(wage)= 0 +1(educ)+2(exper)+3 (exper)2

    In this particular specification we have an example of a log specification with a quadratic term--both are examples of nonlinearities that can be introduced into the standard linear regression model

Interpretation of Log Models

1. If the model is ln(y) = b0 + b1ln(x) + u, then b1 is an elasticity. e.g. if we obtained an estimate of 1.2, this would suggest that a 1 percent increase in x causes y to increase by 1.2 percent.

2. If the model is ln(y) = b0 + b1x + u, then b1*100 is the percent change in y resulting from a unit change in x. e.g. if we obtained an estimate of 0.05, this would suggest that a 1 unit increase in x causes a 5% increase in y.

3. If the model is y = b0 + b1ln(x) + u, then b1/100 is the unit change in y resulting from a 1 percent change in x. e.g. if we obtained an estimate of 20, this would suggest that a 1 percent increase in x causes a 0.2 unit increase in y.

Why use log models?

  • Log models are invariant to the scale of the variables since we’re measuring percent changes

  • They can give a direct estimate of elasticity

  • For models with y > 0, the conditional distribution is often heteroskedastic or skewed, while ln(y) is much less so

  • The distribution of ln(y) is more narrow, limiting the effect of outliers

Some Rules of Thumb

What types of variables are often used in log form?

*Variables in positive dollar amounts

*Variables measuring numbers of people

-school enrollments, population, # employees

*Variables subject to extreme outliers

What types of variables are often used in level form?

*Anything that takes on a negative or zero value

*Variables measured in years

Quadratic Models

  • Captures increasing or decreasing marginal effects

  • For a model of the form y = b0 + b1x + b2x2 + u, we can’t interpret b1 alone as measuring the change in y with respect to x.

Now the effect of an extra unit of x on y depends in part on the value of x. Suppose b1 is positive. Then if b2 is positive, an extra unit of x has a larger impact on y when x is big than when x is small. If b2 is negative, an extra unit of x has a smaller impact on y when x is big than when x is


More on Quadratic Models

  • Suppose that the coefficient on x is positive and the coefficient on x2 is negative

  • Then y is increasing in x at first, but will eventually turn around and be decreasing in x

  • We may want to know the point of inflection

More on Quadratic Models

  • Suppose that the coefficient on x is negative and the coefficient on x2 is positive

  • Then y is decreasing in x at first, but will eventually turn around and be increasing in x

Interaction Terms

  • We might think that the marginal effect of one RHS variable depends on another RHS variable

    Example: suppose the model can be written: y = b0 + b1x1 + b2x2 + b3x1x2 + u

  • Where y is house price, x1 is the number of square feet and x2 is the number of bedrooms.

  • So the effect of an extra bedroom on price is

Interaction Terms

  • If b3>0, this tells us that an extra bedroom boosts the price of a house more, if the square footage of the house is higher.

    • This shouldn’t be surprising. After all, an extra bedroom in a small house is likely to be small compared with an extra bedroom in a large house. So we would expect an extra bedroom in a big house to be worth more.

  • Note that this makes interpretation of b2 a bit less straightforward.

    • Technically, b2 tells us how much an extra bedroom is worth in a house with zero square feet.

    • It may be useful to report on the value of b2+b3x1 for the mean value of x1 .

More on Goodness-of-Fit: Adjusted R-Squared

  • Recall that the R2 will always increase as more variables are added to the model

  • The adjusted R2 takes into account the number of variables in a model, and may decrease

  • The usual R2 can be written:

Adjusted R-Squared (cont)

  • We can define the “population R-squared” as

  • We can use SSR/(n-k-1) as unbiased estimate ofu2

  • Similarly can use SST/(n-1) as unbiased estimate of y2

  • Therefore, adjusted R2 “R-bar squared” is:

Adjusted R-Squared (cont)

  • Notice that R-bar squared can go up or down when a variable is added, unlike the regular R-squared which always goes up

  • R-bar squared is not necessarily “better”- the ratio of 2 unbiased estimators isn’t necessarily unbiased

  • Better to treat it as an alternative way of summarizing goodness of fit

    • If you add a variable to the RHS and the R-bar squared doesn’t rise, this is likely (though not surely) an indication it shouldn’t be included in the model

Comparing Nested Models

  • Suppose you wanted to compare the following two models:

    1. y=0+1x+u

    2. y=0+1x+2x2+ u

    We say that (1) is nested in (2); alternatively, (1) is a special case of (2). With a t-test on 2 we can choose between these two models (if reject null of 2=0, we pick model 2). For multiple exclusion restrictions can use F-test.

Comparing Non-Nested Models

  • Suppose you wanted to compare the following two models:

    1. y=0+1log(x)+

    2. y=0+1x+ 2x2+ 

  • One is not nested in the other, so t-test or F-test cannot be used to compare.

  • Here R-bar-squared can be useful. We can simply choose the model with the higher R-bar-squared.

    • Note that a simple comparison of regular R-squared would tend to lead us to choose the model with more explanatory variables.

    • Note that if the LHS variable takes a different form between (1) and (2) we cannot compare using R-bar-squared (or R-squared).

Goodness of Fit

  • Important not to fixate too much on adj-R2 and lose sight of theory and common sense

  • If economic theory clearly predicts a variable belongs, generally leave it in

  • Don’t want to exclude a variable that prohibits a sensible interpretation of the variable of interest

  • Remember the ceteris paribus interpretation of multiple regression

Residual Analysis

  • Sometimes looking at the residuals (i.e. predicted - observed) provides useful information

  • Example: Regress price of cars on characteristics

    • Engine size, efficiency, luxury amenities, roominess, fuel efficiency, etc.

  • Then the residual = actual price - predicted price

    • By picking the car with the lowest (most negative) residual, you would be choosing the most underpriced car (assuming you’re controlling for all relevant characteristics)

Standard Errors for Predictions

  • Suppose we want to use our estimates to obtain a specific prediction.

  • Such predictions are subject to sampling variation because they are functions of estimated parameters

  • First, suppose that we estimate:


    and we want to obtain a prediction of y for specific value of the x’s. In general we can obtain predictions of y

    by plugging values of the actual x’s into our fitted model.

Standard Errors for Predictions

  • Let c1, c2, …, ck denote particular values of the x variables, for which we want to obtain a prediction of y.

  • We can think of estimating a parameter

  • A good estimator is

Predictions (cont)

  • To get a confidence interval for our estimate of q0 we need its standard error

  • Like testing a linear combination of parameters, the difficulty is getting this standard error

  • Can rewrite as b0 = q0 – b1c1 – … – bkck and plug this into y=0+1x1+…+kxk+u to get

  • If we regress y on a constant and on (x1– c1),…, (xk– ck) we will obtain an estimate of q0 and the standard error of that estimate. This can be used to construct a confidence interval.

Predictions (cont)

  • The standard error we obtain here is for the expected value of y given particular x values.. This can be thought of as the standard error of the average y value for the sub-population that has those exact x characteristics. It is not the same as a standard error for a prediction about a particular individual from the population.

  • In order to form a confidence interval for a particular individual we need to also take into account the variance in the unobserved error.

  • Let y0 = some outcome for which we want a confidence interval (for some individual in the population who you wish to make predictions about), x0 = new values of independent variables, and u0 be the unobserved error. Then,

Predictions (cont)

  • The best prediction of y0 is:

  • The prediction error is:

  • We know that (because our estimators are unbiased)

Predictions (cont)

  • There are 2 sources of variation:

    1. Variance due to the sampling error in prediction (because yhat based on estimated coefficients)

    2. Variance in the error of the population

Prediction interval

  • We can estimate the standard error of prediction:

Predicting y in a log model

  • For the prediction

  • Simple exponentiation of the predicted ln(y) will underestimate the expected value of y

  • Use caution when making predictions in a model with ln(y) on LHS (see text pages 219-221).

  • Login