1 / 17

Regression Algebra and a Fit Measure

Regression Algebra and a Fit Measure. Based on Greene’s Note 5. The Sum of Squared Residuals. b minimizes e  e = ( y - Xb )  ( y - Xb ). Algebraic equivalences, at the solution b = ( X  X ) -1 X  y e’e = y  e (why? e’ = y’ – b’X’ )

kenna
Download Presentation

Regression Algebra and a Fit Measure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Regression Algebra and a Fit Measure Based on Greene’s Note 5

  2. The Sum of Squared Residuals b minimizes ee = (y - Xb)(y - Xb). Algebraic equivalences, at the solution b = (XX)-1Xy e’e = ye (why? e’ = y’ – b’X’) ee = yy - y’Xb = yy - bXy = ey as eX = 0 (This is the F.O.C. for least squares.)

  3. Minimizing e’e Any other coefficient vector has a larger sum of squares. A quick proof: d = the vector, not b u = y - Xd. Then, uu = (y - Xd)(y-Xd) = [y - Xb - X(d - b)][y - Xb - X(d - b)] = [e - X(d - b)] [e - X(d - b)] Expand to find uu = ee + (d-b)XX(d-b) >ee

  4. Dropping a Variable An important special case. Suppose [b,c]=the regression coefficients in a regression of y on [X,z] and d is the same, but computed to force the coefficient on z to be 0. This removes z from the regression. (We’ll discuss how this is done shortly.) So, we are comparing the results that we get with and without the variable z in the equation. Results which we can show: Dropping a variable(s) cannot improve the fit - that is, reduce the sum of squares. Adding a variable(s) cannot degrade the fit - that is, increase the sum of squares. The algebraic result is on text page 34. Where u = the residual in the regression of y on [X,z] and e = the residual in the regression of y on X alone, u’u = ee – c2(z*z*) ee where z* = MXz. This result forms the basis of the Neyman-Pearson class of tests of the regression model.

  5. The Fit of the Regression • “Variation:” In the context of the “model” we speak of variation of a variable as movement of the variable, usually associated with (not necessarily caused by) movement of another variable. • Total variation = = yM0y. • M0 = I – i(i’i)-1i’ = the M matrix for X = a column of ones.

  6. Decomposing the Variation of y Decomposition: y = Xb + e so M0y = M0Xb + M0e = M0Xb + e. (Deviations from means. Why is M0e = e? ) yM0y = b(X’ M0)(M0X)b + ee = bXM0Xb + ee. (M0 is idempotent and e’ M0X = e’X = 0.) Note that results above using M0 assume that one of the columns in X is i. (Constant term.) Total sum of squares = Regression Sum of Squares (SSR)+ Residual Sum of Squares (SSE)

  7. Decomposing the Variation Recall the decomposition: Var[y] = Var [E[y|x]] + E[Var [ y | x ]] = Variation of the conditional mean around the overall mean + Variation around the conditional mean function.

  8. A Fit Measure R2 = bXM0Xb/yM0y = (Very Important Result.) R2 is bounded by zero and one only if: (a) There is a constant term in X and (b) The line is computed by linear least squares.

  9. Adding Variables • R2 never falls when a z is added to the regression. • A useful general result • Partial correlation is a difference in R2s.

  10. Example • U.S. Gasoline Market Regression of G on a constant, PG, and Y. Then, what would happen if PNC, PUC, and YEAR were added to the regression – each one, one at a time?

  11. Comparing fits of regressions • Make sure the denominator in R2 is the same - i.e., same left hand side variable. • Example, linear vs. loglinear. Loglinear will almost always appear to fit better because taking logs reduces variation.

  12. (Linearly) Transformed Data • How does linear transformation affect the results of least squares? • Based on X, b = (XX)-1X’y. • You can show (just multiply it out), the coefficients when y is regressed on Z=XP are c = P -1b • “Fitted value” is Zc = XPP-1b = Xb. The same!! • Residuals from using Z are y - Zc = y - Xb (we just proved this.). The same!! • Sum of squared residuals must be identical, as y-Xb = e = y-Zc. • R2 must also be identical, as R2 = 1 - ee/y’M0y (!!).

  13. Linear Transformation • Xb is the projection of y into the column space of X. Zc is the projection of y into the column space of Z=XP. But, since the columns of Z=XP are just linear combinations of those of X, the column space of Z must be identical to that of X. Therefore, the projection of y into the former must be the same as the latter, which now produces the other results.) • What are the practical implications of this result? • Linear transformation does not affect the fit of a model to a body of data. • Linear transformation does affect the “estimates.” If b is an estimate of something (), then c cannot be an estimate of  - it must be an estimate of P-1, which might have no meaning at all.

  14. Adjusted R Squared • Adjusted R2 (for degrees of freedom?) = 1 - [(n-1)/(n-K)](1 - R2) • Degrees of freedom” adjustment assumes something about “unbiasedness.” • includes a penalty for variables that don’t add much fit. Can fall when a variable is added to the equation.

  15. Adjusted R2 What is being adjusted? The penalty for using up degrees of freedom. = 1 - [ee/(n – K)]/[yM0y/(n-1)] uses the ratio of two ‘unbiased’ estimators. Is the ratio unbiased? = 1 – [(n-1)/(n-K)(1 – R2)] Will rise when a variable is added to the regression? is higher with z than without z if and only if the t ratio on z is larger than one in absolute value. (Proof?)

  16. Fit Measures Other fit measures that incorporate degrees of freedom penalties. • Amemiya: [ee/(n – K)]  (1 + K/n) • Akaike: log(ee/n) + 2K/n (based on the likelihood => no degrees of freedom correction)

  17. Example • U.S. Gasoline Market Regression of G on a constant, PG, Y, PNC, PUC, PPT and YEAR.

More Related