Econometrics i
This presentation is the property of its rightful owner.
Sponsored Links
1 / 53

Econometrics I PowerPoint PPT Presentation


  • 157 Views
  • Uploaded on
  • Presentation posted in: General

Econometrics I. Professor William Greene Stern School of Business Department of Economics. Econometrics I. Part 7 – Estimating the Variance of b. Context.

Download Presentation

Econometrics I

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Econometrics i

Econometrics I

Professor William Greene

Stern School of Business

Department of Economics


Econometrics i1

Econometrics I

Part 7 – Estimating the Variance of b


Context

Context

The true variance of b|X is 2(XX)-1 . We consider how to use the sample data to estimate this matrix. The ultimate objectives are to form interval estimates for regression slopes and to test hypotheses about them. Both require estimates of the variability of the distribution. We then examine a factor which affects how "large" this variance is, multicollinearity.


Estimating 2

Estimating 2

Using the residuals instead of the disturbances:

The natural estimator: ee/N as a sample surrogate for /n

Imperfect observation of i, ei = i - ( - b)xi

Downward bias of ee/N. We obtain the result E[ee|X] = (N-K)2


Expectation of e e

Expectation of ee


Method 1

Method 1:


Estimating 21

Estimating σ2

The unbiased estimator is s2 = ee/(N-K).

“Degrees of freedom correction”

Therefore, the unbiased estimator of 2 is

s2 = ee/(N-K)


Method 2 some matrix algebra

Method 2: Some Matrix Algebra


Decomposing m

Decomposing M


Example characteristic roots of a correlation matrix

Example: Characteristic Roots of a Correlation Matrix


Gasoline data

Gasoline Data


X x and its roots

X’X and its Roots


Var b x

Var[b|X]

Estimating the Covariance Matrix for b|X

The true covariance matrix is 2 (X’X)-1

The natural estimator is s2(X’X)-1

“Standard errors” of the individual coefficients are the square roots of the diagonal elements.


Econometrics i

X’X

(X’X)-1

s2(X’X)-1


Standard regression results

Standard Regression Results

----------------------------------------------------------------------

Ordinary least squares regression ........

LHS=G Mean = 226.09444

Standard deviation = 50.59182

Number of observs. = 36

Model size Parameters = 7

Degrees of freedom = 29

Residuals Sum of squares = 778.70227

Standard error of e = 5.18187 <= sqr[778.70227/(36 – 7)]

Fit R-squared = .99131

Adjusted R-squared = .98951

--------+-------------------------------------------------------------

Variable| Coefficient Standard Error t-ratio P[|T|>t] Mean of X

--------+-------------------------------------------------------------

Constant| -7.73975 49.95915 -.155 .8780

PG| -15.3008*** 2.42171 -6.318 .0000 2.31661

Y| .02365*** .00779 3.037 .0050 9232.86

TREND| 4.14359** 1.91513 2.164 .0389 17.5000

PNC| 15.4387 15.21899 1.014 .3188 1.67078

PUC| -5.63438 5.02666 -1.121 .2715 2.34364

PPT| -12.4378** 5.20697 -2.389 .0236 2.74486

--------+-------------------------------------------------------------


Bootstrapping

Bootstrapping

Some assumptions that underlie it - the sampling mechanism

Method:

1. Estimate using full sample: --> b

2. Repeat R times:

Draw N observations from the n, with replacement

Estimate  with b(r).

3. Estimate variance with

V = (1/R)r [b(r) - b][b(r) - b]’


Bootstrap application

Bootstrap Application

matr;bboot=init(3,21,0.)$ Store results here

name;x=one,y,pg$ Define X

regr;lhs=g;rhs=x$ Compute b

calc;i=0$ Counter

Proc Define procedure

regr;lhs=g;rhs=x;quietly$ … Regression

matr;{i=i+1};bboot(*,i)=b$... Store b(r)

Endproc Ends procedure

exec;n=20;bootstrap=b$ 20 bootstrap reps

matr;list;bboot' $ Display results


Results of bootstrap procedure

Results of Bootstrap Procedure

--------+-------------------------------------------------------------

Variable| Coefficient Standard Error t-ratio P[|T|>t] Mean of X

--------+-------------------------------------------------------------

Constant| -79.7535*** 8.67255 -9.196 .0000

Y| .03692*** .00132 28.022 .0000 9232.86

PG| -15.1224*** 1.88034 -8.042 .0000 2.31661

--------+-------------------------------------------------------------

Completed 20 bootstrap iterations.

----------------------------------------------------------------------

Results of bootstrap estimation of model.

Model has been reestimated 20 times.

Means shown below are the means of the

bootstrap estimates. Coefficients shown

below are the original estimates based

on the full sample.

bootstrap samples have 36 observations.

--------+-------------------------------------------------------------

Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X

--------+-------------------------------------------------------------

B001| -79.7535*** 8.35512 -9.545 .0000 -79.5329

B002| .03692*** .00133 27.773 .0000 .03682

B003| -15.1224*** 2.03503 -7.431 .0000 -14.7654

--------+-------------------------------------------------------------


Bootstrap replications

Bootstrap Replications

Full sample result

Bootstrapped sample results


Ols vs least absolute deviations

OLS vs. Least Absolute Deviations

----------------------------------------------------------------------

Least absolute deviations estimator...............

Residuals Sum of squares = 1537.58603

Standard error of e = 6.82594

Fit R-squared = .98284

--------+-------------------------------------------------------------

Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X

--------+-------------------------------------------------------------

|Covariance matrix based on 50 replications.

Constant| -84.0258*** 16.08614 -5.223 .0000

Y| .03784*** .00271 13.952 .0000 9232.86

PG| -17.0990*** 4.37160 -3.911 .0001 2.31661

--------+-------------------------------------------------------------

Ordinary least squares regression ............

Residuals Sum of squares = 1472.79834

Standard error of e = 6.68059 Standard errors are based on

Fit R-squared = .98356 50 bootstrap replications

--------+-------------------------------------------------------------

Variable| Coefficient Standard Error t-ratio P[|T|>t] Mean of X

--------+-------------------------------------------------------------

Constant| -79.7535*** 8.67255 -9.196 .0000

Y| .03692*** .00132 28.022 .0000 9232.86

PG| -15.1224*** 1.88034 -8.042 .0000 2.31661

--------+-------------------------------------------------------------


Econometrics i

Quantile Regression:

Application of Bootstrap Estimation


Quantile regression

Quantile Regression

  • Q(y|x,) = x,  = quantile

  • Estimated by linear programming

  • Q(y|x,.50) = x, .50  median regression

  • Median regression estimated by LAD (estimates same parameters as mean regression if symmetric conditional distribution)

  • Why use quantile (median) regression?

    • Semiparametric

    • Robust to some extensions (heteroscedasticity?)

    • Complete characterization of conditional distribution


Estimated variance for quantile regression

Estimated Variance for Quantile Regression

  • Asymptotic Theory

  • Bootstrap – an ideal application


Econometrics i

 = .25

 = .50

 = .75


Multicollinearity

Multicollinearity

Not “short rank,” which is a deficiency in the model. A characteristic of the data set which affects the covariance matrix.

Regardless, is unbiased. Consider one of the unbiased coefficient estimators of k. E[bk] = k

Var[b] = 2(X’X)-1 .

The variance of bk is the kth diagonal element of 2(X’X)-1 .

We can isolate this with the result in your text.

Let [X,z] be [Other xs, xk] = [X1,x2]

(a convenient notation for the results in the text).

We need the residual maker, MX.

The general result is that the diagonal element we seek is

[zM1z]-1 , which we know is the reciprocal of the sum of squared residuals in the regression of z on X.


Variance of least squares

Variance of Least Squares


Multicollinearity1

Multicollinearity


Gasoline market

Gasoline Market

Regression Analysis: logG versus logIncome, logPG

The regression equation is

logG = - 0.468 + 0.966 logIncome - 0.169 logPG

Predictor Coef SE Coef T P

Constant -0.46772 0.08649 -5.41 0.000

logIncome 0.96595 0.07529 12.83 0.000

logPG -0.16949 0.03865 -4.38 0.000

S = 0.0614287 R-Sq = 93.6% R-Sq(adj) = 93.4%

Analysis of Variance

Source DF SS MS F P

Regression 2 2.7237 1.3618 360.90 0.000

Residual Error 49 0.1849 0.0038

Total 51 2.9086


Gasoline market1

Gasoline Market

Regression Analysis: logG versus logIncome, logPG, ...

The regression equation is

logG = - 0.558 + 1.29 logIncome - 0.0280 logPG

- 0.156 logPNC + 0.029 logPUC - 0.183 logPPT

Predictor Coef SE Coef T P

Constant -0.5579 0.5808 -0.96 0.342

logIncome 1.2861 0.1457 8.83 0.000

logPG -0.02797 0.04338 -0.64 0.522

logPNC -0.1558 0.2100 -0.74 0.462

logPUC 0.0285 0.1020 0.28 0.781

logPPT -0.1828 0.1191 -1.54 0.132

S = 0.0499953 R-Sq = 96.0% R-Sq(adj) = 95.6%

Analysis of Variance

Source DF SS MS F P

Regression 5 2.79360 0.55872 223.53 0.000

Residual Error 46 0.11498 0.00250

Total 51 2.90858

The standard error on logIncome doubles when the three variables are added to the equation.


Econometrics i

Condition Number andVariance Inflation Factors

Condition number larger than 30 is ‘large.’

What does this mean?


Econometrics i

The Longley Data


Econometrics i

NIST Longley Solution


Econometrics i

Excel Longley Solution


Econometrics i

The NIST Filipelli Problem


Certified filipelli results

Certified Filipelli Results


Minitab filipelli results

Minitab Filipelli Results


Stata filipelli results

Stata Filipelli Results


Econometrics i

Even after dropping two (random columns), results are only correct to 1 or 2 digits.


Regression of x2 on all other variables

Regression of x2 on all other variables


Using qr decomposition

Using QR Decomposition


Multicollinearity2

Multicollinearity

There is no “cure” for collinearity. Estimating something else is not helpful (principal components, for example).

There are “measures” of multicollinearity, such as the condition number of Xand the variance inflation factor.

Best approach: Be cognizant of it. Understand its implications for estimation.

What is better: Include a variable that causes collinearity, or drop the variable and suffer from a biased estimator?

Mean squared error would be the basis for comparison.

Some generalities. Assuming X has full rank, regardless of the condition,

b is still unbiased

Gauss-Markov still holds


Specification and functional form nonlinearity

Specification and Functional Form:Nonlinearity


Log income equation

Log Income Equation

----------------------------------------------------------------------

Ordinary least squares regression ............

LHS=LOGY Mean = -1.15746 Estimated Cov[b1,b2]

Standard deviation = .49149

Number of observs. = 27322

Model size Parameters = 7

Degrees of freedom = 27315

Residuals Sum of squares = 5462.03686

Standard error of e = .44717

Fit R-squared = .17237

--------+-------------------------------------------------------------

Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X

--------+-------------------------------------------------------------

AGE| .06225*** .00213 29.189 .0000 43.5272

AGESQ| -.00074*** .242482D-04 -30.576 .0000 2022.99

Constant| -3.19130*** .04567 -69.884 .0000

MARRIED| .32153*** .00703 45.767 .0000 .75869

HHKIDS| -.11134*** .00655 -17.002 .0000 .40272

FEMALE| -.00491 .00552 -.889 .3739 .47881

EDUC| .05542*** .00120 46.050 .0000 11.3202

--------+-------------------------------------------------------------

Average Age = 43.5272. Estimated Partial effect = .066225 – 2(.00074)43.5272 = .00018.

Estimated Variance 4.54799e-6 + 4(43.5272)2(5.87973e-10) + 4(43.5272)(-5.1285e-8)

= 7.4755086e-08. Estimated standard error = .00027341.


Specification and functional form interaction effect

Specification and Functional Form:Interaction Effect


Interaction effect

Interaction Effect

----------------------------------------------------------------------

Ordinary least squares regression ............

LHS=LOGY Mean = -1.15746

Standard deviation = .49149

Number of observs. = 27322

Model size Parameters = 4

Degrees of freedom = 27318

Residuals Sum of squares = 6540.45988

Standard error of e = .48931

Fit R-squared = .00896

Adjusted R-squared = .00885

Model test F[ 3, 27318] (prob) = 82.4(.0000)

--------+-------------------------------------------------------------

Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X

--------+-------------------------------------------------------------

Constant| -1.22592*** .01605 -76.376 .0000

AGE| .00227*** .00036 6.240 .0000 43.5272

FEMALE| .21239*** .02363 8.987 .0000 .47881

AGE_FEM| -.00620*** .00052 -11.819 .0000 21.2960

--------+-------------------------------------------------------------

Do women earn more than men (in this sample?) The +.21239 coefficient on FEMALE would

suggest so. But, the female “difference” is +.21239 - .00620*Age. At average Age, the

effect is .21239 - .00620(43.5272) = -.05748.


  • Login