1 / 16

# Economics 105: Statistics - PowerPoint PPT Presentation

Economics 105: Statistics. Go over GH 24. Risks in Model Building. Including irrelevant X ’ s Increases complexity Reduces adjusted R 2 Increases model variability across samples Omitting relevant X ’ s Fails to capture fit Can bias other estimated coefficients

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Economics 105: Statistics' - nola

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Economics 105: Statistics

Go over GH 24

• Including irrelevant X’s

• Increases complexity

• Increases model variability across samples

• Omitting relevant X’s

• Fails to capture fit

• Can bias other estimated coefficients

• Where omitted X is related to both other X’s and to the dependent variable (Y)

• Remember: we are using sample data

• About 5% of the time, our sample will include random observations of X’s that result in betahat’s that meet classical hypothesis tests

• Or the beta’s may be important, but the sample data will randomly include observations of X that do not meet the statistical tests

• That’s why we rely on theory, prior hypotheses, and replication

Holy endogeneity, Batman!

Violations of GM Assumptions

Assumption Violation

Wrong functional form

Omit Relevant Variable (Include Irrelevant Var)

Errors in Variables

Sample selection bias, Simultaneity bias

“well-specified model” (1) & (5)

constant, nonzero mean due to systematically +/- measurement error in Y

can only assess theoretically

zero conditional mean of errors (2)

Homoskedastic errors (3)

Heteroskedastic errors

No serial correlation in errors (4)

There exists serial correlation in errors

### Multiple Regression

Assumptions

(1)

Linear function in the parameters, plus error

Variation in Y is caused by , the error (as well as X)

(2)

Sources of error

Idiosyncratic, “white noise”

Measurement error on Y

Omitted relevant explanatory variables

If (2) holds, we have exogenous explanatory vars

If some Xj is correlated with error term for some reason, then that Xj is an endogenous explanatory var

### Multiple Regression

Assumptions

(3)

Homoskedasticity

(4)

No autocorrelation

(5)

Errors and the explanatory variables are uncorrelated

(6)

Errors are i.i.d. normal

### Multiple Regression

Assumption (7) No perfect multicollinearity

no explanatory variable is an exact linear function of other X’s

Venn diagram

Other implicit assumptions

data are a random sample of n observations from proper population

n > K, and ideally n much greater than K

the little xij’s are fixed numbers (the same in repeated samples) or they are realizations of random variables, Xij, that are independent of error term & then inference is done CONDITIONAL on observed values of xij’s

• Violation of Assumptions (1 & 5): well-specified model

• true model is (A)

• but we run (B)

• Including an irrelevant variable

• is an unbiased estimator of

• ; less efficient

• estimator of , , is unbiased

• t & F tests are valid

### Specification Bias

• Violation of Assumptions (1&5): well-specified model

• true model is (C)

• but we run (D)

• Omitting a relevant variable

• is a biased estimator of

• is actually smaller; more efficient

• estimator of , , is now biased

• t & F tests are incorrect

### Omitted Variable Bias

Subcript c indexes 64 countries

Descriptive statistics

### Omitted Variable Bias

… approximately equal