economics 105 statistics
Skip this Video
Download Presentation
Economics 105: Statistics

Loading in 2 Seconds...

play fullscreen
1 / 16

Economics 105: Statistics - PowerPoint PPT Presentation

  • Uploaded on

Economics 105: Statistics. Go over GH 24. Risks in Model Building. Including irrelevant X ’ s Increases complexity Reduces adjusted R 2 Increases model variability across samples Omitting relevant X ’ s Fails to capture fit Can bias other estimated coefficients

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Economics 105: Statistics' - nola

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
risks in model building
Risks in Model Building
  • Including irrelevant X’s
    • Increases complexity
    • Reduces adjusted R2
    • Increases model variability across samples
  • Omitting relevant X’s
    • Fails to capture fit
    • Can bias other estimated coefficients
      • Where omitted X is related to both other X’s and to the dependent variable (Y)
more risks samples can mislead
More Risks:Samples Can Mislead
  • Remember: we are using sample data
    • About 5% of the time, our sample will include random observations of X’s that result in betahat’s that meet classical hypothesis tests
    • Or the beta’s may be important, but the sample data will randomly include observations of X that do not meet the statistical tests
  • That’s why we rely on theory, prior hypotheses, and replication
violations of gm assumptions

I know! We can save the model, but not until Eco205.

Holy endogeneity, Batman!

Violations of GM Assumptions

Assumption Violation

Wrong functional form

Omit Relevant Variable (Include Irrelevant Var)

Errors in Variables

Sample selection bias, Simultaneity bias

“well-specified model” (1) & (5)

constant, nonzero mean due to systematically +/- measurement error in Y

can only assess theoretically

zero conditional mean of errors (2)

Homoskedastic errors (3)

Heteroskedastic errors

No serial correlation in errors (4)

There exists serial correlation in errors

multiple regression

Multiple Regression



Linear function in the parameters, plus error

Variation in Y is caused by , the error (as well as X)


Sources of error

Idiosyncratic, “white noise”

Measurement error on Y

Omitted relevant explanatory variables

If (2) holds, we have exogenous explanatory vars

If some Xj is correlated with error term for some reason, then that Xj is an endogenous explanatory var

multiple regression1

Multiple Regression





No autocorrelation


Errors and the explanatory variables are uncorrelated


Errors are i.i.d. normal

multiple regression2

Multiple Regression

Assumption (7) No perfect multicollinearity

no explanatory variable is an exact linear function of other X’s

Venn diagram

Other implicit assumptions

data are a random sample of n observations from proper population

n > K, and ideally n much greater than K

the little xij’s are fixed numbers (the same in repeated samples) or they are realizations of random variables, Xij, that are independent of error term & then inference is done CONDITIONAL on observed values of xij’s

specification bias

Violation of Assumptions (1 & 5): well-specified model

  • true model is (A)
  • but we run (B)
  • Including an irrelevant variable
    • is an unbiased estimator of
    • ; less efficient
    • estimator of , , is unbiased
      • t & F tests are valid

Specification Bias

specification bias1

Violation of Assumptions (1&5): well-specified model

  • true model is (C)
  • but we run (D)
  • Omitting a relevant variable
    • is a biased estimator of
    • is actually smaller; more efficient
    • estimator of , , is now biased
      • t & F tests are incorrect

Specification Bias

omitted variable bias

When is an unbiased estimator of ?

  • b21 is the slope coefficient from a regression of the EXCLUDED variable on the INCLUDED variable

Omitted Variable Bias

omitted variable bias1

Omitted Variable Bias

Subcript c indexes 64 countries

Descriptive statistics

omitted variable bias6

Omitted Variable Bias

… approximately equal