1 / 30

# welcome to econ 420 applied regression analysis - PowerPoint PPT Presentation

Welcome to Econ 420 Applied Regression Analysis. Study Guide Week Six. The F-Test of Overall Significance of Equation. Testing to see if, in general, our equation is any good at all. Step 1: State the null and alternative hypotheses.

Related searches for welcome to econ 420 applied regression analysis

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'welcome to econ 420 applied regression analysis' - Thomas

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Welcome to Econ 420 Applied Regression Analysis

Study Guide

Week Six

The F-Test of Overall Significanceof Equation

Testing to see if, in general, our equation is any good at all

Step 2: Choose the level of significance; find critical F( pages 316-319, d.f. of numerator =k and d.f. of denominator = n-k-1); state the decision rule

Step 3: Estimate the Regression; find F- Stat (formula on page 56, EViews calculates F-Stat automatically).

Step 4: Apply the decision rule

• If FStat > critical F  reject null hypothesis

The overall fit of the estimated model pages 316-319, d.f. of numerator =k and d.f. of denominator = n-k-1); state the decision rule

• Graph of total, explained, and residual sums of squares

• TSS = RSS + ESS

• Divide both sides by TSS

• 1 = RSS/TSS + ESS/TSS

• The coefficient of determination (R2)

• R2 = ESS/TSS, or

• Definition: Percentage of total variation of the dependent variable around its mean that is explained by the independent variables

R pages 316-319, d.f. of numerator =k and d.f. of denominator = n-k-1); state the decision rule2

• R2 = 1 – RSS/TSS

• the smaller the sum of squared residuals the _______ the R2

• Under what condition R2 = 1?

• Under what condition R2 = 0?

• In the presence of an intercept  1> R2>0

• Suppose we got an R2 =0.7. What does this number mean?

Problem of R pages 316-319, d.f. of numerator =k and d.f. of denominator = n-k-1); state the decision rule2

• Remember our height –weight example

• Suppose R 2 = 0.7

• Now suppose we add another independent variable to our model: pairs of shoes each individual owns

• Does R2 go up?

• May be

• Should it go up?

• No

• Why?

• If there is no correlation between the added variable and dependent variable, then the estimated coefficient will be zero and RSS does not change

2. Sometimes the addition of an irrelevant independent variables to the model increases R2

• Why?

• There may (accidentally) be a correlation between the weight and pairs of shoes. This diminishes the sum of squared residuals

R Bar Squared variables to the model increases R(Adjusts R squared for degrees of freedom.)

Adjusted R Squared variables to the model increases R

• As K goes up what happens to R bar squared?

• The sum of squared residuals may go down.

• What does this do to R bar squared?

• R bar squared may go up

• (n-k-1) goes down the term in the bracket goes up

• R bar squared goes down

• R bar squared goes up if the first effect is stronger than the second effect.

• This is more likely to happen if the added independent variable is a relevant variable

• Note: High R or R bar squared is not the only sign of a good fit.

• EViews reports both R2 and Rbar2

• Steps in Applied Regression Analysis (Chapter 4) variables to the model increases R

• Identify the question

2. Review the literature variables to the model increases R

a) Theoretical literature will help you to

• Specify the model

• Dependent and Independent Variables

• Real/nominal variables

• Omitted variables

• Extra variables

• Functional form

• Hypothesize the expected signs of coefficients

• A perfect but useless regression (cause and effect rather than equality)

Effects of Omitted Variables variables to the model increases R

• Example

• True equation is Y = f (X1,X2)

• Where

• Y = GPA

• X1,= hours of study

• X2 = IQ score

• We fail to include X variables to the model increases R2 in our model

• Does this violate any assumptions?

• Go back and study the assumptions to answer this question

• Violates assumption 1.  Why?

• May violate assumption 3.  Why?

Effects of Omitted Variables variables to the model increases R

• What if X1 and X2 are correlated?

• Does this violate any assumptions?

• OLS is not BLUE

• The estimated coefficient of X1 (that is, B^1) is biased

• Bias depends on the correlation between X1 & X2 and the coefficient of X2 in true regression line.

Direction of Bias variables to the model increases R

The sign (direction) of Bias variables to the model increases R

• Bias is zero either

• if X2 does not affect Y (Bomitted is zero), or

• if X2 is not correlated with X1

• How do you expect IQ (X2) to affect GPA (Y)?

• How are IQ (X2) and Hours of study (X1) correlated?

• What is the direction of bias in our example?

• Will B^1 be bigger or smaller than it actually should be?

The Variance of The estimated Coefficient variables to the model increases R

• Fact:

• When we omit a relevant independent variables that is correlated with other independent variables, variance of the estimated coefficients of the included independent variable goes down t statistic goes up t-test may yield significant coefficient while it should not

When should we suspect the omitted variable problem? variables to the model increases R

• The adjusted R squared is low

• The magnitude or the sign of the estimated coefficients is not as expected

• The unimportant variables end up being highly significant

Correction for Omitted Variables variables to the model increases R

• Study the theoretical literature again

• Include the omitted variable based on the Expected bias analysis

Irrelevant Variable Problem variables to the model increases R

• Suppose the true regression model: GPA = f (Hours of study), but

• Our version of the true model: GPA = f (hours of study, and weight of the person)

• Does our model violate assumption 1?

• Any other assumptions are violated?

• Is our estimator bias?

• Not necessarily: if the expected value of the error term is zero, the expected value of Bhat on hours of study = B

• Does our estimator have the minimum variance? variables to the model increases R

• No, our estimator does not have the smallest variance (not the most efficient)

• How does this affect t-test?

• variance of the estimated coefficients of hours of study goes up t statistic goes down t-test may mot yield significant coefficient on hours of study while it should.

Should we include X in the set of our independent variable? variables to the model increases R

• Yes, if

• Theory calls for its inclusion (the most important criterion)

• T- test: the estimated coefficient of X is significant in the right direction (Note: this does not mean that if the estimated coefficient is insignificant you have to drop the variable from your model.)

• As you include X, the adjusted R squared goes up.

• As you include X, the other variables’ coefficients change significantly.

b) Empirical literature will help you to variables to the model increases R

• See what others have done

• Their variables

• Their functional forms

• Their data sets

• Their findings

3. Choose a sample & collect data variables to the model increases R

• Cross Sectional/ Time Series

• Degrees of freedom

4. Estimate and evaluate the equation variables to the model increases R

a) Overall Quality of estimation

• Adjusted R squared

• F- test

b) Test your hypotheses

5. Document the results variables to the model increases R

• Predictions

• Policy recommendations

Assignment 5 (5 questions for 10 points each, total =50 points)Due: before 10PM on Friday, October 5)

• Use the data set in dvd4 file to

• run an F test of the overall significance of the equation.

• test the significance of all of the estimated coefficients at 1% level. Make sure to not skip any of the 4 steps in hypothesis testing. Attach your EViews output.

• construct a 95% confidence interval for the coefficient on income.

Assignment 5 (continued) points)

2. #17, Page 63

3. #4, PP 81-82

4. #5, Page 82

5. #6, Page 83