Marietta College Spring 2011 Econ 420: Applied Regression Analysis Dr. Jacqueline Khorassani Week 4
Corrections • Exam 1 • Tuesday, February 108 • Exam 2 • Tuesday, March 2422 • Exam 3 • Monday, April 25, 12-2:30PM
Tuesday, February 1 • My home page is moved to http://mcis.marietta.edu/community/khorassj/ • Who went to the lecture on Sunday night? • Summaries are due before 5 pm this Friday • Send them as an email attachment to me • If you asked her a questions, write it down and give it to me today.
Return and discuss Asst 4 • Use Eviews to estimate the coefficients in our classroom height – weight example; attach • the estimation output • Is the first thing you get after you estimate your model (reports the estimated coefficients) • the graph of actual weight, predicted weight and the residuals • Is Jackie above the estimated line or below it; why? • Above, due to positive residual
#5, Page 26 (a) The coefficients are not the same; different samples (b) Equation 1.21 has the steeper slope (6.38 > 4.03) Equation 1.24 has the greater intercept (125.1 > 103.4) They intersect at 9.23 inches above 5 feet (162.3 pounds) (c) Equation 1.24 misguesses by more than ten pounds on exactly the same 3 observations as Equation 1.21. But overall, 1.21 does better (smaller residuals) predicting inside its own sample (no surprise!) (d) Use an average of the two equations Combine samples Use 1.24 because of bigger sample
# 6, Page 26 (a) The percentage chance of making the putt decreases by 4.1 for each foot that the putt is longer (ignoring the effect of all other relevant variables) (b) The equations are identical since ei = Pi – P^i (c) 42.6 percent, yes; 79.5 percent, no (too low); -18.9 percent, no (negative!). • Relationship seems to be nonlinear
Return Asst 5: estimate Ŷi= ^0+ ^1 Xi + ^2 Gi Did R squared go up as you added G to your model?
Collect Asst 6 # 4 , Page 59 # 6, Page 60 #7, Page 61
Chapter 3: Steps in Applied Regression Analysis (Exactly what some of you will do in Econ 421) • Identify the question • What do you have interest in? • We are looking for cause and effect relationships • An example of a perfect but useless regression • Give me an example please!!!
2. Review the literature a) Theoretical literature will help you to • Specify the model Dependent and Independent Variables • How are they measured ? • Real/nominal variables? • Any dummy variables? • Don’t want any omitted variables • Don’t want extra (not needed) variables Functional form • Linear/non-linear • Hypothesize the expected signs of coefficients
b) Review of empirical literature will help you to • See what others have done • Their variables • Their functional forms • Their data sources • Their estimation methods • Their findings
3. Specify the model Based on the review of literature Not based on what increases R bar2
4. Hypothesize the expected signs of coefficients • Based on • Theoretical /empirical literature • Note: You do this before you estimate the equation and see the estimated coefficients
Cross Sectional or Time Series? • Degrees of freedom • Inspect the data • What are the units of measurements? • Any outliers? • Any values that don’t make sense? • Clean the data
6. Choose the method of estimation OLS? Or something else? Test for underlying econometric problems
7. Estimate and evaluate the equation a) Overall quality of estimation • Adjusted R squared • Other test: F- test b) Test your hypotheses • T-test • F-test • Other tests
8. Document the results • Meanings & interpretations • Predictions • Policy recommendations • Shortcomings of the study • What should the future studies do?
Asst 7: Due Thursday in class • #3, Page 85 • #5, Page 87
Thursday, February 3 • Summaries of Sunday night lecture are due before 5 pm this Friday via an email attachment • If you asked her a questions, write it down and give it to me today. • My home page is moved to http://mcis.marietta.edu/community/khorassj/ • Exam 1 • Next Tuesday • Covers Chapters 1 through 3 • Closed book and notes
Data source for Exam 1 • http://pearsonhighered.com/studenmund/ • Using Econometrics: A Practical Guide, 6/e Studenmund • Data Sets • Chapter 2 • Eviews • FINAID • Save the data set in a workfile on your computer and bring your computer to class.
Return and discuss Asst 6 # 4 , Page 59 # 6, Page 60 #7, Page 61
# 4 , Page 59 • Sum of squares residuals (RSS) are minimized with respect to coefficients (beta hats) • R2 = 1 – RSS/TSS or R2 = ESS/TSS If R2 = 0, then RSS = TSS, and ESS = 0 Your line does not predict the value of dependent variable neither better than nor worse than the mean dependent variable R2 < 0 is possible only if the constant term (intercept) is omitted That means that RSS>TSS Your line predicts the value of dependent variable worse than the mean dependent variable Graph
Positive • We prefer Model T because it includes more relevant variables Note: A higher R2 does not automatically mean that an equation is preferred.
# 6, Page 60 • Positive & Positive • Maybe! • One more hour of lecture = 1/25 = 0.04 (4% of classes) One more hour of lecture will increase grade by 0.04 1.74= 0.07 One more of problem set = 1/50= 0.02 (2% of problem sets) One more hour of problem sets increases grade by 0.02 0.60= 0.01 so going to class pays off more. • 0.02 1.74 < 0.10 0.60, so doing problem sets pays off more. Since the units of variables can differ dramatically, coefficient size does not measure importance. If all variables are measured identically, then the size of the coefficient is a measure of importance.
e. An R2 of 0.33 means that a third of the variation of student grades around their mean can be explained by attendance at lectures and the completion of problem sets. This is about right! • The most likely variable to add is measure of student ability (IQ score, ACT score). We’d expect both R2 and R bar 2 to rise.
#7, Page 61 a. Equation B is better X4 is a theoretically sound variable for a campus track X3 seems poorly specified because an especially hot or cold day would discourage fitness runners b. The coefficient of an independent variable tells us the impact of a one-unit increase in that variable on the dependent variable holding constant the other explanatory variables in the equation. If we change the other variables in the equation, we’re holding different variables constant, and so the has a different meaning.
Collect and discuss Asst 7 • #3, Page 85 • #5, Page 87
#3, Page 85 • A male professor in this sample earns $817 more than a female professor, holding constant the other independent variables in the equation. • A key point here is not to change expectations based solely on this result. • R is not a dummy variable because it takes on more than two values. For each additional year in rank, the ith professor’s salary will rise by $406, holding constant the other independent variables in the equation. d. Yes, based on this sample. f. A measure of the quality of the professor
#5, Page 87 • Positive • The best equation includes the actual traffic data. But, since the traffic dummy variable is correlated with the actual traffic variable, it seems slightly better than Equation 3.5. • No! The theoretical underpinnings of the model are much more important. Of course, the higher Rbar2 is certainly a plus.