Week 4. Bivariate Regression, Least Squares and Hypothesis Testing. Lecture Outline. Method of Least Squares Assumptions Normality assumption Goodness of fit Confidence Intervals Tests of Significance alpha versus p. Recall . . .

Download Presentation

Week 4

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

Week 4 Bivariate Regression, Least Squares and Hypothesis Testing

Lecture Outline • Method of Least Squares • Assumptions • Normality assumption • Goodness of fit • Confidence Intervals • Tests of Significance • alpha versus p IS 620 Spring 2006

Recall . . . • Regression curve as “line connecting the mean values” of y for a given x • No necessary reason for such a construction to be a line • Need more information to define a function IS 620 Spring 2006

Method of Least Squares • Goal:describe the functional relationship between y and x • Assume linearity (in the parameters) • What is the best line to explain the relationship? • Intuition:The line that is “closest” or “fits best” the data IS 620 Spring 2006

Why sum of squares? • Sum of residuals may be zero • Emphasize residuals that are far away from regression line • Better describes spread of residuals IS 620 Spring 2006

Gauss-Markov Theorem • Least-squares method produces best, linear unbiased estimators (BLUE) • Also most efficient (minimum variance) • Provided classic assumptions obtain IS 620 Spring 2006

Classical Assumptions • Focus on #3, #4, and #5 in Gujarati • Implications for estimators of violations • Skim over #1, #2, #6 through #10 IS 620 Spring 2006

#3: Zero mean value of ui • Residuals are randomly distributed around the regression line • Expected value is zero for any given observation of x • NOTE: Equivalent to assuming the model is fully specified IS 620 Spring 2006

Violation of #3 • Estimated betas will be • Unbiased but • Inconsistent • Inefficient • May arise from • Systematic measurement error • Nonlinear relationships (Phillips curve) IS 620 Spring 2006

#4: Homoscedasticity • The variance of the residuals is the same for all observations, irrespective of the value of x • “Equal variance” • NOTE: #3 and #4 imply (see “Normality Assumption”) IS 620 Spring 2006

Violations of #5 • Estimated betas will be • Unbiased • Consistent • Inefficient • Arise from • Time-series data • Spatial correlation IS 620 Spring 2006

Other Assumptions (1) • Assumption 6: zero covariance between xi and ui • Violations cause of heteroscedasticity • Hence violates #4 • Assumption 9: model correctly specified • Violations may violate #1 (linearity) • May also violate #3: omitted variables? IS 620 Spring 2006

Other Assumptions (2) • #7: n must be greater than number of parameters to be estimated • Key in multivariate regression • King, Keohane and Verba’s (1996) critique of small n designs IS 620 Spring 2006

Normality Assumption • Distribution of disturbance is unknown • Necessary for hypothesis testing of I.V.s • Estimates a function of ui • Assumption of normality is necessary for inference • Equivalent to assuming model is completely specified IS 620 Spring 2006

Normality Assumption • Central Limit Theorem: M&Ms • Linear transformation of a normal variable itself is normal • Simple distribution (mu, sigma) • Small samples IS 620 Spring 2006

Assumptions, Distilled • Linearity • DV is continuous, interval-level • Non-stochastic: No correlation between independent variables • Residuals are independently and identically distributed (iid) • Mean of zero • Constant variance IS 620 Spring 2006

If so, . . . • Least-squares method produces BLUE estimators IS 620 Spring 2006

Goodness of Fit • How “well” the least-squares regression line fits the observed data • Alternatively: how well the function describes the effect of x on y • How much of the observed variation in y have we explained? IS 620 Spring 2006

Coefficient of determination • Commonly referred to as “r2” • Simply, the ratio of explained variation in y to the total variation in y IS 620 Spring 2006

Components of variation • TSS: total sum of squares • ESS: explained sum of squares • RSS: residual sum of squares IS 620 Spring 2006

Hypothesis Testing • Confidence Intervals • Tests of significance • ANOVA • Alpha versus p-value IS 620 Spring 2006

Confidence Intervals • Two components • Estimate • Expression of uncertainty • Interpretation: • Gujarati, p. 121: “The probability of constructing an interval that contains Beta is 1-alpha” • NOT: “The p that Beta is in the interval is 1-alpha” IS 620 Spring 2006

C.I.s for regression • Depend upon our knowledge or assumption about the sampling distribution • Width of interval proportional to standard error of the estimators • Typically we assume • The t distribution for Betas • The chi-square distribution for variances • Due to unknown true standard error IS 620 Spring 2006