1 / 56

Econometrics Econ. 405

Econometrics Econ. 405. Chapter 5: The Simple Regression Model. I. Understanding the definition of Simple Linear Regression Model. There are two types of regression models (Simple vs. multiple regression models.

lconn
Download Presentation

Econometrics Econ. 405

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EconometricsEcon. 405 Chapter 5: The Simple Regression Model

  2. I. Understanding the definition of Simple Linear Regression Model • There are two types of regression models (Simple vs. multiple regression models. • The simple regression model can be used to study the relationship between any two variables. • This simple regression model is appropriate as an empirical tool. • It is also a good practice for studying multiple regression model (covered next chapters).

  3. The analysis of applied econometrics begins with following: • Y and X are two variables representing some population. • We are interested in “explaining Y in terms of x”. • Or similarly, studying “how much Y varies with changes in X”.

  4. Recall: Explaining the variable yin terms of variable x Intercept Slope parameter Dependent variable, explained variable, response variable, predicted variable, and regressand Error term, disturbance, and unobservables Independent variable, explanatory variable, control variable, predictor variable, and regressor

  5. The Simple Regression Model • In order to estimate the regression model, data is needed: A random sample of observations First observation Second observation Third observation Value of the dependent variable of the i-th ob- servation Value of the expla- natory variable of the i-th observation n-th observation

  6. Revist the regression equations:

  7. The Simple Regression Model • Fit as good as possible a regression line through the data points: Fitted regression line

  8. II. Ordinary Least Squares Technique • Regression analysis refers to techniques that allow you to estimate economic relationship using data. • Mainly there are threetechniques to estimate regression function; Generalized method of moments (GMM), Maximum Likelihood (ML), & Ordinary Least Squares (OLS). • The method used most frequently is commonly known as Ordinary Least Squares (OLS). • Although the OLS technique is popular and relatively simple, the application of it can become more complicated when starting to add more independent variables to your regression model.

  9. Justifying the Least Squares Principle • When estimating a Sample Regression Function (SRF), the most common econometrics method to use is OLS. • Method of OLS uses the least squares principle to fit pre-specified regression. • The least squares principle states that SRF should be constructed ( with constant and slope function) so that the sum of squared distance between the observed values of “Y” and the values estimated from your SRF is minimized (the smallest possible value).

  10. Reasons for OLS Popularity A) OLS is easier than alternatives: Although other alternative methods are used to estimate same regression functions, they require more mathematical sophistication.

  11. B) The OLS is sensible: So by using square residuals, you can avoid positive and negative residuals canceling each other out and find a regression line that’s as close as possible to the observed data points. How??

  12. The numerical properties of estimators obtained by the method of OLS: • The OLS estimators are expressed only in terms of the observable quantities (i.e., X and Y). Therefore, they can be easily computed. • They are point estimators; that is, given the sample, each estimator will provide only a single (point, not interval) value of the relevant population parameter. • Once the OLS estimates are obtained from the sample data, the sample regression line can be easily obtained.

  13. Recall: • Algebraic properties of OLS regression Fittedorpredictedvalues Deviationsfromregressionline (= residuals) Deviationsfromregressionlinesumuptozero Correlationbetweendeviationsandregressorsiszero Sample averagesof y and x lie on regressionline

  14. Accordingly: • Thus the Goodness-of-Fitis measures of variation to to show “How well does the explanatory variable explain the dependent variable?“ • TSS= total sum of squares • ESS= explained sum of squares • RSS= residual sum of squares

  15. Total sumofsquares, represents total variation in dependent variable Explainedsumofsquares, representsvariation explainedbyregression Residual sumofsquares, representsvariationnot explainedbyregression

  16. Unexplainedpart Total variation Explainedpart R-squaredmeasuresthefractionofthe total variationthatisexplainedbytheregression

  17. III. OLS Assumptions- Classical Linear Regression Model (CLRM) • In regression analysis our objective is not only to obtain βˆ1 and βˆ2 but also to draw inferences about the true β1 and β2. For example, we would like to know how close βˆ1 and βˆ2 are to their counterparts in the population or how close Yˆi is to the true E(Y | Xi). • The econometrics model shows that Yi depends on both Xi and ui . The assumptions made about the Xi variable(s) and the error term are extremely critical to the valid interpretation of the regression estimates. • When deciding whether OLS is the best technique for your estimation problem, some requirements must be met. • They are called the OLS assumptions or the classical linear regression model (CLRM).

  18. First Assumption: Keep in mind that the regressandY and the regressorX themselves may be nonlinear.

  19. Second Assumption: This means the regression analysis is conditional on the given values of the regressor(s) X.

  20. Third Assumption: • Revisit property (3); each Y population corresponding to a given X is distributed around its mean value with some Y values above the mean and some below it. the mean value of these deviations corresponding to any given X should be zero. Note that the assumption E(ui | Xi) = 0 implies that E(Yi | Xi) = β1 + β2Xi.

  21. This situation, the variation around the regression line (which is the line of average relationship between Y and X) is the same across the X values; it neither increases or decreases as X varies.

  22. The conditional variance of the Y population varies with X. This situation is known as heteroscedasticity, or unequal spread, or variance. Symbolically, in this situation Assumption (4) can be written as var (ui | Xi) = σ2i var (u| X1) < var (u| X2), . . . , < var (u| Xi). Therefore, the likelihood is that the Y observations coming from the population with X = X1 would be closer to the PRF than those coming from populations corresponding to X = X2, X = X3, and so on. In short, not all Y values corresponding to the various X’s will be equally reliable, reliability being judged by how closely or distantly the Y values are distributed around their means, that is, the points on the PRF.

  23. According to the Figures , When the disturbances (deviations) follow systematic patterns: • In Figure a, we see that the u’s are positively correlated, a positive u followed by a positive u or a negative u followed by a negative u. • In Figure b, the u’s are negatively correlated, a positive u followed by a negative u and vice versa. We can see there is auto- or serial correlation. • In Figure c, shows that there is no systematic pattern to the u’s, thus indicating zero correlation. In another word, auto- or serial correlation, is absent.

  24. Sixth Assumption: • The disturbance u and explanatory variable X are uncorrelated. The it is assumed that X and u (which may represent the influence of all the omitted variables) have separate (and additive) influence on Y. • But if X and u are correlated, it is not possible to assess their individual effects on Y. Thus, if X and u are positively correlated, X increases when u increases and it decreases when u decreases. Similarly, if X and u are negatively correlated, X increases when u decreases and it decreases when u increases. In either case, it is difficult to isolate the influence of X and u on Y

  25. Seventh Assumption: • Note: n > # of X's , n > # of β's

  26. Eight Assumption:

  27. Ninth Assumption: (1) What variables should be included in the model? (2) What is the functional form of the model? Is it linear in the parameters, the variables, or both? (3) What are the probabilistic assumptions made about the Yi , the Xi, and the uientering the model?

  28. Tenth Assumption: • For models beyond the two-variable model containing several regressors(discussed next chapter).

  29. IV. Gauss–Markov Theorem

More Related