1 / 24

OLS Regression

OLS Regression. What is it? Closely allied with correlation – interested in the strength of the linear relationship between two variables One variable is specified as the dependent variable The other variable is the independent (or explanatory) variable. Regression Model Y = a + bx + e

ila-ross
Download Presentation

OLS Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OLS Regression • What is it? • Closely allied with correlation – interested in the strength of the linear relationship between two variables • One variable is specified as the dependent variable • The other variable is the independent (or explanatory) variable

  2. Regression Model • Y = a + bx + e • What is Y? • What is a? • What is b? • What is x? • What is e? • What is Y-hat?

  3. Elements of the Regression Line • a = Y intercept (what Y is predicted to equal when X = 0) • b = Slope (indicates the change in Y associated with a unit increase in X) • e = error (the difference between the predicted Y (Y hat) and the observed Y

  4. Regression • Has the ability to quantify precisely the relative importance of a variable • Has the ability to quantify how much variance is explained by a variable(s) • Use more often than any other statistical technique

  5. The Regression Line • Y = a + bx + e • Y = sentence length • X = prior convictions • Each point represents the number of priors (X) and sentence length (Y) of a particular defendant • The regression line is the best fit line through the overall scatter of points

  6. X and Y are observed. We need to estimate a & b

  7. Calculus 101 Least Squares Method and differential calculus Differentiation is a very powerful tool that is used extensively in model estimation. Practical examples of differentiation are usually in the form of minimization/optimization problems or rate of change problems.

  8. Calculus 101: Calculating the rate of change or slope of a line For a straight line it is relatively simple to calculate the slope

  9. Calculating the rate of change or slope of a line for a curve is a bit harder Differential Calculus: We have a curve describing the variable Y as some function of the variable X: y = x2

  10. It is possible to find a general expression involving the function f(x) that describes the slopes of the approximating sequence of secant lines h = x1 – x0 (represents a small difference from a point of interest)

  11. Lets take a cost curve example: C(x) = x2 what is the derivative if x = 3 = f(3+h) – f(3) / h = (3+h)2 – (3)2 / h = (9 + 6h + h2) – 9 / h = 6h + h2 / h = 6 + h = 6 (as h approaches 0) ∆y/∆x = 6

  12. How does this relate to our Regression model that is a straight line?

  13. How do you draw a line when the line can be drawn in almost any direction? The Method of Least Squares: drawing a line that minimizing the squared distances from the line (Σe2) This is a minimization problem and therefore we can use differential calculus to estimate this line.

  14. X and Y are observed. We need to estimate a & b

  15. Least Squares Method

  16. Summing the squares of the deviations yields: • f(a, b) = 55-30a + 5a2 - 78b + 20ab + 30b2 • Calculate the first order partial derivatives of f(a,b) • fb = -78 + 20a + 60b and fa = -30 + 10a + 20b

  17. Set each partial derivative to zero: Manipulate fa: • 0 = -30 + 10a + 20b • 10a = 30 - 20b • a= 3 - 2b

  18. Substitute (3-2b) into fb: • 0 = -78 + 20a + 60b = -78 +20(3-2b) + 60b • = -78 + 60 - 40b + 60b • = -18 +20b • 20b = 18 • b = 0.9 • Slope = .09

  19. Substituting this value of b back into fa to obtain a: • 10a = 30 - 20(.09) • 10a = 30 - 18 • 10a = 12 • a= 1.2 • Y-intercept = 1.2

  20. Estimating the model (the easy way) Calculating the slope (b)

  21. Sum of Squares for X • Some of Squares for Y • Sum of produces

  22. Calculating the Y-intersept (a) Calculating the error term (e) Y hat = predicted value of Y e will be different for every observation. It is a measure of how much we are off in are prediction.

  23. Regression is strongly related to Correlation

More Related