1 / 17

Linear Regression Analysis

Linear Regression Analysis. Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas Wherly and Dr. Fred Dahm aided in the preparation of chapter 12 material. Linear Regression Analysis (ch. 12).

koen
Download Presentation

Linear Regression Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Linear Regression Analysis • Additional Reference: • Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman • The lecture notes of Dr. Thomas Wherly and Dr. Fred Dahm aided in the preparation of chapter 12 material

  2. Linear Regression Analysis (ch. 12) Observe a response Y and one or more predictors x. Formulate a model that relates the mean response E(Y) to x. Y – Dependent Variable x – Independent Variable

  3. Deterministic Model • Y = f(x) ; Once we know the value of x, the value of Y is completely satisfied • Simplest (Straight Line)Model: Y= o + 1x • 1 = Slope of the Line • o = Y-intercept of the Line

  4. Probabilistic Model • Y = f(x) +  ; The value of Y is a R.V. • Model for Simple Linear Regression: Yi = o + 1xi + i , i=1,..,n • Y1,…,Yn – Observed Value of the Response • x1,…,xn – Observed Value of Predictor • o,1 – Unknown Parameters to be Estimated from the Data • 1,…, n – Unknown Random Error Terms – Usually iid N(0,2) Random Variables

  5. Interpretation of Model For each value of x, the observed Y will fall above or below the line Y = o + 1x according to the error term . For each fixed x Y~N(o + 1x, 2)

  6. Questions • How do we estimate o,1, and 2? • Does the proposed model fit the data well? • Are the assumptions satisfied?

  7. Plotting the Data A scatter plot of the data is a useful first step for checking whether a linear relationship is plausible.

  8. Example (12.4) A study to assess the capability of subsurface flow wetland systems to remove biochemical oxygen demand and other various chemical constituents resulted in the following scatter plot of the data where x = BOD mass loading and y = BOD mass removal. Does the plot suggest a linear relationship?

  9. Example (12.5) An experiment conducted to investigate the stretchability of mozzarella cheese with temperature resulted in the following scatter plot where x = temperature and y = % elongation at failure. Does the scatter plot suggest a linear relationship?

  10. Estimating o and 1 Consider an arbitrary line y = b0 + b1x drawn through a scatter plot. We want the line to be as close to the points in the scatter plot as possible. The vertical distance from (x,y) to the corresponding point on the line (x,b0 + b1x) is y-(b0 + b1x).

  11. Possible Estimation Criteria • Eyeball Method • L1 Estimation - Choose o,1 to minimize yi - ox - 1xi  • Least Squares Estimation - Choose o,1 to minimize (yi - o - 1xi )2 • * We use Least Squares Estimation in practice since it is difficult to mathematically manipulate the other options*

  12. Least Squares Estimation Take derivatives with respect to b0 and b1, and set equal to zero. This results in the “normal equations” (based on right angles – not the Normal distribution)

  13. Formulas for Least Squares Estimates Solving for b0 and b1 results in the L.S. estimates

  14. Example (12.12) Refer to the previous example (12.4). Obtain the expression for the Least Squares line

  15. Estimating 2 Residual = Observed – Predicted Recall the definition of sample variance

  16. Estimating 2 Cont’d • The minimum value of the squared deviation is • D = (yi - ox - 1xi )2 = (yi - )2 = SSE • Divide the SSE by it’s degrees of freedom (n-2) to estimate 2

  17. Example (12.12) Cont’d Predict the value of BOD mass removal when BOD loading is 35. Calculate the residual. Calculate the SSE and a point estimate of 2

More Related