1 / 39

Chapter 11

11-1. Chapter 11. Correlation and Regression. Outline. 11-2. 11-1 Introduction 11-2 Scatter Plots 11-3 Correlation 11-4 Regression. Outline. 11-3. 11-5 Coefficient of Determination and Standard Error of Estimate. Objectives. 11-4.

Download Presentation

Chapter 11

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 11-1 Chapter 11 Correlation and Regression

  2. Outline 11-2 • 11-1 Introduction • 11-2 Scatter Plots • 11-3 Correlation • 11-4 Regression

  3. Outline 11-3 • 11-5 Coefficient of Determination and Standard Error of Estimate

  4. Objectives 11-4 • Draw a scatter plot for a set of ordered pairs. • Find the correlation coefficient. • Test the hypothesis H0:  = 0. • Find the equation of the regression line.

  5. Objectives 11-5 • Find the coefficient of determination. • Find the standard error of estimate. • Find a prediction interval.

  6. 11-2 Scatter Plots 11-6 • Ascatter plot is a graph of the ordered pairs (x, y) of numbers consisting of the independent variable, x, and the dependent variable, y.

  7. 11-2 Scatter Plots -Example 11-7 • Construct a scatter plot for the data obtained in a study of age and systolic blood pressure of six randomly selected subjects. • The data is given on the next slide.

  8. 11-2 Scatter Plots -Example 11-8

  9. 11-2 Scatter Plots -Example 11-9 Positive Relationship

  10. 11-2 Scatter Plots -Other Examples 11-10 Negative Relationship

  11. 1 0 1 0 Y 5 y 5 0 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 x X 11-2 Scatter Plots -Other Examples 11-11 No Relationship

  12. 11-3 Correlation Coefficient 11-12 • The correlation coefficient computed from the sample data measures the strength and direction of a relationship between two variables. • Sample correlation coefficient, r. • Population correlation coefficient, 

  13. 11-3 Range of Values for the Correlation Coefficient 11-13 Strong negative relationship No linear relationship Strong positive relationship   

  14. 11-3 Formula for the Correlation Coefficient r 11-14           n xy x y  r               2 2     n x x n y y 2 2 Where n is the number of data pairs

  15. 11-3 Correlation Coefficient - Example (Verify) 11-15 • Compute the correlation coefficient for the age and blood pressure data.

  16. 11-3 The Significance of the Correlation Coefficient 11-16 • The population corelation coefficient, , is the correlation between all possible pairs of data values (x, y) taken from a population.

  17. 11-3 The Significance of the Correlation Coefficient 11-17 • H0: = 0 H1:  0 • This tests for a significant correlation between the variables in the population.

  18. 11-3 Formula for the t tests for the Correlation Coefficient 11-18  n 2  t  1 r 2   with d . f . n 2

  19. 11-3Example 11-19 • Test the significance of the correlation coefficient for the age and blood pressure data. Use  = 0.05 and r = 0.897. • Step 1: State the hypotheses. • H0: = 0 H1:  0

  20. 11-3Example 11-20 • Step 2: Find the critical values. Since  = 0.05 and there are 6 – 2 = 4 degrees of freedom, the critical values are t = +2.776 and t = –2.776. • Step 3: Compute the test value. t = 4.059 (verify).

  21. 11-3Example 11-21 • Step 4: Make the decision. Reject the null hypothesis, since the test value falls in the critical region (4.059 > 2.776). • Step 5: Summarize the results. There is a significant relationship between the variables of age and blood pressure.

  22. 11-4 Regression 11-22 • The scatter plot for the age and blood pressure data displays a linear pattern. • We can model this relationship with a straight line. • This regression line is called the line of best fit or the regression line. • The equation of the line is y  = a + bx.

  23. 11-4 Formulas for the Regression Liney  = a + bx. 11-23              y x x xy 2   a      2   n x x 2           n xy x y  b      2   n x x 2 Where a is the y intercept and b is the slope of the line.

  24. 11-4Example 11-24 • Find the equation of the regression line for the age and the blood pressure data. • Substituting into the formulas give a = 81.048 and b = 0.964 (verify). • Hence, y  = 81.048 + 0.964x. • Note, a represents the interceptand b the slope of the line.

  25. 11-4Example 11-25 y  = 81.048 + 0.964x

  26. 11-4 Using the Regression Line to Predict 11-26 • The regression line can be used to predict a value for the dependent variable (y) for a given value of the independent variable (x). • Caution: Use x values within the experimental region when predicting y values.

  27. 11-4Example 11-27 • Use the equation of the regression line to predict the blood pressure for a person who is 50 years old. • Since y  = 81.048 + 0.964x, theny  = 81.048 + 0.964(50) = 129.248 129. • Note that the value of 50 is within the range of x values.

  28. 11-5 Coefficient of Determination and Standard Error of Estimate 11-28 • The coefficient of determination, denoted by r2, is a measure of the variation of the dependent variable that is explained by the regression line and the independent variable.

  29. 11-5 Coefficient of Determination and Standard Error of Estimate 11-29 • r2 is the square of the correlation coefficient. • The coefficient of nondetermination is (1 – r2). • Example: If r = 0.90, then r2 = 0.81.

  30. 11-5 Coefficient of Determination and Standard Error of Estimate 11-30 • The standard error of estimate, denoted by sest, is the standard deviation of the observed y values about the predicted y  values. • The formula is given on the next slide.

  31. 11-5 Formula for the Standard Error of Estimate 11-31    2  y y   s  n 2 est or      y a y b xy 2  s  n 2 est

  32. 11-5 Standard Error of Estimate -Example 11-32 • From the regression equation, y  = 55.57 + 8.13x and n = 6, find sest. • Here, a = 55.57, b = 8.13, and n = 6. • Substituting into the formula gives sest = 6.48 (verify).

  33. 11-5 Prediction Interval 11-33 • A prediction intervalis an interval constructed about a predicted y value, y , for a specified x value.

  34. 11-5 Prediction Interval 11-34 • For given  value, we can state with (1 – )100% confidence that the interval will contain the actual mean of the y values that correspond to the given value of x.

  35. 11-5 Formula for the Prediction Interval about a Value y 11-35 2 - 1 n ( x X ) ¢ - + + y t s 1 2 est a 2 2 ( ) n å - å n x x 2 - 1 n ( x X ) ¢ + + + y t s 1 2 est a 2 2 ( ) n å - å n x x   with d . f . n 2

  36. 11-5 Prediction interval -Example 11-36 • A researcher collects the data shown on the next slide and determines that there is a significant relationship between the age of a copy machine and its monthly maintenance cost. The regression equation is y = 55.57 + 8.13x. Find the 95% prediction interval for the monthly maintenance cost of a machine that is 3 years old.

  37. 11-5 Prediction Interval -Example 11-37 A 1 $62 B 2 $78 C 3 $70 D 4 $90 E 4 $93 F 6 $103

  38. 11-5 Prediction Interval -Example 11-38 • Step 1: Find x, x2and . x = 20,x2 = 82, • Step 2: Find yfor x = 3.y= 55.57 + 8.13(3) = 79.96 • Step 3: Find sestsest= 6.48 as shown in previous example.

  39. 11-5 Prediction Interval -Example 11-39 • Step 4: Substitute in the formula and solve. t/2 = 2.776, d.f. = 6 – 2 = 4 for 95% 60.53 < y < 99.39 (verify)Hence, one can be 95% confident that the interval 60.53 < y < 99.39 contains the actual value of y.

More Related