1 / 61

Understanding Regression and Correlation: Scatter Diagram, Linear Correlation, and Prediction

This chapter provides an overview of regression and correlation, including scatter diagrams, linear correlation, the least squares line, prediction, correlation coefficient, coefficient of determination, and testing the correlation coefficient.

langlin
Download Presentation

Understanding Regression and Correlation: Scatter Diagram, Linear Correlation, and Prediction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Understandable StatisticsSeventh EditionBy Brase and BrasePrepared by: Lynn SmithGloucester County College Chapter Ten Regression and Correlation

  2. Scatter Diagram a plot of paired data to determine or show a relationship between two variables

  3. Paired Data

  4. Scatter Diagram

  5. Linear Correlation The general trend of the points seems to follow a straight line segment.

  6. Linear Correlation

  7. Non-Linear Correlation

  8. No Linear Correlation

  9. High Linear Correlation Points lie close to a straight line.

  10. High Linear Correlation

  11. Moderate Linear Correlation

  12. Low Linear Correlation

  13. Perfect Linear Correlation

  14. Questions Arising • Can we find a relationship between x and y? • How strong is the relationship?

  15. When there appears to be a linear relationship between x and y: attempt to “fit” a line to the scatter diagram.

  16. When using x values to predict y values: • Call x the explanatory variable • Call y the response variable

  17. The Least Squares Line The sum of the squares of the vertical distances from the points to the line is made as small as possible.

  18. Least Squares Criterion The sum of the squares of the vertical distances from the points to the line is made as small as possible.

  19. Equation of the Least Squares Line y = a + bx a = the y-intercept b = the slope

  20. The equation of the least squares line is: y = a + bx y = 2.8 + 1.7x

  21. The following point will always be on the least squares line:

  22. Graphing the least squares line • Using two values in the range of x, compute two corresponding y values. • Plot these points. • Join the points with a straight line.

  23. Graphing y = 30.9 + 1.7x • Use (8.3, 16.9) (average of the x’s, the average of the y’s) • Try x = 5. Compute y: y = 2.8 + 1.7(5)= 11.3

  24. Sketching the Line Using the Points (8.3, 16.9) and (5, 11.3)

  25. Using the Equation of the Least Squares Line to Make Predictions • Choose a value for x (within the range of x values). • Substitute the selected x in the least squares equation. • Determine corresponding value of y.

  26. Predict the time to make a trip of 14 miles • Equation of least squares line: y = 2.8 + 1.7x • Substitute x = 14: y = 2.8 + 1.7 (14) y = 26.6 • According to the least squares equation, a trip of 14 miles would take 26.6 minutes.

  27. Interpolation Using the least squares line to predict y values for x values that fall between the points in the scatter diagram

  28. Extrapolation Prediction beyond the range of observations

  29. The least squares line and prediction, yp: • y = a + bx • y = 2.8 + 1.7x • For x = 8, yp = 2.8 + 1.7(8) = 16.4

  30. Try not to use the least squares line to predict y values for x values beyond the data extremes of the sample x distribution.

  31. The Linear Correlation Coefficient, r • A measurement of the strength of the linear association between two variables • Also called the Pearson product-moment correlation coefficient

  32. Positive Correlation y x

  33. Negative Correlation y x

  34. Little or No Linear Correlation y x

  35. What type of correlation is expected? • Height and weight • Mileage on tires and remaining tread • IQ and height • Years of driving experience and insurance rates

  36. Linear correlation coefficient 1  r  +1

  37. If r = 0, scatter diagram might look like: y x

  38. If r = +1, all points lie on the least squares line y x

  39. If r = –1, all points lie on the least squares line y x

  40. – 1 < r < 0 y x

  41. 0 < r < 1 y x

  42. The Correlation Coefficient, r = 0.9753643 r  0.98

  43. A statistic related to r: the coefficient of determination = r2

  44. Coefficient of Determination a measure of the proportion of the variation in y that is explained by the regression line using x as the predicting variable

  45. Interpretation of r2 • If r = 0.9753643, then what percent of the variation in minutes (y) is explained by the linear relationship with x, miles traveled? • What percent is explained by other causes?

  46. Interpretation of r2 • If r = 0.9753643, then r2 = .9513355 • Approximately 95 percent of the variation in minutes (y) is explained by the linear relationship with x, miles traveled. • Less than five percent is explained by other causes.

  47. Warning • The correlation coefficient ( r) measures the strength of the relationship between two variables. • Just because two variables are related does not imply that there is a cause-and-effect relationship between them.

  48. Testing the Correlation Coefficient Determining whether a value of the sample correlation coefficient, r, is far enough from zero to indicate correlation in the population.

  49. The Population Correlation Coefficient  = Greek letter “rho”

  50. Hypotheses to Test Rho • Assume that both variables x and y are normally distributed. • To test if the (x, y) values are correlated in the population, set up the null hypothesis that they are not correlated: H0: x and y are not correlated, so  = 0.

More Related