1 / 61

Chapter 15 Correlation and Regression

Chapter 15 Correlation and Regression. PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Seventh Edition by Frederick J. Gravetter and Larry B. Wallnau. Chapter 15 Learning Outcomes. Concepts to review. Sum of squares (SS) (Chapter 4) Computational formula

Download Presentation

Chapter 15 Correlation and Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 15Correlation and Regression PowerPoint Lecture SlidesEssentials of Statistics for the Behavioral SciencesSeventh Editionby Frederick J. Gravetter and Larry B. Wallnau

  2. Chapter 15 Learning Outcomes

  3. Concepts to review • Sum of squares (SS) (Chapter 4) • Computational formula • Definitional formula • z-Scores (Chapter 5) • Hypothesis testing (Chapter 8)

  4. 15.1 Introduction to Correlation and Regression • Measures and describes a relationship between two variables. • Characteristics of relationships • Direction (negative or positive) • Form (linear is most common) • Strength

  5. Figure 15.1 Scatterplot for correlational data

  6. Figure 15.2 Examples of positive and negative relationships

  7. Figure 15.3 Examples of different values for linear relationships

  8. 15.2 The Pearson Correlation • Measures the degree and the direction of the linear relationship between two variables • Perfect linear relationship • Every change in X has a corresponding change in Y • Correlation will be –1.00 or +1.00

  9. Sum of Products (SP) • Similar to SS (sum of squared deviations) • Measures the amount of covariability between two variables

  10. SP – Computational formula • Definitional formula emphasizes SP as the sum of two difference scores • Computational formula results in easier calculations

  11. Calculation of the Pearson correlation • Ratio comparing the covariability of X and Y (numerator) with the variability of X and Y separately (denominator)

  12. Figure 15.4 Example 15.3 Scatterplot

  13. Pearson Correlation and z-scores • Pearson correlation formula can be expressed as a relationship of z-scores.

  14. Learning Check • A scatterplot shows a set of data points that are clustered loosely around a line that slopes down to the right. Which of the following values would be closest to the correlation for these data?

  15. Learning Check • A scatterplot shows a set of data points that are clustered loosely around a line that slopes down to the right. Which of the following values would be closest to the correlation for these data?

  16. Learning Check TF • Decide if each of the following statements is True or False.

  17. Answer TF

  18. 15.3 Using and Interpreting the Pearson Correlation • Correlations used for prediction • Validity • Reliability • Theory verification

  19. Figure 15.5 Number of churches and number of serious crimes

  20. Interpreting correlations • Correlation does not demonstrate causation • Value of correlation is affected by the range of scores in the data • Extreme points – outliers – have an impact • Correlation cannot be interpreted as a proportion. • To show the shared variability, need to square the correlation

  21. Figure 15.6 Restricted range and correlation

  22. Figure 15.7 Influence of outlier on correlation

  23. Coefficient of determination • Coefficient of determination measures the proportion of variability in one variable that can be determined from the relationship with the other variable.

  24. Figure 15.8 Three degrees of linear relationship

  25. 15.4 Hypothesis Testing with the Pearson Correlation • Pearson correlation is usually computed for sample data, but used to test hypotheses about the relationship in the population. • Population correlation shown by Greek letter rho (ρ) • Nondirectional: H0: ρ= 0 and H1: ρ≠ 0 • Directional: H0: ρ≤ 0 and H1: ρ> 0

  26. Figure 15.9 Correlation of sample and population

  27. Hypothesis Test for Correlations • Sample correlation used to test population ρ • Degrees of freedom (df) = n – 2 • Hypothesis test can be computed using either t or F. • Critical Values have been computed • See Table B.6 • A sample correlation beyond ± Critical Value is very unlikely • A sample correlation beyond ± Critical Value leads to rejecting the null hypothesis.

  28. Partial correlation • A partial correlation measures the relationship between two variables while controlling the influence of a third variable by holding it constant

  29. Figure 15.10 Controlling the impact of a third variable

  30. 15.5 Alternative to the Pearson Correlation • Pearson correlation has been developed • for linear relationships • for interval or ratio data • Other correlations have been developed for • non-linear data • other types of data

  31. Spearman correlation • Pearson correlation formula is used with data from an ordinal scale (ranks) • Used when both variables are measured on an ordinal scale • Used when relationship is consistently directional but may not be linear

  32. Figure 15.11 Consistent nonlinear positive relationship

  33. Figure 15.12 Scatterplot showing scores and ranks

  34. Ranking tied scores • Tie scores need ranks for Spearman correlation • Method for assigning rank • List scores in order from smallest to largest • Assign a rank to each position in the list • When two (or more) scores are tied, compute the mean of their ranked position, and assign this mean value as the final rank for each score.

  35. Special formula for the Spearman correlation • The ranks for the scores are simply integers • Calculations can be simplified • Use D as the difference between the X rank and the Y rank for each individual to compute the rsstatistic

  36. Point-Biserial Correlation • Measures relationship between two variables • One variable has only two values(dichotomous variable) • Same situation as the independent samples t-test in Chapter 10 • Point-biserial r2 has same value as the r2 computed from t-statistic • t-statistic evaluates the significance • r statistic measures its strength

  37. Phi Coefficient • Both variables (X and Y) are dichotomous • Both variables are re-coded to values 0 and 1 • The regular Pearson formulas is used

  38. Learning Check • Participants were classified as “morning people” or “evening people” then measured on a 50-point conscientiousness scale. Which correlation should be used to measure the relationship?

  39. Learning Check - Answer • Participants were classified as “morning people” or “evening people” then measured on a 50-point conscientiousness scale. Which correlation should be used to measure the relationship?

  40. Learning Check • Decide if each of the following statements is True or False.

  41. Answer

  42. 15.6 Introduction to Linear Equations and Regression • The Pearson correlation measures a linear relationship between two variables • The line through the data • Makes the relationship easier to see • Shows the central tendency of the relationship • Can be used for prediction

  43. Figure 15.13 Regression line

  44. Linear equations • General equation for a line • Equation: Y = bX + a • X and Y are variables • a and b are fixed constant

  45. Figure 15.14 Graph of a linear equation

  46. Regression • Regression is the method for determining the best-fitting line through a set of data • The line is called the regression line • Ŷ is the value of Y predicted by the regression equation for each value of X • (Y- Ŷ) is the distance of each data point from the regression line: the error of prediction • Regression minimizes total squared error

  47. Figure 15.15 Distance between data point & the predicted point

  48. Regression equations • Regression line equation: Ŷ = bX + a • The slope of the line, b, can be calculated • The line goes through (MX,MY) so

  49. Figure 15.16 X and Y points and regression line

  50. Figure 15.17 Perfectly fit regression line and regression line for Example 15.13

More Related