1 / 43

Correlation, Reliability and Regression

Correlation, Reliability and Regression. Chapter 7. Correlation. Statistic that describes the relationship between scores (Pearson r). Number is the correlation coefficient. Ranges between +1.00 and –1.00. Positive is direct relationship. Negative is inverse relationship.

Download Presentation

Correlation, Reliability and Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Correlation, Reliability and Regression Chapter 7

  2. Correlation • Statistic that describes the relationship between scores (Pearson r). • Number is the correlation coefficient. • Ranges between +1.00 and –1.00. • Positive is direct relationship. • Negative is inverse relationship. • .00 is no relationship. • Does not mean cause and effect. • Measured by a Z score. • Generally looking for scores greater than .5

  3. Reliability • Statistic used to determine repeatability • Number ranges between 0 and 1 • Always positive • Closer to one is greater reliability • Closer to 0 is less reliability • Generally looking for values greater than .8

  4. Scattergram or Scatterplot • Designate one variable X and one Y. • Draw and label axes. • Lowest scores are bottom left. • Plot each pair of scores. • Positive means high on both scores. • Negative means high on one and low on the other. • IQ and GPA? • ~ 0.68

  5. Example (high correlation with systematic bias) Trial 1 Trial 2 10 12 9 11 12 14 11 13 13 15 8 10 S1 S2 S3 S4 S5 S6

  6. Positive Data

  7. Positive Plot r=.93

  8. Negative Data

  9. Negative Plot r=-.92

  10. Null Data

  11. Null Plot (orthogonal) r=.34 r=-.00

  12. Pearson (Interclass Correlation) • Ignores the systematic bias • Has agreement (rank) but not correspondence (raw score) • The order and the SD of the scores remain the same • The mean may be different between the two tests • The r can still be high (i.e. close to 1.0)

  13. Calculation of r

  14. ICC (Intraclass Correlation) • Addresses correspondence and agreement • R = • 1.0 is perfect reliability • 0.0 is no reliability

  15. ICC • Advantages • More then two variables (ratings, raters etc.) • Will find the systematic bias • Interval, ratio or ordinal data

  16. Example Trial 1 Trial 2 10 12 9 11 12 14 11 13 13 15 8 10 S1 S2 S3 S4 S5 S6

  17. ICC BMS - EMS ICC = BMS + (k-1) EMS

  18. Trial 1 & 2 Pearson r = 0.91 BMS - EMS ICC = BMS +(k-1)EMS + k [(TMS-EMS)/n] 92.13 – 3.96 ICC = = 0.84 92.13 +(2-1)3.96 + 2 [(70.53-3.96)/15]

  19. BMS EMS TMS Example Trial 1 Trial 2 10 12 9 11 12 14 11 13 13 15 8 10 S1 S2 S3 S4 S5 S6

  20. What is a Mean Square? • Sum of squared deviations divided by the degrees of freedom (df=values free to vary when sum is set) • SSx = sum of squared deviations about the mean which is a variance estimator

  21. Running ICC on SPSS • Analyze, scale, reliability analysis • Choose two or more variables • Click statistics, check ICC at bottom • Two-way mixed, consistency • Use single measures on output

  22. Pearson vs. ICC

  23. Interpretation (positive or negative) • < .20 very low • .20-.39 low • .40-.59 moderate • .60-.79 high • .80-1.00 very high • Must also consider the p-value

  24. Correlation Conclusion Statement • Always past tense • Include interpretation • Include ‘significant’ • Include p or alpha value • Include direction • Include r value • Use variable names • There was a high significant (p<0.05) positive correlation (r=.78) between X and Y.

  25. Pearson vs. ICC

  26. Curvilinear • Scores curve around line of best fit. • Also called a trend analysis. • More complex statistics.

  27. Coefficient of Determination • Represents the common variance between scores. • Square of the r value. • % explained. • How much one variable affects the other

  28. R2 the proportion of variance that two measures have in common-overlap determines relationship-explained variance

  29. Partial Correlation the degree of relationship between two variables while ruling out that degree of correlation attributable to other variables

  30. Simple Linear Regression • Predict one variable from one another • If measurement on one variable is difficult or missing • Prediction is not perfect but contains error • High reliability if error is low and R is high

  31. Residual is vertical distance of any point from the line of best fit (predicted) r=.93 Positive and negative distances are equal

  32. Prediction • Y=(bx)+c • Y is the predicted value • B is the slope of the line • X is the raw value of the predictor • C is the Y intercept (Y when x = zero) • Y vertical/X horizontal

  33. SPSS

  34. SPSS Printout Z-scores

  35. Prediction HT WT 185.00 80.00 185.00 87.00 152.50 52.00 155.00 64.10 172.00 66.00 179.00 81.00 160.00 67.72 174.00 76.00 154.00 60.00 165.00 70.00 • Yp = (bx)+c • HTp = (.85x80)+(108.58) • HTp = (68)+(108.58) • HTp = 176.58 • Residual (error) = diff between predicted and actual • Subject must come from that population!

  36. Standard Error of the Estimate (SEE) is the standard deviation of the distribution of residual scores • Error associated with the predicted value • Read as a SD or SEM value • 68%, 95%, 99% • SEE x 2 then add and subtract it from the predicted score to determine 95% CI of the predicted score. SEE = the square root of The squared residuals Divided by the number of pairs SEE

  37. Prediction • Yp = (bx)+c • HTp = (.85x80)+(108.58) • HTp = (68)+(108.58) • HTp = 176.58 • SEE x 2 = 19.5 • 95% CI = 157.08 – 196.08 HT WT 185.00 80.00 185.00 87.00 152.50 52.00 155.00 64.10 172.00 66.00 179.00 81.00 160.00 67.72 174.00 76.00 154.00 60.00 165.00 70.00

  38. Multiple Regression • Uses multiple X variables to predict Y • Results in beta weights for each X variable • Y=(b1x1)+ (b2x2) + (b3x3) … + c

  39. SPSS - R Prediction Y=(WT x .40) - (skinfold x 1.04) + 156.45

  40. HT WT Skin 185.00 80.00 11.00 185.00 87.00 12.00 152.50 52.00 23.00 155.00 64.10 25.00 172.00 66.00 10.00 179.00 81.00 11.00 160.00 67.72 20.00 174.00 76.00 13.00 154.00 60.00 22.00 165.00 70.00 9.00 Equation • Y=(WTx.40)-(skinfoldx1.04)+156.45 • Y=(80x.40)-(11x1.04)+156.45 • Y=(32)-(11.44)+156.45 • Y=177.01 • SEE=8.67 (x2=17.34) • CI=159.67-194.35

  41. Next Class • Chapter 8 & 13 t-tests and chi-square.

  42. Homework • Make a scatterplot with trendline and r and r2 of two ratio variables. • Run Pearson r on four different variables and hand draw a scatterplot for two. • Run ICC between VJ1 and VJ2. • Run linear regression on standing long jump and predict stair up time. Work out the equation and CI for subject #2. • Run multiple regression on subject #2 and add vjump running, circumference and weight to predictors. Also out the equation and CI.

More Related