Correlation

Correlation Chapter 6

Assumptions for Pearson r • X and Y should be interval or ratio. • X and Y should be normally distributed. • Each X should be independent of other X’s. • Each Y should be independent of other Y’s. • Scores on Y should be linearly related to X. • X, Y scores should be bivariate normal,

In a bivariate normal distribution when you look at a single X value (0), most of the Y points are clustered around the blue (regression line) and as you get further away from the blue line there are fewer Y data points at a given X.

If there is a relationship between two variables, then as one variable deviates from its mean, the other variable in the same way or directly opposite way. Covariance is the averaged sum of combined deviations. Each variables deviation from its mean is about the same magnitude, then covariance is high.

Correlation • The correlation (r) is computed by dividing the covariance by the product of x and y’s standard deviation. • The value of r ranges from −1 to +1. • The correlation coefficient is a standard measure of effect size: ±.1 is a small effect ±.3 is a medium effect ±.5 is a large effect Correlation and Cause? There is a high r between shoe size and math performance for grade school children.

Two Types of Correlation • Two types of corr: bivariate and partial. • Bivariate correlation is the correlation between two variables. • Partial correlation is the correlation between two variables when controlling the effect of one or more additional variables.

Correlation measures the association between two variables. Correlation quantifies the extent to which the mean, variation & direction of one variable are related to another variable. r ranges from +1 to -1. Correlation can be used for prediction. Correlation does not indicate the cause of a relationship. Pearson’s Product Moment Correlation

Scatter plot gives a visual description of the relationship between two variables. The line of best fit is defined as the line that minimized the squared deviations from a data point up to or down to the line. Scatter Plot

Line of Best Fit Minimizes Squared Deviations from a Data Point to the Line

Always do a Scatter Plot to Check the Shape of the Relationship

Will a Linear Fit Work?

Will a Linear Fit Work? y = 0.5246x - 2.2473 R2 = 0.4259

6th Order Fit? y = 0.0341x6 - 0.6358x5 + 4.3835x4 - 13.609x3 + 18.224x2 - 7.3526x - 2.0039 R2 = 0.9337

Will Linear Fit Work?

Linear Fit y = 0.0012x - 1.0767 R2 = 0.0035

For predictions, absolute value of r < .7, may produce unacceptably large errors, especially if the SDs of either or both X & Y are large. As a general rule Absolute value r greater than or equal .9 is good Absolute value r equal to .7 - .8 is moderate Absolute value r equal to .5 - .7 is low Values for r below .5 give R2 = .25, or 25% are poor, and thus not useful for predicting. Evaluating the Strength of a Correlation

Significant Correlation?? If N is large (N=90) then a .205 correlation is significant. ALWAYS THINK ABOUT R2 How much variance in Y is X accounting for? r = .205 R2 = .042, thus X is accounting for 4.2% of the variance in Y. This will lead to poor predictions. A 95% confidence interval will also show how poor the prediction is.

Venn diagram shows (R2) the amount of variance in Y that is explained by X. R2=.64 (64%) Variance in Y that is explained by X Unexplained Variance in Y. (1-R2) = .36, 36%

A partial correlation is used to remove the effects of Revision Time on both Exam Anxiety and Exam Performance. Then the unique contribution of Exam Anxiety on Exam Performance can be analyzed.

A coach ranked athletes based on jumping ability and sprinting ability. The ranks for each athlete are presented in the table below. Compute the correlation between the ratings.

Level of Measurement is ordinal

Kendall’s tau is better if the data have several ties, if not Spearman is fine.

Test-Retest Reliability (ICC)and Day to Day Variation

Reliability refers to the consistency of a test or measurement. A test cannot be considered valid if it is not reliable. You should know the day to day variation in your dependent variable. How much does 1RM bench press change from day to day? How much does VO2 max change from day to day? Consistency of Measurements

The 1RM squat strength was measured on 8 subjects on two separate days. In this experiment we want to measure the reliability of our measurements. Here is the day to day variation in 1RM bench press strength. Test-Retest Reliability In this experiment, we plan to average trials, so we will use Averaged Measures ICC.

Enter the between day variables.

Click Statistics

Click OK to run

SPSS Output

There is no difference between trials F(1,7) = .530, p = .49 If there is a difference between trials you may have a learning effect or a fatigue effect and you must modify your methods to control for learning and/or fatigue.

ICC = .973 [High Degree of Reliability] Using Mixed Model, Type Consistency, Averaged Measures.

Day to Day Variation • If you train subjects for 8 weeks and they improve their strength by 2.8 Kg have you actually done anything? • Is the program effective? • Here is the day to day variation in 1RM bench press strength.

Manuscript Methods & Results Sections METHODS Measurement of Reliability Prior to initiating the study, 8 subjects (not in the experiment), participated in a test-retest assessment of measurement reliability. One RM bench press strength was measured on two separate days for subjects in the reliability study. Reproducibility of bench press strength was analyzed using SPSS (18.0 for Windows) to compute the intraclass correlation coefficient (ICC) using a two factor mixed effects model and type consistency (McGraw and Wong, 1996; Shrout and Fleiss, 1979). RESULTS A high degree of reliability was found between bench press strength measurements the average measure ICC was .973 with a 95% confidence interval from .864 - .995. The mean between day variation for 1RM strength was 2.75±10.7 Kg. REFERENCES McGraw KO and Wong SP. Forming inferences about some intraclass correlation coefficients. Psychological Methods 1: 30-46, 1996. Shrout PE and Fleiss JL. Intraclass correlations: Uses in assessing reliability. Psychol Bull86: 420-428, 1979.

Correlation

Correlation

Presentation Transcript

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

CORRELATION

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation