CHAPTER 4. CORRELATION. 4.2 How to measure relationships 4.2.1 Covariance Adverts Watched and Packets bought. Variance = ∑(x i – x) 2 N - 1. Cov (x,y) = ∑(x i – x) (yi – y) N - 1. Cross product deviation = 4.25 for this example.
Variance = ∑(xi – x)2
N - 1
Cov (x,y) = ∑(xi – x) (yi – y)
N - 1
Cross product deviation
= 4.25 for this example
Figure 4.1 Graphical display of the difference between observed data and means of the two variables
To overcome the problem of dependence on measurement scale, we convert covariance in to standard set of units.
By standardising we end up with a value that lies between -1 and +1.
+1 means variables are perfectly positively related.
This correlation coefficient can be called as the
Pearson product moment correlation coeffiecient or Pearson correlation coefficient
r = Cov (x,y) = ∑(xi – x) (yi – y)
sx sy (N – 1) sx sy
Figure 4.6 – 3D Scatter Plot
After a prelimnary glance at the data, we can conduct the correlation analysis.
Access the File Advert.sav
SPSS Analyze –Correlate-Bivariate
4.5.1 Pearsons Correlation Coefficient
Reload the file ExamAnxiety.sav
4.5.2 A word of warning about interpretation: Causality
The third variable problem (There may be other measured or unmeasured variables affecting the results)
Direction of causality (The correlation coefficients say nothing about which variable causes the other to change)
We can go a step further by squaring r
The correlation coefficient squared (coefficient of determination) is a measure of the amount of variability in one variable that is explained by the other.
R2 = 19.4
Exam anxiety accounts for 19.4% of the variability in exam performance.
4.5.4 Spearman´s correlation coefficient
4.5.5 Kendall´ tau (non parametric)
The shaded area of X1 as a portion of the area Y represents the partial correlation of X1 with Y given X2. This shaded area as a proportion of Y, denotes the incremental variance explained by X1, given that X2 is already in the equation.
The unique predictive effect of the due to a single independent variable among a set of independent variables.
a = Variance of Y uniquely explained by X1
b = Variance of Y uniquely explained by X2
c = Variance of Y explained jointly by X1and X2
d = Variance of Y not explained by X1 or X2