Correlation & the Coefficient of Determination. (Session 04). Learning Objectives. At the end of this session, you will be able to understand the meaning and limitations of Pearson’s coefficient of correlation (r)
At the end of this session, you will be able to
The term “correlation” refers to a measure of the strength of association between two variables.
If the two variables increase or decrease together, they have a positive correlation.
If, increases in one variable are associated with decreases in the other, they have a negative correlation.
For two quantitative variables X and Y, for which n pairs of measurements (xi, yi) are available, Pearson’s correlation coefficient (r) gives a measure of the linear association between X and Y.
The formula is given below for reference.
If X and Y are perfectly positively correlated, r = 1
If there is absolutely no association, r = 0
If X and Y are perfectly positively correlated, r = -1
Thus -1 < r < +1
The closer r is to +1 or -1, the greater is the strength of the association.
It is often difficult to interpret r without some familiarity with the expected values of r.
A more appropriate measure to use when interest lies in the dependence of Y on X, is theCoefficient of Determination, R2.
It measures the proportion of variation in Y that is explained by X, and is often expressed as a percentage.
Anova (for 93 rural female headed HHs) of log consumption expenditure versus number of persons per sleeping room is:
R2 = Regre. S.S. / Total S.S. = 4.89/25.23
From above, we can say that 19.4% of the variability in the income poverty proxy measure is accounted for by the number of persons per sleeping room.
Clearly there are many other factors that influence the poverty proxy since over 80% of the variability is left unexplained!
When there is just one explanatory variable being considered (as in above example), the squared value of r equals R2.
In the above example,
value of r = - 0.194 = - 0.44
The negative value is used when taking the square root because the graph indicates a negative relationship (see next slide).