Correlation and CausationVisiting Associate Professor Giddings Math/Econ 108
Outline • A Brief History of Correlation • What is Causation? • What is Correlation? • Spurious Correlations • Simpson’s Paradox • How scientists determine causation
What is Causation? • When changes in one variable (X) affect changes in another variable (Y), we say that X causes Y. • Examples: • The Sun Rises → Rooster Crows (unidirectional) • Education → Higher Wages (bidirectional?)
Important Questions • What causes poverty? • Will Obama’s tax cuts cause the economy to expand? • Does immigration cause lower wages? • Does Prozac cause suicide? • Does the burning of fossil fuel cause global warming? • Does teenage pregnancy lead an individual to drop out of highschool? • Do married men work harder and thus earn higher wages? • Does gay marriage cause the dissolution of heterosexual marriage?
The Importance of Causality • I would rather discover one causal law than be the King of Persia. • Democritus (460-370 B.C)
How do we determine Causation? • Correlation • Controlled Experiments • Theory
What is Correlation? • When two variables move together, we say they are correlated.
Sir Francis Galton A brilliant but quirky dude. • Sir Francis GaltonFRS (16 February 1822 – 17 January 1911), cousin of Sir Douglas Galton, half-cousin of Charles Darwin, was an EnglishVictorianpolymath, anthropologist, eugenicist, tropical explorer, geographer, inventor, meteorologist, proto-geneticist, psychometrician, and statistician. He was knighted in 1909. • Galton had a prolific intellect, and produced over 340 papers and books throughout his lifetime. He also created the statistical concept of correlation and widely promoted regression toward the mean. He was the first to apply statistical methods to the study of human differences and inheritance of intelligence, and introduced the use of questionnaires and surveys for collecting data on human communities, which he needed for genealogical and biographical works and for his anthropometric studies. He was a pioneer in eugenics, coining the very term itself and the phrase "nature versus nurture."
Karl Pearson1857-1936 • Protégé of Sir Francis Galton, he founded the world’s first statistics department at University College London. • Main contributions: • Linear regression and correlation • Classification of distributions • Pearson’s chi-square test • Coefficient of correlation
Pearson’s r • Should be +1 if all points lie on line with a positive slope • Should be -1 if all points lie on line with a negative slope • Should be 0 if all points on horizontal or vertical line
Should be unchanged if the same constant is added to all x-values or all y-values. Put center of graph at
Here’s a different way of seeing it • If r = 1, every X should equal Y. • So, Pearson’s correlation coefficient essentially measures how far the Xs and Ys are from each other.
So, the Crucial Point is… • Correlation does not necessarily imply causality