Unit 2 correlation and causality
This presentation is the property of its rightful owner.
Sponsored Links
1 / 25

Unit 2: Correlation and Causality PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Unit 2: Correlation and Causality. The S-030 roadmap: Where’s this unit in the big picture?. Unit 1: Introduction to simple linear regression. Unit 2: Correlation and causality. Unit 3: Inference for the regression model. Building a solid foundation. Unit 5: Transformations

Download Presentation

Unit 2: Correlation and Causality

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Unit 2 correlation and causality

Unit 2: Correlation and Causality

The s 030 roadmap where s this unit in the big picture

The S-030 roadmap: Where’s this unit in the big picture?

Unit 1:

Introduction to

simple linear regression

Unit 2:


and causality

Unit 3:

Inference for the

regression model

Building a

solid foundation

Unit 5:


to achieve linearity

Unit 4:

Regression assumptions:

Evaluating their tenability




Adding additional predictors

Unit 6:

The basics of

multiple regression

Unit 7:

Statistical control in depth:

Correlation and collinearity

Generalizing to other types of predictors and effects

Unit 9:

Categorical predictors II: Polychotomies

Unit 8:

Categorical predictors I: Dichotomies

Unit 10:

Interaction and

quadratic effects


it all


Unit 11:

Regression modeling

in practice

In this unit we re going to learn about

In this unit, we’re going to learn about…

  • Developing a heuristic understanding of the correlation coefficient (r)

    • Understanding correlation as regression on standardized variables

    • The relationship between r and R2—how large is large?

  • From correlation to causality

    • Randomized experiments: The “gold standard” for establishing causality

    • What can you do when randomized experiments aren’t possible?

    • When might an observed correlation not indicate a causal relationship?

    • Spurious correlation, confounding, Simpson’s paradox, reciprocal causation and ecological correlation

    • Conditions for establishing causality

Rq is there a link between tv exposure and attention deficit problems

RQ: Is there a link between TV exposure and attention deficit problems?

Read the Pediatrics article

Listen to the NPR Interview

Developing a measure of co relation meet karl pearson 27 march 1857 27 april 1936

Developing a measure of “co-relation”: Meet Karl Pearson(27 March 1857 – 27 April 1936)

Heredity: relationships between siblings and spouses(Pearson & Lee, 1903, On the laws of inheritance in man, Biometrika)

  • Pearson, Galton’s advisee and the first Galton Professor of Eugenics at University College, London (shown here with Galton at right)

  • Fun fact: Born Carl (with a C) changed his name to Karl (with a K) after Karl Marx

  • Developed, or named, many of the basic tools of modern statistics, including standard deviation, 2 goodness of fit, and correlation

  • Pearson’s “problems” to solve:

  • Neither variable is an “outcome” or a “predictor”

  • The measure of correlation should be dimensionless, (eg., applicable for inches or feet, digit span or stature)

  • His solution: Re-express (transform) both variables on new “standard” scales that essentially eliminate the particular metrics of the original scales

Learn more about Karl Pearson

Transformation and standardization re expressing a variable s scale

Transformation and standardization: Re-expressing a variable’s scale


A particular transformation that yields a new variable with mean = 0 and sd = 1

Mean = 98.11

sd = 15.21


+ 2sd


+ 1sd



- 1sd


1 68 -1.9985 63 -2.3080

2 71 -1.7943 76 -1.4535

. . .

25 95 -0.1606 96 -0.1389

26 96 -0.0925 93 -0.3361

. . .

52 129 2.1539 117 1.2415

53 131 2.2900 132 2.2274


- 2sd


- 2sd


- 1sd


+ 1sd


+ 2sd


Mean = 97.36

sd = 14.69

  • Standardization...

  • Forces the sample mean of the new variable to be 0 and its sd to be 1

  • The new values measure an individual’s distance from the sample mean in sd units

  • Doesn’t change anyone’s relative rank

  • Doesn’t create a normally distributed variable


Any re-expression of a variable’s scale

Using a regression on standardized variables to understand correlation

Using a regression on standardized variables to understand correlation

Slope of the standardized regression line

assesses the estimated difference in FostIQ

(measured in standard deviation units)

per standard deviation in OwnIQ

Standardized regression line goes precisely through (0,0)

At average X (SOwnIQ=0), we predict average Y (SFostIQ=0)


Pearson product-moment coefficient, r

Does 0.8767 seem familiar?

How do we interpret r

How do we interpret r?

Plots to help develop your intuition for interpreting r

Plots to help develop your intuition for interpreting r

Cool interactive applet for learning more about correlation

Understanding the relationship between r and r 2 and their use as measures of effect size

Understanding the relationship between r and R2(and their use as measures of “effect size”)







Not uncommon in social sciences, but when r < .2, you have very little explanatory power (R2 < 4%)







Covers most “statistically significant” correlations in social sciences, but even when r = .5, you’re only explaining 25% of the variance in Y







Rare in the social sciences and even when r = .7, you’re still explaining less than ½ the variance in Y










Extremely rare in the social sciences, unless you have aggregate data or a coding problem(!)







Cohen’s guidelines

Small: r=.10

Medium: r=.30

Large: r=.50

Another way of thinking about r is as a measure of effect size

From correlation to causality

From correlation to causality

Identified mechanism

You have a sound theory to explain how a change in the predictor produces a change in the outcome


You find the same result in other populations, with different characteristics, at different times

What do we really mean when we say:

“Associated with”

“Related to”

“Explained by”

“Varies with”

“Covaries with”

“I interpreted…Galton…to mean that there was a category broader than causation, namely correlation…and that this new conception of correlation brought psychology, anthropology, medicine, and sociology in large parts into the field of mathematical treatment. It was Galton who first freed me from the prejudice that sound mathematics could only be applied to natural phenomenon under the category of causation. Here for the first time was a possibility, I will not say a certainty, of reaching knowledge—as valid as physical knowledge was then thought to be—in the field of living forms and above all in the field of human conduct.”

Karl Pearson, 1889

Four criteria for establishing causality


You demonstrate that a change in the predictor actually produces a change in the outcome

No plausible alternative explanation

There’s no rival predictor that can explain away the observed correlation

Highest priorities for design and analysis

and often the hardest to establish

Counterfactual reasoning provides a powerful lens for thinking about these questions

You’d like to know what outcome values these individuals would have had if they had received a “different treatment”

(ie, if they had different predictor values)?

Why randomized experiments are the gold standard

Why randomized experiments are the “gold standard”

Narrative development in bilingual kindergarteners: Can Arthur help?Yuuko Uchikoshi (2005) Developmental Psychology

RQ:Can narrative skills be ‘taught’ via TV to English Language Learners?























Four important attributes of randomized experiments

The researcher actively intervened in the system, actually changing X (the treatment) and seeing what happens to Y

Because of random assignment, groups are guaranteed to be initially equivalent, on average, on all observable (and unobservable) characteristics

The control group provides the ideal counterfactual—our best estimate of what the treatment group would have looked like if it didn’t receive the treatment

Any difference found in Y must be due to the changing of X (the treatment) because there’s no other plausible explanation

There will always be studies where researchers have the burden of proof

There will always be studies where researchers have the burden of proof

“You can’t fix by analysis what you bungled by design…”

Light, Singer and Willett (1990)

  • How might you try to establish responsiveness? The key question is:

  • How are predictor values assigned?

  • They’re not: they’re immutable characteristics of people

  • Participants choose them

  • Researchers assign them (but not randomly)

  • Outside forces inadvertently change them

  • External raters assign them using a ranking criterion (e.g., identifying those above a cut-score)

Ethics: Morally, there are some treatments to which you can’t expose people

Does radiation cause cancer?

Many would argue that these

can’t be “causes”

When participants (or even researchers) choose, the conclusions are weaker because they’re subject to selection bias

Feasibility: Logistically, there are some treatments to which you can’t assign people

Does education cause increased income?

Natural experiments

Regression discontinuity


Time: Practically, some information is better than no information

Does quality child care cause better life outcomes?

  • How might you eliminate alternative explanations? The key question is:

  • Can the findings be explained away?

  • Can you establish that the groups were equivalent initially?

  • Can you isolate that portion of the variation in X that’s exogenous (not subject to selection bias) ?

  • Can you rule out other explanations for the observed association?

Matching (especially propensity score matching is very popular now)

Instrumental variables

Availability of data: With so much data, shouldn’t we analyze it?

Let’s think about how you might go about doing this

Non experimental data might the correlation be causal

Non-experimental data: Might the correlation be causal?









US Committee on Gov’t Reform “When forced to take legally binding positions, the tobacco industry still does not accept scientific consensus … that…cigarettes cause disease in smokers [and] that environmental tobacco smoke causes disease in nonsmokers.

Read the Waxman (2002) report Tobacco industry statements in the Department of Justice Lawsuit




But, just because we haven’t done an experiment

doesn’t mean the correlation isn’t causal

Sample Tobacco Industry Statements

  • “[the causes of diseases] are complex, and the mechanism of causation, as well as the possible role of any cigarette smoke constituent in causation, have not been scientifically established”

  • [At] least two standards for establishing causation exist. An epidemiological standard of causation, based primarily on statistical evidence, … [and] the more rigorous traditional scientific standard…[which] requires, among other things… well-designed and conducted … experiments.”

Even experiments aren t foolproof the mrfit trial

Even experiments aren’t foolproof: The MRFIT trial

17 September 1982

Heart Attack Study Finds Men Heeding Health Advice BetterA federally financed study of 12,866 men -- half exhorted to improve their health habits and half getting only "usual care" from their doctors--has produced an unexpected result: Both groups had the same rate of heart attacks, but it was only one-fourth the rate of the general population of the same age.What happened [is that] almost all Americans were reading and hearing advice to smoke less, eat fewer fats and lower their cholesterol level and blood pressure. Exhorted or not, most of the men in the study and their doctors apparently got the same message, and did even better than the average American.

Find Article on LexisNexis

Spurious correlation common response to a third variable

Spurious Correlation: Common Response to a Third Variable




Soft Drink




But not all spurious correlations

are nonsense

Pigou (1899)

SES is often the “3rd variable”

It is easy to prove that the wearing of tall hats and the carrying of umbrellas enlarges the chest, prolongs life, and confers comparative immunity from disease… A university degree, a daily bath, the owning of thirty pairs of trousers, a knowledge of Wagner’s music, a pew in church, anything, in short, that implies more means and better nurture…can be statistically palmed off as a magic spell conferring all sorts of privileges…The mathematician whose correlations would fill a Newton with admiration, may, in collecting and accepting data and drawing conclusions from them, fall into quite crude errors by just such popular oversights --George Bernard Shaw (1906)


There’s a 3rd variable, Z, which causes changes in X and may—or may not—also cause change in Y

Yule (1899) An investigation into the causes of changes in pauperism in England

 Poorhouses


Yule’s footnote 25

“Strictly speaking, for ‘due to’ read ‘associated with.’ ”

Unit 2 correlation and causality

Confounding: A “confusion of effects”:A third variable may (or may not) explain away (or reduce) the correlation

25 Feb 1993

Crack cocaine study faulted on race factorA study carried out four years ago has created the false perception that crack cocaine smoking is more common among blacks and Hispanics than among white Americans, say scientists who reanalyzed the findings in a new report.

The 1988 National Household Survey on Drug Abuse said that rates of crack use among blacks and Hispanics were twice as high as among whites. But the study failed to take into account social factors such as where the users lived and how easily the drug could be obtained, according to researchers writing in yesterday's issue of the Journal of the American Medical Association. The authors, from Johns Hopkins University, said that when adjusted for those factors, the study found equivalent use of crack among blacks, Hispanics and whites.

"Researchers have the responsibility to go beyond the reporting of racial and ethnic differences" because such findings "are often presented as if a person's race has intrinsic explanatory power," the authors wrote.





There’s a 3rd variable, Z, which is correlated with X and which causes changes in Y, but we don’t know if this explains away the correlation between X & Y



Find Article on LexisNexis

Simpson s paradox a third variable may reverse the correlation

Simpson’s paradox: A third variable may reverse(!) the correlation










Some confounders don’t just ‘explain away’ the association, they reveal a reversal in the direction of the effect

r = -0.56

Sex bias in graduate admissions:

UC Berkeley (1973)

Learn more about Simpson’s paradox

Reciprocal causation do happy mothers make happy babies or it is the other way around

Reciprocal causation: Do happy mothers make happy babies? Or it is the other way around?





X may cause Y or Y may cause X—with the data we have, we just can’t tell

  • Cross-sectional observational studies are particularly susceptible to questions of reciprocal causation

  • Special education placements and reading scores:the more segregated the placement, the lower reading scores

  • Motherhood and suicide risk: married women with children are at lower risk of suicide than unmarried women; the more children the lower the risk

  • Depression and smoking: teens who are daily smokers are more likely to be seriously depressed

Ecological correlation aggregate and individual correlations may differ

Ecological correlation: Aggregate and individual correlations may differ

Robinson, W.S. (1950). Ecological Correlations and the Behavior of Individuals. American Sociological Review 15: 351–357.



(Rural, low foreign born, but lots of illiterates)

Correlations with Illiteracy

Unit of analysis


Foreign Born

97,272 individuals



48 states



9 regions



(Urban, lots of foreign born, but also lots of very literate folks)

Aggregate data describe aggregate relationships, not individual level relationships





Where to go to learn more about establishing causality

Where to go to learn more about establishing causality

In recent years, there has been an explosion of interest in the conditions necessary for establishing causal inferences. Different disciplines use different standards and approaches, and there is much to learn by reading broadly. Here are some sources that I find particularly interesting and insightful:

  • Discussions focused on education

  • Shavelson, RJ & Towne, L eds (2002) Scientific Research in Education. Washington, DC: National Academy Press.

  • Angrist, JD (2004). American education research changes tack. Oxford Review of Economic Policy, 20(2), 198-212

  • Gamse, BC & Singer, JD (2005) Lessons from the Red Sox Playbook. Harvard Education Letter, 21(2), 7-8

  • Cook, TD (2001) Sciencephobia: Why education researchers reject randomized experiments. Education Next, Fall, 63-68

  • Discussions focused on psychology

  • Rutter M (2007) Proceeding from Observed Correlation to Causal Inference: The Use of Natural Experiments, Perspectives on Psychological Science, 2(4) 377-395.

  • Discussions focused on epidemiology

  • Rothman, KJ & Greenland, S (2005). Causation and causal inference in epidemiology, American Journal of Public Health, Supplement 1, 95(1), S144-S150

  • Maldonado, G & Greenland, S (2002). Estimating causal effects. International Journal of Epidemiology, 31, 422-429.

  • Krieger, N (1994). Epidemiology and the web of causation: Has anyone seen the spider? Social Science and Medicine.39(7) 887-903.

  • General overviews

  • (warning: some of these are very technical)

  • Shadish, WR, Cook, TF and Campbell, ST (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference, Boston, MA: Houghton Mifflin.

  • Holland PW (1986) Statistics and causal inference (with discussion). Journal of the American Statistical Association, 81, 945-970. [Link to search results for article and discussion.]

  • Freedman, DA (2004). Graphical models for causation and the identification problem. Evaluation Review, 28(4), 267-293.

  • Rubin, DB (2005) Causal inference using potential outcomes: Design, Modeling, Decisions. Journal of the American Statistical Association, 100, 322-331.

  • Discussions focused on sociology

  • Winship, C & Morgan, S (1999). The estimation of causal effects from observational data. Annual Review of Sociology, 25, 659-707.

  • Morgan, SL & Winship, C (2007). Counterfactuals and Causal Inference: Methods and Principles for Social Research. NY: Cambridge U Press.

What s the big takeaway from this unit

What’s the big takeaway from this unit?

  • Correlation coefficients are nifty tools when used correctly

    • Having a scale free measure of association is a powerful concept; you can develop your intuition about the meaning of correlations and that intuition will carry across all types of variables

    • The size of a correlation tells you about the strength of a relationship, not its magnitude. For the magnitude, you need the slope

  • Correlation  Causality

    • Randomized experiments are the gold standard for establishing causality, but even with them there can be limits to your inferences

    • There are many reasons why you might find a correlation between an outcome and a predictor; learn how to think about alternative explanations and evaluate whether they’re likely to hold in any given context

    • When analyzing data, consider the steps involved in going from correlation to causality and decide how far your inferences can go

  • There are many other issues involved in moving from correlation to causality

    • But before being able to tackle these more technical treatments, you need to know much more about the basic regression approach

    • We offer entire classes, A-164: Program evaluation and S-290: Quantitative methods for improving causal inference

Appendix 1 annotated pc sas code for unit 2 burt data

Appendix 1: Annotated PC SAS code for Unit 2, Burt data

Don’t forget the semicolonat the end of every statement;

options nodate nocenter nonumber;

title1 "Unit 2: IQs of Cyril Burt's identical twins";

footnote1 "Program: Unit 2--Burt analysis.sas";


Be sure to update the infile reference to the file's

location on your computer


*---------------------------------------------------- *

Input Burt data and name variables in dataset


data one;

infile 'm:\datasets\Burt.txt';

input ID 1-2 OwnIQ 4-6 FostIQ 8-10;


Estimate bivariate correlation between owniq & fostiq

(Pearson correlation coefficient)


proc corr data=one;

var owniq fostiq;



Don’t forget to specify the location of the raw data, and check that you are indicating the appropriate drive

proc correstimates bivariate correlations between variables you specify. Its var statement syntax is var1 var2 var3 … varn(note that it has neither an * (like proc gplot) or an = (like proc reg)

Appendix 2 my 0 02 on the need for randomized trials in education

Appendix 2: My $0.02 on the need for randomized trials in education

Glossary terms included in unit 2

Glossary terms included in Unit 2

  • Aggregate data

  • Causality

  • Confounder

  • Correlation

  • Reciprocal causation

  • Spurious

  • Standardization

  • Transformation

  • Login