Introduction to causal inference in the social sciences
1 / 38

- PowerPoint PPT Presentation

  • Uploaded on

Introduction to Causal Inference in the Social Sciences. Yu Xie The University of Michigan. Causal Questions. Example A: Is UM’s affirmative action policy educationally beneficial to its students? Example B: Did the war in Iraq help or harm world peace in the long run?

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about '' - colby

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Introduction to causal inference in the social sciences l.jpg

Introduction to Causal Inference in the Social Sciences

Yu Xie

The University of Michigan

Causal questions l.jpg
Causal Questions

  • Example A: Is UM’s affirmative action policy educationally beneficial to its students?

  • Example B: Did the war in Iraq help or harm world peace in the long run?

  • A causal question is a simple question involving the relationship between two theoretical concepts: a cause and an effect.

  • Cause => Effect?

  • Or, X => Y?

Centrality of causality in social science l.jpg
Centrality of Causality in Social Science

  • The primary aim of all sciences (from Aristotle to modern genetics).

  • Understanding of causal relationships leads to accurate predictions of the future.

  • It provides the scientific basis for policy intervention.

  • It advances our theoretical knowledge of the world.

Evaluation research l.jpg
Evaluation Research

  • In high demand by policy makers.

  • Definition: Evaluation research, or program evaluation, refers to the kind of applied social research that attempts to evaluate the effectiveness of social programs.

  • Key to all evaluation research is causal inference: i.e., evaluating effectiveness of programs.

Simple comparisons l.jpg
Simple Comparisons

  • One simple way is to compare units of analysis affected by the program to those unaffected by the program.

  • Say in a community, N1 children attended Head Start, and N2 did not. 27 years later, measure the educational attainment of the two groups, y1 (outcome among those who attended Head Start) and y2 (outcome among those who did not attend Head Start).

Simple comparisons continued l.jpg
Simple Comparisons (Continued)

  • We compute difference in mean 13 - 14 = -1.

  • Should we conclude from this that Head Start has a negative effect on educational attainment?

  • The Westinghouse report in the earlier 60s.

  • The appropriate research question is not to compare observed y1 and observed y2.

What might be wrong with simple comparison l.jpg
What Might be Wrong with Simple Comparison?

  • The observed, bivariate relationship between Head Start participation and educational outcomes may be negative.






Head Start

Uc berkeley graduate admission data l.jpg
UC-Berkeley Graduate Admission Data

  • Two-way table of sex and admission outcome is as follows:

What is going on l.jpg
What is Going On?




What should we conclude from the uc berkeley data l.jpg
What Should We Conclude from the UC-Berkeley Data?

  • There is a strong segregation of major by sex.

  • Admission rates vary greatly by major: low for feminine majors and high for masculine majors.

  • If there is anything, women seem to have an advantage in major A.

More examples l.jpg
More Examples

  • Does cohabitation decrease or increase the likelihood of divorce?

  • Is it better to have more siblings or fewer siblings for educational attainment?

  • What is the earnings return to college education?

Causal effect as a counter factual question l.jpg
Causal Effect as a Counter-Factual Question

  • For causal inference, one should ask the counter-factual question, for those who received “treatment”, what would have happened to them if they hadn't been treated?

    • Or, y1t - y1c(t denoting treatment; c denoting control)

    • Note that y1t is observed, buty1cis not.

Causal effect as a counter factual question continued l.jpg
Causal Effect as a Counter Factual Question (continued)

  • For those who did not receive treatment, what would have happened to them if they had been treated?

    • Or, y2t - y2c(t denoting treatment; c denoting control)

    • Note that y2c is observed, buty2t is not.

  • The problem is one of missing data.

Assumption for simple comparison l.jpg
Assumption for Simple Comparison

  • If subjects who are treated are, on average, “comparable” to subjects who are untreated (which can be achieved by randomization) we can assume away the problem by averaging:

  • E(y1c)= E(y2c) , E(y1t)= E(y2t)

  • In that case,

  • E(y1t - y1c)=E(y2t - y2c) = E(y1t - y2c)

  • I.e, simple comparison is valid

Observable selectivity bias l.jpg
Observable Selectivity Bias

  • If subjects who receive treatment and those who do not are different only in observed characteristics, this type of selectivity is called observable selectivity.

  • This problem can be handled by statistical controls in multivariate analysis to make the two groups comparable (or, differences between the two groups are “ignorable” conditional on covariates).

  • Often called “omitted variable bias.”

  • This is the basis for multivariate analysis.

Conditions for omitted variable bias l.jpg
Conditions for Omitted Variable Bias

  • (1) Correlation Condition: The omitted variable is correlated with the independent variable of primary interest;

  • (2) Relevance Condition: The omitted variable affects the dependent variable.

  • If one of the two conditions is not met, an omitted variable does not introduce a bias.

  • E.g., wedding expenses have been found to have a positive effect on marital stability. Could this be due to omitted variable biases?

Unobservable selectivity l.jpg
Unobservable Selectivity

  • The more difficult problem is to deal with selectivity in unmeasured characteristics.

  • Two situations:

    • (1) “Heterogeneity” Bias: the two groups are systematically different due to predetermined unobservables. E.g., ability in human capital models.

    • (2) “Endogeneity” Bias: the effects of program participation are different between the two groups. E.g., self-selection.

  • Difficult to handle. Statistical models require strong and implausible assumptions.

Review l.jpg

  • Population is divided into two subpopulations: P1 if Di =1, P0 if Di=0.

  • Use the following notations:

    • q = proportion of P0 in P

    • E(Y1T) = E(YT|D=1) , E(Y1C) = E(YC|D=1)

    • E(Y0T) = E(YT|D=0) , E(Y0C) = E(YC|D=0)

  • By total expectation rule:

    • E(YT - YC) = E(Y1T – Y1C)(1-q) + E(Y0T – Y0C)q = E(Y1T – Y0C) - E(Y1C – Y0C) - (d1-d0)q,

      where d1 =E(Y1T – Y1C), d0 =E(Y0T – Y0C).

  • Or:

    • E(Y1T – Y0C) = E(YT - YC) + E(Y1C – Y0C) + (d1-d0)q.

Experimental approach l.jpg
Experimental Approach

  • Experimental design eliminates both types of problems.

  • Example: High/Scope Perry Preschool study conducted in Ypsilanti.

  • Manski and Garfinkel (1992): experimental designs suffer from shortcomings that are often overlooked.

  • Manski and Garfinkel refer to experimental approach as “reduced-form.”

Shortcomings of experimental approach l.jpg
Shortcomings of Experimental Approach

  • We cannot always extrapolate results from an experimental setting to natural setting.

  • Thus, Manski and Garfinkel openly criticize experimental designs:"In fact, reduced-form experimental evaluation actually requires that a highly specific and suspect structural assumption hold: Individuals and organizations must respond in the same way to the experimental version of a program as they would to the actual version." (p.17)

  • I.e., lacking “external validity.”

Structural approach l.jpg
Structural Approach

  • Manski and Garfinkel propose the "structural" approach as an alternative.

  • Definition: structural approach refers to statistical methods that model causal processes based on observational data.

  • Head Start example: control on SES, parental involvement, etc.

  • Requires strong social science theories.

Structural vs reduced form equations l.jpg
Structural vs. Reduced-Form Equations

  • 1. Structural EquationsStructural equations are theoretically derived equations that often have endogenous variables as independent variables.

  • 2. Reduced-Form EquationsReduced-form equations are equations in which all independent variables are exogenous variables. I.e., in reduced-form equations, we purposely ignore intermediate variables.

Comparison of the two approaches l.jpg
Comparison of the two Approaches

Advantages of Structural Approach:

  • Since it is conducted in a natural setting, its findings are directly relevant to the whole population. In contrast, results from an experimental design need to be extrapolated.

  • It is less costly. In contrast, experimental research is very expensive.

  • It builds upon and contributes to theory. In contract, the reduced-form approach only yield simple answers to simple questions.

Advantages of reduced form approach l.jpg
Advantages of Reduced-form Approach

  • Biases due to unobservables can be eliminated through randomization.

  • It requires fewer assumptions.

  • It does not require complicated statistical models that the public and government officials have difficulty understanding.

Research design approaches l.jpg
Research Design Approaches

  • Quasi-Experiment

    • Utilizing spatial variation

    • Utilizing temporal variation

  • Clustering Design

    • Fixed effects model

  • Instrumental-Variable Estimation

    • Special type of structural approach

Example quasi experiment design utilizing spatial variation l.jpg
Example: Quasi-Experiment Design Utilizing Spatial Variation

  • Certain policies are introduced in State A but not in State B.

    • States A and B are otherwise comparable.

    • Observe how outcome Y differs between State A and State B.

  • Pace of economic reforms in China differs greatly by region

    • Associate regional variation in returns to education to regional variation in depth of economic reforms.

Example quasi experiment design utilizing temporal variation l.jpg
Example: Quasi-Experiment Design Utilizing Temporal Variation

  • Declining significance of race?

    • Examine temporal changes in SES differences by race

    • Hope to see a narrowing of racial gaps, particularly after the civil rights movement.

  • Effect of a new instructional method:

Propensity score l.jpg
Propensity Score Variation

  • P(T)=probability of treatment, balancing score for the probability of treatment.

  • Could be a function of other observed variables, z vector. Summary difference on all covariates.

  • We can estimate P(t) through a logit model:

  • logit(P) = b’z.

  • Under the assumption of no other relevant factors, group T and group C are comparable within levels of the estimated propensity score.

  • Different uses of propensity score: stratification, matching, regression covariates.

Instrumental variable approach l.jpg
Instrumental-Variable Approach Variation

  • Condition: IV Z does not affect Y except through X, meaning:

    • Z is correlated with Y but does not affect Y directly (called “exclusion restriction”).

    • Z is also correlated with X but not perfectly.

  • It’s very hard to find a good Z.





Example fixed effects model l.jpg
Example: Fixed Effects Model Variation

  • Sibling models

    • Family SES, environment are shared

      • Yi1 =b0 + b1Xi1 + ai + ei1

      • Yi2 =b0 + b1Xi2 + ai + ei2

    • Take difference between the two eq.

      • Yi2 -Yi1=b1 (Xi2 -Xi1)+ (ei2- ei1)

      • Resulting in a more robust equation

    • Properties of the fixed effects approach:

      • All fixed-characteristics are controlled

      • It wastes a lot of information

      • Unobserved heterogeneity is controlled at the group level (fixed effects)

Heckman s selection model l.jpg
Heckman’s Selection Model Variation

Latent Rule

Different quantities of interest l.jpg
Different Quantities of Interest Variation

  • Necessary because of the “variability principle.”

  • Treatment effects differ by population elements and thus could vary across subgroups.

Different quantities of interest34 l.jpg
Different Quantities of Interest Variation

  • ATE=average treatment effect:

    • E(Yt-Yc)

  • ATT=average treatment effect of the treated (Heckman):

    • E(Yt-Yc|D=1)

  • LATE=local average treatment effect (in an exeriment):

    • E(Yt-Yc|compliance=1)

Example of xie and wu 2005 l.jpg
Example of Xie and Wu (2005) Variation

  • The research question: what is the causal effect of entry into market sector on earnings?

  • Two causal questions:

Slide36 l.jpg

New Entrants to the State Sector (522) Variation

State Sector



Experienced Workers (1197)









Market Sector

Later Entrants


Early Birds


Early Birds






Figure 1. Flow Chart of Labor Market Transitions in China, 1978 – 1996.

Figure 2 market treatment effect on earnings by propensity strata later entrants vs stayers l.jpg
Figure 2. VariationMarket Treatment Effect on Earnings by Propensity Strata: Later Entrants vs. Stayers

Conclusions of xie and wu 2005 l.jpg
Conclusions of Xie and Wu (2005) Variation

  • Nogeneric market effect on earnings.

    • First, only late transition into the market sector is associated with higher earnings.

    • Even among later entrants, the benefit of working in the market sector sharply decreases with the propensity of having made the transition.

  • These results illustrate endogeneity: individuals select their “treatment” based on the anticipated outcome, which is not homogeneous across workers.