1 / 80

Richard Scheines Carnegie Mellon University

Searching for Statistical Causal Models: Theory and Practice. Richard Scheines Carnegie Mellon University. Goals. Policy, Law, and Science: How ca n we use data to answer subjunctive questions (effects of future policy interventions), or

greg
Download Presentation

Richard Scheines Carnegie Mellon University

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Searching for StatisticalCausal Models: Theory and Practice Richard ScheinesCarnegie Mellon University

  2. Goals • Policy, Law, and Science: How can we use data to answer • subjunctive questions (effects of future policy interventions), or • counterfactual questions (what would have happened had things been done differently (law)? • scientific questions (what mechanisms run the world) • Rumsfeld Problem: Do we know what we don’t know: Can we tell when there is or is not enough information in the data to answer causal questions?

  3. Early Progenitors Charles Spearman (1904) StatisticalConstraints CausalStructure g m1 m2 r1 r2 rm1 * rr1 = rm2 * rr2

  4. Early Progenitors Sewall Wright (1920s,1930s) Graphical Model  Causal & Statistical Interpretation The Method of Path Coefficients (1934). Annals of Mathematical Statistics, 5, 161-215.

  5. Social Sciences: 1940s  1970s • Factor Analysis • Instrumental VariableEstimators • Structural Equation Models • Simultaneous Equation Models • Economics: • Cowles Commission • Franklin Fisher • Art Goldberger • Clive Granger • Herb Simon • Haavelmo • R. Strotz • H. Wold • Sociology, Psychometrics, etc. • Hubert Blalock • Herb Costner • Otis Dudley Duncan • David Heise • David Kenny • Ken Bollen

  6. Population level Counterfactuals

  7. 1970s & 1980s:Graphical Models & Independence Structures • S. Lauritzen • J. Darroch • T. Speed • H. Kiiveri • N. Wermuth • D. Hausman • D. Papineau • P. Dawid • D. Cox • J. Robins • J. Whittaker • Judea Pearl 1988

  8. 1988  1993:Axioms, Intervention, and Latent Variable Model Search • P. Spirtes, C. Glymour, and R. Scheines • Causal Markov Axiom • Full model of interventions, both surgical and non-surgical • Equivalence classes for latent variable models, with search

  9. Modern Non-Parametric Theory of Statistical Causal Models Intervention & Manipulation Graphical Models Counterfactuals Constraints (Independence) CausalBayes Nets

  10. Semantics of SCMs • Choice 1: Take direct causation as primitive, axiomatizeCausal systems over V  Probabilistic Independence Relations in P(V) • Choice 2: Define direct causation in terms of intervention, i.e., (hypothetical) treatment)

  11. Choice 1: Causal Markov Axiom If G is a causal graph, and P a probability distribution over the variables in G, then in <G,P> satisfy the Markov Axiom iff: every variable V is independent of its non-effects, conditional on its immediate causes.

  12. Causal Graphs Causal Graph G = {V,E} Each edge X  Y represents a direct causal claim: X is a direct cause of Y relative to V Chicken Pox

  13. Causal Graphs Not Cause Complete Common Cause Complete

  14. Pre-intervention graph “Soft” Intervention “Hard” Intervention Interventions & Causal Graphs Model an intervention by adding an “intervention” variable outside the original system as a direct cause of its target. Intervene on Income

  15. Structural Equation Models 1. Structural Equations 2. Statistical Constraints Causal Graph Statistical CausalModel

  16. Structural Equation Models • Structural Equations: One Assignment Equation for each variable V V := f(parents(V), errorV) for SEM (linear regression) f is a linear function • Statistical Constraints: Joint Distribution over the Error terms Causal Graph

  17. Causal Graph SEM Graph (path diagram) Structural Equation Models Equations: Education := ed Income :=Educationincome Longevity :=EducationLongevity Statistical Constraints: (ed, Income,Income ) ~N(0,2) 2diagonal - no variance is zero

  18. Two Routes to the Causal Markov Condition • Assumption 1: Weak Causal Markov Assumption V1,V2causally disconnectedV1 _||_ V2 • Assumption 2b: Determinism, e.g., Structural Equations • For each ViV, Vi := f(parents(Vi)) • Assumption 2a: • Causal Markov Axiom

  19. Choice 2: Define Direct Causation from Intervention X is a direct cause of Y relative to S, iff z,x1  x2 P(Y | X set= x1 , Zset=z)  P(Y | X set= x2 , Zset=z) where Z = S - {X,Y} X is a cause of Y iff x1  x2 P(Y | X set= x1)  P(Y | X set= x2)

  20. Modularity of Intervention/Manipulation Causal Graph Structural Equations: Education := ed Longevity :=f1(Education)Longevity Income := f2(Education)income Manipulated Structural Equations: Education := ed Longevity :=f1(Education)Longevity Income := f3(M1) Manipulated Causal Graph M1

  21. Manipulation --> Causal Markov • Manipulation conception of causation and Modularity --> weak version of CMA • Zhang, Jiji and Spirtes, Peter (2007) Detection of Unfaithfulness and Robust Causal Inference. In [2007] LSE-Pitt Conference: Confirmation, Induction and Science (London, 8 - 10 March, 2007)http://philsci-archive.pitt.edu/archive/00003188/01/Detection_of_Unfaithfulness_and_Robust_Causal_Inference.pdf

  22. Statistical Inference Background Knowledge - X2 before X3 SCM SearchStatistical DataCausal Structure

  23. Faithfulness • Constraints on a probability distribution P generated by a causal structure G hold for all parameterizations of G. Revenues = aRate + cEconomy + eRev. Economy = bRate + eEcon. Faithfulness: a ≠ -bc

  24. Equivalence Classes • Equivalence: • Independence (M1╞ X _||_ Y | Z  M2╞ X _||_ Y | Z) • Distribution (q1q2M1(q1) = M2(q2)) • Independence (d-separation equivalence) • DAGs : Patterns • PAGs : Latent variable models • Intervention Equivalence Classes • Measurement Model Equivalence Classes • Linear Non-Gaussian Model Equivalence Classes

  25. Patterns

  26. Patterns: What the Edges Mean

  27. PAGs: Partial Ancestral Graphs

  28. PAGs: Partial Ancestral Graphs What PAG edges mean.

  29. Overview of Search Methods • Constraint Based Searches • TETRAD (SGS, PC, FCI) • Very fast – max N ~ 1,000 • Pointwise Consistent • Scoring Searches • Scores: BIC, AIC, etc. • Search: Hill Climb, Genetic Alg., Simulated Annealing • Difficult to extend to latent variable models • Meek and Chickering Greedy Equivalence Class (GES) • Very slow – max N ~ 30-40

  30. Tetrad Demo http://www.phil.cmu.edu/projects/tetrad_download/

  31. Does Foreign Investment in 3rd World Countries cause Repression? Case Study 1: Foreign Investment Timberlake, M. and Williams, K. (1984). Dependence, political exclusion, and government repression: Some cross-national evidence. American Sociological Review 49, 141-146. N = 72 PO degree of political exclusivity CV lack of civil liberties EN energy consumption per capita (economic development) FI level of foreign investment

  32. Case Study 1: Foreign Investment Correlations po fi en fi -.175 en -.480 0.330 cv 0.868 -.391 -.430

  33. Case Study 1: Foreign Investment Regression Results po = .227*fi - .176*en + .880*cv SE (.058) (.059) (.060) t 3.941 -2.99 14.6 Interpretation: increases in foreign investment increases political exclusion

  34. Case Study 1: Foreign Investment Alternatives No model with testable constraint (df > 0) in which FI has a positive effect on PO

  35. Case Study 2: Welfare Reform Single Mothers’ Self-Efficacy, Parenting in the Home Environment, and   Children’s Development in a Two-Wave Study (Social Work Research, 29, 1, 7-20) Aurora Jackson, Richard Scheines

  36. Case Study 2: Welfare Reform Sampling Scheme • Longitudinal Data • Time 1: 1996-97 (N = 188) • Time 2: 1998-99 (N = 178) • Single black mothers in NYC • Current and former welfare recipients • With a child who was 3 – 5 at time 1, and 6 to 8 at time 2

  37. Case Study 2: Welfare Reform Constructs/Scales/Measures • Employment Status • Perceived Self-efficacy • Depressive Symptoms • Quality of Mother/Father Relationship • Father/Child Contact • Quality of Home Environment • Behavior Problems • Cognitive Development

  38. Case Study 2: Welfare Reform Background Knowledge • Tier 1: • Employment Status • Tier 2: • Depression • Self-efficacy • Mother/Father Relationship • Father/Child Contact • Mother’s Parenting/HOME • Tier 3: • Negative Behaviors • Cognitive Development Over 22 million path models consistent with these constraints

  39. Case Study 2: Welfare Reform Conceptual Model c2 = 22.3, df = 20, p = .32 Tetrad Equivalence Class c2 = 18.87, df = 19, p = .46

  40. Case Study 2: Welfare Reform Points of Agreement: • Mother’s Self-Efficacy mediates the effect of Employment on all other variables. • HOME environment mediates the effect of all other factors on outcomes: Cog. Develop and Prob. Behaviors Conceptual Model Points of Disagreement: • Depression key cause vs. only an effect Tetrad

  41. Case Study 3: Online Courseware Online Course in Causal & Statistical Reasoning

  42. Case Study 3: Online Courseware Variables • Pre-test (%) • Print-outs (% modules printed) • Quiz Scores (avg. %) • Voluntary Exercises (% completed) • Final Exam (%) • 9 other variables

  43. 2002 2003 Case Study 3: Online Courseware Printing and Voluntary Comprehension Checks: 2002 --> 2003

  44. Case Study 4: Charitable Giving Variables • Tangibility/Concreteness (Exp manipulation) • Imaginability (likert 1-7) • Impact (avg. of 2 likerts) • Sympathy (likert) • Donation ($)

  45. Case Study 4: Charitable Giving Theoretical Model study 1 (N= 94) df = 5, c2 = 52.0, p= 0.0000 study 2 (N= 115) df = 5, c2 = 62.6, p= 0.0000

  46. Case Study 4: Charitable Giving GES Outputs study 1:df = 5, c2 = 5.88, p= 0.32 study 2:df = 5, c2 = 8.23, p= 0.14 study 1:df = 5, c2 = 3.99, p= 0.55 study 2:df = 5, c2 = 7.48, p= 0.18

  47. The Causal Theory Formation Problem for Latent Variable Models Given observations on a number of variables, identify the latent variables that underlie these variables and the causalrelations among these latent concepts. Example: Spectral measurements of solar radiation intensities. Variables are intensities at each measured frequency. Example: Quality of a Child’s Home Environment, Cumulative Exposure to Lead, Cognitive Functioning

  48. The Most Common Automatic Solution: Exploratory Factor Analysis • Chooses “factors” to account linearly for as much of the variance/covariance of the measured variables as possible. • Great for dimensionality reduction • Factor rotations are arbitrary • Gives no information about the statistical and thus the causal dependencies among any real underlying factors. • No general theory of the reliability of the procedure

  49. Other Solutions • Independent Components, etc • Background Theory • Scales

  50. Key Causal Question Other Solutions: Background Theory Specified Model Thus, key statistical question: Lead _||_ Cog | Home ?

More Related