1 / 61

Otis Dudley Duncan Memorial Lecture: THE RESURRECTION OF DUNCANISM

Otis Dudley Duncan Memorial Lecture: THE RESURRECTION OF DUNCANISM. Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea/). OUTLINE. Duncanism = Causally Assertive SEM History: Oppression, Distortion, and Resurrection The Old-New Logic of SEM New Tools

marly
Download Presentation

Otis Dudley Duncan Memorial Lecture: THE RESURRECTION OF DUNCANISM

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Otis Dudley Duncan Memorial Lecture: THE RESURRECTION OF DUNCANISM Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea/)

  2. OUTLINE • Duncanism = Causally Assertive SEM • History: Oppression, Distortion, and Resurrection • The Old-New Logic of SEM • New Tools • 4.1 Local testing • 4.2 Non-parametric identification • 4.3 Logic of counterfactuals • 4.3 Non linear mediation analysis

  3. Duncan book

  4. Duncan book

  5. FINDING INSTRUMENTAL VARIABLES Can you find a n instrument for identifying b34? (Duncan, 1975) By inspection: X2d-separate X1 from V Therefore X1 is a valid instrument

  6. DUNCANISM = ASSERTIVE SEM SEM – A tool for deriving causal conclusion from data and assumptions: • [y = bx + u] may be read as "a change in x or u produces a change in y” ... or “x and u are the causes of y” (Duncan, 1975, p. 1) • “[the disturbance]u stands for all other sources of variation in y” (ibid) • “doing the model consists largely in thinking about what kind of model one wants and can justify.” • (ibid, p. viii) • Assuming a model for sake of argument, we can express its properties in terms of correlations and (sometimes) find one or more conditions that must hold if the model is true“ (ibid, p. 20).

  7. HISTORY: BIRTH, OPPRESSION, DISTORTION, RESURRECTION • Birth and Development (1920 - 1980) • Sewell Wright (1921), Haavelmo (1943), Simon (1950), • Marschak (1950), Koopmans (1953), Wold and Strotz (1963), • Goldberger (1973), Blalock (1964), and Duncan (1969, 1975) • The regressional assault (1970-1990) • Causality is meaningless, therefore, to be meaningful, SEM • must be a regression technique, not “causal modeling.” • Richard (1980) Cliff (1983), Dempster (1990), Wermuth (1992), • and Muthen (1987)     • The Potential-outcome assault (1985-present) • Causality is meaningful but, since SEM is a regression • technique, it could not possibly have causal interpretation. • Rubin (2004, 2010), Holland (1986) Imbens (2009), and • Sobel (1996, 2008)

  8.  The mothers of all questions: Q. When would b equal a? A. When (uY X), read from the diagram REGRESSION VS. STRUCTURAL EQUATIONS (THE CONFUSION OF THE CENTURY) Regression (claimless, nonfalsifiable): Y = ax + Y Structural (empirical, falsifiable): Y = bx + uY Claim: (regardless of distributions): E(Y | do(x)) = E(Y | do(x), do(z)) = bx Q. When is b a partial regression? b = YX •  A. Shown in the diagram, Slide 40.

  9. THE POTENTIAL-OUTCOME ASSAULT (1985-PRESENT) • “I am speaking, of course, about the equation: • What does it mean? The only meaning I have ever • determined for such an equation is that it is a shorthand way • of describing the conditional distribution of {y}given {x}.” • (Holland 1995) • “The use of complicated causal-modeling software • [read SEM] rarely yields any results that have any • interpretation as causal effects.” • (Wilkinson and Task Force 1999 on Statistical Methods in • Psychology Journals: Guidelines and Explanations)

  10. THE POTENTIAL-OUTCOME ASSAULT (1985-PRESENT) (Cont) • In general (even in randomized studies), the structural • and causal parameters are not equal, implying that the • structural parameters should not be interpreted as effect.” • (Sobel 2008) • “Using the observed outcome notation entangles the • science...Bad! Yet this is exactly what regression • approaches, path analyses, directed acyclic graphs, • and so forth essentially compel one to do.” • (Rubin 1010)

  11. WHY SEM INTERPRETATION IS “SELF-CONTRADICTORY” D. Freedman JES (1987), p. 114, Fig. 3 "Now try the direct effect of Z on Y:  We intervene by fixing W and X but increasing Z by one unit; this should increase Y by d units.   However, this hypothetical intervention is self-contradictory, because fixing W and increasing Z causes an increase in X.   The oversight: Fixing XDISABLES equation (7.1);

  12. SEM REACTION TO FREEDMAN CRITICS Total Surrender: • It would be very healthy if more researchers abandoned • thinking of and using terms such as cause and effect • (Muthen, 1987). • “Causal modeling” is an outdated misnomer (Kelloway, 1998). • Causality-free, politically-safe vocabulary: “covariance • structure” “regression analysis,” or “simultaneous equations.” • [Causal Modeling] may be somewhat dated, however, as it • seems to appear less often in the literature nowadays” • (Kline, 2004, p. 9)

  13. SEM REACTION TO THE STRUCTURE-PHOBIC   ASSAULT NONE TO SPEAK OF! Galles, Pearl, Halpern (1998 - Logical equivalence) Heckman-Sobel Morgan Winship (1997) Gelman Blog

  14. THE RESURRECTION Why non-parametric perspective? What are the parameters all about and why we labor to identify them?Can we do without them?Consider: Only he who lost the parameters and needs to find substitutes can begin to ask:Do I really need them? What do they really mean? What role do the play?

  15. THE LOGIC OF SEM CAUSAL MODEL (MA) CAUSAL MODEL (MA) A - CAUSAL ASSUMPTIONS A* - Logical implications of A Causal inference Q Queries of interest Q(P) - Identified estimands T(MA) - Testable implications Statistical inference Data (D) Q - Estimates of Q(P) Goodness of fit Conditional claims Model testing

  16. P Joint Distribution Q(P) (Aspects of P) Data Inference TRADITIONAL STATISTICAL INFERENCE PARADIGM e.g., Infer whether customers who bought product A would also buy product B. Q = P(B | A)

  17. FROM STATISTICAL TO CAUSAL ANALYSIS: 1. THE DIFFERENCES Probability and statistics deal with static relations P Joint Distribution P Joint Distribution Q(P) (Aspects of P) Data change Inference What happens when P changes? e.g., Infer whether customers who bought product A would still buy Aif we were to double the price.

  18. FROM STATISTICAL TO CAUSAL ANALYSIS: 1. THE DIFFERENCES What remains invariant when Pchanges say, to satisfy P(price=2)=1 P Joint Distribution P Joint Distribution Q(P) (Aspects of P) Data change Inference Note: P(v)P (v | price = 2) P does not tell us how it ought to change Causal knowledge: what remains invariant

  19. Causal and statistical concepts do not mix. CAUSAL Spurious correlation Randomization / Intervention Confounding / Effect Instrumental variable Exogeneity / Ignorability Mediation STATISTICAL Regression Association / Independence “Controlling for” / Conditioning Odd and risk ratios Collapsibility / Granger causality Propensity score FROM STATISTICAL TO CAUSAL ANALYSIS: 1. THE DIFFERENCES (CONT)

  20. Causal and statistical concepts do not mix. CAUSAL Spurious correlation Randomization / Intervention Confounding / Effect Instrumental variable Exogeneity / Ignorability Mediation STATISTICAL Regression Association / Independence “Controlling for” / Conditioning Odd and risk ratios Collapsibility / Granger causality Propensity score • No causes in – no causes out (Cartwright, 1989) } statistical assumptions + data causal assumptions  causal conclusions FROM STATISTICAL TO CAUSAL ANALYSIS: 2. MENTAL BARRIERS • Causal assumptions cannot be expressed in the mathematical language of standard statistics.

  21. Causal and statistical concepts do not mix. CAUSAL Spurious correlation Randomization / Intervention Confounding / Effect Instrumental variable Exogeneity / Ignorability Mediation STATISTICAL Regression Association / Independence “Controlling for” / Conditioning Odd and risk ratios Collapsibility / Granger causality Propensity score • No causes in – no causes out (Cartwright, 1989) } statistical assumptions + data causal assumptions  causal conclusions • Non-standard mathematics: • Structural equation models (Wright, 1920; Simon, 1960) • Counterfactuals (Neyman-Rubin (Yx), Lewis (xY)) FROM STATISTICAL TO CAUSAL ANALYSIS: 2. MENTAL BARRIERS • Causal assumptions cannot be expressed in the mathematical language of standard statistics.

  22. THE STRUCTURAL MODEL PARADIGM Joint Distribution Data Generating Model Q(M) (Aspects of M) Data M Inference • M – Invariant strategy (mechanism, recipe, law, protocol) by which Nature assigns values to variables in the analysis. “Think Nature, not experiment!”

  23. e.g., STRUCTURAL CAUSAL MODELS • Definition: A structural causal model is a 4-tuple • V,U, F, P(u), where • V = {V1,...,Vn} are endogenous variables • U={U1,...,Um} are background variables • F = {f1,...,fn} are functions determining V, • vi = fi(v, u) • P(u) is a distribution over U • P(u) and F induce a distribution P(v) over observable variables

  24. U U X (u) Y (u) X = x YX (u) M Mx CAUSAL MODELS AND COUNTERFACTUALS Definition: The sentence: “Y would be y (in unit u), had X beenx,” denoted Yx(u) = y, means: The solution for Y in a mutilated model Mx, (i.e., the equations for X replaced by X = x) with input U=u, is equal to y.

  25. The Fundamental Equation of Counterfactuals: CAUSAL MODELS AND COUNTERFACTUALS Definition: The sentence: “Y would be y (in unit u), had X beenx,” denoted Yx(u) = y, means: The solution for Y in a mutilated model Mx, (i.e., the equations for X replaced by X = x) with input U=u, is equal to y.

  26. Joint probabilities of counterfactuals: In particular: CAUSAL MODELS AND COUNTERFACTUALS Definition: The sentence: “Y would be y (in situation u), had X beenx,” denoted Yx(u) = y, means: The solution for Y in a mutilated model Mx, (i.e., the equations for X replaced by X = x) with input U=u, is equal to y.

  27. 3-LEVEL HIERARCHY OF CAUSAL MODELS • Probabilistic Knowledge P(y | x) • Bayesian networks, graphical models • Interventional Knowledge P(y | do(x)) • Causal Bayesian Networks (CBN) • (Agnostic graphs, manipulation graphs) • Counterfactual Knowledge P(Yx = y,Yx=y) • Structural equation models, physics, • functional graphs, “Treatment assignment • mechanism”

  28. TWO PARADIGMS FOR CAUSAL INFERENCE Observed: P(X, Y, Z,...) Conclusions needed: P(Yx=y), P(Xy=x | Z=z)... How do we connect observables, X,Y,Z,… to counterfactuals Yx, Xz, Zy,… ? N-R model Counterfactuals are primitives, new variables Super-distribution Structural model Counterfactuals are derived quantities Subscripts modify the model and distribution

  29. ARE THE TWO PARADIGMS EQUIVALENT? • Yes (Galles and Pearl, 1998; Halpern 1998) • In the N-R paradigm, Yx is defined by consistency: • In SCM, consistency is a theorem. • Moreover, a theorem in one approach is a theorem in the other.

  30. THE FIVE NECESSARY STEPS OF CAUSAL ANALYSIS Define: Assume: Identify: Estimate: Test: Express the target quantity Q as a function Q(M) that can be computed from any model M. Formulate causal assumptions Ausing some formal language. Determine if Q is identifiable given A. Estimate Q if it is identifiable; approximate it, if it is not. Test the testable implications of A (if any).

  31. THE FIVE NECESSARY STEPS FOR EFFECT ESTIMATION Define: Assume: Identify: Estimate: Test: Express the target quantity Q as a function Q(M) that can be computed from any model M. Formulate causal assumptions Ausing some formal language. Determine if Q is identifiable given A. Estimate Q if it is identifiable; approximate it, if it is not. Test the testable implications of A (if any).

  32. 2. Counterfactuals: 3. Structural: X Z Y FORMULATING ASSUMPTIONS THREE LANGUAGES 1. English: Smoking (X), Cancer (Y), Tar (Z), Genotypes (U) Not too friendly: consistent? complete? redundant? arguable?

  33. IDENTIFYING CAUSAL EFFECTS IN POTENTIAL-OUTCOME FRAMEWORK Define: Assume: Identify: Estimate: Express the target quantity Q as a counterfactual formula, e.g., E(Y(1) – Y(0)) Formulate causal assumptions using the distribution: Determine if Q is identifiable using P* and Y=x Y (1) + (1 – x) Y (0). Estimate Q if it is identifiable; approximate it, if it is not.

  34. GRAPHICAL – COUNTERFACTUALS SYMBIOSIS Every causal graph expresses counterfactuals assumptions, e.g., X  Y Z • Missing arrows Y Z 2. Missing arcs Y Z • consistent, and readable from the graph. • Express assumption in graphs • Derive estimands by graphical or algebraic methods

  35. G IDENTIFICATION IN SCM Find the effect ofXonY, P(y|do(x)), given the causal assumptions shown inG, whereZ1,..., Zk are auxiliary variables. Z1 Z2 Z3 Z4 Z5 X Z6 Y CanP(y|do(x)) be estimated if only a subset,Z, can be measured?

  36. G Gx Moreover, (“adjusting” for Z) Ignorability ELIMINATING CONFOUNDING BIAS THE BACK-DOOR CRITERION P(y | do(x)) is estimable if there is a set Z of variables such thatZd-separates X from YinGx. Z1 Z1 Z2 Z2 Z Z3 Z3 Z4 Z5 Z5 Z4 X X Z6 Y Y Z6

  37. Z1 Z2 W1 Z3 W2 X W3 Y IDENTIFYING TESTABLE IMPLICATIONS Assumptions advertized in the missing edges are Z1–Z2, Z1– Y Implying: The missing edges imply: z = 0, b1 = 0, and c1 = 0. Software routines for automatic detection of all such tests reported in Kyono (2010)

  38. SEPARATION EQUIVALENCE  MODEL EQUIVALENCE

  39. FINDING INSTRUMENTAL VARIABLES Can you find a n instrument for identifying b34? (Duncan, 1975) By inspection: X2d-separate X1 from V Therefore X1 is a valid instrument

  40. L T Z ? Z T CONFOUNDING EQUIVALENCE WHEN TWO MEASUREMENTS ARE EQUALLY VALUABLE W1 W4 W2 W3 V1 V2 X Y

  41. CONFOUNDING EQUIVALENCE WHEN TWO MEASUREMENTS ARE EQUALLY VALUABLE • Definition: • T and Z are c-equivalent if • Definition (Markov boundary): • Markov boundary Sm of S (relative to X) is the minimal subset of S that d-separates X from all other members of S. • Theorem (Pearl and Paz, 2009) • Z and T are c-equivalent iff • Zm=Tm, or • Z and T are admissible (i.e., satisfy the back-door • condition)

  42. L T Z CONFOUNDING EQUIVALENCE WHEN TWO MEASUREMENTS ARE EQUALLY VALUABLE W1 W4 Z T W2 W3 V1 V2 X Y

  43. T Z Z T CONFOUNDING EQUIVALENCE WHEN TWO MEASUREMENTS ARE EQUALLY VALUABLE W1 W4 W2 W3 V1 V2 X Y

  44. Z T CONFOUNDING EQUIVALENCE WHEN TWO MEASUREMENTS ARE EQUALLY VALUABLE W1 W4 Z T W2 W3 V1 V2 X Y

  45. W1 {W1, W2} BIAS AMPLIFICATION BY INSTRUMENTAL VARIABLES W2 W1 U X Y • Adding W2 to Propensity Score increases bias (if such exists) • (Wooldridge, 2009) • In linear systems – always • In non-linear systems – almost always (Pearl, 2010) • Outcome predictors are safer than treatment predictors

  46. EFFECT DECOMPOSITION (direct vs. indirect effects) • Why decompose effects? • What is the definition of direct and indirect effects? • What are the policy implications of direct and indirect effects? • When can direct and indirect effect be estimated consistently from experimental and nonexperimental data?

  47. WHY DECOMPOSE EFFECTS? • To understand how Nature works • To comply with legal requirements • To predict the effects of new type of interventions: • Signal routing, rather than variable fixing

  48. Adjust for Z? No! No! • LEGAL IMPLICATIONS • OF DIRECT EFFECT Can data prove an employer guilty of hiring discrimination? X Z (Gender) (Qualifications) Y (Hiring) What is the direct effect of X on Y ? (averaged over z)

  49. NATURAL INTERPRETATION OF AVERAGE DIRECT EFFECTS Robins and Greenland (1992) – “Pure” X Z z = f (x, u) y = g (x, z, u) Y Natural Direct Effect of X on Y: The expected change in Y, when we change X from x0 to x1 and, for each u, we keep Z constant at whatever value it attained before the change. In linear models, DE = Controlled Direct Effect

  50. DEFINITION OF INDIRECT EFFECTS X Z z = f (x, u) y = g (x, z, u) Y Indirect Effect of X on Y: The expected change in Y when we keep X constant, say at x0, and let Z change to whatever value it would have attained had X changed to x1. In linear models, IE = TE - DE

More Related