slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Causal Discovery PowerPoint Presentation
Download Presentation
Causal Discovery

Loading in 2 Seconds...

play fullscreen
1 / 50

Causal Discovery - PowerPoint PPT Presentation


  • 154 Views
  • Uploaded on

Causal Discovery. Richard Scheines Peter Spirtes, Clark Glymour, and many others Dept. of Philosophy & CALD Carnegie Mellon. Outline. Motivation Representation Connecting Causation to Probability ( Independence ) Searching for Causal Models

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Causal Discovery' - lok


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Causal Discovery

Richard Scheines

Peter Spirtes, Clark Glymour,

and many others

Dept. of Philosophy & CALD

Carnegie Mellon

Nov. 13th, 2003

outline
Outline
  • Motivation
  • Representation
  • Connecting Causation to Probability (Independence)
  • Searching for Causal Models
  • Improving on Regression for Causal Inference

Nov. 13th, 2003

1 motivation
1. Motivation

Non-experimental Evidence

TypicalPredictiveQuestions

  • Can we predict aggressiveness from the amount of violent TV watched
  • Can we predict crime rates from abortion rates 20 years ago

Causal Questions:

  • Does watching violent TV cause Aggression?
  • I.e., if we change TV watching, will the level of Aggression change?

Nov. 13th, 2003

causal estimation
Causal Estimation

Manipulated Probability P(Y | X set= x, Z=z)

from

Unmanipulated Probability P(Y | X = x, Z=z)

When and how can we use non-experimental data to tell us about the effect of an intervention?

Nov. 13th, 2003

2 representation
2. Representation
  • Association & causal structure -qualitatively
  • Interventions
  • StatisticalCausal Models
    • Bayes Networks
    • Structural Equation Models

Nov. 13th, 2003

causation association
Causation & Association

X and Y are associated (X _||_ Y) iff

x1  x2 P(Y | X = x1)  P(Y | X = x2)

Association is symmetric: X _||_ Y  Y _||_ X

X is a cause of Y iff

x1  x2 P(Y | X set= x1)  P(Y | X set= x2)

Causation is asymmetric: X Y  Y X

Nov. 13th, 2003

direct causation
Direct Causation

X is a direct cause of Y relative to S, iff

z,x1  x2 P(Y | X set= x1 , Zset=z)

 P(Y | X set= x2 , Zset=z)

where Z = S - {X,Y}

Nov. 13th, 2003

causal graphs
Causal Graphs

Causal Graph G = {V,E}

Each edge X  Y represents a direct causal claim:

X is a direct cause of Y relative to V

Chicken Pox

Nov. 13th, 2003

causal graphs1
Causal Graphs

Not Cause Complete

Common Cause Complete

Nov. 13th, 2003

slide10

Modeling Ideal Interventions

  • Ideal Interventions (on a variable X):
  • Completely determine the value or distribution of a variable X
  • Directly Target only X
    • (no “fat hand”)
    • E.g., Variables: Confidence, Athletic Performance
    • Intervention 1: hypnosis for confidence
    • Intervention 2: anti-anxiety drug (also muscle relaxer)

Nov. 13th, 2003

slide11

Modeling Ideal Interventions

Interventions on the Effect

Post

Pre-experimental System

Room Temperature

Sweaters

On

Nov. 13th, 2003

modeling ideal interventions
Modeling Ideal Interventions

Interventions on the Cause

Post

Pre-experimental System

Sweaters

On

Room Temperature

Nov. 13th, 2003

interventions causal graphs
Interventions & Causal Graphs
  • Model an ideal intervention by adding an “intervention” variable outside the original system
  • Erase all arrows pointing into the variable intervened upon

Intervene to change Inf

Post-intervention graph?

Pre-intervention graph

Nov. 13th, 2003

conditioning vs intervening
Conditioningvs. Intervening

P(Y | X = x1)vs. P(Y | X set= x1)Teeth Slides

Nov. 13th, 2003

causal bayes networks
Causal Bayes Networks

The Joint Distribution Factors

According to the Causal Graph,

i.e., for all X in V

P(V) = P(X|Immediate Causes of(X))

P(S = 0) = .7

P(S = 1) = .3

P(YF = 0 | S = 0) = .99 P(LC = 0 | S = 0) = .95

P(YF = 1 | S = 0) = .01 P(LC = 1 | S = 0) = .05

P(YF = 0 | S = 1) = .20 P(LC = 0 | S = 1) = .80

P(YF = 1 | S = 1) = .80 P(LC = 1 | S = 1) = .20

P(S,YF, L) = P(S) P(YF | S) P(LC | S)

Nov. 13th, 2003

structural equation models
Structural Equation Models

1. Structural Equations

2. Statistical Constraints

Causal Graph

Statistical Model

Nov. 13th, 2003

structural equation models1
Structural Equation Models
  • Structural Equations:

One Equation for each variable V in the graph:

V = f(parents(V), errorV)

for SEM (linear regression) f is a linear function

  • Statistical Constraints:

Joint Distribution over the Error terms

Causal Graph

Nov. 13th, 2003

structural equation models2

Causal Graph

SEM Graph

(path diagram)

Structural Equation Models

Equations:

Education = ed

Income =Educationincome

Longevity =EducationLongevity

Statistical Constraints:

(ed, Income,Income ) ~N(0,2)

2diagonal

- no variance is zero

Nov. 13th, 2003

slide20

The Markov Condition

Causal

Structure

Statistical Predictions

Causal Markov Axiom

Independence

X _||_ Z | Y

i.e.,

P(X | Y) = P(X | Y, Z)

Causal Graphs

Nov. 13th, 2003

causal markov axiom
Causal Markov Axiom

If G is a causal graph, and P a probability distribution over the variables in G, then in P:

every variable V is independent of its non-effects, conditional on its immediate causes.

Nov. 13th, 2003

causal markov condition
Causal Markov Condition

Two Intuitions:

1) Immediate causes make effectsindependent of remote causes (Markov).

2) Common causes make their effectsindependent (Salmon).

Nov. 13th, 2003

causal markov condition1
Causal Markov Condition

1) Immediate causes make effectsindependent of remote causes (Markov).

E = Exposure to Chicken Pox

I = Infected

S = Symptoms

Markov Cond.

E || S | I

Nov. 13th, 2003

causal markov condition2
Causal Markov Condition

2) Effects are independent conditional on their commoncauses.

YF || LC | S

Markov Cond.

Nov. 13th, 2003

causal markov axiom1
Causal Markov Axiom

In SEMs, d-separation follows from assuming independence among error terms that have no connection in the path diagram - i.e., assuming that the model is common cause complete.

Nov. 13th, 2003

causal markov and d separation
Causal Markov and D-Separation
  • In acyclic graphs: equivalent
  • Cyclic Linear SEMs with uncorrelated errors:
    • D-separation correct
    • Markov condition incorrect
  • Cyclic Discrete Variable Bayes Nets:
    • If equilibrium --> d-separation correct
    • Markov incorrect

Nov. 13th, 2003

d separation conditioning vs intervening

P(X3 | X2)  P(X3 | X2, X1)

X3 _||_ X1 | X2

P(X3 | X2 set= ) = P(X3 | X2 set=, X1)

X3 _||_ X1 | X2set=

D-separation: Conditioning vs. Intervening

Nov. 13th, 2003

slide30

Statistical Inference

Background Knowledge

- X2 before X3

- no unmeasured common causes

Causal DiscoveryStatistical DataCausal Structure

Nov. 13th, 2003

representations of d separation equivalence classes
Representations ofD-separation Equivalence Classes

We want the representations to:

  • Characterize the Independence Relations Entailed by the Equivalence Class
  • Represent causal features that are shared by every member of the equivalence class

Nov. 13th, 2003

patterns pags
Patterns & PAGs
  • Patterns(Verma and Pearl, 1990): graphical representation of an acyclic d-separation equivalence - no latent variables.
  • PAGs: (Richardson 1994) graphical representation of an equivalence class including latent variable models and sample selection bias that are d-separation equivalent over a set of measured variables X

Nov. 13th, 2003

patterns
Patterns

Nov. 13th, 2003

patterns1
Patterns

Nov. 13th, 2003

pags partial ancestral graphs
PAGs: Partial Ancestral Graphs

What PAG edges mean.

Nov. 13th, 2003

tetrad 4 demo
Tetrad 4 Demo

www.phil.cmu.edu/projects/tetrad_download/

Nov. 13th, 2003

overview of search methods
Overview of Search Methods
  • Constraint Based Searches
    • TETRAD
  • Scoring Searches
    • Scores: BIC, AIC, etc.
    • Search: Hill Climb, Genetic Alg., Simulated Annealing
    • Very difficult to extend to latent variable models

Heckerman, Meek and Cooper (1999). “A Bayesian Approach to Causal Discovery” chp. 4 in Computation, Causation, and Discovery, ed. by Glymour and Cooper, MIT Press, pp. 141-166

Nov. 13th, 2003

regression to estimate causal influence
Regression to estimate Causal Influence
  • Let V = {X,Y,T}, where

- Y : measured outcome

- measured regressors: X = {X1, X2, …, Xn}- latent common causes of pairs in X U Y: T = {T1, …, Tk}

  • Let the true causal model over V be a Structural Equation Model in which each V  V is a linear combination of its direct causes and independent, Gaussian noise.

Nov. 13th, 2003

regression to estimate causal influence1
Regression to estimate Causal Influence
  • Consider the regression equation:

Y = b0 + b1X1 + b2X2 + ..…bnXn

  • Let the OLS regression estimate bi be the estimatedcausal influence of Xi on Y.
  • That is, holding X/Xi experimentally constant, bi is an estimate of the change in E(Y) that results from an intervention that changes Xi by 1 unit.
  • Let the realCausalInfluence Xi Y = bi
  • When is the OLS estimate bi an unbiased estimate of bi ?

Nov. 13th, 2003

linear regression
Linear Regression

Let the other regressors O = {X1, X2,....,Xi-1, Xi+1,...,Xn}

bi = 0 if and only if Xi,Y.O = 0

In a multivariate normal distribuion,

Xi,Y.O = 0if and only if Xi _||_ Y | O

Nov. 13th, 2003

linear regression1
Linear Regression

So in regression:

bi = 0 Xi _||_ Y | O

But provably :

bi = 0 S  O, Xi _||_ Y | S

So S  O, Xi _||_ Y | S  bi = 0

~ S  O, Xi _||_ Y | S  don’t know(unless we’re lucky)

Nov. 13th, 2003

regression example

X1 _||_ Y | X2

Regression Example

b1 0

b2 = 0

X2 _||_ Y | X1

Don’t know

~S  {X2} X1 _||_ Y | S

b2 = 0

S  {X1} X2 _||_ Y | {X1}

Nov. 13th, 2003

regression example1

X1 _||_ Y | {X2,X3}

X2 _||_ Y | {X1,X3}

X3 _||_ Y | {X1,X2}

Regression Example

b1 0

b2 0

b3 0

DK

~S  {X2,X3}, X1 _||_ Y | S

b2 = 0

S  {X1,X3}, X2 _||_ Y | {X1}

DK

~S  {X1,X2}, X3 _||_ Y | S

Nov. 13th, 2003

regression example2

X2

X1

X3

Y

PAG

Regression Example

Nov. 13th, 2003

regression bias
Regression Bias

If

  • Xi is d-separated from Y conditional on X/Xi in the true graph after removing Xi Y, and
  • X contains no descendant of Y, then:

biis an unbiased estimate ofbi

See “Using Path Diagrams ….”

Nov. 13th, 2003

applications
Applications
  • Rock Classification
  • Spartina Grass
  • College Plans
  • Political Exclusion
  • Satellite Calibration
  • Naval Readiness
  • Parenting among Single, Black Mothers
  • Pneumonia
  • Photosynthesis
  • Lead - IQ
  • College Retention
  • Corn Exports

Nov. 13th, 2003

references
Causation, Prediction, and Search, 2nd Edition, (2000), by P. Spirtes, C. Glymour, and R. Scheines ( MIT Press)

Causality: Models, Reasoning, and Inference, (2000), Judea Pearl, Cambridge Univ. Press

Computation, Causation, & Discovery (1999), edited by C. Glymour and G. Cooper, MIT Press

Causality in Crisis?, (1997) V. McKim and S. Turner (eds.), Univ. of Notre Dame Press.

TETRAD IV: www.phil.cmu.edu/projects/tetrad

Web Course on Causal and Statistical Reasoning : www.phil.cmu.edu/projects/csr/

References

Nov. 13th, 2003