bayes net perspectives on causation and causal inference n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Bayes Net Perspectives on Causation and Causal Inference PowerPoint Presentation
Download Presentation
Bayes Net Perspectives on Causation and Causal Inference

Loading in 2 Seconds...

play fullscreen
1 / 39

Bayes Net Perspectives on Causation and Causal Inference - PowerPoint PPT Presentation


  • 109 Views
  • Uploaded on

Bayes Net Perspectives on Causation and Causal Inference. Peter Spirtes. Example Problems. Genetic regulatory networks Yeast – ~5000 genes, ~2,500,000 potential edges. A gene regulatory network in mouse embryonic stem cells http://www.pnas.org/content/104/42/16438/F3.expansion.html.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Bayes Net Perspectives on Causation and Causal Inference' - xuxa


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
example problems
Example Problems
  • Genetic regulatory networks
    • Yeast – ~5000 genes, ~2,500,000 potential edges

A gene regulatory network in mouse embryonic stem cells http://www.pnas.org/content/104/42/16438/F3.expansion.html

causal models predictions
Causal Models → Predictions
  • Probabilistic – Among the cells that have active Oct4 what percentage have active Rcor2?
  • Causal – If I experimentally set a cell to have active Oct4, what percentage will have active Rcor2?
causal models predictions1
Causal Models → Predictions
  • Counterfactual – Among the cells that did not have active Oct4 at t-1, what percentage would have active Rcor2 if I had experimentally set a cell to have active Oct4 at t-1?
data causal models
Data → Causal Models
  • Large number of variables
  • Small observed sample size
  • Overlapping variables
  • Small number of experiments
  • Feedback
  • Hidden common causes
  • Selection bias
  • Many kinds of entities causally interacting
outline
Outline
  • Bayesian Networks
  • Search
  • Limitations and Extensions of Bayesian Networks
    • Dynamic
    • Relational
    • Cycles
    • Counterfactual
directed acyclic graph dag

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Directed Acyclic Graph (DAG)

SES

SEX PE CP

IQ

SES – Socioeconomic Status

PE – Parental Encouragement

CP – College Plans

IQ – Intelligence Quotient

SEX– Sex

  • The vertices are random variables.
  • All edges are directed.
  • There are no directed cycles.
population

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Population

SES

SEX PE CP

IQ

SES

SEX PE CP

IQ

SES

SEX PE CP

IQ

Independent, identically distributed

p factoring according to g

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
P Factoring According to G

SES

SEX PE CP

IQ

  • P(SES,SEX,PE,IQ,CP) =

P(SEX)P(SES)P(IQ|SES)

P(PE|SES,SEX,IQ)

P(CP|PE,SES,IQ)

  • If
  • then P factors according to G
  • G represents all of the distributions that factor according to G
conditional independence

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Conditional Independence
  • X is independent of Y conditional on Z (denoted IP(X,Y|Z)) iff P(X|Y,Z) = P(X|Z).
  • IP(CP,SEX|{SES,IQ,PE}) iff P(CP|{SES,IQ,PE,SEX}) = P(CP|{SES,IQ,PE})
graphical entailment

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Graphical Entailment

SES

SEX PE CP

IQ

    • If for every P that factors according to G, IP(X,Y|Z) holds, then GentailsI(X,Y|Z).
  • Examples: G entails
    • I(IQ,SEX|∅)
    • I(IQ,SEX|SES)
  • Can read entailments off of graph through d-separation
d separation and d connection

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
D-separation and D-connection

SES

SEX PE CP

IQ

  • X d-separated from Y conditional on Z in G iff G entails X independent of Y conditional on Z
  • D-separation between X and Y conditional on Z holds when certain kinds of paths do notexist between X and Y
  • D-connection (the negation of d-separation) between X and Y conditional on Z holds when certain kinds of paths do exist between X and Y
definition of d connection

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Definition of D-connection

SES

SEX PE CP

IQ

  • A node X is active on a path UconditionalonZ iff
    • X is a collider (→ X ←) and there is a directed path from X to a member of Z or X is in Z; or
    • X is not a collider and X is not in Z.
  • SES → IQ → PE ← SEX is a path U.
  • PE is active on U conditional on {CP, IQ}.
  • IQ is inactive on U conditional on {CP, IQ}.
definition of d connection1

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Definition of D-connection

SES

SEX PE CP

IQ

  • A path U is active conditional onZ iff every vertex on U is active relative to Z.
  • X is d-connectedto Y conditional onZ iff there is an active path between X and Y conditional on Z.
  • SES → IQ → PE ← SEX is inactive conditional on{CP, IQ}.
  • SES is d-connected to SEX conditional on {CP, IQ} because SES → PE ← SEX is active conditional on {CP, IQ}
if i is not entailed by g

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
If I is Not Entailed by G
  • If conditional independence relation I is not entailed by G, then I may hold in some (but not every) distribution P that factors according to G.

SES

SEX PE CP

IQ

  • Example: There are P and P’ that factor according to G such that ~IP(SES,CP|∅) and IP’(SES,CP|∅). P’ is said to be unfaithful to G.
manipulations

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Manipulations
  • An ideal manipulationassigns a density to a set X of properties (random variables) as a function of the values of a set Z of properties (random variables)
    • Directly affects only the variables in X
    • Successful
  • Example – randomized experiment
manipulations and causal graph

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Manipulations and Causal Graph

SES

SEX PE CP

IQ

  • There is an edge SES → CP in Gbecause there are two ways of manipulating {SES,SEX,IQ,PE} that differ only in the value they assign to SES that changes the probability of CP.

Stable Unit Treatment Value Assumption

causal sufficiency

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Causal Sufficiency

SES

SEX PE CP

IQ

  • A set S of variables is causally sufficient if there are no variables not in S that are direct causes of more than one variable in S.
  • S = {SES,IQ} is causally sufficient.
  • S = {SES,PE,CP} is not causally sufficient.
causal markov assumption

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Causal Markov Assumption

SES

SEX PE CP

IQ

  • In a population Pop with distribution P and causal graph G, if V is causally sufficient, P(V) factorsaccording to G.
  • P(SES,SEX,PE,IQ,CP) =

P(SEX)P(SES)P(IQ|SES)

P(PE|SES,SES,IQ)

P(CP|PE,SES,IQ)

representation of manipulation

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Representation of Manipulation

SES

SEX PE CP

IQ

P(SES,SEX,PE=1,IQ,CP||PE=1) =

P(SEX)P(SES)P(IQ|SES) * 1 * P(CP|PE,SES,IQ) =

P(SES,SEX,PE=1,IQ,CP)/P(PE|SEX,SES,IQ)

fci algorithm

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
FCI Algorithm
  • Looks for set of DAGs (possibly with latent variables and selection bias) that entail all and only the conditional independence relations that hold in the data according to statistical tests.
markov equivalence

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Markov Equivalence
  • Two DAGs G1 and G2 are Markov equivalent when they contain the same variables, and for all disjoint X, Y, Z, X is entailed to be independent from Y conditional on Z in G1 if and only if X is entailed to be independent from Y conditional on Z in G2
markov equivalence class

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Markov Equivalence Class

SES

SEX PE CP

IQ

SES

SEX PE CP

IQ

DAG G’

DAG G

causal faithfulness assumption

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Causal Faithfulness Assumption

SES

SEX PE CP

IQ

  • In a population Pop with causal graph G and distribution P(V), if V is causally sufficient, IP(X,Y|Z) only if G entails I(X,Y|Z).
  • ~IP(SES,CP|∅) because I(SES,CP|∅)is not entailed by G
  • +…
causal faithfulness assumption1

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Causal Faithfulness Assumption

SES

SEX PE CP

IQ

  • Causal Faithfulness is too strong because
    • can prove consistency with assumptions about fewer conditional independencies
    • is unlikely to hold, especially when there are many variables.
  • Causal Faithfulness is too weak because it is not sufficient to prove uniform consistency (put error bounds at finite sample sizes.)
good features of fci algorithm

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Good Features of FCI Algorithm
  • Is pointwise consistent: As sample size → ∞, P(error in output pattern) → 0.
  • Can be applied to distributions where tests of conditional independence are known
  • Can be applied to hidden variable models (and selection bias models)
bad features of fci algorithm

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Bad Features of FCI Algorithm
  • There is no reliable way to set error bounds on the pattern without making stronger assumptions.
  • Can only get set of Markov equivalent DAGs, not a single DAG
  • Doesn’t allow for comparing how much better one model is than another
  • Need to assume some version of Causal Faithfulness Assumption
non independence constraints

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Non Independence Constraints
  • Depending on the parametric family, a DAG can entail constraints that are not conditional independence constraints
    • Assuming linearity and non-Gaussian error terms, if a distribution is compatible with X → Y it is not compatible with X ← Y, even though they are Markov equivalent.
score based search strategy

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Score-Based Search Strategy
  • Assign score to graph and sample based on
    • maximum likelihood of data given graph
    • simplicity of model
  • Do search over graph space for highest score
advantages of score based search strategy

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Advantages of Score-Based Search Strategy
  • Get more information about graph
    • Additive noise models, unique DAG
  • Doesn’t rely on binary decisions
  • Local mistakes don’t propagate
disadvantages of score based search strategy

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Disadvantages of Score-Based Search Strategy
  • Often slower to calculate or not known how to calculate exactly if include
    • unmeasured variables
    • selection bias
    • unusual distributions
  • Search over graph space is often heuristic
dynamic bayes nets

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Dynamic Bayes Nets
  • If measure same variable at different times, then the samples from the variable are not i.i.d.
  • Solution: index each variable by time (time series)
dynamic bayes nets1

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Dynamic Bayes Nets
  • Make a template for the causal structure that can be filled in with actual times

Xt-2Xt-1Xt

Yt-2Yt-1Yt

  • Continuous time or differential equations?
  • Continuous time or differential equations?
population1

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual

parent-of

parent-of

parent-of

Population

SES

SEX PE CP

IQ

SES

SEX PE CP

IQ

SES

SEX PE CP

IQ

population2

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual

parent-of

parent-of

parent-of

Population

SES

SEX PE CP

IQ

  • Not i.i.d. distribution
  • Violations of SUTVA
  • Causal relations between relations (e.g. sibling causes rivalry)
extended manipulation specification

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Extended Manipulation Specification
  • A manipulation assigns a density to
    • a set of properties or relations
    • at a set of times (measurable set of times T)
    • for a set of units
  • as a function of the values of
    • a set of properties of relations
    • at a set of times (measurable set of times T)
    • for a set of units
extended factorization assumption

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual

parent-of

parent-of

Extended Factorization Assumption

Alice&Jim

SES

SEX PE CP

IQ

Sue

Bob

P([Alice&Jim.SES, Sue.SEX,Sue.PE, Sue.IQ, Sue.CP,

Alice&Jim.SES, Bob.SEX,Bob.PE, Bob.IQ, Bob.CP) =

extended factorization assumption1

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
Extended Factorization Assumption

P(Sue.SEX) P(Alice&Jim.SES)P(Sue.IQ|Alice&Jim

.SES) P(Sue.PE|Alice&Jim.SES,Sue.SEX, Sue.IQ) P(Sue.CP|Sue.PE, Alice&Jim.SES, Sue.IQ)

P(Bob.SEX) P(Alice&Jim.SES) P(Bob.IQ|Alice&Jim.SES) P(Bob.PE|Alice&Jim.SES, Bob.SEX, Bob.IQ) P(Bob.CP|Bob.PE, Alice&Jim.SES, Bob.IQ)

3 interpretation of cycles pe cp

Bayesian Networks

  • Search
  • Limitations and Extensions
      • Dynamic
      • Relational
  • Cycles
  • Counterfactual
3 Interpretation of Cycles: PE ⇆ CP

SES

SEX PE CP

IQ

  • Equilibrium values of PE and CP cause each other.
  • Average of values of PE and CP while reaching equilibrium influence each other.
  • Mixture of PE→ CP and CP→ PE