1 / 38

Learning Causality

Learning Causality. Some slides are from Judea Pearl’s class lecture http://bayes.cs.ucla.edu/BOOK-2K/viewgraphs.html. Rain. Mud. Other causes of mud. A causal model Example.

lam
Download Presentation

Learning Causality

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning Causality Some slides are from Judea Pearl’s class lecture http://bayes.cs.ucla.edu/BOOK-2K/viewgraphs.html

  2. Rain Mud Other causes of mud A causal model Example • Statement ‘rain causes mud’ implies an asymmetric relationship: the rain will create mud, but the mud will not create rain. • Use ‘→’ when refer such causal relationship; • There is no arrow between ‘rain’ and ‘other causes of mud’ means that there is no direct causal relationship between them;

  3. A E C F B D Directed (causal) Graphs • A and B are causally independent; • C, D, E, and F are causally dependent on A and B; • A and B are direct causes C; • A and B are indirect causes D, E and F; • If C is prevented from changing with A and B, then A and B will no longer cause changes in D, E and F.

  4. Conditional Independence

  5. Conditional Independence

  6. Conditional Independence (Notation)

  7. Causal Structure

  8. Causal Structure (cont’d) • A Causal Structure serves as a blueprint for forming a “casual model” – a precise specification of how each variable is influenced by its parents in the DAG. • We assume that Nature is at liberty to impose arbitrary functional relationships between each effect and its causes and then to perturb these relationships by introducing arbitrary disturbance; • These disturbances reflect “hidden” or unmeasurable conditions.

  9. Causal Model

  10. Causal Model (Cont’d) • Once a causal model M is formed, it defines a joint probability distribution P(M) over the variables in the system; • This distribution reflects some features of the causal structure • Each variable must be independent of its grandparents, given the values of its parents • We may allowed to inspect a select subset OV of “observed” variables to ask questions about P[o], the probability distribution over the observations; • We may recover the topology D of the DAG, from features of the probability distribution P[o].

  11. Inferred Causation

  12. Latent Structure

  13. Structure Preference

  14. Structure Preference (Cont’d) • The set of independencies entailed by a causal structure imposes limits on its power to mimic other structure; • L1 cannot be preferred to L2 if there is even one observable dependency that is permitted by L1 and forbidden by L2; • L1 is preferred to L2 if L2 has subset of L1’s independence; • Thus, test for preference and equivalence can sometimes be reduced to test dependencies, which can be determined by topology of the DAGs without concerning parameters.

  15. Minimality

  16. Consistency

  17. Inferred Causation

  18. Examples • {a,b,c,d} reveal two independencies: • a is independent of b; • d is independent of {a,b} given c; • Assume further that the data reveals no other independencies; • a = having a cold; • b = having hay fever; • c = having to sneeze; • d = having to wipe one’s nose.

  19. Arbitrary relations between a and b minimal Example (Cont’d) • {a,b,c,d} reveal two independencies: • a is independent of b; • d is independent of {a,b} given c; Not minimal: fails to impose conditional Independence between d and {a,b} Not consistent with data: impose marginal independence between d and {a,b}

  20. Stability The stability condition states that, as we vary the parmeters from  to, no indpendence in P can be destroyed. In other words, if the independency exists, it will always exists.

  21. Stable distribution • A probability distribution Pis a faithful/stable distribution if there exist a directed acyclic graph (DAG) Dsuch that the conditional independence relationship in Pis also shown in the D, and vice versa.

  22. IC algorithm (Inductive Causation) • IC algorithm (Pearl) • Based on variable dependencies; • Find all pairs of variables that are dependent of each other (applying standard statistical method on the database); • Eliminate (as much as possible) indirect dependencies; • Determine directions of dependencies;

  23. Comparing abduction, deduction and induction A => B A --------- B • Deduction: major premise: All balls in the box are black minor premise: These balls are from the box conclusion: These balls are black • Abduction: rule: All balls in the box are black observation: These balls are black explanation: These balls are from the box • Induction: case: These balls are from the box observation: These balls are black hypothesized rule: All ball in the box are black A => B B ------------- Possibly A Whenever A then B but not vice versa ------------- Possibly A => B Induction: from specific cases to general rules; Abduction and deduction: both from part of a specific case to other part of the case using general rules (in different ways) Source from httpwww.csee.umbc.edu/~ypeng/F02671/lecture-notes/Ch15.ppt

  24. IC Algorithm (Cont’d) • Input: • P – a stable distribution on a set V of variables; • Output: • A pattern H(P) compatible with P; Patten: is a partially directed DAG • some edges are directed and • some edges are undirected;

  25.  Sab a b Sab a ╨ b a b Not  Sab IC Algorithm: Step 1 • For each pair of variables a and b in V, search for a set Sab such that (a╨b | Sab) holds in P – in other words, a and b should be independent in P, conditioned on Sab . • Construct an undirected graph G such that vertices a and b are connected with an edge if and only if no set Sab can be found.

  26. Yes a a ╨ b C c a No b c b IC Algorithm: Step 2 • For each pair of nonadjacent variables a and b with a common neighbor c, check if c Sab. • If it is, then continue; • Else add arrowheads at c • i.e a→ c ← b

  27. Other causes of mud Other causes of mud Rain Rain Mud Mud Example

  28. IC Algorithm Step 3 • In the partially directed graph that results, orient as many of the undirected edges as possible subject to two conditions: • The orientation should not create a new v-structure; • The orientation should not create a directed cycle;

  29. b c a b c b c Rules required to obtaining a maximally oriented pattern • R1: Orient b — c into b→c whenever there is an arrow a→b such that a and c are non adjacent;

  30. a b a c b a b Rules required to obtaining a maximally oriented pattern • R2: Orient a — b into a→b whenever there is a chain a→c→b;

  31. a b c a b a b d Rules required to obtaining a maximally oriented pattern R3: Orient a — b into a→b whenever there are two chains a—c→b and a—d→b such that c and d are nonadjacent;

  32. a b a b Rules required to obtaining a maximally oriented pattern R4: Orient a — b into a→b whenever there are two chains a—c→d and c→d→b such that c and b are nonadjacent; a c d c d b

  33. IC* Algorithm • Input: • P, a sampled distribution; • Output: • core(P), a marked pattern;

  34. Marked Pattern:Four types of edges

  35. IC* Algorithm: Step 1 For each pair of variables a and b, search for a set Sab such that a and b are independent in P, conditioned on Sab. If there is no such Sab, place an undirected link between the two variables, a – b.

  36. IC* Algorithm: Step 2 • For each pair of nonadjacent variables a and b with a common neighbor c, check if cSab • If it is, then continue; • If it is not, then add arrow heads pointing at c (i.e. a  c  b). • In the partially directed graph that results, add (recursively) as many arrowheads as possible, and mark as many edges as possible, according to the following two rules:

  37. a a c c * b b IC* Algorithm: Rule 1 • R1: For each pair of non-adjacent nodes a and b with a common neighbor c, if the link between a and c has an arrow head into c and if the link between c and b has no arrowhead into c, then add an arrow head on the link between c and b pointing at b and mark that link to obtain c –* b;

  38. IC* Algorithm: Rule 2 • R2: If a and b are adjacent and there is a directed path (composed strictly of marked links) from a to b, then add an arrowhead pointing toward b on the link between a and b;

More Related