1 / 29

Learning Equivalence Classes of Bayesian-Network Structures

Learning Equivalence Classes of Bayesian-Network Structures. David M. Chickering Presented by Dmitry Zinenko. Heuristic Search. We are looking for the best state in the search space . Na ï vely: state = a particular DAG search space = all possible DAGs over our variables

felton
Download Presentation

Learning Equivalence Classes of Bayesian-Network Structures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning Equivalence Classes of Bayesian-Network Structures David M. Chickering Presented by Dmitry Zinenko

  2. Heuristic Search • We are looking for the best state in the search space. Naïvely: • state = a particular DAG • search space = all possible DAGs over our variables • Move between related states using search operators. Naively: • Egde addition/removal/inversion

  3. Heuristic Search Challenges • Search space graph should be well-connected • To reach good states quickly • To avoid local maxima • Search space graph should not be too dense • Computationally efficient scoring and transformations

  4. Equivalence • G1 and G2 are equivalent if the set of distributions that can be represented by them is identical • Equivalence is an equivalence relationship! X Y X Y P X Y

  5. Score Equivalence • If all we care about is the probability distribution, all we need is the equivalence class • The scoring function should give equal scores to structures from the same class • Called score equivalent • Why prefer one representation of the class to another?

  6. Equivalence Classes Are Good For You • We are ultimately looking for a probability representation, not a particular DAG • Searching individual DAGs is bad: • Some operators lead to the same class • Efficiency • Bad state connectivity for greedy

  7. Theorem 1 (Verma & Pearl 1990) • Two DAGs are equivalent if and only if they have the same skeletons and the same v-structures X Z X Z Y Y X Z X Z Y Y

  8. Partially Directed Acyclic Graph • A directed edge is called compelled in G, if for every G’ equivalent to G, that edge has the same direction • Otherwise we call it reversible • Partially Directed Acyclic Graph (PDAG) • Contains both directed and undirected edges • Does not contain any directed circles • Theorem 1 extends naturally to PDAGs • A DAG is also a PDAG

  9. X Y Z X Y Z X Y Z W CPDAG and Consistent Extension • Completed PDAG for Class(G) contains • directed edges for the compelled edges of G • undirected edges for the reversible edges of G • G is consistent extension of P if • G has the same skeleton and v-structures • Every directed edge in P has the same orientation in G

  10. CPDAGs And Equivalence • Every consistent extension of P is in Class(P) • If Pc is a completed PDAG, then every PDAG G in Class(Pc) is a consistent extension of Pc • If P1 and P2 are completed PDAGs that admit consistent extension, then P1=P2 if and only if Class(P1)=Class(P2) • A completed PDAG uniquely represents its equivalence class

  11. DAG to CPDAG (Meek 1995) • Undirect all edges except those that are in the v-structures • Direct (mark as compelled) undirected edges that match particular patterns X Z X Z X Z W Y Y Y

  12. Constructing Consistent Extension (I) • “Theorem 26”: The undirected components of a CPDAG are chordal • In any cycle of length >3 in a DAG, there must be a v-structure! Let {Ki} be the set of undirected components of a completed PDAG Pc. Let {Gi} be consistent extensions of {Ki} A graph G that results from replacing each reversible edge in Kiwith the directed edge from corresponding Gi is a consistent extension of Pc

  13. Constructing Consistent Extension (II) • Use decreasing maximum cardinality search to direct edges in each one of the chordal components • Property of dMCS: Every path between any pair of non-adjacent x, y contains a node numbered higher than x or y • Resulting graph is a consistent extension of Pc • Works only on completed PDAGs

  14. PDAG-to-DAG (Dor & Tarsi 1992) • Select a node x in P s.t. • x has no outgoing edges • Vertices adjacent to x form a clique • Direct all edges (x―y) toward x • x becomes a sink • Remove x from P • Works only on any PDAG

  15. Applying the Operators

  16. Operators • The set of operators should: • Ensure global connectivity (completeness) and good connectivity in general • Be easy to check for applicability (validity) • Avoid redundancy • Allow for efficient scoring • Local scoring– local changes in G cause “local” changes in score(G)

  17. Score Decomposability • A scoring function S is decomposable if it is a product (or sum) of factors s, each depending only on one node and its parents • For example: Z X Y Z X Y

  18. Used Operators

  19. Operator Scoring • Chickering 1996a • Apply the operator and score the consistent extension (DAG) • Drawbacks: • Need to apply PDAG-to-DAG for every operator • Local operators may cause non-local changes when applied to CPDAG • Cannot benefit from local scoring

  20. Local Operator Scoring

  21. InsertU Operator – “Theorem 34” • Let Pc be any completed PDAG for which nodes x and y are not adjacent. • If after adding an edge between x and yPc admits a consistent extension, then • The edge x―y is reversible if and only if x and y have exactly the same parents in the original PDAG

  22. InsertU Operator – “Theorem 6” • The insertion of the undirected edge x―y in a CPDAG Pc is valid if and only if: • x and y have the same parents in Pc • every undirected path between x and y contains at least one of their common neighbors • Only if (+Theorem 34): • Take the shortest undirected path from x to y in Pc that does not include any common neighbor of x and y • Length at least 3 and has no chord • After adding x―y becomes a cycle of length 4

  23. InsertU Operator – “Lemma 32” • Let Pc be any completed PDAG, and let x and y be any pair of nodes that are not adjacent. • There exists a consistent extension of Pc in which • all the reversible edges adjacent to x are directed away from x • all the reversible edges between y and the common neighbors of x and y are directed toward y • all the other reversible edges adjacent to y are directed away from y • If and only if every undirected path between x and y passes through a common neighbor of x and y

  24. InsertU Operator – Theorem 6“If” proof outline • Use consistent extension from Lemma 32 as G • Add a directed edge x→y to G to get G’ (the other direction is symmetric) • Show that G’ is a consistent extension of P’ (P with the addition of the undirected edge x―y) • G’ is acyclic • Same skeleton • Same v-structures

  25. InsertU Operator – Theorem 6G’ is a DAG • Assume by contradiction that there is a directed path from y to x in G • All the reversible edges are directed away from x, so the last edge in that path w→x is compelled • Then w is a parent of x in P, and it must also be a parent of y • In G there is a cycle y→w→y W X Y

  26. InsertU Operator – “Lemma 24” • Let Pc be a completed PDAG, and let P’ denote a PDAG that results from adding a single edge between x and y to Pc • Consider any consistent extension G of Pc, and G’ that results by inserting a directed edge between x and y in G • Then any v-structure in G’ but not in P’, or any v-structure in P’ but not in G’ must include the edge between x and y

  27. InsertU Operator – Theorem 6G’ is a consistent extension of P’ • By Lemma 24, any v-structure different between G’ and P’ must include the edge x―y • The v-structure must be in G’, because in P’ this edge is undirected • The other edge in the v-structure cannot be reversible in G’ • x does not have reversible parents • y’s reversible parents are adjacent to x • But any compelled parent of x or y is a parent of both Q.E.D

  28. Local Operator Evaluation • Since the only difference between G and G’ is the edge x→y, we can use score decomposability to compute the score of P’ in O(1) time • s(P’) = s(Pc)+s(y,Nx,y{x}y)-s(y,Nx,yy) • In general we do not need to transform the CPDAG to compute neighbor scores: • Calculate scores for all the neighbor states (locally!) • Check operator validity (efficiently!) starting from the highest score

More Related