Create Presentation
Download Presentation

Download Presentation

Learning Equivalence Classes of Bayesian-Network Structures

Download Presentation
## Learning Equivalence Classes of Bayesian-Network Structures

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Learning Equivalence Classes of Bayesian-Network Structures**David M. Chickering Presented by Dmitry Zinenko**Heuristic Search**• We are looking for the best state in the search space. Naïvely: • state = a particular DAG • search space = all possible DAGs over our variables • Move between related states using search operators. Naively: • Egde addition/removal/inversion**Heuristic Search Challenges**• Search space graph should be well-connected • To reach good states quickly • To avoid local maxima • Search space graph should not be too dense • Computationally efficient scoring and transformations**Equivalence**• G1 and G2 are equivalent if the set of distributions that can be represented by them is identical • Equivalence is an equivalence relationship! X Y X Y P X Y**Score Equivalence**• If all we care about is the probability distribution, all we need is the equivalence class • The scoring function should give equal scores to structures from the same class • Called score equivalent • Why prefer one representation of the class to another?**Equivalence Classes Are Good For You**• We are ultimately looking for a probability representation, not a particular DAG • Searching individual DAGs is bad: • Some operators lead to the same class • Efficiency • Bad state connectivity for greedy**Theorem 1 (Verma & Pearl 1990)**• Two DAGs are equivalent if and only if they have the same skeletons and the same v-structures X Z X Z Y Y X Z X Z Y Y**Partially Directed Acyclic Graph**• A directed edge is called compelled in G, if for every G’ equivalent to G, that edge has the same direction • Otherwise we call it reversible • Partially Directed Acyclic Graph (PDAG) • Contains both directed and undirected edges • Does not contain any directed circles • Theorem 1 extends naturally to PDAGs • A DAG is also a PDAG**X**Y Z X Y Z X Y Z W CPDAG and Consistent Extension • Completed PDAG for Class(G) contains • directed edges for the compelled edges of G • undirected edges for the reversible edges of G • G is consistent extension of P if • G has the same skeleton and v-structures • Every directed edge in P has the same orientation in G**CPDAGs And Equivalence**• Every consistent extension of P is in Class(P) • If Pc is a completed PDAG, then every PDAG G in Class(Pc) is a consistent extension of Pc • If P1 and P2 are completed PDAGs that admit consistent extension, then P1=P2 if and only if Class(P1)=Class(P2) • A completed PDAG uniquely represents its equivalence class**DAG to CPDAG (Meek 1995)**• Undirect all edges except those that are in the v-structures • Direct (mark as compelled) undirected edges that match particular patterns X Z X Z X Z W Y Y Y**Constructing Consistent Extension (I)**• “Theorem 26”: The undirected components of a CPDAG are chordal • In any cycle of length >3 in a DAG, there must be a v-structure! Let {Ki} be the set of undirected components of a completed PDAG Pc. Let {Gi} be consistent extensions of {Ki} A graph G that results from replacing each reversible edge in Kiwith the directed edge from corresponding Gi is a consistent extension of Pc**Constructing Consistent Extension (II)**• Use decreasing maximum cardinality search to direct edges in each one of the chordal components • Property of dMCS: Every path between any pair of non-adjacent x, y contains a node numbered higher than x or y • Resulting graph is a consistent extension of Pc • Works only on completed PDAGs**PDAG-to-DAG (Dor & Tarsi 1992)**• Select a node x in P s.t. • x has no outgoing edges • Vertices adjacent to x form a clique • Direct all edges (x―y) toward x • x becomes a sink • Remove x from P • Works only on any PDAG**Operators**• The set of operators should: • Ensure global connectivity (completeness) and good connectivity in general • Be easy to check for applicability (validity) • Avoid redundancy • Allow for efficient scoring • Local scoring– local changes in G cause “local” changes in score(G)**Score Decomposability**• A scoring function S is decomposable if it is a product (or sum) of factors s, each depending only on one node and its parents • For example: Z X Y Z X Y**Operator Scoring**• Chickering 1996a • Apply the operator and score the consistent extension (DAG) • Drawbacks: • Need to apply PDAG-to-DAG for every operator • Local operators may cause non-local changes when applied to CPDAG • Cannot benefit from local scoring**InsertU Operator – “Theorem 34”**• Let Pc be any completed PDAG for which nodes x and y are not adjacent. • If after adding an edge between x and yPc admits a consistent extension, then • The edge x―y is reversible if and only if x and y have exactly the same parents in the original PDAG**InsertU Operator – “Theorem 6”**• The insertion of the undirected edge x―y in a CPDAG Pc is valid if and only if: • x and y have the same parents in Pc • every undirected path between x and y contains at least one of their common neighbors • Only if (+Theorem 34): • Take the shortest undirected path from x to y in Pc that does not include any common neighbor of x and y • Length at least 3 and has no chord • After adding x―y becomes a cycle of length 4**InsertU Operator – “Lemma 32”**• Let Pc be any completed PDAG, and let x and y be any pair of nodes that are not adjacent. • There exists a consistent extension of Pc in which • all the reversible edges adjacent to x are directed away from x • all the reversible edges between y and the common neighbors of x and y are directed toward y • all the other reversible edges adjacent to y are directed away from y • If and only if every undirected path between x and y passes through a common neighbor of x and y**InsertU Operator – Theorem 6“If” proof outline**• Use consistent extension from Lemma 32 as G • Add a directed edge x→y to G to get G’ (the other direction is symmetric) • Show that G’ is a consistent extension of P’ (P with the addition of the undirected edge x―y) • G’ is acyclic • Same skeleton • Same v-structures**InsertU Operator – Theorem 6G’ is a DAG**• Assume by contradiction that there is a directed path from y to x in G • All the reversible edges are directed away from x, so the last edge in that path w→x is compelled • Then w is a parent of x in P, and it must also be a parent of y • In G there is a cycle y→w→y W X Y**InsertU Operator – “Lemma 24”**• Let Pc be a completed PDAG, and let P’ denote a PDAG that results from adding a single edge between x and y to Pc • Consider any consistent extension G of Pc, and G’ that results by inserting a directed edge between x and y in G • Then any v-structure in G’ but not in P’, or any v-structure in P’ but not in G’ must include the edge between x and y**InsertU Operator – Theorem 6G’ is a consistent extension**of P’ • By Lemma 24, any v-structure different between G’ and P’ must include the edge x―y • The v-structure must be in G’, because in P’ this edge is undirected • The other edge in the v-structure cannot be reversible in G’ • x does not have reversible parents • y’s reversible parents are adjacent to x • But any compelled parent of x or y is a parent of both Q.E.D**Local Operator Evaluation**• Since the only difference between G and G’ is the edge x→y, we can use score decomposability to compute the score of P’ in O(1) time • s(P’) = s(Pc)+s(y,Nx,y{x}y)-s(y,Nx,yy) • In general we do not need to transform the CPDAG to compute neighbor scores: • Calculate scores for all the neighbor states (locally!) • Check operator validity (efficiently!) starting from the highest score