1 / 24

Exact Inference Continued

Exact Inference Continued. A Graph-Theoretic View. Eliminating vertex v from an undirected graph G – the process of making N G (v) a clique and removing v and its incident edges from G. N G (v) is the set of vertices that are adjacent to v in G.

Download Presentation

Exact Inference Continued

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exact InferenceContinued .

  2. A Graph-Theoretic View Eliminating vertex v from an undirected graph G – theprocess of making NG(v) a clique and removing v and its incident edges from G. NG(v)is the set of vertices that are adjacent to v in G. Elimination sequence of G – an order of all vertices.

  3. Treewidth The width w of an elimination sequence s is the size of the largest clique (minus 1) being formed in the elimination process, namely, ws = maximumv|NG(v)|. The treewidth tw of a graph G is the minimum width among all elimination sequences, namely, tw=minimums ws Examples. All trees have tw = 1, All graphs with isolated cycles have tw = 2, cliques of size n have tw=n-1.

  4. x1 x1 x2 x2 x3 x3 x4 x4 x5 x5 x6 x6 x7 x7 x8 x8 x9 x9 Another Example Order 1: “Corners first” Largest clique size 4 Width=3 Order 2: x2, x5,… Largest clique size 5 Width=4 Exercise: Build clique trees.

  5. Observations Theorem. Computing a posteriori probability in a Markov graph G (or a BN for which G is the moral graph) has complexity |V|· kw where w is the width of the elimination sequence used and k is the largest domain size. Theorem. Computing a posteriori probability in chordal graphs is polynomial in the size of the input (namely, the largest clique). Justification: Chordal graphs have tw equal to the size of their largest clique (minus 1).

  6. Observations Theorem. Finding an elimination sequence that produces the treewidth or more precisely just finding if tw = c is NP-hard. Simple heuristic. At each step eliminate a vertex v that produces the smallest clique, namely, minimizes |NG(v)|.

  7. Results about treewidth Theorem(s). There are several algorithms that produce treewidth tw with a small constant factor error  at time complexity of Poly(n)ctw. where c is a constant and n is the number of vertices. Main idea. Find a vertex minimum (A,B) cutset S (error by some factor). Make S a clique. Solve recursively G[A,S] and G[B,S], namely, make them chordal. The union graph is chordal. Observation. The above theorem is “practical” if the constants  and c are low enough because computing posterior belief also requires complexity of at most Poly(n)ktw where k is the size of the largest variable domain.

  8. Elimination Sequence with Weights There is a need for cost functions for optimizing time complexity that take the number of states into account. • Elimination sequence of a weighted graph G – an order of the vertices of G, writtenasXα= (Xα(1) ,…,Xα(n) ),where αis a permutation on {1,…,n}. • The cost of eliminating vertex v from a graph Gi is the • product of weights of the vertices in NGi(v).

  9. The residual graph Gi is the graph obtained from Gi-1 by • eliminating vertex Xα(i-1). (G1≡G). • The cost of an elimination sequence Xα is the sum of • costs of eliminating Xα(i) from Gi, for all i. Elimination Sequence with Weights

  10. V S L T A B D X V S L T A B D X Example Original Bayes network. • Weights of vertices (#of states): • yellow nodes: w = 2 • blue nodes: w = 4 Undirected graph Representation, called the moral graph. I-map of the original

  11. G2 G3 G1 S S V S L L L T T T A B A A B D D D X X X Example Suppose the elimination sequence is Xα=(V,B,S,…):

  12. Finding Good Elimination Order • Optimal elimination sequence: one with • minimal cost. NP-complete. • Repeat until the graph becomes empty • Compute the elimination cost of each variable in the current graph. • Choose a vertex v among the k lowest “at random” (flip a coin according to their current elimination costs). • 3. Eliminate vertex v from the graph (make its neighbors a clique) Repeat these steps until using 5% of the estimated time needed to solve the inference problem with the elimination order found so far (estimated by the sum of state space size of all cliques).

  13. Global conditioning b a a b A B Fixing value of A & B C D E C D E I J K I J K L M L M This transformation yields an I-map of Prob(a,b,C,D…) for fixed values of A and B. Fixing values in the beginning of the summation can decrease tables formed by variable elimination. This way space is traded with time. Special case: choose to fix a set of nodes that “break all loops”. This method is called cutset-conditioning.

  14. A B C D E I J K L M Cuteset conditioning Fixing value of A & B & L breaks all loops. We remain with solving a tree. But can we choose less variables to break all loops ? Are there better variables to choose than others ? This optimization question translates to the well known WVFS problem: Choose a set of variables of least weight that lie on every cycle of a given weighted undirected graph G.

  15. A B C D E I J K L M Optimization The weight w of a node v is defined by: w(v)= Log(|Dom(V)|). The problem is to minimize the sum of w(v) of all v in the selected cutset. Solution idea (Factor 2). Remove a vertex with minimum w(v)/d(v). Update neighboring weights by w(u)-w(u)/d(u). Repeat until all cycles are gone. Make the set minimal.

  16. Summary Variable elimination: find an order that minimizes the width. The optimal is called treewidth. Complexity of inference grows exponentially in tw. Treewith is smallest in trees and maximal in cliques. Cutset conditioning: find an order that minimizes the cutset size/weight. Complexity of inference grows exponentially in cutset size. Cutset is smallest in trees and maximal in cliques. Example: Small loops connected in a chain. Inference is exponential using the second method but polynomial using the first method.

  17. C E I K The Loop Cutset Problem • Each vertex that is not a sink with respect to a loop Γis called an allowed vertex with respect to Γ. • A loopcutsetof a directed graph D is a set of vertices that contain at least one allowed vertex with respect to each loop in D. • A minimumloopcutset of a weighted directed graph D is one for which the weight is minimum. Example: L is a sink with respect to the loop with I and J; and L is an allowed vertex with respect to loop JLMJ. A B D J L M

  18. Reduction from LC to WVFS • Given a weighted directed graph (D,w), produce the weighted undirected graph Ds as follows: • Split each vertex v in D into two vertices vinand voutin Ds , connect vinand vout . • All incoming edges to v become undirected incident edges with vin . • All outcoming edges from v become undirected incident edges with vout . • Set ws(vin)=∞ and ws(vout)=w(v). C1 W(v) Γ1 W(v) ∞ V Vin Vout ∞ W(v) V Vin Vout Γ2 C2

  19. Algorithm LoopCutset • Algorithm LC • Input: A Bayesian network D. • Output: A loop cutset of D. • Construct the graph Dswith weight function ws . • Find a vertex feedback set F for (Ds , ws). • Output Ψ(F). • Ψ(X) is a set obtained by replacing each vertex vinor voutin X by the respective source vertex v in D • One-to-one and onto correspondence between loops in D and cycles in Ds • 2-approximation for WVFS yields 2-approximation for LC

  20. Approximate Inference • Loopy belief propagation • Gibbs sampling • Bounded conditioning • Likelihood Weighting • Variational methods

  21. Extra Slides If time allows

  22. The Noisy Or-Gate Model

  23. Belief Update in Poly-Trees

  24. Belief Update in Poly-Trees

More Related