230 likes | 316 Views
Explore the intricacies of Bucket Elimination methods in AI and Machine Learning, including initialization, processing, complexity determination, and tree decompositions. Learn about determining complexity graphically and computing all marginals efficiently using Junction Tree Algorithm.
E N D
Statistical Methods in AI/ML Bucket elimination VibhavGogate
Bucket Elimination: Initialization (A,C) (C,E) A E D F B C A C E (C,D) (E,F) (A,B) B D F (B,D) (D,F) • You put each function in exactly one bucket • How? • Along the order, find the first bucket such that one of the variable’s in the function’s scope is the bucket variable
Bucket elimination: Processing Buckets A E D F B C ψ(B,C) A C • Process in order • Multiply all the functions in the bucket • Sum-out the bucket variable • Put the new function in one of the buckets obeying the initialization constraint E ψ(C,F) (E,F) (A,B) (A,C) (C,E) B D F (D,F) ψ(B,C,F) (C,D) (B,D) ψ2(B,C) ψ(C) Z
Bucket elimination: Why it works? A E D F B C A C E (E,F) (A,B) (A,C) (C,E) B D F (D,F) (C,D) (B,D) ψ(B,C,F) ψ(C,F) ψ2(B,C) ψ(B,C) Z ψ(C)
Bucket elimination: Why it works? A E D F B C (E,F) (A,B) (A,C) (C,E) (D,F) (C,D) (B,D) ψ(B,C,F) ψ(C,F) ψ2(B,C) ψ(B,C) Z ψ(C)
Bucket elimination: Why it works? A E D F B C (E,F) (A,B) (A,C) (C,E) (D,F) (C,D) (B,D) ψ(B,C,F) ψ(C,F) ψ2(B,C) ψ(B,C) Z ψ(C)
Bucket elimination: Why it works? A E D F B C (E,F) (A,B) (A,C) (C,E) (D,F) (C,D) (B,D) ψ(B,C,F) ψ(C,F) ψ2(B,C) ψ(B,C) Z ψ(C)
Bucket elimination: Why it works? A E D F B C (E,F) (A,B) (A,C) (C,E) (D,F) (C,D) (B,D) ψ(B,C,F) ψ(C,F) and so on. ψ2(B,C) ψ(B,C) Z ψ(C)
Bucket elimination: Complexity A E D F B C exp(3) exp(3) exp(4) exp(3) exp(2) exp(1) ≈6exp(3) Complexity: O(nexp(w)) w: scope of the largest function generated n:#variables (E,F) (A,B) (A,C) (C,E) (D,F) (C,D) (B,D) ψ(B,C,F) ψ(C,F) ψ2(B,C) ψ(B,C) Z ψ(C)
Bucket elimination: Determining complexity graphically A • Schematic operation on a graph • Process nodes in order • Connect all children of a node to each other E A C E D B D F F B C
Bucket elimination: Complexity A • Complexity of processing a bucket “i” • exp(childreni) • Complexity of bucket elimination • nexp(max(childreni)) E D F B C
Treewidth and Tree Decompositions • Running schematic bucket elimination yields a chordal graph • Each cycle of length > 3 has a chord (an edge connecting two nodes that are not adjacent in the cycle) • Every chordal graph can be represented using a tree decomposition
Tree Decomposition of Chordal graphs A ABC E EFC BC FC D DBCF FBC F FBC BC B BC C C C
Tree Decomposition and Treewidth: Definition • Given a network and its interaction graph • Tree Decomposition is a set of subset of variables connected by a tree such that: • Each variable is present in at least one subset • Each edge is present in at least one subset • The set of subsets containing a variable “X” form a connected sub-tree • Running intersection property • Width of a tree decomposition: Cardinality of the maximum subset minus 1 • Treewidth: minimum width out of all possible tree decompositions
Bucket elimination: Complexity • Best possible complexity: O(nexp(w+1)) where w is the treewidth of the graph • Thus, we have a graph-based algorithm for determining the complexity of bucket elimination. • If w is small, we can solve the problem efficiently!
Generating Tree Decompositions • Computing treewidth is NP-hard • Branch and Bound algorithm (Gogate&Dechter, 2004) • Best-first search algorithm • (Dow and Korf, 2009) • Heuristics in practice • min-fill heuristic • min-degree heuristic
Min-degree and min-fill • min-degree • At each point, select a variable with minimum degree (ties broken arbitrarily) • Connect the children of the variable to each other • min-fill • At each point, select a variable that adds the minimum number of edges to the current graph • Connect the children of the selected variable to each other
Computing all Marginals • Bucket elimination computes • P(e) or Z • P(Xi|e) where “Xi” is the last variable eliminated • To compute all marginals P(Xi|e) for all variables Xi • Run bucket elimination “n” times • Efficient algorithm • Junction tree algorithm or bucket tree propagation • Requires only two passes to compute all marginals
Junction tree algorithm:An exact message passing algorithm • Construct a tree decomposition T • Initialize the tree decomposition as in bucket elimination • Select an arbitrary node of T as root • Pass messages from leaves to root (upward pass) • Pass messages from root to leaves (downward pass)
Message passing Equations • Multiply all received messages except from R • Multiply all functions • Sum-out all variables except the separator S R
Computing all marginals S P(S)
Message passing Equations (A,B) (A,C) ABC • Select “EFC” as root • Pass messages from leaves to root • Pass messages from root to leaves (E,F) (C,E) EFC (C,D) (D,F) FC DBCF (B,D) FBC FBC BC BC C C
Architectures • Shenoy-Shafer architecture • Hugin architecture • Associate one function with each cluster • Requires multiplication • Smaller time complexity • Higher space complexity