1 / 54

Book: Constraint Processing Author: Rina Dechter

Chapter 10: Hybrids of Search and Inference Time-Space Tradeoffs. Book: Constraint Processing Author: Rina Dechter. Zheying Jane Yang Instructor: Dr. Berthe Choueiry. Outline. Combining Search and Inference--Hybrid Idea - Cycle Cutset Scheme Hybrids: Conditioning-first

swain
Download Presentation

Book: Constraint Processing Author: Rina Dechter

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 10: Hybrids of Search and Inference Time-Space Tradeoffs Book: Constraint Processing Author: Rina Dechter Zheying Jane Yang Instructor: Dr. Berthe Choueiry

  2. Outline • Combining Search and Inference--Hybrid Idea - Cycle Cutset Scheme • Hybrids: Conditioning-first • - Hybrid Algorithm for Propositional Theories • Hybrids: Inference-first • - The Super Cluster Tree Elimination • - Decomposition into non-separable Components • - Hybrids of hybrids • A case study of Combinational Circuits • - Parameters of Primary Join Trees • - Parameters Controlling Hybrids • Summary

  3. Part I Hybrids of Search and Inference

  4. Two primary constraint processing schemes: 1.Conditioning or search Based on depth-first BT search time require may be exponential, but need very little memory 2. Inference or derivation Variable-elimination Both of time and space exponential but in the size of induced width. Good at when problem is sparse (W* is low)

  5. Inference Search CSP CSP’ solutions search subproblems subsolutions CSP subproblems Inference Subsolutions Hybrids idea • Two ways to combine them: • To use inference procedures as a preprocessing to search, • then yield a restricted search space, then use search to find solutions. • To alternatebetween both methods, • apply search to subset of variables, then perform inference on the rest

  6. Comparison of BT andVE Backtracking (Search) Elimination(Inference) O(n·exp(w*)) w*<=n Worst-case time O(exp(n)) O(n·exp(w*)) w*<=n Space O(n) Find one solution Knowledge compilation Task When induce-width is small, such as w*<=b, variable-elimination is polynomial time solvable, thus it is far more efficient than backtracking search.

  7. Break cycle here x5 x4 x4 x2 x5 x2  x3 x2 x3 x1 x2 x1 10.1 The Cycle Cutset Scheme Definition 5.3.5 (cycle-cutset) Given an undirected graph, a subset of nodes in the graph is a cycle-cutset if its removal becomes a graph having no cycles . Example: Figure 10.3. Figure 10.3 An instantiated variable cuts its own cycle

  8. search subproblems subsolutions CSP subproblems Inference subsolutions The Cycle Cutset Scheme (Cond’) • When finding a cycle-cutset, • The resulting network is cycle free, and can be solved by the • inference-based tree-solving algorithms. (complicated problem turned • to be easy problem) • If a solution to this cycle-cutset is found, then a solution to the entire • problem is obtained. • Otherwise, another instantiation of the cycle-cutset variables • should be considered until a solution is found.

  9. Tradeoff between Finding a Small Cycle-cutset and Using Search + Tree-Solving A small cycle-cutset is desirable. However, finding a minimal-size cycle-cutset is NP-hard Compromise between BT search and tree-solving algorithm Step1: using BT to keep track of the connectivity status of the constraint graph Step2: As soon as the set of instantiated variables constitutes a cycle-cutset, the search algorithm is switched to the tree-solving algorithm. Step3: Either finding a consistent extension for the remaining variables  find solution. Step4: Or no such extension exists, in this case, BT takes place again and try another instantiation.

  10. A A B C D B E E D C D C C D A constraint graph and a constraint-tree generated by the cutset (C,D) Example

  11. Tree structure E A C C A A C B F F D E A D E D Backup B A F Cycle Cutset C B C (a) Constraint graph (b) Its ordered graph (c) Constraint graph is broken into Tree variables part and cycle-cutset variables part Example Figure 10.4 * I think Figure 10.4 (C) on the text book is wrong!

  12. Two extreme cases: • When the original problem has a tree-constraint graph, the cycle-cutset • scheme coincides with a tree-algorithm. • When the constraint graph is complete, the algorithms reverts • to regular backtracking. (Why?)

  13. Time and space complexity of the algorithm cycle-cutset • In the worst-case, all possible assignments to the cutset • variables need to be tested, the time complexity yields:O(nkc+2) • nis number of variables, • cis the cycle-cutset size, • kis domain size, • kcis the number of tree-structured subproblems, • then for each tree-structure requiresO((n-c)k2)steps for • the tree algorithm. • Thus we have time complexityO((n-c)kc+2). • Then we haveO(nkc+2) • Space complexity is linear.

  14. 10.1.1 Structure-based recursive search • An algorithm that performs search only, but consults a • tree-decomposition to reduces its complexity. • The algorithm is used in binary constraint network which is a tree. • Given a network, • Removes a node x1, • Generates two subtrees of size n/2 (appro.), • Let Tnbe the time needed to solve this binary tree starting at x1, it obeys the recurrence relation, suppose x1has at most kvalues, • we have the formula: • Tn = k2Tn/2 • T1 = k • Then we have Tn = n  k logn+1

  15. Theorem 10.1.3 A constraint problem with n variables and k values having a tree decomposition of tree-width w*, can be solved by recursive search in linear space and in O(nk2w*(logn+1) time

  16. 10.2 Hybrids: conditioning first • Conditioning in cycle-cutset. • This suggestion a framework of hybrid algorithms parameterized • by a boundb on the induced-width of subproblems solved by • inference. • And conditioning in b-cutset.

  17. Conditioning Set or b-cutset The algorithm removes a set of cutset variables, gets a constraint graph with an induced-width bounded by b. we call such a conditioning set or b-cutset. Definition 10.2.1(b-cutset) Given a graph G, a subset of nodes is called a b-cutset iff when the subset is removed the resulting graph has an induced-width less than or equal to b. A minimal b-cutset of a graph has a smallest size among all b-cutsets of the graph. A cycle-cutsetis a 1-cutset of a graph.

  18. How to Find a b-cutset Definition 10.2.3 (finding a b-cutset)Given a graph G = (V,E) and a constant b, find a minimal b-cutset. Namely, find a smallest subset of nodes U, such that G’ = (V-U, E’), where E’ includes all the edges in E that are not incident to nodes in U, has induced-width less than or equal b. Step1: Given a ordering d = {x1,…,xn} of G, a b-cutsetrelative to d is obtained by processing the nodes from last to first. Step2: When node x is processed, if its induced-width is greater than b, it is added to the b-cutset. Step3:Else, its earlier neighbors are connected. Step4: The adjusted induced-width relative to a b-sutsetisb. Step5: A minimal b-cutsetis a smallest among all b-cutset.

  19. The Purpose of Algorithm elim-cond(b) • Original problem is divided two smaller parts: • Cutset variables • Remaining variables (subproblems) • We can runs BT search on the cutset variables • Run bucket elimination on the remaining variables. • This yields elim-cond(b) algorithm • The constant bcan be used to control the balance between • search and variable-elimination, and thus effect the tradeoff • between time and space.

  20. Input: A constraint network R= (X,D,C),Y  X which is ab-cutset. d is an ordering withY such that the adjusted induced-width, relative toY alongd, is bounded byb, Z= X-Y. Output: A consistent assignment, if there is one. 1. While If next partial solution of Yfound by BT, do (a)  adaptive - consistency (R Y= ). (b) if is not false, return solution = ( , ). 2. Endwhile. 3. Return: the problem has no solution. z z y y z y Algorithm elim-cond(b) Algorithm elim-cond(b), text book page 280, Figure 10.5

  21. The Complicity of elim-cond(b) Theorem 10.2.2 Given R=(X,D,C), if elim-cond(b) is applied along ordering dwhen Yis a b-cutsetof size cb, then the space complexity of elim-cond(b) is bounded by O(nexp(b)), and its time complexity is bounded by O(nexp(cb + b)). Proof: If b-cutset assignment is fixed, the time and space complexity of the inference portion (by variable-elimination) is bounded by O(nkb). In BT portion, it checks all possible values of theb-cutset, which is inO(kcb)time. And linear space. Thus the total time complexity isO(n kb kcb) = O(n k(b+cb))

  22. Part II Trade-off between search and inference

  23. Parameterbcan be used to control the trade-off between search and inference • If b wd*, where d is the ordering used by elim-cond(b) the algorithm coincides with adaptive-consistency. • As b decreases, the algorithm adds more nodes into cutset, where cb is increased, the algorithm requires less space and more time. (Why?)

  24. Trade-off between search and inference • Let size of the smallest cutset is 1-cutset, c1, • Smallest induced width is w*, • Then we have inequality c1 w* -1 1+ c1  w*, • The left side b+cbis the exponent that determines • the time complexity of elim-cond(b) algorithm. • While w* dominates the complexity of the bucket-elimination. • Each time we increase the bby 1, and the size of the cutset cbis decreased. • 1+ c1  2+c2  …b+ cb,...  w* + cw* = w*, • When cw* = 0,it means that the subproblem has the induced-width = w*, • there are no vertices in cutset. • Thus search and variable elimination can also be interleaved dynamically. • Thus we get the a hybrid scheme whose time complexity decrease as its • space increase until it reaches the induced-width.

  25. Algorithm DP-DR(b) • A variant Algorithm of elim-cond(b) for processing proposittional CNF. Hybrid of DPLL and DR • For Backtracking DPLL algorithm applies look-ahead • using unit-propagation at each node • Bucket-elimination algorithm applies Directional • Resolution (DR) algorithm. • It is special version of elim-cond(b) which incorporates dynamic variable-ordering. Check on Chapter 8, using resolution as its variable-elimination operator, page 232. Figure 10.9 (page 284): Algorithm DP-DR(b).

  26. Ed(y). y DP-DR ( , b) Input: A CNF theory  over variables X; a bound b. Output: A decision of whether  is satisfiable. If it is, an assignment to its conditioning variables, and the conditional directional extension • If unit-propagate() = false, return (false) • ElseXX- {variables in unit clauses} • If no more variables to process, return true • Else while Q  X s.t. degree(Q) <=b in the current conditioned graph • resolve over Q • if no empty clause is generated, • add all resolvents to the theory • else return false • XX – {Q} • EndWhile • Select a variable Q  X; X  X –{Q} • Y  Y{Q} • Return(DP-DR( Q, b)  ( Q, b)).

  27. y y y • The theory  conditioned on the assignment Y = is called a • conditional theory of  relative to , and is denoted by . • Conditional graph of , denoted G (y). (which is obtained • by deleting the nodes in Y and all their incident edges from G(). • The conditional induced width of a theory , denoted , • is the induced width of the graph G (y). y y Wy* Represent a Propositional Theory as an Interaction Graph • The interaction graph of a propositional theory , • denoted G(). • Each propositional variable denotes one node in the graph • Each pair of nodes in the same clause denotes an edge in • the graph,which yields a complete graph.

  28. C B C B Resolution over A D D A A E E Example • 1 = {(C), (AB C), (A B E), (B C D)} • If we apply resolution over variable Athen we should get a new clause • (BC E) • The result graph should has an edge between nodes E and C. (Figure 10.6, • in the text book)

  29. A B B B C C C D D D A=0 A=1 E E E W*=2 W*=4 W*=3 Example • Given a theory  = {(CE), (AB C D), (A B E D), (B C D)} • If we do conditioning on A. • When A=0 {( B  C  D), (B  C  D), ( C  E)}. • When A=1 {(B E D), ( B  C  D), ( C  E)}. • Delete Aand its incident edges from the interaction graph, • De can also delete some other edges B and E, because when • A=0 the clause (AB E D), we can remove edge between • is always satisfied.

  30. Example: A DCDR(b=2) W* > 2 Conditioning B Input A=0 A=1 Bucket A Bucket B Bucket C Bucket D Bucket E A A C A B CD, A B ED B C D C E C D B CDB ED B B W*  2 Elimination D C D D E D E D E E

  31. Complexity of DP-DR(b) • Theorem 10.2.5 • The time complexity of algorithm DP-DR(b) is O(nexp(cb + b)), • where cb is the largest cutset • to be conditioned upon. • The space complexity is O(nexp(b)).

  32. Empirical evaluation of DP-DR (b) • Evaluation of DP-DR(b) on Conditioning-first hybrids • Empirial results from experiments with different structuredCNFs. • Use random uniform 3-CNFs with 100 variables and 400 clauses • (2, 5) -trees with 40 cliques and 15 clauses per clique • (4, 8) -trees with 50 cliques and 20 clauses per clique • In general, (k, m) -trees are trees of cliques each having m+k nodes • and separators size of k. • The randomly generated 3-CNFs were designed to have an interaction • graph that corresponds to (k, m)-trees. • The performance of DP-DR(b) depends on the induced width of the theories. • When b=5, the overall performance is best. See Figure 10.10, on page 285.

  33. Part III Non-separable components and tree-decomposition

  34. 10.3 Hybrids: Inference-first • The another approach for combining conditioning and inference based on structured constraint networks—by using tree-decomposition. • The algorithm CTE (Cluster-Tree-Elimination) computes (Chapter9, p261) • Variable elimination on separators first (size is small) • Search on tree clusters (size is relative large) • Thus, we can trade even more space for time by allowing large cliques • but smaller separators. • This can be achieved by combining adjacent nodes in a tree-decomposition • that connected by “fat separators”, • Rule to keep apart only those nodes that are linked by bounded size separators.

  35. Tree Decomposition • Definition 9.2.5 (tree-decomposition) Let R=(X, D, C) be a CSP problem. • A tree-decomposition for R is a triple <T, ,(kai) (psai)>, where T = (V, E) is a tree, and , and  are labeling functions where associate each vertex v  V with two sets, (v)  X and (v)  C, that satisfy the following conditions: • For each constraint Ri  C, there is at least one vertex v  V such that • Ri  (v), and scope (Ri)  (v). • For each variable x X, the set {v  V | x (v)} induces a connected • subtree of T. (This is the connectness property.) Definition 9.2.6 (tree-width, hyper-width, separtor) The tree-width of a tree-decomposition <T, , > is tw = maxv V |(v)|, and its hyper-width is hw = maxv V| (v)|. Given two adjacent vertices u and v of a tree-decomposition, the separator of u and v is defined as sep(u,v) = (u) (v). Definitions from chapter9 Page260

  36. Tree-Decomposition • Assume a CSP problem has a tree-decomposition, which has tree- width rand separator size s. • Assume the space restrictions do not allow memory space up to • O(exp(s)) • One way to overcome this problem is to collapse these nodes in the tree that are connected by large separators • let previous connected two nodes include the variables and constraints • The resulting tree-decomposition will has larger subproblems but smaller separators. • As s decrease, both r and hw increase.

  37. Theorem 10.3.1 Given a tree-decomposition T over nvariables, separator sizes s0, s1,…, st and secondary tree-decompositions having a corresponding maximal number of nodes in any cluster, r0, r1,…, rt. The complexity of CTE when applied to each secondary tree-decompositions Ti is O(n·exp(ri)) time, and O(n·exp(si)) space (i ranges over all the secondary tree-decomposition). • Secondary tree decomposition Ti generated by combining • adjacent nodes whose separator sizes are strictly greater • than si.

  38. Example I AB A H B G F BCD B GEFH BD C GFE BDG GD A primal constraint graph D E (a) GDEF Fat separator size=3 AB AB B B Separator smallest size=1 BCD Separator smaller size=2 BD GD BCDGEFH BDG GDEFH (c) (b)

  39. A A D D B B C C E E F F G G (a) the primal constraint graph (b) Induced triangulated graph I I H H B,C,D A, B, C, D B, C, D, F B, E, F D, F, G F, G, I B, F, H F B,F B,F FG (c) Super tree clusters Example II

  40. 10.3.1 Super Cluster Tree Elimination(b) • Each clique is processed by search, • each solution created by BTsearch is projected on the separator • the projected solutions are accumulated. • We call the resulting algorithm SUPER CLUSTER ELIMINATION(b), • or SCTE(B). • It takes a primary tree-decomposition and generates a tree-decomposition • whose separators’ size is bounded by b, which is subsequently processed • by CTE.

  41. G,F G,F G,F F F F,B,C F,B,C D,B,A D,B,A F B,C B,C A,B A,B A,B B,A A,B,C A,B,C A,B,C,D,F A Join-tree Super-bucket-tree Bucket-tree A Superbuckets Bucket-elimination algorithm can be extended to bucket-trees (section 9.3) Bucket-tree is a tree-decomposition by merging adjacent buckets to generate a super-bucket-tree (SBT) in a similar way to generate super clusters. In the top-down phase of bucket-elimination several variables are eliminated at once. Figure 10.13:

  42. Definition 10.3.3 (non-separable components) A connected graph G = (V, E) is said to have separation node vif there exist nodes a and b such that all paths connection a and bpass through v. • A graph that has aseparation node is called separable, and one that has none is called non-separable. • A subgraph with no separation nodes is called a non-separable component (or a bi-connected component).

  43. 10.3.2Decomposition into non-separable components • Generally we cannot find the best-decomposition having a bounded separators’ size in polynomial time. • Tree-decomposition--all the separators are singeton variables: It requires only linear space {O(n·exp(sep)), where size of sep=1). • The variables of the nodes are those appearing in each component • The constraints can be placed into a component that contains • their scopes • Applying CTE to such a tree requires linear space, (CTE space • complexity is O(N·exp(sep)), chapter9, page263) • Each node corresponds to a component • But it is time exponential in the components’ sizes

  44. A D F J F E C2 E B C C I C1 (a) H G,F,J (b) C,F,E E,H,I A,B,C,D Example: Decomposition into non-separable components G C3 C4 Textbook, page 289

  45. Because in super-bucket tree, each node Cidictates a super-cluster, i.e., • Figure 10.14 on page 289, C1 includes variables {A, B, C, D}; • C2includes variables (F, C, E},…. • If the C1 sends message to C2, it needs to place its message inside the • receiving super-bucket C2. • The message C1denotes new Domain of constraints • Computed by bucket C1 and sent to bucket C2 is place inside bucket C2. • How to execute message passing along a super-bucket tree C2 See the example on page 290. Execution messages passing in super-bucket tree

  46. Part IV Hybrids of hybrids

  47. 10.3.3 Hybrids of hybrids Advantage: The space complexity of this algorithm is linear but its time complexity can be much better than the cycle-cutset scheme or the non-separable component alone. Example: Case study for c432, c499,… see Figure 10.19

  48. Algorithm HYBRID(b1,b2) • Combine two approaches—conditioning and inference • Given a space parameter b1 • Firstly, find a tree-decomposition with separator bounded by • b1by the super-clustering approach. • Instead of pure search in each cluster applying elim-cond(b2), • (b2≤b1), • Time complexity will be significantly reduced. • If Cb2* is the size of the maximum b2-cutsetin each clique of • the b1-tree-decomposition • The result algorithm is space exponential in b1 • (separator is restricted by b1) • But time exponential in Cb2* (cycle-cutsize is bound by b2)

  49. Two special cases 1. Apply the cycle-cutset scheme (hybrid(b1,1)) in each clique. Real circuits problem—for circuit diagnosis tasks shows that the reduction in complexity bounds for complex circuits is tremendous. 2. When b1=b2, for b=1, hybrid(1,1) corresponds to applying the non-separable components as a tree-decomposition and utilizing the cycle-cutset in each component.

  50. 10.4 A case study of combinatorial circuits Textbook Page 291-294 Method • Using triangulation approach to decompose each circuit graph. • Selecting an ordering for the nodes • Triangulating the graph • Generating the induced graph • Identifying its maximum cliques (maximal-cardinality) • Using heuristic the min-degree ordering, will yield the smallest cliques sizes and separators. • See Fig 10.18

More Related