1 / 76

SAT Problem Definition KR with SAT Tractable Subclasses DPLL Search Algorithm

SAT Problem Definition KR with SAT Tractable Subclasses DPLL Search Algorithm Slides by: Florent Madelaine Roberto Sebastiani Edmund Clarke Sharad Malik Toby Walsh Kostas Stergiou. Material of lectures on SAT. SAT definitions Tractable subclasses

derica
Download Presentation

SAT Problem Definition KR with SAT Tractable Subclasses DPLL Search Algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SAT Problem Definition KR with SAT Tractable Subclasses DPLL Search Algorithm Slides by: Florent Madelaine Roberto Sebastiani Edmund Clarke Sharad Malik Toby Walsh Kostas Stergiou KNOWLEDGE REPRESENTATION & REASONING - SAT

  2. Material of lectures on SAT • SAT definitions • Tractable subclasses • Horn-SAT • 2-SAT • CNF • Algorithms for SAT • DPLL-based • Basic chronological backtracking algorithm • Branching heuristics • Look-ahead (propagation) • Backjumping and learning • Local Search • GSAT • WalkSAT • Other enhancements • Application of SAT • Planning as satisfiability • Hardware verification KNOWLEDGE REPRESENTATION & REASONING - SAT

  3. What is SAT? Given a propositional formula in Conjunctive Normal Form (CNF), find an assignment to Boolean variables that makes the formula true: c1 = (x2  x3) c2 = (x1  x4) c3 = (x2  x4) A = {x1=0, x2=1, x3=0, x4=1} SATisfying assignment! KNOWLEDGE REPRESENTATION & REASONING - SAT

  4. Why do we study SAT? • Fundamental problem from theoretical point of view • NP-completeness • First problem to be proved NP-complete (Cook’s theorem) • Reduction to SAT often used to prove NP-completeness for other problems • Studies on tractability • Numerous applications: • CAD, VLSI • Combinatorial Optimization • Bounded Model Checking and other type of formal software and hardware verification • AI, planning, automated deduction KNOWLEDGE REPRESENTATION & REASONING - SAT

  5. Representing knowledge using SAT • Embassy ball (a diplomatic problem) King wants to invite PERU or exclude QATAR Queen wants to invite QATAR or ROMANIA King wants to exclude ROMANIA or PERU Who can we invite? KNOWLEDGE REPRESENTATION & REASONING - SAT

  6. Representing knowledge using SAT • Embassy ball (a diplomatic problem) King wants to invite PERU or exclude QATAR Queen wants to invite QATAR or ROMANIA King wants to exclude ROMANIA or PERU (PQ)  (QR)  (RP) is satisfied by P=true, Q=true, R=false and by P=false, Q=false, R=true KNOWLEDGE REPRESENTATION & REASONING - SAT

  7. Other applications of SAT • Hardware verification S = Cin (P  Q), … KNOWLEDGE REPRESENTATION & REASONING - SAT

  8. Formulation of a famous problem as SAT: k-Coloring The K-Coloring problem: Given an undirected graph G(V,E) and a natural number k, is there an assignment color: KNOWLEDGE REPRESENTATION & REASONING - SAT

  9. i) At least one color to each node: (x1,1 x1,2… x1,k) ii) At most one color to each node: iii) Coloring constraints: Formulation of a famous problem as SAT: k-Coloring xi,j= node i is assigned the ‘color’ j (1 in, 1 jk) Constraints: KNOWLEDGE REPRESENTATION & REASONING - SAT

  10. SAT Notation • Boolean Formula: • T and F are formulas • A propositional atom (variable) is a formula • If φ1and φ2 are formulas then φ1, φ1φ2, φ1φ2, φ1φ2, φ1φ2 are formulas • Atoms(φ): the set of atoms appearing in φ • Literal: either an atom p (positive literal) or its negation p(negative literal) • p and pare complementary literals • Clause: a disjunction L1… Ln, n  0 of literals. • Empty clausewhenn = 0 (the empty clause is falsein every interpretation). • Unit clause whenn = 1. KNOWLEDGE REPRESENTATION & REASONING - SAT

  11. SAT Notation • Total truth assignment μ for φ: • μ: Atoms(φ) {Τ,F} • Partial Truth assignment μ for φ: • μ: A{Τ,F}, A Atoms(φ) • Set and formula representation of an assignment: • μcan be represented as a set of literals: • E.g. {μ(Α1) = Τ , μ(Α2) = F} => {A1 , A2} • μcan be represented as a formula: • E.g. {μ(Α1) = Τ , μ(Α2) = F} => {A1  A2} • both representations used for sets of clauses (formulas) KNOWLEDGE REPRESENTATION & REASONING - SAT

  12. SAT Notation • μ|=φ (μsatisfiesφ): • μ|= Aiμ(Ai) = T • μ|= φ not μ|=φ • μ|= φ1 φ2μ|= φ1 μ|= φ2 • ... • φissatisfiable iffμ|=φ for some μ • φ1|=φ2(φ1entailsφ2) • iff for every μ, μ|= φ1=> μ|= φ2 • |=φ(φ is valid) • iff for every μ, μ|= φ • what does this mean forφ ? KNOWLEDGE REPRESENTATION & REASONING - SAT

  13. SAT Notation • φ1andφ2are equivalent iff • for every μ, μ|= φ1iffμ|= φ2 • φ1andφ2are equisatisfiable iff • exists μ1 s.t.μ1|= φ1iff exists μ2 s.t.μ2|= φ2 • If φ1andφ2are equivalent then they are also equisatisfiable • but the opposite does not hold • Example: • φ1 φ2and (φ1 l)  (l  φ2), where l not in φ1 φ2, are equisatisfiable but not equivalent KNOWLEDGE REPRESENTATION & REASONING - SAT

  14. Conjunctive Normal Form (CNF) • A formula A is in conjunctive normal form, or simply CNF,if it is • either T, or F, or a conjunction of disjunctions of literals: • (That is, a conjunction of clauses.) • A formula B is called a conjunctive normal form of a formula A ifB is equivalent to A and B is in conjunctive normal form. KNOWLEDGE REPRESENTATION & REASONING - SAT

  15. Conjunctive Normal Form • Every sentence in propositional logic can be transformed into conjunctive normal form • i.e. a conjunction of disjunctions Simple Algorithm • Eliminate  using the rule that (p q) is equivalent to (p  q) • Use de Morgan’s laws so that negation applies to literals only • Distribute  and  to write the result as a conjunction of disjunctions KNOWLEDGE REPRESENTATION & REASONING - SAT

  16. Conjunctive Normal Form - Example (p q)  (r p) • Eliminate implication signs • (p  q)  (r  p) • Apply de Morgan’s laws • (p  q)  (r  p) • Apply associative and distributive laws • (p  r  p)  (q  r  p) • (p  r)  (q  r  p) KNOWLEDGE REPRESENTATION & REASONING - SAT

  17. Tractable Subclasses • SAT is NP-complete • therefore it generally is hard to solve! • Question: • In what ways can we restrict the expressiveness of SAT in order to achieve tractability? • Answer: • Horn-SAT • 2-SAT KNOWLEDGE REPRESENTATION & REASONING - SAT

  18. Algorithms for SAT • The study of algorithms for SAT dates back to 1960! • one of the most widely studied NP-complete problems • There are five general approaches to SAT solving • Resolution-based (DP) • Complete Search (DPLL) • Decision Diagrams • IncompleteLocal Search • Stalmärck’s algorithm (breadth-first search) most widely used in practice and the ones we will study KNOWLEDGE REPRESENTATION & REASONING - SAT

  19. Algorithms for SAT • How do we test if a problem is SAT or not? • Complete methods • Return “Yes” if SATisfiable • Return “No” if UNSATisfiable • Incomplete methods • If return “Yes”, problem is SATisfiable • Otherwise timeout/run forever, problem can be SAT or UNSAT KNOWLEDGE REPRESENTATION & REASONING - SAT

  20. Algorithms for SAT • The first algorithm was based on resolution (Davis & Putnam, 1960) • exponential space complexity  memory explosion! • The second algorithm was based on search (Davis, Logemann, Loveland, 1962) • usually referred to as DPLL (although Putnam was not involved) • still the basis of most modern complete SAT solvers • Some early DPLL-based SAT solvers: • Tableau (NTAB), POSIT, 2cl, CSAT • not used any more (many orders of magnitude slower than modern solvers) KNOWLEDGE REPRESENTATION & REASONING - SAT

  21. Davis-Putnam Algorithm • Existential abstraction using resolution • Iteratively select a variable for resolution till no more variables are left. (a  b  c)(b -c  f)(-b  e)(a  b) (a -b) (-a  c) (-a -c) ∃b (a  c  e) (-c  e  f) ∃b (a)(-a  c) (-a -c) ∃bc (a ef) ∃ba (c)(-c) ∃bcaef T∃bac () SAT UNSAT KNOWLEDGE REPRESENTATION & REASONING - SAT

  22. Algorithms for SAT • The first algorithm was based on resolution (Davis & Putnam, 1960) • exponential space complexity  memory explosion! • The second algorithm was based on search (Davis, Logemann, Loveland, 1962) • usually referred to as DPLL (although Putnam was not involved) • still the basis of most modern complete SAT solvers • Some early DPLL-based SAT solvers: • Tableau (NTAB), POSIT, 2cl, CSAT • not used any more (many orders of magnitude slower than modern solvers) KNOWLEDGE REPRESENTATION & REASONING - SAT

  23. DPLL Solvers • DPLL-based solvers are relatively small pieces of software • a few thousand lines of code • but they involve quite complex algorithms and heuristics • The evolution of SAT solvers into the modern ultra-fast tools that can tackle large (and huge) real problems is based on the following enhancements of DPLL: • preprocessing • advanced propagation/deduction techniques for look-ahead and preprocessing • sophisticated branching heuristics • very detailed and fast implementations + smart memory management • backjumping and learning methods increasing order of importance? KNOWLEDGE REPRESENTATION & REASONING - SAT

  24. DPLL preprocessing status = preprocess(); if (status!=UNKNOWN) return status; while(1) { decide_next_branch(); while (true) { status = deduce(); if (status == CONFLICT) { blevel = analyze_conflict(); if (blevel == 0) return UNSATISFIABLE; elsebacktrack(blevel); } else if (status == SATISFIABLE) return SATISFIABLE; else break; } } branching heuristics propagation/deduction backjumping/learning • DPLL is traditionally described in a recursive way • We will use this modern iterative description due to Zhang and Malik KNOWLEDGE REPRESENTATION & REASONING - SAT

  25. Unit Propagation • Unit propagation (UP) is the core deduction method used by all DPLL-based solvers • a clause is calledunit if all but one of its literals have been assigned to false (i.e. it consists of a single literal) • UP repeatedly applies unit resolution (i.e. it resolves unit clauses) Let us look at an example most of the time is spent on doing UP!!! The efficient implementation of UP is of primary importance in a SAT solver KNOWLEDGE REPRESENTATION & REASONING - SAT

  26. X  X X X X DPLL examples Given  in CNF: (x,y,z),(-x,y),(-y,z),(-x,-y,-z) more examples Decide() Deduce() Analyze_Conflict() KNOWLEDGE REPRESENTATION & REASONING - SAT

  27. DPLL preprocessing status = preprocess(); if (status!=UNKNOWN) return status; while(1) { decide_next_branch(); while (true) { status = deduce(); if (status == CONFLICT) { blevel = analyze_conflict(); if (blevel == 0) return UNSATISFIABLE; elsebacktrack(blevel); } else if (status == SATISFIABLE) return SATISFIABLE; else break; } } branching heuristics UP backjumping/learning KNOWLEDGE REPRESENTATION & REASONING - SAT

  28. Propagation / Deduction • Apart from UP several other deduction methods have been proposed and used during preprocessing (mainly) and search (less frequently) • Pure Literal rule • Binary Clause reasoning • Hyper Resolution • Failed Literal Detection • Equality Reduction • Krom Subsumption Resolution • Generalized Subsumption Resolution • … most of them are only used for preprocessing the formula because they are expensive One notable exception is the pure literal rule KNOWLEDGE REPRESENTATION & REASONING - SAT

  29. Pure Literal Rule • The pure literal rule (Davis, Logemann, Loveland, 1962) states the following: • if a variable occurs only positively then it can be assigned to true • if a variable occurs only positively then it can be assigned to false • Example: Given  in CNF: (x,y,z),(-x,y),(y,-w),(-x,y,-z) y is a pure literal it can be assigned true w is a pure literal it can be assigned false • Clauses with pure literals or tautologies can be removed! • a tautology is a clause of the form x  –x  y • The pure literal rule is expensive to apply during search KNOWLEDGE REPRESENTATION & REASONING - SAT

  30. Pure Literal Rule • The pure literal rule can be sequentially applied • Consider the formula (u  w  x), (-w  x  y), (-u  -x), (v  w  -y) v is a pure literal it can be assigned true The formula becomes (u  w  x), (-w  x  y), (-u  -x) y is a pure literal it can be assigned true The formula becomes (u  w  x), (-u  -x) w is a pure literal it can be assigned true The formula becomes (-u  -x) both u and x are pure literals they can be assigned false KNOWLEDGE REPRESENTATION & REASONING - SAT

  31. Other Deduction Methods • Weaker versions of UP • Binary UP resolves only unit and binary clauses • Can be used to solve a 2-SAT problem in quadratic time • Fixed-depth UP applies UP only up to a certain depth • Variants of Binary Resolution • BinRes, Equality Reduction, HyperBinRes • Failed Literal Detection • Hyper-Resolution • Krom Subsumption Resolution • Generalized Subsumption Resolution • Equivalence Reasoning • Etc. preprocessing propagation/deduction KNOWLEDGE REPRESENTATION & REASONING - SAT

  32. Failed Literal Detection • Failed literal detection (Freeman, 1995) is a one-step lookahead with UP. • Say we force (assign) literal l and then perform UP. If this process yields a contradiction (empty literal) then we know that lis entailed by the current inputand we can force it (and then perform UP). • DPLL solversoften perform failed literal detection on a set of likely,heuristically selected, literalsat each node. • The SATZ system (Li & Anbulagan, 1997) wasthe first to show that very aggressive failed literal detectioncan pay off. • but doing it on all literals is too expensive KNOWLEDGE REPRESENTATION & REASONING - SAT

  33. Binary Resolution • One “cheap” form of binary resolution consists of performing all possible resolutions of pairsof binary clauses • Such resolutions yield only new binaryclauses or new unit clauses • BinRes(Bacchus, 2002) repeatedly: • (a) adds to the formula all new binary or unit clauses producible byresolving pairs of binary clauses, and • (b) performs UP onany new unit clauses that appear (which in turn might producemore binary clauses causing another iteration of (a)), until either a contradiction is achieved, or nothingnew can be added by a step of (a) or (b). • BinRes ((a,b),(a,c),(b,c)) produces the new binary clauses (b,c),(a,c), and (c). Then unit propagation yields thefinal reduction. KNOWLEDGE REPRESENTATION & REASONING - SAT

  34. Hyper Resolution • A hyper resolution rule resolves more than two clauses at the same time • HypBinRes is a rule of inference involving hyper-resolution It takes as input a single n-ary clause (n  2)(l1, l2, ..., ln) and n−1 binary clauses each of the form (li,l) (i = 1, . . . , n−1). Itproduces as output the new binary clause (l, ln). • For example, using HypBinRes hyperresolutionon the inputs (a, b, c, d), (h, a), (h, c), and (h,  d), produces the new binaryclause (h, b) • HypBinRes is equivalent to a sequence of ordinary resolution steps (i.e., resolutionsteps involving only two clauses). However, such a sequence would generate clauses ofintermediate length while HypBinRes only generates the final binary clause KNOWLEDGE REPRESENTATION & REASONING - SAT

  35. Krom Subsumption • Krom-subsumption resolution(van Gelder and Y. Tsuji, 1996) takes as input two clauses of the form x  y and ¬x  y  Z and generates the clausey  Z • where Z is a clause of arbitrary length • y  Z subsumes (entails) ¬x  y  Z, therefore ¬x  y  Z can be deleted • GeneralizedSubsumption resolution takes two clauses x  Y and ¬x  Y  Z and generatesY  Z • We can derive propagation methods derived by repeatedly applying either form of resolution KNOWLEDGE REPRESENTATION & REASONING - SAT

  36. Equality Reduction • If a formula F contains(a,b) as well as (a,b),then we can form a new formula EqReduce(F) by equalityreduction. • Equality reduction(Bacchus, 2002) involves: • (a) replacing allinstances of b in F by a (or vice versa), • (b) removing allclauses which now contain both a and a, • (c) removing allduplicate instances of a (or a) from all clauses. • This processmight generate new binary clauses • For example, EqReduce((a,b),(a,b),(a,b,c),(b,d),(a,b,d)) = ((a, d),(a,d)) • EqReduce(F) has a satisfying truth assignment iff F does. • And any truth assignment for EqReduce(F) can beextended to one for F by assigning b the same value as a. KNOWLEDGE REPRESENTATION & REASONING - SAT

  37. DPLL HyperRes, BinRes, EqRed etc. status = preprocess(); if (status!=UNKNOWN) return status; while(1) { decide_next_branch(); while (true) { status = deduce(); if (status == CONFLICT) { blevel = analyze_conflict(); if (blevel == 0) return UNSATISFIABLE; elsebacktrack(blevel); } else if (status == SATISFIABLE) return SATISFIABLE; else break; } } branching heuristics UP backjumping/learning KNOWLEDGE REPRESENTATION & REASONING - SAT

  38. Decision heuristics • DLIS (Dynamic Largest Individual Sum) For a given variable x: • Cx,p – # unresolved clauses in which x appears positively • Cx,n - # unresolved clauses in which x appears negatively • Let x be the literal for which Cx,p is maximal • Let ybe the literal for which Cy,n is maximal • If Cx,p > Cy,n choose x and assign it TRUE • Otherwise choose y and assign it FALSE • Requires l (#literals) queries for each decision. • (Implemented in some solvers e.g. Grasp) KNOWLEDGE REPRESENTATION & REASONING - SAT

  39. Decision heuristics • DLCS (Dynamic Largest Combined Sum) For a given variable x: • Cx,p – # unresolved clauses in which x appears positively • Cx,n - # unresolved clauses in which x appears negatively • Let x be the literal for which Cx,p + Cx,n is maximal • If Cx,p > Cx,n and assign x to TRUE • Otherwise assign x to FALSE • Requires l (#literals) queries for each decision. • (Implemented in some solvers e.g. Grasp) KNOWLEDGE REPRESENTATION & REASONING - SAT

  40. Decision heuristics • Bohm’s Heuristic • At each step of the backtrack search algorithm, the BOHM heuristic selects a variable with the maximal vector (H1(x),H2(x),…,Hn(x)) in lexicographic order. Each Hi(x) is computed as follows: Hi(x) = a max(hi(x), hi(x)) + b min(hi(x), hi(x)) • where hi(x) is the number of unresolved clauses with i literals that contain literal x. Hence, each selected literal gives preference to satisfying small clauses (when assigned value true) or to further reducing the size of small clauses (when assigned value false). • The values of aand are b chosen heuristically. KNOWLEDGE REPRESENTATION & REASONING - SAT

  41. Decision heuristics Jeroslow-Wang method Compute for every clause w and every literal l: • J(l) := • One-sided JW: Choose a literal l that maximizes J(l) • Two-sided JW: Choose a variable x that maximizes J(x) + J(x) • Assign it to true if J(x)  J(x) and false otherwise • This gives an exponentially higher weight to literals in shorter clauses. KNOWLEDGE REPRESENTATION & REASONING - SAT

  42. Decision heuristics MOM (Maximum Occurrence of clauses of Minimum size). • Let f*(x) be the # of unresolved smallest clauses containing x. Choose x that maximizes: ((f*(x) + f*(x)) * 2k + f*(x) * f*(x) • k is chosen heuristically. • The idea: • Give preference to satisfying small clauses. • Among those, give preference to balanced variables (e.g. f*(x) =3,f*( x) = 3 is better than f*(x) = 1, f*(x) = 5). KNOWLEDGE REPRESENTATION & REASONING - SAT

  43. Decision heuristics VSIDS (Variable State Independent Decaying Sum) 1. Each variable in each polarity has a counter initialized to 0. 2. When a clause is added, the counters are updated. 3. The unassigned variable with the highest counter is chosen. 4. Periodically, all the counters are divided by a constant. (Implemented in Chaff) KNOWLEDGE REPRESENTATION & REASONING - SAT

  44. Decision heuristics VSIDS (cont’d) • Chaff holds a list of unassigned variables sorted by the counter value. • Updates are needed only when adding conflict clauses. • Thus - decision is made in constant time. KNOWLEDGE REPRESENTATION & REASONING - SAT

  45. Decision heuristics VSIDS is a ‘quasi-static’ strategy: - static because it doesn’t depend on current assignment - dynamic because it gradually changes. Variables that appear in recent conflicts have higher priority. This strategy is a conflict-driven decision strategy. “..employing this strategy dramatically (i.e. an order of magnitude) improved performance ... “ KNOWLEDGE REPRESENTATION & REASONING - SAT

  46. DPLL HyperRes, BinRes, EqRed etc. status = preprocess(); if (status!=UNKNOWN) return status; while(1) { decide_next_branch(); while (true) { status = deduce(); if (status == CONFLICT) { blevel = analyze_conflict(); if (blevel == 0) return UNSATISFIABLE; elsebacktrack(blevel); } else if (status == SATISFIABLE) return SATISFIABLE; else break; } } branching heuristics UP backjumping/learning KNOWLEDGE REPRESENTATION & REASONING - SAT

  47. Conflict Analysis, Learning, Backjumping • When a conflicting clause is derived (i.e. a clause with all its literals 0), the solver must backtrack • conflict analysis finds the reason for a conflict and tries to resolve it • The DPLL algorithm uses chronological backtracking • it backtracks to the most recent decision point where a variable has not both of values its tried, and flips the current assignment • Example • Modern SAT solvers employ more advanced conflict analysis techniques to identify the actual reasons for the conflict • in this way they can achieve non-chronological backjumping KNOWLEDGE REPRESENTATION & REASONING - SAT

  48. Conflict Analysis, Learning, Backjumping • Suppose the conflicting clause  = (a  x  c) has been derived • i.e. a=1, x=0, c=1 • A set R of value assignments to variables in the problem is called a conflict assignment if after making these assignments and running UP, clause  becomes unsatisfiable • assignment {a=1, x=0, c=1} is a trivial conflict assignment • But it is not of much use • Question: how can we derive more interesting conflict assignments? • Answer: determine why and at what decision level a=1, x=0, c=1 • Suppose we find that R={x=0, y=1, z=1} is also a conflict assignment for clause  • the implied clause (x  y  z) which records the conflict assignment R is called a conflict clause KNOWLEDGE REPRESENTATION & REASONING - SAT

  49. Conflict Analysis, Learning, Backjumping • Suppose that assignment x=0 of R={x=0, y=1, z=1} is chosen (or implied) at the current decision level v • assume that y=1 and z=1 are deduced at nodes v’ and v’’ respectively • suppose that v>v’>v’’ (i.e. v’’ is closest to the root) • After adding conflict clause (x  y  z) to the problem, we can backjump from v to v’ (skipping the nodes in between) • because whatever assignments we make there, the conflict at node v will still exist! • After we make the backjump, we can deduce x=1. Why? • because the added clause (x  y  z) will be a unit clause, forcing x=1 • without learning this clause, this deduction would not be possible • now we can avoid needless search and save time! KNOWLEDGE REPRESENTATION & REASONING - SAT

  50. Conflict Analysis, Learning, Backjumping • During the conflict analysis information about conflicts is usually recorded and added to the problem as new (learned) clauses • these conflict clauses are redundant but they often help prune the search space in the future • this mechanism is called conflict-directed learning • Non-chronological backtracking is also called conflict-directed backjumping • originally proposed for CSPs (Prosser, 1993) • then incorporated in SAT solvers like GRASP (Silva and Sakallah, 1996) and rel_sat (Bayardo and Schrag, 1997) • Learning and conflict-directed backjumping can be analyzed using implication graphs or they can be viewed as a resolution process KNOWLEDGE REPRESENTATION & REASONING - SAT

More Related