1 / 29

Branching Strategies and Restarts in SAT Solvers Ashish Sabharwal

Branching Strategies and Restarts in SAT Solvers Ashish Sabharwal. Talk Outline. The SAT Problem, SAT Solvers Conflict-Driven Systematic SAT Solvers Dramatic Progress Contrast with CP/MIP solvers “Everything” influenced by Learned Clauses and Conflict Analysis

jola
Download Presentation

Branching Strategies and Restarts in SAT Solvers Ashish Sabharwal

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Branching Strategies and Restartsin SAT SolversAshish Sabharwal

  2. Talk Outline • The SAT Problem, SAT Solvers • Conflict-Driven Systematic SAT Solvers • Dramatic Progress • Contrast with CP/MIP solvers • “Everything” influenced by Learned Clauses and Conflict Analysis • Traditional Branching Heuristics • CDCL Solvers: Dynamic Heuristics and Associated Techniques • Clause Learning • Lazy Data Structures • VSIDS Variable Selection Heuristic • Restarts • Summary

  3. SAT: Problem and Solvers

  4. Boolean Satisfiability (SAT) : Basics • Variables with Boolean domain {T,F} or, equivalently, {1,0} • Constraints specified in the Conjunctive Normal Form (CNF)E.g. (a or b) and (c or d or f) and (a or c or d) • SAT Solver: An algorithm (typically with an implementation) that, given a CNF formula F, finds a satisfying assignment for F, if there is one • Complete SAT Solver: must terminate and output “unsatisfiable”, if F is unsat. • Dozens of (mostly Open Source) SAT Solvers available on the Internet • 35+ solvers participated in SAT Competition 2006 • 65+ in 2011 a : variable a, a : literals clause = disjunction of literals

  5. SAT Solvers: 3 Dominant Approaches • Local Search based stochastic algorithms • Incomplete (do not prove unsatisfiability) • Very effective on satisfiable Random instances, esp. near phase transition • Look-Ahead Based systematic solvers • Complete search with careful selection of variables/values to branch on • Spend time exploring “reduction” in complexity with various branching possibilities • Local Learning: some local inference within a subtree is learned as “implication arrays” • Theoretical Studies: autarkies, “reduction” measures based on probability distributions • Very effective on unsatisfiable Random instances • Also effective on Crafted and some Industrial instances • Conflict Directed Clause Learning (CDCL) solvers • Complete search producing General Resolution proofs of unsatisfiability • Very effective on Industrial instances, esp. large and highly interconnected ones

  6. Look-Ahead vs. CDCL SAT Solvers Credit: Heule & van Maaren,Handbook of SAT • Complementary Regimes of Strength • Plot shows dominating solver on Crafted and Industrial instances + March: typical Look-Ahead solver Minisat: typical CDCL solver Low Diameterof Resolution Graph [two clauses have an edgeif they clash in 1 literal] Low ConstraintDensity

  7. CDCL SAT Solvers

  8. Systematic SAT Solvers as Search Engines Dramatic Progress in 20 Years • Started out with ~100 vars, ~200 constraints in early 1990’s • Now often easily handle over 1Mvars, ~5M constraints • Instances with 30M clauses being used in competitions! • Was it all just Moore’s Law? It helped, but not much… • 2x faster computer does not solve 2x larger SAT instance • Search difficulty does not scale linearly with problem size! • Key Development Drivers • Academic: “Open” SAT Competitions, Races, and Challenges:Germany ’89, Dimacs ’93, China ’96, SAT-2002, …, SAT-2013 • Industrial: Verification: Backend of Model Checkers, SMT solvers • Applicationsto Test Pattern generation, Optimal Control, ProtocolDesign, Routers, Cryptography, E-Commerce (E-auctions &electronic trading agents), Bioinformatics (Haplotype Inference), etc

  9. SAT vs. CP/MIP Search: A Contrast SAT Solvers, esp. CDCL Solvers, work in a very different setting and with very different design principles/goals: • Blackbox approach • No notion of designing custom search / decompositions as in CP Opt. or CPLEX • Expected to work “out of the box” with perhaps a little parameter tuning • Very little structure available to exploit • Binary domains, CNF form – very “flat” representation • Advantage: Standardization, Competitions, Simplicity • No objective function to estimate for guidance or use to assess progress • Number of unsatisfied clauses can be a highly misleading indicator • Reliance on LOTS of branching, backtracking, learning, restarting, …all performed extremely fast. How fast? But Note: 1M variable CNF formula given to CP or MIP solver will not fly 

  10. SAT Solvers as Fast Search Engines • CDCL SAT solvers have become really efficient at searching fast • E.g., on an IBM model checking instance from SAT Race 2006, with ~170k variables, 725k clauses, solvers such as Minisat and Rsat roughly • Make 2000-5000 decisions/second • Deduce 600-1000 conflicts/second • Learn 600-1000 clauses/second(#clauses grows rapidly) • Restart every 1-2 seconds (aggressive restarts) • Leading solvers such as Glucose have pushed Restarts even further • Extremely aggressive restarts! • Rely on techniques such as phase saving, “intelligent” clause deletion (based on LBD level), and dynamic context-based freezing of restarts to achieve success

  11. SAT vs. CP/MIP: Branching “Tree” Structure • CP & MIP solvers traditionally explore a well-defined underlying search tree, albeit in different heuristic orders • CP: typically binary/multi-way tree with DFS or LDS exploration order • MIP: typically best-first style tree search with a frontier of Open nodes and “diving” to obtain feasible solutions quickly • Modern CDCL SAT solvers very far from building a traditional search tree! • Branching is “uneven” • Restarts are extremely frequent(context is retained using various techniques) • Under current context,X=0 UPY=0 • Y is 1-UIP variablein last conflict analysis • Note: Y=1  X=1but not necessarilyY=1 UPX=1 Normal: X=1 Y=1 X=0 X=0 “smaller” X=1 ?

  12. The Importance of Learned Clauses • “Everything” is influenced by Conflict Analysis and Learned Clauses! • No need to “flip” value of the branched upon variable: 1-UIP learned clause automatically implies flipped value of the 1-UIP literal • Enablement of Aggressive Restarts • Safe, as context is preserved by learned clauses • Conflict-directed Backjumping • Necessity of Lazy Data Structures due to ~1000 clauses learned per second • Fast but with a drawback: incomplete knowledge of current state of all clauses • E.g., can no longer determine how many clauses are not yet satisfied! • Branching heuristic: typical state-basedheuristics cannot be computed anymore with lazy data structures: missing information about current state of all clauses • VSIDS and variations for variable selection (more later)

  13. It Wasn’t Always the Case…Traditional, State-Dependentand History-IndependentHeuristics in SAT

  14. Traditional Branching Heuristics • SAT Solvers, before Clause Learning became a must-have, had many variations of state-dependent heuristics similar to CSP solvers, e.g.: • DLCS: Dynamic Largest Combined Summaximize CP(x) + CN(x) (#unresolved clauses with literal x occurring pos and neg, resp.) • DLIS: Dynamic Largest Individual Summaximize max {CP(x), CN(x)}

  15. Traditional Branching Heuristics… contd. • BOHM[Buro & Klein-Buning, 1992]lexicographically-maximizewhereIntuitively, satisfy most small clauses or further reduce their size #unresolved size-iclausescontaining literal x

  16. Traditional Branching Heuristics… contd. • MOMS: Maximum Occurrences in clauses of Minimum SizeMany variations, e.g.:maximize • Jeroslow-Wang [1990]maximize Two-sided version: maximize #unresolved smallestclausescontaining literal x preference also to vars thatappear as bothpos & negin smallest clauses number of unresolved clauses literal lappears in, weighted inverselyproportional to exp(clause size)

  17. Key Techniques InsideModern CDCL Solvers

  18. DPLL Search as Implemented in Modern Solvers Note: No “search tree” style search where we set x=0 andthen later “flip” to x=1

  19. Clause Learning: Conflict Graphs, etc. Search tree behavior: • Branch: p=0, q=0, b=1 • Detect conflict; learn, say, 1-UIP clause (¬a or t) • Backtrack to depth=2: assignment stack has p=0, q=0 • Flip value of b to get b=0 • Do nothing (not even state update) and simply observe t=1 is implied!  further, t=1 implies b=0(under the current context) p=0 (¬a or t) q=0 b=1 t =1

  20. Lazy Data Structures • SAT solvers (used to) spend 80% of their time doing unit propagation • Must make unit propagation efficient • as more and more clauses are added (clause learning) • as longer clauses are added (initial clauses tend to be mostly short) • Observation: Watching two un-falsified literals is sufficient,no matter how long the clause is! • With 2 un-faslified clauses, clause guaranteed to not unit propagate or be falsified • Can ignore processing most clauses unless the literal under consideration is being watched in them • Head and Tail Lists: SATO solver [1997] • Watched Literals: zChaff solver [2001]

  21. H/T Lists vs. Watched Literals Credit: Marques-Silva, Lynce,& Malik; Handbook of SAT WL structure needs • No pointer “trail” maintenance • No work when backtracking But can mean exploring the whole clause to detect unit literal

  22. Dynamic Variable Selection Heuristics VSIDS: Variable State Independent Decaying Sum (zChaff solver) • Fast heuristic: not extremely accurate but adaptive and informed by conflicts! • A key ingredient to make SAT solvers work well on industrial instances • Necessitated by lazy data structures: accurate information about reduced clause size no longer available • Maintain one score for each literal • Increase score of literals appearing in the conflict clause • Periodically divide all scores by 2 Several variations, e.g., Berkmin solver: • One score for each variable, incremented for all vars appearing in 1-UIP analysis • More importantly: variable chosen from most recently learned and yet-unsatisfied conflict clause!

  23. Power Law Decay Exponential Decay Standard Distribution (finite mean & variance) Restarts: Without Clause Learning • Originally motivated by observations about runtime distributionsof SAT solvers without conflict learning [Gomes et al, 1998] • Really effective when “heavy-tailed” behavior is presentwith many short runs (“backdoors”) and many very long runs • The easy to grasp concept: key is the role of the exponential distribution (geometric distribution, really, for the discrete case) • If probability of failure after time T(a) decays faster than exponentially hurts to restart(a)decays exponentially doesn’t matter (easy solution strategy: keep restarting!)(c) decays slower than exponentially  should restart

  24. Restarts: With Clause Learning • No clear empirical runtime distribution study (to my knowledge); however, large runtime variations often observed in practice and rapid restarts help! • Safe: Context is kept through learned clauses and associated heuristics • Theoretical Justification/Intuition: Do we really need restarts? • Stems from characterization of Clause Learning Proof System (CL) and its relation to General Resolution (RES) • Full simulation of RES by CL known only in the presence of restarts! • CL (specific learning scheme, no restarts) exponentially more powerful than any “natural and proper” fragment of RES [2003] • CL** + lots of restarts = RES [2003] • F has a short RES proof  F’ has a short CL proof w/o restarts [2008] • CL + lots of restarts = RES [2009] • CL has short proofs of natural candidate formulas for separation [2012]

  25. Summary • Dramatic Progress in CDCL SAT Solvers • High Contrast with CP/MIP solvers w.r.t. “tree” structure • “Everything” influenced by Learned Clauses and Conflict Analysis • Traditional Branching Heuristics • Exist but no longer common (except in Look-Ahead SAT Solvers) • CDCL Solvers: Interesting Search Design, no clear “tree” • Clause Learning • Lazy Data Structures • VSIDS Variable Selection Heuristic • Aggressive (but careful) Restarts • Reference: Handbook of SAT • 27 chapters: Everything from historical perspectives,theoretical foundations, practical solvers, applications, …

  26. Extra slides

  27. Goal of This Talk Highlight key advances in the design of DPLL-based SAT solvers that have made this scaling feasible • Note: it is not just the “simplicity” of the constraints per se • E.g., a CNF formula F given as a set of “clause constraints” to IBM/ILOG CP Solver or to a MIP solver would not scale up! • Several fundamental techniques make modern SAT solvers behave very differently from the traditional branch-and-backtrack search; e.g. • there isn’t anymore a clearly defined “search tree”, or even a search data structure that “tries both branches” / “flips variable value” • they don’t even look at most of the clauses when branching and propagating • they literally do nothing upon backtrack besides un-assigning variable values (no “state” to revert back to)

  28. Basic DPLL Search for SAT

  29. Key Techniques in Modern SAT Solvers • Clause learning (no-goods) • Requires a “conflict analysis” mechanism: implication graph, graph cuts • Motivates/necessitates efficient data structures (e.g., watched literals) • Enables getting rid of traditional search tree • Takes “restarts” to another level: very rapid and less risky • Helps guide the solver in many ways • Conflict directed backjumping / non-chronological backtracking • Conflict directed variable selection: VSIDS • Lazy data structures: watched literals • Motivated by SAT solvers spending ~80% of their time doing unit prop., and new clauses being added at a very rapid rate! • Enables very efficient propagation: allows ignoring most clauses • Enables “no work” upon backtracking • Very aggressive restarts • Assignment stack shrinking • Conflict clause minimization • Clause deletion (to save memory) [have to be careful about search tree]

More Related