Heuristics for Accelerated Exact Model Counting in SAT Problems

Heuristics for Fast Exact Model Counting Tian Sang, Paul Beame, and Henry Kautz Computer Science & Engineering University of Washington 2005

SAT and #SAT Given a CNF formula, • SAT: find a satisfying assignment • NP-complete: SAT, graph coloring, Hamiltonian cycle, … • #SAT: count satisfying assignments • #P-complete: #SAT, Bayesian inference, computing permanent of a matrix, … • Example: F = (x  y)  (y  z) • 5 satisfying assignments • (0,1,0), (0,1,1), (1,1,0), (1,1,1), (1, 0, 0)

DPLL DPLL(F) if F is empty, return 1 if F contains an empty clause, return 0 else choose a variable x to branch return DPLL(F|x=1)  DPLL(F|x=0) #DPLL #DPLL(F) // computes satisfying probability of F if F is empty, return 1 if F contains an empty clause, return 0 else choose a variable x to branch return 0.5*#DPLL(F|x=1) + 0.5*#DPLL(F|x=0)

Weighted Model Counting • Each literal has a weight  [0,1] • Weight of a model = Product of weight of its literals • Weight of a formula = Sum of weight of its models Basic Weighted Model Counting (BWMC) BWMC(F) if F is empty, return 1 if F contains an empty clause, return 0 else choose a variable x to branch return (BWMC(F|x=1) * weight (x) + BWMC(F|x=0) * weight (x))

Compiling Bayesian Networks to Weighted Model Counting A B

A P Q B Compiling Bayesian Networks to Weighted Model Counting A B

A P Q B Compiling Bayesian Networks to Weighted Model Counting

Cachet • State-of-the-art model counting algorithm [Sang, Bacchus, Beame, Kautz, Pitassi 2004] • Built on top of zchaff SAT solver[Moskewicz et al. 2001] • Incorporates • Clause learning • Component analysis • Formula caching

Component Caching • If formula breaks into separatecomponents (no shared variables), can count each separately and multiply results: #DPLL(C1 C2) = #DPLL (C1) * #DPLL (C2) [Bayardo and Pehoushek 2000] • Caching components can yield exponential time savings [Bacchus, Dalmao, & Pitassi 2003] • Subtle interaction with clause learning • Ignore learned clauses in computing components • Flush siblings of unsat components (and their children) from cache • [Sang, Bacchus, Beame, Kautz & Pitassi 2004]

Question • What heuristics can we use to speed up (weighted) model counting? • How well do SAT heuristics translate to #SAT?

Outline • Background • Heuristics for faster model counting • Component selection • Variable selection • Randomization • Backtracking • One-pass computation of all marginals • Comparison with Bayesian solvers

Component Selection • Is it always best to solve components one at a time? • Depth-first strategy: branch on variables within a component until it is empty • Best-first strategy: select next branching variable from any open component • Reduce priority for SAT components • Best-first strategy yields 2X speedup • Increases chance of finding an unsat component • No need to completely solve their siblings!

Variable Selection • Variable selection heuristics try to rapidly reduce problem size • Must balance • Accuracy of selection • Cost of evaluating heuristic • Empirical comparison: • Literal-count heuristics • VSIDS (zChaff heuristic) • Unit-propagation heuristics • New: VSADS

Literal-Count Heuristics • Scores based on #occurrence of a literal in the residual formula [Marques Silva 1999] • Dynamic Largest Individual Sum (DLIS) • Score(v) = Max (#occurrence(v), #occurrence(v)) • Dynamic Largest Combined Sum (DLCS) • Score(v) = #occurrence(v) + #occurrence(v) • Static: Does not explicitly consider unit propagations

VSIDS • Variable State Independent Decaying Sum (VSIDS) (zChaff) [Moskewicz et al. 2001] • Scores literals, rather than variables • High score determines variable and initial polarity • Initialize literal scores = # of occurrences in formula • Increment scores of literals in each newly-learned conflict clause • Literal scores decayed periodically

Unit-Propagation Based Heuristics • Exact Unit Propagation Count (EUPC) • Literals that lead to conflicts have top priority • If none, compute: Score(v) = |UP(v)|*|UP(v)| + |UP(v)| + |UP(v)| • Used in relsat [Bayardo and Shrag 1997] • Approximate Unit Propagation Count (AUPC) • Consider 2 levels of unit propagation through binary clauses [Goldberg and Novikov 2002] • #Bin(x) = number of binary clauses containing literal x • #Bin2(x) = Sum #Bin(y) for all binary clauses (x,y) Score(v) = #Bin(v) + #Bin2(v) + #Bin(v) + #Bin2(v)

New Heuristic: VSADS • Variable State Aware Decaying Sum (VSADS) • The score of a variable is the combination of its DLCS and VSIDS scores ScoreVSADS(v) = 0.5 * ScoreDLCS(v) + Max( ScoreVSIDS(v), ScoreVSIDS(v) ) • Intuition • VSIDS is strong if there are many learned clauses, weak if there are few

Results: Circuit Formulas X = time-out of 12 hours

Results: Logistics Formulas X = time-out of 12 hours

Results: Other Formulas

Summary: Variable Selection • VSADS best performance overall • Most robust: best or 2nd best • Better than DLCS or VSIDS alone • More expensive unit propagation heuristics usually slightly faster, but much worse on circuit benchmarks

Randomized Restarts • Randomize decisions and restart if too long • Decision randomly chosen from variables whose score is within 25% of the best • Kill run and restart after time limit • Good for SAT: great variance between runs • Bad for #SAT: less variance & each run worse • Exhaustive search less sensitive to order • Random decisions make the residual formulas diverge quickly and reduce cache hits

Results: Randomization

Backtracking Strategies • Standard non-chronological (far) backtracking: • At a conflict, learn a conflict clause • Backtrack to most recent literal in the clause • Good for SAT • Prunes only unsatisfiable portions of search space • Bad for #SAT • Many solutions may have been found before conflict was reached • Non-chronological backtracking can skip over non-empty portions of the search space, which will then be re-explored from scratch • Solution: only backtrack to parents of UNSAT components

Results: Backtracking

Computing All Marginals • Task: In one counting pass, • Compute number of models in which each variable is true • Equivalently: compute marginal probabilities • Approach • Each recursion computes a vector of marginals • At branch point: compute left and right vectors, combine with vector sum • Need to account for fact that some variables may appear in one branch, but be eliminated (reduced) in the other • Cache vectors, not just counts • Reasonable overhead: 10% - 40% slower than counting

All_Marginals (Sketch) AM(F, Marginals) if (F ==) return 1 if (  F) return 0 Process the left branch LValue = 1/2 select v  F to branch LMarginals[v] = 0 // by definition for each G  components_of (F|v) LValue *= AM(G, LMarginals) // recursion for each u  F if u  (F|v) LMarginals[u] *= LValue // adjusting else LMarginals[u] = LValue / 2 // eliminated vars do the similar steps for the right branch Marginals = sumVector (LMarginals, RMarginals) Marginals /= (LValue + RValue) // normalizing return LValue + RValue

Effectiveness of Heuristics • How to evaluate the final result of Cachet 2005’s improved heuristics? • Best model-counting program – but few competitors! • One method: compare against Bayesian inference engines • Both problems #P-complete • Simple translation Bayes nets  #SAT

Exact Bayesian Inference Algorithms • Comparison: • Junction tree [Shenoy & Shafer 90], [Jensen et al. 90] • Recursive conditioning [Darwiche 01] • Value elimination [Bacchus et al. 03] • Weighted model counting [Sang, Beame and Kautz 05] • Other methods: • Cutset conditioning [Dechter 90] • Variable elimination [Zhang & Poole 94]

Implementations • Junction trees (Netica) • Space/time exponential largest clique size • If fits in memory, very fast; all marginals • Recursive conditioning (Samiam) • DPLL, static var ordering, dtree based heuristic • Sub-problem caching, any space; all marginals • Value elimination (Valelim) • DPLL, static var ordering • Dependency set caching, single query node • Weighted model counting (Cachet) • DPLL, dynamic var ordering, component caching • Clause learning; all marginals

S T Network Problems A fraction of nodes are deterministic, specified as a parameter ratio

Results of ratio=0.5 10 problems of each size, X=memory out or time out

Results of ratio=0.75

Results of ratio=0.9

Strategic Plan Recognition • Task: • Given a planning domain described by STRIPS operators, initial and goal states, and time horizon • Infer the marginal probabilities of each action • Abstraction of strategic plan recognition: • We know enemy’s capabilities and goals, what will it do? • Modified Blackbox planning system [Kautz & Selman 1999] to translate such plans to both • Weighted model counting (CNF) • Standard Bayes net specification language

BN of a 3-step plan graph BN of a 3-step plan graph

Summary • Good strategies for model counting: • Best-first component selection • VSADS variable selection • Poor strategies • Randomization • Aggressive Non-chronological backtracking • Can generalize to efficiently compute all marginals • Model counting can be useful for Bayesian inference • Problems with large cliques, many deterministic nodes

Heuristics for Accelerated Exact Model Counting in SAT Problems