670 likes | 867 Views
Generating Diverse Solutions in SAT. Alexander Nadel, Intel Israel IBM; Haifa, Israel October 31, 2011. Agenda. Introduction Analysis Polarity-based Algorithms Variable-based Algorithms Local Algorithms Global Algorithms Conclusion. Diverse k Set in SAT.
E N D
Generating Diverse Solutions in SAT Alexander Nadel, Intel Israel IBM; Haifa, Israel October 31, 2011
Agenda • Introduction • Analysis • Polarity-based Algorithms • Variable-based Algorithms • Local Algorithms • Global Algorithms • Conclusion
DiversekSet in SAT • Given a propositional formula in CNF, generate a number of solutions that are as diverse as possible • A solution is a satisfying assignment • The threshold on the number of solutions is provided by the user
DiversekSet: Brief History • DiversekSet in CSP is studied since 2005 • See the paper for references • DiversekSet in SAT • The first work is our FMCAD’10 paper on semi-formal FPV • Semi-formal FPV finds bugs in hardware that cannot be identified by other methods • DiversekSet is the prime reasoning engine • The problem has a number of additional applications at Intel • This work is thefirst full-blown paper
Algorithms for DiversekSet in SAT in a Glance • The idea: • Adapt a modern CDCL SAT solver for DiversekSet • Make minimal changes to remain efficient • Compact algorithms: • Invoke the SAT solver once to generate all the solutions • Restart after a solution is generated • This work: • Diversity is achieved by modifying polarity and variable selection heuristics
Agenda • Introduction • Analysis • Polarity-based Algorithms • Variable-based Algorithms • Local Algorithms • Global Algorithms • Conclusion
Diversification Quality as the Average Hamming Distance • Quality: the average Hamming distance between the solutions, normalized to [0…1] Hamming distances matrix a b c 10 0 0 21 1 0 30 1 1 41 0 0 2
Diversification Quality as the Average Hamming Distance • Quality: the average Hamming distance between the solutions, normalized to [0…1] Hamming distances matrix a b c 10 0 0 21 1 0 30 1 1 41 0 0 2 2
Diversification Quality as the Average Hamming Distance • Quality: the average Hamming distance between the solutions, normalized to [0…1] Hamming distances matrix a b c 10 0 0 21 1 0 30 1 1 41 0 0 2 2 1
Diversification Quality as the Average Hamming Distance • Quality: the average Hamming distance between the solutions, normalized to [0…1] Hamming distances matrix a b c 10 0 0 21 1 0 30 1 1 41 0 0 2 2 1 2
Diversification Quality as the Average Hamming Distance • Quality: the average Hamming distance between the solutions, normalized to [0…1] Hamming distances matrix a b c 10 0 0 21 1 0 30 1 1 41 0 0 2 2 1 2 1
Diversification Quality as the Average Hamming Distance • Quality: the average Hamming distance between the solutions, normalized to [0…1] Hamming distances matrix a b c 10 0 0 21 1 0 30 1 1 41 0 0 2 2 1 2 1 3
Diversification Quality as the Average Hamming Distance • Quality: the average Hamming distance between the solutions, normalized to [0…1] Hamming distances matrix a b c 10 0 0 21 1 0 30 1 1 41 0 0 2 2 1 2 1 3 Hamming Distance Variables Solutions
Diversification Quality as the Average Hamming Distance • Quality: the average Hamming distance between the solutions, normalized to [0…1] Hamming distances matrix a b c 10 0 0 21 1 0 30 1 1 41 0 0 2 2 1 2 1 3
Quality via Variable Contribution • We formulate an alternative definition forthe same notion ofquality: • Induces quality-efficient polarity and variable selection strategies by: • Allowing one to estimate the contribution of each variable to quality online • Inducing methods to improve the quality online
Variable Quality • Variable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable Sa = 0 Start counting… a b c 10 0 0 21 1 0 30 1 1 41 0 0 Variable quality for a Different: update Sa!
Variable Quality • Variable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable Sa = 1 a b c 10 0 0 21 1 0 30 1 1 41 0 0 Different: update Sa!
Variable Quality • Variable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable Sa = 1 a b c 10 0 0 21 1 0 30 1 1 41 0 0 Equal: do not update Sa
Variable Quality • Variable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable Sa = 1 a b c 10 0 0 21 1 0 30 1 1 41 0 0 Different: update Sa!
Variable Quality • Variable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable Sa = 2 a b c 10 0 0 21 1 0 30 1 1 41 0 0 Different: update Sa!
Variable Quality • Variable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable Sa = 2 a b c 10 0 0 21 1 0 30 1 1 41 0 0 Different: update Sa!
Variable Quality • Variable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable Sa = 3 a b c 10 0 0 21 1 0 30 1 1 41 0 0 Different: update Sa!
Variable Quality • Variable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable Sa = 3 a b c 10 0 0 21 1 0 30 1 1 41 0 0 Equal: do not update Sa
Variable Quality • Variable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable Sa = 3 a b c 10 0 0 21 1 0 30 1 1 41 0 0 Different: update Sa!
Variable Quality • Variable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable Sa = 4 a b c 10 0 0 21 1 0 30 1 1 41 0 0 Different: update Sa!
Variable Quality • Variable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable Sa = 4 a b c 10 0 0 21 1 0 30 1 1 41 0 0
A Simple Way to Calculate Variable Quality • Variable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable Sa = 4 a b c 10 0 0 21 1 0 30 1 1 41 0 0 Sv= pv nv Number of 1’s assigned to v Number of 0’s assigned to v
Quality via Variable Quality • Quality: the average variable quality, normalized to [0…1]:
Why the Definitions are Identical? • Both definitions of quality sum up the number of different pairs of values assigned to variables
Why the Definitions are Identical? • Hamming distance-based definition • Counts the number of different pairs per each pair of rows a b c 10 0 0 21 1 0 30 1 1 41 0 0
Why the Definitions are Identical? • Variable quality-based definition • Counts the number of different pairs per variable, that iscolumn by column a b c 10 0 0 21 1 0 30 1 1 41 0 0
Quality via Variable Quality • Quality: the average variable quality, normalized to [0…1]: • This definition induces DiversekSetalgorithms that improve quality online, e.g.: • Polarity-based algorithm: • For a decision variable, pick a polarity that maximizes variable quality after the assignment • The current partial assignment is taken into account as a solution when calculating quality online • A simple way to keep track of which polarity maximizes variable quality is provided next
Variable Potential • Variable Potential: difference between the number of 1’s and 0’s assigned to a variable: v = pv- nv
Variable Potential • Variable Potential: difference between the number of 1’s and 0’s assigned to a variable: a b c 10 0 0 21 1 0 30 1 1 41 0 0 0 0 -2 v = pv- nv
Relation between Potential and Quality • The closer variable potential to 0 the higher variable quality • v = pv– nv= pv– (m – pv)= 2pv – m • Sv= pv nv= pv (m – pv) = mpv – pv2
Relation between Potential and Quality • The closer variable potential to 0 the higher variable quality • Sv= pv nv= pv (m – pv) = mpv – pv2 • v = pv– nv= pv– (m – pv)= 2pv – m v = 0; Svis max
Relation between Potential and Quality • The closer variable potential to 0 the higher variable quality • Sv= pv nv= pv (m – pv) = mpv – pv2 • v = pv– nv= pv– (m – pv)= 2pv – m Absolute potential moves away from 0; Variable quality drops
Agenda • Introduction • Analysis • Polarity-based Algorithms • Variable-based Algorithms • Local Algorithms • Global Algorithms • Conclusion
pGuide: a Dedicated Compact Polarity-based Algorithm • Compactness: • The SAT solver is invoked once • The solver restarts upon new solution • Only the polarity selection heuristic is modified • pGuide’s polarity selection heuristic: • If the potential is positive, pick 0 • If the potential is negative, pick 1 • If the potential is 0, pick a random value • Properties: • Always prefers the value that yields better quality (if such exists) • The potential is closer to 0 variable quality improves overall quality improves • Yields the best possible quality given a tautological formula • Can be easily proven by induction on the number of solutions
pRand : a Randomized Compact Polarity-based Algorithm • Chooses the polarity randomly • Unlike pGuide, might choose a polarity which yields worse quality that the second polarity
pGuide vs. pRand on Tautological Formulas • pGuide picks values r0,r0’, r1,r1’, r2,r2’, … for every variable • pRand picks random values • For 2 solutions: • The quality is: • 1 for pGuide: {0, 1} or {1,0} • 0.5 for pRand: {0, 0} or {0, 1} or {1,0} or {1, 1} with equal probability
pGuide vs. pRand on Tautological Formulas Even m pGuide Odd m pRand:
Our Experimental Setup • 66 instances from semi-formal FPV • Stats: • Variables: 213,047 to 910,868 • Clauses: 738,862 to 3,251,382 • All the instances are available by email • Machines: • Intel Xeon • 4Ghz CPU frequency • 32Gb memory
Quality Comparison between pGuide and pRand • pGuide is clearly preferable to pRand, especially for a small number of solutions • There is a resemblance between the quality function on tautological and real-life formulas • The quality for real-life formulas is significantly lower • How one can improve the quality on real-life formulas?
pBCPGuide: a BCP-aware Compact Polarity-based Algorithm • The idea: take constraints into account by considering the impact of Boolean Constraint Propagation (BCP) • For a newly selected decision variable: • For each polarity {0,1} • Pick • Propagate with BCP • Write down the quality Q • Undo • Pick the polarity with larger Q • It is sufficient to measure the delta in the absolute value of the variable potentials for step (3): the lower delta the better quality
pBCPGuide: More • Plain pBCPGuide is too costly in terms of performance • Performs BCP 2 or 3 times per decision • Optimizations: • Continue with the second polarity if it yields better quality (instead of undoing) to save a BCP • The first polarity should be the inferior one in terms of the impact on variable quality of the decision variable • To increase the chances that the second polarity yields better quality, in which case only 2 BCPs are required • pBCPGuide_T • Use pBCPGuide until T conflicts are encountered from the moment when either: • The search is started • A new solution is discovered • Switch to pGuide after the threshold is reached
Polarity-based Algorithms: Empirical Comparison Summary • pGuide is preferable to pRandin terms of both quality and run-time • pBCPGuide_T is preferable to pGuide in terms of quality, but pays fee in terms of performance • T regulates the trade-off between quality and run-time • pBCPGuide_100 seems to achieve an attractive balance between run-time and quality
Agenda • Introduction • Analysis • Polarity-based Algorithms • Variable-based Algorithms • Local Algorithms • Global Algorithms • Conclusion