The Promise of LP to Boost CSP Techniques for Combinatorial Problems

103 Views

Download Presentation
## The Promise of LP to Boost CSP Techniques for Combinatorial Problems

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**The Promise of LP to Boost CSP Techniques for Combinatorial**Problems Carla P. Gomes gomes@cs.cornell.edu David Shmoys shmoys@cs.cornell.edu Department of Computer Science School of Operations Research and Industrial Engineering Cornell University CP-AI-OR 2002**Motivation**• Increasing interest in combining Constraint Satisfaction Problem (CSP) formulations and Linear Programming (LP) based techniques for solving hard computational problems. • Successful results for solving problems that are a mixture of linear constraints – where LP excels – and combinatorial constraints – where CSP excels. However, surprisingly difficult to successfully integrate LP and CSP based techniques in a purely combinatorial setting. Example: Satisfiability**Power of Randomization**• Randomization is magic --- • we have some intuitions why it works.**Outline of Talk**• A purely combinatorial problem domain • Problem formulations • CSP formulation • LP formulations • Assignment formulation • Packing Formulation • Randomization • Heavy-tailed behavior in combinatorial search • Approximation Algorithms for QCP • A Hybrid Complete CSP/LP Randomized Rounding Backtrack Search Approach • Empirical Results • Conclusions**Quasigroup or Latin Square**(Order 4) A Quasigroup or Latin Square is an n-by-n matrix such that each row and column is a permutation of the same n colors The Quasigroup or Latin Square Completion Problem (QCP): 68% holes Quasigroups or Latin Squares:An Abstraction for Real World Applications Gomes and Selman 97**Critically**constrained area EASY AREA EASY AREA Complexity of Latin Square Completion Time: 150 1820 165 20% 42% 50% 35% 42% 50% Complexity QCP is NP-Complete Better characterization beyond worst case?**QCP as a CSP**• Variables - • Constraints - row column**Pure CSP approaches solve QCP instances up**• to order 33 relatively well. • Higher orders (e.g.,critically constrained area) • are beyond the reach of CSP solvers.**Assignment Formulation**Rows Colors Columns Cubic representation of QCP**QCPAssignment Formulation**Max number of colored cells Row/color line Column/color line Row/column line**Packing formulation**Families of patterns (partial patterns are not shown) Max number of colored cells in the selected patterns s.t. one pattern per family a cell is covered at most by one pattern**QCPPacking Formulation**Max number of colored cells one pattern per color at most one pattern covering each cell**Any feasible solution to the packing LP relaxation is**• also a solution to the assignment LP relaxation • The value of the assignment relaxation is at least the bound implied by the packing formulation => the packing formulation provides a tighter upper bound than the assignment formulation • Limitation – size of formulation is exponential in n. (one may apply column generation techniques)**Background**• Stochastic strategies have been very successful in the area of local search. • Simulated annealing • Genetic algorithms • Tabu Search • Walksat and variants. • Limitation: inherent incomplete nature of local search methods.**Randomized backtrack search**• Randomized variable and/or value selection – lots of different ways. • Example: randomly breaking ties in variable and/or value selection. • Compare with standard lexicographic tie-breaking. • Note: No problem maintaining the completeness of the algorithm!**Time:**7 11 30 (*) (*) (*) no solution found - reached cutoff: 2000 Erratic Behavior of Mean Sample mean Number runs Empirical Evidence of Heavy-Tails Easy instance – 15 % preassigned cells 3500 2000 Median = 1! 500 Gomes et al. 97**Power Law Decay**Exponential Decay Standard Distribution (finite mean & variance) Decay of Distributions Standard Exponential Decay e.g. Normal: Heavy-Tailed Power Law Decay e.g. Pareto-Levy: Infinite variance, infinite mean**70%**unsolved 1-F(x) Unsolved fraction 0.001% unsolved 250 (62 restarts) Number backtracks (log) Exploiting Heavy-Tailed Behavior • Heavy Tailed behavior has been observed in several domains: QCP, Graph Coloring, Planning, Scheduling, Circuit synthesis, Decoding, etc. • Consequence for algorithm design: • Use restarts or parallel / interleaved runs to exploit the extreme variance performance. Restarts eliminate heavy-tailed behavior**Randomized backtrack search –**• active research area -> very effective when combined with no-good learning! • solved open problems • different variants of randomization/restarts, e.g., biased probability function for variable/value selection, “jumping” to different points in the search tree • State-of-the-art Sat Solvers incorporate randomized restarts: • Chaff Relsat • Grasp Goldberg’s Solver • Quest SatZ, SATO, … • used to verify 1/7 of a Alpha chip (Pentium IV)**Randomized Rounding**• Randomized Rounding • Solve a relaxation of combinatorial problem; • Use randomization to go from the relaxed version to the original problem;**Randomized Rounding of a 0-1 Integer Programming**• Solve the LP relaxation; • Interpret the resulting fractional solution as providing the probability distribution over which to set the variables to 1. • Note: The resulting solution is not guaranteed to be feasible. Nevertheless, good intuition of why randomized rounding is a powerful tool.**Approximation Algorithm**• Assumption: Maximization problem • the value of the objective function delivered by algorithm A for input instance I. • the optimal value of the objective function for input instance I. • The performance ratio of an algorithm A is the infimum (supremum, for min) over all I of the ratio • A is an - approximation algorithm if it has performance ratio at least (at most, for min)**Approximation Algorithm**• For randomized algorithms we replace by • in the definition of performance ratio. • (expectation is taken over the random choices performed by the algorithm). • Note: the only randomness in the performance guarantee stems from the randomization of the algorithm itself, and not due to any probabilistic assumptions on the instance. • In general, the term approximation algorithm will denote a polynomial-time algorithm.**Approximations Based on Assignment Formulation**• Kumar et. al 99 • Algorithm1 - at each iteration, the algorithm • solves the LP relaxation and sets to 1 the variable • closest to 1. This is an 1/3 approximation algorithm. • Algorithm 2 – at each iteration, the algorithm selects a compatible matching for a color, for which the LP relaxation places the greatest total weight. • This is an 1/2 approximation algorithm. • Experimental evaluation -> problems up to order 9.**ApproximationBased on Packing Formulation**• Randomization scheme: • for each color K choose a pattern with probability (so that some matching is selected for each color) • As a result we have a pattern per color. • Problem: some patterns may overlap, even though in expectation, the constraints imply that the number of matchings in which a cell is involved is 1.**Packing formulation**1 0.8 1 1 0.2 Max number of colored cells in the selected patterns s.t. one pattern per family a cell is covered at most by one pattern**(1-1/e)- ApproximationBased on Packing Formulation**• Let’s assume that the PLS is completable • Z*=h • What is the expected number of cells uncolored by our randomized procedure due to overlapping conflicts? • From we can compute • So, the desired probability corresponds to the probability of a cell not be colored with any color, i.e.:**(1-1/e)- ApproximationBased on Packing Formulation**• This expression is maximized when all the • are equal therefore: • So the expected number of uncolored cells is at most at least holes are expected to be filled by this technique.**CSP Model**• LP Model + LP Randomized Rounding • Heavy-tails • We want to maintain completeness How do we put all the pieces together? A HYBRID COMPLETE CSP/LP RANDOMIZED ROUNDING BACKTRACK SEARCH**HYBRID CSP/LP RANDOMIZED ROUNDING BACKTRACK SEARCH**• Central features of algorithm: • Complete Backtrack search algorithm • It maintains two formulations • CSP model • Relaxed LP model • LP Randomized rounding for setting values at the top of the tree • CSP + LP inference**HYBRID CSP/LP RANDOMIZED ROUNDING BACKTRACK SEARCH**• Populate CSP Model • Perform propagation • Populate LP solver • Solve LP Variable setting controlled by LP Randomized Rounding CSP & LP Inference %LP Interleave-LP Search & Inference controlled by CSP Adaptive CUTOFF**HYBRID CSP/LP RANDOMIZED ROUNDING BACKTRACK SEARCH**• Initialize CSP model and perform propagation of constraints (Ilog Solver); • Solve LP model (Ilog Cplex Barrier) • LP provides good heuristic guidance and pruning information for the search. However solving the LP is relatively expensive. • Two parameters control the LP effort • %LP – this parameter controls the percentage of variables set based on the LP rounding (%LP=0 pure CSP strategy) • Interleave-LP – sets the frequency in which we re-solve the LP. • Randomized rounding scheme: rank variables according to the LP value. Select the highest ranked variable and set its value to 1 with probability p given by its LP value. With probability (1-p), randomly select a color form the colors allowed in the CSP model. • Perform propagation CSP propagation after each variable setting. (A total of Interleave-LP variables is assigned this way without resolving the LP) • Use a cutoff value to restart the sercah (keep increasing it to maintain completeness)**Performance**• With the hybrid strategy we also solve instances of order 40 in critically constrained area – out of reach for pure CSP; • We even solved a few balanced instances of order 50 in the critically constrained order! • more systematic experimentation is required to better understand limitations and strengths of approach.**Conclusions**• Approximations based on LP randomized rounding (variable/value setting) + CSP propagation --- very powerful. • Combating heavy-tails of backtrack search through randomization --- very effective. • Consequence: • New ways of designing algorithms - aim for strategies which have highly asymmetric distributions that can be exploited using restarts, portfolios of algorithms, and interleaved/parallel runs. • General approach holds promise for a range of combinatorial problems Final TAKE HOME MESSAGE Randomization does not incomplete search !!!**Demos, papers, etc.**www.cs.cornell.edu/gomeswww.orie.cornell.edu/~shmoysCheck also:www.cis.cornell.edu/iisi**Eighth International Conference on the**• Principles and Practice of • Constraint Programming • September 7-13 • Cornell, Ithaca NY • CP 2002