The Promise of LP to Boost CSP Techniques for Combinatorial Problems

The Promise of LP to Boost CSP Techniques for Combinatorial Problems Carla P. Gomes gomes@cs.cornell.edu David Shmoys shmoys@cs.cornell.edu Department of Computer Science School of Operations Research and Industrial Engineering Cornell University CP-AI-OR 2002

Motivation • Increasing interest in combining Constraint Satisfaction Problem (CSP) formulations and Linear Programming (LP) based techniques for solving hard computational problems. • Successful results for solving problems that are a mixture of linear constraints – where LP excels – and combinatorial constraints – where CSP excels. However, surprisingly difficult to successfully integrate LP and CSP based techniques in a purely combinatorial setting. Example: Satisfiability

Power of Randomization • Randomization is magic --- • we have some intuitions why it works.

Outline of Talk • A purely combinatorial problem domain • Problem formulations • CSP formulation • LP formulations • Assignment formulation • Packing Formulation • Randomization • Heavy-tailed behavior in combinatorial search • Approximation Algorithms for QCP • A Hybrid Complete CSP/LP Randomized Rounding Backtrack Search Approach • Empirical Results • Conclusions

A purely combinatorial problem domain

Quasigroup or Latin Square (Order 4) A Quasigroup or Latin Square is an n-by-n matrix such that each row and column is a permutation of the same n colors The Quasigroup or Latin Square Completion Problem (QCP): 68% holes Quasigroups or Latin Squares:An Abstraction for Real World Applications Gomes and Selman 97

Critically constrained area EASY AREA EASY AREA Complexity of Latin Square Completion Time: 150 1820 165 20% 42% 50% 35% 42% 50% Complexity QCP is NP-Complete Better characterization beyond worst case?

Problem Formulations

QCP as a CSP • Variables - • Constraints - row column

Pure CSP approaches solve QCP instances up • to order 33 relatively well. • Higher orders (e.g.,critically constrained area) • are beyond the reach of CSP solvers.

LP Formulations

Assignment Formulation Rows Colors Columns Cubic representation of QCP

QCPAssignment Formulation Max number of colored cells Row/color line Column/color line Row/column line

Packing formulation Families of patterns (partial patterns are not shown) Max number of colored cells in the selected patterns s.t. one pattern per family a cell is covered at most by one pattern

QCPPacking Formulation Max number of colored cells one pattern per color at most one pattern covering each cell

Any feasible solution to the packing LP relaxation is • also a solution to the assignment LP relaxation • The value of the assignment relaxation is at least the bound implied by the packing formulation => the packing formulation provides a tighter upper bound than the assignment formulation • Limitation – size of formulation is exponential in n. (one may apply column generation techniques)

Randomization

Background • Stochastic strategies have been very successful in the area of local search. • Simulated annealing • Genetic algorithms • Tabu Search • Walksat and variants. • Limitation: inherent incomplete nature of local search methods.

Randomized backtrack search • Randomized variable and/or value selection – lots of different ways. • Example: randomly breaking ties in variable and/or value selection. • Compare with standard lexicographic tie-breaking. • Note: No problem maintaining the completeness of the algorithm!

Time: 7 11 30 (*) (*) (*) no solution found - reached cutoff: 2000 Erratic Behavior of Mean Sample mean Number runs Empirical Evidence of Heavy-Tails Easy instance – 15 % preassigned cells 3500 2000 Median = 1! 500 Gomes et al. 97

Power Law Decay Exponential Decay Standard Distribution (finite mean & variance) Decay of Distributions Standard Exponential Decay e.g. Normal: Heavy-Tailed Power Law Decay e.g. Pareto-Levy: Infinite variance, infinite mean

70% unsolved 1-F(x) Unsolved fraction 0.001% unsolved 250 (62 restarts) Number backtracks (log) Exploiting Heavy-Tailed Behavior • Heavy Tailed behavior has been observed in several domains: QCP, Graph Coloring, Planning, Scheduling, Circuit synthesis, Decoding, etc. • Consequence for algorithm design: • Use restarts or parallel / interleaved runs to exploit the extreme variance performance. Restarts eliminate heavy-tailed behavior

Randomized backtrack search – • active research area -> very effective when combined with no-good learning! • solved open problems • different variants of randomization/restarts, e.g., biased probability function for variable/value selection, “jumping” to different points in the search tree • State-of-the-art Sat Solvers incorporate randomized restarts: • Chaff Relsat • Grasp Goldberg’s Solver • Quest SatZ, SATO, … • used to verify 1/7 of a Alpha chip (Pentium IV)

Randomized Rounding

Randomized Rounding • Randomized Rounding • Solve a relaxation of combinatorial problem; • Use randomization to go from the relaxed version to the original problem;

Randomized Rounding of a 0-1 Integer Programming • Solve the LP relaxation; • Interpret the resulting fractional solution as providing the probability distribution over which to set the variables to 1. • Note: The resulting solution is not guaranteed to be feasible. Nevertheless, good intuition of why randomized rounding is a powerful tool.

LP Based Approximations

Approximation Algorithm • Assumption: Maximization problem • the value of the objective function delivered by algorithm A for input instance I. • the optimal value of the objective function for input instance I. • The performance ratio of an algorithm A is the infimum (supremum, for min) over all I of the ratio • A is an - approximation algorithm if it has performance ratio at least (at most, for min)

Approximation Algorithm • For randomized algorithms we replace by • in the definition of performance ratio. • (expectation is taken over the random choices performed by the algorithm). • Note: the only randomness in the performance guarantee stems from the randomization of the algorithm itself, and not due to any probabilistic assumptions on the instance. • In general, the term approximation algorithm will denote a polynomial-time algorithm.

Approximations Based on Assignment Formulation • Kumar et. al 99  • Algorithm1 - at each iteration, the algorithm • solves the LP relaxation and sets to 1 the variable • closest to 1. This is an 1/3 approximation algorithm. • Algorithm 2 – at each iteration, the algorithm selects a compatible matching for a color, for which the LP relaxation places the greatest total weight. • This is an 1/2 approximation algorithm. • Experimental evaluation -> problems up to order 9.

ApproximationBased on Packing Formulation • Randomization scheme: • for each color K choose a pattern with probability (so that some matching is selected for each color) • As a result we have a pattern per color. • Problem: some patterns may overlap, even though in expectation, the constraints imply that the number of matchings in which a cell is involved is 1.

Packing formulation 1 0.8 1 1 0.2 Max number of colored cells in the selected patterns s.t. one pattern per family a cell is covered at most by one pattern

(1-1/e)- ApproximationBased on Packing Formulation • Let’s assume that the PLS is completable • Z*=h • What is the expected number of cells uncolored by our randomized procedure due to overlapping conflicts? • From we can compute • So, the desired probability corresponds to the probability of a cell not be colored with any color, i.e.:

(1-1/e)- ApproximationBased on Packing Formulation • This expression is maximized when all the • are equal therefore: • So the expected number of uncolored cells is at most  at least holes are expected to be filled by this technique.

Putting all the pieces together

CSP Model • LP Model + LP Randomized Rounding • Heavy-tails • We want to maintain completeness How do we put all the pieces together? A HYBRID COMPLETE CSP/LP RANDOMIZED ROUNDING BACKTRACK SEARCH

HYBRID CSP/LP RANDOMIZED ROUNDING BACKTRACK SEARCH • Central features of algorithm: • Complete Backtrack search algorithm • It maintains two formulations • CSP model • Relaxed LP model • LP Randomized rounding  for setting values at the top of the tree • CSP + LP inference

HYBRID CSP/LP RANDOMIZED ROUNDING BACKTRACK SEARCH • Populate CSP Model • Perform propagation • Populate LP solver • Solve LP Variable setting controlled by LP Randomized Rounding CSP & LP Inference %LP Interleave-LP Search & Inference controlled by CSP Adaptive CUTOFF

HYBRID CSP/LP RANDOMIZED ROUNDING BACKTRACK SEARCH • Initialize CSP model and perform propagation of constraints (Ilog Solver); • Solve LP model (Ilog Cplex Barrier) • LP provides good heuristic guidance and pruning information for the search. However solving the LP is relatively expensive. • Two parameters control the LP effort • %LP – this parameter controls the percentage of variables set based on the LP rounding (%LP=0 pure CSP strategy) • Interleave-LP – sets the frequency in which we re-solve the LP. • Randomized rounding scheme: rank variables according to the LP value. Select the highest ranked variable and set its value to 1 with probability p given by its LP value. With probability (1-p), randomly select a color form the colors allowed in the CSP model. • Perform propagation CSP propagation after each variable setting. (A total of Interleave-LP variables is assigned this way without resolving the LP) • Use a cutoff value to restart the sercah (keep increasing it to maintain completeness)

Empirical Results

Time Performance

Performance in Backtracks

Performance • With the hybrid strategy we also solve instances of order 40 in critically constrained area – out of reach for pure CSP; • We even solved a few balanced instances of order 50 in the critically constrained order! • more systematic experimentation is required to better understand limitations and strengths of approach.

Conclusions

Conclusions • Approximations based on LP randomized rounding (variable/value setting) + CSP propagation --- very powerful. • Combating heavy-tails of backtrack search through randomization --- very effective. • Consequence: • New ways of designing algorithms - aim for strategies which have highly asymmetric distributions that can be exploited using restarts, portfolios of algorithms, and interleaved/parallel runs. • General approach holds promise for a range of combinatorial problems Final TAKE HOME MESSAGE Randomization does not  incomplete search !!!

Demos, papers, etc. www.cs.cornell.edu/gomeswww.orie.cornell.edu/~shmoysCheck also:www.cis.cornell.edu/iisi

Eighth International Conference on the • Principles and Practice of • Constraint Programming • September 7-13 • Cornell, Ithaca NY • CP 2002

The Promise of LP to Boost CSP Techniques for Combinatorial Problems

The Promise of LP to Boost CSP Techniques for Combinatorial Problems

Presentation Transcript

Approximation algorithms for combinatorial allocation problems

Satisfied by Message Passing: Probabilistic Techniques for Combinatorial Problems

Constraint Satisfaction problems (CSP)

Solution Counting Methods for Combinatorial Problems

Quality of LP-based Approximations for Highly Combinatorial Problems

Approximation Algorithms for Combinatorial Problems

L15 LP Problems

Graphing LP problems

Classification of Combinatorial problems

The Evergreen Project: The Promise of Polynomials to Boost CSP/SAT Solvers*

Integrating CSP Decomposition Techniques and BDDs for Compiling Configuration Problems

The Evergreen Project: The Promise of Polynomials to Boost CSP/SAT Techniques*

The Evergreen Project: The Promise of Polynomials to Boost CSP/SAT Techniques*

EAs for Combinatorial Optimization Problems

The Promise and Problems of Nuclear Energy

Genetic Algorithms for Dynamic Combinatorial Problems

Promise Problems – Ilustrated

Promise Problems – Ilustrated

Problems in Combinatorial Optimization

The Promise and Problems of Nuclear Energy