
Complexity of Approximating Constraint Satisfaction Problems Prasad Raghavendra Microsoft Research New England, Cambridge, MA, USA. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAA
Constraint Satisfaction Problem A constraint satisfaction problem Λ: Λ = (A finite domain [q]={1,2,..q}, Set of Predicates/Relations {P1,P2,.. Pr}) Example: MaxCut = ({0,1}, {P(a,b) = a NOTEQUAL b}) 3-SAT = ({0,1}, {P1(a,b,c) = a ѵ b ѵ c, P2(a,b,c) = ¬a ѵ b ѵ c …. P8(a,b,c) = ¬a ѵ ¬ b ѵ ¬ c})
Constraint Satisfaction Problem Instance of Λ –CSP I: • Set of variables: {x1 , x2, .. Xn} • Set of constraints: Predicates from Λapplied to subsets of variables. P1 P31 P13 x1 x2 x3 x9 xn • Max-Λ-CSP: • “Given an instance of Λ-CSP, find an assignment to the variables, that satisfies the maximum number of constraints.” • Remarks: • Use fraction of constraints instead of number of constraints (objective value always between 0 and 1). • The constraints can be weighted, then one maximizes the total weight. Easily seen to be equivalent. • Predicates can be replaced by bounded real valued payoffs (Generalized CSPs)
Approximability An algorithm Ais an α-approximation for a Max-Λ-CSP if for every instance I, A(I) ≥ α ∙ OPT(I) Approximability Threshold: “Given a CSP Λ , what is the largest constant αΛfor which there is an αΛapproximation for Max- Λ-CSP?”
Polymorphisms A function F : [q]R -> [q] for some constant R is a “polymorphism” for a CSP Λ if, For every instance I of CSP Λ, P1 P31 P13 x1 x2 x3 x9 xn X1 = 0 1 1 0 0 1 1 1 0 1 For every set of R solutions {X1, X2, … XR} to instance I, X2 = 1 1 1 1 1 1 1 1 1 1 XR = 1 0 1 0 0 1 0 1 0 1 F(X1, X2, … XR) = 0 1 1 1 1 1 1 1 1 0 F(X1, X2, … XR) is also a solution to instance I (Here F is applied for each variable separately)
Polymorphisms and Complexity of Exact CSP Examples: • The dictator functions F(x) = xiare polymorphisms for all CSPs Λ • For linear Equations over {0,1}, for odd R, • For 2-SAT, for odd R, the majority function on R bits. P1 P31 P13 x1 x2 x3 x9 xn 0 1 1 0 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 (Algebraic Dichotomy Conjecture) Exact CSP Λis in P if and only if there are “non-trivial” polymorphisms , i.e., polymorphisms that are very different from dictators. (precisely defined in [Bulatov-Jeavons-Krohkin]) 1 0 1 0 0 1 0 1 0 1 0 1 1 1 1 1 1 1 1 0
“(α,c) –approximate polymorphisms” Fix the value of c in the CSP instance. Approximate Polymorphisms A function F : [q]R -> [q] for some constant R is an “α-approximatepolymorphism” for a CSP Λ if, For every instance I of CSP Λ, and c > 0, P1 P31 P13 x1 x2 x3 x9 xn X1 = For every set of R assignments {X1, X2, … XR} to instance I, that satisfy c fraction of constraints 0 1 1 0 0 1 1 1 0 1 X2 = 1 1 1 1 1 1 1 1 1 1 XR = 1 0 1 0 0 1 0 1 0 1 F(X1, X2, … XR) = 0 1 1 1 1 1 1 1 1 0 F(X1, X2, … XR) satisfies at least α c fraction of constraints (Here F is applied for each variable separately)
Distributional Function Definition: A distributional function is a map F: [q]R {Probability distribution over [q]}} Alternately, F: [q]R such that F1(x) + F2(x) + .. Fq (x) = 1 andFi(x) ≥ 0 Definition: A DDFΨ is a probability distribution over distributional functions F ЄΨ over [q]R F: [q]R {Probability distribution over [q]}}
Approximate Polymorphisms A DDFΨ for some constant R is an “α-approximatepolymorphism” for a CSP Λ if, P1 P31 P13 For every instance I of CSP Λ, and c > 0, x1 x2 x3 x9 xn For every set of R assignments {X1, X2, … XR} to instance I, that satisfy c fraction of constraints 0 1 1 0 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 0 1 0 1 • Sample a distributional function F ЄΨ • Apply F to each bit separately. P1 P2 P3 0 1 1 1 1 1 1 1 1 0 The expected value of the solution returned is at least α c
Influences Definition: Influence of the ith co-ordinate on a function F:[q]R under a product distributionμRis defined as: Infiμ (F) = E [ Variance [F] ] Dictator functions are trivially 1-approximate polymorphisms for every CSP. Non-trivial not like a dictator. over changing the ith coordinate as per μ Random Fixing of All Other Coordinates from μR-1 (For the ith dictator function : Infiμ (F) is as large as variance of F) Definition: A function is τ-quasirandomif for all product distributions μRand all i, Infiμ (F) ≤ τ
Define:αΛ(c) - Restricted to instances with value ≥c. Complexity of Approximability For a CSP Λ, αΛ= largest constant such that there are αΛ–approximate non-trivial polymorphisms True for all known approximation thresholds αΛ–approximate τ –quasirandom polymorphisms for every τ > 0 (Analogue of Algebraic Dichotomy Conjecture): “For every Max-CSP Λ, αΛis the threshold of approximability of Max-CSP Λ”. Hardness: cannot approximate to better than αΛ factor Algorithm: can approximate to factor αΛ
Hardness and Algorithm Unique Games Conjecture [Khot 02] For every Max-CSP Λ, it is NP-hard to approximate the problem better than αΛ • For every Max-CSP Λand c , ε >0, it is NP-hard to approximate the problem on instances with value c-εto a factorbetter than αΛ(c) Theorem[Raghavendra 08] • For every ε >0, every Max-CSP Λ, can be approximated within a ratio αΛ- ε in time exp(exp(poly(1/ ε, |Λ|))· poly(n). • Follows fairly easily using the reduction of [Khot-Kindler-Mossel-O’Donnell] • A slightly more general version of hardness is equivalent to Unique Games Conjecture. • Algorithm based on semidefinite programming – LC relaxation. • Settles the approximability of every CSP under Unique Games Conjecture. • Situation is the reverse of the state of dichotomy conjecture.
More Generally, TheoremUnder UGC, this semidefinite program (LC) yields the “optimal” approximation for: • Every Generalized Constraint Satisfaction Problem: bounded real valued functions instead of predicates. (minimization problems also - Metric Labelling) [Raghavendra 08] • On instances with value c, • The LC relaxation approximates instances with value c to αΛ(c-ε)-ε • Under UGC, it is hard to approximate instances with value c to a factor better than αΛ(c+ε)+ε
Remarks • Approximation threshold αΛ is not very explicit. Theorem[Raghavendra 08] • For every Max-CSP Λ, the value ofαΛ can be computed within an error ε in time • exp(exp(poly(1/ε,|Λ|))) • LCSemidefinite program is simple and can be solved in near linear time in number of [Steurer 09].
Rest of the Talk • Hardness: • Unique Games Conjecture. • Overview of Reduction. • Algorithm: • Intuitive Idea • Description of LC semidefinite program • Random Projections • Invariance Principle
x-y = 11 (mod 17) x-z = 13 (mod 17) … …. z-w = 15(mod 17) Unique GamesA Special Case E2LIN mod p Given a set of linear equations of the form: Xi – Xj = cij mod p Find a solution that satisfies the maximum number of equations.
Unique Games Conjecture[Khot 02]An Equivalent Version[Khot-Kindler-Mossel-O’Donnell] For every ε> 0, the following problem is NP-hard for large enough prime p Given a E2LIN mod p system, distinguish between: • There is an assignment satisfying 1-εfraction of the equations. • No assignment satisfies more than εfraction of equations.
Polymorphisms and Dictatorship Tests By definition of αΛ, for every α > αΛthere is no α-approximate polymorphism. there is some instance I and R solution such that every low influence function F fails. x1 x2 x3 x9 xn • To Get Dictatorship Test Gadget: • Merge variables that have the same R bit vector associated. • New instance I’ has only 2R vertices indexed by {0,1}R • All the projections/dictator solutions have value c, • Any other low influence function has value < αc P1 P31 P13 0 1 1 0 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 0 1 0 1
Algorithm Theorem[Raghavendra 08] • For every ε >0, every Max-CSP Λ, can be approximated within a ratio αΛ- ε in time exp(exp(poly(1/ ε, |Λ|))· poly(n). Design an αΛ -approximation algorithm using the existence of αΛ- approximate polymorphism.
Intuitive Idea P1 P31 P13 Input: An instance I of the Max-CSP Λ We Know: For every τ > 0 αΛ–approximate τ –quasirandom polymorphisms. F : [q]R [q] 0.5 1.2 -1 0.3 0.3 1.2 1.1 1.1 0.1 1 0 1 1 0 0 1 1 1 0 1 R Optimal Solutions 1.1 1 -1 1 1 1 1 1 0.8 1 1 1 1 1 1 1 1 1 1 x1 x2 x3 x9 xn 1.2 0.3 1 0 0 1.1 0 1 0.1 1 0 1 0 0 1 0 1 0 1 0 1 1 1 1 1 1 1 1 0 • A Plausible Algorithm: • Relax the constraint that solutions are integral {0,1} - allow real values. Using semidefinite programming, one can generate “real valued optimal solutions”. • Feed in these real valued optimal solutions. • A τ –quasirandom polymorphism cannot distinguish between integral solutions and these real valued solutions. Foolish Algorithm: Take R Optimal Solutions X1 , X2 ,.. , XR, Apply F(X1,..XR) to get a solution that is αΛ–approximate optimal.
Multi-linear Expansion Every function F : {0,1}R can be written as a multilinear polynomial. Example: AND(x(1) ,x(2)) = x(1)x(2) OR(x(1),x(2)) = 1 – (1-x(1))(1-x(2)) More generally, the polynomial PF corresponding to F Any function F:[q]R!can be written as a multilinear polynomial after suitably arithmetizing the inputs.
Semidefinite Program for CSPs Constraints : For each clause P, 0 ≤μ(P,α)≤ 1 For each clause P (xaνxbνxc), For each pair Xa ,Xb in P, consitency between vector and LP variables. V(a,0) ∙V(b,0) = μ(P,000) + μ(P,001) V(a,0) ∙V(b,1) = μ(P,010) + μ(P,011) V(a,1) ∙V(b,0) = μ(P,100) + μ(P,101) V(a,1) ∙V(b,1) = μ(P,100) + μ(P,101) Variables : For each variable Xa Vectors {V(a,0) , V(a,1)} For each clause P = (xaνxbνxc), Scalar variables μ(P,000) , μ(P,001) , μ(P,010) , μ(P,100) , μ(P,011) , μ(P,110) , μ(P,101) , μ(P,111) Objective Function : Xa = 1 V(a,0) = 0 V(a,1) = 1 Xa = 0 V(a,0) = 1 V(a,1) = 0 If Xa = 0, Xb = 1, Xc = 1 μ(P,000) = 0 μ(P,011) = 1 μ(P,001) = 0 μ(P,110) = 0 μ(P,010) = 0 μ(P,101) = 0 μ(P,100) = 0 μ(P,111) = 0
Semidefinite Relaxation for CSP SDP solution for =: Example of local distr.: ɸ = 3XOR(x3, x4, x7) • for every constraint ɸ in I • local distributions µφ over assignments to the variables of ɸ x3x4x7 ¹Á 0 0 0 0.1 0 0 1 0.01 0 1 0 0 … 1 1 1 0.6 • for every variable xi in = • vectors vi,1 , … , vi,q Explanation of constraints: first and second moments of distributions are consistent and form PSD matrix constraints (also for first moments) SDP objective: maximize
Gaussian Projections Sample g = a random Gaussian vector in Generate real valued solution Z = Z1 , Z2 ,… …… Zn-1 Zn by random projection along g where Zi= |vi,1|2 + (vi,1 - |vi,1|2 (vi,1 +vi,0)) ¢ g Lemma: For every constraint Á in the instance I, the first two moments of random variables {Zi | i2Á} agree with local distribution ¹Á. Formally, for all i,j 2Á, Exi,xj» ¹Á [xixj] = E[ZiZj] Exi» ¹Á[xi] = E[Zi]
Noise P1 P31 P13 Let F be an αΛ -approximate polymorphism 0 1 1 0 0 1 1 1 0 1 Lemma: Let H(X) = EY [F(Y)] where Y = X with εnoise Yi = Xiwith 1- εprobability, random bit with probε. Then H is a αΛ - O(ε) polymorphism. x1 x2 x3 x9 xn 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 0 1 0 1 Proof: Let X1 ,X2 ,.. XR be solutions to instance I with value c. Perturb each coordinate of each solution Xi probability ε Yi = Xiwith εnoise The expected value of each perturbed solution is > c – O(ε) The expected value of solution F(Y1 , .. , YR) is at least αΛ (c-O(ε)) Advantage: H is a αΛ - O(ε) polymorphism essentially a low-degree function due to averaging.
Algorithm P1 P31 P13 Setup: • {v1, v2 , v3 ,… vn} - SDP vectors. • Hbe a low degree αΛ- O(ε) approximate polymorphism. 0.5 1.2 -1 0.3 0.3 1.2 1.1 1.1 0.1 1 1.1 1 -1 1 1 1 1 1 0.8 x1 x2 x3 x9 xn 0 Q (Z2) P(Z2) 0 Q (Z3) 0 0 0 0 P(Z1) P(Z3) Q (Z1) 1 1.2 0.3 1 0 0 1.1 0 1 0.1 Algorithm: • Sample R independent Gaussian vectors {g1,g2,.. gR}in • Generate the corresponding Gaussian projections. • Apply the polynomial PHcorresponding to H. • Outputs are not distributions, but close to distributions over {0,1}. Round them to the nearest distribution, and sample from them.
Central Limit Theorem: ``Sum of large number of {-1,1} random variables has similar distribution as Sum of large number of Gaussian random variables.” For the polynomial P(X) = , the distribution of P(X) is similar for X being iid{-1,1} or iid Gaussian vector. Invariance Principle for Low Degree Low Influence Polynomials [Rotar] [Mossel-O’Donnell-Oleszkiewich], [Mossel 2008] If P(X1 ,… XR) is a constant degree polynomial, • Infi (P) ≤ τfor all i (All influences are small), • Z1 , Z2 ,.. ZRare iid random variables, then the distribution of P(Z1 ,Z2 , .. ZR) only depends on the first and second moments of random variables Zi. (P cannot distinguish between two sets of random variables with the same first two moments) If Z1, Z2 , Z3and Y1 , Y2 , Y3are two sets of random variables with matching first two moments. Z11, Z21 , Z31Y11, Y21, Y31 Z12, Z22 , Z32Y12, Y22 , Y32 Z13, Z23, Z33Y13, Y23, Y33 …… ,…………. Z1R, Z2R, Z3RY1R, Y2R, Y3R <----> P(Z1) P(Z2) P(Y2) P(Y3) P(Z3) P(Y1)
Analysis of Algorithm Consider a predicate φ of the instance I. Let µφ be local distribution for φ φ x1 x2 x3 x4 0.5 1.2 0.1 0 0 1 1 0 1 1.11.2 1 1 1 1 1 0 1.3 1.3 0 0 1 1 0 Zi from Gaussian projection Xi from µφ Each row satisfies in expectation constraints. Output satisfies in expectation. By invariance Output satisfies in expectation. 0 3.2 1.3 0 0 1 1 0 1 4.3 0.9 1 1 1 1 1 Summing up over all constraints, Output value ≥ αΛ* SDP value ≥ αΛ *OPT
Back to Exact CSPs Introduced ε noise to make the polymorphism low degree. will not get fully satisfiable assignments. Definition: A function F is noise stable ifεperturbation of inputs changes F with probability g(ε) such that g(ε)0 asε0 Theorem: If there exists noise stable low influence polymorphisms for a CSP Λ then it is tractable (using semidefinite programming). • Above condition holds for all boolean CSPs except linear equations. • Semidefinite programs cannot solve linear equations over {0,1}. • SDPs can solve all bounded width CSPs trivially. OPEN PROBLEM: Can one show that if CSP Λ does not contain the affine type, then it has noise stable low influence polymorphisms?
Degree 0.5 1.2 -1 0.3 0.3 1.2 1.1 1.1 0.1 1 1.1 1 -1 1 1 1 1 1 0.8 If Z1, Z2 , Z3and Y1 , Y2 , Y3are two sets of random variables with matching first two moments. Z11, Z21 , Z31Y11, Y21, Y31 Z12, Z22 , Z32Y12, Y22 , Y32 Z13, Z23, Z33Y13, Y23, Y33 …… ,…………. Z1R, Z2R, Z3RY1R, Y2R, Y3R <----> 1.2 0.3 1 0 0 1.1 0 1 0.1 P(Z1) P(Z2) P(Y2) P(Y3) P(Z3) P(Y1)
Example : Max Cut Input : a weighted graph G Find a cut that maximizes the number of crossing edges 10 15 7 1 1 3
Max Cut SDP Quadratic Program Variables : x1 , x2 … xn xi = 1 or -1 Maximize Semidefinite Program Variables : v1 , v2 … vn • | vi |2= 1 Maximize -1 10 1 -1 15 1 7 1 1 1 -1 -1 -1 3 -1 Relax all the xi to be unit vectors instead of {1,-1}. All products are replaced by inner products of vectors
v1 , v2 , v3 = SDP Vectors Local Random Variables Fix an edge e = (1,2). There exists random variables a1 a2taking values {-1,1} such that: 10 15 7 1 1 1 3 2 For every edge e, there is a local distribution over integral solutions Pe such that: All the moments of order at most 2 match the inner products. E[a1 a2] = v1∙ v2 E[a12] = |v1|2 E[a22] = |v2|2
c = SDP Value v1 , v2 , v3 = SDP Vectors Global Random Variables B g = random Gaussian vector. (each coordinate generated by i.i.d normal variable) b1 = v1 ∙ g b2 = v2 ∙ g b3 = v3 ∙ g … bm = vm ∙ g 10 15 1 7 1 1 3 There is a global distribution B=(b1 ,b2 ,b3 .. bm) over real numbers such that: All the moments of order at most 2 match with the local distributions Pe. E[b1 b2] = v1∙ v2 E[b2 b3] = v2∙ v3 E[b3 b1] = v3∙ v1 E[b12] = |v1|2 E[b22] = |v2|2 E[b32] = |v3|2
Overview • Constraint Satisfaction Problems • Unique Games Conjecture • Semidefinite Programming • Results • Generic Algorithm • Proof Outline Emphasis on connections between Integrality Gaps, UGC Hardness Results, Dictatorship tests and Rounding Schemes
Constraint Satisfaction ProblemA Classic Example : Max-3-SAT Given a 3-SAT formula, Find an assignment to the variables that satisfies the maximum number of clauses. Equivalently the largest fraction of clauses
Constraint Satisfaction Problem Instance : • Set of variables. • Predicates Pi applied on variables Find an assignment that satisfies the largest fraction of constraints. Problem : Domain : {0,1,.. q-1} Predicates : {P1, P2 , P3 … Pr} Pi : [q]k -> {0,1} Max-3-SAT Domain : {0,1} Predicates : P1(x,y,z) = x ѵ y ѵz Variables : {x1 , x2 , x3 ,x4 , x5} Constraints : 4 clauses
Generalized CSP (GCSP) Replace Predicates by Payoff Functions (bounded real valued) Problem : Domain : {0,1,.. q-1} Pay Offs:{P1, P2 , P3 … Pr} Pi : [q]k -> [-1, 1] Objective : Find an assignment that maximizes the Average Payoff Pay Off Functions can be Negative Can model Minimization Problems like Multiway Cut, 0-Extension.
Examples Max-3-SAT Max Cut Max Di Cut Multiway Cut Metric Labelling 0-Extension Unique Games d- to - 1 Games Label Cover Horn Sat
x-y = 11 (mod 17) x-z = 13 (mod 17) … …. z-w = 15(mod 17) Unique GamesA Special Case E2LIN mod p Given a set of linear equations of the form: Xi – Xj = cij mod p Find a solution that satisfies the maximum number of equations.
Unique Games Conjecture[Khot 02]An Equivalent Version[Khot-Kindler-Mossel-O’Donnell] For every ε> 0, the following problem is NP-hard for large enough prime p Given a E2LIN mod p system, distinguish between: • There is an assignment satisfying 1-εfraction of the equations. • No assignment satisfies more than εfraction of equations.
Unique Games Conjecture A notorious open problem, no general consensus either way. Hardness Results: No constant factor approximation for unique games. [Feige-Reichman]
More.. UG hardness results are intimately connected to the limitations of Semidefinite Programming
Max Cut Input : a weighted graph G Find a cut that maximizes the number of crossing edges 10 15 7 1 1 3