# Complexity of Approximating Constraint Satisfaction Problems - PowerPoint PPT Presentation Download Presentation Complexity of Approximating Constraint Satisfaction Problems

Complexity of Approximating Constraint Satisfaction Problems Download Presentation ## Complexity of Approximating Constraint Satisfaction Problems

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Complexity of Approximating Constraint Satisfaction Problems Prasad Raghavendra Microsoft Research New England, Cambridge, MA, USA. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAA

2. Constraint Satisfaction Problem A constraint satisfaction problem Λ: Λ = (A finite domain [q]={1,2,..q}, Set of Predicates/Relations {P1,P2,.. Pr}) Example: MaxCut = ({0,1}, {P(a,b) = a NOTEQUAL b}) 3-SAT = ({0,1}, {P1(a,b,c) = a ѵ b ѵ c, P2(a,b,c) = ¬a ѵ b ѵ c …. P8(a,b,c) = ¬a ѵ ¬ b ѵ ¬ c})

3. Constraint Satisfaction Problem Instance of Λ –CSP I: • Set of variables: {x1 , x2, .. Xn} • Set of constraints: Predicates from Λapplied to subsets of variables. P1 P31 P13 x1 x2 x3 x9 xn • Max-Λ-CSP: • “Given an instance of Λ-CSP, find an assignment to the variables, that satisfies the maximum number of constraints.” • Remarks: • Use fraction of constraints instead of number of constraints (objective value always between 0 and 1). • The constraints can be weighted, then one maximizes the total weight. Easily seen to be equivalent. • Predicates can be replaced by bounded real valued payoffs (Generalized CSPs)

4. Approximability An algorithm Ais an α-approximation for a Max-Λ-CSP if for every instance I, A(I) ≥ α ∙ OPT(I) Approximability Threshold: “Given a CSP Λ , what is the largest constant αΛfor which there is an αΛapproximation for Max- Λ-CSP?”

5. Polymorphisms A function F : [q]R -> [q] for some constant R is a “polymorphism” for a CSP Λ if, For every instance I of CSP Λ, P1 P31 P13 x1 x2 x3 x9 xn X1 = 0 1 1 0 0 1 1 1 0 1 For every set of R solutions {X1, X2, … XR} to instance I, X2 = 1 1 1 1 1 1 1 1 1 1 XR = 1 0 1 0 0 1 0 1 0 1 F(X1, X2, … XR) = 0 1 1 1 1 1 1 1 1 0 F(X1, X2, … XR) is also a solution to instance I (Here F is applied for each variable separately)

6. Polymorphisms and Complexity of Exact CSP Examples: • The dictator functions F(x) = xiare polymorphisms for all CSPs Λ • For linear Equations over {0,1}, for odd R, • For 2-SAT, for odd R, the majority function on R bits. P1 P31 P13 x1 x2 x3 x9 xn 0 1 1 0 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 (Algebraic Dichotomy Conjecture) Exact CSP Λis in P if and only if there are “non-trivial” polymorphisms , i.e., polymorphisms that are very different from dictators. (precisely defined in [Bulatov-Jeavons-Krohkin]) 1 0 1 0 0 1 0 1 0 1 0 1 1 1 1 1 1 1 1 0

7. “(α,c) –approximate polymorphisms” Fix the value of c in the CSP instance. Approximate Polymorphisms A function F : [q]R -> [q] for some constant R is an “α-approximatepolymorphism” for a CSP Λ if, For every instance I of CSP Λ, and c > 0, P1 P31 P13 x1 x2 x3 x9 xn X1 = For every set of R assignments {X1, X2, … XR} to instance I, that satisfy c fraction of constraints 0 1 1 0 0 1 1 1 0 1 X2 = 1 1 1 1 1 1 1 1 1 1 XR = 1 0 1 0 0 1 0 1 0 1 F(X1, X2, … XR) = 0 1 1 1 1 1 1 1 1 0 F(X1, X2, … XR) satisfies at least α c fraction of constraints (Here F is applied for each variable separately)

8. Distributional Function Definition: A distributional function is a map F: [q]R {Probability distribution over [q]}} Alternately, F: [q]R such that F1(x) + F2(x) + .. Fq (x) = 1 andFi(x) ≥ 0 Definition: A DDFΨ is a probability distribution over distributional functions F ЄΨ over [q]R F: [q]R {Probability distribution over [q]}}

9. Approximate Polymorphisms A DDFΨ for some constant R is an “α-approximatepolymorphism” for a CSP Λ if, P1 P31 P13 For every instance I of CSP Λ, and c > 0, x1 x2 x3 x9 xn For every set of R assignments {X1, X2, … XR} to instance I, that satisfy c fraction of constraints 0 1 1 0 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 0 1 0 1 • Sample a distributional function F ЄΨ • Apply F to each bit separately. P1 P2 P3 0 1 1 1 1 1 1 1 1 0 The expected value of the solution returned is at least α c

10. Influences Definition: Influence of the ith co-ordinate on a function F:[q]R  under a product distributionμRis defined as: Infiμ (F) = E [ Variance [F] ] Dictator functions are trivially 1-approximate polymorphisms for every CSP. Non-trivial  not like a dictator. over changing the ith coordinate as per μ Random Fixing of All Other Coordinates from μR-1 (For the ith dictator function : Infiμ (F) is as large as variance of F) Definition: A function is τ-quasirandomif for all product distributions μRand all i, Infiμ (F) ≤ τ

11. Define:αΛ(c) - Restricted to instances with value ≥c. Complexity of Approximability For a CSP Λ, αΛ= largest constant such that there are αΛ–approximate non-trivial polymorphisms True for all known approximation thresholds αΛ–approximate τ –quasirandom polymorphisms for every τ > 0 (Analogue of Algebraic Dichotomy Conjecture): “For every Max-CSP Λ, αΛis the threshold of approximability of Max-CSP Λ”. Hardness: cannot approximate to better than αΛ factor Algorithm: can approximate to factor αΛ

12. Hardness and Algorithm Unique Games Conjecture [Khot 02] For every Max-CSP Λ, it is NP-hard to approximate the problem better than αΛ • For every Max-CSP Λand c , ε >0, it is NP-hard to approximate the problem on instances with value c-εto a factorbetter than αΛ(c) Theorem[Raghavendra 08] • For every ε >0, every Max-CSP Λ, can be approximated within a ratio αΛ- ε in time exp(exp(poly(1/ ε, |Λ|))· poly(n). • Follows fairly easily using the reduction of [Khot-Kindler-Mossel-O’Donnell] • A slightly more general version of hardness is equivalent to Unique Games Conjecture. • Algorithm based on semidefinite programming – LC relaxation. • Settles the approximability of every CSP under Unique Games Conjecture. • Situation is the reverse of the state of dichotomy conjecture.

13. More Generally, TheoremUnder UGC, this semidefinite program (LC) yields the “optimal” approximation for: • Every Generalized Constraint Satisfaction Problem: bounded real valued functions instead of predicates. (minimization problems also - Metric Labelling) [Raghavendra 08] • On instances with value c, • The LC relaxation approximates instances with value c to αΛ(c-ε)-ε • Under UGC, it is hard to approximate instances with value c to a factor better than αΛ(c+ε)+ε

14. Remarks • Approximation threshold αΛ is not very explicit. Theorem[Raghavendra 08] • For every Max-CSP Λ, the value ofαΛ can be computed within an error ε in time • exp(exp(poly(1/ε,|Λ|))) • LCSemidefinite program is simple and can be solved in near linear time in number of [Steurer 09].

15. Rest of the Talk • Hardness: • Unique Games Conjecture. • Overview of Reduction. • Algorithm: • Intuitive Idea • Description of LC semidefinite program • Random Projections • Invariance Principle

16. Hardness

17. x-y = 11 (mod 17) x-z = 13 (mod 17) … …. z-w = 15(mod 17) Unique GamesA Special Case E2LIN mod p Given a set of linear equations of the form: Xi – Xj = cij mod p Find a solution that satisfies the maximum number of equations.

18. Unique Games Conjecture[Khot 02]An Equivalent Version[Khot-Kindler-Mossel-O’Donnell] For every ε> 0, the following problem is NP-hard for large enough prime p Given a E2LIN mod p system, distinguish between: • There is an assignment satisfying 1-εfraction of the equations. • No assignment satisfies more than εfraction of equations.

19. Polymorphisms and Dictatorship Tests By definition of αΛ, for every α > αΛthere is no α-approximate polymorphism.  there is some instance I and R solution such that every low influence function F fails. x1 x2 x3 x9 xn • To Get Dictatorship Test Gadget: • Merge variables that have the same R bit vector associated. • New instance I’ has only 2R vertices indexed by {0,1}R • All the projections/dictator solutions have value c, • Any other low influence function has value < αc P1 P31 P13 0 1 1 0 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 0 1 0 1

20. Algorithm Theorem[Raghavendra 08] • For every ε >0, every Max-CSP Λ, can be approximated within a ratio αΛ- ε in time exp(exp(poly(1/ ε, |Λ|))· poly(n). Design an αΛ -approximation algorithm using the existence of αΛ- approximate polymorphism.

21. Intuitive Idea P1 P31 P13 Input: An instance I of the Max-CSP Λ We Know: For every τ > 0 αΛ–approximate τ –quasirandom polymorphisms. F : [q]R  [q] 0.5 1.2 -1 0.3 0.3 1.2 1.1 1.1 0.1 1 0 1 1 0 0 1 1 1 0 1 R Optimal Solutions 1.1 1 -1 1 1 1 1 1 0.8 1 1 1 1 1 1 1 1 1 1 x1 x2 x3 x9 xn 1.2 0.3 1 0 0 1.1 0 1 0.1 1 0 1 0 0 1 0 1 0 1 0 1 1 1 1 1 1 1 1 0 • A Plausible Algorithm: • Relax the constraint that solutions are integral {0,1} - allow real values. Using semidefinite programming, one can generate “real valued optimal solutions”. • Feed in these real valued optimal solutions. • A τ –quasirandom polymorphism cannot distinguish between integral solutions and these real valued solutions. Foolish Algorithm: Take R Optimal Solutions X1 , X2 ,.. , XR, Apply F(X1,..XR) to get a solution that is αΛ–approximate optimal.

22. Multi-linear Expansion Every function F : {0,1}R  can be written as a multilinear polynomial. Example: AND(x(1) ,x(2)) = x(1)x(2) OR(x(1),x(2)) = 1 – (1-x(1))(1-x(2)) More generally, the polynomial PF corresponding to F Any function F:[q]R!can be written as a multilinear polynomial after suitably arithmetizing the inputs.

23. Semidefinite Program for CSPs Constraints : For each clause P, 0 ≤μ(P,α)≤ 1 For each clause P (xaνxbνxc), For each pair Xa ,Xb in P, consitency between vector and LP variables. V(a,0) ∙V(b,0) = μ(P,000) + μ(P,001) V(a,0) ∙V(b,1) = μ(P,010) + μ(P,011) V(a,1) ∙V(b,0) = μ(P,100) + μ(P,101) V(a,1) ∙V(b,1) = μ(P,100) + μ(P,101) Variables : For each variable Xa Vectors {V(a,0) , V(a,1)} For each clause P = (xaνxbνxc), Scalar variables μ(P,000) , μ(P,001) , μ(P,010) , μ(P,100) , μ(P,011) , μ(P,110) , μ(P,101) , μ(P,111) Objective Function : Xa = 1 V(a,0) = 0 V(a,1) = 1 Xa = 0 V(a,0) = 1 V(a,1) = 0 If Xa = 0, Xb = 1, Xc = 1 μ(P,000) = 0 μ(P,011) = 1 μ(P,001) = 0 μ(P,110) = 0 μ(P,010) = 0 μ(P,101) = 0 μ(P,100) = 0 μ(P,111) = 0

24. Semidefinite Relaxation for CSP SDP solution for =: Example of local distr.: ɸ = 3XOR(x3, x4, x7) • for every constraint ɸ in I • local distributions µφ over assignments to the variables of ɸ x3x4x7 ¹Á 0 0 0 0.1 0 0 1 0.01 0 1 0 0 … 1 1 1 0.6 • for every variable xi in = • vectors vi,1 , … , vi,q Explanation of constraints: first and second moments of distributions are consistent and form PSD matrix constraints (also for first moments) SDP objective: maximize

25. Gaussian Projections Sample g = a random Gaussian vector in Generate real valued solution Z = Z1 , Z2 ,… …… Zn-1 Zn by random projection along g where Zi= |vi,1|2 + (vi,1 - |vi,1|2 (vi,1 +vi,0)) ¢ g Lemma: For every constraint Á in the instance I, the first two moments of random variables {Zi | i2Á} agree with local distribution ¹Á. Formally, for all i,j 2Á, Exi,xj» ¹Á [xixj] = E[ZiZj] Exi» ¹Á[xi] = E[Zi]

26. Noise P1 P31 P13 Let F be an αΛ -approximate polymorphism 0 1 1 0 0 1 1 1 0 1 Lemma: Let H(X) = EY [F(Y)] where Y = X with εnoise Yi = Xiwith 1- εprobability, random bit with probε. Then H is a αΛ - O(ε) polymorphism. x1 x2 x3 x9 xn 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 0 1 0 1 Proof: Let X1 ,X2 ,.. XR be solutions to instance I with value c. Perturb each coordinate of each solution Xi probability ε Yi = Xiwith εnoise The expected value of each perturbed solution is > c – O(ε) The expected value of solution F(Y1 , .. , YR) is at least αΛ (c-O(ε)) Advantage: H is a αΛ - O(ε) polymorphism essentially a low-degree function due to averaging.

27. Algorithm P1 P31 P13 Setup: • {v1, v2 , v3 ,… vn} - SDP vectors. • Hbe a low degree αΛ- O(ε) approximate polymorphism. 0.5 1.2 -1 0.3 0.3 1.2 1.1 1.1 0.1 1 1.1 1 -1 1 1 1 1 1 0.8 x1 x2 x3 x9 xn 0 Q (Z2) P(Z2) 0 Q (Z3) 0 0 0 0 P(Z1) P(Z3) Q (Z1) 1 1.2 0.3 1 0 0 1.1 0 1 0.1 Algorithm: • Sample R independent Gaussian vectors {g1,g2,.. gR}in • Generate the corresponding Gaussian projections. • Apply the polynomial PHcorresponding to H. • Outputs are not distributions, but close to distributions over {0,1}. Round them to the nearest distribution, and sample from them.

28. Central Limit Theorem: ``Sum of large number of {-1,1} random variables has similar distribution as Sum of large number of Gaussian random variables.” For the polynomial P(X) = , the distribution of P(X) is similar for X being iid{-1,1} or iid Gaussian vector. Invariance Principle for Low Degree Low Influence Polynomials [Rotar] [Mossel-O’Donnell-Oleszkiewich], [Mossel 2008] If P(X1 ,… XR) is a constant degree polynomial, • Infi (P) ≤ τfor all i (All influences are small), • Z1 , Z2 ,.. ZRare iid random variables, then the distribution of P(Z1 ,Z2 , .. ZR) only depends on the first and second moments of random variables Zi. (P cannot distinguish between two sets of random variables with the same first two moments) If Z1, Z2 , Z3and Y1 , Y2 , Y3are two sets of random variables with matching first two moments. Z11, Z21 , Z31Y11, Y21, Y31 Z12, Z22 , Z32Y12, Y22 , Y32 Z13, Z23, Z33Y13, Y23, Y33 …… ,…………. Z1R, Z2R, Z3RY1R, Y2R, Y3R <----> P(Z1) P(Z2) P(Y2) P(Y3) P(Z3) P(Y1)

29. Analysis of Algorithm Consider a predicate φ of the instance I. Let µφ be local distribution for φ φ x1 x2 x3 x4 0.5 1.2 0.1 0 0 1 1 0 1 1.11.2 1 1 1 1 1 0 1.3 1.3 0 0 1 1 0 Zi from Gaussian projection Xi from µφ Each row satisfies in expectation constraints.  Output satisfies in expectation. By invariance Output satisfies in expectation. 0 3.2 1.3 0 0 1 1 0 1 4.3 0.9 1 1 1 1 1 Summing up over all constraints, Output value ≥ αΛ* SDP value ≥ αΛ *OPT

30. Back to Exact CSPs Introduced ε noise to make the polymorphism low degree.  will not get fully satisfiable assignments. Definition: A function F is noise stable ifεperturbation of inputs changes F with probability g(ε) such that g(ε)0 asε0 Theorem: If there exists noise stable low influence polymorphisms for a CSP Λ then it is tractable (using semidefinite programming). • Above condition holds for all boolean CSPs except linear equations. • Semidefinite programs cannot solve linear equations over {0,1}. • SDPs can solve all bounded width CSPs trivially. OPEN PROBLEM: Can one show that if CSP Λ does not contain the affine type, then it has noise stable low influence polymorphisms?

31. Thank You

32. Degree 0.5 1.2 -1 0.3 0.3 1.2 1.1 1.1 0.1 1 1.1 1 -1 1 1 1 1 1 0.8 If Z1, Z2 , Z3and Y1 , Y2 , Y3are two sets of random variables with matching first two moments. Z11, Z21 , Z31Y11, Y21, Y31 Z12, Z22 , Z32Y12, Y22 , Y32 Z13, Z23, Z33Y13, Y23, Y33 …… ,…………. Z1R, Z2R, Z3RY1R, Y2R, Y3R <----> 1.2 0.3 1 0 0 1.1 0 1 0.1 P(Z1) P(Z2) P(Y2) P(Y3) P(Z3) P(Y1)

33. Example : Max Cut Input : a weighted graph G Find a cut that maximizes the number of crossing edges 10 15 7 1 1 3

34. Max Cut SDP Quadratic Program Variables : x1 , x2 … xn xi = 1 or -1 Maximize Semidefinite Program Variables : v1 , v2 … vn • | vi |2= 1 Maximize -1 10 1 -1 15 1 7 1 1 1 -1 -1 -1 3 -1 Relax all the xi to be unit vectors instead of {1,-1}. All products are replaced by inner products of vectors

35. v1 , v2 , v3 = SDP Vectors Local Random Variables Fix an edge e = (1,2). There exists random variables a1 a2taking values {-1,1} such that: 10 15 7 1 1 1 3 2 For every edge e, there is a local distribution over integral solutions Pe such that: All the moments of order at most 2 match the inner products. E[a1 a2] = v1∙ v2 E[a12] = |v1|2 E[a22] = |v2|2

36. c = SDP Value v1 , v2 , v3 = SDP Vectors Global Random Variables B g = random Gaussian vector. (each coordinate generated by i.i.d normal variable) b1 = v1 ∙ g b2 = v2 ∙ g b3 = v3 ∙ g … bm = vm ∙ g 10 15 1 7 1 1 3 There is a global distribution B=(b1 ,b2 ,b3 .. bm) over real numbers such that: All the moments of order at most 2 match with the local distributions Pe. E[b1 b2] = v1∙ v2 E[b2 b3] = v2∙ v3 E[b3 b1] = v3∙ v1 E[b12] = |v1|2 E[b22] = |v2|2 E[b32] = |v3|2

37. Overview • Constraint Satisfaction Problems • Unique Games Conjecture • Semidefinite Programming • Results • Generic Algorithm • Proof Outline Emphasis on connections between Integrality Gaps, UGC Hardness Results, Dictatorship tests and Rounding Schemes

38. Constraint Satisfaction ProblemA Classic Example : Max-3-SAT Given a 3-SAT formula, Find an assignment to the variables that satisfies the maximum number of clauses. Equivalently the largest fraction of clauses

39. Constraint Satisfaction Problem Instance : • Set of variables. • Predicates Pi applied on variables Find an assignment that satisfies the largest fraction of constraints. Problem : Domain : {0,1,.. q-1} Predicates : {P1, P2 , P3 … Pr} Pi : [q]k -> {0,1} Max-3-SAT Domain : {0,1} Predicates : P1(x,y,z) = x ѵ y ѵz Variables : {x1 , x2 , x3 ,x4 , x5} Constraints : 4 clauses

40. Generalized CSP (GCSP) Replace Predicates by Payoff Functions (bounded real valued) Problem : Domain : {0,1,.. q-1} Pay Offs:{P1, P2 , P3 … Pr} Pi : [q]k -> [-1, 1] Objective : Find an assignment that maximizes the Average Payoff Pay Off Functions can be Negative Can model Minimization Problems like Multiway Cut, 0-Extension.

41. Examples Max-3-SAT Max Cut Max Di Cut Multiway Cut Metric Labelling 0-Extension Unique Games d- to - 1 Games Label Cover Horn Sat

42. x-y = 11 (mod 17) x-z = 13 (mod 17) … …. z-w = 15(mod 17) Unique GamesA Special Case E2LIN mod p Given a set of linear equations of the form: Xi – Xj = cij mod p Find a solution that satisfies the maximum number of equations.

43. Unique Games Conjecture[Khot 02]An Equivalent Version[Khot-Kindler-Mossel-O’Donnell] For every ε> 0, the following problem is NP-hard for large enough prime p Given a E2LIN mod p system, distinguish between: • There is an assignment satisfying 1-εfraction of the equations. • No assignment satisfies more than εfraction of equations.

44. Unique Games Conjecture A notorious open problem, no general consensus either way. Hardness Results: No constant factor approximation for unique games. [Feige-Reichman]

45. Why is UGC important?

46. Why is UGC important?

47. Why is UGC important?

48. More.. UG hardness results are intimately connected to the limitations of Semidefinite Programming

49. Semidefinite Programming

50. Max Cut Input : a weighted graph G Find a cut that maximizes the number of crossing edges 10 15 7 1 1 3