Ryan O'Donnell (CMU, IAS) Yi Wu (CMU, IBM) Yuan Zhou (CMU)

Hardness of Solving Sparse Linear Equations over Integers (and Large Cyclic Groups) Ryan O'Donnell (CMU, IAS) Yi Wu (CMU, IBM) Yuan Zhou (CMU)

Solving linear equations • Given a set of linear equations over reals, is there a solution satisfying all the equations? • Easy : Gaussian elimination. Noisy version • Given a set of linear equations for which there is a solution satisfying 99% of the equations, • can we find a solution that satisfies at least 1% of the equations? • I.e. 99% vs 1% approximation algorithm for linear equations over reals?

Hardness of Max-3Lin(q) • Theorem.[Håstad '01] Given a set of linear equations modulo q, it is NP-hard to distinguish between • there is a solution satisfying (1 - ε)-fraction of the equations • no solution satisfies more than (1/q + ε)-fraction of the equations • Equations are sparse, and are of the form xi + xj - xk = c (mod q) • (1 - ε) vs (1/q + ε) approx. for Max-3Lin(q) is NP-Hard • A 3-query PCP of completeness (1 - ε),soundness (1/q + ε)

Sparser equations: Max-2Lin(q) • Theorem.[KKMO '07] Assuming Unique Games Conjecture, for any ε, δ > 0, there exists q > 0, such that (1 - ε) vs δ approx. for Max-2Lin(q) is NP-Hard

over integers/reals ? ?

Equations over integers: Max-3Lin(Z) • Approximate Max-3Lin/Max2Lin over large domains? • Intuitively, it should be harder, because when domain size increases, • soundness becomes smaller in both [Håstad '01] and [KKMO '07] • Obstacle of getting hardness • "Long code" becomes too long (even infinitely long)

Hardness of Max-3Lin(Z) • Theorem.[Guruswami-Raghavendra '07] For all ε, δ > 0, it is NP-Hard to (1 - ε) vs δ approximate Max-3Lin(Z) • 3-query PCP over integers • Implies the hardness for Max-3Lin(R) • Proof follows [Håstad '01], but much more involved • derandomized Long Code testing • Fourier analysis with respect to an exponential distribution on Z+

Unique Games over Integers? • Can we use the techniques in [Guruswami-Raghavendra '07] prove a (1 - ε) vs δ UG-hardness for Max-2Lin(Z)? • Seems difficult • Open question from Raghavendra's thesis [Raghavendra '09] :

Our results • Relatively easy to modify the KKMO proof to get • Theorem. For all ε, δ > 0, it is UG-Hard to (1 - ε) vs δ approximate Max-2Lin(Z) • Also applies to Max-2Lin over reals and large domains • Simpler proof (and better parameters) of Max-3Lin(Z) hardness

Dictatorship Test • Theorem. For all ε, δ > 0, it is UG-Hard to (1 - ε) vs δ approximate Max-2Lin(Z) • By [KKMO '07], only need to design a (1 - ε) vs δ 2-query dictatorship test over integers.

Dictatorship Test (cont'd) • f: [q]d -> Z is called a dictator if f(x1, x2, ..., xd) = xi (for some i) • Dictatorship test over [q]: a distribution over equations f(x) - f(y) = c (mod q) • Completeness: for dictators, Pr[equation holds] ≥ 1 - ε • Soundness: for functions far from dictators, Pr[equation holds] < δ (1 - ε) vs δhardness of Max-2Lin(q)

Dictatorship Test over Integers • A distribution over equations f(x) - f(y) = c • Completeness: for dictators, Pr[f(x) - f(y) =c] ≥ 1 - ε • Soundness: for functions far from dictators, Pr[f(x) - f(y) = c mod q] < δ • It is UG-Hard to distinguish between • a Max-2Lin(Z) instance is (1 - ε)-satisfiable • the instance is not δ-satisfiable even when the the equations are modulo q

Recap of KKMO Dictatorship Test

Back to KKMO Dictatorship Test • Dictatorship test over [q]: a distribution over equations • f(x) - f(y) = c (mod q) • Completeness: for dictators, Pr[equation holds] ≥ 1 - ε • Soundness: for functions far from dictators, • Pr[equation holds] < δ • KKMO Test • Pick x ∈ [q]d by random • Get y by rerandomizing each coordinate of x w.p. ε • Test f(x) - f(y) = 0 (mod q)

Back to KKMO Dictatorship Test (cont'd) • KKMO Test • Pick x ∈ [q]d by random • Get y by rerandomizing each coordinate of x w.p. ε • Test f(x) - f(y) = 0 (mod q) • Soundness analysis "Majority Is Stablest" Theorem[MOO '05] • If f is far from dictators and "β-balanced", then Pr[f passes the test] < βε/2 • f is β-balanced : Pr[f(x) = a mod q] < β for all 0 ≤ a < q

Back to KKMO Dictatorship Test (cont'd) • KKMO Test • Pick x ∈ [q]d by random • Get y by rerandomizing each coordinate of x w.p. ε • Test f(x) - f(y) = 0 (mod q) • Soundness analysis • "Folding" trick: to make sure f is β-balanced • Idea: when query f(x) = f(x1, x2, ..., xn), return g(x) = f(0, (x2 - x1) mod q, ..., (xn - x1) mod q) + x1 • Dictators not affected in completeness analysis • g(x) is 1/q-balanced

Dictatorship Test for Max-2Lin(Z) • A distribution over equations f(x) - f(y) = c • Completeness: for dictators, Pr[f(x) - f(y) =c] ≥ 1 - ε • Soundness: for functions far from dictators, Pr[f(x) - f(y) = c mod q] < δ • If we use KKMO test... • Soundness: the same, • Completeness does not hold, because • when query f(x), get g(x) = (xi - x1) mod q + x1 • when query f(y), get g(y) = (yi - y1) mod q + y1 Max-2Lin(q): Pr[g(x) - g(y) = 0 mod q] ≥ 1 - ε Max-2Lin(Z): Pr[g(x) - g(y) ≠ 0] ≥ Pr["wrap-around" (exactly one of g(x), g(y) ≥ q)] ≈ 1/2

Our methodStep IIntroducing the new "active folding"

mod q The new "active folding" • KKMO Test with active folding • Pick x ∈ [q]d by random • Get y by rerandomizing each coordinate of x w.p. ε • Pick c, c'∈ [q] by random, test • f(x1 - c, ..., xn - c) + c = f(y1 - c', ..., yn - c') + c' (mod q) • Completeness: • Soundness: • Claim. g(x) = f(x1 - c, ..., xn - c) + c is 1/q-balanced • Proof. Prx,c[f(x1 - c, ..., xn - c) + c = a mod q] = Ec [Prx[f(x1 - c, ..., xn - c) = a - c mod q] ] = Ec [Prx[f(x) = a - c mod q] ] = Ex [Prc[f(x) = a - c mod q] ] ≤ 1/q

Our methodStep II"Partial active folding"

"Partial active folding" • KKMO Test with partial active folding for Max-2Lin(Z) • Pick x ∈ [q]d by random • Get y by rerandomizing each coordinate of x w.p. ε • Pick c, c'∈ [q0.5] by random, test • f(x1 - c, ..., xn - c) + c = f(y1 - c', ..., yn - c') + c' • Completeness: • f(x1 - c, ..., xn - c) + c = (xi - c) mod q + c = (xi - c) + c = xiw.p. 1 - 1/q0.5 • f(y1 - c', ..., yn - c') + c' = yiw.p. 1 - 1/q0.5 Pr[f(x1-c, ..., xn-c)+c = f(y1-c', ..., yn-c')+c'] ≥ 1 - ε - 2/q0.5

"Partial active folding" (cont'd) • KKMO Test with partial active folding for Max-2Lin(Z) • Pick x ∈ [q]d by random • Get y by rerandomizing each coordinate of x w.p. ε • Pick c, c'∈[q0.5] by random, test • f(x1 - c, ..., xn - c) + c = f(y1 - c', ..., yn - c') + c' • Completeness: • Soundness: • Claim. g(x) = f(x1 - c, ..., xn - c) + c is 1/q0.5-balanced • Proof. Prx,c[f(x1 - c, ..., xn - c) + c = a mod q] = Ec [Prx[f(x1 - c, ..., xn - c) = a - c mod q] ] = Ec [Prx[f(x) = a - c mod q] ] = Ex [Prc[f(x) = a - c mod q] ] ≤ 1/q0.5

"Partial active folding" (cont'd) • KKMO Test with partial active folding for Max-2Lin(Z) • Pick x ∈ [q]d by random • Get y by rerandomizing each coordinate of x w.p. ε • Pick c, c'∈[q0.5] by random, test • f(x1 - c, ..., xn - c) + c = f(y1 - c', ..., yn - c') + c' • Completeness: • Soundness: • Claim. g(x) = f(x1 - c, ..., xn - c) + c is 1/q0.5-balanced • By Majority Is Stablest Theorem, when f is far from dictators Pr[f(x1-c,...,xn-c)+c = f(y1-c',...,yn-c')+c' mod q] < 1/qε/4

Application to Max-3Lin(Z) Key Idea in Max-2Lin(Z):"Partial folding" to deal with "wrap-around" event

Håstad's reduction for Max-3Lin(q) • Hastad's Matching Dictatorship Test for • f: [q]L -> Z, g : [q]R -> Z, π : [R] -> [L] • Pick x ∈ [q]L , y ∈ [q]R, by random • Let z∈[q]R, s.t. zi = (yi + xπ(i)) mod q • Rerandomizing each coordinate of x, y, z w.p. ε • Test f(0, x2 - x1, ..., xn - x1) + x1 + g(y) = g(z) mod q • Completeness: if g is i-th dictator, f is π(i)-th dictator Pr[f, g pass the test] ≥ 1 - 3ε • Soundness: if f and g far from being "matching dictators" Pr[f, g pass the test] < 1/q + δ (1 - 3ε) vs (1/q + δ) NP-Hardness of Max-3Lin(q)

Our reduction for Max-3Lin(Z) • Matching Dictatorship Test with partial active folding for • f: [q2]L -> Z, g : [q3]R -> Z, π : [R] -> [L] • Pick x ∈[q2]L , y ∈[q3]R, by random • Let z∈[q3]R, s.t. zi = (yi + xπ(i)) mod q • Rerandomizing each coordinate of x, y, z w.p. ε • Pick c ∈[q] by random • Test f(x1 - c, ..., xn - c) + c + g(y) = g(z) • Completeness: if g is i-th dictator, f is π(i)-th dictator Pr[f(x1 - c, ..., xn - c) + c + g(y) = g(z)] ≥ 1 - 3ε - 2/q • Soundness: if f and g far from being "matching dictators" Pr[f(x1 - c, ..., xn - c) + c + g(y) = g(z) mod q] < 1/q + δ (1-3ε-2/q) vs (1/q+δ) NP-Hardness of Max-3Lin(Z)

The End.Any questions?

Ryan O'Donnell (CMU, IAS) Yi Wu (CMU, IBM) Yuan Zhou (CMU)

Ryan O'Donnell (CMU, IAS) Yi Wu (CMU, IBM) Yuan Zhou (CMU)

Presentation Transcript

Service Oriented Architecture

Parallel and Distributed Systems for Probabilistic Reasoning

Thesis Defense Large -Scale Graph Computation on Just a PC

15-446 Distributed Systems Spring 2009

Power-efficient server provisioning in server farms

Towards Global Network Positioning

Sensor and Graph Mining

SPIRAL: Current Status

Fully Automatic Cross-Associations

CMU

Finding patterns in large, real networks

Cmu. 200

Global Network Positioning: A New Approach to Network Distance Prediction

CMU

Framework

CMU SCS

CMU SCS