The Goldreich-Levin Theorem: List-decoding the Hadamard code

The Goldreich-Levin Theorem:List-decoding the Hadamard code

Outline • Motivation • Probability review • Theorem and proof

Hadamard Codes • [2n, n, 2n-1]2 linear code • The encoding for a message xFn is given by all 2n scalar products <x,y> for yFn (Note: all string related math here is mod 2.) • Why is the relative distance 1/2? • We will see a probabilistic algorithm that provides list decoding for Hadamard codes when up to 1/2-e of the bits are corrupted

Low error case: p = 3/4+e • Unique decoding • Probabilistic algorithm: Estimate-Had(x): For j = 1…k (k to be fixed) Choose rj{0,1}n randomly ajf(rj+x) - f(rj) Return majority(a1,…,ak) • Now set the ith bit of the solution toEstimate-Had(ei)

Analysis • Analysis: Choose rj{0,1}n randomly ajf(rj+x) - f(rj) • If both f(rj+x) andf(rj) are correct then aj = f(rj+x) - f(rj) = <s, rj+x>-<s, rj> = <s,x> • Using a union bound we get Pr[aj <s,x>] ≤ 2(1-p) = 1/2-2e

Analysis (contd.) • Since we take a majority vote of a1,…,ak we can use the fact that they’re independent to get a Chernoff bound of at most e-(ke2) on the probability of error • The probability of getting some bit wrong is Pr[Estimate-Had(ei) is wrong for some i] ≤ ne-(ke2) • Taking k = O(logn/e2) gives an O(nlogn/e2) algorithm with arbitrarily small error • Note that the error probability is doubled, so doesn’t work with p<3/4

Decoding - The noisy scenario • If m<d/2 then there's a unique solution • If d/2<m<d there could be multiple solutions

List Decoding • Fix an (n, k, d) code C, and suppose there is an unknown message xk • We are given a vector ynwhich is equal to the codeword C(x) with at most m of the places corrupted • Suppose we want to find possible values x'k for the original messages so that dH(C(x'),y)m

List decoding Had • Input: function f() that agrees with Had(s) at pfraction of the function inputs: Prx[f(x)=<s,x>] = p Assume calling the function has O(1) cost. • Output: a list of possible messages. • A message is possible..

General case: p = 1/2+e • List decoding • Theorem (Goldreich-Levin): there exists a probabilistic algorithm that solves this problem. Specifically: • Output: List L of strings such that each possible solution s appears with high probability: Prx[f(x)=<s,x>] ≥ 1/2+e Pr[sL] ≥1/2 • Run time: Poly(n/e)

Basic probability theory review • Random variables (discrete) • Expected value (m) E(X) =Sxp(x) • Variance (s2) Var(X) = E[(X-E(X))2] = E[X2]-E[X]2

Binary random variables • Pr(X=1)=p, Pr(X=0)=1-p • Often used as indicator variables • E(X)=… • Var(X) = p(1-p) ≤ 1/4

Majority votes • Consider a probabilistic algorithm that returns a binary value (0 or 1), with probability > 1/2 of returning the correct result • We can amplifythe probability of getting the correct answer by calling the algorithm multiple times and deciding by the majority vote • In order for this to work well there should be some independence between the algorithm’s results in each invocation

Independence • Events A1,...,An are independent if Pr[A1,...,An] = Pr[A1]...Pr[An] • Likewise, random variables X1,...,Xn are independent if for each possible assignment x1,...,xn: Pr[X1=x1,...,Xn=xn] = Pr[X1=x1]...Pr[Xn=xn]

Pairwise independence • A set of r.v.'s (or events) is pairwise independent if each pair of the set is independent • Does one type of independence imply the other?

Chernoff bound • The probability of simultaneous occurance of the majority of n independent events, each having probability p≥1/2+e, has the lower bound P ≥ 1-exp{-2ne2}

Chebyshev inequality • For any r.v.X with expected value μ and variance s2: Pr(|X-m|≥a) ≤ s2/a2 • Can be used to get a lower bound for the probability of getting a majority of n pairwise independent events with p≥1/2+e: Pr ≥ 1 - 1/(4ne2)

No error case • In this case we can recover the ith bit of the secret string by computing f(ei) where ei is the string with 1 at the ith position and 0 everywhere else.

General case: p = 1/2+e • List decoding • Theorem (Goldreich-Levin): there exists a probabilistic algorithm that solves this problem. Specifically: • Output: List L of strings such that each possible solution s appears with high probability: Prx[f(x)=<s,x>] ≥ 1/2+e Pr[sL] ≥1/2 • Run time: Poly(n/e)

The algorithm (almost) • Suppose that we somehow know the values of Had(s) in m places. Specifically, we are given the strings r1,…,rm and the values b1,…,bm where bj = <s,rj> • We can then try to compute the value of Had(s) in any x: Estimate-With-Guess(x , r1,…,rm , b1,…,bm): For J {1,...,m} (Jf) aJf(x+SjJ rj) - SjJ bj Return majority of all aJ • Now get the bits of s by calling Estimate-With-Guess withei as before

Analysis • The idea here is that due to linearity we can get the correct values in more places than we are given • For any J {1,...,m} definerJ=SjJ rj.Then <s, rJ>=<s, SjJrj>=SjJ<s, rj >=SjJ bj • If the rjs are uniformly random so are the rJs • The probability of gettingaJwrong is therefore the probability of gettingf(x+rJ) wrong, which is bounded by 1/2-e

But! • The rJs are not independent, so Chernoff bound can’t be used • However, they are pairwise independent so we can use Chebyshev • Pr[EWG(x , r1,…,rm , b1,…,bm) <s,x>] ≤ 1/(2me2) when the ris are independent and chosen uniformly and for each i, bi=<s,ri> • We can recover all bits with an error of at most n/(2me2). Taking 2m = O(n/e2) gives an O(n2/e2) algorithm with arbitrarily small error

Completing the algorithm • We don’t actually have the correct values for the bis • But if m is small we can try all 2mcombinations – one of them must be correct! • The final algorithm: 1. Choose r1,…,rm randomly 2. For each (b1,…,bm){0,1}m: 2.1 For i=1,..,n aiEWG(ei, r1,…,rm , b1,…,bm) 2.2 Output (a1,…,an) • Complexity: O(n3/e4)

Back to the Goldreich-Levin theorem • The only thing we assumed about the desired output string s was the agreement of Had(s) with f(). So in fact the algorithm produces with high probability any string with the same agreement.

Alternative algorithm?

The Goldreich-Levin Theorem: List-decoding the Hadamard code