1 / 54

Information and interactive computation

Information and interactive computation. Mark Braverman Computer Science, Princeton University. January 16, 2012. Prelude: one-way communication. Basic goal : send a message from Alice to Bob over a channel. communication channel. Bob. Alice. One-way communication. Encode; Send;

Download Presentation

Information and interactive computation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information and interactive computation Mark Braverman Computer Science, Princeton University January 16, 2012

  2. Prelude: one-way communication • Basic goal: send a message from Alice to Bob over a channel. communication channel Bob Alice

  3. One-way communication • Encode; • Send; • Decode. communication channel Bob Alice

  4. Coding for one-way communication • There are two main problems a good encoding needs to address: • Efficiency: use the least amount of the channel/storage necessary. • Error-correction: recover from (reasonable) errors;

  5. Interactive computation Today’s theme Extending information and coding theory to interactive computation. I will talk about interactive information theory and Anup Rao will talk about interactive error correction.

  6. Efficient encoding • Can measure the cost of storing a random variable X very precisely. • Entropy: H(X) = ∑Pr[X=x] log(1/Pr[X=x]). • H(X) measures the average amount of information a sample from X reveals. • A uniformly random string of 1,000 bits has 1,000 bits of entropy.

  7. Efficient encoding • H(X) = ∑Pr[X=x] log(1/Pr[X=x]). • The ZIP algorithm works because • H(X=typical 1MB file) < 8Mbits. • P[“Hello, my name is Bob”] >> P[“h)2cjCv9]dsnC1=Ns{da3”]. • For one-way encoding, Shannon’s source coding theorem states that • Communication ≈ Information.

  8. Efficient encoding • The problem of sending many samples of X can be implemented in H(X) communication on average. • The problem of sending a single sample of X can be implemented in <H(X)+1 communication in expectation.

  9. Communication complexity [Yao] • Focus on the two party setting. Y X A & B implement a functionality F(X,Y). A F(X,Y) B e.g. F(X,Y) = “X=Y?”

  10. Communication complexity Goal: implement a functionality F(X,Y). A protocol π(X,Y) computing F(X,Y): Y X m1(X) m2(Y,m1) m3(X,m1,m2) A B F(X,Y) Communication cost = #of bits exchanged.

  11. Distributional communication complexity • The input pair (X,Y) is drawn according to some distribution μ. • Goal: make a mistake on at most an ε fraction of inputs. • The communication cost: C(F,μ,ε): C(F,μ,ε) := minπ computes F with error≤ε C(π, μ).

  12. Example μis a distribution of pairs of files. F is “X=Y?”: Y X MD5(X) (128b) X=Y? (1b) A B Communication cost = 129 bits. ε ≈ 2-128.

  13. Randomized communication complexity • Goal: make a mistake of at most ε on every input. • The communication cost: R(F,ε). • Clearly: C(F,μ,ε)≤R(F,ε) for all μ. • What about the converse? • A minimax(!) argument [Yao]: R(F,ε)=maxμ C(F,μ,ε).

  14. A note about the model • We assume a shared public source of randomness. R Y X A B

  15. The communication complexity of EQ(X,Y) • The communication complexity of equality: R(EQ,ε) ≈ log 1/ε • Send log 1/ε random hash functions applied to the inputs. Accept if all of them agree. • What if ε=0? R(EQ,0) ≈ n, where X,Y in {0,1}n.

  16. Information in a two-way channel • H(X) is the “inherent information cost” of sending a message distributed according to X over the channel. communication channel X Bob Alice What is the two-way analogue of H(X)?

  17. Entropy of interactive computation • “Inherent information cost” of interactive two-party tasks. R Y X A B

  18. One more definition: Mutual Information • The mutual information of two random variables is the amount of information knowing one reveals about the other: I(A;B) = H(A)+H(B)-H(AB) • If A,B are independent, I(A;B)=0. • I(A;A)=H(A). H(B) H(A) I(A,B)

  19. Information cost of a protocol • [Chakrabarti-Shi-Wirth-Yao-01, Bar-Yossef-Jayram-Kumar-Sivakumar-04, Barak-B-Chen-Rao-10]. • Caution: different papers use “information cost” to denote different things! • Today, we have a better understanding of the relationship between those different things.

  20. Information cost of a protocol • Prior distribution: (X,Y) ~ μ. Y X Protocol transcript π Protocol π A B I(π,μ) = I(π;Y|X) + I(π;X|Y) what Alice learns about Y + what Bob learns about X

  21. External information cost • (X,Y) ~ μ. Y X Protocol transcript π Protocol π A C B Iext(π,μ) = I(π;XY) what Charlie learns about (X,Y)

  22. Another view on I and Iext • It is always the case that C(π, μ) ≥Iext(π, μ) ≥ I(π, μ). • Iext measures the ability of Alice and Bob to compute F(X,Y) in an information theoretically secure way if they are afraid of an eavesdropper. • I measures the ability of the parties to compute F(X,Y) if they are afraid of each other.

  23. Example • F is “X=Y?”. • μis a distribution where w.p. ½ X=Y and w.p. ½ (X,Y) are random. Y X MD5(X) [128b] X=Y? A B Iext(π,μ) = I(π;XY) = 129 bits what Charlie learns about (X,Y)

  24. Example • F is “X=Y?”. • μis a distribution where w.p. ½ X=Y and w.p. ½(X,Y) are random. Y X MD5(X) [128b] X=Y? A B I(π,μ) = I(π;Y|X)+I(π;X|Y) ≈ 1 + 64.5 = 65.5 bits what Alice learns about Y + what Bob learns about X

  25. The (distributional) information cost of a problem F • Recall: C(F,μ,ε) := minπ computes F with error≤ε C(π, μ). • By analogy: I(F, μ, ε) := infπ computes F with error≤εI(π, μ). Iext(F, μ, ε) := infπ computes F with error≤ε Iext (π, μ).

  26. I(F,μ,ε) vs. C(F,μ,ε): compressing interactive computation Source Coding Theorem: the problem of sending a sample of X can be implemented in expected cost <H(X)+1 communication – the information content of X. Is the same compression true for interactive protocols? Can F be solved in I(F,μ,ε) communication? Or in Iext(F,μ,ε) communication?

  27. The big question • Can interactive communication be compressed? • Can π be simulated by π’ such that C(π’, μ) ≈ I(π, μ)? Does I(F,μ,ε) ≈C(F,μ,ε)?

  28. Compression results we know • Let ε, ρbe constants; let π be a protocol that computes F with error ε. • π’s costs: C, Iext, I. • Then π can be simulated using: • (I·C)½·polylog(C) communication; [Barak-B-Chen-Rao’10] • Iext·polylog(C) communication; [Barak-B-Chen-Rao’10] • 2O(I)communication;[B’11] while introducing an extra error of ρ.

  29. The amortized cost of interactive computation Source Coding Theorem: the amortized cost of sending many independent samples of X is =H(X). What is the amortized cost of computing many independent copies of F(X,Y)?

  30. Information = amortized communication • Theorem[B-Rao’11]: for ε>0 I(F,μ,ε) = limn→∞ C(Fn,μn,ε)/n. • I(F,μ,ε) is the interactive analogue of H(X).

  31. Information = amortized communication • Theorem[B-Rao’11]: for ε>0 I(F,μ,ε) = limn→∞ C(Fn,μn,ε)/n. • I(F,μ,ε) is the interactive analogue of H(X). • Can we get rid of μ? I.e. make I(F,ε) a property of the task F?

  32. Prior-free information cost • Define: I(F,ε) := infπ computes F with error≤ε maxμ I(π, μ) • Want a protocol that reveals little information against all priors μ! • Definitions are cheap! • What is the connection between the “syntactic” I(F,ε) and the “meaningful” I(F,μ,ε)? • I(F,μ,ε) ≤ I(F,ε)…

  33. Prior-free information cost • I(F,ε) := infπ computes F with error ≤ε maxμ I(π, μ). • I(F,μ,ε) ≤ I(F,ε) for all μ. • Recall: R(F,ε)=maxμ C(F,μ,ε). • Theorem[B’11]: I(F,ε) ≤ 2·maxμI(F,μ,ε/2). I(F,0) = maxμI(F,μ,0).

  34. Prior-free information cost • Recall: I(F,μ,ε) = limn→∞ C(Fn,μn,ε)/n. • Theorem: for ε>0 I(F,ε) = limn→∞ R(Fn,ε)/n.

  35. Example • R(EQ,0) ≈ n. • What is I(EQ,0)?

  36. The information cost of Equality • What is I(EQ,0)? • Consider the following protocol. A non-singular in X in {0,1}n Y in {0,1}n Continue for n steps, or until a disagreement is discovered. A1·X A A1·Y B A2·X A2·Y

  37. Analysis (sketch) • If X≠Y, the protocol will terminate in O(1) rounds on average, and thus reveal O(1) information. • If X=Y… the players only learn the fact that X=Y (≤1 bit of information). • Thus the protocol has O(1) information complexity.

  38. Direct sum theorems • I(F,ε) = limn→∞ R(Fn,ε)/n. • Questions: • Does R(Fn,ε)=Ω(n·R(F,ε))? • DoesR(Fn,ε)=ω(R(F,ε))?

  39. Direct sum strategy • The strategy for proving direct sum results. • Take a protocol for Fn that costs Cn=R(Fn,ε), and make a protocol for F that costs ≈Cn/n. • This would mean that C<Cn/n, i.e.Cn>n∙C. ~ ~ A protocol for n copies of F ? Cn/n 1 copy of F Cn

  40. Direct sum strategy • If life were so simple… Copy 1 Easy! Copy 2 Cn/n 1 copy of F Cn Copy n

  41. Direct sum strategy • Theorem: I(F,ε) = I(Fn,ε)/n ≤ Cn= R(Fn,ε)/n. • Compression → direct sum!

  42. The information cost angle • There is a protocol of communication cost Cn, but information cost ≤Cn/n. Restriction Copy 2 Copy n Copy 1 Cn Cn/n info Compression? 1 bit Cn/n 1 copy of F

  43. Direct sum theorems Best known general simulation [BBCR’10]: • A protocol with C communication and I information cost can be simulated using (I·C)½·polylog(C) communication. • Implies: R(Fn,ε) = Ω(n1/2∙R(F,ε)). ~

  44. Compression vs. direct sum • We saw that compression → direct sum. • A form of the converse is also true. • Recall: I(F,ε) = limn→∞ R(Fn,ε)/n. • If there is a problem such that I(F,ε)=o(R(F,ε)), thenR(Fn,ε)=o(n·R(F,ε)).

  45. A complete problem • Can define a problem called Correlated Pointer Jumping – CPJ(C,I). • The problem has communication cost C and information cost I. • CPJ(C,I) is the “least compressible problem”. • If R(CPJ(C,I),1/3)=O(I), then R(F,1/3)=O(I(F,1/3)) for all F.

  46. The big picture direct sum for information I(Fn,ε)/n I(F, ε) information = amortized communication interactive compression? direct sum for communication? R(Fn,ε)/n R(F, ε)

  47. Partial progress • Can compress bounded-round interactive protocols. • The main primitive is a one-shot version of Slepian-Wolf theorem. • Alice gets a distribution PX. • Bob gets a prior distribution PY. • Goal: both must sample from PX.

  48. Correlated sampling PY PX A M ~ PX M ~ PX B • The best we can hope for is D(PX||PY).

  49. u3 u3 u1 u1 u2 u2 u4 u4 Proof Idea • Sample using D(PX||PY)+O(log 1/ε+D(PX||PY)½) communication with statistical error ε. Public randomness: ~|U| samples u1 u2 u3 u4 u5 u6 u7 1 1 q1 q2 q3 q4 q5 q6 q7 …. PX PY 0 0 PY PX u4

  50. h1(u4)  Proof Idea • Sample using D(PX||PY)+O(log 1/ε+D(PX||PY)½) communication with statistical error ε. 1 1 PX PY u2 u4 0 0 u2 h2(u4) PY PX u4

More Related