1 / 27

Tight Bounds for Distributed Functional Monitoring

Tight Bounds for Distributed Functional Monitoring. David Woodruff IBM Almaden. Qin Zhang Aarhus University MADALGO. Distributed Functional Monitoring. Communication. coordinator. C. P 1. P 2. P 3. …. P k. sites. inputs:. x 1. x 2. x 3. x k. Updates: x i à x i + e j. time.

topper
Download Presentation

Tight Bounds for Distributed Functional Monitoring

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tight Bounds for Distributed Functional Monitoring David Woodruff IBM Almaden Qin Zhang Aarhus University MADALGO

  2. Distributed Functional Monitoring Communication coordinator C P1 P2 P3 … Pk sites inputs: x1 x2 x3 xk Updates: xià xi + ej time Static case vs. Dynamic case Problems on x1 + x2 + … + xk: sampling, p-norms, heavy hitters, compressed sensing, quantiles, entropy Authors: Can, Cormode, Huang, Muthukrishnan, Patt-Shamir, Shafrir, Tirthapura, Wang, Yi, Zhao, many others

  3. Motivation • Data distributed and stored in the cloud • Impractical to put data on a single device • Sensor networks • Communication very power-intensive • Network routers • Bandwidth limitations

  4. Problems What is the randomized communication cost of these problems? I.e., the minimal cost of a protocol, which for every input, fails with probability < 1/3 Static case, Dynamic Case • Which functions f(x1, …, xk) do we care about? • x1, …, xk are non-negative length-n vectors • x = i=1k xi • f(x1, …, xk) = |x|p = (i=1n xip)1/p • |x|0 is the number of non-zero coordinates

  5. Exact Answers • An (n) communication bound for computing |x|p , p  1 • Reduction from 2-Player Set-Disjointness (DISJ) • Alice has a set S µ [n] of size n/4 • Bob has a set T µ [n] of size n/4 with either |S Å T| = 0 or |S Å T| = 1 • Is S Å T = ;? • |X Å Y| = 1 ! DISJ(X,Y) = 1, |X Å Y| = 0 !DISJ(X,Y) = 0 • [KS, R] (n) communication • Prohibitive for applications

  6. Approximate Answers f(x1, …, xk) = (1 ± ε) |x|p What is the randomized communication cost as a function of k, ε, and n? Ignore log(nk/ε) factors

  7. Previous Results Lower bounds in static model, upper bounds in dynamic model (underlying vectors are non-negative) • |x|0: (k + ε-2) and O(k¢ε-2) • |x|p: (k + ε-2) • |x|2: O(k2/ε + k1.5/ε3) • |x|p, p > 2: O(k2p+1n1-2/p ¢ poly(1/ε))

  8. First lower bounds to depend on product of k and ε-2 Our Results Lower bounds in static model, upper bounds in dynamic model (underlying vectors are non-negative) • |x|0: (k + ε-2) and O(k¢ε-2) (k¢ε-2) • |x|p: (k + ε-2) (kp-1¢ε-2). Talk will focus on p = 2 • |x|2: O(k2/ε + k1.5/ε3) O(k¢poly(1/ε)) • |x|p, p > 2: O(k2p+1n1-2/p ¢ poly(1/ε)) O(kp-1¢poly(1/ε)) Upper bound doesn’t depend polynomially on n

  9. Talk Outline • Lower Bounds • Non-zero elements • Euclidean norm • Upper Bounds • p-norm

  10. Previous Lower Bounds • Lower bounds for any p-norm, p != 1 • [CMY] (k) • [ABC] (ε-2) • Reduction from Gap-Orthogonality (GAP-ORT) • Alice, Bob have u, v 2 {0,1}ε-2 , respectively • |¢(u, v) – 1/(2ε2)| < 1/ε or |¢(u, v) - 1/(2ε2)| > 2/ε • [CR, S] (ε-2) communication

  11. Talk Outline • Lower Bounds • Non-zero elements • Euclidean norm • Upper Bounds • p-norm

  12. Lower Bound for Distinct Elements • Improve bound to optimal (k¢ε-2) • Simpler problem: k-GAP-THRESH • Each site Pi holds a bit Zi • Zi are i.i.d. Bernoulli(¯) • Decide if • i=1k Zi > ¯ k + (¯ k)1/2 or i=1k Zi < ¯ k - (¯ k)1/2 Otherwise don’t care • Rectangle property: for any correct protocol transcript ¿, Z1, Z2, …, Zk are independent conditioned on ¿

  13. A Key Lemma • Lemma: For any protocol ¦ which succeeds w.pr. >.9999, the transcript ¿ is such that w.pr. > 1/2, for at least k/2 different i, H(Zi | ¿) < H(.01 ¯) • Proof: Suppose ¿ does not satisfy this • With large probability, ¯ k - O(¯ k)1/2 < E[i=1k Zi | ¿] < ¯ k + O(¯ k)1/2 • Since the Zi are independent given ¿, i=1k Zi | ¿ is a sum of independent Bernoullis • Since most H(Zi | ¿) are large, by anti-concentration, both events occur with constant probability: i=1k Zi | ¿ > ¯ k + (¯ k)1/2 , i=1k Zi | ¿ < ¯ k - (¯ k)1/2 So ¦ can’t succeed with large probability

  14. Composition Idea Can think of C as a player C DISJ DISJ DISJ DISJ P1 P2 P3 … Pk Zk Z1 Z2 Z3 The input to Pi in k-GAP-THRESH, denoted Zi, is the output of a 2-party Disjointness (DISJ) instance between C and Si - Let X be a random set of size 1/(4ε2) from {1, 2, …, 1/ε2} - For each i, if Zi = 1, then choose Yi so that DISJ(X, Yi) = 1, else choose Yi so that DISJ(X, Yi) = 0 - Distributional complexity (1/ε2) [Razborov]

  15. Putting it All Together • Key Lemma ! For most i, H(Zi | ¿) < H(.01¯) • Since H(Zi) = H(¯) for all i, for most i protocol ¦ solves DISJ(X, Yi) with constant probability • Since the Zi | ¿ are independent, solving DISJ requires communication (ε-2) on each of k/2 copies • Total communication is (k¢ε-2) • Can show a reduction: • |x|0 > 1/(2ε2) + 1/ε if i=1k Zi > ¯ k + (¯ k)1/2 • |x|0 < 1/(2ε2) - 1/ε if i=1k Zi < ¯ k - (¯ k)1/2

  16. Talk Outline • Lower Bounds • Non-zero elements • Euclidean norm • Upper Bounds • p-norm

  17. Lower Bound for Euclidean Norm • Improve (k + ε-2) bound to optimal (k¢ε-2) • Base problem: Gap-Orthogonality (GAP-ORT(X, Y)) • Consider uniform distribution on (X,Y) • We observe information lower bound for GAP-ORT • Sherstov’s lower bound for GAP-ORT holds for uniform distribution on (X,Y) • [BBCR] + [Sherstov] ! for any protocol ¦ and t > 0, I(X, Y; ¦) = (1/(ε2log t)) or ¦ uses t (1) communication

  18. Information Implications • By chain rule, I(X, Y ; ¦) = i=11/ε2 I(Xi, Yi ; ¦ | X< i, Y< i) = (ε-2) • For most i, I(Xi, Yi ; ¦ | X< i, Y< i) = (1) • Maximum Likelihood Principle: non-trivial advantage in guessing (Xi, Yi)

  19. 2-BIT k-Party DISJ We compose GAP-ORT with a variant of k-Party DISJ • Choose a random j 2 [k2] • j doesn’t occur in any Ti • j occurs only in T1, …, Tk/2 • j occurs only in Tk/2+1, …, Tk • j occurs in T1, …, Tk • All j’  j occur in at most one set Ti (assume k ¸ 4) • We show (k) information cost P1 P2 P3 … Pk T1 T2 T3 Tk 2 [k2]

  20. Rough Composition Idea Bits Xi and Yi in GAP-ORT determine output of i-th 2-BIT k-party DISJ instance 2-BIT k-party DISJ instance { GAP -ORT • Information adds (if we condition on enough “helper” variables) • Pi participates in all instances 2-BIT k-party DISJ instance An algorithm for approximating Euclidean norm solves GAP-ORT, therefore solves most 2-BIT k-party DISJ instances 1/ε2 … 2-BIT k-party DISJ instance Show (k/ε2) overall information is revealed

  21. Talk Outline • Lower Bounds • Non-zero elements • Euclidean norm • Upper Bounds • p-norm

  22. Algorithm for p-norm • We get kp-1 poly(1/ε), improving k2p+1n1-2/p poly(1/ε) for general p and O(k2/ε + k1.5/ε3) for p = 2 • Our protocol is the first 1-way protocol, that is, all communication is from sites to coordinator • Focus on Euclidean norm (p = 2) in talk • Non-negative vectors • Just determine if Euclidean norm exceeds a threshold θ

  23. The Most Naïve Thing to Do • xi is Site i’s current vector • x = i=1k xi • Suppose Site i sees an update xià xi + ej • Send j to Coordinator with a certain probability that only depends on k and θ?

  24. Send each update with probability at least 1/k Communication = O(k), so okay Sample and Send |x|2 = k2 C Suppose x has k4 coordinates that are 1, and may have a unique coordinate which is k2, occurring k times on each site P1 P2 P3 … Pk |x|2 = 2k2 1 … 1 0 … 0 0 … 0 … … … 0 … 0 { 0 … 0 1 … 1 0 … 0 … … … 0 … 0 0 … 0 0 … 0 1 … 1 … … … 0 … 0 … … … … … … … … … … … … … … … 0 … 0 0 … 0 0 … 0 … … … 1 … 1 k • Send update with probability 1/k2 • Will find the large coordinate • But communication is (k2) 1 1 1 1 1

  25. What Is Happening? • Sampling with probability ¼ 1/k2 is good to get a few samples from heavy item • But all the light coordinates are in the way, making the communication (k2) • Suppose we put a barrier of k, that is, sample with probability ¼ 1/k2 but only send an item if it has occurred at least k times on a site • Now communication is O(1) and found heavy coordinate • But light coordinates also contribute to overall |x|2 value

  26. Algorithm for Euclidean Norm • Sample at different scales with different barriers • Use public coin to create O(log n) groups T1, …, Tlog n of the n input coordinates • Tz contains n/2z random coordinates • Suppose Site i sees the update xià xi + ej • For each Tz containing j • If xij > (θ/2z)1/2/k then with probability (2z/θ)1/2¢poly(ε-1 log n), send (j, z) to the coordinator • Expected communication O~(k) • If a group of coordinates contributes to • |x|2, there is a z for which a few coordinates in the group are sampled multiple times

  27. Conclusions • Improved communication lower and upper bounds for estimating |x|p • Implies tight lower bounds for estimating entropy, heavy hitters, quantiles • Implications for data stream model • First lower bound for |x|0 without Gap-Hamming • Useful information cost lower bound for Gap-Hamming, or protocol has very large communication • Improve (n1-2/p/ε2/p) bound for estimating |x|p in a stream to (n1-2/p/ε4/p)

More Related