1 / 51

Capacity of Agreement with Finite Link Capacity

Capacity of Agreement with Finite Link Capacity. Guanfeng Liang @ Infocom 2011 Electrical and Computer Engineering University of Illinois at Urbana-Champaign Joint work with Prof. Nitin Vaidya. Motivation. Motivation. Distributed systems are emerging

emmett
Download Presentation

Capacity of Agreement with Finite Link Capacity

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Capacity of Agreementwith Finite Link Capacity Guanfeng Liang @ Infocom 2011 Electrical and Computer Engineering University of Illinois at Urbana-Champaign Joint work with Prof. NitinVaidya

  2. Motivation

  3. Motivation • Distributed systems are emerging • Cloud computing (e.g. Windows Azure), distributed file systems, data centers, multiplayer online games • Large number of distributed components • Distributed components need to be coordinated

  4. Motivation • Distributed primitives • Clock synchronization • Mutual exclusion • Agreement • etc. • Large body of literature in Distributed Algorithms

  5. Motivation A networking guy asks: “How would constraints of the network affect the performance of these primitives?” A algorithm guy replies: “……” Network-aware distributed algorithm design

  6. Byzantine agreement in p2p networks

  7. Byzantine Agreement (BA): Broadcast • A sender wants to send message to n-1 receivers • Fault-free receivers must agree • Sender fault-free agree on its message • Any ≤ f nodes may fail

  8. Why agreement? • Distributed systems are failure-prone • Non-malicious: crashed nodes, buggy codes • Malicious: attacker tries to crack the system • Robust system against faults: Important to maintain consistent state

  9. Impact of the Network • How does capacity (rate region) of the network affect agreement performance? • How to quantify the impact?

  10. Rate Region • Defines the way “links” may share channel • Interference posed to each other determines whether a set of transmissions can succeed together

  11. “Ethernet” Rate Region S Rate S2 1 2 Rate S1 Rate S1 +Rate S2 ≤C

  12. Point-to-Point Network Rate Region Rate ij≤ Capacity ij S Each directed linkindependent of other links 1 2

  13. Capacity of Agreement • b(t) = # bits agreed in [0,t] • Capacity of agreement: supremum of achievable throughput for a given rate region

  14. Upper Bound of Capacity in P2P Networks • NC1: C ≤ min-cut(S,X | freceivers removed) S 3 1 2

  15. Upper Bound of Capacity in P2P Networks • NC2:C ≤ In(X | f nodes removed) S 3 1 2

  16. Upper Bound of Capacity in P2P Networks • NC1: C ≤ min-cut(S,X | freceivers removed) • NC2: C ≤ In(X | f nodes removed) S ε 3 1 2 Upper bound = 1+ε

  17. Classic Solution for Broadcast value v S v v v 3 1 Faulty peer 2

  18. Classic Solution for Broadcast value v S v v v 3 1 v v 2

  19. Classic Solution for Broadcast value v S v v v 3 1 v v 2 ? ?

  20. Classic Solution for Broadcast value v S v v v 3 1 v v 2 v ? ? v

  21. Classic Solution for Broadcast value v S v v v 3 1 [v,v,?] v v 2 v ? [v,v,?] ? v

  22. Classic Solution for Broadcast value v S v v v 3 1 v v v 2 Majority vote resultsin correctresult atgood receiver v ? v ? v

  23. Classic Solution for Broadcast S Faulty source v x w 3 1 2

  24. Classic Solution for Broadcast S v x w 3 1 w w 2

  25. Classic Solution for Broadcast S v x w 3 1 w w 2 x v v x

  26. Classic Solution for Broadcast S v x w 3 1 [v,w,x] w w [v,w,x] 2 x v [v,w,x] v x

  27. Classic Solution for Broadcast S v x w 3 1 [v,w,x] w w [v,w,x] 2 x v [v,w,x] Vote resultidentical atgood receivers v x

  28. Classic Solution in P2P Networks • Whole message is sent on every link S Throughput ≤ slowest link ε 3 1 Throughput≤ ε but Upper bound = 1+ε 2

  29. Improving Broadcast Throughput • Observation: classic solution is in fact an “error correction code” • “Error detection codes” are more efficient

  30. Error Detection Code Two-bit value a, b S a a+b b 3 1 2

  31. Error Detection Code Two-bit value a, b S a a+b b 3 1 b [a,b,a+b] b [a,b,a+b] 2 a+b a [a,b,a+b] a a+b

  32. Error Detection Code Two-bit value a, b S a a+b b 3 1 b [a,b,a+b] b [a,b,a+b] 2 a+b a [a,b,a+b] Parity check passes at all nodes  Agree on (a,b) a a+b

  33. Error Detection Code Two-bit value a, b S a a+b b 3 1 b [?,b,a+b] b 2 a+b ? [?,b,a+b] Parity checkfails at a node if 1 misbehaves ? a+b

  34. Error Detection Code Two-bit value a, b Only detection is not what we want S a z b 3 1 b [a,b,z] b [a,b,z] 2 z a [a,b,z] Check fails at a good node if S sends bad codeword (a,b,z) a z

  35. Modification • Agree on small pieces of data in each “round” • If X misbehaves with Y in a given round, avoid using XY link in the next round (for next piece of data) • Repeat

  36. Algorithm Structure • Fast round (as in the example)

  37. Algorithm Structure • Fast round (as in the example) S a a+b b 3 1 b [a,b,a+b] b [a,b,a+b] 2 a+b a [a,b,a+b] a a+b

  38. Algorithm Structure • Fast round (as in the example) • Fast round … • Fast round in which failure is detected • Expensive round to learn new info about failure

  39. Algorithm Structure • Fast round (as in the example) • Fast round … • Fast round in which failure is detected • Expensive round to learn new info about failure • Fast round • Fast round … • Expensive round to learn new info about failure.

  40. Algorithm Structure • Fast round (as in the example) • Fast round … • Fast round in which failure is detected • Expensive round to learn new info about failure • Fast round • Fast round … • Expensive round to learn new info about failure. After a small number of expensive rounds, failures completely identified

  41. Algorithm Structure • Fast round (as in the example) • Fast round … • Fast round in which failure is detected • Expensive round to learn new info about failure • Fast round • Fast round … • Expensive round to learn new info about failure. • Only fast rounds hereon After a small number of rounds failures identified

  42. Algorithm “Analysis” • Many fast rounds • Few expensive rounds • When averaged over time,the cost of expensive rounds is negligible • Average usage of link capacity depends only on the fast round, which is very efficient Achieves capacity for 4-node networks, and symmetric networks

  43. Open problems

  44. Open Problems • Capacity of agreement for general rate regions

  45. Open Problems • Capacity of agreement for general rate regions • Even the multicast problem with Byzantine nodes is unsolved - For multicast, sources fault-free

  46. Rich Problem Space • Wireless channel allows overhearing • Transmit to 2 at highrate, or low rate ? - Low rate allows reception at 1 1 2 S 3

  47. Rich Problem Space • Similar questions relevant for anymulti-party computation Distributed Computation Communication Multi-party computing under Communication Constraints

  48. Mind teaser

  49. How many bits needed? • N nodes each has a k-bit input • Check if all inputs are identical • At least 1 node “detects” if not identical 2 Intuitive guess: (N-1)k bit Is it the best we can do? 1 3

  50. Thank you!

More Related