Capacity of Agreement with Finite Link Capacity

Capacity of Agreementwith Finite Link Capacity Guanfeng Liang @ Infocom 2011 Electrical and Computer Engineering University of Illinois at Urbana-Champaign Joint work with Prof. NitinVaidya

Motivation

Motivation • Distributed systems are emerging • Cloud computing (e.g. Windows Azure), distributed file systems, data centers, multiplayer online games • Large number of distributed components • Distributed components need to be coordinated

Motivation • Distributed primitives • Clock synchronization • Mutual exclusion • Agreement • etc. • Large body of literature in Distributed Algorithms

Motivation A networking guy asks: “How would constraints of the network affect the performance of these primitives?” A algorithm guy replies: “……” Network-aware distributed algorithm design

Byzantine agreement in p2p networks

Byzantine Agreement (BA): Broadcast • A sender wants to send message to n-1 receivers • Fault-free receivers must agree • Sender fault-free agree on its message • Any ≤ f nodes may fail

Why agreement? • Distributed systems are failure-prone • Non-malicious: crashed nodes, buggy codes • Malicious: attacker tries to crack the system • Robust system against faults: Important to maintain consistent state

Impact of the Network • How does capacity (rate region) of the network affect agreement performance? • How to quantify the impact?

Rate Region • Defines the way “links” may share channel • Interference posed to each other determines whether a set of transmissions can succeed together

“Ethernet” Rate Region S Rate S2 1 2 Rate S1 Rate S1 +Rate S2 ≤C

Point-to-Point Network Rate Region Rate ij≤ Capacity ij S Each directed linkindependent of other links 1 2

Capacity of Agreement • b(t) = # bits agreed in [0,t] • Capacity of agreement: supremum of achievable throughput for a given rate region

Upper Bound of Capacity in P2P Networks • NC1: C ≤ min-cut(S,X | freceivers removed) S 3 1 2

Upper Bound of Capacity in P2P Networks • NC2:C ≤ In(X | f nodes removed) S 3 1 2

Upper Bound of Capacity in P2P Networks • NC1: C ≤ min-cut(S,X | freceivers removed) • NC2: C ≤ In(X | f nodes removed) S ε 3 1 2 Upper bound = 1+ε

Classic Solution for Broadcast value v S v v v 3 1 Faulty peer 2

Classic Solution for Broadcast value v S v v v 3 1 v v 2

Classic Solution for Broadcast value v S v v v 3 1 v v 2 ? ?

Classic Solution for Broadcast value v S v v v 3 1 v v 2 v ? ? v

Classic Solution for Broadcast value v S v v v 3 1 [v,v,?] v v 2 v ? [v,v,?] ? v

Classic Solution for Broadcast value v S v v v 3 1 v v v 2 Majority vote resultsin correctresult atgood receiver v ? v ? v

Classic Solution for Broadcast S Faulty source v x w 3 1 2

Classic Solution for Broadcast S v x w 3 1 w w 2

Classic Solution for Broadcast S v x w 3 1 w w 2 x v v x

Classic Solution for Broadcast S v x w 3 1 [v,w,x] w w [v,w,x] 2 x v [v,w,x] v x

Classic Solution for Broadcast S v x w 3 1 [v,w,x] w w [v,w,x] 2 x v [v,w,x] Vote resultidentical atgood receivers v x

Classic Solution in P2P Networks • Whole message is sent on every link S Throughput ≤ slowest link ε 3 1 Throughput≤ ε but Upper bound = 1+ε 2

Improving Broadcast Throughput • Observation: classic solution is in fact an “error correction code” • “Error detection codes” are more efficient

Error Detection Code Two-bit value a, b S a a+b b 3 1 2

Error Detection Code Two-bit value a, b S a a+b b 3 1 b [a,b,a+b] b [a,b,a+b] 2 a+b a [a,b,a+b] a a+b

Error Detection Code Two-bit value a, b S a a+b b 3 1 b [a,b,a+b] b [a,b,a+b] 2 a+b a [a,b,a+b] Parity check passes at all nodes  Agree on (a,b) a a+b

Error Detection Code Two-bit value a, b S a a+b b 3 1 b [?,b,a+b] b 2 a+b ? [?,b,a+b] Parity checkfails at a node if 1 misbehaves ? a+b

Error Detection Code Two-bit value a, b Only detection is not what we want S a z b 3 1 b [a,b,z] b [a,b,z] 2 z a [a,b,z] Check fails at a good node if S sends bad codeword (a,b,z) a z

Modification • Agree on small pieces of data in each “round” • If X misbehaves with Y in a given round, avoid using XY link in the next round (for next piece of data) • Repeat

Algorithm Structure • Fast round (as in the example)

Algorithm Structure • Fast round (as in the example) S a a+b b 3 1 b [a,b,a+b] b [a,b,a+b] 2 a+b a [a,b,a+b] a a+b

Algorithm Structure • Fast round (as in the example) • Fast round … • Fast round in which failure is detected • Expensive round to learn new info about failure

Algorithm Structure • Fast round (as in the example) • Fast round … • Fast round in which failure is detected • Expensive round to learn new info about failure • Fast round • Fast round … • Expensive round to learn new info about failure.

Algorithm Structure • Fast round (as in the example) • Fast round … • Fast round in which failure is detected • Expensive round to learn new info about failure • Fast round • Fast round … • Expensive round to learn new info about failure. After a small number of expensive rounds, failures completely identified

Algorithm Structure • Fast round (as in the example) • Fast round … • Fast round in which failure is detected • Expensive round to learn new info about failure • Fast round • Fast round … • Expensive round to learn new info about failure. • Only fast rounds hereon After a small number of rounds failures identified

Algorithm “Analysis” • Many fast rounds • Few expensive rounds • When averaged over time,the cost of expensive rounds is negligible • Average usage of link capacity depends only on the fast round, which is very efficient Achieves capacity for 4-node networks, and symmetric networks

Open problems

Open Problems • Capacity of agreement for general rate regions

Open Problems • Capacity of agreement for general rate regions • Even the multicast problem with Byzantine nodes is unsolved - For multicast, sources fault-free

Rich Problem Space • Wireless channel allows overhearing • Transmit to 2 at highrate, or low rate ? - Low rate allows reception at 1 1 2 S 3

Rich Problem Space • Similar questions relevant for anymulti-party computation Distributed Computation Communication Multi-party computing under Communication Constraints

Mind teaser

How many bits needed? • N nodes each has a k-bit input • Check if all inputs are identical • At least 1 node “detects” if not identical 2 Intuitive guess: (N-1)k bit Is it the best we can do? 1 3

Thank you!

Capacity of Agreement with Finite Link Capacity

Capacity of Agreement with Finite Link Capacity

Presentation Transcript

Capacity

Capacity

Capacity

Capacity

Finite Capacity Scheduling

CAPACITY

Capacity

Capacity

Capacity

capacity

Capacity

Planning and Advanced Reservation of Capacity Agreement – Issues

Capacity Short – Capacity Credit

Capacity

Capacity

Capacity

Capacity

Link Layer: Wireless Mesh Networks Capacity

Finite Capacity Scheduling

Capacity Planning Issues Finite Capacity Planning Example Twin Disc Capacity Bills Example

Capacity

Planning and Advanced Reservation of Capacity Agreement