1 / 34

Error-Free Multi-Valued Consensus with Byzantine Failures

Error-Free Multi-Valued Consensus with Byzantine Failures. Guanfeng Liang Electrical and Computer E ngineering University of Illinois at Urbana-Champaign Joint work with Nitin Vaidya. Multi-Valued Byzantine Consensus.

elata
Download Presentation

Error-Free Multi-Valued Consensus with Byzantine Failures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Error-Free Multi-Valued Consensus with Byzantine Failures Guanfeng Liang Electrical and Computer Engineering University of Illinois at Urbana-Champaign Joint work with NitinVaidya

  2. Multi-Valued Byzantine Consensus N nodes each given an L-bit input value, want to compute their L-bit outputs such that • Synchronous • Fault-free nodes must agree • If all fault-free nodes have identical inputs values  agree on this value • Up to f < N/3 Byzantine failures

  3. Related Work • Consensus: • Ω(N2) for 1-bit agreement[Dolev and Reischuk, J.ACM’85] • O(N1.5) for 1-bit with randomized [King and Saia, PODC’10] • O(NL) for large L with hashing [Fitziand Hirt, PODC’06] • Broadcast: • O(NL) for large L [Beerliova-Trubiniova and Hirt, TCC’08] Error-free May error May error Error-free

  4. Related Work • Multicast with Byzantine faults • System level diagnosis

  5. Our Work • Error-free consensus within 4x optimal • Can be improved to 2x optimal

  6. Overview of the Algorithm • Divide L-bit input into many generations of D bits • Consensus one generation at a time • Exchange information efficiently with coding • Identify a cliqueof Snodes that “trust” each other, and appear to have identical inputs: If not found, terminate with default output • If found: Try to agree with the inputs in the clique • Any misbehave will be detected, then update “trust”:If X did “bad” things to Y, Y will not trust X any more • Repeat for next generation • Memory of “trust” across generations

  7. Code used for info exchange (n,k) MDS (maximum distance separable) code • n: length; k: dimension • k data symbols  n coded symbols • Any k coded symbols  k data symbols • Any m ≥ k locations consist a(m,k) MDS code (also has dimension k)

  8. S - f 3 1 i … 2 n

  9. N 3 1 i … 2 n Encode with (N, S-f) MDS code

  10. 3 1 i … 2 n

  11. 1 1 3 0 1 i … 1 0 2 n Same as the local one? If not  inputs must be different

  12. 1 1 3 0 1 i … 1 0 2 n

  13. 1 1 1 1 3 0 1 1 i … 1 1 0 1 2 n

  14. 1 1 1 1 1 0 1 1 1 1 1 1 0 1 3 0 1 1 0 0 1 0 1 i … 1 1 1 0 1 1 1 0 1 0 0 0 0 1 2 n Broadcast the 1-bit flags Find a “clique” of S nodes match with each other

  15. 1 1 1 1 1 0 1 1 1 1 1 1 0 1 3 0 1 1 0 0 1 0 1 i … 1 1 1 0 1 1 1 0 1 0 0 0 0 1 2 n If not found: Good nodes have different inputs

  16. 1 1 1 1 1 0 1 1 1 1 1 1 0 1 3 0 1 1 0 0 1 0 1 i … 1 1 1 0 1 1 1 0 1 0 0 0 0 1 2 n If clique of S nodes is found: Try to “agree” using packets from the clique

  17. 1 1 1 1 1 0 1 1 1 1 1 1 0 1 3 0 1 1 0 0 1 0 1 i … 1 1 1 0 1 1 1 0 1 0 0 0 0 1 2 n At most f bad nodes  At least S - f good nodes in the clique  Good nodes share ≥S – fpackets identically

  18. 1 1 1 1 1 0 1 1 1 1 1 1 0 1 3 0 1 1 0 0 1 0 1 i … 1 1 1 0 1 1 1 0 1 0 0 0 0 1 2 n Nodes in the clique: The code has dimension S - f  All have same input

  19. 1 1 1 1 1 0 1 1 1 1 1 1 0 1 3 0 1 1 0 0 1 0 1 i … 1 1 1 0 1 1 1 0 1 0 0 0 0 1 2 n Nodes not in the clique: Either all have identical codeword of (S, S - f) Or someone is not a codeword

  20. 1 1 1 1 1 0 1 1 1 1 1 1 0 1 3 0 1 1 0 0 1 0 1 i … 1 1 1 0 1 1 1 0 1 0 0 0 0 1 2 n Either all decode to the input of good nodes in the cliqueOr someone can’t decode  misbehavior detected

  21. Overview of the Algorithm • Divide L-bit input into many generations of D bits • Consensus one generation at a time • Exchange information efficiently with coding • Identify a clique of S nodes that trust each other, and appear to have identical inputs: If not found, terminate with default output • If found: Try to agree with the inputs in the clique • Any misbehave will be detected, then update “trust”: If X did “bad” things to Y, Y will not trust X any more • Repeat for next generation • Memory of “trust” across generations

  22. Complexity of the Algorithm • Many generations without failure • Few expensive generations with failure • Total cost dominated by the failure-free generations • For large L, communication complexity is

  23. Property of the Algorithm • S = N – f : original consensus with • Stronger property satisfied when • S < N – f : If S fault-free nodes have same input output is input of a fault-free node • S > N/2 : If S fault-free nodes have same input output is the majority input

  24. Summary of Results • Error-free multi-valued Byzantine consensus with complexity < 3NL • Order optimal, 4x optimal • Same complexity for many consensus of small inputs, instead of one very long one • Can be improved to 1.5NL (2x optimal)

  25. Future Work & Latest Results • Is 1.5NL the best we can do? • Generalize to other network models: point-to-point, wireless, etc. • Point-to-Point network model: max # of bits transmit on each link per unit time independent with other links • Capacity of Byzantine agreement: max # of bits agreed per unit time • Achieve at least 1/2 of capacity using Random Linear Codes

  26. Thank you!

  27. Flow of the Algorithm • Fast generation (no failure) • Fast generation …… • Fast generation in which failure is detected • Expensive operation to learn new info about failure • Fast generation • Fast generation …… • Fast generation in which failure is detected • Expensive operation to learn new info about failure • Only fast generations hereon Failures identified after a small number of generations

  28. Failure models • Crash failure – fail by stopping (“do no harm”) • Byzantine failure – arbitrary, potentially harmful, behavior

  29. Known results • Need N ≥ 3f + 1 nodes to tolerate f failures • Need Ω(N2) messages

  30. 1-bit value • Each message at least 1 bit • O(N2) bits “communication complexity” to agree on just 1 bit value

  31. Larger values (L bits) • Upper bound: Agree on each bit separately  O(N2 L) bits communication complexity • Lower bound: Need Ω(N L) bits to agree on L bits

  32. Effort To Improve complexity • L = 1: O(N1.5) with randomized algorithm [King and Saia, PODC’10] • Large L: O(N L) with hashing[Fitzi and Hirt, PODC’06] Both probabilistically correct = Not error-free

  33. Modification • Try to agree on small pieces (D bits) our of L bits data in each “round” • If X misbehaves with Y in a given round, avoid using XY links in the next round (for next rounds D bits of data) • Repeat

  34. Algorithm structure • Fast round (as in the example)

More Related