1 / 25

Abstracting out Byzantine Behavior

Abstracting out Byzantine Behavior. Peter Druschel Andreas Haeberlen Petr Kouznetsov Max Planck Institute for Software Systems. Why Distributed ≠ Centralized ?. Failures: a process can deviate from its specification

malini
Download Presentation

Abstracting out Byzantine Behavior

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Abstracting out Byzantine Behavior Peter Druschel Andreas Haeberlen Petr KouznetsovMax Planck Institute for Software Systems P. Kouznetsov, 2006

  2. Why Distributed ≠ Centralized ? • Failures: a process can deviate from its specification • There are problems that cannot be solved fault-tolerant (even if just one process might fail)

  3. Crash failures • Crash fault-tolerant consensus cannot be achieved in an asynchronous system [FLP85] • A process crashes = prematurely halts all its activities

  4. Abstracting out crash failures • Failure detectors [Chandra and Toueg, 1996] • Engineering side: can be specified and implemented independently of algorithms • Theory side: can be used for comparing and classifying problems (the weakest failure detectors)

  5. Using failure detectors Eventually strong FD <>S [Chandra and Toueg, 1996]: outputs a list of suspected processes. There is a time after which: • every crashed process is suspected by every correct process • some correct process is never suspected by any correct process • Consensus is solvable with <>S and a majority of correct processes

  6. Using failure detectors, contd. • Abstracting out a majority assumption : Quorum failure detector Σ[DFG, 2004]: outputs a list of processes, called a quorum • Every two quorums (output at any processes at any times) intersect • There is a time after which every output quorum contain only correct processes

  7. The weakest failure detector • <>S is necessary to solve consensus [CHT, 1996] • Σ is the weakest FD to implement a RW register [DFG, 2004] => (<>S, Σ) is the weakest FD to solve consensus

  8. State machine replication [Lamport, 1984; Schneider, 1993;…] requests response Clients Servers

  9. State machine replication Client: broadcast request to all servers wait until a response is received Server: repeat forever if there are unserved requests use consensus (<>S, Σ) to agree on the order in which the requests are served send the results of served requests to the clients

  10. Useful abstractions • SMR (Totally ordered broadcast) = reliable broadcast + consensus [Toueg, Hadzilacos, 1993] • Consensus = (<>S, Σ)

  11. Detectable Byzantine failures Ignorant Crash Mute Detectable Byzantine Byzantine failures

  12. Byzantine failure detectors • BFDs are parameterized with the specification of the correct system behavior • The output of BFD depends solely on detectable failures: no information about steps performed by correct processes can be extracted (necessary to distinguish algorithms from BFDs)

  13. Byzantine FD abstraction Application Monitoring algorithms (Peerreview, HotDep 2006) Automaton Ai BFD Enforcing algorithms (SMR) Network

  14. State machine replication: classics Client: broadcast requests to all servers wait until a response is received Server: repeat forever if there are unserved requests use consensus to agree on the order in which the requests are served send the results of served requests to the clients (!) a single malicious process can ignore correct requests and inject bogus requests

  15. BFT state machine replication [Doudou et al, 2005] reliable broadcast + weak interactive consistency WIConsistency: every correct process proposes a value and decides on a set of values • the decided set contains at least one value proposed by a correct process • no two correct processes decide differently SMR can be implemented using RB and WIConsistency

  16. The question • SMR = RB + WIConsistency? • No: (<>SB, ΣB) can implement SMR but cannot implement WIConsistency => WIConsistency > SMR

  17. <>SB [MR97,DS98,KMM03] Outputs a list of suspected to be mute processes. There is a time after which: • every mute process is suspected by every correct process • somecorrect process is never suspected by any correct process

  18. Byzantine quorum FD ΣB Outputs a list of processes, called quorum • Every two quorums (output at any two correct processes at any times) share at least one correct process • There is a time after which every output quorum contain only correct processes

  19. SMR using (<>SB, ΣB) • (<>SB, ΣB) can be used to implement BFT replication system • Adaptation of BFT [Castro, Liskov, 1999]: • wait until receive acks from 2f+1 processes => wait until receive acks from ΣB • If the primary replica is timed-out then initiate a view change => If the primary replica is in <>SB then initiate a view change

  20. WIConsistency using (<>SB, ΣB) ? Assume an algorithm exists • Let processes in Q be correct and the rest crash initially • E: Q decide on V (set of values proposed by Q) • E’: an extension of E in which some pi not in Q decides V • E’’: an extension of E in which all processes in V are faulty and pi is correct => contradiction

  21. Related work • State machine replication [Lamport 84, 89; Schneider, 1990; Doudou et al., 2005;…] • Failure detectors [Chandra, Toueg, 1991; Chandra et al., 1992; Delporte et al., 2003;…] • Byzantine quorum systems [Malkhi, Reiter, 1997] • Byzantine failure detection [MR97; DS98; KMM03; AMPR01; BAR, 2005; …]

  22. Conclusions Byzantine FD abstraction does make sense! • BFT state machine replication using (<>SB, ΣB) • BFT SMR is strictly weaker than WIConsistency • Is the lower bound tight? • How to implement Byzantine FDs?

  23. Monitoring: PeerReview [HKD06] BFD produces three types of indications for the application layer: trusted, suspected, and exposed. Completeness: • Eventually, every detectably ignorant node is forever suspected by every correct node • Eventually, every detectably malicious node is exposed by every correct node Accuracy: • No correct node is forever suspected by a correct node • No node is exposed by a correct node, unless it is detectably malicious

  24. PeerReview approach • Nodes locally observe message traffic and classify other nodes as trusted, suspected, or exposed • Quick overview: • Every node keeps a log of allits local inputs and outputs • Use crypto techniques to ensurethat log is accurate & linear • Nodes can audit each others' log at any time • To check for faulty behavior,auditors replay the contents of the log • In case of misbehavior, produce evidence that can be verified independently by other nodes • Eventually complete and accurate! {trusted,suspected,exposed} Application PeerReviewdetector State machine(e.g. NFS) Network

  25. Typical consensus algorithm repeat round++ c = round mod n if p=c then try to “lock” the current estimate help in locking until a decided value is received from c, or c is suspected by <>S until a decided value is received

More Related