1 / 28

High Throughput Byzantine Fault Tolerance

High Throughput Byzantine Fault Tolerance. Ramakrishna Kotla, Mike Dahlin Laboratory for Advanced Systems Research, The University of Texas at Austin. Summary of the talk . High throughput is achievable along with Byzantine fault tolerance Contributions High Throughput BFT Architecture

Download Presentation

High Throughput Byzantine Fault Tolerance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Throughput Byzantine Fault Tolerance Ramakrishna Kotla, Mike Dahlin Laboratory for Advanced Systems Research, The University of Texas at Austin

  2. Summary of the talk • High throughput is achievable along with Byzantine fault tolerance • Contributions • High Throughput BFT Architecture • CBASE : Generic Prototype • CBASE-FS : High throughput replicated NFS Department of Computer Sciences, UT Austin

  3. Outline • Overview • Architecture • Implementation • Evaluation • Conclusion Department of Computer Sciences, UT Austin

  4. Motivation • Large scale Internet services • High Availability : 24 X 7 service • High Reliability : Correctness • High Security : Data integrity/Confidentiality • High Throughput : System load • Challenges : Byzantine failures • Malicious attacks • http://www.cert.org • Software and operator errors • ROC@USITS03 • Network and hardware failures Department of Computer Sciences, UT Austin

  5. BFT State Machine Replication Department of Computer Sciences, UT Austin

  6. Server Replicas Execution Execution Execution Execution Agreement Agreement Agreement Agreement Clients BFT state machine replication • Byzantine Fault Tolerance Protocol • Tolerates f Byzantine server failures using 3f+1 replicas • Agreement : Order requests from clients • Execution stage : Execute requests • Provide high availability, reliability and security • PBFT, Farsite, Oceanstore [OSDI99, OSDI01, SOSP01, SOSP03] Department of Computer Sciences, UT Austin

  7. BFT : Tradeoff throughput for fault tolerance ? Department of Computer Sciences, UT Austin

  8. Traditional BFT : Limitations • Fail to provide high throughput • Does not scale with hardware resources and application parallelism • Reason • Uses Generalized State Machine Replication • Correctness conditions: • Agreement : Every non-faulty state machine replica receives every request • Order : Every non-faulty state machine replica processes the requests in the same relative order • BFT State machine replication : • Execute requests sequentially to ensure order Department of Computer Sciences, UT Austin

  9. High Throughput BFT : Idea • Modify Order without compromising consistency/safety • Relaxed order : Every non-faulty replica executes dependent requests in the same relative order • Dependent requests : Two requests are dependent if read set or write set of one intersects with write set of the other. • Requests that are not dependent can be concurrently executed • Exploit application parallelism to provide high throughput • Commercial applications like web server, file systems, databases have inherent data parallelism Department of Computer Sciences, UT Austin

  10. Outline • Overview • Architecture • Implementation • Evaluation • Conclusion Department of Computer Sciences, UT Austin

  11. HT BFT : Architecture • Goals : • Generic : Generic interface that exposes application parallelism • Extensible : Easily extensible to support any application • Modular : Support different fault models easily • Reuse : Reuse existing agreement protocols Server Replicas Execution Execution Execution Execution Parallelizer Parallelizer Parallelizer Parallelizer Agreement Agreement Agreement Agreement Department of Computer Sciences, UT Austin

  12. Parallelizer • Application independent module • Receives ordered requests from agreement • Maintains/Updates dependency graph of requests • 2 level dependency analysis • Concurrency matrix • Schedules a request if it is not dependent on any outstanding requests (no outgoing edges at a request node) • Requests that are not dependent are concurrently executed Department of Computer Sciences, UT Austin

  13. Parallelizer : Concurrency Matrix • Definition/Figure : Square matrix rows/columns represent operations • 1 represents independent, 0 represents dependent operations • Exports application level parallelism • Statically defined • Two matrices : Dependency also depends on objects • Related objects • Unrelated objects • Table Lookup • Low overhead Department of Computer Sciences, UT Austin

  14. Parallelizer : Dependence Analysis • Parallelizer figure : agreement stage, input queue, dependency graph, multi thread execution stage Department of Computer Sciences, UT Austin

  15. Advantages/Limitations • Advantages : • Supports high throughput applications • Simple : Minimal/No changes to client/agreement protocol/application • Flexible : Supports different fault models easily • Limitation : • Concurrency matrix requires inner workings of application • Conservative rules ensures correctness at the expense of performance • Incrementally refine the rules to gain performance Department of Computer Sciences, UT Austin

  16. Outline • Overview • Architecture • Implementation • Evaluation • Conclusion Department of Computer Sciences, UT Austin

  17. System Model • Asynchronous system • Nodes operate at arbitrarily different speeds • Network may delay, drop or deliver messages out of order • Assumption : Bounded fair links • Fault Model : Byzantine Faults • Faulty nodes may behave arbitrarily : crash, lose/alter data, send incorrect messages • Adversary : Strong adversary • Can coordinate faulty nodes in arbitrarily bad ways • Assumption : Computationally limited Department of Computer Sciences, UT Austin

  18. CBASE : Concurrent BASE • Uses unmodified PBFT agreement protocol [OSDI 1999] • Built upon BASE library [SOSP 2001] • Agreement stage : Single thread • Execution stage : Multithreaded • Parallelizer : Producer/Consumer queue • Figure ?? Department of Computer Sciences, UT Austin

  19. Parallelizer : Interface • Parallelizer.insert() • Parallelizer.next_request() • Parallelizer.sync() Department of Computer Sciences, UT Austin

  20. CBASE-FS : BFT NFS • Figure • Brief description of NFS concurrency matrix rules • Related objects : Same NFS handle • Rules are conservative • Refer paper for more details Department of Computer Sciences, UT Austin

  21. Outline • Overview • Architecture • Implementation • Evaluation • Conclusion Department of Computer Sciences, UT Austin

  22. Evaluation • With 4 server replicas that tolerate 1 Byzantine failure • Replicas running on different uniprocessor machine • 933 MHz P3, 256 MB Ram • 5 Client machines • Dedicated network with 100MB ethernet hub • OS : Redhat Linux 7.2 with NFS 2.0 • Assumption : No correlated failures due to OS. Department of Computer Sciences, UT Austin

  23. Microbenchmark : Overhead • BASE versus CBASE Department of Computer Sciences, UT Austin

  24. Microbenchmark : Scalability • Scalability with hardware resources • Scalability with application level parallelism Department of Computer Sciences, UT Austin

  25. Microbenchmark : CBASE-FS/BASE-FS/NFS • Latency versus Throughput with no sleep • Latency versus Throughput with 20 ms sleep • Iozone results summary Department of Computer Sciences, UT Austin

  26. Macrobenchmarks • Postmark : • Andrew : Department of Computer Sciences, UT Austin

  27. Conclusions • Commercial applications have parallelism • High throughput BFT provides a simple/flexible solution to achieve high throughput Department of Computer Sciences, UT Austin

  28. Questions ? • Why don’t you have parallelizer in the agreement stage to reduce agreement cost ? Department of Computer Sciences, UT Austin

More Related