1 / 35

Tolerating Byzantine Faults in Database Systems using Commit Barrier Scheduling

Tolerating Byzantine Faults in Database Systems using Commit Barrier Scheduling. Ben Vandiver , Hari Balakrishnan , Barbara Liskov , and Sam Madden CSAIL, MIT. Sponsors: Quanta Computer Inc, NSF. Non-crash faults in Databases. Over 50% of reported bugs were non-crash faults

emmet
Download Presentation

Tolerating Byzantine Faults in Database Systems using Commit Barrier Scheduling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tolerating Byzantine Faultsin Database Systemsusing Commit Barrier Scheduling Ben Vandiver, HariBalakrishnan, Barbara Liskov, and Sam Madden CSAIL, MIT Sponsors: Quanta Computer Inc, NSF

  2. Non-crash faults in Databases • Over 50% of reported bugs were non-crash faults • Incorrect answers, data or index corruption, etc. • Previous focus on fail-stop faults • Better model: Byzantine faults

  3. Failure Independence • Heterogeneous replicas • Different implementations / versions • Easiest with non-invasive solution • Requires standard interface • SQL is moderately standard

  4. Client Interaction • Organized into Transactions • Query, Query, …, Commit / Rollback • Interactive • Strong consistency • Single-copy serializable

  5. Database Functionality • Each Database provides • Serializable isolation • Strict (rigorous) 2-phase locking • Databases don’t execute in issue-order • Limited control over execution order Issue S1 Replica 1 executes S1 Replica 2 executes S2 S2 S2 S1

  6. Replica Coordination • BFT well known solution • 3f+1 replicas • Globally order client requests • Replicas execute in order • Exhibits no concurrency • Goal: mechanism to extract concurrency in database context

  7. Architecture Client Client Client Shepherd SQL SQL SQL DB1 DB2 DB3

  8. Architecture Client Client Client SQL Result Shepherd Vote Need f+1 matching votes SQL Result SQL Result SQL ? DB1 DB2 DB3

  9. How to extract concurrency? • Just issue statements to replicas • Likely to get stuck • Solution: pre-determine which statements conflict • Inspecting SQL is very hard

  10. Commit Barrier Scheduling • Primary / Secondary Scheme • Run transactions first on the primary • Duplicate primary’s ordering on the secondaries • Works best when primary is Sufficiently Blocking • Required for performance, not correctness

  11. Commit Barrier Scheduling Client Client Client SQL SQL SQL Shepherd Result Result Result SQL SQL ? DB DB DB Primary

  12. Correct Execution • Statement Ordering Rule • Execute statements of transaction in order • Commit Ordering Rule • All replicas commit transactions in the same order • Order determined by Shepherd

  13. Execution Trace on Primary T1 SX C T2 SY SZ C Time

  14. Extracting Conflict Info T1 SX C T2 SY SZ C Don’t Conflict!

  15. Avoiding Conflicts T1 SX C T2 SY SZ C Might Conflict! Transaction-Ordering Rule: A query from transaction T2 that was executed by the primary after the COMMIT of transaction T1 can be sent to a secondary only after it has processed all queries of T1.

  16. Commit Barrier Scheduling • Maintain barrier for each replica • Mark statements and transactions with barriers • Issue statements and commits when replica’s barrier reaches appropriate value • Simple to implement

  17. Analysis of CBS:Non-faulty primary • Full concurrency on the Primary • Deadlocks detected and resolved locally • Ample concurrency on Secondaries • allows many statements to run in parallel • Secondaries hardly ever block • Latency increase

  18. Early Return Client Client Client Result Next SQL Stmt Shepherd Pipelined Execution! SQL SQL DB DB DB Primary

  19. Early Return Analysis • Cut latency in half • Must vote at Commit • Sent wrong answer, abort the transaction • Correctness Condition • Clients receive correct answers for all transactions that commit

  20. Masking Faults • Faulty Secondary not a problem • Voting resolves wrong answers • Faulty Primary is a problem • Generates invalid schedule • Goal: correct execution

  21. Faulty Primary Scenario T1 , T2 – Increment A by 1, return A A initially 0, should end up 2 f+1 matching votes for both answers!

  22. Other Issues • Mechanics • Replica Repair • Shepherd crashes • Heterogeneity & SQL

  23. Implementation • Prototype called HRDB • Implemented in Java • About 3500 semicolon-lines of code • JDBC interface to clients and databases • Works with MySQL, DB2, Derby, and SQLServer

  24. Performance 17%

  25. Heterogeneous Replication • Ran 2f+1=3 replica system, heterogeneous vendors • MySQL, DB2, Commerical DB X • Sufficiently Blockingholds in practice • System runs at slowest of f+1 fastest replicas, or primary

  26. Fail-Stop Faults

  27. Bugs and HRDB • Successfully masked bugs • Heterogeneous vendors & heterogeneous versions • Found a new bug in MySQL • While running TPC-C • Present since October 2001 • Patched in recent release • Starting to look for bugs actively with HRDB

  28. Conclusion • First practical Byzantine Fault Tolerant Database • Failure independence by supporting heterogeneous replicas • Novel concurrency extraction scheme • Tool for finding new bugs in databases

  29. Backup Slides

  30. Snapshot Isolation • Allows read-after-write hazards • Converts fail-stop to Byzantine faults • Need write-sets to implement • Scheme called Snapshot Barrier Scheduling

  31. Implement with Barriers B=0 B=1 B=2 B=3 T1 SW C T2 SX SY SZ C T3 SJ SK C • Primary • S – Annotate with current barrier upon completion • C – Increment barrier before issue • Secondary • S – Issue when replica barrier is at least the value of the annotation • C – Increment replica barrier after completion

  32. Heterogeneity Issues • Non-determinism in answers • Result set ordering • Non-deterministic functions in queries • Database-assigned row IDs • Query Rewriting • SQL incompatibility • Translation Engine • SQL hiding – Views and Stored Procedures

  33. Future Work • Replicating the Shepherd • Efficient Replica Repair • Finding Bugs

  34. Replica Recovery • Replicas • Fail-stop crashes – Shepherd replays missing transactions • Uses transaction log table in database to discover which transactions to replay • Byzantine faults – Shepherd repairs faulty state, then replays • Efficient repair mechanism under development • Shepherd • Fail-stop crashes - Maintains a write-ahead log

  35. Faulty Primary • Wrong answers result in transaction abort • Concurrency Faults • Can result in secondaries being unable to make progress • System is back to “Correct but Slow” solution • Same case as when primary is not sufficiently blocking • Can be hard to tell if primary is faulty • Replace primary by doing a view change

More Related