1 / 9

What Should the Design of Cloud-Based (Transactional) Database Systems Look Like?

What Should the Design of Cloud-Based (Transactional) Database Systems Look Like?. Daniel Abadi Yale University March 17 th , 2011. Does the Cloud Force Us to Build New Database Systems?. Traditional vendors argue ‘no’ But there is increased desire for certain things:

pelham
Download Presentation

What Should the Design of Cloud-Based (Transactional) Database Systems Look Like?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What Should the Design of Cloud-Based (Transactional) Database Systems Look Like? Daniel Abadi Yale University March 17th, 2011

  2. Does the Cloud Force Us to Build New Database Systems? • Traditional vendors argue ‘no’ • But there is increased desire for certain things: • Horizontal scalability to leverage cloud elasticity • Traditional solutions use high-end SANs, but this is not a cloudy concept • Shared-nothing is the most effective way to achieve horizontal scalability in the cloud • Virtualization can result in wild fluctuations of node performance • Do not want to operate at the speed of slowest node • Virtual machines in the cloud have orders of magnitudes higher mortality rates • Greedy owners like to murder the poor VMs • Fault tolerance must be treated as a first class citizen

  3. The Problem With Traditional Database Solutions • Available shared-nothing solutions do not achieve high transactional throughput • Distributed concurrency control and commit protocols are expensive • Need more research to reduce this overhead • Database systems generally optimize everything in advance • Adaptive execution and optimization frameworks must be high on the research agenda • Database systems treat faults as a rare event • Machine failures cause transactions to abort • Recovery from a REDO log is slow

  4. Common Solutions • Drop A or C of ACID • Relaxing consistency makes replication easy, facilitates fault tolerance • Relaxing atomicity reduces (or eliminates) need for distributed concurrency control • Examples: SimpleDB, BigTable (HBase), Cassandra, PNUTs, SQL Azure, sharded MySQL, etc. • Often called NoSQL systems • (Dropping ‘C’ also helps with CAP, but this is only part of the story)

  5. Whither ACID in the Cloud? • People still want ACID • Engineers at Google, Facebook, Amazon, Twitter, etc. are a very loud minority • NoSQL should not be the only option in the cloud • Needed research: • Building an ACID-compliant, horizontally scalable, fault tolerant database for the cloud

  6. One Potential Idea • Get replication to work right out of the box • Today’s systems generally act, then replicate • Complicates semantics of sending read queries to replicas • Need confirmation from replica before commit (increased latency) if you want durability and high availability • In progress transactions must be aborted upon a master failure • Want system that replicates then acts

  7. Therefore … • Instead of weakening ACID, strengthen it! • Guaranteeing equivalence to SOME serial order makes active replication difficult • Running the same set of xacts on two different replicas might cause replicas to diverge • Disallow any nondeterministic behavior • Disallow aborts caused by DBMS • Disallow deadlock • Distributed commit much easier if you don’t have to worry about aborts

  8. Consequences of Determinism • Replicas produce the same output, given the same input, • Facilitates active replication • Only initial input needs to be logged, state at failure can be reconstructed from this input log (or from a replica) • Active distributed xacts not aborted upon node failure • Greatly reduces (or eliminates) cost of distributed commit • Don’t have to worry about nodes failing during commit protocol • Don’t have to worry about affects of transaction making it to disk before promising to commit transaction • Just need one message from any node that potentially can deterministically abort the xact • This message can be sent in the middle of the xact, as soon as it knows it will commit

  9. If This Works Then … • Node failure does not cause transaction failure • Fault tolerance will be extremely high • Can run distributed transactions without an expensive commit protocol • Shared-nothing becomes much more attractive

More Related