1 / 19

Highly Available ACID Memory

Highly Available ACID Memory. Vijayshankar Raman. Introduction. Why ACID memory? non-database apps: want updates to critical data to be atomic and persistent synchronization useful when multiple threads are accessing critical data databases

marin
Download Presentation

Highly Available ACID Memory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Highly Available ACID Memory Vijayshankar Raman

  2. Introduction • Why ACID memory? • non-database apps: • want updates to critical data to be atomic and persistent • synchronization useful when multiple threads are accessing critical data • databases • concurrency control and recovery logic runs through most of database code. • Extremely complicated, and hard to get right • bugs lead to data loss -- disastrous!

  3. Project goal • Take recovery logic out of apps • Build a simple user-level library that provides recoverable, transactional memory. • all the logic in one place => easy to debug, maintain • easy to to make use of hardware advances • use replication and persistent memory for recovery -- instead of writing logs • simpler to implement • simpler for applications to use ??

  4. Questions to answer • program simplicity vs. performance • how much do we lose by replicating instead of logging? • on a cluster, can we use replication directly for availability? • traditionally availability handled on top of the recovery system

  5. Outline • Introduction • Acid Memory API • Single Node design & implementation • Evaluation • High Availability: multiple node design and implementation • Evaluation • Conclusion

  6. Acid Memory API • Transaction manager interface • TransactionManager(database name, acid memory area) • Transaction interface • beginTransaction() • getLock(memory region1, READ/WRITE) • getLock(memory region2, READ/WRITE) • ... • memory region = virtual address prefix • commit/abort() -- all locks released • combine concurrency control with recovery • recovery done on write-locked regions • supports fine granularity locking=> cannot use VM for recovery • applications can modify data directly

  7. Implementation Acid memory area master copy Disk file • assume non-volatile memory (NVRAM, battery backup) • assume persistent file cache • acid memory area mmap’d from file • persistence => writes are permanent • getLock(WRITE) -- copy the region onto mirror area • transaction abort / system crash • undo changes on all writelocked regions using copy in mirror area • only overhead of recovery is a memcpy on each write lock mmap mirror

  8. Evaluation • Overhead of acid memory • read lock:  35usec (lock manager overhead) • write lock:  35usec + 5.5usec/KB (memcpy cost) • much lesser than methods that write log to disk • Ease of programming • application needs to only acquire locks to become recoverable • can manipulate the data directly -- do not have to call special function on every update

  9. Example: suppose I want to transfer 1M $ from A’s account to B’s With ACID memory /* a points to A’s account */ /* b points to B’s account */ trans = new Transaction(transMgr); trans->getLock(a, WRITE); trans->getLock(b, WRITE); a = a - 1000000; b = b + 1000000; trans->commit(); (Update() creates the needed logs) Using logging BeginTransaction(); getLock(A’s account, WRITE); getLock(B’s account, WRITE); read(A’s account, a); read(B’s account, b); a = a - 1000000; b = b + 1000000; Update(A’s account, a); Update(B’s account, b); commit();

  10. Acid memory: write-lock the data-structure Logging: write-lock the structure and update each integer separately • Performance comparison: acid memory vs. logging • consider a transaction updating integers in a 1KB data-structure • logging each individual update is a bit faster, to an extent • acid memory gives okay performance with very easy programmability Time (in microseconds) Number of integer writes

  11. Outline • Introduction • Acid Memory API • Single Node design & implementation • Evaluation • High Availability: multiple node design and implementation • Evaluation • Conclusion

  12. Replication for availability Transaction processing monitor replicate • traditionally, availability has been handled in a separate layer -- above recovery • can we handle both recovery and availability via same mechanism? DBMS DBMS DBMS

  13. lock manager Architecture replicas Owner data data data data data client • Transactions run by transaction handler • all lock requests must go to owner • data in all replicas must be kept in sync • balance load by partitioning data • different owner for each partition • failure model • fail-stop: nodes never send incorrect messages to others • failed nodes never recover data after crash • network never fails Transaction handler

  14. Owner data lock manager data data data data client Transaction handler • Reads: client gets data from random replica • Writes: must update all replicas • on commit, transaction sends new data to owner • owner propagates update atomically to all replicas • 3 phase non-blocking commit protocol. Always ensure that there is someone to take over the propagation if you crash • if owner crashes, fail-over to a replica

  15. Evaluation • Very fast recovery -- 424 usecs • get fast transactions without non-volatile memory • writes are slower • 4n messages at commit if n replicas • still, this is faster than logging to disk • homogeneous software: susceptible to bugs

  16. Conclusions • Acid memory easier to use • Performance relative to logging not too bad • replication gives fast recovery • Using cache for replication • when/how much to replicate? Future Work

  17. Additional Slides

  18. Evaluation, w.r.t. logging based approach • Ease of implementation • very little to code, mostly lock manager stuff • whereas in a traditional dbms • specialized buffer manager • log manager • complex recovery mechanism

  19. How to make file cache persistent • Rio (Chen et. Al, 1996) • place file cache in non-volatile memory • protect it against OS crashes using VM protection • flush pages in file cache to disk files on reboot

More Related