1 / 24

Recovery in Main Memory Databases

Recovery in Main Memory Databases. -Le Gruenwald, Jing Huang, Margaret H. Dunham el al - Engineering Intelligent Systems, Vol.4, No. 3, September 1996 이 인선 97/08/21. Introduction. General MMDB Architecture Main Memory (MM) in RAM memory Stable Memory(SM) optional nonvolatile memory

Download Presentation

Recovery in Main Memory Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Recovery in Main Memory Databases -Le Gruenwald, Jing Huang, Margaret H. Dunham el al - Engineering Intelligent Systems, Vol.4, No. 3, September 1996 이 인선 97/08/21

  2. Introduction • General MMDB Architecture • Main Memory (MM) in RAM memory • Stable Memory(SM) • optional nonvolatile memory • used to hold log buffers(log tail) • avoid I/O actions when transaction are committed • essential to performance • Archive Memory(AM) holds a backup of the entire database • focus on logging, checkpointing, reloading

  3. MMDB Logging(1) • physical logging • the state of the database modified by an operation are logged • it is recommended for MMDB systems • logical logging • contains descriptions of higher level operations and records the state transition of the database • the idempotent property does not hold

  4. MMDB Logging(2) • Logging rules • Write Ahead Rule • undo-log data must be written to a nonvolatile memory prior to the updating in the database • Commit rule • if a DBMS allows a transaction to commit, the redo-log data of it should be ensured in nonvolatile storage • Logging After Writing • the after image of an updated item should be written to the log after its corresponding update is propagated to the database • simplifies the log processing with a fuzzy checkpointing MMDB

  5. MMDB Logging(3) • MMDB logging differs from DRDB logging in three ways • a nonvolatile log buffer should be used to satisfy WAL without requiring I/O prior to transaction commit • physical logging is recommended as it is easier to use with fuzzy checkpointing • to reduce the amount of the log needed to redo transactions after a system failure, the LAW policy should be followed

  6. Checkpointing DRDB • Commit consistent checkpointing • periodically stop processing transactions • flush all dirty cache slots and mark the log • cache consistent checkpointing • fuzzy checkpointing • only flushes those dirty slots that have not been flushed since before the previous checkpoint • normal replacement activity will flush most cache slots that were dirty since before the previous checkpoint • checkpoint won’t have much flushing to do and won’t delay active transaction for very long.

  7. Checkpointing MMDBs(1) • Focuses on low-interference with normal transactions and supporting efficient recovery • Fuzzy checkpointing • Hagmann • first suggested using fuzzy checkpointing for MMDBs • “a crash recovery scheme for a memory-resident database system” • IEEE transactions on computers. Vol. C-35, No. 9, september 1986 • the checkpointer does not need to obtain the locks on the data items to be checkpointed • the database is dumped in sections • after dumping a section, the checkpointer writes a log record to the log • a section must not overwrite its previous image (sliding monoplexed backups)

  8. LAW with fuzzy checkpointing

  9. Checkpointing MMDBs(2) • Salem and Garcia-Molina • “checkpointing memory-resident databases”(‘89) • compared the fuzzy checkpointing scheme with two-non-fuzzy checkpointing schemes • fuzzy checkpointing is the most efficient one • ping-pong scheme • each dirty page is flushed twice • Lin and Dunham • “segmented fuzzy checkpointing for main memory databases”(‘94) • checkpoints one segment at a time in a round-robin fashion • automatically changes the segment boundaries based on the distribution of update operations

  10. Checkpointing MMDBs(3) 1 2 3 4 Redo log size in the Segmented fuzzy checkpointing • Li et al • “checkpointing and recovery in partitioned main memory databases(‘95) • the database is divided into partitions, each of which has its own log disks • the time to recover from a system failure is reduced a1 b1 c1 a2 b2 c2 B C1 B C2

  11. Checkpointing MMDBs(4) • Non-Fuzzy Checkpointing • overhead comes from locking the checkpointed objects to ensure transaction-consistency or action-consistency • Lehman and Carey • “a recovery algorithm for a high-performance memory-resident database system”(‘87) • transaction-consistent(at relation level)scheme • no need to maintain undo-log-records in nonvolatile storage • checkpointing increases the data contention with normal transaction

  12. Checkpointing MMDBs(5) • Salem and Garcia-Molina • “checkpointing memory-resient databases” (‘89) • discuss two non-fuzzy checkpointing approaches • the first(black and white) one aborts some update transactions • the second(Copy-On-Update) one requires some update transactions storing the original values of data items to be updated • both have severe impact on the system performance • Jagadish et al • “recovering from main-memory lapses” (‘93) • propose an action-consistent checkpointing scheme • the undo-logs of active transactions are first written to the log, and then dirty pages are flushed to disk • during normal processing, the redo-logs of the committed transactions are written to the log • ping-pong update • this approach was originally used in Dali

  13. Checkpointing MMDBs(6) • Log-driven checkpointing • applies the log to a previous dump to generate a new dump • originally used to generate remote backup of the database • is adopted to “incremental recovery in main memory database systems” (‘92) • with high transaction processing rate in MMDBs, the size of the log can increase rapidly • it is quite inefficient compared to fuzzy checkpointing

  14. MMDB Reloading(1) • Issues • occurrence frequency of the reload process • on average, a system failure occurs once every few weeks • media failure, MM page faults • when the system should resume its execution after a failure • 28.43 minutes are needed to recover 1Giga DB [?] • if the system is not available at all during recovery, many transactions will be backlogged • reload prioritization • reload priority can be determined based on access frequency, transaction deadline(“MMDB reload algorithms”) or temporal data interval from real-time applications[?]

  15. MMDB Reloading(2) • Existing reload schemes • simple reloading • the system can not be brought online until the entire database is memory-resident • concurrent reloading • Grenwald • “mmdb reload algorithms” (‘91) • two processors(RP & DP), nonvolatile shadow memory(SM) and dual address translation mechanism in the MARS system • ordered reload with prioritization/ smart reload/ frequency reload • the differences lie in the structure of AM, utilization of data access frequency, reload prioritization, and reload granularity • the frequency reload yields the best transaction response time and system throughput

  16. MMDB Reloading(3) • Lehman • “a recovery algorithm for a high-preformance” • after the system catalogs and their indices are reloaded then regular transaction processing is allowed to resume • Levy and Silberschatz • “incremental recovery in main memory database systems”, (‘92) • resume transaction processing immediately after a system failure and recovers pages individually according to the demand of post-crash transaction. • Stale/fresh marking technique • in order to implement a page-based recovery, log records must be grouped together on a page basis during normal operation

  17. Recovery with Existing MMDB Systems(1) • Dali from AT&T • the original recovery manager was implemented according to “recovering from main-memory lapses” (‘93) • logging only redo records during normal execution • segment-level action-consistent checkpoints • checkpointer write to the disk relevant parts of the undo log • recovery has only a single pass over the log • require no special h/w to preserve the data • test led to a restructuring of its recovery manager • “multi-level recovery in the Dali storage manager” (‘95) • multi-level logging, post-commit actions, dirty page detection, and fuzzy checkpoints

  18. Recovery with Existing MMDB Systems(2) • Fast Path • supports the memory-resident data and disk-resident data • performs updates to memory resident data at commit time • no undo operations are required when a failure occurs • a group commit is adopted • transaction-consistent backup copy of the database is refreshed during system shutdown or infrequently checkpoints. • Two backup database with ping-pong backups

  19. Recovery with Existing MMDB Systems(3) • two real-time system examples • NEC Real-Time DBMS • Stone RTDB • NEC RTDBMS has several features to ensure high throughput and accurate predictability • no page fault • in-memory log buffer is nonvolatile • physical logging using deferred update • fuzzy checkpointing • no real-time characteristics such as transaction deadline and criticalness are utilized in the recovery components

  20. Summary and Conclusion • Discussed 3 logging rules • nonvolatile log buffer should be used to satisfy WAL without requiring I/O prior to transaction commit • LAW should be followed to reduce the amount of log needed to redo transactions after a system failure • described three groups of checkpointing • identified 3 issues about reloading • data should be prioritized for reload purposes • future research • investigate how real-time requirements such as transaction deadline and temporal data intervals can be incorporated into MMDB recovery

  21. a crash recovery scheme for a memory-resident database system Robert B. Hagmann IEEE transactions on computers. Vol. C-35, No. 9, september 1986

  22. overview • Presents a method of doing recovery that uses the existing techniques of fuzzy dumps and log compression • design requirement • small system example • 2 pages/transaction *100 transactions/s * 3600s /h * 8h = 5,760,000 pages written to the log • transaction size must be short • checkpointed periodically every five minutes

  23. Overview(2) • The principal requirement of the system is “fast” recovery from a system crash • critical factor : transfer rate of the disk • can be improved by using several parallel processors • design overview • fuzzy dump • simply a copy of the database taken without any synchronization • If a DBMS uses a nonvolatile storage, some log compression can occur • else precommitting and group commits can be used to increase performance

  24. overview • Design details

More Related