1 / 19

ICS 214A: Database Management Systems Fall 2002

ICS 214A: Database Management Systems Fall 2002. Lecture 17: Checkpoints Professor Chen Li. Recovery is very, very SLOW !. Undo log: First Record Last Record (1 year ago) We do not want to rescan all the log records! Some of them can be removed. Crash. Solution: Checkpoint.

cecile
Download Presentation

ICS 214A: Database Management Systems Fall 2002

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ICS 214A: Database Management Systems Fall 2002 Lecture 17: Checkpoints Professor Chen Li

  2. Recovery is very, very SLOW ! Undo log: First Record Last Record (1 year ago) We do not want to rescan all the log records! Some of them can be removed. ... ... ... Crash Notes 17

  3. Solution: Checkpoint Simple Version Periodically: (1) Do not accept new transactions (“quiescent”) (2) Wait until all current transactions finish (3) Flush all log records to disk (4) Flush all data buffers to disk (5) Write log record <CKPT> and flush the log (6) Resume accepting transactions Notes 17

  4. Example: Undo log, quiescent ckpt Log: <T1, START> <T1, A, 5> <T2, START> <T2, B, 10>  Do a checkpoint • Wait until both T1 and T2 finish (commit or abort); • Then flush the data and log, and write <CKPT> to the log. Final Log <T1, START> <T1, A, 5> <T2, START> <T2, B, 10> <T2, C, 15> <T1, D, 20> <T1, COMMIT> <T2, COMMIT> <CKPT> … Notes 17

  5. Recovery: Undo log, quiescent ckpt Log after a crash: <T1, START> <T1, A, 5> <T2, START> <T2, B, 10> <T2, C, 15> <T1, D, 20> <T1, COMMIT> <T2, COMMIT> <CKPT> <T3, START> <T3, E, 25> <T3, F, 30> • Scan the log backwards from the end and identify incomplete transactions • Once see a <CKPT> record, ignore record before this <CKPT> • Why? All transactions before this ckpt must have finished. • Other operations same as before • Example: • T3 is the only incomplete transaction • Undo F and E. Write <T3, abort> Notes 17

  6. Nonquiescent checkpoint (undo) • We don’t want the system to “halt” to do a checkpoint • How to accept xacts during a checkpoint? • Write (flush) log record <START CKPT (T1,…,Tk)>, where T1,…,Tk are active (not finished) transactions. • Wait until them to finish (complete and abort). Meanwhile, accept new transactions. • After these k transaction complete, write (flush) a log record <END CKPT>. Notes 17

  7. Ex: Undo log, nonquiescent ckpt Undo Log: <T1, START> <T1, A, 5> <T2, START> <T2, B, 10> <START CKPT(T1,T2)>  Start checkpointing <T2, C, 15>  continue, accept new xacts, <T3, START> until T1 and T2 complete <T1, D, 20> <T1, COMMIT> <T3, E, 25> <T2, COMMIT> <END CKPT>  end checkpointing <T3, F, 30>  continue Notes 17

  8. Recovery: Undo log, nonquiescent ckpt • Scan the log backwards from the end • Case 1: meet a <END CKPT> first • Then all incomplete xacts began after the previous <START CKPT(…)> log record • Thus we can scan backwards until the previous <START CKPT(…)> log record • Ignore log before this record • Ex: • T3 is the only incomplete xact, and should be undone • Restore data element F back to 30. <T1, START> <T1, A, 5> <T2, START> <T2, B, 10> <START CKPT(T1,T2)> <T2, C, 15> <T3, START> <T1, D, 20> <T1, COMMIT> <T3, E, 25> <T2, COMMIT> <END CKPT> <T3, F, 30> Notes 17

  9. Recovery: Undo log, nonquiescent ckptcase 2 • Scan the log backwards from the end • Case 2: meet a <START CKPT(T1,…,Tk)> first • Then all incomplete xacts include: • Those incomplete xacts we met before this <START CKPT()> log record; and • Those of (T1,…,Tk) that are incomplete • Thus we need to scan to the start of the earliest incomplete xact • Discard the previous log records • Undo incomplete xacts • Ex: • Incomplete xacts: (T2, T3) • T1 is complete! • Scan until the start of T2 (earliest) <T1, START> <T1, A, 5> <T2, START> <T2, B, 10> <START CKPT(T1,T2)> <T2, C, 15> <T3, START> <T1, D, 20> <T1, COMMIT> <T3, E, 25> Notes 17

  10. Improvement <T1, START> <T1, A, 5> <T2, START> <T2, B, 10> <START CKPT(T1,T2)> <T2, C, 15> <T3, START> <T1, D, 20> <T1, COMMIT> <T3, E, 25> • Use pointers to chain together the log records of the same xact • Then we can follow the chain to find the “start” record of this xact. Notes 17

  11. General rule: Undo log, nonquiescent ckpt <T1, START> <T1, A, 5> <T2, START> <T2, B, 10> <START CKPT(T1,T2)> <T2, C, 15> <T3, START> <T1, D, 20> <T1, COMMIT> <T3, E, 25> <T2, COMMIT> <END CKPT> <T3, F, 30> • Once an <END CKPT> record has been written to disk, we can delete the log prior to the previous <START CKPT> record Notes 17

  12. Next: checkpoint in Redo Logging Notes 17

  13. Complications • For a xact whose <COMMIT> log record is written on disk, • its changed data elements can be copied to disk much later • Thus, between a <START CKPT> and an <END CKPT> • We must write to disk all DB elements that have been modified by committed xacts but not yet written to disk • Need to keep track of all the dirty buffers • We can complete the ckpt without waiting for the active xacts (not completed) to complete (commit or abort), since they are not allowed to write their pages to disk at that time anyway Notes 17

  14. Quiescent checkpoint (redo) • Write (flush) log record <START CKPT (T1,…,Tk)>, where T1,…,Tk are active (uncommitted) xacts. • Write to disk all DB elements that are written to buffers but not yet to disk by xacts that had already committed when the <START CKPT> record was written to the log • Write (flush) a log record <END CKPT>. Notes 17

  15. Ex: redo, checkpoint, nonquiescent Redo Log: <T1, START> <T1, A, 5> <T2, START> <T1, COMMIT> <START CKPT(T2)>  Start checkpoint <T2, C, 15>  continue, accept new xacts, <T3, START> make sure A=5 by T1 is on disk <T3, D, 20> <END CKPT>  end checkpoint <T2, COMMIT>  continue <T3, COMMIT> Notes 17

  16. Recovery: redo, nonquiescent (case 1) • Search backwards the log • Case 1: <END CKPT> is seen before <START CKPT(T1,…,Tk)> • All xacts committed before <START CKPT> have their data element changes on disk. These xacts can be ignored • Xacts T1,…,Tk and those new xacts after <START CKPT> that have committed need to be redone • Find the earliest of the <START Ti> records • Can use pointers to improve the performance • Ex: • T2 and T3 need to be considered • Since both have “COMMIT” records  need to be redone Redo Log: <T1, START> <T1, A, 5> <T2, START> <T1, COMMIT> <START CKPT(T2)> <T2, C, 15> <T3, START> <T3, D, 20> <END CKPT> <T2, COMMIT> <T3, COMMIT> Notes 17

  17. Recovery: redo, nonquiescent (case 1) • Ex: • T2 and T3 need to be considered • Since T2 has a “COMMIT” records, it needs to be redone • T3 can be ignored Redo Log: <T1, START> <T1, A, 5> <T2, START> <T1, COMMIT> <START CKPT(T2)> <T2, C, 15> <T3, START> <T3, D, 20> <END CKPT> <T2, COMMIT> Notes 17

  18. Recovery: redo, nonquiescent (case 2) • Search backwards the log • Case 2: <START CKPT(T1,…,Tk)> is seen before <END CKPT> • Not sure if xacts prior to this <START CKPT> has their data element changes on disk. • Need to find the previous <START CKPT(S1,…,Sm)> • Redo those committed xacts that start after the previous <START CKPT> or among those Si’s • Ex: • Look for the previous <START CKPT> • T0 and T1 are the committed xacts  need to be redone • T2 and T3 are ignored Redo Log: <START CKPT(T0)> … <T0, COMMIT> … <END CKPT(T2)> <T1, START> <T1, A, 5> <T2, START> <T1, COMMIT> <START CKPT(T2)> <T2, C, 15> <T3, START> <T3, D, 20> Notes 17

  19. Next: Redo/Undo logging Notes 17

More Related