1 / 20

The Vagabond Approach to Logging and Recovery in Transaction-Time Temporal Object Database Systems

The Vagabond Approach to Logging and Recovery in Transaction-Time Temporal Object Database Systems. A presentation by Sean Matthews. Introduction – Log-only? Temporal? Transaction-time? Log-only log management Log-only database operations. Introduction.

tamika
Download Presentation

The Vagabond Approach to Logging and Recovery in Transaction-Time Temporal Object Database Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Vagabond Approach to Logging and Recovery in Transaction-Time Temporal Object Database Systems A presentation by Sean Matthews

  2. Introduction – Log-only? Temporal? Transaction-time? • Log-only log management • Log-only database operations

  3. Introduction • Norvag, Kjetil. "The Vagabond Approach to Logging and Recovery in Transaction-Time Temporal Object Database Systems". IEEE Transactions on Knowledge and Data Engineering, April 2004, pp. 504-518. • Why Log-only? • Fast recovery • More benefit from RAID technology • Keeps previous versions of objects easily • The Vagabond Approach • Efficient storage and management of temporal objects • Steal/no-force buffer management, fuzzy checkpointing, and fast commit are supported

  4. Introduction • Steal/no-force buffer management: • Steal: When the buffer manager needs the space, it can decide to replace dirty pages • No-force: No modified page is forced to commit • Fast commit • A faster processing of database transactions • Fuzzy checkpointing • Resumes a transaction after the checkpoint record has been written to the log • Does not wait until the buffers have been written to the log

  5. Introduction • Log-only: the log is written contiguously to the disk in a no-overwrite way using large blocks • In the log-only approach, the log is the final repository for data. • Current object database systems update data in place– updated pages are written to the same location. • Temporal objects: • objects that change over time • These objects have current and previous versions • Transaction-time: • The time when a fact is current in the database and may be retrieved • transaction times are not time instants, but have duration • A database object is stored at some point in time and is current until logically deleted

  6. Log-only log management • Data that is already written is never modified. • Rather, new versions of the already-existing objects are appended to the log.

  7. Log-only log management • Objects • Uniquely identified by an object identifier (OID) • An updated object has the same OID as its previous version • Timestamp: the commit timestamp of the transaction that created or updated the actual object version • Object descriptor (OD): contains OID, physical location of object in log, commit timestamp, end timestamp, and other information. • Stored in the OID index (OIDX) and used for mapping from OID (and/or timestamp) to physical location

  8. Log-only log management • OID Index (OIDX) • Vagabond approach has one index structure containing all current and previous ODs PCache = persistent cache • An intermediate index structure that contains a subset of the entries in the OIDX • Provides an intermediate storage area for persistent data (ODs)

  9. Log-only DB operations • Object Operations • Creating Objects • A unique OID is allocated to the object • A new OD is created • Updating Objects • A new OD is created • The old version is not deleted

  10. Log-only DB operations • Deleting Objects • Temporal objects: A tombstone OD (physical location = NULL) is created where the timestamp is the delete time • Non-temporal objects: An OD where both physical location and timestamp set to NULL is written to the log • New objects: no effect on the database • Reading Objects • Current versions • When the OD is found, the object is read from the physical location found in the OD • Historical versions • If the desired timestamp is known, the same method as used to retrieve a current version is used. • Otherwise, a search of the objects with the largest end timestamp less than or equal to the desired time is performed.

  11. Log-only DB operations • Transaction Management • In order to perform recovery after a failure, it is necessary to ensure that enough information has been written to the log before a transaction commits • Commit • Write to the log the objects from the transaction that is still dirty in the object buffer • Append a transaction finished mark (TxID, timestamp) • TxID = transaction ID • Abort • It is necessary to undo operations when a transaction is aborted • Any objects written to the log before the abort operation will be a dead object, which will be removed the next time the segment is cleaned

  12. Log-only DB operations • Transaction process • When a transaction is started it is not written to the log before the transaction commits • If the transaction aborts before it writes the object to the log, there will be no trace left of the aborted transaction • When a transaction commits the OD is entered into the OIDX • When an object is written to the log, it is always written with its OD (for use with crash recovery)

  13. Log-only DB operations • Recovery • Checkpointing • Crash Recovery When a system is restarted, it is determined from the checkpoint block whether the shutdown of the system was done controlled, or caused by a crash. If caused by a crash, recovery is needed

  14. Log-only DB operations • Checkpointing: allows you to “freeze” a copy of the database in order to reduce the recovery time • The main task of checkpointing is to install the ODs into the OIDX. • This way, the amount of log that has to be read at recovery time in search of ODs that had not been installed into the OIDX before the system crashed, is reduced.

  15. Log-only DB operations • Checkpoint Algorithm • Wait until the number of written objects (or the number of segments written) since the last checkpoint reached a certain threshold. • If there are ODs that 1) were created before the last checkpoint and 2) are not yet installed into the OIDX, install them. • If there are entries in the segment status, PCache status, or TxID/timestamp/counter tables that have not been written during this checkpoint interval, write them to the log now. • Update the least recently written checkpoint block. This finishes the checkpointing, and by definition starts a new checkpoint interval.

  16. Log-only DB operations • Crash Recovery: allows a reconstruction of a consistent state • A traditional system requires an analysis phase, a redo phase, and an undo phase to complete recovery • In a log-only approach, only the ODs and the transaction management information (both in the tail of the log) have to be read in order to rebuild the resident structures.

  17. Log-only DB operations • Crash Recovery Algorithm • Identify the last segment that was successfully written before the crash. • Read the log from the last checkpoint until you encounter 1) a partially written segment, or 2) a segment that points to a next segment that doesn’t exist. • As we read the log, ODs that are encountered that do not have a commit record are discarded • After the end of the log has been identified, the log is read from the penultimate checkpoint to the last checkpoint • In this way, all segments that might have ODs from committed transactions, but where the ODs have not yet been installed into the OIDX, are processed. • When the log has been processed, a checkpoint is performed, and then the checkpoint blocks are updated. • If a system crashes during recovery, it will start in the same way next time.

  18. Log-only DB operations • Vacuuming • The process of physically deleting data which has previously been logically deleted • Such as deleted temporal objects, non-current versions of data • Eager Vacuuming • The OIDX is searched • The ODs of all objects that are non-current and were created before time t.v (arbitrary vacuuming age) are removed from the OIDX • Lazy Vacuuming • The physical removal of an object is deferred until the segment it resides in is cleaned

  19. Log-only DB operations • Segment Cleaning • Alive Data = • All current versions of objects • Historical versions of temporal objects that have not yet been vacuumed • Sub-object and sub-object-index nodes that are reachable from a large object that is still alive • OIDX nodes that are in the current version of the OIDX, including the PCache • Only dirty segments written before the penultimate checkpoint can be cleaned Data that is still valid is moved from segments that have mostly dead data and written to new segments. The result of the process is clean segments.

  20. Questions?

More Related