Cs 440 database management systems
This presentation is the property of its rightful owner.
Sponsored Links
1 / 50

CS 440 Database Management Systems PowerPoint PPT Presentation


  • 53 Views
  • Uploaded on
  • Presentation posted in: General

CS 440 Database Management Systems. Transaction Management - Recovery. What we have discussed so far:. RDBMS Implementation. Use indexing to speed up queries Which type of index to use Which attributes to index … Storage Which block size to choose

Download Presentation

CS 440 Database Management Systems

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Cs 440 database management systems

CS 440 Database Management Systems

Transaction Management - Recovery


What we have discussed so far

What we have discussed so far:


Rdbms implementation

RDBMS Implementation

  • Use indexing to speed up queries

    • Which type of index to use

    • Which attributes to index

  • Storage

    • Which block size to choose

    • Clustered versus unclustered relations


Rdbms implementation1

RDBMS Implementation

  • Query execution

    • Algorithms to implement our operators

    • Analyze their costs

  • Query optimization

    • Find the best plan to execute the queries

    • Check if your RDBMS picks the best plan

    • Change RDBMS setting to get the best time

      • Build hash-index if RDBMS uses hash-based join


Rdbms implementation2

RDBMS Implementation

  • You can start tuning your RDBMS

    • or writing data management programs

    • A rare skill!

  • We assumed the input is a single query

    • Programs use more than one query to do the job

  • What is different?


Example balance transfer

Example: Balance Transfer

  • Transfer $200 from account 170 to account 103

  • Three SQL queries:

  • Check if Account 170 has at least $200.

  • Reduce the balance of 170 by $200.

  • Increase the balance of 103 by $200.

  • There is a crash after step 2.

    • Power loss, hardware problem, software bug in OS,…

    • Account 170 loses $200.


Balance transfer example lessons

Balance Transfer Example: Lessons

  • Check if Account 170 has at least $200.

  • Reduce the balance of 170 by $200.

  • Increase the balance of 103 by $200.

  • Queries in step 2 and 3 must be executed together.

    • All or none.

    • Atomicity.


What should we care

What should we care?

  • Failures do not happen that often!

    • Ostrich method: let them happen

  • This makes the database inconsistent

    • Deposit and withdrawal do not sum up

  • A DB that is 1% inconsistent is 100% useless.

    • Nobody can rely on the data

    • Prohibitively expensive to find the inconsistent accounts


Rdbms implementation3

RDBMS Implementation

  • We assumed that there is only one user

  • Multiple users concurrently read/write the data.

  • What is different?


Example 6 40

Example 6.40

Flight(fltNo, fltDate, seatNo, seatStatus)

  • Check the available seats

    SELECT seatNo

    FROM Flight

    WHERE fltNo=123 AND fltDate=DATE’08-1-1’

    AND seatStatus = ‘available’;

    2. Book the available seat

    UPDATE Flight

    SET seatStatus = ‘occupied’

    WHERE fltNo=123 AND fltDate=DATE‘08-1-1’

    AND seatNo = 22A


Example 6 401

Example 6.40

  • John checks for availability and gets seat 22A

  • John books seat 22A

  • Mary checks for availability and gets seat 22B

  • Mary books seat 22B


Example 6 402

Example 6.40

  • John checks for availability and gets seat 22A

  • Mary checks for availability and gets seat 22A

  • John books seat 22A

  • Mary books seat 22A

    Double booking!


Example 6 40 lessons

Example 6.40: Lessons

  • Either John’s program must execute first and then Mary’s, or the other way. They cannot run in parallel.

    • Or they seem to run serially!

  • They must not interfere with each other

  • Serializablity


Transaction

Transaction

  • An execution of a DB program

    • Or a coherent fraction of a DB program

  • A large DB program may contain more than one transaction

    • Flight: one transaction for booking, one transaction for canceling, …


Transaction acid properties

Transaction: ACID Properties

  • Atomicity

    • All or nothing

  • Consistency

    • Each transaction maps database form one consistent state to another consistent state.

  • Isolation

    • Concurrent transactions must not interfere with each other.

  • Durability

    • The result of a committed transaction does no vanish due to failures: power loss, errors, …


Balance transfer example

Balance Transfer Example

  • Check if Account 170 has at least $200.

  • Reduce the balance of 170 by $200.

  • Increase the balance of 103 by $200.

  • Power outage in the middle violates Atomicity,Consistency, and durability


Example 6 403

Example 6.40

  • John checks for availability and gets seat 22A

  • Mary checks for availability abd gets seat 22A

  • John books seat 22A

  • Mary books seat 22A

  • It violates consistency and isolation.


Transaction1

Transaction

START TRANSACTION

SQL-statement 1

SQL-statement 2

...

SQL-statement n

COMMIT

SQL-statement1 to SQL-statement n will be treated as an atomic programming fragment: “All or none”.


Transaction2

Transaction

  • We can also undo a transaction programmatically:

    START TRANSACTION

    SQL-statement 1

    SQL-statement 2

    ...

    IF (Some condition) ROLLBACK

    SQL-statement n

    COMMIT

  • ROLLBACK undoes the effect of all executed statements -> aborts the transaction.


Transaction manager

Transaction Manager

  • A component in DBMS that supports transaction processing.

    • COMMIT, ROLLBACK

  • It guarantees ACID properties.


Transaction manager modules

Transaction Manager Modules

  • Recovery

    • Rolls back the database to consistent state in the case system failures

    • Guarantees atomicity, durability, and consistency.

    • Recovery manager.

  • Concurrency control

    • Does not allow concurrent transactions to interfere with each other.

    • Guarantees isolation and consistency.

    • Concurrency control manager.


Recovery

Recovery


Types of failures

Types of Failures

  • Human error

    • Enter wrong data, wrong logic in transactions

    • Use integrity constraints

  • Disk crashes

    • Redundancy: RAID, Archive on tape, …

  • System failures

    • Power outage, (other) hardware errors, software errors,…

    • Recovery


Database log

Database Log

  • Transactions have some intermediate state

    • May be in memory buffers

    • Lost in case of system failures

  • Log

    • A file that records every action in the transaction.

  • Log Manager

    • Maintains the log of a database.


Transaction notations

Transaction Notations

  • Database is a set of data items

    • Usually data item = block

    • Could be record (smaller) or table (larger)

  • Transaction is a sequence of read/write data items

  • Three types of storage

    • Disk

    • Memory buffers: managed by buffer manager

    • Transaction’s local memory (variables): managed by transaction manager.


Operations of transactions

Operations of Transactions

  • INPUT(A)

    • Read data item A from disk to memory buffer

  • READ(A,v)

    • Read data item A to local variable v

  • WRITE(A,v)

    • Write local variable v to data item A in memory buffer

  • OUTPUT(A)

    • Write data item A from memory buffer to disk.


Example 17 1

Example 17.1

  • Salary increase

  • Data items A and B has the same value in DB

    • Consistent state

  • Transaction T

    • A = A * 2;

    • B = B * 2;

  • After T is done A and B must have the same value

    • A new consistent state


Example 17 11

Example 17.1


Example 17 12

Example 17.1

  • Crash happens before OUTPUT(A)

    • DB in consistent state.

  • Crash happens after OUTPUT(B)

    • DB in a new consistent state.

  • Crash happens after OUTPUT(A) but before OUTPUT(B),

    • DB is in an inconsistent state!

  • Recovery guarantees that the transactions starts with and ends with consistent states.

    • Atomicity: All or none


More on log

More on Log

  • Contains log records

  • An append-only file

    • It is not used to answer queries (very inefficient). It is used only to recover information.

  • Every transaction has some records in the DB log

  • After a system failure, TM uses log to

    • Undo the operations of the uncommitted transactions

    • Redo all or some operations of the committed transactions.


Undo logging

Undo Logging


Undo logging1

Undo Logging

  • Records every modification on the log before they are written to the disk.

  • We can undo all modifications, if system crashes in the middle of transaction.


Log records in an undo log

Log Records in an Undo Log

  • <START T>

    • Transaction T has started

  • <COMMIT T>

    • Transaction T is committed.

  • <ABORT T>

    • Transaction T has aborted. TM must undo all operations of T (not covered in this course).

  • <T,A,v>

    • Transaction T has updated data item A whose old value was v. One per WRITE operation, none for OUTPUT.


Rules of undo logging

Rules of Undo Logging

  • Rule 1:

    • If T modifies data item A, record <T,A,v> must be written to log before OUTPUT(A).

    • The new information is in the log, before they are written on disk.

  • Rule 2:

    • If T commits, record <COMMIT T> must be written to log only after all the modifications of T are written on disk.


Sequence of operations

Sequence of Operations

  • According to Rule 1 and Rule 2:

    • The modifications to data items are recorded in the log.

    • The updated data items are written to disk.

    • COMMIT record is written to the log.


Recovery algorithm

Recovery Algorithm

  • System crash-> perform recovery:

  • Figure out which transaction is completed and which one is not

    • <START T>…<COMMIT T> ... : completed

    • <START T>…<ABORT T> ... : completed

    • <START T>… : incomplete

  • Undo all updates done by incomplete transactions.


Recovery algorithm1

Recovery Algorithm

  • Read log from the end

    • If <COMMIT T> : mark T as completed

    • If <ABORT T> : mark T as completed

    • If <T,A,v>

      If (T is incomplete) write A=v to disk


Example

Example

<START T5>

<T2,A2,v2>

  • <START T4>

  • <T5,A5,v5>

    <T4,A4,v4>

    <T3,A3,v3>

  • <COMMIT T4>

    <T1,A1,v1>

  • Write v2 to A2 on disk

  • Write v5 to A5 on disk

  • Write v3 to A3 on disk

  • Mark T4 as completed

  • Write v1 to A1 on disk


How far up in the log

How far up in the Log?

  • The recovery algorithm has to examine all records in the log.

  • Very inefficient.

  • We can use check-pointing to limit the number of examined log records.


Crash during recovery

Crash During Recovery

  • Undo operations are idempotent

  • We can repeat them without losing consistency.

  • In the case crash, just re-start the recovery from the initial state.


Checkpointing

Checkpointing

  • Periodically create check points:

    • Do not accept any new transaction

    • Wait for the active transactions to complete

    • Flush log information to the log file.

    • Write checkpoint entry (<CKPT>) to log file.

    • Start accepting new transactions


Example1

Example

  • <T16, A16, v16>

  • <T14, A14, v14>

    <CKPT>

    <START T5>

    <T2,A2,v2>

  • <START T4>

  • <T5,A5,v5>

    <T4,A4,v4>

    <T3,A3,v3>

  • <COMMIT T4>

    <T1,A1,v1>

Other transactions (T14,

T16, …) are completed.

Stop here

  • Write v2 to A2 on disk

  • Write v5 to A5 on disk

  • Write v3 to A3 on disk

  • Mark T4 as completed

  • Write v1 to A1 on disk


Any problem in checkpointing

Any problem in checkpointing?

  • It makes the database frequently unavailable.

  • Not acceptable in large systems.

  • Can we do checkpointing and accept transactions?


Nonquiescent checkpointing

NonquiescentCheckpointing

  • Find active transactions: T1, …, Tn

  • Write <START CKPT T1, …, Tn> to the log file.

  • Wait for T1,…,Tn to complete

    • Other transaction can operate and use the database

  • Write <END CKPT> to the log file.


Nonquiescent checkpointing1

NonquiescentCheckpointing

<START CKPT T1, T2>

<END CKPT>

Other completed transactions

and active transactions: T1,

T2

T1, T2, and other

active transactions

Undo till the <START CKPT>

for this <END CKPT>

Crash

Other active transactions


Nonquiescent checkpointing2

NonquiescentCheckpointing

<START T1>

<START T2>

<START CKPT T1, T2>

Other completed transactions

and active transactions: T1,

T2

Undo till the earliest record of T1 and T2

Crash

T1, T2, and other

active transactions


Nonquiescent checkpointing3

NonquiescentCheckpointing

  • Space optimization

  • When inserting an <END CKPT> record, we remove all log records before the last

    <START CKPT T1, …, Tn>


Problems with undo logging

Problems with Undo Logging?

  • Rule 1:

    • If T modifies data item A, record <T,A,v> must be written to log before OUTPUT(A).

  • Rule 2:

    • If T commits, record <COMMIT T> must be written to log only after all the modifications of T are written on disk.


Problems with undo logging1

Problems with Undo Logging?

  • Rule 1:

    • If T modifies data item A, record <T,A,v> must be written to log before OUTPUT(A).

  • Rule 2:

    • If T commits, record <COMMIT T> must be written to log only after all the modifications of T are written on disk.

INEFFICIENT


  • Login