cs 440 database management systems
Download
Skip this Video
Download Presentation
CS 440 Database Management Systems

Loading in 2 Seconds...

play fullscreen
1 / 50

CS 440 Database Management Systems - PowerPoint PPT Presentation


  • 68 Views
  • Uploaded on

CS 440 Database Management Systems. Transaction Management - Recovery. What we have discussed so far:. RDBMS Implementation. Use indexing to speed up queries Which type of index to use Which attributes to index … Storage Which block size to choose

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' CS 440 Database Management Systems' - michi


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
cs 440 database management systems

CS 440 Database Management Systems

Transaction Management - Recovery

rdbms implementation
RDBMS Implementation
  • Use indexing to speed up queries
    • Which type of index to use
    • Which attributes to index
  • Storage
    • Which block size to choose
    • Clustered versus unclustered relations
rdbms implementation1
RDBMS Implementation
  • Query execution
    • Algorithms to implement our operators
    • Analyze their costs
  • Query optimization
    • Find the best plan to execute the queries
    • Check if your RDBMS picks the best plan
    • Change RDBMS setting to get the best time
      • Build hash-index if RDBMS uses hash-based join
rdbms implementation2
RDBMS Implementation
  • You can start tuning your RDBMS
    • or writing data management programs
    • A rare skill!
  • We assumed the input is a single query
    • Programs use more than one query to do the job
  • What is different?
example balance transfer
Example: Balance Transfer
  • Transfer $200 from account 170 to account 103
  • Three SQL queries:
  • Check if Account 170 has at least $200.
  • Reduce the balance of 170 by $200.
  • Increase the balance of 103 by $200.
  • There is a crash after step 2.
    • Power loss, hardware problem, software bug in OS,…
    • Account 170 loses $200.
balance transfer example lessons
Balance Transfer Example: Lessons
  • Check if Account 170 has at least $200.
  • Reduce the balance of 170 by $200.
  • Increase the balance of 103 by $200.
  • Queries in step 2 and 3 must be executed together.
    • All or none.
    • Atomicity.
what should we care
What should we care?
  • Failures do not happen that often!
    • Ostrich method: let them happen
  • This makes the database inconsistent
    • Deposit and withdrawal do not sum up
  • A DB that is 1% inconsistent is 100% useless.
    • Nobody can rely on the data
    • Prohibitively expensive to find the inconsistent accounts
rdbms implementation3
RDBMS Implementation
  • We assumed that there is only one user
  • Multiple users concurrently read/write the data.
  • What is different?
example 6 40
Example 6.40

Flight(fltNo, fltDate, seatNo, seatStatus)

  • Check the available seats

SELECT seatNo

FROM Flight

WHERE fltNo=123 AND fltDate=DATE’08-1-1’

AND seatStatus = ‘available’;

2. Book the available seat

UPDATE Flight

SET seatStatus = ‘occupied’

WHERE fltNo=123 AND fltDate=DATE‘08-1-1’

AND seatNo = 22A

example 6 401
Example 6.40
  • John checks for availability and gets seat 22A
  • John books seat 22A
  • Mary checks for availability and gets seat 22B
  • Mary books seat 22B
example 6 402
Example 6.40
  • John checks for availability and gets seat 22A
  • Mary checks for availability and gets seat 22A
  • John books seat 22A
  • Mary books seat 22A

Double booking!

example 6 40 lessons
Example 6.40: Lessons
  • Either John’s program must execute first and then Mary’s, or the other way. They cannot run in parallel.
    • Or they seem to run serially!
  • They must not interfere with each other
  • Serializablity
transaction
Transaction
  • An execution of a DB program
    • Or a coherent fraction of a DB program
  • A large DB program may contain more than one transaction
    • Flight: one transaction for booking, one transaction for canceling, …
transaction acid properties
Transaction: ACID Properties
  • Atomicity
    • All or nothing
  • Consistency
    • Each transaction maps database form one consistent state to another consistent state.
  • Isolation
    • Concurrent transactions must not interfere with each other.
  • Durability
    • The result of a committed transaction does no vanish due to failures: power loss, errors, …
balance transfer example
Balance Transfer Example
  • Check if Account 170 has at least $200.
  • Reduce the balance of 170 by $200.
  • Increase the balance of 103 by $200.
  • Power outage in the middle violates Atomicity,Consistency, and durability
example 6 403
Example 6.40
  • John checks for availability and gets seat 22A
  • Mary checks for availability abd gets seat 22A
  • John books seat 22A
  • Mary books seat 22A
  • It violates consistency and isolation.
transaction1
Transaction

START TRANSACTION

SQL-statement 1

SQL-statement 2

...

SQL-statement n

COMMIT

SQL-statement1 to SQL-statement n will be treated as an atomic programming fragment: “All or none”.

transaction2
Transaction
  • We can also undo a transaction programmatically:

START TRANSACTION

SQL-statement 1

SQL-statement 2

...

IF (Some condition) ROLLBACK

SQL-statement n

COMMIT

  • ROLLBACK undoes the effect of all executed statements -> aborts the transaction.
transaction manager
Transaction Manager
  • A component in DBMS that supports transaction processing.
    • COMMIT, ROLLBACK
  • It guarantees ACID properties.
transaction manager modules
Transaction Manager Modules
  • Recovery
    • Rolls back the database to consistent state in the case system failures
    • Guarantees atomicity, durability, and consistency.
    • Recovery manager.
  • Concurrency control
    • Does not allow concurrent transactions to interfere with each other.
    • Guarantees isolation and consistency.
    • Concurrency control manager.
types of failures
Types of Failures
  • Human error
    • Enter wrong data, wrong logic in transactions
    • Use integrity constraints
  • Disk crashes
    • Redundancy: RAID, Archive on tape, …
  • System failures
    • Power outage, (other) hardware errors, software errors,…
    • Recovery
database log
Database Log
  • Transactions have some intermediate state
    • May be in memory buffers
    • Lost in case of system failures
  • Log
    • A file that records every action in the transaction.
  • Log Manager
    • Maintains the log of a database.
transaction notations
Transaction Notations
  • Database is a set of data items
    • Usually data item = block
    • Could be record (smaller) or table (larger)
  • Transaction is a sequence of read/write data items
  • Three types of storage
    • Disk
    • Memory buffers: managed by buffer manager
    • Transaction’s local memory (variables): managed by transaction manager.
operations of transactions
Operations of Transactions
  • INPUT(A)
    • Read data item A from disk to memory buffer
  • READ(A,v)
    • Read data item A to local variable v
  • WRITE(A,v)
    • Write local variable v to data item A in memory buffer
  • OUTPUT(A)
    • Write data item A from memory buffer to disk.
example 17 1
Example 17.1
  • Salary increase
  • Data items A and B has the same value in DB
    • Consistent state
  • Transaction T
    • A = A * 2;
    • B = B * 2;
  • After T is done A and B must have the same value
    • A new consistent state
example 17 12
Example 17.1
  • Crash happens before OUTPUT(A)
    • DB in consistent state.
  • Crash happens after OUTPUT(B)
    • DB in a new consistent state.
  • Crash happens after OUTPUT(A) but before OUTPUT(B),
    • DB is in an inconsistent state!
  • Recovery guarantees that the transactions starts with and ends with consistent states.
    • Atomicity: All or none
more on log
More on Log
  • Contains log records
  • An append-only file
    • It is not used to answer queries (very inefficient). It is used only to recover information.
  • Every transaction has some records in the DB log
  • After a system failure, TM uses log to
    • Undo the operations of the uncommitted transactions
    • Redo all or some operations of the committed transactions.
undo logging1
Undo Logging
  • Records every modification on the log before they are written to the disk.
  • We can undo all modifications, if system crashes in the middle of transaction.
log records in an undo log
Log Records in an Undo Log
  • <START T>
    • Transaction T has started
  • <COMMIT T>
    • Transaction T is committed.
  • <ABORT T>
    • Transaction T has aborted. TM must undo all operations of T (not covered in this course).
  • <T,A,v>
    • Transaction T has updated data item A whose old value was v. One per WRITE operation, none for OUTPUT.
rules of undo logging
Rules of Undo Logging
  • Rule 1:
    • If T modifies data item A, record <T,A,v> must be written to log before OUTPUT(A).
    • The new information is in the log, before they are written on disk.
  • Rule 2:
    • If T commits, record <COMMIT T> must be written to log only after all the modifications of T are written on disk.
sequence of operations
Sequence of Operations
  • According to Rule 1 and Rule 2:
    • The modifications to data items are recorded in the log.
    • The updated data items are written to disk.
    • COMMIT record is written to the log.
recovery algorithm
Recovery Algorithm
  • System crash-> perform recovery:
  • Figure out which transaction is completed and which one is not
    • <START T>…<COMMIT T> ... : completed
    • <START T>…<ABORT T> ... : completed
    • <START T>… : incomplete
  • Undo all updates done by incomplete transactions.
recovery algorithm1
Recovery Algorithm
  • Read log from the end
    • If <COMMIT T> : mark T as completed
    • If <ABORT T> : mark T as completed
    • If <T,A,v>

If (T is incomplete) write A=v to disk

example
Example

<START T5>

<T2,A2,v2>

  • <START T4>
  • <T5,A5,v5>

<T4,A4,v4>

<T3,A3,v3>

  • <COMMIT T4>

<T1,A1,v1>

  • Write v2 to A2 on disk
  • Write v5 to A5 on disk
  • Write v3 to A3 on disk
  • Mark T4 as completed
  • Write v1 to A1 on disk
how far up in the log
How far up in the Log?
  • The recovery algorithm has to examine all records in the log.
  • Very inefficient.
  • We can use check-pointing to limit the number of examined log records.
crash during recovery
Crash During Recovery
  • Undo operations are idempotent
  • We can repeat them without losing consistency.
  • In the case crash, just re-start the recovery from the initial state.
checkpointing
Checkpointing
  • Periodically create check points:
    • Do not accept any new transaction
    • Wait for the active transactions to complete
    • Flush log information to the log file.
    • Write checkpoint entry (<CKPT>) to log file.
    • Start accepting new transactions
example1
Example

  • <T16, A16, v16>
  • <T14, A14, v14>

<CKPT>

<START T5>

<T2,A2,v2>

  • <START T4>
  • <T5,A5,v5>

<T4,A4,v4>

<T3,A3,v3>

  • <COMMIT T4>

<T1,A1,v1>

Other transactions (T14,

T16, …) are completed.

Stop here

  • Write v2 to A2 on disk
  • Write v5 to A5 on disk
  • Write v3 to A3 on disk
  • Mark T4 as completed
  • Write v1 to A1 on disk
any problem in checkpointing
Any problem in checkpointing?
  • It makes the database frequently unavailable.
  • Not acceptable in large systems.
  • Can we do checkpointing and accept transactions?
nonquiescent checkpointing
NonquiescentCheckpointing
  • Find active transactions: T1, …, Tn
  • Write <START CKPT T1, …, Tn> to the log file.
  • Wait for T1,…,Tn to complete
    • Other transaction can operate and use the database
  • Write <END CKPT> to the log file.
nonquiescent checkpointing1
NonquiescentCheckpointing

<START CKPT T1, T2>

<END CKPT>

Other completed transactions

and active transactions: T1,

T2

T1, T2, and other

active transactions

Undo till the <START CKPT>

for this <END CKPT>

Crash

Other active transactions

nonquiescent checkpointing2
NonquiescentCheckpointing

<START T1>

<START T2>

<START CKPT T1, T2>

Other completed transactions

and active transactions: T1,

T2

Undo till the earliest record of T1 and T2

Crash

T1, T2, and other

active transactions

nonquiescent checkpointing3
NonquiescentCheckpointing
  • Space optimization
  • When inserting an <END CKPT> record, we remove all log records before the last

<START CKPT T1, …, Tn>

problems with undo logging
Problems with Undo Logging?
  • Rule 1:
    • If T modifies data item A, record <T,A,v> must be written to log before OUTPUT(A).
  • Rule 2:
    • If T commits, record <COMMIT T> must be written to log only after all the modifications of T are written on disk.
problems with undo logging1
Problems with Undo Logging?
  • Rule 1:
    • If T modifies data item A, record <T,A,v> must be written to log before OUTPUT(A).
  • Rule 2:
    • If T commits, record <COMMIT T> must be written to log only after all the modifications of T are written on disk.

INEFFICIENT

ad