transactions
Download
Skip this Video
Download Presentation
Transactions

Loading in 2 Seconds...

play fullscreen
1 / 36

Transactions - PowerPoint PPT Presentation


  • 428 Views
  • Uploaded on

Transactions. A transaction is a complete and consistent series of database operations to perform some logical task that is part of a database application. a database application usually consists of many such transactions

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Transactions' - ostinmannual


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
transactions
Transactions
  • A transaction is a complete and consistent series of database operations to perform some logical task that is part of a database application.
    • a database application usually consists of many such transactions
  • A transactions starts with the first SELECT, INSERT, UPDATE and DELETE statement and ends with either:
    • “commit work” – save the work done by the transaction
    • “rollback work” – undo the work done by the transaction

Database Systems

properties of a transaction
Properties of a Transaction
  • Transaction management in a DBMS allows a large number of transactions to execute concurrently while maintaining:
    • Atomicity:a transaction either finishes all operations successfully or has no effect on the database
    • Consistency:when a transaction completes successfully, it leaves the database in a consistent state
    • Isolation:transactions run as if no other process is accessing the database while they execute
    • Durability:when a transaction completes successfully, its effects are written to disk and are never lost!

Database Systems

transactions1
Transactions
  • A transaction is defined as a series of read and write operations
    • select is a read type operation, insert is a write operation, delete and update are a read operation followed by a write operation
    • data items (usually tuples) are operands for the operations in a transaction
    • most often we disregard or simplify the program logic that takes place in between the database operations
  • A transaction:

R(A) R(B)

A := A + 500

W(A)

B:= B - 500

W(B)

Database Systems

transaction management
T1 is overwriting a

change T2 made

T2 is reading something

T1 changed (DIRTY READ)

As a result, T2 used a value

for A that should have never

been in the database

Transaction Management

T1 T2

R(A)

R(B)

R(B)

A := A + 500

W(A)

B := B+200

W(B)

R(A)

B:= B - 500

W(B)

A:= A-200

W(A)

ABORT

Time

Database Systems

transaction management1
Transaction Management
  • What does it mean to correctly execute a series of transactions? When are interleaved read/write operations from transactions valid?
    • a schedule is a global sequence of database operations from multiple transactions
    • serializable schedules are equivalent to executing all transactions one by one in some order T1,T2,…,Tk (i.e. T1 first, then T2, etc.)
    • serializable schedules guarantee isolation and consistency

Database Systems

achieving the acid properties
Achieving the ACID Properties
  • How do we guarantee all schedules generated by a DBMS are serializable? [Concurrency Control]
    • locking can be used to guarantee serializable schedules as we will see (consistency and isolation)
    • locking also guarantees atomicity
  • How do we make sure that if a transaction succeeds, its effects are never lost? [Recovery]
    • write-ahead logging, achieves durability

Database Systems

concurrency control
Concurrency Control
  • Concurrency control – activity of coordinating the actions of transactions that operate in parallel and access shared data.
  • The operations of multiple transactions must be interleaved in such a way that the transactions do not interfere with one another.
  • The results of the execution of the transactions must be the same as if they were run one after the other in a serial fashion – serializability.
  • A database system must find serializable executions of the transactions it receives.
  • For simplicity, we will assume transactions are composed of the operations: read(x), write(x), commit and abort.

Database Systems

concurrency control problems
Concurrency Control Problems

T1

T2

  • Lost update problem:
  • Dirty read problem:

read(x)

x = x + 1

read(x)

The final value of x is

incremented by one rather

than by two, as it should be.

time

x = x + 1

write(x)

write(x)

T1

T2

read(x)

T2 reads a value for x

that should not have been

part of the database.

x = x + 1

write(x)

time

read(x)

abort

Database Systems

more problems
More Problems
  • Incorrect summary problem:

T1

T2

sum = 0

T1 incorrectly computes

the sum of x, y, and z.

read(x)

sum = sum + x

read(y)

y = y - 1

write(y)

time

If T2 is moving money

From one account to

another and T1 is

computing total assets,

this can be a serious

problem.

read(y)

sum = sum + y

read(z)

sum = sum + z

read(z)

z = z + 1

write(z)

Database Systems

transaction histories
Transaction Histories
  • A history is a partial ordering of the operations in a set of transactions recording how the transactions executed.
  • Two operations conflict if they operate on the same data item and at least one of them is a write.
    • in this case it is important to know which of the two operations went first
  • Let T = {T1, T2, ..., Tn}. A history H over T is a partial order

Database Systems

history example
History Example
  • Consider the following three transactions:
    • T1 = r1[x] ---> w1[x] ---> c1
    • T2 = r2[x] ---> w2[y] ---> w2[x] ---> c2
    • T3 = r3[y] ---> w3[x] ---> w3[y] ---> w3[z] ---> c3
  • A history over these transactions is:
  • r2[x] ---> w2[y] ---> w2[x] ---> c2
  • r3[y] ---> w3[x] ---> w3[y] ---> w3[z] ---> c3

H =

  • r1[x] ---> w1[x] ---> c1

Database Systems

serialization graph
Serialization Graph
  • A serialization graph of a history H is a directed graph whose nodes are the transactions in H and whose edges are all of the form Ti ---> Tj such that an operation in Ti precedes and conflicts with an operation in Tj.
  • Example
  • The serialization graph for this example is:
  • r3[x] ---> w3[x] ---> c3
  • r1[x] ---> w1[x] ---> w1[y] ---> c1

H =

  • r2[x] ---> w2[y] ---> c2

T2 T1 T3

Database Systems

serializability theory
Serializability Theory
  • Two histories are equivalent if they order conflicting operations of non-aborted transactions in the same way.
  • Goal -- determine if there is a serial history equivalent to a given history H.
    • if there is, then the transactions in H did not interfere with one another
  • Therefore, we need a serial history consistent with the edges in the serialization graph for the history H.
  • Such a serial history exists if the serialization graph for history H is acyclic.

Database Systems

example a serializable history
Example - A Serializable History
  • Example:
  • The serialization graph for this example is:
  • An equivalent serial history is:
  • r3[x] ---> w3[x] ---> c3
  • r1[x] ---> w1[x] ---> w1[y] ---> c1

H =

  • r2[x] ---> w2[y] ---> c2

T2 T1 T3

T2 T1 T3

Database Systems

example non serializable history
Example: Non-Serializable History
  • r2[x] ---> w2[y] ---> w2[x] ---> c2
  • Example:
  • The serialization graph for this history is:
  • There are no equivalent serial histories because the serialization graph has a cycle.
    • these transactions interfere with one another
  • r3[y] ---> w3[x] ---> w3[y] ---> w3[z] ---> c3

H =

  • r1[x] ---> w1[x] ---> c1

T1 T2 T3

Database Systems

serialization graphs
Serialization Graphs
  • Given a serialization graph that does not contain a cycle, an equivalent serial schedule can always be constructed:
    • create an empty schedule S
    • while the graph is not empty:
      • find all nodes N in the graph with no incoming edges
      • order transactions in N in any arbitrary serial order and add them to the end of S

for each transaction in N, select the operations in the current schedule and write them in the same order to the new schedule

      • delete nodes N and all edges from nodes in N to the remaining graph

Database Systems

schedulers
Schedulers
  • A database system must have a scheduler that orders operations in concurrently executing transactions.
  • Aggressive (optimistic) scheduler -- assume the best and abort if discover a problem later
    • the abort undoes operations that the transaction completed before the abort
    • good when conflicts are rare
  • Conservative scheduler -- delay operations so that they can be reordered to avoid problems.
    • less concurrency
    • good when conflicts are likely

Database Systems

locking to ensure serializability
Locking to Ensure Serializability
  • A transaction should have a lock in its possession for any data objects it accesses.
    • read/shared lock is required to read an item
    • write/exclusive lock is required to write an item
  • Many transactions may have read locks on the same data item.
  • If a transaction holds a write lock on an item, no other transaction may obtain a lock of any kind for this item.

Database Systems

two phase locking 2pl
Two Phase Locking (2PL)
  • Before each read and write operation, the transaction manager attempts to obtain the necessary lock for the appropriate data item
    • if the lock is granted, then the operation is performed
    • if the lock is not granted, then the whole transaction waits in a queue until the lock can be obtained
  • A transaction can release a lock at any time
    • after a transaction releases a lock for the first time, it cannot obtain any new locks on any data items
  • The COMMIT command in SQL releases all locks a transaction holds, which is a stronger condition than required by 2PL. [Strict Two Phase Locking]

Database Systems

slide20
2PL

THEOREM: All schedules generated by enforcing the 2PL protocol are serializable. The actual read/write events that take place in the database correspond to some serial execution of transactions.

PROOF: The conflict graph induced by 2PL compliant schedulers are acyclic. By contradiction, assume the serialization graph has a cycle of the form:

Then there must be at least two conflicts.

A sample scenario for conflicts 1 and 2:

R1(A)…W2(A)...R2(B)…W1(B)

conflict1

T1

T2

conflict2

Database Systems

slide21
2PL

conflict1

T1

T2

R1(A)…W2(A)...R2(B)…W1(B)

  • Such a cycle is not possible. T2 could only obtain a write lock for “A” if T1 first releases it’s read lock and enters its shrinking phase.
  • Consequently, T1 will be able to write “B” only if it already holds a write lock for “B”.
  • But is T1 already holds a write lock for “B”, then T2 will not be able to get the read lock on B that it needs.
  • Hence, this schedule cannot happen.

conflict2

Database Systems

but there is a problem
But There is a Problem !
  • Two-phase locking is prone to deadlocks.
  • Example:
    • T1: r1 [x] ---> w1 [y] ---> c1
    • T2: w2 [y] ---> w2 [x] ---> c2
  • Lock conversion from read locks to write locks can also cause a problem.
    • T1: r1 [x] ---> w1 [x] ---> c1
    • T2: r2 [x] ---> w2 [x] ---> c2

Database Systems

deadlocks in 2pl
Deadlocks in 2PL
  • If a transaction requests a lock of any kind and cannot obtain it, it enters a wait state
    • this can lead to deadlocks
  • A wait-for graph to detect deadlocks is constructed as follows:
    • if transaction Ti is waiting for a lock that transaction Tj holds, draw an edge from Ti to Tj
    • If the wait-for graph has a cycle, then no transaction in the cycle will ever exit the “wait” state. [DEADLOCK]
  • One of the transactions participating in a deadlock can be aborted to remove the deadlock

Database Systems

example
Example
  • T1: R(A) W(B)
  • T2: R(B) W(A)
  • Schedule: R1(A) R2(B)
  • T1 is waiting for T2 to release the lock on B, T2 is waiting for T1 to release the lock on A
  • Neither of the transactions can complete  there is a cycle in the wait-for graph  deadlock

T1

T2

Database Systems

deadlocks
Deadlocks
  • Time-outs -- if a transaction has waited too long for a lock, the scheduler “guesses” that it is deadlocked and aborts it
  • Maintain wait-for graphs
    • things to consider when aborting transactions:
      • effort already invested
      • cost to abort
      • other potential deadlocks
      • work remaining
      • priority and locks already held

Database Systems

implementation issues
Implementation Issues
  • The scheduler in a database system is usually implemented as a combination of a lock manager and a transaction manager.
  • The lock manager maintains a table of locks and supports the following operations:
    • lock (transaction id, data item, mode)
    • unlock (transaction id, data item)
    • these must be atomic operations
  • The lock manager also maintains a queue for each data item of transactions waiting for locks on that item.

Database Systems

locking granularity
Locking Granularity
  • Possible lock granularities:
    • attribute value in a tuple
    • tuple
    • disk block
    • file
    • whole database
  • Larger granularities allow less concurrency
  • Smaller granularities require more overhead
  • Many database systems provide a small set of choices to pick from.
  • Row (tuple level) locking only in Oracle.

Database Systems

timestamp ordering
Timestamp Ordering
  • A unique timestamp ts(Ti) is given to each transaction.
  • If operations oi[x] from Ti and oj[x] from Tj conflict, then the system processes oi[x] before oj[x] iff ts(Ti) < ts(Tj).
  • Basic Timestamp Ordering: (aggressive)
    • operations executed in first come first serve order
    • if an operation arrives to late, its transaction is aborted
    • the scheduler must maintain the maximum timestamp of previous reads and writes for each data item to do this
  • Timestamp Ordering establishes an ordering on the transactions so that the serialization graph will be acyclic.

Database Systems

cascading abort
Cascading Abort
  • By definition, if a transaction aborts, its effects must be completely erased from the database
  • Example:
  • if T1 aborts, since T2 has seen the changes caused by T1, then T2 has to be aborted as well -> cascading abort
  • schedules generated by 2PL may require cascading abort

Database Systems

strict 2pl
Strict 2PL
  • A transaction obtains all locks it requires as in normal 2PL.
  • A transaction holds all locks until it commits
  • Since all transactions see only the changes made by committed (not aborted) transactions, these transactions do not require cascading aborts
    • the example on the previous slide could not happen
  • Remember that most commercial database systems use strict 2PL
    • this is one of the reasons why

Database Systems

levels of isolation
Levels of Isolation
  • 2PL ensures serializability, but at the expense of lower concurrency
  • “Consistency” may not be vital to some types of transactions
  • In all the isolations levels, all locks are kept until commit time!

Database Systems

read uncommitted
Read Uncommitted
  • Read all tuples regardless of the locks that are on them.
  • When reading, a short lock (latch) is used to ensure atomicity.
  • Transactions that are set to be “read uncommitted” are read only and cannot make database updates
  • Results of such transactions are not guaranteed to be correct!

SET TRANSACTION READ UNCOMMITTED ;

SELECT count(F.reserved_seats) FROM Flights F

WHERE F.flight_id = “AA112” ;

Database Systems

read committed cursor stability
Read committed (cursor stability)
  • There are no read locks (except for latches) ! But, transactions can only read items with no locks.
  • A transaction needs a write lock to write a data item.
  • Write locks are not released until the transaction holding them commits.
    • Avoids all W1(A) -> R2(A) and W1(A) -> W2(A) type conflicts since transaction T2 will only execute after transaction T1 commits
    • Problem with R1(A) -> … -> W1(A) type transactions

Another transaction T2 may read and

alter A in between these operations

Database Systems

cursor stability
Cursor Stability
  • Cursor Stability
    • declare a cursor for update when using “read committed”
    • a read lock is held on each row the cursor points to until the cursor is moved from that location

exec sql set transaction read committed ;

exec sql declare cursor deposit for select balance from accounts where branch_id = ‘Alb’ for update of balance ;

exec sql open deposit ;

(loop through rows in cursor)

exec sql fetch deposit into :balance ;

balance = balance + 10 ;

exec sql update account set balance = :balance

where current of deposit ;

(end of loop)

exec sql close deposit ;

exec sql commit work ;

Database Systems

repeatable reads
Repeatable reads
  • Hold read locks on rows as well as write locks until commit time
    • similar to the 2PL mechanism we have seen
  • Suffers from “phantom update” problems.

exec sql set transaction repeatable reads ;

exec sql select count(F.reserved_seats) into :num;

from Flights F where F.flight_id = “AA112”

exec sql update current set status = :num where current.flight_id = “AA112” ;

Another transaction may insert a

reservation tuple in between these

operations making the count incorrect

Database Systems

serializable
Serializable
  • Serializable transactions do not have the phantom update problem.
  • Serializable transactions lock a “predicate”, limiting all read/write access to operations that would satisfy a predicate such as F.flight_id = “AA112”
  • In Oracle

Database Systems

ad