190 likes | 327 Views
Learn about transformation of relational expressions, equivalence rules, ACID properties, transaction states, concurrent execution, consistency, and more in database transactions. Understand conflict resolution and serializability testing methods.
E N D
Quick Review of Apr 29 material • Transformation of Relational Expressions • Equivalence Rules • Transactions • ACID properties (Atomic, Consistent, Isolated, Durable) • Transaction States (active, partially committed, failed, committed, aborted) • Concurrent Execution and Serializability
Serial and Interleaved Schedules Serial schedule Interleaved schedule
Another Interleaved Schedule The previous serial and interleaved schedules ensured database consistency (A+B before execution = A+B after execution) The interleaved schedule on the right is only slightly different, but does not ensure a consistent result. • assume A=1000 and B=2000 before (sum = 3000) • after execution, A=950 and B=2150 (sum = 3100)
Inconsistent Transaction Schedules • So what caused the problem? What makes one concurrent schedule consistent, and another one inconsistent? • Operations on data within a transaction are not relevant, as they are run on copies of the data residing in local buffers of the transaction • For scheduling purposes, the only significant operations are read and write • Given two transactions, Ti and Tj, both attempting to access data item Q: • if Ti and Tj are both executing read(Q) statements, order does not matter • if Ti is doing write(Q), and Tj read(Q), then order does matter • same if Tj is writing and Ti reading • if both Ti and Tj are executing write(Q), then order might matter if there are any subsequent operations in Ti or Tj accessing Q
Transaction Conflicts • Two operations on the same data item by different transactions are said to be in conflict if at least one of the operations is a write • If two consecutive operations of different transactions in a schedule S are not in conflict, then we can swap the two to produce another schedule S’ that is conflict equivalent with S • A schedule S is serializable if it is conflict equivalent (after some series of swaps) to a serial schedule.
Transaction Conflict Example Example: read/write(B) in T0 do not conflict with read/write(A) in T1 T0 T1 T0 T1 read(A) read(A) write(A) write(A) read(A) read(B) read(B) write(B) write(A) read(A) write(B) write(A) read(B) read(B) write(B) write(B)
Serializability Testing (15.9) and Precedence Graphs • So we need a simple method to test a schedule S and discover whether it is serializable. • Simple method involves constructing a directed graph called a Precedence Graph from S • Construct a precedence graph as follows: • a vertex labelled Ti for every transaction in S • an edge from Ti to Tj if any of these three conditions holds: • Ti executes write(Q) before Tj executes read(Q) • Ti executes read(Q) before Tj executes write(Q) • Ti executes write(Q) before Tj executes write(Q) • if the graph has a cycle, S is not serializable
Precedence Graph Example 1 • Compute a precedence graph for schedule B (right) • three vertices (T1, T2, T3) • edge from Ti to Tj if • Ti writes Q before Tj reads Q • Ti reads Q before Tj writes Q • Ti writes Q before Tj writes Q
Precedence Graph Example 1 • Compute a precedence graph for schedule B (right) • three vertices (T1, T2, T3) • edge from Ti to Tj if • Ti writes Q before Tj reads Q • Ti reads Q before Tj writes Q • Ti writes Q before Tj writes Q
Precedence Graph Example 2 • Slightly more complicated example • Compute a precedence graph for schedule A (right) • three vertices (T1, T2, T3) • edge from Ti to Tj if • Ti writes Q before Tj reads Q • Ti reads Q before Tj writes Q • Ti writes Q before Tj writes Q
Precedence Graph Example 2 • Slightly more complicated example • Compute a precedence graph for schedule A (right)
Concurrency Control • So now we can recognize when a schedule is serializable. In practice, it is often difficult and inefficient to determine a schedule in advance, much less examine it for serializability. • Lock-based protocols are a common system used to prevent transaction conflicts on the fly (i.e., without knowing what operations are coming later) • Basic concept is simple: to prevent transaction conflict (two transactions working on the same data item with at least one of them writing), we implement a lock system -- a transaction may only access an item if it holds the lock on that item.
Lock-based Protocols • We recognize two modes of locks: • shared: if Ti has a shared-mode (“S”) lock on data item Q then Ti may read, but not write, Q. • exclusive: if Ti has an exclusive-mode (“X”) lock on Q then Ti can both read and write Q. • Transactions be granted a lock before accessing data • A concurrency-control manager handles granting of locks • Multiple S locks are permitted on a single data item, but only one X lock • this allows multiple reads (which don’t create serializability conflicts) but prevents any R/W, W/R, or W/W interactions (which create conflicts)
Lock-based Protocols • Transactions must request a lock before accessing a data item; they may release the lock at any time when they no longer need access • If the concurrency-control manager does not grant a requested lock, the transaction must wait until the data item becomes available later on. • Unfortunately, this can lead to a situation called a deadlock • suppose T1 holds a lock on item R and requests a lock on Q, but transaction T2 holds an exclusive lock on Q. So T1 waits. Then T2 gets to where it requests a lock on R (still held by waiting T1). Now both transactions are waiting for each other. Deadlock.
Deadlocks • To detect a deadlock situation we use a wait-for graph • one node for each transaction • directed edge Ti --> Tj if Ti is waiting for a resource locked by Tj • a cycle in the wait-for graph implies a deadlock. • The system checks periodically for deadlocks • If a deadlock exists, one of the nodes in the cycle must be aborted • 95% of deadlocks are between two transactions • deadlocks are a necessary evil • preferable to allowing the database to become inconsistent • deadlocks can be rolled back; inconsistent data is much worse
Two-phase Locking Protocol • A locking protocol is a set of rules for placing and releasing locks • a protocol restricts the number of possible schedules • lock-based protocols are pessimistic • Two-phase locking is (by far) the most common protocol • growing phase: a transaction may only obtain locks (never release any of its locks) • shrinking phase: a transaction may only release locks (never obtain any new locks)
Two-phase with Lock Conversion • Two-phase with lock conversion: • S can be upgraded to X during the growing phase • X can be downgraded to S during the shrinking phase (this only works if the transaction has already written any changed data value with an X lock, of course) • The idea here is that during the growing phase, instead of holding on X on an item that it doesn’t need to write yet, to hold an S lock on it instead (allowing other transactions to read the old value for longer) until the point where modifications to the old value begin. • Similarly in the shrink phase, once a transaction downgrades an X lock, other transactions can begin reading the new value earlier.
Variants on Two-phase Locking • Strict two-phase locking • additionally requires that all X locks are held until commit time • prevents any other transactions from seeing uncommitted data • viewing uncommitted data can lead to cascading rollbacks • Rigorous two-phase locking • requires that ALL locks are held until commit time