II. Selected Database Issues

3C13/D6 II. Selected Database Issues Part 2: Transaction Management Lecture 4 Lecturer: Chris Clack

4.0 Content Content 4.1Objectives 4.2 Transaction Support - 4.2.1 Properties of transactions - 4.2.2 Database architecture 4.3 Concurrency Control (Part 1) - 4.3.1 Needs - 4.3.2 Serializability and Recoveryability - 4.3.3 Locking Methods - 4.3.4 Deadlock 4.4 Summary

4.1 Objectives Objectives In this Lecture you will learn: • The purpose of concurrency control • The purpose of database recovery • The function and importance of transactions • Transaction properties • The meaning of serializability and its application • The use of Locks in serializability • The workings of the 2-phase locking (2PL) protocol • The meaning of deadlock and its resolution

4.2 Transaction Support Transaction Support • Transaction: Action, or series of actions, carried out by user or application, which accesses or changes contents of database. • A Logical unit of work on the database. • An application program can be thought of as a series of transactions with non-database processing in between. • Can have one of two outcome • Success - transaction commitsand database reaches a new consistent state. • Failure - transaction aborts, and database must be restored to consistent state before it started. Such a transaction is rolled backor undone.

4.2 Transaction Support Transaction Support Example A transaction must transform the database from one consistent state to another. In the example transaction (b) all elements in the database related to the member of staff who has left must be allocated a new staff number, otherwise the database will be in an inconsistent state (using a staff number that doesn’t exist).

4.2 Transaction Support Transaction Support • Committed transactions cannot be aborted. If it was a mistake reverse with a compensating transaction • An aborted transaction that is rolled back can be restarted later. • DBMS has no inherent way of knowing which updates are grouped together to form a logical transaction. • usually use keywords BEGIN, TRANSACTION, COMMIT, ROLLBACK to delimit transactions. • or DBMS treats entire program as a single transaction

4.2 Transaction Support Transaction Support State transition diagram for a Transaction. • FAILED: if transaction cannot be committed or if it isaborted while in the ACTIVE state (perhaps the user or the concurrency control protocol aborted to ensure serializability) • PARTIALLY COMMITTED: after final statement executed, it may be found that the transaction has violated serializability(see later) or integrity constraints, and would need to be aborted.

4.2 Transaction Support ACIDTransaction properties • Atomicity ‘All or nothing’ • Responsibility of recovery subsystem. • Consistencymust transform the database between consistent states. • Responsibility of DBMS and application developers • Isolationtransactions execute independently of one another. • Responsibility of concurrency control subsystems • Durabilityeffects of a committed transaction are not lost in a failure. • Responsibility of recovery subsystem.

4.2 Transaction Support Database Architecture • Transaction manager: coordinates transactions on behalf of application programs. Communicates with scheduler. • Scheduler: (or Lock manager) implemets particular strategies or concurrency controls. Maximizes concurrency and isolation to ensure integrity and consistency. • Recovery manager: ensures database restoration after failure when failure occurs during a transaction. • Buffer manager responsible for transfer of data between disk storage and main memory.

4.3 Concurrency control Concurrency controlNeeds Concurrency control:Process of managing simultaneous operations on the database without having them interfere with one another. main objective of databases is to allow many users to share data concurrently. When one user updates a multi-user database this could cause inconsistencies for others accessing same data. Example: system that handles input/output (I/O) operations independently, while performing CPU operations. • Interleaved: executes CPU transactions until an I/O is reached, then suspends 1st transaction and executes second transaction. When 2nd transaction reaches I/O operation, control returns to the 1st transaction at the point where it was suspended. Achieves concurrent execution • Throughput: the amount of work accomplished in a given time interval, is improved as CPU executes other transactions instead of waiting for in an idle state for I/O operations to complete. Interleaving may produce an incorrect result, compromising integrity, consistency.

4.3 Concurrency control Concurrency controlNeeds Three potential problems with consistency: 1. Lost update problem: (T1 overwrites T2) • Solution: prevent T1 from accessing data till T2 updates data. 2. Uncommitted dependency problem: (T1 sees intermediate results of T2) • Solution: prevent T1 reading data till T2 commits or aborts 3. Inconsistent analysis problem: dirty read or unrepeatable read(T2 sees data which T1 has not yet updated) • Solution: prevent T2 reading data till T1 has finished updating.

4.3 Concurrency control Concurrency controlSerializability Objective of a concurrency control protocol is to schedule transactions in such a way as to avoid interference. Could run transactions serially – limits parallelism, some can execute together consistently. Schedule: Sequence of reads/writes by set of concurrent transactions. Serial Schedule: operations are executed consecutively without interleaved operations from other transactions. No guarantee that results of all serial executions of a given set of transactions will be identical. Nonserial Schedule: operations from set of concurrent transactions are interleaved. Objective of serializability is to find nonserial schedules that allow concurrent execution without interference. Serializability identifies executions guaranteed to ensure consistency.

4.3 Concurrency control Concurrency controlSerializability • In serializability, ordering of read/writes is important: (a) If two transactions only read a data item, they do not conflict and order is not important. (b) If two transactions either read or write completely separate data items, they do not conflict and order is not important. (c) If one transaction writes a data item and another reads or writes same data item, order of execution is important. • Under constrained write rule (transaction updates data item based on its old value, which is first read), use precedence graph to test for serializability. • There are two definitions of equivalence (and thus of serializability)

4.3 Concurrency control Concurrency controlSerializability • VIEW EQUIVALENCE / SERIALIZABILITY • View equivalence: if two schedules S1, S2 cause all transactions Ti to read the same values and make the same final writes, then S1 and S2 are view-equivalent • View serializability: S is view-equivalent to a serial schedule • CONFLICT EQUIVALENCE / SERIALIZABILITY • Two operations conflict if they are issued by different transactions, operate on the same data item, and one of them is a write operation • Conflict equivalence: all conflicting operations have the same order • Conflict serializability: S is conflict-equivalent to a serial schedule

4.3 Concurrency control Concurrency controlSerializability • If precedence graph contains a cycle, the schedule is not conflict serializable. • In practice DBMS doesn’t test serializability, it uses protocols (to be discussed). Recoverability: Effect a schedule where, for each pair of transactions Ti and Tj, if Tj reads a data item previously written by Ti, then the commit operation of Ti precedes the commit operation of Tj.

4.3 Concurrency control Locking methods Locking: Transaction uses locks to deny access to other transactions and so prevent incorrect updates. • Most widely used approach to ensure serializability. • Generally, a transaction must claim a shared (read) or exclusive (write) lock on a data item before read or write. • Lock prevents another transaction from modifying item or even reading it. Shared Lock: if a transaction has a shared lock on an item it can read the item but not update it. Exclusive Lock: if a transaction has an exclusive lock on a data item it can both read and update the item.

4.3 Concurrency control Locking methods:2-Phase locking (2PL) 2PL: Transaction follows 2PL protocol if all locking operations precede the first unlock operation in the transaction. • Two phases for transaction: • Growing phase - acquires all locks but cannot release any locks. • Shrinking phase - releases locks but cannot acquire any new locks. • Rules: • transaction must acquire a lock on an item before operating on it. • Once a transaction releases a lock it can never acquire any new locks. • upgrading of locks can only take place in the growing phase. • downgrading can only take place during the shrinking phase. • 2PL can prevent the lost update problem etc…

4.3 Concurrency control Deadlock Deadlock: An impasse that may result when two (or more) transactions are each waiting for locks held by the other to be released. • Once it occurs, the applications involved cannot resolve the problem, the DBMS has to recognize it and break it • Only one way to break a deadlock: Abort one or more of the transactions or applications involved. • Three general techniques for handling deadlock: • Timeouts lock request only waits for certain amount of time, after which transaction is aborted and restarted. Very simple and practical. • Deadlock prevention Order transactions using timestamps. Wait-Die or wound-wait algorithms (see the book for details). • Deadlock detection and recovery DBMS allows deadlock to occur but recognizes it and breaks it. construct wait-for graph (WFG).

4.3 Concurrency control Recovery from Deadlock Detection Several issues: • choice of deadlock victim: choice of which transaction to abort may not be clear. Abort the T that incurs minimal cost. Consider • how long T has been running • how many data items have been updated by the T • How many data items T has left to update • how far to roll a transaction back: undoing all changes T made is simplest solution, not necessarily most efficient. May be possible to resolve deadlock whilst only partly rolling back. • avoiding starvation: starvation is when the same T is always chosen as the victim. (similar to livelock). Avoid by storing a counter for number of times T has been selected.

4.4 Summary Summary • 4.1Objectives • 4.2 Transaction Support • Properties of transactions • Database architecture • 4.3 Concurrency Control (Part 1) • Needs • Serializability and Recoveryability • Locking Method • Deadlock NEXT LECTURE: Selected Database Issues 2: Transaction Management Part 2: - More Concurrency Control - Database Recovery - Advanced Transaction Models - Oracle (concurrency control and recovery)

II. Selected Database Issues