Transaction Communications Yi Sun
Outline • Transaction • ACID Property • Distributed transaction • Two phase commit protocol • Nested transaction
Transaction • Service-oriented request/reply communication and multicast communication can be combined as transaction communication. • Commonly known as fundamental unit of interaction between client and server processes in a database system. • In database, represented by a set of synchronous request/reply operations • Atomicity, consistency, isolation, durability properties • In communications: set of asynchronous RPC communications with the ACID properties
ACID Properties Transactionsare communications with ACID property, ACID mainly concerned with concurrency transparency of distributed system. • Atomicity: all or nothing • Either all of the operations in a transaction are performed or none of them are, in spite of failures. • Consistency: consistent state before transaction => consistent state after • Interleaved transaction results in serial execution in some order
ACID Properties (cont.) • Isolation: concurrent transactions may not interfere with each other • Partial results of an incomplete transaction are not visible to others before the transaction is successfully committed. • Durability: after committing, the results are permanent • result of a committed transaction is permanent even in case of failures.
Atomicity and Isolation • Atomicity = indivisibility != exclusivity • Isolation => concurrency control • Serializability • An initiated transaction is either committed or aborted
Durability • Problem: mechanism to allow commits even if hardware or software fails • Hardware: stable storage • Software: recovery mechanisms • Logging • Shadow versions
Stable Storage • Idea: replication on at least 2 disks, keep one copy “correct” at all times • After a crash (or periodical comparison): • Both disks identical: you’re in good shape • One bad, the other OK (checksums): choose the good one • Both seem OK but are different: choose the main disk • For durability: write date to stable storage and check regularly • Works extremely well in practice
Logging • Keep track of all write operations • Log old and new values of data in a log-file (in stable storage) • Need to keep track of if the changes have been committed • On abort: undo all changes (rollback) • After crash: redo all changes (roll forward) • Need to make undo and redo operations
Shadow Copies • Idea: conceptually, copy all files to a private workspace before starting the actual transaction • Commit: copy private workspace to stable storage • Abort: just delete private workspace • Optimizations: • Don’t copy files from which data is only read • Use shadow blocks instead of shadow files
Concurrency Control • Problem: increase efficiency by concurrency • Constraint: isolation => (global) serialization • Solutions: • Two-phase locking • Time stamp ordering • Optimistic concurrency control
Locking • Using locking approach, all shared objects must be locked before they can be accessed and must be released before the end of transaction. • Read and write operations only within transactions • Locks granted and released only by scheduler • Concurrent read / exclusive write locks • A shared lock used for reading and exclusive lock used for write • Locking policy avoids conflicts between operations
Two-Phase Locking • Two phases: • First phase: any number of locks can be requested • Second phase: once any lock is released no more locks may be granted • Problems: • Deadlocks? • timeouts • When should a lock actually be released? • Strict two-phase locking
Example on Two-Phase Locking • Assume there are three transactions: t0, t1 and t2. Transaction t0 has been committed. t1 and t2 are for concurrent execution. • t0: Write A=100, Write B=20 • t1: Read A, Read B, 1: Write sum in C, 2: Write diff in D • t2: Read A, Read B, 3: Write diff in C, 4: Write sum in D
Timestamp Ordering • Idea: • A unique timestamp is associated to each transaction • Each operation is timestamped with the transaction’s timestamp • Schedule operations in the timestamps order • If a single operation is rejected, abort
Example on Two-Phase Locking and Timestamp • Assume there are three transactions: t0, t1 and t2. Transaction t0 has been committed. t1 and t2 are for concurrent execution. • t0: Write A=100, Write B=20 • t1: Read A, Read B, 1: Write sum in C, 2: Write diff in D • t2: Read A, Read B, 3: Write diff in C, 4: Write sum in D
Optimistic Concurrency Control • Hypothesis: conflicts are unlikely • Also some real-time requirement • Method: • Allow transaction complete and then validate before making effect permanent • Work on shadow copies • If validation of changes then commit, else abort
Distributed Transaction • One coordinator (usually the initiator of the transaction) and several participating processes (remote process) • At commit • Atomicity: either all nodes commit or none do • Isolation: effects of the transaction not made visible until all nodes have made an irrevocable decision to commit or abort
Two Phase Commit Protocol ACID properties can be achieved by the two-phase commit(2PC) protocol. There is one coordinator and multiple participants. Each of them have access to some stable storage. Activity log is maintained in the stable storage. Coordinator: • Prepare to commit the transaction by writing every update in activity log. • Write a precommit message in the activity log. Send a voting message to all participants asking whether they are ready to commit. • If all participants vote yes within the time-out period, multicast a commit message. Otherwise, multicast an abort message.
Two Phase Commit Protocol (cont.) Participant: • Prepare to commit the transaction by writing every update in activity log. • Write a precommit message in the activity log. Wait for request to vote from coordinator. • When receiving a vote request, test whether the transaction can be committed and if yes, writes a precommit to its activity log and sends a YES reply, otherwise send a NO reply. • Wait for commit message from the coordinator. If received, commit the transaction. If abort message is received, abort the transaction.
Two Phase Commit Protocol (Recovery) Coordinator: • If the processor crashes, it will check the activity log for the transaction. • If the precommit message is not in the log, abort the transaction (failure before precommit). • If the commit message is not in the log, retake the vote (failure after precommit and before commit). • If the commit message is there in the log, finish the transaction (failure after commit).
Failure Resistance of Two Phase Commit • Use reliable communications (RPC) • Participant failures detected by coordinator • If coordinator fails: • Participant nodes that voted for commit will timeout • Unavoidable uncertainty period • Can be reduced through three phase commit or by allowing participants to contact each other • Put the coordinator on a reliable machine
Nested Transaction • Definition: nested transaction = tree of transactions • Commit of a subtransaction takes place only if parent transaction commits • Rollback of a transaction forces rollback of all its subtransactions
Rules for Nested Transactions • Commit rule • Commit of a transaction makes it result accessible only to its parent • Final global commit happens only if local and all ancestors commit • Rollback rule • If a transaction is rolled back, all its subtransactions are also rolled back (whatever their status)
Rules for Nested Transaction (cont’d) • Visibility rule • All changes by a subtransaction are visible to parent upon local commit • All objects held by parent are visible to subtransactions • Locking rule • Externally, toplevel transaction holds all locks • Internally, multiple transactions may hold exclusive locks • Only leaf transactions may operate on locked objects • Lock inheritance
Final Remarks on Transactions • First introduced for database requirements • Very high reliability • High throughput => concurrency • Data characteristics: huge volumes, fine graularity • Search expressed by simple languages • Used as building blocks for more general distributed environments
References • Distributed Operating Systems & Algorithms, by Randy Chow and Theodore Johnson, 1997 • Towards a Transport Service for Transaction Processing Applications, Network Working Group UCLA, R. Braden • http://www.freenetpages.co.uk/hp/alan.gauld/tutipc.htm