1 / 36

Chapter 7: Distributed Transactions

Chapter 7: Distributed Transactions. Transaction concepts Centralized/Distributed Transaction Architecture Schedule concepts Locking schemes Deadlocks Distributed commit protocol ( 2PC ). An Updating Transaction.

becka
Download Presentation

Chapter 7: Distributed Transactions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 7: Distributed Transactions • Transaction concepts • Centralized/Distributed Transaction Architecture • Schedule concepts • Locking schemes • Deadlocks • Distributed commit protocol ( 2PC ) Distributed Systems

  2. An Updating Transaction Updating a master tape is fault tolerant: If a run fails for any reason, all the tape could be rewound and the job restarted with no harm done. Distributed Systems

  3. Transaction concepts • OS Processes  DBMS Transactions • A collection of actions that make consistent transformation of system states while preserving consistency • Termination of transactions – commit vs abort • Example: T : x = x + y R(x) Read(x) W(x) C Read(y) R(y) Write(x) Commit Distributed Systems

  4. Properties of Transactions: ACID • Atomicity • All or Nothing • Consistency • No violation of integrity constants, i.e., from one consistent state to another consistent state • Isolation • Concurrent changes invisible, i.e. serializable • Durability • Committed updates persist (saved in permanent storage) Distributed Systems

  5. Transaction Processing Issues • Transaction structure • Flat vs Nested vs Distributed • Internal database consistency • Semantic data control and integrity enforcement • Reliability protocols • Atomicity and durability • Local recovery protocols • Global commit protocols • Concurrency control algorithms • Replica control protocols Distributed Systems

  6. Basic Transaction Primitives Distributed Systems

  7. A Transaction Example • Transaction to reserve three flights commits • Transaction aborts when third flight is unavailable Distributed Systems

  8. Transaction execution Distributed Systems

  9. Nested vs Distributed Transactions Distributed Systems

  10. T11 S3 S1 S1 T1 T12 S4 T S2 T S0 T21 S5 T2 T22 S6 S3 S2 S7 (a) Distributed flat (b) Distributed nested Flat/nested Distributed Transactions A circle (Si) denotes a server, and a square (Tj) represents a sub-transaction. Distributed Systems

  11. Distributed Transaction • A distributed transaction accesses resource managers distributed across a network • When resource managers are DBMSs we refer to the system as a distributed database system • Each DBMS might export stored procedures or an SQL interface. In either case, operations at a site are grouped together as a subtransaction and the site is referred to as a cohort of the distributed transaction • Coordinator module plays major role in supporting ACID properties of distributed transaction • Transaction manager acts as coordinator Distributed Systems

  12. Distributed Transaction execution Distributed Systems

  13. Distributed ACID • Global Atomicity: All subtransactions of a distributed transaction must commit or all must abort. • An atomic commit protocol, initiated by a coordinator (e.g., the transaction manager), ensures this. • Coordinator must poll cohorts to determine if they are all willing to commit. • Global deadlocks: there must be no deadlocks involving multiple sites • Global serialization: distributed transaction must be globally serializable Distributed Systems

  14. Schedule • Synchronizing concurrent transactions • Data base remains consistent • Maximum degree of concurrency • Transaction execute concurrently but the net effect of the resulting history is equivalent to some serial history • Conflict equivalence • The relative order of execution of the conflictingoperationsbelonging to unaborted transactions in the two schedules is same Distributed Systems

  15. Lost Update Problem Transaction T and U both work on the same bank branch K Distributed Systems

  16. Inconsistent Retrieval Problem Transaction T and U both work on the same bank branch K Distributed Systems

  17. Serial Equivalent Transaction T and U both work on the same bank branch K Distributed Systems

  18. Serializability (d) • a) – c) Three transactions T1, T2, and T3 • d) Possible schedules Distributed Systems

  19. Concurrency control Algorithms • Pessimistic • Two-Phase locking based(2PL) • Centralized 2PL • Distributed 2PL • Timestamp Ordering (TO) • Basic TO • Multiversion TO • Conservative TO • Hybrid • Optimistic • Locking and TO based Distributed Systems

  20. Two-Phase Locking (1) • A transaction locks an object before using it • When an object is locked by another transaction, the requesting transaction must wait • When a transaction releases a lock, it may not request another lock • Strict 2 PL – hold lock till the end • The scheduler first acquires all the locks it needs during the growing phase • The scheduler releases locks during shrinking phase Distributed Systems

  21. Two-Phase Locking (2) Distributed Systems

  22. Two-Phase Locking (3) Distributed Systems

  23. Two-Phase Locking (4) • Centralized 2PL • One 2PL scheduler in the distributed system • Lock requests are issued to the central scheduler • Primary 2PL • Each data item is assigned a primary copy, the lock manager on that copy is responsible for lock/release • Like centralized 2PL, but locking has been distributed • Distributed 2PL • 2PL schedulers are placed at each site and each scheduler handles lock requests at that site • A transaction may read any of the replicated copies by obtaining a read lock on one of the copies. Writing into x requires obtaining write lock on all the copies. Distributed Systems

  24. Timestamp ordering (1) • Transaction Ti is assigned a globally unique time stamp ts(Ti) • Using Lamport’s algorithm, we can ensure that the timestamps are unique (important) • Transaction manager attaches the timestamp to all the operations • Each data item is assigned a write timestamp (wts) and a read timestamp (rts) such that: • rts(x) = largest time stamp of any read on x • wts(x) = largest time stamp of any write on x Distributed Systems

  25. Timestamp ordering (2) • Conflicting operations are resolved by timestamp order, let ts(Ti) be the timestamp of transaction Ti, and Ri(x), Wi(x) be read/write operation from Ti for Ri(x): For Wi(x) if (ts(Ti) < wts(x)) if (ts(Ti) < wts(x) and then reject Ri(x) (abort Ti) ts(Ti) < rts(x)) else accept Ri(x) then reject Wi(x) (abort Ti) rts(x)  ts(Ti) else accept Wi(x) wts(x)  ts(Ti) Distributed Systems

  26. Distributed commit protocols • How to execute commit for distributed transactions • Issue: How to ensure atomicity and durability • One-phase commit (1PC): the coordinator communicates with all servers to commit. Problem: a server can not abort a transaction. • Two-phase commit (2PC): allow any server to abort its part of a transaction. Commonly used. • Three-phase commit (3PC): avoid blocking servers in the presence of coordinator failure. Mostly referred in literature, not in practice. Distributed Systems

  27. Two phase commit (2PC) • Consider a distributed transaction involving the participation of a number of processes each running on a different machine, and assuming no failure occur • Phase1: The coordinator gets the participants ready to write the results into the database • Phase2: Everybody writes the results into database • Coordinator: The process at the site where the transaction originates and which controls the execution • Participant: The process at the other sites that participate in executing the transaction • Global commit rule: • The coordinator aborts iff at least one participant votes to abort • The coordinator commits iff all the participants vote to commit Distributed Systems

  28. 2PC phases Distributed Systems

  29. 2PC actions Distributed Systems

  30. Distributed 2PC • The Coordinator initiates 2PC • The participants run a distributed algorithm to reach the agreement of global commit or abort. Distributed Systems

  31. Problems with 2PC • Blocking • Ready implies that the participant waits for the coordinator • If coordinator fails, site is blocked until recovery • Blocking reduces availability • Independent recovery not possible • Independent recovery protocols do not exist for multiple site failures Distributed Systems

  32. Deadlocks • If transactions follow 2PL, then it may have deadlocks • Consider the following scenarios: • w1(x) w2(y) r1(y) r2(x) • r1(x) r2(x) w1(x) w2(x) • Deadlock management • Ignore – Let the application programmers handle it • Prevention – no run time support • Avoidance – run time support • Detection and recovery – find it at your own leisure!! Distributed Systems

  33. T W V T U U V W (a) Simple circle (b) Complex circle Deadlock Conditions • Mutual exclusion • Hold and wait (a) TUVWT • No preemption (b) VWTV • Circular chain VWV Distributed Systems

  34. Deadlock Avoidance • WAIT-DIE rule • If (ts(Ti) < ts(Tj)) then Ti waits else Ti dies • Non preemptive – Ti never preempts Tj • Prefers younger transactions • WOUND-WAIT rule • If (ts(Ti) < ts(Tj)) then Tj is wounded else Ti waits • Preemptive – Ti preempts Tj if it is younger • Prefers older transactions Problem: very expensive, as deadlock is rarely and randomly occurred, forcing deadlock detection involves system overhead. Distributed Systems

  35. Deadlock Detection • Deadlock is a stable property. • With distributed transaction, a deadlock might not be detectable at any one site • But a deadlock still comes from cycles in a Wait for graph • Topology for deadlock detection algorithms • Centralized – periodically collecting waiting states • Distributed – Path pushing • Hierarchical – build a hierarchy of detectors Distributed Systems

  36. wait W S1 S2 lock A B wait lock lock S3 U wait D C V lock Centralized – periodically collecting waiting states S1 :U→ V S2 :V→ W S3 :W→ U Distributed Systems

More Related