1 / 59

ICS 214B: Transaction Processing and Distributed Data Management

ICS 214B: Transaction Processing and Distributed Data Management. Lecture 11: Concurrency Control and Distributed Commits Professor Chen Li. Overview. Concurrency Control Schedules and Serializability Locking Timestamp Control Deadlocks. In centralized db. T1 T2 … Tn. DB

zoie
Download Presentation

ICS 214B: Transaction Processing and Distributed Data Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ICS 214B: Transaction Processing and Distributed Data Management Lecture 11: Concurrency Control and Distributed Commits Professor Chen Li

  2. Overview • Concurrency Control • Schedules and Serializability • Locking • Timestamp Control • Deadlocks Notes 11

  3. In centralized db T1 T2 … Tn DB (consistency constraints) Notes 11

  4. In distributed db T1 T2 Z Y X Notes 11

  5. Concepts (similar to centralize db) Transaction: sequence of ri(x), wi(x) actions Conflicting actions: r1(A) w2(A) w1(A) w2(A) r1(A) w2(A) Schedule: represents chronological order in which actions are executed Serial schedule: no interleaving of actions or transactions Notes 11

  6. Example constraint: X=Y X Y Node 1 Node 2 T1T2 1 (T1) a  X 5 (T2) c  X 2 (T1) X  a+100 6 (T2) X  2c 3 (T1) b  Y 7 (T2) d  Y 4 (T1) Y  b+100 8 (T2) Y  2d Precedence relation Notes 11

  7. Precedence: intra-transactioninter-transaction Schedule S1 (node X) (node Y) 1 (T1) a  X 2 (T1) X  a+100 5 (T2) c  X 3 (T1) b  Y 6 (T2) X  2c 4 (T1) Y  b+100 7 (T2) d  Y 8 (T2) Y  2d If X=Y=0 initially, X=Y=200 at end Notes 11

  8. Enforcing Serializability • Locking • Timestamp Ordering Schedulers Notes 11

  9. Locking Rules in centralized db (2-phase locking) • Well-formed transactions • Legal schedulers • Two-phase transactions These rules guarantee serializable schedules Notes 11

  10. # locks time Strict 2PL • Hold all locks until transaction commits • Called “Strict 2-phase locking” • Strict 2PL automatically avoids cascading rollbacks Notes 11

  11. access & lock data access & lock data T (release all locks at end) Two-phase Locking in distributed db • Just like in a centralized system • But with multiple lock managers scheduler 1 scheduler 2 ... locks for D1 locks for D2 D1 D2 node 1 node 2 Notes 11

  12. Replicated data T1 T2 scheduler 1 scheduler 2 ... locks for D1 locks for D2 X X node 1 node 2 Notes 11

  13. Replicated data • Simplest scheme (read all, write all) • If T wants to read (write) data item X, T obtains read (write) locks for X at all sites that have X • Better scheme (Read one, write all) • If T wants to read X, T obtains read lock at any one site that has X • If T wants to write X, T obtains write locks at all sites that have X • More sophisticated schemes possible Notes 11

  14. Timestamp Ordering Schedulers • Basic idea: - assign timestamp as transaction begins - if ts(T1) < ts(T2) … < ts(Tn), then scheduler produces history equivalent to T1,T2, ... Tn Notes 11

  15. TO Rule If pi[x] and qj[x] are conflicting operations, then pi[x] is executed before qj[x] (pi[x] <S qj[x]) IFF ts(Ti) < ts(Tj) Notes 11

  16. reject! abort T1 abort T1 abort T2 abort T2 Example: schedule S2 ts(T1) < ts(T2) (Node X) (Node Y) (T1) a  X (T2) d  Y (T1) X  a+100 (T2) Y  2d (T2) c  X (T1) b  Y (T2) X  2c (T1) Y  b+100 Notes 11

  17. Strict T.O. • Lock written items until it is certain that writing transaction has been successful (avoid cascading rollbacks) Notes 11

  18. abort T1 (T2) c  X (T2) X  2c Example Revisited ts(T1) < ts(T2) (Node X) (Node Y) (T1) a  X (T2) d  Y (T1) X  a+100 (T2) Y  2d (T2) c  X (T1) b  Y reject! delay abort T1 Notes 11

  19. Enforcing T.O. • For each data item X: MAX_R[X]: maximum timestamp of a transaction that read X MAX_W[X]: maximum timestamp of a transaction that wrote X rL[X]: # of transactions currently reading X (0,1,2,…) wL[X]: # of transactions currently writing X (0 or 1) Notes 11

  20. T.O. Scheduler - Part 1 ri[X] arrives IF (ts(Ti) < MAX_W[X]) THEN { ABORT Ti } ELSE { IF (ts(Ti) > MAX_R[X]) THEN MAX_R[X]  ts(Ti); IF (queue is empty AND wL[X] = 0) THEN { rL[X]  rL[X]+1; START READ OF X } ELSE add (r, Ti) to queue } Notes 11

  21. T.O. Scheduler - Part 2 Wi[X] arrives IF (ts(Ti) < MAX_W[X] OR ts(Ti) < MAX_R[X]) { ABORT Ti } ELSE { MAX_W[X]  ts(Ti); IF (queue is empty AND wL[X]=0 AND rL[X]=0) { wL[X]  1; WRITE X; // WAIT FOR Ti TO FINISH } ELSE add (w, Ti) to queue } Notes 11

  22. T.O. Scheduler - Part 3 When o finishes (o is r or w) on X oL[X]  oL[X] - 1; NDONE  TRUE WHILE NDONE DO { Let head of queue be (q, Tj); (smallest timestamp) IF (q=w AND rL[X]=0 AND wL[X]=0) { Remove (q,Tj); wL[X]  1; WRITE X; // WAIT FOR Tj TO FINISH } ELSE IF (q=r AND wL[X]=0) { Remove (q,Tj); rL[X]  rL[X] +1; START READ OF X } ELSE NDONE  FALSE } Notes 11

  23. ts(T)=11  Starvation possible  If a transaction is aborted, it must be retried with a new, larger timestamp MAX_R[X]=10 T ts(T)=8 MAX_W[X]=9 read X . . . X . . . Notes 11

  24. Theorem If S is a schedule representing an execution by a T.O. scheduler, then S is serializable Notes 11

  25. Improvement: Thomas Write Rule MAX_R[X] MAX_W[X] ts(Ti) Ti wants to write X Notes 11

  26. Change in T.O. Scheduler MAX_R[X] MAX_W[X] ts(Ti) Ti wants to write X When Wi[X] arrives IF ts(Ti)<MAX_R[X] THEN ABORT Ti ELSE IF (ts(Ti)<MAX_W[X]) { IGNORE THIS WRITE (tell Ti it was OK) } ELSE { process write as before… } Notes 11

  27. 2PL  TO: Example 1 T1: r1[X] r1[Y] w1[Z] ts(T1) < ts(T2) T2: w2[X] S: r1[X] w2[X] r1[Y] w1[Z] S could be produced with T.O. but not with 2PL Notes 11

  28. 2PL  TO: Example 2 T1: r1[X] r1[Y] w1[Z] ts(T1) < ts(T2) T2: w2[Y] S: r1[X] w2[Y] r1[Y] w1[Z] S could be produced with 2PL but not with TO Notes 11

  29. Relationship between 2PL and TO Serializable schedules T.O. schedules 2PL schedules Notes 11

  30. access data access data T Distributed T.O. Scheduler scheduler 1 scheduler 2 ... D1 ts cache D2 ts cache D1 D2 node 1 node 2 • Each scheduler is “independent” • At end of transaction, signal all schedulers involved to release all wL[X] locks Notes 11

  31. Next: Deadlocks • If nodes use 2P locking, global deadlocks possible Local wait-for graph (WFG): no cycles T1 T2 T1 T2 Notes 11

  32. Need to “combine” WFGs to discover global deadlock T1 T2 T1 T2 T1 T2 e.g., central detection node Notes 11

  33. Deadlocks • Local vs. Global • Deadlock detection • Waits-for graph • Timeouts • Deadlock prevention • Wound-wait • Wait-die • Covered in ICS214A Notes 11

  34. Summary • 2PL - the most popular - deadlocks possible - useful in distributed systems • T.O. - aborts more likely - no deadlocks - useful in distributed systems Notes 11

  35. Next: • Reliable distributed database management • Dealing with failures • Distributed commit algorithms • The “two generals” problem Notes 11

  36. Reliability • Correctness • Serializability • Atomicity • Persistence • Availability Notes 11

  37. Types of failures • Processor failures • Halt, delay, restart, berserk, ... • Storage failures • atomic write, transient errors, disk crash • Communication (network) failures • Lost message, out-of-order messages, partitions Notes 11

  38. Failure models • Cannot protect against everything • Unlikely failures (e.g., flooding in the Sahara) • Expensive to protect failures (e.g., earthquake) • Failures we know how to protect against (e.g., message sequence numbers; stable storage) Notes 11

  39. Failure model: Desired Events Expected Undesired Unexpected Notes 11

  40. Node models (1) Fail-stop nodes time perfect halted recovery perfect Volatile memory lost Stable storage ok Notes 11

  41. Node models (2) Byzantine nodes A Perfect Perfect Arbitrary failure Recovery B C At any given time, at most some fraction f of nodes failed (typically f < 1/2 or f < 1/3) Notes 11

  42. Network models (1) Reliable network - in-order messages - no spontaneous messages - timeout TD I.e., no lost messages, except for node failures Destination down (not paused) If no ack in TD sec. Notes 11

  43. Variation of reliable net • Persistent messages • If destination down, net will eventually deliver message • Simplifies node recovery, but leads to inefficiencies • Just moves the problem one level lower down the stack • Not considered here Notes 11

  44. Network models (2) Partitionable network - In order messages - No spontaneous messages - nodes can have different views of failures Notes 11

  45. Scenarios • Reliable network • Fail-stop nodes • No data replication (1) • Data replication (2) • Partitionable network • Fail-stop nodes (3) Notes 11

  46. No Data Replication • Reliable network, fail-stop nodes • Basic idea: node P controls X P net Item X - Single control point simplifies concurrency control and recovery - Note availability hit: if P down, X unavailable too! Notes 11

  47. “P controls X” means - P does concurrency control for X - P does recovery for X Notes 11

  48. Say transaction T wants to access X: req PT is a process that represents T at this node PT Local DMBS X Lock mgr LOG Notes 11

  49. Distributed commit problem . Transaction T Action: a1,a2 Action: a3 Action: a4,a5 Commit must be atomic Notes 11

  50. Distributed commit problem • Commit must be atomic • Solution: Two-phase commit (2PC) • Centralized 2PC • Distributed 2PC • Linear 2PC • Many other variants… Notes 11

More Related