1 / 54

CS 347: Distributed Databases and Transaction Processing Data Replication

CS 347: Distributed Databases and Transaction Processing Data Replication. Hector Garcia-Molina. Replication Space. Updates at any copy at fixed (primary) copy at one copy but control can migrate no updates. Replication Space. Correctness no consistency local consistency

Download Presentation

CS 347: Distributed Databases and Transaction Processing Data Replication

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 347: Distributed Databases and Transaction ProcessingData Replication Hector Garcia-Molina Notes08

  2. Replication Space • Updates • at any copy • at fixed (primary) copy • at one copy but control can migrate • no updates Notes08

  3. Replication Space • Correctness • no consistency • local consistency • order preserving • serializable schedule • 1-copy serializability Notes08

  4. Replication Space • Expected Failures • processors: fail-stop, byzantine? • network: reliable, partitions, in-order msgs? • storage: stable disk? Notes08

  5. Replication Space • Implementation Details • update propagation • physical log records • logical log records • sql updates • transactions • reads at backup? • architecture • cross backups • multi-computer copy • initialization of backup copy Notes08

  6. primary copy DB1 primary copy DB2 backup copy DB2 backup copy DB1 Cross Backups site A site B Notes08

  7. L3’ L2’ L1’ L3 L2 L1 P2 P3 P1 B2 B3 B1 Y1 Y3 Y2 X3 X1 X2 Multi-Computer Sites backup site primary site Notes08

  8. L1’ L1 P1 B1 Y1 X1 1-Safe Backups • Transactions commit at primary • Redo log records propagated • Transaction commit at backup Notes08

  9. L1’ L1’ L1 L1 P1 P1 B1 B1 Y1 Y1 X1 X1 1-Safe Backups • Transactions can get lost T1, T2, T3 T1, T2 T1, T2, T3 T1, T2, T4, T5 Notes08

  10. L1’ L1 P1 B1 Y1 X1 2-Safe Backups • Transactions do 2-phase commit • Redo log records propagated in prepare • Transactions not lost, but • longer delay, contention • cannot process unless both sites are up • After failure, go to 1-safe (no backup) Notes08

  11. What is Correctness? • In 2-safe • In 1-safe Notes08

  12. What is in Paper You Read? • Specific Senario • updates at fixed primary site • each site has multiple computers • primary-backup sites are matched • clean site failures; stable storage; rel net • log shipping • no reads at backup • no initialization Notes08

  13. L2’ L1’ L2 L1 P2 P1 B1 B2 Y2 Y1 X2 X1 Main Problem: Update Dependencies backup site primary site Ta(1) Tb Ta(2) data dependency: TaTb Notes08

  14. L2’ L1’ L2 L1 P1 P2 B2 B1 Y1 Y2 X2 X1 Main Problem: Update Dependencies backup site primary site Ta(1) Tb Ta(1) Tb ? Ta(2) data dependency: TaTb Notes08

  15. L2’ L1’ L1 L2 P1 P2 B2 B1 Y2 Y1 X1 X2 Main Problem: Update Dependencies backup site primary site Ta(1) Tb Ta(1) Tb ? Ta(2) • should not install Ta • should not install Tb data dependency: TaTb Notes08

  16. Dependency Reconstruction Algorithm • Locking at backup to detect dependencies • Ensure locks granted in same order as they were granted at primary Notes08

  17. L1’ L2’ L1 L2 P1 P2 B2 B1 Y1 Y2 X2 X1 Example: Dependency Reconstruction backup site primary site tickets reflect local commit order Ta(1) Tb 5 6 Ta(2) 18 data dependency: TaTb Notes08

  18. L2’ L1’ L1 L2 P1 P2 B2 B1 Y2 Y1 X1 X2 Example: Dependency Reconstruction backup site primary site Ta(1) Tb Ta(1) Tb 5 6 5 6 ? Ta(2) 18 data dependency: TaTb Notes08

  19. L2’ L1’ L1 L2 P1 P2 B1 B2 Y1 Y2 X1 X2 Example: Dependency Reconstruction backup site primary site Ta(1) Tb Ta(1) Tb 5 6 5 6 ? Ta(2) 18 • Say Tb requests lock first at B1; • Tb request delayed until all lockswith tickets <6 have been granted data dependency: TaTb Notes08

  20. Epoch Algorithm • Backup updates are installed in batches • Epoch delimiters written on log Notes08

  21. Writing Delimiters at Primary master 15 16 slave 15 16 slave 15 16 log time Notes08

  22. Problem with Commits master 15 16 prepare commit slave T 15 16 slave 15 16 log time T’s commit record in Epoch 15 in some logs; in Epoch 16 in others Notes08

  23. Solution: Bump Epoch master 15 16 prepare commit slave T 15 16 slave 15 16 log time prepare ack reports epoch number; coordinator bumps epoch if necessary Notes08

  24. Installing an Epoch at Backup master 15 16 install 16 end of 16 slave 15 16 end of 16 slave 15 16 log time Notes08

  25. To Install Epoch X at Backup J • Redo transactions: • If commit(T)  X, commit T • If prepare(T)  X but commit(T) > X: • If T’s primary peer was coordinator, do not commit; • Else check with the backup of T’s coordinator B’: • If B’ committing T in epoch X, then we commit T • Else do not commit T • Otherwise do not commit T (defer to next epoch) commit(T)  X means that T’s commit record found in epoch X (or earlier) at node J. Notes08

  26. Why Do We Need Coordinator Check? • Assignment: Construct 2 scenarios that look the same to backup J: • In Scenario 1, T should be installed • In Scenario 2, T should not be installed Notes08

  27. Scenario 1 B’ C(T) P(T) 15 16 slave C(T) P(T) 15 16 log time Notes08

  28. Scenario 2 B’ P(T) C(T) 15 16 slave C(T) P(T) 15 16 log time Notes08

  29. Scenario 3: Possible? B’ P(T) C(T) 15 16 17 slave C(T) P(T) 15 16 17 log time Note that T commits at slave but not at B’!! Notes08

  30. Scenario 4: Possible? B’ P(T) C(T) 17 15 16 slave P(T) C(T) 15 16 17 log time Note that T commits at B’ but not at slave!! Notes08

  31. Comparison of Options • 2-safe • 1-safe • dep reconstruction • epoch • Specific Senario • updates at fixed primary site • each site has multiple computers • primary-backup sites are matched • clean site failures; stable storage; rel net • log shipping • no reads at backup • no initialization Notes08

  32. How to Evaluate • What system? • actual system(s) • simulation • testbed • What transactions? • real transactions • synthetic transactions Notes08

  33. Metrics • IO utilization • CPU utilization • Throughput (given max delay?) • Transaction commit delay • Backup copy lag • Network overhead • Probability of inconsistency Notes08

  34. Sample Results Notes08

  35. Sample Results Notes08

  36. And Now For SomethingCompletely Different: • Updates • at any copy • at fixed (primary) copy • at one copy but control can migrate • no updates next: available copies have seen Notes08

  37. PC-lock available copies • Transactions write lock at all available copies • Transactions read lock at any available copy • Primary site (static) manages U – set of available copies * down primary X1 X2 X3 X4 Notes08

  38. Update Transaction (1) Get U from primary (2) Get write locks from U nodes (3) Commit at U nodes U={C0, C1} C0 Primary C1 Backup C2 Backup updates, 2PC U Trans T3, U={C0, C1} Notes08

  39. A potential problem - example Now: U={C0, C1} -recovering- I am recovering C0 Primary C1 Backup C2 Backup Trans T3, U={C0, C1} Notes08

  40. A potential problem - example Later: U={C0, C1, C2} -recovering- You missed T0, T1, T2 C0 Primary C1 Backup C2 Backup T3 updates T3 updates Trans T3, U={C0, C1} Notes08

  41. Solution: • Initially transaction T gets copy of U’ ofU from primary (or uses cached value) • At commit of T, check U’ with current Uat primary (if different, abort T) Notes08

  42. Solution Continued • When CX recovers: • request missed and pending transactionsfrom primary (primary updates U) • set write locks for pending transactions • Primary polls nodes to detect failures(updates U) Notes08

  43. Example Revisited You missed T0, T1, T2 U={C0, C1} U={C0, C1, C2} I am recovering C0 Primary C1 Backup C2 Backup reject -recovering- prepare prepare Trans T3, U={C0, C1} Notes08

  44. Available Copies — No Primary • Let all nodes have a copy of U(not just primary) • To modify U, run a special atomic transaction at all available sites(use commit protocol) • E.g.: U1={C1, C2}  U2={C1, C2 , C3}only C1, C2 participate in this transaction • E.g.: U2={C1, C2 , C3}  U3={C1, C2}only C1, C2 participate in this transaction Notes08

  45. Details are tricky... • What if commit of U-change blocks? Notes08

  46. Node Recovery (no primary) • Get missed updates from any active node • No unique sequence of transactions • If all nodes fail, wait for - all to recover - majority to recover Notes08

  47. Example recovering node  How much information (update values) must beremembered? By whom? Committed: A,B,C,D,E,F Pending: G Committed: A,B Committed: A,C,B,E,D Pending: F,G,H Notes08

  48. Correctness with replicated data S1: r1[X1]  r2[X2]  w1[X1]  w2[X2]  Is this schedule serializable? X2 X1 Notes08

  49. One copy serializable (1SR) A schedule S on replicated data is 1SR if it is equivalent to a serial history of the same transactions on a one-copy database Notes08

  50. To check 1SR • Take schedule • Treat ri[Xj] as ri[X] Xj is copy of X wi[Xj] as wi[X] • Compute P(S) • If P(S) acyclic, S is 1SR Notes08

More Related