1 / 50

Transaction chains: achieving serializability with low-latency in geo-distributed storage systems

Transaction chains: achieving serializability with low-latency in geo-distributed storage systems. Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos K. Aguilera Jinyang Li. New York University *Microsoft Research Silicon Valley. Why geo-distributed storage?.

nedra
Download Presentation

Transaction chains: achieving serializability with low-latency in geo-distributed storage systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos K. Aguilera Jinyang Li New York University *Microsoft Research Silicon Valley

  2. Why geo-distributed storage? Large-scale Web applications Geo-distributed storage Replication

  3. Geo-distribution is hard Strong semantics: relational tables w/ transactions Lowlatency: O(Intra-datacenter RTT)

  4. Prior work High latency Strict serializable Spanner [OSDI’12] Provably high latency according to CAP Serializable ? Our work Walter [SOSP’11] Various non-serializable COPS [SOSP’11] Eiger [NSDI’13] • Low latency Dynamo [SOSP’07] Eventual General transaction Key/value only Limited forms of transaction

  5. Our contributions • A new primitive: transaction chain • Allow for low latency, serializable transactions • Lynx geo-storage system: built with chains • Relational tables • Secondary indices, materialized join views

  6. Talk Outline • Motivation • Transaction chains • Lynx • Evaluation

  7. Why transaction chains? Auction service Items Bids Bob Alice Datacenter-1 Datacenter-2

  8. Why transaction chains? Operation: Alice bids on Bob’s camera 1. Insert bid to Alice’s Bids 1. Insert bid to Alice’s Bids 2. Update highest bid on Bob’s Items Alice’s Bids Bob Bob’s Items Alice Datacenter-1 Datacenter-2

  9. Why transaction chains? Operation: Alice bids on Bob’s camera 1. Insert bid to Alice’s Bids 2. Update highest bid on Bob’s Items Alice’s Bids Bob Bob’s Items Alice Datacenter-1 Datacenter-2

  10. Low latency with first-hop return Alice bid on Bob’s camera Alice’s Bids Bob Bob’s Items $500 Datacenter-1 Datacenter-2

  11. Problem: what if chains fail? • What if servers fail after executing first-hop? • What if a chain is aborted in the middle?

  12. Solution: provide all-or-nothing atomicity • Chains are durably logged at first-hop • Logs are replicated to another closest data center • Chains are re-executed upon recovery • Chains allow user-aborts only at first hop • Guarantee: First hop commits  all hops eventually commit

  13. Problem: non-serializable interleaving • Concurrent chains ordered inconsistently at different hops Not serializable! T1 Server-Y: T2 < T1 Server-X: T1 < T2 X=1 Y=1 T2 T2 T1 X=2 Y=2 Time • Traditional 2PL+2PC prevents non-serializable interleaving at the cost of high latency

  14. Solution: detect non-serializable interleaving via static analysis • Statically analyze all chains to be executed • Web applications invoke fixed set of operations T1 X=1 Y=1 Conflict? A SC-cycle has both red and blue edges T2 X=2 Y=2 Serializable if no SC-cycle [Shasha et. al TODS’95]

  15. Outline • Motivation • Transaction chains • Lynx’s design • Evaluation

  16. How Lynx uses chains • User chains: used by programmers to implement application logic • System chains: used internally to maintain • Secondary indexes • Materialized join views • Geo-replicas

  17. Example: secondary index Bids (secondary index) Bids (base table)

  18. Example user and system chain Alice bid on Bob’s camera Bob Datacenter-1 Datacenter-2

  19. Lynx statically analyzes all chains beforehand Read Bids table Read-bids Update Items table Insert to Bids table Put-bid Insert to Bids table Update Items table Put-bid SC-cycle One solution: execute chain as adistributed transaction Read Bids table Read-bids

  20. SC-cycle source #1: false conflicts in user chains Update Items table Insert to Bids table Put-bid False conflict because max(bid, current_price) commutes Update Items table Insert to Bids table Put-bid

  21. Solution: users annotate commutativity Update Items table Insert to Bids table Put-bid commutes Update Items table Insert to Bids table Put-bid

  22. SC-cycle source #2: system chains … Insert to Bids-secondary Insert to Bids table Put-bid … Insert to Bids table Insert to Bids-secondary Put-bid SC-cycle

  23. Solution: chains provide origin-ordering • Observation: conflicting system chains originate at the same first hop server. T1 Insert to Bids-secondary Insert to Bids table T2 Insert to Bids table Insert to Bids-secondary Both write the same row of Bids table • Origin-ordering: if chains T1 < T2 at same first hop, thenT1 < T2 at all subsequent overlapping hops. • Can be implemented cheaply  sequence number vectors

  24. Limitations of Lynx/chains • Chains are not strictly serializable, only serializable. • Programmers can abort only at first hop • Our application experience: limitations are managable

  25. Outline • Motivation • Transaction chains • Lynx’s design • Evaluation

  26. Simple Twitter Clone on Lynx Tweets Geo-replicated Tweets JOIN Follow-Graph (Timeline) Geo-replicated Follow-Graph Follow-Graph (secondary)

  27. Experimental setup europe 153ms 102ms • Lynx protoype: • In-memory database • Local disk logging only. 82ms us-east us-west

  28. Returning on first-hop allows low latency Chain completion First hop return

  29. Applications achieve good throughput

  30. Related work • Transaction decomposition • SAGAS [SIGMOD’96], step-decomposed transactions • Incremental view maintenance • Views for PNUTS [SIGMOD’09] • Various geo-distributed/replicated storage • Spanner[OSDI’12], MDCC[Eurosys’13], Megastore[CIDR’11], COPS [SOSP’11], Eiger[NSDI’13], RedBlue[OSDI’12].

  31. Conclusion • Chains support serializability at low latency • With static analysis of SC-cycles • Key techniques to reduce SC-cycles • Origin ordering • Commutative annotation • Chains are useful • Performing application logic • Maintaining indices/join views/geo-replicas

  32. Limitations of Lynx/chains • Chains are not strict serializable Time Serializable Strict serializable Remedies: • Programmers can wait for chain completion • Lynx provides read-your-own-writes 2. Programmers can only abort at first hop • Our application experience shows the limitations are managable

  33. 2PC and chainsThe easy way R(A) T1 T1 R(A) T2 W(A) W(B) T2 2PC-W(AB) T2 W(A) W(B) R(A) T1 R(A) T1

  34. 2PC and chainsThe hard way R(A) R(B) T1 T1 R(A) R(B) T2 W(A) W(B) T2 2PC-W(AB) T2 W(A) W(B) T1 R(A) R(B) R(A) R(B) T1

  35. 2PC and chainsThe hard way B Chain A C D DC2 DC4 DC1 DC3 Parallel unlock 2PC retry

  36. Lynx is scalable

  37. Challenge of static analysis: false conflict T1 1. Insert bid into bid history 2. Update max price on item Conflict on bid history Conflict on item T2 1. Insert bid into bid history 2. Update max price on item SC-cycle  Not serializable

  38. Solution:communitivityannotations T1 1. Insert bid into bid history 2. Update max price on item No real conflict because bid ids are unique Updating max commutes Conflict on bid history Commutative operation Conflict on item Commutative operation T2 1. Insert bid into bid history 2. Update max price on item No SC-cycle  Serializable

  39. ACID: all-or-nothing atomicity • Chain’s failure guarantee: • If the first hop of a chain commits, then all hops eventually commit • Users are only allowed to abort a chain in the first hop • Achievable with low latency: • Log chains durably at the first hop • Logs replicated to a nearby datacenter • Re-execute stalled chains upon failure recovery

  40. ACID: serializability • Serializability • Execution result appears as if obey a serial order for all transactions • No restrictions on the serial order Transactions Ordering 1 Ordering 2

  41. Problem #2: unsafe interleaving • Serializability • Execution result appears as if obey a serial order for all transactions • No restrictions on the serial order Transactions Ordering 1 Ordering 2

  42. Chains are not linearizable • Serializability • Linearability  a total ordering of chains • a total ordering of chains & total order obeys the issue order Transactions Time Ordering 1 Ordering 2 Linearizable

  43. Transaction chains: recap • Chains provide all-or-nothing atomicity • Chains ensure serializability via static analysis • Practical challenges: • How to use chains? • How to avoid SC-cycles?

  44. Example user chain Items Bids Alice Bob 2. Update max price on Bob’s camera 1. Insert bid into Alice’s bid history

  45. Lynx implementation • 5000 lines C++ and 3500 lines RPC library • Uses an in-memory key/value store • Support user chains in Javascript (via V8)

  46. Geo-distributed storage is hard • Applications demand simplicity & performance • Friendly programming model • Relational tables • Transactions • Fast response • Ideally, operation latency = O(intra-datacenter RTT) • Geo-distribution leads to high latency • Coordinate data access across datacenters • Operation latency = O(inter-datacenter RTT) = O(100ms)

More Related