1 / 18

Consistency Guarantees and Snapshot isolation

Consistency Guarantees and Snapshot isolation. Marcos Aguilera, Mahesh Balakrishnan, Rama Kotla, Vijayan Prabhakaran, Doug Terry MSR Silicon Valley. Goals. Develop a cloud storage system featuring multiple consistency levels requires one API to learn, one system to administer

fern
Download Presentation

Consistency Guarantees and Snapshot isolation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Consistency Guarantees and Snapshot isolation Marcos Aguilera, Mahesh Balakrishnan, Rama Kotla, Vijayan Prabhakaran, Doug Terry MSR Silicon Valley

  2. Goals Develop a cloud storage system featuring • multiple consistency levels • requires one API to learn, one system to administer • handles diversity of requirements within and across applications • read-write transactions • with snapshot isolation • on replicated and partitioned data • consistency-based SLAs

  3. Geo-Replication remote datacenter datacenter remote secondaries secondaries primary Read Write

  4. Client API Transaction • Get (key) • Put (key, object) • BeginTx (consistency) • EndTx () • BeginSession (consistency) • EndSession () Puts/ Gets Session

  5. Transaction Properties • Conventional transaction model • BeginTx … EndTx • Atomic updates to multiple objects • Multi-object reads from snapshots • Even across partitions

  6. Partitioned Data for Scalability • Data partitioned by key range • Each partition has its own primary and secondary servers

  7. Write Operations • Writes performed at primary server(s) • May have different primaries for different objects • Propagate to secondary servers eventually • Any gossip or anti-entropy protocol will do • Have a commit timestamp, i.e. global order • And deterministic outcomes • No write conflicts => All replicas converge towards a mutually consistent state

  8. Versioned Data Store • Store version history for each object • Can perform writes as soon as commit timestamp is known • need not perform writes in commit order • Can eventually prune old versions Object A V1 V2 V3 V4 Object B V1 V2 time

  9. Per-Replica State • Datastore = set of <key, value, timestamp> • High-time = timestamp of latest received write transaction • Assumes transactions are received in order • May receive periodic null transactions • Low-time = timestamp of most recent discarded object version

  10. Read Operations • Single-key Gets go to one server • Multi-partition transactions may read from multiple servers • Server(s) selected based on desired consistency • E.g. read from nearby server when possible • Alternative: Broadcast operation to all servers • Take first response that is consistent enough

  11. Read-Only Transactions • Transaction assigned a read timestamp • Read from snapshot at that time • See all write transactions committed before this time, and only those writes • Consistency guarantee places constraints on read timestamp

  12. Reads on Versioned Data Store • Allows reads at any timestamp • Without placing constraints on write propagation • Assuming no future transaction could be assigned a commit timestamp before the read timestamp Object A V1 V2 V3 V4 Object B V1 V2 time Read timestamp

  13. Selecting Read Timestamp assuming in-order delivery of writes

  14. read timestamp strong Acceptable Read Timestamps read-my-writes monotonic bounded causal eventual 0 time BeginTx

  15. Selecting Read Timestamp low high node A low high node B low high node C time Read timestamp

  16. Read-Write Transactions • Transaction assigned a read timestamp and a commit timestamp • Use optimistic concurrency control • Old read timestamps increase the chance of abort • Read from snapshot at read timestamp • With selected consistency guarantee • Batch writes until commit • No undo needed • Validate transaction at commit timestamp

  17. Transaction Lifetime time Transaction Get(x) … Put(x, value) Session Select read timestamp and perform Get Buffer Put Get commit timestamp, validate, and perform Puts

  18. Committing Write Transactions Snapshot isolation => • Check that no object being written has a version between the transaction’s read timestamp and commit timestamp Serializability => • Check that no object being read or written has a version between the transaction’s read timestamp and commit timestamp

More Related