1 / 21

Telex/IceCube: a distributed memory with tunable consistency

Telex/IceCube: a distributed memory with tunable consistency. Marc Shapiro, Pierre Sutra, Pierpaolo Cincilla INRIA & LIP6, Regal group Nuno Preguiça Universidade Nova de Lisboa. Telex/IceCube. It is: A distributed shared memory Object-based (high-level operations)

jean
Download Presentation

Telex/IceCube: a distributed memory with tunable consistency

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Telex/IceCube: a distributed memory with tunable consistency Marc Shapiro, Pierre Sutra, Pierpaolo Cincilla INRIA & LIP6, Regal group Nuno Preguiça Universidade Nova de Lisboa

  2. Telex/IceCube • It is: • A distributed shared memory • Object-based (high-level operations) • Transactional (ACID… for some definition of ID) • Persistent • Top level design goal: availability • ∞-optimistic ⇒ (reconcile ⇒ cascading aborts) • Minimize aborts ⇒ non-sequential schedules • (Only) consistent enough to enforce invariants • Partial replication Distributed TMs — 22-Feb-2012

  3. Scheduling(∞-optimistic execution) • Sequential execution • In (semantic) dependence order • Dynamic checks for preconditions: if OK, application approves schedule • Conflict ⇒ fork • Latest checkpoint + replay • Independent actions not rolled back • Semantics: • Conflict = non-commuting, antagonistic • Independent Distributed TMs — 22-Feb-2012

  4. Telex lifecycle Application operations appt • User requests • > application: actions, dependence • > Telex: add to ACG, transmit Shared Calendar Application +action(appt) Telex Distributed TMs — 22-Feb-2012

  5. Telex lifecycle Schedule • ACG • > Telex: valid path = sound schedule • > application: tentative execute • > application: if OK, approve Shared Calendar Application sched approve Telex Distributed TMs — 22-Feb-2012

  6. Telex lifecycle Conflict ⇒ multiple schedules !!! • ACG • > Telex: valid path = sound schedule • > application: tentative execute • > application: if OK, approve Shared Calendar Application sched1 sched2 approve2 Telex Distributed TMs — 22-Feb-2012

  7. Constraints(Semantics-based conflict detection) • Action: reified operation • Constraint: scheduling invariant • Binary relations: • NotAfter    • Enables (implication)    • NonCommuting    • Combinations: • Antagonistic    • Atomic    • Causal dependence        • Action-constraint graph ACG Distributed TMs — 22-Feb-2012

  8. Single site scheduling • Sound schedule: • Path in the ACG that satisfies constraints • Fork: • Antagonism    • NonCommuting + Dynamic checks    • Several possible schedules • Penalise lost work • Optimal schedule: NP-hard • IceCube heuristics Distributed TMs — 22-Feb-2012

  9. Minimizing aborts(IceCube heuristics) • Iteratively pick a sound schedule • Application executes, checks, approves • Check invariants • Reject if violation • Request alternative schedule • Restart from previous • Approved schedule • Preferred for future • Proposed to agreement Distributed TMs — 22-Feb-2012

  10. Telex lifecycle Receive remote operations • User requests • > application: actions, dependence • > Telex: add to ACG, transmit, merge Shared Calendar Application getCstrt +cstrt(antg) Telex ACG Distributed TMs — 22-Feb-2012

  11. Eventual consistency 0 • Common stable prefix • Diverge beyond • Combine approved schedules • ⇒ Consensus on next extension of prefix • Equivalence, not equality 0 0 Distributed TMs — 22-Feb-2012

  12. Telex lifecycle Convergence protocol • approved schedules • > Telex: exchange, agree • > commit/abort, serialise Shared Calendar Application Telex agreement Distributed TMs — 22-Feb-2012

  13. FGGC • Special-case commutative commands ⇒ collision recovery • Improvements higher on WAN. Fast Paxos Generalised Paxos Paxos FGGC WAN typical ☺ Distributed TMs — 22-Feb-2012

  14. 360 FGGC: Varying commutativity 1 Paxos • Each command reads or writes a randomly-chosen register; WAN FGGC 1 register 1024 2048 4096 8192 16384 FGGC 16384 registers ☺ Distributed TMs — 22-Feb-2012

  15. Example: Sakura Shared Calendarover Telex Marc  Lamia Tues.  Fri. • Private calendar + common meetings • Example proposals: • M1: Marc & Lamia & JeanMi, Monday | Tuesday | Friday • M2: Lamia & Marc & Pierre, Tuesday | Wed. | Friday • M3: JeanMi & Pierre & Marc, Monday | Tues. | Thurs. • Change M3 to Friday • Telex: • (Local) Explore solutions, approve • (Global) Combine, Commit M1  M2 Run-time check Distributed TMs — 22-Feb-2012

  16. Example ACG: calendar application • Schedule: sound cut in the graph Distributed TMs — 22-Feb-2012

  17. Status • Open source • 35,000 Java (well-commented) lines of code • Available at gforge.inria.fr, BSD license • Applications: • Sakura co-operative calendar (INRIA) • Decentralised Collaborative Environment (UPC) • Database, STMBench7 (UOC) • Co-operative text editor (UNL, INRIA, UOC) • Co-operative ontology editor (U. Aegean) • Co-operative UML editor (LIP6) Distributed TMs — 22-Feb-2012

  18. Lessons learned • Separation of concerns: • Application: business logic • Semantics: constraints, run-time checks • System: Persistence, replication, consistency • Optimistic: availability, WAN operation • Adapts to application requirements • Constraints + run-time machine learning • Modular ⇒ efficiency is an issue • High latency, availability ⇒ reduce messages • Commutative operations ⇒ CRDTs • Partial replication is hard! Distributed TMs — 22-Feb-2012

  19. Distributed TMs — 22-Feb-2012

  20. ----- Distributed TMs — 22-Feb-2012

  21. Application design • MVC-like execution loop + rollback • Designing for replication is an effort: • Commute (no constraints) • >> Antagonistic • >> Non-Commuting • Weaken invariants • Make invariants explicit Distributed TMs — 22-Feb-2012

More Related