Highly Available Services and Transactions with Replicated Data Jason Lenthe

Highly Available Services and Transactions with Replicated Data Jason Lenthe

Highly Available Services (1 of 17) • What is availability? • The percentage of time that a service is “up” • What is highly available? • Availability close to 100% • With reasonable response times • May not conform to sequential consistency

Highly Available Services (2 of 17) • The Gossip Architecture • Is a framework for implementing highly available services • Replica Managers periodically “gossip” with each to convey updates they have received RM RM RM Gossip Front Ends Clients

Highly Available Services (3 of 17) • The Gossip Architecture (con't) • Outline for Processing Queries and Updates: • 1) Request – Front end sends request to a replica manager • 2) Update Reponse – If request is a update, the replica manager replies when it has the request • 3) Coordination – Replica managers “gossip” (send gossip messages) • 4) Execution – Replica manager executes the request • 5) Query Response – If request is a query, the replica manager responds • 6) Agreement – More gossip messages may be sent out. Generally, a lazy approach is taken

Highly Available Services (4 of 17) • The Gossip Architecture (con't) • Each Front End maintains a vector timestamp for each value that it has accessed • Contains last update for each replica manager • Is sent as part of a query/update • Each Replica Manager uses the received vector timestamp is find out if they are up-to-date • If they are not, they can wait for updates or request them explicitly

Highly Available Services (5 of 17) • The Gossip Architecture (con't) • Examples of the Gossip architecture? • Textbook is skimpy on examples in this section, but... • Suggests a bulletin board service • Clients may have a different view of the bulletin board at any time, if the network is partitioned • All messages will eventually be propagated to each replication manager

Highly Available Services (6 of 17) • The Gossip Architecture – Conclusions • Clients can operate when is partitioned network (as long as 1 replica manager is accessible). • Lazy approach makes it inappropriate for near-real time collaboration • Not particularly scalable • 2 + (R – 1)/G equals the number of messages transmitted per update where R = number of replica manager and G = number of updates packed into a gossip message

Highly Available Services (7 of 17) • The Bayou System • Another framework for providing highly available services • Uses Operational Transformation • Allows domain-specific conflict detection and conflict resolution

Highly Available Services (8 of 17) • The Bayou System (con't) • Updates have two states: • Tentative – may be undone or reapplied as the system becomes consistent • Committed – cannot be undone

Highly Available Services (9 of 17) • The Bayou System (con't) • Uses application specific dependency checks and merge procedures • Dependency checks determine is an new update conflicts with an update that has already been applied • Merge procedure produces a new update that does not conflict with the previous update

Highly Available Services (10 of 17) • The Bayou System – Conclusions • Uses application-specific logic to produce an eventually sequentially consistent state • Complicated for the application programmer and user • Programmer needs to provide dependency check and merge procedures • User needs to deal with tentative data • Generally limited to applications where • Conflicts are rare • Data semantics are simple

Highly Available Services (11 of 17) • The Coda File System • Coda is basically a highly available version of AFS • Aims to provide constant data availability • Good for mobile environments • Follows an optimistic strategy – conflicts are not likely

Highly Available Services (12 of 17) • The Coda File System (con't) • Architecture • Venus – client process • Vice – server process • Volume Storage Group (VSG) – the set of servers that have a copy of a particular file volume • Available Volume Storage Group (AVSG) – the subset of the VSG for a file volume that is accessible

Highly Available Services (13 of 17) • The Coda File System (con't) • Basic Operation • On open: • Venus gets the file from its local cache or • Determines which server in the AVSG has the most recent version (the preferred server) and gets the file (and callback promises) from there • On close (after modification): • Venus sends the updated file to everyone in the AVSG using multicast RPC • But, some servers might be in the AVSG of this client...

Highly Available Services (14 of 17) • The Coda File System (con't) • Venus periodically sends out of probe for each file in its cache • This determines that AVSG for each file • Each server responds with a version CVV • Contains summary of all files in the volume • Mismatches are detected

Highly Available Services (15 of 17) • The Coda File System (con't) • Disconnected operation is supported (AVSG is empty) • User specifies which files Venus should make available during periods of disconnectivity • When connectivity is restored, the reintegration process begins • Conflicts are detected and files are flagged for manual integration

Highly Available Services (16 of 17) • The Coda File System (con't) • Performance: Coda vs. AFS • With no replication: about the same • With three-fold replication: • For 5 users, Coda increases benchmark time by 5% • Going to 50 users, Coda increase benchmark time by 70% while AFS increases it by 16%

Highly Available Services (17 of 17) • The Coda File System – Summary • Coda FS provides a highly available filesystem which works during periods of disconnectivity • Requires some user interaction • Identifying files to be available during disconnectivity • Manually resolving occasional update conflicts • Does not perform as well as AFS

Transactions with Replicated Data (1 of x) • The goal of normal distributed transactions is serial equivalence • When replicated data is involved, one-copy serializability is needed • Which means the effect of the transactions is the same as if they were • Performed one at a time • On a single set of objects

Transactions with Replicated Data (2 of x) • Architectural Issues • Eager vs. Lazy Update Propagation • Eager – propagate updates to replica manager during the transaction (before commit) • Lazy – commit the transaction and propagate updates later • Two Phase Commit Protocol needed • Primary Copy Replication • Only one replica manager at a time can interact with front ends • All other replica managers are backups (could be the primary if the current one fails)

Transactions with Replicated Data (3 of x) • Schemes for Dealing with Network Partitions • Available copies with validation • Quorum consensus • Virtual Partition

Transactions with Replicated Data (3 of x) • Available Copies with Validation Method • Reads are serviced by any available replica manager • Updates must be performed by all available replica managers (some replica managers may be unavailable) • When the network is partitioned each partition can carry out transactions • When the network is fixed, conflicts may have arisen • Conflicts are eliminated by aborting one of the transactions

Transactions with Replicated Data (4 of x) • Quorum Consensus Method • Only one of the network partitions has the right to carry on with transactions • When the network is fixed replica managers are brought up-to-date with those in the quorum • Quorum is determined by a voting algorithm which is applied on each operation request

Transactions with Replicated Data (4 of x) • Virtual Partition Method • Combines Available copies method with Quorum consensus method • New virtual partition created on write failure • If a virtual partition has a quorum, transactions can proceed

Highly Available Services and Transactions with Replicated Data Jason Lenthe

Highly Available Services and Transactions with Replicated Data Jason Lenthe

Presentation Transcript

Building Highly Scalable and Available Applications and Services with Windows Azure AppFabric MID315

Mobile Replicated Data

Highly-Available Lustre with SRP-Mirrored LUNs

Conflict-free Replicated Data Types

Replicated Data Protocols

StarFish : highly-available block storage

Replicated Data Management

Highly available services

Data Currency in Replicated DHTs

Building Global and Highly Available Services Using Windows Azure

Data Currency in Replicated DHTs

Highly Available ACID Memory

Building Highly Available Web Applications

Ch12 (continued) Replicated Data Management

Highly Available Central Services An Intelligent Router Approach

Transactions and Web Services

Analysis of Replicated Data with Repair Dependency

Enabling Highly Available Grid Sites

replicated with composite materials

Highly Leveraged Transactions: Going Private and LBOs

Analysis of Replicated Data with Repair Dependency