Fault tolerance and replication
Sponsored Links
This presentation is the property of its rightful owner.
1 / 49

Fault Tolerance and Replication PowerPoint PPT Presentation


  • 80 Views
  • Uploaded on
  • Presentation posted in: General

Fault Tolerance and Replication. This power point presentation has been adapted from: (1) web.njit.edu/~ gblank /cis633/Lectures/ Replication . ppt. Content. Introduction System model and the role of group communication Fault tolerant services Case study: Bayou and Coda

Download Presentation

Fault Tolerance and Replication

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Fault Tolerance and Replication

This power point presentation has been adapted from:

(1) web.njit.edu/~gblank/cis633/Lectures/Replication.ppt


Content

  • Introduction

  • System model and the role of group communication

  • Fault tolerant services

  • Case study: Bayou and Coda

  • Transaction with replicated data


Content

  • Introduction

  • System model and the role of group communication

  • Fault tolerant services

  • Case study: Bayou and Coda

  • Transaction with replicated data


Introduction

  • Replication

  • Duplicate limited or heavily loaded resources

    • to provide access and ensure access after failures

  • Replication is important for performance enhancement, increased availability and fault tolerance.


Introduction

  • Replication

  • Performance enhancement

    • Data are replicated between several originating servers in the same domain

    • The workload is shared between the servers by binding all the server IP addresses to the site’s DNS name

    • It increases performance with little cost to the system


Introduction

  • Replication

  • Increased availability

    • Replication is a technique for automatically maintaining the availability of data despite server failures

    • If data are replicated at two or more failure-independent servers, then client software may be able to access data at an alternative server should the default server fail or become unreachable


Introduction

  • Replication

  • Fault tolerance

    • Highly available data is not necessarily providing correct data (may be out of date)

    • A fault-tolerant service always guarantees the correctness of the freshness of data supplied to the client and the effects of the client’s operations upon the data


Introduction

  • Replication

  • Replication requirements:

    • Transparency

      • Users should not need to be aware that data is replicated, and the performance and utility of the information retrieval should not be noticeably different from unreplicated data

    • Consistency

      • Different copies of replicated data should be the same. When data are changed, it is distributed to all replicated servers


Content

  • Introduction

  • System model and the role of group communication

  • Fault tolerant services

  • Case study: Bayou and Coda

  • Transaction with replicated data


System Model & The Role of Group Communication

  • Introduction

  • The data in the system are composed of objects (e.g.,files, components, Java objects, etc.)

  • Each logical object is implemented by a collection of physical objects called replicas, each stored on a computer.

  • The replicas of a given object are not necessarily identical, at least not at any particular point in time. Some replicas may have received updates that others have not received.


System Model & The Role of Group Communication

  • System Model


System Model & The Role of Group Communication

  • System Model

  • Replica Managers (RM)

    • components that contain the objects on a particular computer and perform operations on them.

  • Front ends (FE)

    • Components that handle client’s requests

      • communicate with one or more of the replica managers by message passing

      • A front end may be implemented in the client’s address space, or it may be a separate process


System Model & The Role of Group Communication

  • System Model

  • 5 phases in the a request upon replicated objects [Wiesmannet al. 2000]

    • Front end requests service from one or more RMs which may communicate with the other RMs. The front end may communicate through one RM or multicast to all of them.

    • RMs coordinate to prepare to execute the request. This may require ordering of the operations.

    • RMs execute the request (may be reversible later).

    • RMs reach agreement on effect of the request.

    • One or more RMs pass a response back to the front end.


System Model & The Role of Group Communication

  • The role of group communication

  • RM in group communication is complex, especially in the case of dynamic groups.

    • A group membership service may be used to manage the addition and removal of replica managers, and detect and recover from crashes and faults.


System Model & The Role of Group Communication

  • The role of group communication

  • Tasks of a Group Membership Service

    • Provide an interface for group membership changes

    • Implement a failure detector

    • Notify members of group membership changes

    • Perform group address expansion for multicast delivery of messages.


Group

address

expansion

Leave

Group

send

Multicast

Group membership

Fail

communication

management

Join

Process group

System Model & The Role of Group Communication

  • The role of group communication


Content

  • Introduction

  • System model and the role of group communication

  • Fault tolerant services

  • Case study: Bayou and Coda

  • Transaction with replicated data


Fault Tolerant Services

  • Introduction

  • Replicating data and functionality at replica managers can be used to provide a service that is correct despite process failures

    • A replication service is correct if it keeps responding despite faults

    • Clients can’t see the difference between a service provided by replication and one with a single copy of the data.


Fault Tolerant Services

  • Introduction

  • A criteria for replicated objects is linearizable

    • Every operation is synchronous

      • Clients must wait for one operation to complete before starting another.

    • A replicated shared object is sequentially consistent if for any execution interleaved operations produce a single correct copy and the order of the operations is consistent with the order in which they were performed


Fault Tolerant Services

  • Update process

  • Read-only requests have no impact on the replicated object

  • Update processes may need to managed properly to avoid inconsistency.

  • A strategy to avoid inconsistency

    • Make all updates to a primary copy of the data and copy that to the other replicas (passive replication).

    • If the primary fails, one of the backups is promoted to act as primary.


Fault Tolerant Services

  • Passive (primary-backup) replication


Fault Tolerant Services

  • Passive (primary-backup) replication

  • The sequence of events when a client requests an operation

    • Request: front end issues a request with a unique identifier to the primary replica manager.

    • Coordination: primary processes request atomically, checking ID for duplicate requests.

    • Execution: request is processed and stored.

    • Agreement: if an update, primary sends info to backups, which update and acknowledge.

    • Response: primary notifies front end, which passes information to client.


Fault Tolerant Services

  • Passive (primary-backup) replication

  • It gives fault tolerance at a cost in performance.

    • high overhead to updating the replicas, so it gives lower performance than non-replicated objects.

  • To solve this issue:

    • Allow read-only requests to be made to backup RMs, but send all updates to the primary.

    • Limited value for transaction processing systems but is very effective for decision support systems (mostly read-only requests).


Fault Tolerant Services

  • Active Replication


Fault Tolerant Services

  • Active Replication

  • Active Replication steps:

    • Request: front end attaches unique ID to request and multicasts (totally ordered, reliable) to RMs. Front end is assumed to fail only by crashing.

    • Coordination: every correct RM receives request in same total order.

    • Execution: every RM executes the request.

    • Coordination: (not required due to multicast)

    • Response: each RM sends response to front end, which manages responses depending on failure assumptions and multicast algorithm.


Fault Tolerant Services

  • Active Replication

  • The model assumes totally ordered and reliable multicasting.

    • This is equivalent to solving consensus, which requires either a synchronous system or a technique such as failure detectors in an asynchronous system.

    • The model can be simplified if updates are assumed to be commutative, so that the effect of two operations is the same in any order.

      • E.g. A bank account—daily deposits and withdrawals can be done in any order unless the balance goes below zero. If a process avoids overdrafts, the effects are commutative.


Content

  • Introduction

  • System model and the role of group communication

  • Fault tolerant services

  • Case study: Bayou and Coda

  • Transaction with replicated data


Case study: Bayou and Coda

  • Introduction

  • Implementation of replication techniques to make services highly available

    • Giving clients access to the service (with reasonable response times)

    • Fault tolerant systems send updates and all correct RMs receive updates as soon as possible.

      • May be unacceptable for high availability systems.

      • May be desirable to increase performance by providing slower (but still acceptable) updates with a minimal set of RMs.

      • Weaker consistency tends to require less agreement and provides more availability.


Case study: Bayou and Coda

  • Bayou

  • Is an approach to high availability

    • Users working in a disconnected fashion can make any updates in any partition at any time, with the updates recorded at any replica manager.

    • The replica managers are required to detect and manage conflicts at the time when two partitions are rejoined and the updates are merged.

    • Domain specific policies, called operational transformations, are used to resolve conflicts by giving priority to some partitions.


Case study: Bayou and Coda

  • Bayou

  • Bayou holds state values in a database to support queries and updates.

  • Updates are a special case of a transaction, using the equivalent of a stored procedure to guarantee the ACID properties.

  • Eventually every RM gets the same set of updates and applies them so that their databases are identical.

  • However, since this is delayed, in an active system with a consistent stream of updates the databases may never really be identical.


Case study: Bayou and Coda

  • Bayou

  • Bayou Update Resolution

    • Updates are marked as tentative when they are first applied to a database.

    • Once coordination with the other RMS makes it possible to resolve conflicts and place the updates in a canonical order, they are committed.

    • Once committed, they remain applied in their allotted order. Usually, this is achieved by designating a primary RM.

    • Every update includes a dependency check and follows a merge procedure.


Case study: Bayou and Coda

  • Bayou


Case study: Bayou and Coda

  • Bayou

  • In Bayou, replication is not transparent to the application.

    • Knowledge of the application semantics is required to increase data availability while maintaining a replication state that can be called eventually sequentially consistent.

  • Disadvantages include increased complexity for the application programmers and the users.

  • The operational transformation approach is particularly suited for groupware, where workers access documents remotely.


Case study: Bayou and Coda

  • Coda

  • The Coda file system is a descendent of Andrew File System (AFS)

    • To address several requirements that AFS does not meet – particularly the requirement to provide high availability despite disconnected operation

    • It was developed in a research project at Carnegie-Mellon University

    • Increasing users of AFS that use laptop:

      • A need to support disconnected use of replicated data and to increase performance and availability.


Case study: Bayou and Coda

  • Coda

  • The Coda architecture:

    • Coda has Venus processes at the client computers and Vice processes at the file servers.

    • The Vice processes are replica managers.

    • A set of servers holding replicas of a file volume is a volume storage group (VSG).

    • Clients access a subset known as the available volume storage group (AVSG), which varies as servers are connected or disconnected.

    • Updates are distributed by broadcasting to the AVSG after a close.

    • If the AVSG is empty (disconnected operation) files are cached until reconnected.


Case study: Bayou and Coda

  • Coda

  • Coda uses an optimistic replication strategy

    • files can be updated when the network is partitioned or during disconnected operation.

  • A Coda version vector (CVV) is a timestamp that is used at each site to determine whether there are any conflicts among updates at the time of reconnection.

  • If no conflict, updates are performed.

  • Coda does not attempt to resolve conflicts.

  • If there is a conflict, the file is marked inoperable, and the owner of the file is notified. This is done at the AVSG level, so conflicts may recur at the VSG level.


Content

  • Introduction

  • System model and the role of group communication

  • Fault tolerant services

  • Case study: Bayou and Coda

  • Transaction with replicated data


Transaction with Replicated Data

  • Introduction

  • Client should see that transactions on replicated objects should appear the same as on non-replicated objects

  • Client transactions are interleaved in a serially equivalent manner.

  • One-copy serializability:

    • If replicated object transactions are performed and the result is the similar as on a single set of objects


Transaction with Replicated Data

  • Introduction

  • 3 replication schemes for network partition:

    • Available copies with validation

      • Available copies replication is applied in each partition. When a partition is repaired, a validation procedure is applied and any inconsistencies are dealt with.

    • Quorum consensus:

      • A subgroup must have a quorum (has sufficient members) in order to be allowed to continue providing a service in the presence of a partition. When a partition is repaired (and when a replica manager restarts after a failure), replica managers get their objects up-to-date by means of recovery procedures.

    • Virtual partition:

      • A combination of quorum consensus and available copies. If a virtual partition has a quorum, it can use available copies replication.


Transaction with Replicated Data

  • Available copies

  • Allows for some RMs to be unavailable.

  • Updates must be made to all available replicas of the data, with provisions to restore and update a RM that has crashed.


Transaction with Replicated Data

  • Available copies


Transaction with Replicated Data

  • Available copies with validation

  • An optimistic approach that allows updates in different partitions of a network.

  • When the partition is corrected, conflicts must be detected and compensating actions must be taken.

  • This approach is limited to situations in which such compensation is possible.


Transaction with Replicated Data

  • Quorum consensus

  • Is a pessimistic approach to replicated transactions.

  • A quorum is a subgroup of RMs that is large enough to give it the right to carry out transactions even if some RMs are not available.

  • This limits updates to a single subset of the RMs, which update other RMs after a partition is corrected.

  • Gifford’s File Replication:

    • a Quorum scheme in which a number of votes is assigned to each copy of a replicated file.

    • A certain number of votes are required for either read or update operations, with writes limited to subsets of more than half the RMs.

    • The rest of the RMs will be updated as a background task when they are available.

    • Copies of data without enough read votes are considered weak copies and may be read locally with limits assumed on their currency and quality.


Transaction with Replicated Data

  • Virtual Partition Algorithm

  • This approach combines Quorum Consensus to handle partitions and Available Copies for faster read operations.

  • A virtual partition is an abstraction of a real partition and contains a set of replica managers.


Transaction with Replicated Data

  • Virtual Partition Algorithm


Transaction with Replicated Data

  • Virtual Partition Algorithm


Transaction with Replicated Data

  • Virtual Partition Algorithm

  • Issues:

    • If network partitions are intermittent, different virtual partitions can form:

      • Overlapping virtual partitions violate one-copy serializability.

    • Higher logical timestamps determine the selection of consistent virtual partitions where partitions are uncommon.


Transaction with Replicated Data

  • Virtual Partition Algorithm


End of the Chapter …


  • Login