Fault tolerant distributed computing system
1 / 26

Fault Tolerant Distributed Computing system. - PowerPoint PPT Presentation

  • Uploaded on

Fault Tolerant Distributed Computing system. Fundamentals . What is fault? A fault is a blemish, weakness, or shortcoming of a particular hardware or software component. Fault, error and failures Why fault tolerant ? Availability, reliability, dependability, …

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Fault Tolerant Distributed Computing system. ' - emera

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript


  • What is fault?

    • A fault is a blemish, weakness, or shortcoming of a particular hardware or software component.

    • Fault, error and failures

  • Why fault tolerant?

    • Availability, reliability, dependability, …

  • How to provide fault tolerance ?

    • Replication

    • Checkpointing and message logging

    • Hybrid

Message logging
Message Logging

  • Tolerate crash failures

  • Each process periodically records its local state and log messages received after

    • Once a crashed process recovers, its state must be consistent with the states of other processes

    • Orphan processes

      • surviving processes whose states are inconsistent with the recovered state of a crashed process

    • Message Logging protocols guarantee that upon recovery no processes are orphan processes

Message logging protocols
Message logging protocols

  • Pessimistic Message Logging

    • avoid creation of orphans during execution

    • no process p sends a message m until it knows that all messages delivered before sending m are logged; quick recovery

    • Can block a process for each message it receives - slows down throughput

    • allows processes to communicate only from recoverable states; synchronously log to stable storage any information that may be needed for recovery before allowing process to communicate

Message logging1
Message Logging

  • Optimistic Message Logging

    • take appropriate actions during recovery to eliminate all orphans

    • Better performance during failure-free runs

    • allows processes to communicate from non-recoverable states; failures may cause these states to be permanently unrecoverable, forcing rollback of any process that depends on such states

Causal message logging
Causal Message Logging

  • Causal Message Logging

    • no orphans when failures happen and do not block processes when failures do not occur.

    • Weaken condition imposed by pessimistic protocols

    • Allow possibility that the state from which a process communicates is unrecoverable because of a failure, but only if it does not affect consistency.

    • Append to all communication information needed to recover state from which communication originates - this is replicated in memory of processes that causally depend on the originating state.

Kan a reliable distributed object system
KAN – A Reliable Distributed Object System

  • Developed at UC Santa Barbara

  • Project Goal:

    • Language support for parallelism and distribution

    • Transparent location/migration/replication

    • Optimized method invocation

    • Fault-tolerance

    • Composition and proof reuse

System description
System Description

Kan source

Kan Compiler

Java bytecode + Kan run-time libraries




UNIX sockets

Fault tolerance in kan
Fault Tolerance in Kan

  • Log-based forward recovery scheme:

    • Log of recovery information for a node is maintained externally on other nodes.

    • The failed nodes are recovered to their pre-failure states, and the correct nodes keep their states at the time of the failures.

  • Only consider node crash failures.

    • Processor stops taking steps and failures are eventually detected.

Basic architecture of the fault tolerance scheme

Logical Node y

Logical Node x

Failure handler

Fault Detector

Request handler

Communication Layer

Basic Architecture of the Fault Tolerance Scheme

Physical Node i



IP Address


Logical ring
Logical Ring

  • Use logical ring to minimize the need for global synchronization and recovery.

  • The ring is only used for logging (remote method invocations).

  • Two parts:

    • Static part containing the active correct nodes. It has a leader and a sense of direction: upstream and downstream.

    • Dynamic part containing nodes that trying to join the ring

  • A logical node is logged at the next T physical nodes in the ring, where T is the maximum number of nodes failures to tolerate.

Logical ring maintenance
Logical Ring Maintenance

  • Each node participating in the protocol maintains a variables:

    • Failedi(j): true if i has detected the failure of j

    • Mapi(x): the physical node on which logical node x resides

    • Leaderi: i’s view of the leader of the ring

    • Viewi: i’s view of the logical ring (membership and order)

    • Pendingi: the set of physical nodes that i suspects of failing

    • Recovery_counti: the number of logical nodes that need to be recovered

    • Readyi: records whether I is active.

      • Initial set of ready nodes; new nodes become ready when they are linked into the ring.

Failure handling
Failure Handling

  • When node i is informed of failure of node j:

    • If every node upstream of i has failed, then I must become new leader. It remaps all logical nodes from the upstream physical nodes, and informs the other correct nodes by sending a remap message. It then recovers the logical nodes.

    • If the leader has failed but there is some upstream node k that will become the new leader, then just update the map and leader variables to reflect the new situation

    • If the failed node j is upstream of i, then just update map. If I is the next downstream node from j, also recover the logical nodes from j.

    • If j is downstream of i and there is some node k downstream of j, then just update map.

    • If j is downstream of I and there is no node downstream of j, then wait for the leader to update map.

    • If i is the leader and must recover j, then change map, send a remap message to change the correct nodes’ maps, and recover all logical nodes that are mapped locally

Physical node and leader recovery
Physical Node and Leader Recovery

  • When a physical node comes back up:

    • It sends a join message to the leader.

    • The leader tries to link this node in the ring:

      • Acquire <-> Grant

      • Add, Ack_add

      • Release

  • When the leader fails, the next downstream node in the ring becomes the new leader.


  • Adaptive Quality of Service Availability

  • Developed in UIUC and BBN.

  • Goal:

    • Allow distributed applications to request and obtain a desired level of availability.

  • Fault tolerance

    • replication

    • reliable messaging

Features of aqua
Features of AQuA

  • Uses the QuO runtime to process and make availability requests.

  • Proteus dependability manager to configure the system in response to faults and availability requests.

  • Ensemble to provide group communication services.

  • Provide CORBA interface to application objects using the AQuA gateway.

Proteus functionality
Proteus functionality

  • How to provide fault tolerance for appl.

    • Style of replication (active, passive)

    • voting algorithm to use

    • degree of replication

    • type of faults to tolerate (crash, value or time)

    • location of replicas

  • How to implement chosen ft scheme

    • dynamic configuration modification

    • start/kill replicas, activate/deactivate monitors,voters

Group structure
Group structure

  • For reliable mcast and pt-to-pt. Comm

    • Replication groups

    • Connection groups

    • Proteus Communication Service Group for replicated proteus manager

      • replicas and objects that communicate with the manager

      • e.g. notification of view change, new QuO request

      • ensure that all replica managers receive same info

    • Point-to-point groups

      • proteus manager to object factory

Fault model detection and handling
Fault Model, detection and Handling

  • Object Fault Model:

    • Object crash failure - occurs when object stops sending out messages; internal state is lost

      • crash failure of an object is due to the crash of at lease one element composing the object

    • Value faults - message arrives in time with wrong content (caused by application or QuO runtime)

      • Detected by voter

    • Time faults

      • Detected by monitor

    • Leaders report fault to Proteus; Proteus will kill objects with fault if necessary, and generate new objects


  • Developed in UT, Austin

  • An object-oriented, extensible toolkit for low-overhead fault-tolerance

  • Provides a library of objects that can be used to composelog-basedrollback recovery protocols.

    • Specification language to express arbitrary rollback-recovery protocols

Log based rollback recovery
Log-based Rollback Recovery

  • Checkpointing

    • independent, coordinated, induced by specific patterns of communication

  • Message Logging

    • Pessimistic, optimistic, causal

Core building blocks
Core Building Blocks

  • Almost all the log-based rollback recovery protocols share event-driven structures

  • The common events are:

    • Non-deterministic events

      • Orphans, determinant

    • Dependency-generating events

    • Output-commit events

    • Checkpointing events

    • Failure-detection events

A grammar for specifying rollback recovery protocols
A grammar for specifying rollback-recovery protocols

Protocol := <non-det-event-stmt>* <output-commit-event-stmt>*

<dep-gen-event-stmt> <ckpt-stmt>op t <recovery-stmt>op t

<non-det-event-stmt> := <event> : determinant : <determinant-structure>

<Log <event-info-list><how-to-log>on <stable-storage>>opt

<output-commit-event-stmt> := <output-commit-proto> output commit on < event-list>

<event> := send | receive | read | write

<determinant-structure> := {source, sesn, dest, dest}

<output-commit-proto> := independent | co-ordinated

<how-to-log> := synchronously | asynchronously

<stable-storage> := local disk | volatile memory of self

Egida modules
Egida Modules

  • EventHandler

  • Determinant

  • HowToOutputCommit

  • LogEventDeterminant

  • LogEventInfo

  • HowToLog

  • WhereToLog

  • StableStorage

  • VolatileStorage

  • Checkpointing