checkpointing rollback recovery n.
Skip this Video
Loading SlideShow in 5 Seconds..
Checkpointing & Rollback Recovery PowerPoint Presentation
Download Presentation
Checkpointing & Rollback Recovery

Loading in 2 Seconds...

play fullscreen
1 / 57

Checkpointing & Rollback Recovery - PowerPoint PPT Presentation

  • Uploaded on

Checkpointing & Rollback Recovery. Chapter 13 Anh Huy Bui Jason Wiggs Hyun Seok Roh. Introduction . Rollback recovery protocols restore the system back to a consistent state after a failure

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Checkpointing & Rollback Recovery' - nascha

Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
checkpointing rollback recovery

Checkpointing &Rollback Recovery

Chapter 13

Anh Huy Bui

Jason Wiggs

Hyun Seok Roh


Rollback recovery protocols

restore the system back to a consistent state after a failure

achieve fault tolerance by periodically saving the state of a process during the failure-free execution

treats a distributed system application as a collection of processes that communicate over a network


the saved states of a process

Why is rollback recovery of distributed systems complicated?

messages induce inter-process dependencies during failure-free operation

Rollback propagation

the dependencies may force some of the processes that did not fail to roll back

This phenomenon is called “domino effect”


If each process takes its checkpoints independently, then the system can not avoid the domino effect

this scheme is called independent or uncoordinated checkpointing

Techniques that avoid domino effect

Coordinated checkpointingrollback recovery

processes coordinate their checkpoints to form a system-wide consistent state

Communication-induced checkpointingrollback recovery

forces each process to take checkpoints based on information piggybacked on the application

Log-based rollback recovery

combines checkpointing with logging of non-deterministic events

relies on piecewise deterministic (PWD) assumption

a local checkpoint
A local checkpoint
  • All processes save their local states at certain instants of time
  • A local check point is a snapshot of the state of the process at a given instance
  • Assumption
    • Aprocess stores all local checkpoints on the stable storage
    • A process is able to roll back to any of its existing local checkpoints
    • The kth local checkpoint at process
    • Aprocess takes a checkpoint before it starts execution
consistent states
Consistent states
  • A global state of a distributed system
    • a collection of the individual states of all participating processes and the states of the communication channels
  • Consistent global state
    • a global state that may occur during a failure-free execution of distribution of distributed computation
    • if a process’s state reflects a message receipt, then the state of the corresponding sender must reflect the sending of the message
  • A global checkpoint
    • a set of local checkpoints, one from each process
  • A consistent global checkpoint
    • a global checkpoint such that no message is sent by a process after taking its local point that is received by another process before taking its checkpoint
interactions with outside world
Interactions with outside world
  • A distributed system often interacts with the outside world to receive input data or deliver the outcome of a computation
  • Outside World Process (OWP)
    • a special process that interacts with the rest of the system through message passing
  • A common approach
    • save each input message on the stable storage before allowing the application program to process it
  • Symbol “||”
    • An interaction with the outside world to deliver the outcome of a computation
  • In-transit message
    • messages that have been sent but not yet received
  • Lost messages
    • messages whose ‘send’ is done but ‘receive’ is undone due to rollback
  • Delayed messages
    • messages whose ‘receive’ is not recorded because the receiving process was either down or the message arrived after rollback
  • Orphan messages
    • messages with ‘receive’ recorded but message ‘send’ not recorded
    • do not arise if processes roll back to a consistent global state
  • Duplicate messages
    • arise due to message logging and replaying during process recovery
messages example
Messages – example
  • In-transit
    • ,
  • Lost
  • Delayed
    • ,
  • Orphan
    • none
  • Duplicated
    • ,
issues in failure recovery
Issues in failure recovery
  • Checkpoints : {, }, {, }, and {, }
  • Messages : A - J
  • The restored global consistent state : {, , }
issues in failure recovery1
Issues in failure recovery
  • The rollback of process to checkpoint created an orphan message H
  • Orphan message I is created due to the roll back of process to checkpoint
  • Messages C, D, E, and F are potentially problematic
    • Message C: a delayed message
    • Message D: a lost message since the send event for D is recorded in the restored state for , but the receive event has been undone at process .
    • Lost messages can be handled by having processes keep a message log of all the sent messages
    • Messages E, F: delayed orphan messages. After resuming execution from their checkpoints, processes will generate both of these messages
uncoordinated checkpointing
Uncoordinated Checkpointing
  • Each process has autonomy in deciding when to take checkpoints
  • Advantages
    • The lower runtime overhead during normal execution
  • Disadvantages
    • Domino effect during a recovery
    • Recovery from a failure is slow because processes need to iterate to find a consistent set of checkpoints
    • Each process maintains multiple checkpoints and periodically invoke a garbage collection algorithm
    • Not suitable for application with frequent output commits
  • The processes record the dependencies among their checkpoints caused by message exchange during failure-free operation
direct dependency tracking technique
Direct dependency tracking technique
  • Assume each process starts its execution with an initial checkpoint
  • : checkpoint interval, interval between and
  • When receives a message m during , it records the dependency from to , which is later saved onto stable storage when takes
coordinated checkpointing
Coordinated Checkpointing
  • Blocking Checkpointing
    • After a process takes a local checkpoint, to prevent orphan messages, it remains blocked until the entire checkpointing activity is complete
    • Disadvantages
      • the computation is blocked during the checkpointing
  • Non-blocking Checkpointing
    • The processes need not stop their execution while taking checkpoints
    • A fundamental problem in coordinated checkpointing is to prevent a process from receiving application messages that could make the checkpoint inconsistent.
coordinated checkpointing1
Coordinated Checkpointing
  • Example (a) : checkpoint inconsistency
    • message m is sent by after receiving a checkpoint request from the checkpoint coordinator
    • Assume m reaches before the checkpoint request
    • This situation results in an inconsistent checkpoint since checkpoint shows the receipt of message m from , while checkpoint does not show m being sent from
  • Example (b) : a solution with FIFO channels
    • If channels are FIFO, this problem can be avoided by preceding the first post-checkpoint message on each channel by a checkpoint request, forcing each process to take a checkpoint before receiving the first post-checkpoint message
communication induced checkpointing
Communication-induced Checkpointing
  • Two types of checkpoints
    • autonomous and forced checkpoints
  • Communication-induced checkpointing piggybacks protocol- related information on each application message
  • The receiver of each application message uses the piggybacked information to determine if it has to take a forced checkpoint to advance the global recovery line
  • The forced checkpoint must be taken before the application may process the contents of the message
  • In contrast with coordinated checkpointing, no special coordination messages are exchanged
  • Two types of communication-induced checkpointing
    • model-based checkpointing and index-based checkpointing.
log based rollback recovery
Log-based Rollback Recovery
  • A log-based rollback recovery makes use of deterministic and nondeterministic events in a computation.
  • Deterministic and Non-deterministic events
    • Non-deterministic events can be the receipt of a message from another process or an event internal to the process
    • a message send event is not a non-deterministic event.
    • the execution of process is a sequence of four deterministic intervals
    • Log-based rollback recovery assumes that all non-deterministic events can be identified and their corresponding determinants can be logged into the stable storage
    • During failure-free operation, each process logs the determinants of all non-deterministic events that it observes onto the stable storage
no orphans consistency condition
No-orphans consistency condition
  • Let e be a non-deterministic event that occurs at process p
  • Depend(e)
    • the set of processes that are affected by a non-deterministic event e. This set consists of p, and any process whose state depends on the event e according to Lamport’s happened before relation
  • Log(e)
    • the set of processes that have logged a copy of e’s determinant in their volatile memory
  • Stable(e)
    • a predicate that is true if e’s determinant is logged on the stable storage
  • always-no-orphans condition
    • ∀(e) :¬Stable(e) ⇒ Depend(e) ⊆ Log(e)
pessimistic logging
Pessimistic Logging
  • Pessimistic logging protocols assume that a failure can occur after any non-deterministic event in the computation
  • However, in reality failures are rare
  • synchronous logging
    • ∀e: ¬Stable(e) ⇒ |Depend(e)| = 0
    • if an event has not been logged on the stable storage, then no process can depend on it.
    • stronger than the always-no-orphans condition
pessimistic logging1
Pessimistic Logging
  • Suppose processes and fail as shown, restart from checkpoints B and C, and roll forward using their determinant logs to deliver again the same sequence of messages as in the pre-failure execution
  • Once the recovery is complete, both processes will be consistent with the state of that includes the receipt of message from
optimistic logging
Optimistic Logging
  • Processes log determinants asynchronously to the stable storage
  • Optimistically assume that logging will be complete before a failure occurs
  • Do not implement the always-no-orphans condition
  • To perform rollbacks correctly, optimistic logging protocols track causal dependencies during failure free execution
  • Optimistic logging protocols require a non-trivial garbage collection scheme
  • Pessimistic protocolsneed only keep the most recent checkpoint of each process, whereas optimistic protocolsmay need to keep multiple checkpoints for each process
causal logging
Causal Logging
  • Combines the advantages of both pessimistic and optimistic logging at the expense of a more complex recovery protocol
  • Like optimistic logging, it does not require synchronous access to the stable storage except during output commit
  • Like pessimistic logging, it allows each process to commit output independently and never creates orphans, thus isolating processes from the effects of failures at other processes
  • Make sure that the always-no-orphans property holds
  • Each process maintains information about all the events that have causally affected its state
koo toueg coordinated checkpointing algorithm
Koo-Toueg coordinated checkpointing algorithm

A coordinated checkpointing and recovery technique that takes a consistent set of checkpointing and avoids domino effect and livelock problems during the recovery

Includes 2 parts: the checkpointing algorithm and the recovery algorithm

koo toueg coordinated checkpointing algorithm cont
Koo-Toueg coordinated checkpointing algorithm(cont.)
  • Checkpointing algorithm
    • Assumptions: FIFO channel, end-to-end protocols, communication failures do not partition the network, single process initiation, no process fails during the execution of the algorithm
    • Two kinds of checkpoints: permanent and tentative
      • Permanent checkpoint: local checkpoint, part of a consistent global checkpoint
      • Tentative checkpoint: temporary checkpoint, become permanent checkpoint when the algorithm terminates successfully
koo toueg coordinated checkpointing algorithm cont1
Koo-Toueg coordinated checkpointing algorithm(cont.)
  • Checkpointing algorithm
    • 2 phases
      • The initiating process takes a tentative checkpoint and requests all other processes to take tentative checkpoints. Every process can not send messages after taking tentative checkpoint. All processes will finally have the single same decision: do or discard
      • All processes will receive the final decision from initiating process and act accordingly
    • Correctness: for 2 reasons
      • Either all or none of the processes take permanent checkpoint
      • No process sends message after taking permanent checkpoint
    • Optimization: maybe not all of the processes need to take checkpoints (if not change since the last checkpoint)
koo toueg coordinated checkpointing algorithm cont2
Koo-Toueg coordinated checkpointing algorithm(cont.)
  • The rollback recovery algorithm
    • Restore the system state to a consistent state after a failure with assumptions: single initiator, checkpoint and rollback recovery algorithms are not invoked concurrently
    • 2 phases
      • The initiating process send a message to all other processes and ask for the preferences – restarting to the previous checkpoints. All need to agree about either do or not.
      • The initiating process send the final decision to all processes, all the processes act accordingly after receiving the final decision.
koo toueg coordinated checkpointing algorithm cont3
Koo-Toueg coordinated checkpointing algorithm(cont.)
  • Correctness: resume from a consistent state
  • Optimization: may not to recover all, since some of the processes did not change anything
juang venkatesan algorithm for asynchronous checkpointing and recovery
Juang-Venkatesan algorithm for asynchronous checkpointing and recovery
  • Assumptions: communication channels are reliable, delivery messages in FIFO order, infinite buffers, message transmission delay is arbitrary but finite
  • Underlying computation/application is event-driven: process P is at state s, receives message m, processes the message, moves to state s’ and send messages out. So the triplet (s, m, msgs_sent) represents the state of P
  • Two type of log storage are maintained:
    • Volatile log: short time to access but lost if processor crash. Move to stable log periodically.
    • Stable log: longer time to access but remained if crashed
juang venkatesan algorithm for asynchronous checkpointing and recovery cont
Juang-Venkatesan algorithm for asynchronous checkpointing and recovery(cont.)
  • Asynchronous checkpointing:
    • After executing an event, the triplet is recorded without any synchronization with other processes.
    • Local checkpoint consist of set of records, first are stored in volatile log, then moved to stable log.
  • Recovery algorithm
    • Notations:
      • (): number of messages received by from , from the beginning of computation to checkpoint
      • (): number of messages sent by to , from the beginning of computation to checkpoint
    • Idea:
      • From the set of checkpoints, find a set of consistent checkpoints
      • Doing that based on the number of messages sent and received
manivannan singhal algorithm
Manivannan-Singhal algorithm
  • Observation: there are some checkpoints useless (i.e. never included in any consistent global checkpoint), even none of them are useful
  • Combine the coordinated and uncoordinated checkpointing approaches
    • Take checkpoint asynchronously
    • Use communication-induced checkpointing to eliminates the useless checkpoint
    • Every checkpoint lies on a consistent checkpoint, determine the recovery line is easy and fast
  • Idea
    • Each checkpoint of a process has a unique sequence number – local number, increased periodically
    • When a process send out a message, its sequence number is piggybacked
    • When a process received a message, if the received sequence number > its sequence number, it is forced to take checkpoint, and any basic checkpointing with smaller sequence number is skipped
manivannan singhal checkpointing alg 1
Manivannan-Singhal – Checkpointing Alg. (1)
  • Checkpointing algorithm
    • Checkpoints satisfy the following interesting properties
      • Ci,m of Pi is concurrent with C*, m of all other processes
      • Checkpoints C*,m of all processes form a consistent global checkpoint
      • Checkpoint Ci,m of Pi is concurrent with earliest checkpoint Cj, n with m ≤ n
manivannan singhal checkpointing alg 2
Manivannan-Singhal – CheckpointingAlg. (2)

For a forced


For a basic


manivannan singhal checkpointing ex
Manivannan-Singhal – Checkpointing Ex
  • M1forces P2to take a forced checkpoint with sequence number 3 before processing M1because> sn2
manivannan singhal recovery ex
Manivannan-Singhal – Recovery Ex
  • When recovers,
    • broadcast rollback(inc3, rec_lin3) where inc3 = 1 and rec_line3 = 5
    • rollback to
    • does not have a checkpoint with sequence number ≥ 5. So it takes a local check point and assign 5 as its sequence number
manivannan singhal quasi synchronous checkpointing algorithm cont
Manivannan-Singhal quasi-synchronous checkpointing algorithm(cont.)
  • Comprehensive handling messages during recovery
    • Handling the replay of messages
    • Handle of received messages
peterson kearns algorithm definition 1
Peterson-Kearns algorithm – Definition (1)
  • Based on optimistic recovery protocol
  • Rollback based on vector time
  • Ring configuration : each processor knows its successor on the ring
  • Each process has a vector clock , 0 ≤ j ≤ N-1
  • : the clock value of an event which occurred at
  • : the current vector clock time of and denotes the most recent event in , thus =
  • : i th event on
  • s : A send event of the underlying computation
  • : The process where send event s occurs
  • (s) : The process where the receive event matched with send event s occurs
  • : The ith failure on
peterson kearns algorithm definition 2
Peterson-Kearns algorithm – Definition (2)
  • : The ith state checkpoint on . The check point resides on the stable stoarge
  • : The i th restart event on
  • : The i throllback event on
  • LastEvent() = iff
  • : The arrival of the final polling wave message for rollback from failure at process
  • : The response to this final polling wave by . If no response is required, =
  • The final polling wave for recovery from failure :


  • tk(i, m).ts: the token with failure and restart event
  • tk(i, m).inc : incarnation number of in the token
peterson kearns alg informal description 1
Peterson-Kearns Alg. – Informal Description (1)
  • Step 1
    • When a process restarts after failure, it retrieves its latest checkpoint, including its vector time value, and roll back to it
  • Step 2
    • The message log is replayed
  • Step 3
    • The recovering process executes a restart event to begin the global rollback protocol
    • creates a token message containing the vector timestamp of the restart event
    • sends the token to its successor process
peterson kearns alg informal description 2
Peterson-Kearns Alg. – Informal Description (2)
  • Step 4
    • The token is circulated through all the processes on the ring

(propagation rule : from to )

    • When the token arrives at process , the timestamp in the token is used to determine whether must roll back

If tk(i, m).ts < ,

then must roll back to an earlier state because an orphan event has occurred at

Otherwise, the state of is not changed

  • Step 5
    • When the token returns to the originating process, the roll back recovery is complete
peterson kearns alg formal description 1
Peterson-Kearns Alg. – Formal Description (1)
  • described as set of six rules, CRB1 to CRB6
  • CRB1
    • A formerly failed process creates and propagates a token, event , only after restoring the state from the latest checkpoint and executing the message log from the stable storage
  • CRB2
    • The restart event increments the incarnation number at the recovering process, and the token carries the vector timestamp of the restart event and the newly incremented incarnation number
  • CRB3
    • A non-failed process will propagate the token only after it has rolled back
peterson kearns alg formal description 2
Peterson-Kearns Alg. – Formal Description (2)
  • CRB4
    • A non-failed process will propagate the token only after it has incremented its incarnation number and has stored the vector timestamp of the token and the incarnation number of the token in its OrVectset
  • CRB5
    • When the process that failed, recovered, and initiated the token, receives its token back, the rollback is complete
  • CRB6
    • Messages that were in transit and which were orphaned by the failure and subsequent restart and recovery must be discarded
helary mostefaoui netzer raynal protocol 1
Helary-Mostefaoui-Netzer-Raynal protocol (1)
  • Communication-induced checking protocol
  • Some coordination is required in taking local checkpoints
  • Achieve the coordination by piggybacking control information on application messages
  • Basic checkpoints
    • Processes take local checkpoints independently
  • Forced checkpoints
    • The protocol directs processes to take additional local checkpoints
    • A process takes a forces checkpoint when it receives a message and its predicate becomes true
  • No local checkpoint is useless
  • Takes as few forced checkpoints as possible
helary mostefaoui netzer raynal protocol 2
Helary-Mostefaoui-Netzer-Raynal protocol (2)
  • Based on Z-path and Z-cycle theory
    • A useless checkpoint exactly corresponds to the existence of a Z-cycle in the distributed computation
    • The protocol prevents Z-Cycles
  • A Z-path exists from local check point A to local checkpoint Biff (i) A precedes B in the same process, or (ii) a sequence of message [, , . . . , ] (q 1) exists such that
    • (1)Aprecedes send() in the same process, and
    • (2) for each , i < q, delivery()is in the same or earlier interval as send(), and
    • (3) delivery() precedes B in the same process
helary mostefaoui netzer raynal protocol 3
Helary-Mostefaoui-Netzer-Raynal protocol (3)
  • AZ-path from a local checkpoint to the same local checkpoint is called a Z-cycle (i.e., it involves the local checkpoint )
  • In a Z-path [, , . . . , ], two consecutive messages and form a Z-patterniff send() delivery()
  • Theorem : For any pair of checkpoints and , such that there is a Z-path from to , < implies that there is no Z-cycle
h m n r protocol z path z cycle ex
H-M-N-R protocol – Z-path & Z-cycle ex.
  • [] is a Z-path from to
  • [] and [] are two Z-paths from to
  • [] and [] are two Z-patterns
  • The Z-path [, , ]isa Z-cycle that involves the local checkpoint
h m n r protocol forced checkpoints ex
H-M-N-R protocol – forced checkpoints ex.
  • (a) ≤ : < . Hence, the Z-pattern [, is consistent with the assumption of the above theorem
  • (b) >: A safe strategy to prevent Z-cycle formation is to direct to take a forced checkpoint before delivering . This “breaks” [, ], so it is no longer a Z-pattern
  • How to implement “taking a forced checkpoint”?
    • takes a forced checkpoint if C is true, where

C k: > )

the end
The End
  • Question portion