message logging pessimistic optimistic n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Message Logging Pessimistic & Optimistic PowerPoint Presentation
Download Presentation
Message Logging Pessimistic & Optimistic

Loading in 2 Seconds...

play fullscreen
1 / 27

Message Logging Pessimistic & Optimistic - PowerPoint PPT Presentation


  • 97 Views
  • Uploaded on

Message Logging Pessimistic & Optimistic. CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu. Intruduction. Context & Applications Check-pointing Message Logging Pessimistic (failure-free mode suffers) Optimistic (good for failure-free mode)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Message Logging Pessimistic & Optimistic' - gareth


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
message logging pessimistic optimistic

Message LoggingPessimistic & Optimistic

CS717 Lecture 10/16/01-10/18/01

Kamen Yotov

kyotov@cs.cornell.edu

intruduction
Intruduction
  • Context & Applications
  • Check-pointing
  • Message Logging
    • Pessimistic (failure-free mode suffers)
    • Optimistic (good for failure-free mode)
    • Causal (to be discussed in next lectures...)
  • Main problems
    • Consistency
      • Orphans
fault tolerance why s
Fault Tolerance “Why”s
  • Flow of events
    • Check-point
    • Log messages
    • Crash
    • Restore
    • Replay
common assumptions
Common Assumptions
  • Fail-stop model
  • Failure eventually detectable by all
  • Channels
    • Asynchronous
    • Reliable
    • FIFO
    • Unbounded message delivery
    • Failures
      • Transiently dropping
      • No duplication and/or corruption
  • Stable storage
  • Spare processing capacity
common goals
Common goals
  • Application independence
  • Application transparency
    • Simple
    • Independent evolution
    • Handles preexisting programs
  • High throughput
    • Failure-free model with little overhead
  • Maximum fault-tolerance
    • Any number of failures
formal terminology
Formal Terminology
  • Delivery (as opposed to receipt)
    • Non-faulty processes eventually deliver all messages that they have received
    • Receive sequence number
      • If p delivers m and m.rsn=l then m is the lth message p delivers
  • Run
    • Sequence of system states
    • Asynchronous
      • Only one process changes state at once
formal terminology cont
Formal Terminology (cont.)
  • Properties: Logical expressions over runs
    • □ - Always 
    • ◊ - Eventually 
  • Message determinant
    • #m = <m.src, m.ssn, m.dest, m.rsn, m.data>
    • m.data and m.dest not essential
    • Logging determinants vs. actual messages
  • Other notation
    • N – set of all processes
    • C – set of failed processes
    • Log(m) – set of processes possessing a copy of #m
    • Depend(m) – set of processes that depend on m
orphan properties
Orphan Properties
  • Before failure, by definition #mLog(m)
  • #m lost if Log(m)C
  • stable(m) if #m cannot be lost
  • p orphan of C if
    • p did not fail
    • pDepend(m)
    • #m is lost
performance metrics
Performance Metrics
  • Number of forced roll-backs
  • Time spend on blocking
  • Number of messages
  • Size of messages
got to the real world stuff
Got to the real-world stuff!
  • No additional messages
  • Any number of failures (including total)
  • No assumptions about the logging protocol
    • Pessimistic doesn’t require that generality
the model process states
The ModelProcess states
  • Process states
    • State interval
      • Instantiates a new one on each message received
      • State interval index (auto increment)

I01

I11

I32

p1

p2

p3

I03

I13

I23

I33

I43

I53

the model process states cont
The ModelProcess states (cont.)
  • Dependencies between process states (pi depends on pj)
    • Maximum index of any interval of pj, on which pi depends
    • Inside a process each interval depends on the previous one
  • Dependence vector
    • di = <*> = < 1, 2, 3, 4,…, n>, k = , 0, 1, …

I01

I11

I32

p1

p2

p3

I03

I13

I23

I33

I43

I53

the model system states
The ModelSystem states
  • Process state – dependence vector
    • di = <*> = < 1, 2, 3, 4,…, n>, k = , 0, 1, …
  • System state – dependence matrix
    • nn
    • Row i – process state for pi
    • Diagonal – current state intervals
the model system states cont
The ModelSystem states (cont.)
  • S – set of all system states
  • A=[**]S and B=[**]S
    • A  B   i=1..n: ii ii
    • Partial order different than Lamport’s
      • Orders system states vs. events
      • Only events are state intervals
  • Lattice
    • A  B = [**] ik = ii ii ? ik : ik
    • A  B = [**] ik = ii ii ? ik : ik
the model consistent system states
The ModelConsistent System states
  • Consistent state
    • All received messages
      • Sent in the current state of the sender
      • Can be deterministically sent in the future
    • Messages not yet received are not a problem
    • Definition: D=[**]S,  i, k=1..n: ik kk
      • A process cannot depend on the state interval of another process, that has not been reached yet
    • C = { D S | D is consistent }
      • C is a sub-lattice of S – proof straightforward!
the model logging and stability
The ModelLogging and Stability
  • logged(i,)
    • Message that started state interval  of process i has been logged on stable storage
  • checkpoint(i,)
    • Exists a check-point that contains the state of process i on stable storage
    • checkpoint(i,0) is implicit
    • Effective check-point for  on i is checkpoint(i,),   ,  is maximal
  • stable(i,)   :  <     [logged(i,)]
the model recoverable system states
The ModelRecoverable System states
  • Recoverable system state
    • System state is consistent
    • All current process states are stable
    • D=[**]S
      • recoverable(D)  D C &&  i : stable(i, ii)
    • R = { D S | recoverable(D) }
      • R is a sub-lattice of S – proof straightforward!
  • Theorem: A single maximum recoverable state exists!
    • Proof
      • R  S;
      • A  B R if A, B R A, B  A  B
      • Therefore maximum is D R D, obviously unique!
the model recoverable system states cont
The ModelRecoverable System states (cont.)
  • Current recovery state
    • The Maximum Recover State at any time
    • Never decreases
      • D=[**], No  : ( i :   ii ) is ever rolled back
      • Proof:
        • D will always remain consistent
        • iiwill always remain stable
        • Since R is a lattice, any new state formed after D will be greater than D
        • In any new current recovery:
          • ii  state interval index for each process
          • Therefore, not state interval   ii for each i will ever need to be rolled back!
the model wrapup
The ModelWrapup…
  • Corollary 1: If all messages received are eventually logged no domino effect occurs
  • If D=[**] is the current recovery state
    • Corollary 2: Any messages sent by process i from state   ii may be committed
    • With i being the effective checkpoint of ii
      • Corollary 3: All previous checkpoints of process i may be discarded
      • Corollary 4: All messages that begin state intervals prior to i may be discarded
the algorithm overview
The AlgorithmOverview
  • Keep a current recovery state
  • On each new interval  for some process k becoming stable
    • Try to improve the current recovery state, such that:
      • State of process k advances to 
      • Add more state intervals from other processes to maintain consistency
      • Succeed if all such included intervals are stable
the algorithm basic implementation
The AlgorithmBasic implementation
  • Notation
    • D=[**]– the current recovery state
    •  – state interval of process k becoming stable
    • dk = <*> = < 1, 2, 3, 4,…, n>, j = , 0, 1, … – state of process k (dependence vector)
  • Algorithm
    • if ( >kk) { i : ki  i // update row of D while ( i,j : ij >jj ) if (   ij : stable()) //  - an interval for j i : ji  i // update row of D with dj for  else fail}
the algorithm some details
The AlgorithmSome details
  • The chosen  should be the minimum stable state interval:   ij
  • The comparisons ij >jj can be made in any order without affecting the final result
  • When state interval  of process k becomes stable, the algorithm finds some recoverable D with kk = 
  • No stable process state interval  that was not suitable should be checked again before advancing the current recovery state
    • Corollary: When the recovery state advances from some D to D’, the rejected ’s above that need to be rechecked are those with direct dependency on some  on any process i such that ii <  < ii’
the algorithm proof of correctness
The AlgorithmProof of Correctness
  • The algorithm presented always finds the current recovery state of the system
    • Only finds recoverable system states
    • Any such state found is greater
    • Following the observations stated before, all possible new states are considered
    • Therefore, the correct one is always found!
the algorithm optimizations implementation
The AlgorithmOptimizations & Implementation
  • Optimization considerations
    • Keeping work list of rows to update D
      • Keep only the one with max index
    • Keeping only the diagonal of D
  • Implementation
    • Provided in the paper
    • Follows everything said till now
    • Takes advantage of some specifics
conclusions
Conclusions
  • General Model and Algorithm
    • Work for both pessimistic and optimistic protocols
    • Does not need the generality for the pessimistic case
  • Optimistic logging is desirable from performance standpoint in low failure environments
  • Unifies existing approaches to fault tolerance
    • Check-pointing
    • Message Logging
  • Results
    • Existence of unique maximum recoverable state
    • Never decreases (progress is being made)
    • Domino effect cannot occur
future work list
Future work list…
  • Address non-determinism
    • Switch between
      • check-pointing for the non-deterministic part
      • Check-pointing + message logging elsewhere
  • Output-driven optimistic message logging and check-pointing
    • Pay attention to communication of the results
  • Application specific knowledge