CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS

Self Stabilization CSCE 668DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch

Reference • Self-Stabilization, Shlomi Dolev, MIT Press, 2000. • Chapter 2 • Slides prepared for the book by Shlomi Dolev • available at http://www.cs.bgu.ac.il/~dolev/book/slides.html Self Stabilization

Self-Stabilization • A powerful form of fault-tolerance. • Starting from an arbitrary system configuration, the algorithm is able to start working properly all on its own • Arbitrary system configuration is caused by some transient failure: message loss, corrupted memory, processor failure, loss of synchrony,… • As long as system is well-behaved sufficiently long, the algorithm can correct itself. • Paradigm has been applied to both shared memory and message passing models Self Stabilization

Definitions • Execution no longer defined to start with an initial configuration • instead can start with an arbitrary configuration • Depending on the problem to be solved, certain executions are considered legal, forming the set LE. • A configuration C is safeif every admissible execution starting with C is in LE. • An algorithm is self-stabilizingif every admissible execution reaches a safe configuration. Self Stabilization

… … … … … … … … … … … Self-Stabilization Definition arbitrary configuration safe configuration legal execution … Self Stabilization

Communication Model • A "hybrid" of message passing and shared memory • Communication topology is represented as an undirected graph • not necessarily fully connected • Processors correspond to vertices • Corresponding to each edge (pi,pj) are two shared read/write registers: • Rij: written by piand read by pj • Rji : written by pjand read by pi Self Stabilization

p0 p1 p3 p2 Communication Model R21 R01 R12 R10 R23 R13 R32 R31 Self Stabilization

Self-Stabilizing Spanning Tree Definition • Every processor has a variable parent in its local state. • There is a distinguished root processor. • LE consists of all admissible executions in which the parent variables form a spanning tree rooted at root. Self Stabilization

SS Spanning Tree Algorithm • Each processor has local variable • parent, id of neighbor who is parent • dist, estimated distance to root • Root sets dist to 0, and copies state to all its "outgoing" registers • Non-root reads neighbors' states from “incoming” registers and adopts as its parent the neighbor with the smallest distance, and sets its distance to one more • Nodes perform these actions repeatedly Self Stabilization

SS Spanning Tree Algorithm Code for root p0: while true do parent :=  dist := 0 for each neighbor pj do R0j:= 0 // write shared variable endfor Self Stabilization

SS Spanning Tree Algorithm Code for non-root pi: while true do for each neighbor pjdo neigh-dist[j] := Rji // read shared variable dist := 1 + min{neigh-dist[j] : pjis a neighbor} foundParent := false for each neighbor pj do if !foundParent and neigh-dist[j] = dist - 1 then parent := j; foundParent := true endif Rij := dist // write shared variable endfor endwhile storage of negative values is not allowed Self Stabilization

Output of Spanning Tree Algorithm root 0 3 1 1 2 1 2 2 numbers are distances red arrows indicate parents black edges are non-tree edges Self Stabilization

Correctness Proof of SS ST Alg Definition: Executions are partitioned into asynchronous rounds, which are the shortest segments containing at least one step by each processor. Definition: is the degree (maximum number of neighbors) of the communication graph. Definition:D is the diameter of the communication graph. Self Stabilization

Correctness Proof of SS ST Alg Lemma: Consider any admissible execution. There exists T1 < T2 < … < TD such that after asynchronous round Tk: (a) every proc. at distance ≤ k from root has dist = shortest path distance to root and parent variables form a BFS tree (b) every proc. at distance > k from root has dist ≥ k. Self Stabilization

Correctness Proof of SS ST Alg Proof: By induction on k. Basis (k = 1): Let T1 = 5. • Initially all distances are nonnegative. • Procs might start with program counter in the middle of an iteration of the outer while loop; after at most 2 rounds, partial iterations are done. • After next  rounds, all non-root procs have completed read for-loop at least once and computed dist: all are > 0 • After next  rounds, all non-root procs have completed write for-loop at least once • After next  rounds, all non-root procs have completed read for-loop at least once and computed dist: every neighbor of root reads 0 from root and > 0 from every other node, so sets dist to 1 and parent to root. Self Stabilization

Correctness Proof of SS ST Alg Induction (k > 1): Assume for k - 1 and show for k. Let Tk = Tk-1 + 2. • Consider the execution just after end of asynchronous round Tk-1. • After next  rounds, all non-root nodes have executed write for-loop at least once (and written their dist values). • After next  rounds, all non-root nodes have executed read for-loop at least once. • Suppose piis at distance d ≤ k from root. • pi has at least one neighbor pjat distance d-1 ≤ k-1 from root, and no neighbor that is closer to the root. • By inductive hypothesis, pj's register has correct value in it and all other neighbors of pi have registers with values ≥ d-1. • Thus picorrectly computes dist and parent. • Suppose pi is at distance > k from root. • Every neighbor of pi is at distance ≥ k from root. • By inductive hypothesis, all their registers have values ≥ k-1. • Thus pi computes dist to be ≥ k. Self Stabilization

Correctness Proof of SS ST Alg • Since every processor is at most distance D from root, previous lemma implies that a correct breadth-first spanning tree has been constructed after O(D) asynchronous rounds, no matter what the starting configuration. Self Stabilization

Another Classic SS Algorithm • Proposed by Dijkstra • Suggested for mutual exclusion • we will view it as a "token circulation" algorithm • Uses a stronger model of computation • in one atomic step, a proc can read all its "incoming" registers and write all its "outgoing" registers Self Stabilization

p0 p1 p2 p3 R3 R2 R1 R0 Ring Communication Topology • Procs are arranged in a unidirectional ring. • Only need one register for each proc. p0 writes into R0, p1 reads from R0, etc. Self Stabilization

Processor's States • Each processor's state consists solely of an integer, ranging from 0 to K - 1 (for suitable value of K) • Actually, processor just stores this information in its register. Self Stabilization

Definition of Holding the Token • Proc p0holds the tokenif R0 = Rn-1. • Proc pi(other than p0) holds the tokenif Ri ≠ Ri-1. Self Stabilization

Self-Stabilizing Token Circulation Definition • LE consists of all admissible executions in which • in every configuration only one processor holds the token and • every processor holds the token infinitely often (Note resemblance to mutual exclusion problem.) Self Stabilization

Dijkstra's Algorithm code for p0: while true do if R0 = Rn-1 then R0:= (R0 + 1) mod K endif endwhile code for pi, i ≠ 0: while true do if Ri≠Ri-1 then Ri:= Ri-1 endif endwhile executes atomically Self Stabilization

p0 p1 p2 p3 Analysis of Dijkstra's Algorithm Lemma: If all registers are equal in a configuration, then the configuration is safe. Proof: Suppose K = 5. 3 1 0 4 4 0 3 3 4 0 4 0 3 Self Stabilization

Analysis of Dijkstra's Algorithm • If execution begins with arbitrary values between 0 and K-1 in the registers, how can we show that eventually all the values will be the same (i.e., reach a safe state)? • Depends on K being large enough. • Suppose K = n+1 (so there are n+1 different values). • Lemma 1: In every configuration, there is at least one integer in {0,…,K-1} that does not appear in any register. Self Stabilization

Analysis of Dijkstra's Algorithm Lemma 2: In every admissible execution (starting from any configuration), p0 holds the token, and thus changes R0, at least once during every n rounds. Proof: Suppose in contradiction there is a segment of n rounds in which p0 does not change R0. • Once p1 takes a step in the first round, R1 = R0, and this equality remains true. • Once p2takes a step in the second round, R2 = R1 = R0, and this equality remains true. • … • Once pn-1 takes a step in the (n-1)-st round, Rn-1 = Rn-2 = … = R0. • So when p0 takes a step in the n-th round, it will change R0. Self Stabilization

Analysis of Dijkstra's Algorithm Theorem: In any admissible execution starting at any configuration C, a safe configuration is reached within O(n2) rounds. Proof: Letjbe a value not in any register in C. • By Lemma 2, p0 changes R0 (by incrementing it)at least once every n rounds. • Thus eventually R0 holds j, in configuration D, after at most O(n2) rounds. • Since other procs only copy values, no register holds j between C and D. • After at most n more rounds, the value j propagates around the ring to pn-1. Self Stabilization

What about Reducing K? • Easy to see that K = n (n different values) suffices: either there is a missing value or p0's value is unique. • Can also show that K = n - 1 (n-1 different values) suffices. • But if K < n - 1 (less than n-1 different values), then there is a counter-example. • If the strong atomicity model is weakened to our familiar read/write atomicity, then K > 2n - 2 suffices. Self Stabilization

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS