Lecture 4 Introduction to Principles of Distributed Computing. Sergio Rajsbaum Math Institute UNAM, Mexico. Lecture 4. Consensus in partially synchronous systems, and failure detectors Part I : Realistic timing model and metric Part II : Failure detectors, algorithms
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Lecture 4Introduction to Principles of Distributed Computing
Sergio Rajsbaum
Math Institute
UNAM, Mexico
Consensus in partially synchronous systems, and failure detectors
Each process has an input, should decide an output s.t.
Agreement: correct processes’ decisions are the same
Validity: decision is input of one process
Termination: eventually all correct processes decide
There are at least two possible input values 0 and 1.
all possible vectors over the input values V
L2(X0)
L(X0)
X0
Connectivity
destroyed
Initial states
states after one round
Connectivity
preserved
states after 2 rounds
t < n potential failures out of n >1 processes
Depends on the timing model:
Part I: Realistic Timing Model
First two simple models
Consensus impossible even for t=1 [FLP85]
Round
[Lamport Fischer 82; Dolev, Reischuk, Strong 90](early-deciding)
Many real networks are neither synchronous nor asynchronous
Timed Asynchronous Model [Cristian, Fetzer 98]
Unbounded running time
by [FLP85], because model can be asynchronous for unbounded time
Can we say more than:
consensus will be solved eventually ?
Number of rounds in well-behavedruns
Part II: Algorithms, and the Failure Detector Abstraction
II.a Failure Detectors and Partial Synchrony
-=
II.b Algorithms
In well-behaved runs: process 1 always trusted
Food for thought:
What is the weakest timing model where <>S and/or W are implementable but <>P is not?
Partial Answer: In PODC’03 Aguilera et al present a system with synchronous processes S :
<>P is not implementable in S, W yes
New proof that: <>S is strictly weaker than <>P
does not cover sensitivity to failures, timing, etc.
Food for thought:
When is building <>P more costly than <>S or W?
Partial answer: Aguilera at al PODC’03 observe
Part II: Algorithms, and the Failure Detector Abstraction
II.a Failure Detectors and Partial Synchrony
II.b Algorithms
for r =1, 2, … do
coord(r mod n)+1
if I am coord,then send (r,val) to all
wait for ( (r, val)from coordOR suspect coord (by <>S))
if receive val from coord then estval elseest null
send (r, est)to all
wait for (r,est) from n-t processes
if any non-null est received thenvalest
if all ests have same vthen send (“decide”, v) to all; return(v)
od
1
2
1
1
1
decide v1
(1, v1)
2
2
.
.
.
.
.
.
n
n
est = v1
(1, v1)
The algorithm can block in case of transient message omissions, waiting for a specific round message that will not arrive
Rank BallotNum, initially r0
Rank AcceptNum, initially r0
Value {^} AcceptVal, initially ^
if leader (by W) then
BallotNum (unique rank > BallotNum)
send (“prepare”, rank) to all
if rank > BallotNum then
BallotNum rank
send (“ack”, rank, AcceptNum, AcceptVal) to i
Upon receive (“ack”, BallotNum, b, val) from n-t
if all vals = ^ then myVal = initial value
else myVal = received val with highest b
send (“accept”, BallotNum, myVal) to all /* proposal */
Upon receive (“accept”, b, v) with b BallotNum
AcceptNum b; AcceptVal v /* accept proposal */
send (“accept”, b, v) to all (first time only)
Upon receive(“accept”, b, v) from n-t
decide v
periodically send (“decide”, v) to all
Upon receive (“decide”, v)
decide v
1
1
1
1
1
2
2
2
(“prepare”,1)
(“accept”,1 ,v1)
.
.
.
.
.
.
.
.
.
(“ack”,1,r0,^)
n
n
n
(“accept”,1 ,v1)
Our W implementation
always trusts process 1
decide v1
If n-t correct processes including the leader can communicate with each other
then they eventually decide
[Lamport Fischer 82; Dolev, Reischuk, Strong 90](early-deciding)
Recall: with consensus, only correct processes have to agree (disagreement with the dead is OK)
This version of consensus will be useful to extend the lower bound argument to asynchronous models
Every algorithm has a run with f failures (f<t-1), that takes at least f+2 rounds to decide
A Simple Proof of the Uniform Consensus Synchronous Lower Bound[Keidar, Rajsbaum IPL 02]
We saw that there are algorithms that take 2 rounds todecide in well-behaved runs
There is a lower bound of 2 rounds in well-behaved executions
Recall: with consensus, only correct processes have to agree
In partial synchrony model, any algorithm A for consensus solves uniform consensus[Guerraoui 95]
Proof: Assume by contradiction that A does not solve uniform consensus
Every algorithm has a well-behaved run that takes 2 rounds to decide
End of Lecture 4