Total Order Broadcast and Multicast Algorithms: Taxonomy and Survey (Paper by X. Défago, A. Schiper, and P. Urbán) ACM computing Surveys, Vol. 36,No 4, Dec 2004, pp. 372-421. Aida Omerovic 4. March 2008 Seminar on Dependable and Adaptive Distributed Systems. Outline. Background
Total Order Broadcast and Multicast Algorithms: Taxonomy and Survey(Paper by X. Défago, A. Schiper, and P. Urbán)ACM computing Surveys, Vol. 36,No 4, Dec 2004, pp. 372-421
4. March 2008
Seminar on Dependable and
Adaptive Distributed Systems
Total order broadcast and multicast algorithms
Lack of a roadmap for use of the algorithms.
Lack of generality of existing comparissons.
Broadcast (messages are sent to all processes) vs.
Multicast (messages are sent to a subset of processes)
Closed vs. open groups (belonging of the sender)
Single vs. multiple groups (disjoint/overlapping)
A correct process never expresses any of the faulty behaviors:
The total order broadcast problem specification
Properties of total order broadcast:
Properties 1, 2 and 3 satisfied -> ”reliable broadcast”.
Properties 1 and 2: ”liveness properties”. (Property may eventually hold, regardless.)
Properties 3 and 4: ”safety properties”. (Once the property does not hold, it never will).
Properties 2 and 4: uniform. (Apply to both correct and faulty processes.) Costly. Algorithms tolerant to Byzantine failures can not guarantee any of the uniform properties above.
Nonuniform: Neither 2 nor 4 hold. Apply only to correct processes, no restr. on the faulty ones. Voting can be a measure.
Alternative: uniform processes are those enforced by honest processes, correct or not. (Honest process: behaves according to its specification.)
An issue: contamination. (A faulty process in an inconsistent state ”legally” TO broadcasts a message, prior to crashing, thus contaminating the correct processes.)
Note: satisfies even the strongest specification so far.
This is disallowed by
However, contamination can not be avoided in case of arbitrary failures (e.g. correct delivery by faulty process.)
Other ordering properties include:
Generally: broadcast of m before m’, implies delivery of m before m’ by correct processes.
Note: these two properties further restrict total order property definition by properties related to SENDERS.
Causal order <-> FIFO order + Local order
… according to how the ordering (e.g. timestamp, sequence number) is performed and by whom (type of role).
Process roles: sender, destination, sequencer.
Five classes of total order broadcast algorithms:
Another distinction is between time-free and time-based (physical time) ordering.
Neither of the five is failure tolerant!!!
Synchronous system: a system where upper bounds on process speed interval and communication delay, are set.
Asynchronous system: the two parameters are unbounded.
Timed asynchronous model: asynchronous model with notion of physical time and assumption that ”most of the messages are likely to reach their destination within a delay δ”.
Concensus in asynchronous systems if just a single process can crash, has no deterministic solutiuon.
Total order broadcast can be transformed into
concensus -> the impossibility holds also here!
Solution: extent the asynchronous system with oracles.
An oracle provides information that processes can use to guide their choices.
Process controlled crash: the ability to artificially force the crash of a process.
Useful in crashing incorrect or suspect processes.
However, a process tolerant algoriths can only tolerate the crash of a bounded number of processes.
Failures: provoked + genuine => provoking failures degrades the actual fault tolerance of the system.
The main fault-tolerance mechanisms algorithms rely on: