Causality in Distributed Systems: Importance and Applications

Coverage • Nature of Causality • Causality: Why is it Important/Useful? • Causality in Life vs. Causality in Distributed Systems • Modeling Distributed Events - Defining Causality • Logical Clocks • General Implementation of Logical Clocks • Scalar Logical Time • Demo: Scalar Logical Time with Asynchronous Unicast Communication between Multiple Processes • Conclusions • Questions / Reference

Nature of Causality • Consider a distributed computation which is performed by a set of processes: • The objective is to have the processes work towards and achieve a common goal • Processes do not share global memory • Communication occurs through message passing only

Process Actions • Actions are modeled as three types of events: • Internal Event: affects only the process which is executing the event, • Send Event: a process passes messages to other processes • Receive Event: a processes gets messages from other processes b d g i m o P1 a c j l n r P2 f h P3 e k p q

Causal Precedence • Ordering of events for a single process is simple: they are ordered by their occurrence. a b c d P • Send and Receive events signify the flow of information between processes, and establish causal precedence between events at the sender and receiver   a d P1 P2  c  b

Distributed Events • The execution of this distributed computation results in the generation of a set of distributed events. • The causal precedence induced by the send and receive events establishes a partial order of these distributed events: • The precedence relation in our case is “Happened Before”, e.g. for two events a and b, a  b means “a happened before b”. a  b (Event a precedes event b) a P1 b P2

Causality:Why is it important/useful? • This causality among events (induced by the “happened before” precedence relation) is important for several reasons: • Helps us solve problems in distributed computing: • Can ensure liveness and fairness in mutual exclusion algorithms, • Helps maintain consistency in replicated databases, • Facilitates the design of deadlock detection algorithms in distributed systems,

Importance of Causality (Continued) • Debugging of distributed systems: allows the resumption of execution. • System failure recovery: allows checkpoints to be built which allow a system to be restarted from a point other than the beginning. • Helps a process to measure the progress of other processes in the system: • Allows processes to discard obsolete information, • Detect termination of other processes

Importance of Causality • Allows distributed systems to optimize the concurrency of all processes involved: • Knowing the number of causally dependent events in a distributed system allows one to measure the concurrency of a system: All events that are not causally related can be executed concurrently.

Causality:Life vs. Distributed Systems • We use causality in our lives to determine the feasibility of daily, weekly, and yearly plans, • We use global time and (loosely) synchronized clocks (wristwatches, wall clocks, PC clocks, etc.)

Causality (Continued) • However, (usually) events in real life do not occur at the same rate as those in a distributed system: • Distributed systems’ event occurrence rates are obviously much higher, • Event execution times are obviously much smaller. • Also, distributed systems do not have a “global” clock that they can refer to, • There is hope though! We can use “Logical Clocks” to establish order.

Modeling Distributed Events:Defining Causality and Order • Distributed program as a set of asynchronous processes p1, p2, …, pn, who communicate through a network using message passing only. Process execution and message transfer are asynchronous. P3 P1 P4 P2

Modeling Distributed Events • Notation: given two events e1 and e2, • e1 e2 : e2 is dependent on e1 • if e1 e2 and e2 e1 then e1 and e2 are concurrent: e1 || e2 e1 e1 P1 P1 P2 P2 e2 e2

Logical Clocks • In a system of logical clocks, every process has a logical clock that is advanced using a set of rules P2 P1 P3

Logical Clocks - Timestamps • Every event is assigned a timestamp (which the processes use to infer causality between events). P1 P2 Data

Logical Clocks - Timestamps • The timestamps obey the monotonicity property. e.g. if an event a causally affects an event b, then the timestamp for a is smaller than b. Event a’s timestamp is smaller than event b’s timestamp. a P1 b P2

Formal Definition of Logical Clocks • The definition of a system of logical clocks: • We have a logical clock C, which is a function that maps events to timestamps, e.g. For an event e, C(e) would be its timestamp P1 e P2 Data C(e)

Formal Definition of Logical Clocks • For all events e in a distributed system, call them the set H, applying the function C to all events in H generates a set T: e  H, C(e)  T a b d P1 P2 c H = { a, b, c, d } T = { C(a), C(b), C(c), C(d) }

Formal Definition of Logical Clocks • We define the relation for timestamps, “<“, to be our precedence relation: “happened before”. • Elements in the set T are partially ordered by this precedence relation, i.e.: The timestamps for each event in the distributed system are partially ordered by their time of occurrence. More formally, e1 e2 C(e1) < C(e2)

Formal Definition of Logical Clocks • What we’ve said so far is, “If e2 depends on e1, then e1 happened before e2.” • This enforces monotonicity for timestamps of events in the distributed system, and is sometimes called the clock consistency condition.

General Implementation of Logical Clocks • We need to address two issues: • The data structure to use for representing the logical clock and, • The design of a protocol which dictates how the logical clock data structure updated

Logical Clock Implementation:Clock Structure • The structure of a logical clock should allow a process to keep track of its own progress, and the value of the logical clock. There are three well-known structures: • Scalar: a single integer, • Vector: a n-element vector (n is the number of processes in the distributed system), • Matrix: a nn matrix

Logical Clock Implementation:Clock Structure • Vector: Each process keeps an n-element vector C1 C2 C3 Process 1’s Logical Time Process 1’s view of Process 2’s Logical Time Process 1’s view of Process 3’s Logical Time • Matrix: Each process keeps an n-by-n matrix C1 C2 C3 C1´ C2 ´ C3 ´ C1 ´´ C2 ´´ C3 ´´ Process 1’s view of Process 3’s view of everyone’s Logical Time Process 1’s Logical Time and view of Process 2’s and Process 3’s logical time. ...

Logical Clock Implementation:Clock Update Protocol • The goal of the update protocol is to ensure that the logical clock is managed consistently; consequently, we’ll use two general rules: • R1: Governs the update of the local clock when an event occurs (internal, send, receive), • R2: Governs the update of the global logical clock (determines how to handle timestamps of messages received). • All logical clock systems use some form of these two rules, even if their implementations differ; clock monotonicity (consistency) is preserved due to these rules.

Scalar Logical Time • Scalar implementation – Lamport, 1978 • Again, the goal is to have some mechanism that enforces causality between some events, inducing a partial order of the events in a distributed system, • Scalar Logical Time is a way to totally orders all the events in a distributed system, • As with all logical time methods, we need to define both a structure, and update methods.

Scalar Logical Time: Structure • Local time and logical time are represented by a single integer, i.e.: • Each process pi uses an integer Cito keep track of logical time. P1 C1 P2 C2 P3 C3

Scalar Logical Time:Logical Clock Update Protocol • Next, we need to define the clock update rules: • For each process pi: • R1: Before executing an event, pi executes the following: Ci = Ci + d (d > 0) d is a constant, typically the value 1. • R2: Each message contains the timestamp of the sending process. When pi receives a message with a timestamp Cmsg, it executes the following: Ci = max(Ci, Cmsg) Execute R1

Scalar Logical Time:Update Protocol Example C1 = 0 d = 1 C2 = 0 P1 P2 C2 = 1 (R1) C1 = 1 (R1) C2 = 2 (R1) C2 = max(2, 1) (R2) C2 = 3 (R1) C1 = 2 (R1) C2 = 4 (R1) C2 = 5 (R1) C2 = 6 (R1) C1 = 3 (R1) C1 = max (3, 6) (R2) C2 = 7 (R1) C1 = 7 (R1)

Scalar Logical Time: Properties • Properties of this implementation: • Maintains monotonicity and consistency properties, • Provides a total ordering of events in a distributed system.

Scalar Logical Time: Pros and Cons • Advantages • We get a total ordering of events in the system. All the benefits gained from knowing the causality of events in the system apply, • Small overhead: one integer per process. • Disadvantage • Clocks are not strongly consistent: clocks lose track of the timestamp of the event on which they are dependent on. This is because we are using a single integer to store the local and logical time.

Demo - Simple Scalar Logical Time Application • Consists of several processes, communicating asynchronously via Unicast, • Only Send and Receive events are used; internal events can be disregarded since they only complicate the demo (imagine processes which perform no internal calculations), • Scalar logical time is used, • Written in Java.

Demo: Event Sequence • Start one process (P1) • P1 uses a receive thread to process incoming messages asynchronously. • P1 will sleep for a random number of seconds • Upon waking, P1 will attempt to send a message to a random process, emulating asynchronous and random sending. P1 repeats this process. • Start process 2 (P2). The design of the application allows processes to know who is in the system at all times. • P2 performs the same steps as P1…

Conclusions • Logical time is used for establishing an ordering of events in a distributed system, • Logical time is useful for several important areas and problems in Computer Science, • Implementation of logical time in a distributed system can range from simple (scalar-based) to complex (matrix-based), and covers a wide range of applications, • Efficient implementations exist for vector and matrix based scalar clocks, and must be considered for any large scale distributed system.

Causality in Distributed Systems: Importance and Applications

Causality in Distributed Systems: Importance and Applications

Presentation Transcript

Provisional Coverage

The coverage

Political Coverage

Statewide coverage

Coverage

Yearbook Coverage

Coverage improvement

Coverage

Coverage

Coverage

Coverage

Coverage

Media Coverage

Coverage

Coverage

Coverage

Coverage

COVERAGE