1 / 43

Logical Clocks

Logical Clocks. Topics. Logical clocks Totally-Ordered Multicasting Vector timestamps. Readings. Van Steen and Tanenbaum: 5.2 Coulouris: 10.4 L. Lamport, “Time, Clocks and the Ordering of Events in Distributed Systems,” Communications of the ACM, Vol. 21, No. 7, July 1978, pp. 558-565.

uriel-witt
Download Presentation

Logical Clocks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Logical Clocks

  2. Topics • Logical clocks • Totally-Ordered Multicasting • Vector timestamps

  3. Readings • Van Steen and Tanenbaum: 5.2 • Coulouris: 10.4 • L. Lamport, “Time, Clocks and the Ordering of Events in Distributed Systems,” Communications of the ACM, Vol. 21, No. 7, July 1978, pp. 558-565. • C.J. Fidge, “Timestamps in Message-Passing Systems that Preserve the Partial Ordering”, Proceedings of the 11th Australian Computer Science Conference, Brisbane, pp. 56-66, February 1988.

  4. Ordering of Events • For many applications, it is sufficient to be able to agree on the order that events occur and not the actual time of occurrence. • It is possible to use a logical clock to unambiguously order events • May be totally unrelated to real time. • Lamport showed this is possible (1978).

  5. The Happened-Before Relation • Lamport’s algorithm synchronizes logical clocks and is based on the happened-before relation: • a  b is read as “a happened before b” • The definition of the happened-before relation: • If a and b are events in the same process and a occurs before b, then a  b • For any message m, send(m) send(m) rcv(m), where send(m) is the event of sending the message and rcv(m) is event of receiving it. • If a, b and c are events such that a  b and b c then a c

  6. The Happened-Before Relation • If two events, x and y, happen in different processes that do not exchange messages , then x  y is not true, but neither is y  x • The happened-before relation is sometimes referred to as causality.

  7. Say in process P1 you have a code segment as follows: 1.1 x = 5; 1.2 y = 10*x; 1.3 send(y,P2); Say in process P2 you have a code segment as follows: 2.1 a=8; 2.2 b=20*a; 2.3 rcv(y,P1); 2.4 b = b+y; Example Let’s say that you start P1 and P2 at the same time. You know that 1.1 occurs before 1.2 which occurs before 1.3; You know that 2.1 occurs before 2.2 which occurs before 2.3 which is before 2.4. You do not know if 1.1 occurs before 2.1 or if 2.1 occurs before 1.1. You do know that 1.3 occurs before 2.3 and 2.4

  8. Example • Continuing from the example on the previous page – The order of actual occurrence of operations is often not consistent from execution to execution. For example: • Execution 1 (order of occurrence): 1.1, 1.2, 1.3, 2.1, 2.2, 2.3, 2.4 • Execution 2 (order of occurrence): 2.1,2.2,2.3,1.3, 2.3,2.4 • Execution 3 (order of occurrence) 1.1, 2.1, 2.2, 1.2, 1.3, 2.3, 2.4 • We can say that 1.1 “happens before” 2.3, but not that 1.1 “happens before” 2.2 or that 2.2 “happens before” 1.1. • Note that the above executions provide the same result.

  9. Lamport’s Algorithm • We need a way of measuring time such that for every event a, we can assign it a time value C(a) on which all processes agree on the following: • The clock time C must monotonically increase i.e., always go forward. • If a  b then C(a) < C(b) • Each process, p, maintains a local counter Cp • The counter is adjusted based on the rules presented on the next page.

  10. Lamport’s Algorithm • Cpis incremented before each event is issued at process p: Cp = Cp + 1 • When p sends a message m, it piggybacks on m the value t=Cp • On receiving (m,t), process q computes Cq = max(Cq,t) and then applies the first rule before timestamping the event rcv(m).

  11. Example P1 P2 P3 a e j b f k c g d l h i Assume that each process’s logical clock is set to 0

  12. Example P1 P2 P3 1 a e 1 2 1 j b f 3 3 k 2 c g d 4 4 3 5 l h 6 i Assume that each process’s logical clock is set to 0

  13. Example • From the timing diagram on the previous slide, what can you say about the following events? • Between a and b: a  b • Between b and f: b  f • Between e and k: concurrent • Between c and h: concurrent • Between k and h: k  h

  14. Total Order • A timestamp of 1 is associated with events a, e, j in processes P1, P2, P3 respectively. • A timestamp of 2 is associated with events b, k in processes P1, P3 respectively. • The times may be the same but the events are distinct. • We would like to create a total order of all events i.e. for an event a, b we would like to say that either a  b or b a

  15. Total Order • Create totalorder by attaching a process number to an event. • Pi timestamps event e with Ci (e).i • We then say that Ci(a).i happens before Cj(b).j iff: • Ci(a) < Cj(a); or • Ci(a) = Cj(b) and i < j

  16. Example (total order) P1 P2 P3 1.1 a e 1.2 2.1 1.3 j b f 3.2 3.1 k 2.3 c g d 4.1 4.2 3.3 5.2 l h 6.2 i Assume that each process’s logical clock is set to 0

  17. Example: Totally-Ordered Multicast • Application of Lamport timestamps (with total order) • Scenario • Replicated accounts in New York(NY) and San Francisco(SF) • Two transactions occur at the same time and multicast • Current balance: $1,000 • Add $100 at SF • Add interest of 1% at NY • If not done in the same order at each site then one site will record a total amount of $1,111 and the other records $1,110.

  18. Example: Totally-Ordered Multicasting • Updating a replicated database and leaving it in an inconsistent state.

  19. Example: Totally-Ordered Multicasting • We must ensure that the two update operations are performed in the same order at each copy. • Although it makes a difference whether the deposit is processed before the interest update or the other way around, it does matter which order is followed from the point of view of consistency. • We need totally-ordered multicast, that is a multicast operation by which all messages are delivered in the same order to each receiver. • NOTE: Multicast refers to the sender sending a message to a collection of receivers.

  20. Example: Totally Ordered Multicast • Algorithm • Update message is timestamped with sender’s logical time • Update message is multicast (including sender itself) • When message is received • It is put into local queue • Ordered according to timestamp, • Multicast acknowledgement

  21. Example:Totally Ordered Multicast • Message is delivered to applications only when • It is at head of queue • It has been acknowledged by all involved processes • Pi sends an acknowledgement to Pj if • Pi has not made an update request • Pi’s identifier is less than Pj’s identifier • Pi’s update has been processed; • Lamport algorithm (extended for total order) ensures total ordering of events

  22. Example: Totally Ordered Multicast • On the next slide m corresponds to “Add $100” and n corresponds to “Add interest of 1%”. • When sending an update message (e.g., m, n) the message will include the timestamp generated with the update was issued.

  23. Example: Totally Ordered Multicast San Francisco (P1) New York (P2) Issue m 1.1 1.2 Issue n 2.1 Send m 2.2 Send n 3.2 Recv m Recv n 3.1 4.2 Send ack(m) Recv ack(m) 5.1 Process m 6.1 Send ack(n) Recv ack(n) 5.2

  24. Example: Totally Ordered Multicast • When P1 issues the update message (m) the timestamp associated with it is 1.1 • When P2 issues the update message (n) the timestamp associated with it is 2.1 • At both P1’s queue and P2’s queue the update messages are ordered such that m is before n.

  25. A Note on Ordering and Consistency • The previous examples assumes that messages are received in the order they were delivered and the message passing is reliable. • There are different definitions of consistency. • We will study these issues in more detail.

  26. Problems with Lamport Clocks • Lamport timestamps do not capture causality. • With Lamport’s clocks, one cannot directly compare the timestamps of two events to determine their precedence relationship. • If C(a) < C(b) is not true then a  b is also not true. • Knowing that C(a) < C(b) is true does not allow us to conclude that a  b is true. • Example: In the first timing diagram, C(e) = 1and C(b) = 2; thus C(e) < C(b) but it is not the case that e  b

  27. Problem with Lamport Clocks • The main problem is that a simple integer clock cannot order both events within a process and events in different processes. • C. Fidge developed an algorithm that overcomes this problem. • Fidge’s clock is represented as a vector [v1,v2,…,vn] with an integer clock value for each process (vi contains the clock value of process i). This is a vector timestamp.

  28. Fidge’s Algorithm • Properties of vector timestamps • vi [i] is the number of events that have occurred so far at Pi • If vi [j] = k then Pi knows that k events have occurred at Pj

  29. Fidge’s Algorithm • The Fidge’s logical clock is maintained as follows: • Initially all clock values are set to the smallest value (e.g., 0). • The local clock value is incremented at least once before each primitive event in a process i.e., vi[i] = vi[i] +1 • The current value of the entire logical clock vector is delivered to the receiver for every outgoing message. • Values in the timestamp vectors are never decremented.

  30. Fidge’s Algorithm • Upon receiving a message, the receiver sets the value of each entry in its local timestamp vector to the maximum of the two corresponding values in the local vector and in the remote vector received. • Let vq be piggybacked on the message sent by process q to process p; We then have: • For i = 1 to n do vp[i] = max(vp[i], vq [i] ); vp[p] = vp[p] + 1;

  31. Fidge’s Algorithm • For two vector timestamps, Ta and Tb • Ta is not equal to Tb if there exists an i such that Ta[i] is not equal to Tb[i] • Ta <= Tb if for all i Ta[i] <= Tb[i] • Ta < Tb if for all i Ta[i] < = Tb[i] AND Ta is not equal to Tb • Events a and b are causally related if Ta < Tb or Tb< Ta .

  32. Example P2 P1 P3 e a j b f k c g d h l i

  33. Example P2 P1 P3 [0,1,0] e a [1,0,0] [0,0,1] j b [2,0,0] f [2,2,0] [3,0,0] k c [0,0,2] g [2,3,2] d [4,0,0] [2,4,2] h [0,0,3] l i [4,5,2]

  34. Example Application:Bulletin Board • The Internet’s electronic bulletin board service (network news) • Users (processes) join specific groups (discussion groups). • Postings, whether they are articles or reactions, are multicast to all group members. • Could use a totally-ordered multicasting scheme.

  35. total (makes the numbers the same at all sites) causal (makes replies come after original message) FIFO (gives sender order Display from a Bulletin Board Program • Users run bulletin board applications which multicast messages • One multicast group per topic (e.g. os.interesting) • Require reliable multicast - so that all members receive messages • Ordering: Bulletin board: os.interesting From Subject Item 23 A.Hanlon Mach 24 G.Joseph Microkernels 25 A.Hanlon Re: Microkernels 26 T.L’Heureux RPC performance 27 M.Walker Re: Mach Figure 11.13 end •

  36. Example Application: Bulletin Board • A totally-ordered multicasting scheme does not imply that if message B is delivered after message A, that B is a reaction to A. • Totally-ordered multicasting is too strong in this case. • The receipt of an article causally precedes the posting of a reaction. The receipt of the reaction to an article should always follow the receipt of the article.

  37. Example Application: Bulletin Board • If we look at the bulletin board example, it is allowed to have items 26 and 27 in different order at different sites. • Items 25 and 26 may be in different order at different sites.

  38. Example Application: Bulletin Board • Vector timestamps can be used to guarantee causal message delivery. A slight variation of Fidge’s algorithm is used. • Each process Pi has an array Vi where Vi[j] denotes the number of events that process Pi knows have taken place. • Vector timestamps are assumed to be updated only when posting or receiving articles i.e., when a message is sent or received. Incrementing a component is only done during sending.

  39. Example Application: Bulletin Board • When a process Pi posts an article, it multicasts that article as a message with the vector timestamp. Let’s calls this message a. Assume that the value of the timestamp is Vi • Process Pj posts a reaction. Let’s call this message r. Assume that the value of the timestamp is Vj • Note that Vj > Vi • Message r may arrive at Pk before message a.

  40. Example Application: Bulletin Board • Pk will postpone delivery of r to the display of the bulletin board until all messages that causally precede r have been received as well. • Message r is delivered iff the following conditions are met: • Vj[j] = Vk[j]+1 • This states that r is the next message that Pk was expecting from process Pj • Vj[i] <= Vk[i] for all i not equal to j • This states that Pk has not seen any messages that were not seen by Pj when it sent message r.

  41. Example Application: Bulletin Board [0,0,0] P2 P1 [0,0,0] P3 [0,0,0] Post a [1,0,0] a [1,0,0] e [1,0,0] c [1,0,1] r: Reply a d g b [1,0,1] [1,0,1] Message a arrives at P2 before the reply r from P3 does

  42. Example Application: Bulletin Board [0,0,0] P2 P1 [0,0,0] P3 [0,0,0] Post a [1,0,0] a d [1,0,0] g [1,0,1] r: Reply a Buffered b c [1,0,1] [1,0,0] Deliver r The message a arrives at P2 after the reply from P3; The reply is not delivered right away.

  43. Summary • No notion of a globally shared clock. • Local (physical) clocks must be synchronized based on algorithms that take into account network latency. • Knowing the absolute time is not necessary. • Logical clocks can be used for ordering purposes.

More Related