Chapter 5 Synchronization Clocks and Synchronization Algorithms Lamport Timestamps and Vector Clocks Distributed Snapshots Termination Detection Election.

Slides:



Advertisements
Similar presentations
Global States.
Advertisements

SES Algorithm SES: Schiper-Eggli-Sandoz Algorithm. No need for broadcast messages. Each process maintains a vector V_P of size N - 1, N the number of processes.
Time and Global States Part 3 ECEN5053 Software Engineering of Distributed Systems University of Colorado, Boulder.
Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.
Distributed Computing
Distributed Computing
Time and Clock Primary standard = rotation of earth De facto primary standard = atomic clock (1 atomic second = 9,192,631,770 orbital transitions of Cesium.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Logical Clocks and Global State.
Distributed Systems Spring 2009
Ordering and Consistent Cuts Presented By Biswanath Panda.
CMPT 431 Dr. Alexandra Fedorova Lecture VIII: Time And Global Clocks.
CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
1. Explain why synchronization is so important in distributed systems by giving an appropriate example When each machine has its own clock, an event that.
Synchronization Clock Synchronization Logical Clocks Global State Election Algorithms Mutual Exclusion.
20101 Synchronization in distributed systems A collection of independent computers that appears to its users as a single coherent system.
EEC-681/781 Distributed Computing Systems Lecture 11 Wenbing Zhao Cleveland State University.
Clock Synchronization and algorithm
EEC-681/781 Distributed Computing Systems Lecture 10 Wenbing Zhao Cleveland State University.
EEC-681/781 Distributed Computing Systems Lecture 10 Wenbing Zhao Cleveland State University.
Computer Science Lecture 10, page 1 CS677: Distributed OS Last Class: Clock Synchronization Physical clocks Clock synchronization algorithms –Cristian’s.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Logical Clocks and Global State.
1 Synchronization Part 1 REK’s adaptation of Claypool’s adaptation of Tanenbaum’s Distributed Systems Chapter 5.
Synchronization Chapter 6 Part I Clock Synchronization & Logical clocks Part II Mutual Exclusion Part III Election Algorithms Part IV Transactions.
Synchronization.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Chapter 5.
OS2- Sem , R. Jalili Synchronization Chapter 5.
Logical Clocks n event ordering, happened-before relation (review) n logical clocks conditions n scalar clocks condition implementation limitation n vector.
Computer Science Lecture 10, page 1 CS677: Distributed OS Last Class: Naming Name distribution: use hierarchies DNS X.500 and LDAP.
Synchronization CSCI 4900/6900. Importance of Clocks & Synchronization Avoiding simultaneous access of resources –Cooperate to grant exclusive access.
Synchronization. Why we need synchronization? It is important that multiple processes do not access shared resources simultaneously. Synchronization in.
Lamport’s Logical Clocks & Totally Ordered Multicasting.
“Virtual Time and Global States of Distributed Systems”
Synchronization Chapter 5. Outline 1.Clock synchronization 2.Logical clocks 3.Global state 4.Election algorithms 5.Mutual exclusion 6.Distributed transactions.
Synchronization Chapter 5.
Communication & Synchronization Why do processes communicate in DS? –To exchange messages –To synchronize processes Why do processes synchronize in DS?
Distributed Systems Fall 2010 Logical time, global states, and debugging.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Real-Time & MultiMedia Lab Synchronization Distributed System Jin-Seung,KIM.
Synchronization CSCI 4900/6900. Importance of Clocks & Synchronization Avoiding simultaneous access of resources –Cooperate to grant exclusive access.
Distributed Process Coordination Presentation 1 - Sept. 14th 2002 CSE Spring 02 Group A4:Chris Sun, Min Fang, Bryan Maden.
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Clock Synchronization When each machine has its own clock, an event that occurred after another event may nevertheless be assigned an earlier time.
Logical Clocks. Topics r Logical clocks r Totally-Ordered Multicasting.
Event Ordering. CS 5204 – Operating Systems2 Time and Ordering The two critical differences between centralized and distributed systems are: absence of.
6 SYNCHRONIZATION. introduction processes synchronize –exclusive access. –agree on the ordering of events much more difficult compared to synchronization.
Synchronization. Clock Synchronization In a centralized system time is unambiguous. In a distributed system agreement on time is not obvious. When each.
Synchronization Chapter 5. Clock Synchronization When each machine has its own clock, an event that occurred after another event may nevertheless be assigned.
Hwajung Lee. Primary standard = rotation of earth De facto primary standard = atomic clock (1 atomic second = 9,192,631,770 orbital transitions of Cesium.
Ordering of Events in Distributed Systems UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau.
COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 6: Synchronyzation 3/5/20161 Distributed Systems - COMP 655.
Synchronization Chapter 5. Clock Synchronization When each machine has its own clock, an event that occurred after another event may nevertheless be assigned.
Lecture on Synchronization Submitted by
Logical Clocks event ordering, happened-before relation (review) logical clocks conditions scalar clocks  condition  implementation  limitation vector.
Ordering of Events in Distributed Systems UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau.
Chapter 10 Time and Global States
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Distributed Computing
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Time And Global Clocks CMPT 431.
Chapter 5 Synchronization
Chapter 5 (through section 5.4)
Chap 5 Distributed Coordination
Last Class: Naming Name distribution: use hierarchies DNS
Outline Theoretical Foundations
Presentation transcript:

Chapter 5 Synchronization Clocks and Synchronization Algorithms Lamport Timestamps and Vector Clocks Distributed Snapshots Termination Detection Election Algorithms Distributed Mutual Exclusion Transactions – concurrency control

2 What Do We Mean By Time? Monotonic increasing Useful when everyone agrees on it UTC is Universal Coordinated Time. NIST operates on a short wave radio frequency WWV and transmits UTC from Colorado.

3 Clock Synchronization When each machine has its own clock, an event that occurred after another event may nevertheless be assigned an earlier time.

4 Time Time is complicated in a distributed system. Physical clocks run at slightly different rates – so they can ‘drift’ apart. Clock makers specify a maximum drift rate  (rho). By definition 1-  <= dC/dt <= 1+  where C(t) is the clock’s time as a function of the real time.

5 Clock Synchronization The relation between clock time and UTC when clocks tick at different rates.

6 Clock Synchronization 1-  <= dC/dt <= 1+  A perfect clock has dC/dt = 1 Assuming 2 clocks have the same max drift rate . To keep them synchronized to within a time interval delta, , they must re- sync every  /2  seconds.

7 Cristian’s Algorithm One of the nodes (or processors) in the distributed system is a time server TS (presumably with access to UTC). How can the other nodes be sync’ed? Periodically, at least every  /2  seconds, each machine sends a message to the TS asking for the current time and the TS responds.

8 Cristian's Algorithm Getting the current time from a time server.

9 Cristian’s Algorithm Should the client node simply force his clock to the value in the message?? Potential problem: if client’s clock was fast, new time may be less than his current time, and just setting the clock to the new time might make time appear to run backwards on that node. TIME MUST NEVER RUN BACKWARDS. There are many applications that depend on the fact that time is always increasing. So new time must be worked in gradually.

10 Cristian’s Algorithm Can we compensate for the delay from when TS sends the response to T1 (when it is received)? Add (T1 – T0)/2. If no outside info is available. Estimate or ask server how long it takes to process time request, say R. Then add (T1 – T0 – R)/2. Take several measurements and taking the smallest or an average after throwing out the large values.

11 The Berkeley Algorithm The server actively tries to sync the clocks of a DS. This algorithm is appropriate if no one has UTC and all must agree on the time. Server “polls” each machine by sending his current time and asking for the difference between his and theirs. Each site responds with the difference. Server computes ‘average’ with some compensation for transmission time. Server computes how each machine would need to adjust his clock and sends each machine instructions.

12 The Berkeley Algorithm a)The time daemon asks all the other machines for their clock values b)The machines answer c)The time daemon tells everyone how to adjust their clock

13 Analysis of Sync Algorithms Cristian’s algorithm: N clients send and receive a message every  /2  seconds. Berkeley algorithm: 3N messages every  /2  seconds. Both assume a central time server or coordinator. More distributed algorithms exist in which each processor broadcasts its time at an agreed upon time interval and processors go through an agreement protocol to average the value and agree on it.

14 Analysis of Sync Algorithms In general, algorithms with no coordinator have greater message complexity (more messages for the same number of nodes). That’s the price you pay for equality and no-single-point-of-failure. With modern hardware, we can achieve “loosely synchronized” clocks. This forms the basis for many distributed algorithms in which logical clocks are used with physical clock timestamps to disambiguate when logical clocks roll over or servers crash and sequence numbers start over (which is inevitable in real implementations).

15 Logical Clocks What do we really need in a “clock”? For many applications, it is not necessary for nodes of a DS to agree on the real time, only that they agree on some value that has the attributes of time. Attributes of time: X(t) has the sense or attributes of time if it is strictly increasing. A real or integer counter can be used. A real number would be closer to reality, however, an integer counter is easier for algorithms and programmers. Thus, for convenience, we use an integer which is incremented anytime an event of possible interest occurs.

16 Logical Clocks in a DS What is important is usually not when things happened but in what order they happened so the integer counter works well in a centralized system. However, in a DS, each system has its own logical clock, and you can run into problems if one “clock” gets ahead of others. (like with physical clocks) We need a rule to synchronize the logical clocks.

17 Lamport Clocks Lamport defined the happens-before relation for DS. A  B means “A happens before B”. If A and B are events in the same process and A occurs before B then A  B is true. If A is the event of a message being sent by one process-node and B is the event of that message being received by another process, then then A  B is true. (A message must be sent before it is received). Happens-before is the transitive closure of 1 and 2. That is, if A  B and B  C, then A  C. Any other events are said to be concurrent.

18 Lamport Clocks p1  q2 and q1  p2 and q1  q2 but p1 and q1 are incomparable. p1  q3 and p1  r2 Does p1  r1? PQ R p1p1 q1q1 p2p2 r1r1 q2q2 q3q3 p3p3 r2r2

19 Lamport Clocks Desired properties: (1) anytime A  B, C(A) < C(B), that is the logical clock value of the earlier event is less (2) the clock value C is increasing (never runs backwards)

20 Lamport Clocks Rules An event is an internal event or a message send or receive. The local clock is increased by one for each message sent and the message carries that timestamp with it. The local clock is increased for an internal event. When a message is received, the current local clock value, C, is compared to the message timestamp, T. If the message timestamp, T = C, then set the local clock value to C+1. If T > C, set the clock to T+1. If T<C, set the clock to C+1.

21 Lamport Clocks Anytime A  B, C(A) < C(B) However, C(E) < C(F) doesn’t mean E  F (ex: event 6 on P may not proceed event 7 on Q) PQ R

22 Lamport Timestamps If you need a total ordering, (distinguish between event 6 on P and event 6 on Q) use Lamport timestamps. Lamport timestamp of event A at node i is (C(A), i) For any 2 timestamps T1=(C(A),I) and T2=(C(B),J) –If C(A) > C(B) then T1 > T2. –If C(A) < C(B) then T1 < T2. –If C(A) = C(B) then consider node numbers. If I>J then T1 > T2. If I<J then T1 < T2. If I=J then the two events occurred at the same node, so since their clock C is the same, they must be the same event.

23 Lamport Timestamps (6,1)  (6,2) and (4,3)  (6,2) P1 P2 P3 (5,1) (2,2) (6,1) (4,3) (6,2) (7,2) (7,1) (8,3) (3,3)

24 Lamport Timestamps a)Three processes, each with its own clock. The clocks run at different rates. b)Lamport's algorithm corrects the clocks.

25 Exercise: Lamport Clocks and Timestamps Assuming the only events are message send and receive, what are the clocks at 1-7 ABCABC

26 Why Vector Timestamps Lamport timestamps gives us the property if A  B then C(A) < C(B). But it doesn’t give us the property if C(A) < C(B) then A  B. (if C(A) < C(B), A and B may be concurrent or incomparable, but never B  A). A1 B2 C3 1,1 2,1 1,2 2,2 1,3 2,3 Lamport timestamp of 1,1 < 2,2 but the events are unrelated

27 Why Vector Timestamps Also, Lamport timestamps do not detect causality violations. Causality violations are caused by long communications delays in one channel that are not present in other channels or a non-FIFO channel. ABCABC

28 Causality Violation Causality violation example: A gets a message from B that was broadcast to all nodes. A responds by broadcasting an answer to all nodes. C gets A’s answer to B before it receives B’s original message. How can B tell that this message is out of order? ABCABC

29 Causality: Solution The solution is vector timestamps: Each node maintains an array of counters. If there are N nodes, the array has N integers V(N). V(I) = C, the local clock, if I is the designation of the local node. In general, V(X) is the latest info the node has on what X’s local clock is. Gives us the property e  f iff ts(e) < ts(f)

30 Vector Timestamps Each site has a local clock incremented at each event (not according to Lamport clocks) The vector clock timestamp is piggybacked on each message sent. RULES: Local clock is incremented for a local event and for a send event. The message carries the vector time stamp. When a message is received, the local clock is incremented by one. Each other component of the vector is increased to the received vector timestamp component if the current value is less. That is, the maximum of the two vector components is the new value.

31 Vector Timestamps and Causal Violations C receives message (2,1,0) then (0,1,0) The later message causally precedes the first message if we define how to compare timestamps right ABCABC

32 Vector Clock Comparison VC1 > VC2 if for each component j, VC1[j] >= VC2[j], and for some component k, VC1[k] > VC2[k] VC1 = VC2 if for each j, VC1[j] = VC2[j] Otherwise, VC1 and VC2 are incomparable and the events they represent are concurrent ABCABC Clock at point 1= (2,1,0) 2= (2,2,0) 3= (2,1,1) 4= (2,1,2)

33 Vector Clock Exercise Assuming the only events are send and receive: What is the vector clock at events 1-6? Which events are concurrent? ABCABC

Matrix Timestamps Matrix timestamps can be used to give each node more information about the state of the other nodes. Each site keeps a 2 dimensional time table If T i [j,k] = v then site i knows that site j is aware of all events at site k up to v Row x is the view of the vector clock at site x A’s TT A B C A B C ABCABC

35 Global State Matrix timestamps is one way of getting information about the distributed system. Another way is to sample the global state. The Global state is the combination of the states of all the processors and channels at some time which could have occurred. –Because there is no way of recording states at the exact same time at every node, we will have to be careful how we define this.

36 Global State There are many reasons for wanting to sample the global state “take a snapshot”. –deadlock detection – finding lost token –termination of a distributed computation –garbage collection We must define what is meant by the state of a node or a channel.

37 Defining Global State There are N processes P1…Pn. The state of the process Pi is defined by the system and application being used. Between each pair of processors, Pi and Pj, there is a one-way communications channel Ci,j. Channels are reliable and FIFO, ie, the messages arrive in the order sent. The contents of Ci,j is an ordered list of messages Li,j = (m1, m2, m3…). The state of the channel is the messages in the channel and their order. Li,j = (m1, m2, …) is the channel from Pi to Pj and m1 (head or front) is the next message to be delivered.

38 Defining Global State It is not necessary for all processors to be interconnected, but each processor must have at least one incoming channel and one outgoing channel and it must be possible to reach each processor from any other processor (graph is strongly connected)

39 Defining Global State The Global state is the combination of the states of all the processors and channels. The state of all the channels, L, is the set of messages sent but not yet received. Defining the state was easy, getting the state is more difficult. Intuitively, we say that a consistent global state is a “snapshot” of the DS that looks to the processes as if it were taken at the same instant everywhere.

40 Defining Global State For a global state to be meaningful, it must be one that could have occurred. Suppose we observe processor Pi (getting state Si) and it has just received a message m from processor Pk. When we observe processor Pk to get Sk, it should have sent m to Pi in order for us to have a consistent global state. In other words, if we get Pk’s state before it sent message m and then get Pi’s state after it received m, we have an inconsistent global state. PiPkPiPk PiPkPiPk

41 Consistent Cut So we say that the global state must represent a consistent cut. One way of defining a consistent cut is that the observations resulting in the states Si should all occur concurrently (as defined using vector clocks). Also, a consistent cut is one where all the events before the cut happen-before the ones after the cut or are unrelated (uses “happens-before” relation).

42 Global State (1) a)A consistent cut b)An inconsistent cut

43 Algorithm for Distributed Snapshot Well known algorithm by Chandy and Lamport When instructed, each processor will stop other processing and record its state Pi, send out marker messages and record the sequence of messages arriving on each incoming channel until a marker comes in (this will enable us to get the channel state Ci,j). At end of algorithm, initiator or other coordinator collects local states and compiles global state.

44 Chandy Lamport Snapshot a)Organization of a process and channels for a distributed snapshot

45 Chandy Lamport Snapshot One processor starts the snapshot by recording his own local state and immediately sends a marker message M on each of its outgoing channels. (This indicates the causal boundary between before the local state was recorded and after). It begins to record all the messages arriving on all incoming channels. When it has received markers from all incoming channels, it is done. When a processor who was not the initiator receives the marker for the first time, it immediately records its local state, sends out markers on all outgoing channels. It begins recording the received message sequence on all incoming channels other than the one it just received the marker on. When a marker has been received on each incoming channel, the processor is done with its part of the snapshot.

46 Chandy Lamport Snapshot b)Process Q receives a marker for the first time and records its local state c)Q records all incoming message d)Q receives a marker for its incoming channel and finishes recording the state of the incoming channel

47 Snapshot Node 2 initiates snapshot a b c Recorded: 2 State S2 M M

48 Snapshot Node 2 initiates snapshot b c Recorded: 2 State S2 M M

49 Snapshot Node 2 initiates snapshot b Recorded: 2 State S2 M M M d 1 4 State S1 State S4 4

50 Snapshot Recorded: 2 State S2 M M M d 1 4 State S1 State S4 L3,2 = b 4

51 Snapshot Recorded: 2 State S2 M d 1 4 State S1 State S4 L3,2 = b L1,2 = empty L4,2 = empty 4

52 Snapshot Recorded: 2 State S2 1 4 State S1 State S4 L3,2 = b L1,2 = empty L4,2 = empty 3 State S3 M L3,1 = d M 4 3

53 Snapshot Recorded: 2 State S2 1 4 State S1 State S4 L3,2 = b L1,2 = empty L4,2 = empty 3 State S3 M L3,1 = d M 4 3

54 Snapshot Recorded: 2 State S2 1 4 State S1 State S4 L3,2 = b L1,2 = empty L4,2 = empty 3 State S3 L3,1 = d

55 Chandy Lamport Snapshot Uses O(|E|) messages where E is the number of edges. Time bound is dependent on the topology of the graph

56 Next: Termination detection