Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Systems Principles and Paradigms Chapter 05 Synchronization.

Similar presentations


Presentation on theme: "Distributed Systems Principles and Paradigms Chapter 05 Synchronization."— Presentation transcript:

1 Distributed Systems Principles and Paradigms Chapter 05 Synchronization

2 Communication & Synchronization Why do processes communicate in DS? –To exchange messages –To synchronize processes Why do processes synchronize in DS? –To coordinate access of shared resources –To order events

3 Time, Clocks and Clock Synchronization Time –Why is time important in DS? –E.g. UNIX make utility (see Fig. 5-1) Clocks (Timer) –Physical clocks –Logical clocks (introduced by Leslie Lamport) –Vector clocks (introduced by Collin Fidge) Clock Synchronization –How do we synchronize clocks with real-world time? –How do we synchronize clocks with each other? 05 – 1 Distributed Algorithms/5.1 Clock Synchronization

4 Physical Clocks (1/3) Problem: Clock Skew – clocks gradually get out of synch and give different values Solution: Universal Coordinated Time (UTC): Formerly called GMT (Greenwich Mean Time) Based on the number of transitions per second of the cesium 133 atom (very accurate). At present, the real time is taken as the average of some 50 cesium-clocks around the world – International Atomic Time Introduces a leap second from time to time to compensate that days are getting longer. UTC is broadcasted through short wave radio (with the accuracy of +/- 1 msec) and satellite (Geostationary Environment Operational Satellite, GEOS, with the accuracy of +/- 0.5 msec). Question: Does this solve all our problems? Don’t we now have some global timing mechanism? 05 – 2 Distributed Algorithms/5.1 Clock Synchronization

5 Physical Clocks (2/3) Problem: Suppose we have a distributed system with a UTC- receiver somewhere in it, we still have to distribute its time to each machine. Basic principle: Every machine has a timer that generates an interrupt H (typically 60) times per second. There is a clock in machine p that ticks on each timer interrupt. Denote the value of that clock by Cp ( t), where t is UTC time. Ideally, we have that for each machine p, Cp ( t) = t, or, in other words, dC/ dt = 1 Theoretically, a timer with H=60 should generate 216,000 ticks per hour In practice, the relative error of modern timer chips is 10**-5 (or between 215,998 and 216,002 ticks per hour) 05 – 3 Distributed Algorithms/5.1 Clock Synchronization

6 Physical Clocks (3/3) Goal: Never let two clocks in any system differ by more than  time units => synchronize at least every  2  seconds. 05 – 4 Distributed Algorithms/5.1 Clock Synchronization Where   is the max. drift rate

7 Clock Synchronization Principles Principle I: Every machine asks a time server for the accurate time at least once every  / 2  seconds (see Fig. 5-5). But you need an accurate measure of round trip delay, including interrupt handling and processing incoming messages. Principle II: Let the time server scan all machines periodically, calculate an average, and inform each machine how it should adjust its time relative to its present time. Ok, you’ll probably get every machine in sync. Note: you don’t even need to propagate UTC time (why not?) 05 – 5 Distributed Algorithms/5.1 Clock Synchronization

8 Clock Synchronization Algorithms The Berkeley Algorithm  The time server polls periodically every machine for its time  The received times are averaged and each machine is notified of the amount of the time it should adjust  Centralized algorithm, See Figure 5-6 Decentralized Algorithm  Every machine broadcasts its time periodically for fixed length resynchronization interval  Averages the values from all other machines (or averages without the highest and lowest values) Network Time Protocol (NTP)  the most popular one used by the machines on the Internet  uses an algorithm that is a combination of centralized/distributed 05 – 6 Distributed Algorithms/5.2 Logical Clocks

9 Network Time Protocol (NTP) a protocol for synchronizing the clocks of computers over packet- switched, variable-latency data networks (i.e., Internet) NTP uses UDP port 123 as its transport layer. It is designed particularly to resist the effects of variable latency NTPv4 can usually maintain time to within 10 milliseconds (1/100 s) over the public Internet, and can achieve accuracies of 200 microseconds (1/5000 s) or better in local area networks under ideal conditions visit the following URL to understand NTP in more detail http://en.wikipedia.org/wiki/Network_Time_Protocol

10 The Happened-Before Relationship Problem: We first need to introduce a notion of ordering before we can order anything. The happened-before relation on the set of events in a distributed system is the smallest relation satisfying: If a and b are two events in the same process, and a comes before b, then a  b. (a happened before b) If a is the sending of a message, and b is the receipt of that message, then a  b. If a  b and b  c, then a  c. (transitive relation) Note: if two events, x and y, happen in different processes that do not exchange messages, then they are said to be concurrent. Note: this introduces a partial ordering of events in a system with concurrently operating processes. 05 – 6 Distributed Algorithms/5.2 Logical Clocks

11 Logical Clocks (1/2) Problem: How do we maintain a global view on the system’s behavior that is consistent with the happened-before relation? Solution: attach a timestamp C ( e) to each event e, satisfying the following properties: P1: If a and b are two events in the same process, and a  b, then we demand that C ( a) < C ( b) P2: If a corresponds to sending a message m, and b to the receipt of that message, then also C ( a) < C ( b) Problem: How do we attach a timestamp to an event when there’s no global clock?  maintain a consistent set of logical clocks, one per process. 05 – 7 Distributed Algorithms/5.2 Logical Clocks

12 Logical Clocks (2/2) Each process Pi maintains a local counter Ci and adjusts this counter according to the following rules: (1) For any two successive events that take place within Pi, Ci is incremented by 1. (2) Each time a message m is sent by process Pi, the message receives a timestamp Tm = Ci. (3) Whenever a message m is received by a process Pj, Pj adjusts its local counter Cj : Property P1 is satisfied by (1); Property P2 by (2) and (3). This is called the Lamport’s Algorithm 05 – 8 Distributed Algorithms/5.2 Logical Clocks

13 Logical Clocks – Example 05 – 9 Distributed Algorithms/5.2 Logical Clocks Fig 5-7. (a) Three processes, each with its own clock. The clocks run at different rates. (b) Lamport’s algorithm corrects the clocks

14 Election Algorithms Principle: Many distributed algorithms require that some process acts as a coordinator. The question is how to select this special process dynamically. Note: In many systems the coordinator is chosen by hand (e.g., file servers, DNS servers). This leads to centralized solutions => single point of failure. Question: If a coordinator is chosen dynamically, to what extent can we speak about a centralized or distributed solution? Question: Is a fully distributed solution, i.e., one without a coordinator, always more robust than any centralized/coordinated solution? 05 – 18 Distributed Algorithms/5.4 Election Algorithms

15 Election by Bullying (1/2) Principle: Each process has an associated priority (weight). The process with the highest priority should always be elected as the coordinator. Issue: How do we find the heaviest process? Any process can just start an election by sending an election message to all other processes (assuming you don’t know the weights of the others). If a process P heavy receives an election message from a lighter process P light, it sends a take-over message to P light. P light is out of the race. If a process doesn’t get a take-over message back, it wins, and sends a victory message to all other processes. 05 – 19 Distributed Algorithms/5.4 Election Algorithms

16 Election by Bullying (2/2) Question: We’re assuming something very important here – what? 05 – 20 Distributed Algorithms/5.4 Election Algorithms Assumption: Each process knows the process number of other processes

17 Mutual Exclusion Problem: A number of processes in a distributed system want exclusive access to some resource. Basic solutions: Via a centralized server. Completely distributed, with no topology imposed. Completely distributed, making use of a (logical) ring. Centralized: Really simple: 05 – 22 Distributed Algorithms/5.5 Mutual Exclusion

18 Mutual Exclusion: Ricart & Agrawala Principle: The same as Lamport except that acknowledgments aren’t sent. Instead, replies (i.e., grants) are sent only when: The receiving process has no interest in the shared resource; or The receiving process is waiting for the resource, but has lower priority (known through comparison of timestamps). In all other cases, reply is deferred (see the algorithm on pg. 267) 05 – 23 Distributed Algorithms/5.5 Mutual Exclusion

19 The most straightforward way to achieve mutual. exclusion in a distributed system is to simul ate how it is done in a one-proc essor system. One process is elected as the coordinator

20 READING: Read Chapter 5


Download ppt "Distributed Systems Principles and Paradigms Chapter 05 Synchronization."

Similar presentations


Ads by Google