Hwajung Lee. Question 1. Why is physical clock synchronization important? Question 2. With the price of atomic clocks or GPS coming down, should we care.

Slides:



Advertisements
Similar presentations
CS542 Topics in Distributed Systems Diganta Goswami.
Advertisements

Token-Dased DMX Algorithms n LeLann’s token ring n Suzuki-Kasami’s broadcast n Raymond’s tree.
Time and Clock Primary standard = rotation of earth De facto primary standard = atomic clock (1 atomic second = 9,192,631,770 orbital transitions of Cesium.
Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.
Time and Clock Primary standard = rotation of earth De facto primary standard = atomic clock (1 atomic second = 9,192,631,770 orbital transitions of Cesium.
CS 582 / CMPE 481 Distributed Systems
Ordering and Consistent Cuts Presented By Biswanath Panda.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Distributed Snapshots –Termination detection Election algorithms –Bully –Ring.
Synchronization Clock Synchronization Logical Clocks Global State Election Algorithms Mutual Exclusion.
20101 Synchronization in distributed systems A collection of independent computers that appears to its users as a single coherent system.
Hwajung Lee. Question 1. Why is physical clock synchronization important? Question 2. With the price of atomic clocks or GPS coming down, should we care.
1 Distributed Process Management: Distributed Global States and Distributed Mutual Exclusion.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms.
CIS 720 Distributed algorithms. “Paint on the forehead” problem Each of you can see other’s forehead but not your own. I announce “some of you have paint.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed Mutual Exclusion.
Distributed Mutual Exclusion
Distributed Algorithms
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Mutual Exclusion Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Mutual Exclusion Steve Ko Computer Sciences and Engineering University at Buffalo.
Maekawa’s algorithm Divide the set of processes into subsets that satisfy the following two conditions: i  S i  i,j :  i,j  n-1 :: S i  S j.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms –Bully algorithm.
CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
Coordination and Agreement. Topics Distributed Mutual Exclusion Leader Election.
Distributed Snapshot. Think about these -- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes?
Presenter: Long Ma Advisor: Dr. Zhang 4.5 DISTRIBUTED MUTUAL EXCLUSION.
UBI529 Distributed Algorithms 2. Time Synchronization in Distributed Systems.
Communication & Synchronization Why do processes communicate in DS? –To exchange messages –To synchronize processes Why do processes synchronize in DS?
Studying Different Problems from Distributed Computing Several of these problems are motivated by trying to use solutiions used in `centralized computing’
Program correctness The State-transition model A global states S  s 0 x s 1 x … x s m {s k = set of local states of process k} S0  S1  S2  Each state.
Hwajung Lee. The State-transition model The set of global states = s 0 x s 1 x … x s m {s k is the set of local states of process k} S0  S1  S2  Each.
Distributed Snapshot. One-dollar bank Let a $1 coin circulate in a network of a million banks. How can someone count the total $ in circulation? If not.
Lecture 10 – Mutual Exclusion Distributed Systems.
Real-Time & MultiMedia Lab Synchronization Distributed System Jin-Seung,KIM.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Mutual Exclusion & Leader Election Steve Ko Computer Sciences and Engineering University.
Hwajung Lee. The State-transition model The set of global states = s 0 x s 1 x … x s m {s k is the set of local states of process k} S0  S1  S2  Each.
Physical clock synchronization Question 1. Why is physical clock synchronization important? Question 2. With the price of atomic clocks or GPS coming down,
Mutual exclusion Ludovic Henrio CNRS - projet SCALE Distributed Algorithms.
ITEC452 Distributed Computing Lecture 6 Mutual Exclusion Hwajung Lee.
Hwajung Lee. -- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes? How do we compute.
Hwajung Lee. Mutual Exclusion CS p0 p1 p2 p3 Some applications are:  Resource sharing  Avoiding concurrent update on shared data  Controlling the.
Hwajung Lee. Mutual Exclusion CS p0 p1 p2 p3 Some applications are: 1. Resource sharing 2. Avoiding concurrent update on shared data 3. Controlling the.
Hwajung Lee. Primary standard = rotation of earth De facto primary standard = atomic clock (1 atomic second = 9,192,631,770 orbital transitions of Cesium.
Lecture 12-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2012 Indranil Gupta (Indy) October 4, 2012 Lecture 12 Mutual Exclusion.
Lecture 7- 1 CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 7 Distributed Mutual Exclusion Section 12.2 Klara Nahrstedt.
Decentralized solution 1
Hwajung Lee. -- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes? How do we compute.
Mutual Exclusion Algorithms. Topics r Defining mutual exclusion r A centralized approach r A distributed approach r An approach assuming an organization.
Hwajung Lee. Mutual Exclusion CS p0 p1 p2 p3 Some applications are:  Resource sharing  Avoiding concurrent update on shared data  Controlling the.
Token-passing Algorithms Suzuki-Kasami algorithm The Main idea Completely connected network of processes There is one token in the network. The holder.
CSC 8420 Advanced Operating Systems Georgia State University Yi Pan Transactions are communications with ACID property: Atomicity: all or nothing Consistency:
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Oct 1, 2015 Lecture 12: Mutual Exclusion All slides © IG.
Proof of liveness: an example
Time and Clock Primary standard = rotation of earth
Distributed Mutual Exclusion
Time and Clock.
Time and Clock.
Decentralized solution 1
Mutual Exclusion Problem Specifications
ITEC452 Distributed Computing Lecture 7 Mutual Exclusion
Mutual Exclusion CS p0 CS p1 p2 CS CS p3.
CSE 486/586 Distributed Systems Mutual Exclusion
Chapter 5 (through section 5.4)
Physical clock synchronization
Synchronization (2) – Mutual Exclusion
ITEC452 Distributed Computing Lecture 7 Mutual Exclusion
Distributed algorithms
Distributed Systems and Concurrency: Synchronization in Distributed Systems Majeed Kassis.
Distributed Mutual eXclusion
CSE 486/586 Distributed Systems Mutual Exclusion
Hwajung Lee ITEC452 Distributed Computing Lecture 6 Mutual Exclusion Sequential and concurrent events. Understanding logical clocks and vector clocks.
Presentation transcript:

Hwajung Lee

Question 1. Why is physical clock synchronization important? Question 2. With the price of atomic clocks or GPS coming down, should we care about physical clock synchronization?

Types of Synchronization  External Synchronization  Internal Synchronization  Phase Synchronization Types of clocks Unbounded 0, 1, 2, 3,... Bounded 0,1, 2,... M-1, 0, 1,... Unbounded clocks are not realistic, but are easier to deal with in the design of algorithms. Real clocks are always bounded.

What are these? Drift rate  Clock skew  Resynchronization interval R Max drift rate  implies: (1-  ) ≤ dC/dt < (1+  ) Challenges (Drift is unavoidable) Accounting for propagation delay Accounting for processing delay Faulty clocks

Berkeley Algorithm A simple averaging algorithm that guarantees mutual consistency |c(i) - c(j)| <  Step 1. Read every clock in the system. Step 2. Discard outliers and substitute them by the value of the local clock. Step 3. Update the clock using the average of these values.

Lamport and Melliar-Smith’s averaging algorithm handles byzantine clocks too Assume n clocks, at most t are faulty Step 1. Read every clock in the system. Step 2. Discard outliers and substitute them by the value of the local clock. Step 3. Update the clock using the average of these values. Synchronization is maintained if n > 3t Why? A faulty clocks exhibits 2-faced or byzantine behavior Bad clock

Lamport & Melliar-Smith’s algorithm (continued) The maximum difference between the averages computed by two non-faulty nodes is ( 3t  / n) To keep the clocks synchronized, 3t  / n <  So, 3t < n B a d c l o c k s k

Client pulls data from a time server every R unit of time, where R <  / 2 . For accuracy, clients must compute the round trip time (RTT), and compensate for this delay while adjusting their own clocks. (Too large RTT’s are rejected) Time server External Synchronization

Tiered architecture Broadcast mode - least accurate Procedure call - medium accuracy Peer-to-peer mode - upper level servers use this for max accuracy Time server The tree can reconfigure itself if some node fails. Level 1 Level 0 Level 2

Let Q’s time be ahead of P’s time by . Then T2 = T1 + T PQ +  T4 = T3 + T QP -  y = T PQ + T QP = T2 +T4 -T1 -T3 (RTT)  = (T2 -T4 -T1 +T3) / 2 - (T PQ - T QP ) / 2 So, x- y/2 ≤  ≤ x+ y/2 T2 T1T4 T3 Q P Ping several times, and obtain the smallest value of y. Use it to calculate  x Between y/2 and -y/2

1. What problems can occur when a clock value is Advanced from 171 to 174? 2. What problems can occur when a clock value is Moved back from 180 to 175? 1.What happened to the instant 172 and 173? 2. The instants appear twice

Mutual Exclusion CS p0 p1 p2 p3

Some applications are: 1. Resource sharing 2. Avoiding concurrent update on shared data 3. Controlling the grain of atomicity 4. Medium Access Control in Ethernet 5. Collision avoidance in wireless broadcasts

ME1. At most one process in the CS. (Safety property) ME2. No deadlock. (Safety property) ME3. Every process trying to enter its CS must eventually succeed. This is called progress. (Liveness property) Progress is quantified by the criterion of bounded waiting. It measures a form of fairness by answering the question: Between two consecutive CS trips by one process, how many times other processes can enter the CS? There are many solutions, both on the shared memory model and the message-passing model

clients Client do true  send request; reply received  enter CS; send release; od Server do request received and not busy  send reply; busy:= true request received and busy  enqueue sender release received and queue is empty  busy:= false release received and queue not empty  send reply to the head of the queue od busy: boolean server queue req reply release

-Centralized solution is simple. -But the server is a single point of failure. This is BAD. -ME1-ME3 is satisfied, but FIFO fairness is not guaranteed. Why? Can we do better? Yes!

{ Life of each process } 1. Broadcast a timestamped request to all. 2. Reques t received  enqueue sender in local Q;. Not in CS  send ack In CS  postpone sending ack (until exit from CS). 3. Enter CS, when (i) You are at the head of your own local Q (ii) You have received ack from all processes 4. To exit from the CS, (i) Delete the request from Q, and (ii) Broadcast a timestamped release 5. Release received  remove sender from local Q. Completely connected topology Can you show that it satisfies all the properties (i.e. ME1, ME2, ME3) of a correct solution ?

{Lamport’s algorithm} 1. Broadcast a timestamped request to all. 2. Request received  enqueue it in local Q. Not in CS  send ack, else postpone sending ack until exit from CS. 3. Enter CS, when (i) You are at the head of your Q (ii) You have received ack from all 4. To exit from the CS, (i) Delete the request from your Q, and (ii) Broadcast a timestamped releas e 5. When a process receives a releas e message, it removes the sender from its Q. Completely connected topology

Can you show that it satisfies all the properties (i.e. ME1, ME2, ME3) of a correct solution? Observation. Processes taking a decision to enter CS must have identical views of their local queues, when all acks have been received. Proof of ME1. At most one process can be in its CS at any time. Suppose not, and both j,k enter their CS. This implies  j in CS  Qj.ts.j < Qk.ts.k  k in CS  Qk.ts.k < Qj.ts.j Impossible.

Proof of ME2. (No deadlock) The waiting chain is acyclic. i waits for j  i is behind j in all queues (or j is in its CS)  j does not wait for i Proof of ME3. (progress) New requests join the end of the queues, so new requests do not pass the old ones

Proof of FIFO fairness. timestamp (j) < timestamp (k)  j enters its CS before k does so Suppose not. So, k enters its CS before j. So k did not receive j’s request. But k received the ack from j for its own req. This is impossible if the channels are FIFO. Message complexity = 3(N-1) (N-1 requests + N-1 ack + N-1 release) k j Req (20 ) ack Req (30)

{ Ricart & Agrawala’s algorithm} What is new? 1. Broadcast a timestamped request to all. 2. Upon receiving a request, send ack if -You do not want to enter your CS, or -You are trying to enter your CS, but your timestamp is higher than that of the sender. (If you are already in CS, then buffer the request) 3. Enter CS, when you receive ack from all. 4. Upon exit from CS, send ack to each pending request before making a new request. (No release message is necessary)

{Ricart & Agrawala’s algorithm} ME1. Prove that at most one process can be in CS. ME2. Prove that deadlock is not possible. ME3. Prove that FIFO fairness holds even if channels are not FIFO Message complexity = 2(N-1) (N-1 requests + N-1 acks - no release message) TS(j) < TS(k) k j Req(j) Ack(j) Req(k)

Timestamps grow in an unbounded manner. This makes real implementation impossible. Can we somehow bounded timestamps? Think about it.

{Maekawa’s algorithm} - First solution with a sublinear O(sqrt N) message complexity. - “Close to” Ricart-Agrawala’s solution, but each process is required to obtain permission from only a subset of peers

 With each process i, associate a subset S i.Divide the set of processes into subsets that satisfy the following two conditions: i  S i  i,j :  i,j  n-1 :: S i  S j ≠   Main idea. Each process i is required to receive permission from S i only. Correctness requires that multiple processes will never receive permission from all members of their respective subsets. 0,1,21,3,5 2,4,5 S0S0 S1S1 S2S2

Example. Let there be seven processes 0, 1, 2, 3, 4, 5, 6 S 0 ={0, 1, 2} S 1 ={1, 3, 5} S 2 ={2, 4, 5} S 3 ={0, 3, 4} S 4 ={1, 4, 6} S 5 ={0, 5, 6} S 6 ={2, 3, 6}

Version 1 {Life of process I} 1. Send timestamped request to each process in S i. 2. Request received  send ack to process with the lowest timestamp. Thereafter, "lock" (i.e. commit) yourself to that process, and keep others waiting. 3. Enter CS if you receive an ack from each member in S i. 4. To exit CS, send release to every process in S i. 5. Release received  unlock yourself. Then send ack to the next process with the lowest timestamp. S 0 ={0, 1, 2} S 1 ={1, 3, 5} S 2 ={2, 4, 5} S 3 ={0, 3, 4} S 4 ={1, 4, 6} S 5 ={0, 5, 6} S 6 ={2, 3, 6}

ME1. At most one process can enter its critical section at any time. Let i and j attempt to enter their Critical Sections S i  S j ≠   there is a process k  S i  S j Process k will never send ack to both. So it will act as the arbitrator and establishes ME1 S 0 ={0, 1, 2} S 1 ={1, 3, 5} S 2 ={2, 4, 5} S 3 ={0, 3, 4} S 4 ={1, 4, 6} S 5 ={0, 5, 6} S 6 ={2, 3, 6}

ME2. No deadlock. Unfortunately deadlock is possible! Assume 0, 1, 2 want to enter their critical sections. From S 0 = {0,1,2}, 0,2 send ack to 0, but 1 sends ack to 1; From S 1 = {1,3,5}, 1,3 send ack to 1, but 5 sends ack to 2; From S 2 = {2,4,5}, 4,5 send ack to 2, but 2 sends ack to 0; Now, 0 waits for 1, 1 waits for 2, and 2 waits for 0. So deadlock is possible! S 0 ={0, 1, 2} S 1 ={1, 3, 5} S 2 ={2, 4, 5} S 3 ={0, 3, 4} S 4 ={1, 4, 6} S 5 ={0, 5, 6} S 6 ={2, 3, 6}

Avoiding deadlock If processes receive messages in increasing order of timestamp, then deadlock “could be” avoided. But this is too strong an assumption. Version 2 uses three additional messages: - failed - inquire - relinquish S 0 ={0, 1, 2} S 1 ={1, 3, 5} S 2 ={2, 4, 5} S 3 ={0, 3, 4} S 4 ={1, 4, 6} S 5 ={0, 5, 6} S 6 ={2, 3, 6}

New features in version 2 - Send ack and set lock as usual. - If lock is set and a request with larger timestamp arrives, send failed (you have no chance). If the incoming request has a lower timestamp, then send inquire (are you in CS?) to the locked process. - Receive inquire and at least one failed message  send relinquish. The recipient resets the lock. S 0 ={0, 1, 2} S 1 ={1, 3, 5} S 2 ={2, 4, 5} S 3 ={0, 3, 4} S 4 ={1, 4, 6} S 5 ={0, 5, 6} S 6 ={2, 3, 6}

- Let K = |S i |. Let each process be a member of D subsets. When N = 7, K = D = 3. When K=D, N = K(K- 1)+1. So K is of the order √N -The message complexity of Version 1 is 3√N. Maekawa’s analysis of Version 2 reveals a complexity of 7√N  Sanders identified a bug in version 2 …

Suzuki-Kasami algorithm The Main idea Completely connected network of processes There is one token in the network. The holder of the token has the permission to enter CS. Any other process trying to enter CS must acquire that token. Thus the token will move from one process to another based on demand. I want to enter CS

Process i broadcasts (i, num) Each process maintains -an array req : req[j] denotes the sequence no of the latest request from process j (Some requests will be stale soon) Additionally, the holder of the token maintains - an array last: last[j] denotes the sequence number of the latest visit to CS from for process j. - a queue Q of waiting processes req: array[0..n-1] of integer last: array [0..n-1] of integer Sequence number of the request req last queue Q

When a process i receives a request (k, num) from process k, it sets req[k] to max(req[k], num). The holder of the token --Completes its CS --Sets last[i]:= its own num --Updates Q by retaining each process k only if 1+ last[k] = req[k] ( This guarantees the freshness of the request) --Sends the token to the head of Q, along with the array last and the tail of Q In fact, token  (Q, last) Req: array[0..n-1] of integer Last: Array [0..n-1] of integer

{ Program of process j} Initially,  i: req[i] = last[i] = 0 * Entry protocol * req[j] := req[j] + 1 Send (j, req[j]) to all Wait until token (Q, last) arrives Critical Section * Exit protocol * last[j] := req[j]  k ≠ j: k  Q  req[k] = last[k] + 1  append k to Q; if Q is not empty  send (tail-of-Q, last) to head-of-Q fi * Upon receiving a request (k, num) * req[k] := max(req[k], num)

req=[1,0,0,0,0] last=[0,0,0,0,0] req=[1,0,0,0,0] initial state

req=[1,1,1,0,0] last=[0,0,0,0,0] req=[1,1,1,0,0] 1 & 2 send requests

req=[1,1,1,0,0] last=[1,0,0,0,0] Q=(1,2) req=[1,1,1,0,0] 0 prepares to exit CS

req=[1,1,1,0,0] last=[1,0,0,0,0] Q=(2) req=[1,1,1,0,0] 0 passes token (Q and last) to 1

req=[2,1,1,1,0] last=[1,0,0,0,0] Q=(2,0,3) req=[2,1,1,1,0] 0 and 3 send requests

req=[2,1,1,1,0] last=[1,1,0,0,0] Q=(0,3) req=[2,1,1,1,0] 1 sends token to 2

1,4 4, ,4,7 want to enter their CS

1,4 4, sends the token to 6

4 4,7 4 The message complexity is O(diameter) of the tree. Extensive empirical measurements show that the average diameter of randomly chosen trees of size n is O(log n). Therefore, the authors claim that the average message complexity is O(log n) 6 forwards the token to 1 4

-- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes? How do we compute these?

Let a $1 coin circulate in a network of a million banks. How can someone count the total $ in circulation? If not counted “properly,” then one may think the total $ in circulation to be one million.

Major uses in - deadlock detection - termination detection - rollback recovery - global predicate computation

(a  consistent cut C)  (b happened before a)  b  C If this is not true, then the cut is inconsistent a b c d g m e f k i h j Cut 1Cut 2 A cut is a set of events. (Not consistent) (Consistent) P1 P2 P3

The set of states immediately following a consistent cut forms a consistent snapshot of a distributed system.  A snapshot that is of practical interest is the most recent one. Let C1 and C2 be two consistent cuts and C1  C2. Then C2 is more recent than C1.  Analyze why certain cuts in the one-dollar bank are inconsistent.

How to record a consistent snapshot? Note that 1. The recording must be non-invasive 2. Recording must be done on-the-fly. You cannot stop the system.