D ISTRIBUTED S YSTEM M UTUAL E XCLUSION UNIT-3
D ISTRIBUTED M UTUAL E XCLUSION Concurrent access to a shared resource (critical region) by several uncoordinated processes located on several sites is serialized to secure the integrity of the shared resource. The major differences comparing with the single processor ME problem are: (1) there is no shared memory and (2) there is no common physical clock. Classification of Mutual Exclusion algorithms Non-Token based algorithms Token based algorithms
Requirements of Distributed ME Mutual Exclusion: guarantee that only one request access the CR at a time. Freedom from deadlock: two or more sites(hosts) should not endlessly wait for msg’s that will never arrive. Freedom from starvation: a site should not be forced to wait indefinitely to access CR. Fairness: requests must be executed in the order they are made. (fairness freedom of starvation, but not reverse) Fault tolerance: in the case of failure, the algorithm can reorganize itself so that it continues to function without any disruption.
H OW TO MEASURE THE PERFORMANCE Number of messages necessary per CS invocation The synchronization delay
Response Time The system throughput = 1 / ( sd + E ) sd : synchronization delay E: Average critical section execution time
S OLUTION TO MUTUAL EXCLUSION There are three different approach for solution to Mutual Exclusion in distributed system Centralize Approach Distributed Approach Token Ring Approach
Centralized Solution Queue up the requests and grant CR one by one. C p2 p1 p3 3 msg’s per CR invocation: REQ, ACK, REL Single point of failure Control site is a bottleneck SD = 2T where T is the communication delay ST = 1/(2T + E)
a) Process 1 asks the coordinator for permission to enter a critical region. Permission is granted. b) Process 2 then asks permission to enter the same critical region. The coordinator does not reply. c) When process 1 exits the critical region, it tells the coordinator, when then replies to process 2. As shown in figure a), the coordinator is not reply to process 2 when the critical region is occupied. Here, depending on the type of system, the coordinator can also reply back to the process 2 that it is in queue. If the coordinator doesn’t do so, then the waiting process 2 will be unable to distinguish between ‘permission denied” or a “dead” coordinator. This type of system as a single point of failure, if the coordinator fails, then the entire system crashes.
D ISTRIBUTED A LGORITHM The processes communicate by using messages only and there is no global controller. When a process wants to enter a critical region, it builds a message containing: name of the critical region it’s process number it’s current time.
D ISTIBUTED A LGORITHM (C ONTD …) The process sends this message to all the processes in the network. When another process receives this message, it takes the action pertaining on its state and the critical region mentioned. Three cases are possible here: 1) If the message receiving process is not in the critical region and does not wish to enter it, it sends it back. 2) Receiver is already in the critical region and does not reply 3) Receiver wants to enter the same critical region and a 4) Has not done so, it compares the “time stamp” of the incoming message with the one it has sent to others for permission. The lowest one wins and can enter the critical region. When the process exists from the critical region, it sends an OK message to inform everyone
D ISTIBUTED A LGORITHM (C ONTD …) a) Two processes want to enter the same critical region at the same moment. b) Process 0 has the lowest timestamp, so it wins. c) When process 0 is done, it sends an OK also, so 2 can now enter the critical region.
T OKEN R ING A LGORITHM : a logical ring is constructed in which each process is assigned a position in the ring. The ring positions may be allocated in numerical order of network addresses or some other means. When the ring is initialized, process 0 is given a token. The token circulates around the ring. It is passed from process k to process k +1 in point-to-point messages. When a process acquires the token from its neighbour, it checks to see if it is attempting to enter a critical region. If so, the process enters the region, does all the work it needs to, and leaves the region. After it has exited, it passes the token along the ring. It is not permitted to enter a second critical region using the same token. If a process is handed the token by its neighbour and is not interested in entering a critical region,it just passes it along. As a consequence, when no processes want to enter any critical regions, the token just circulates at high speed around the ring.
T OKEN R ING A LGORITHM ( CONTD ….) a) An unordered group of processes on a network. b) A logical ring constructed in software.
N ON T OKEN B ASED ALGORITHMS Lamport’s Algorithm for mutual exclusion: For all i: 1<=i<=N::R i = {S 1, S 2,….S N } Every site Si keeps a queue : request_queuei, contains mutual exclusion requests ordered by their timestamps.
L AMPORT ’ S A LGORITHM FOR MUTUAL EXCLUSION Requesting the critical section: 1. When a site Si wants to enter the CS, it sends a REQUEST( ts i, i ) message to all the sites in its request set R i and places the request on request_queue i. 2. When a site Sj receives the REQUEST(tsi, i ) message from site Si,places site it returns a time stamped REPLY message to Si and places site Si ’s request on request_queue j.
L AMPORT ’ S M UTUAL E XCLUSION A LGO …. Executing the critical section: Site Si enters the CS when the following two conditions hold: L1: Si has received a message with timestamp larger than (tsi, i ) from all other sites. L2: Si ’s request is at the top of request queuei.
L AMPORT ’ S M UTUAL E XCLUSION A LGO …. Releasing the critical section: 3. Site Si, upon exiting the CS, removes its request from the top of its request queue and broadcasts a timestamped RELEASE message to all other sites. 4. When a site Sj receives a RELEASE message from site Si, it removes Si ’s request from its request queue. When a site removes a request from its request queue, its own request may come at the top of the queue, enabling it to enter the CS.
R ICART -A GRAWALA A LGORITHM Requesting the Critical Section: 1. When a site Si wants to enter the CS, it sends a timestamped REQUEST message to all the sites in its request set. 2. When the site Sj receives REQUEST message from site Si, it sends a REPLY message to site Si if site Sj is neither requesting nor executing the CS or if site Sj is requesting and Si’s request’s timestamp is smaller than site Sj’s own request’s timestamp. The request is differed otherwise.
R ICART -A GRAWALA A LGORITHM C ONTD … Executing the Critical Section: 3. Site Si enters the CS after it has received REPLY messages from all the sites in its request set. Releasing the Critical Section: 4. When site Si exits the CS, it sends REPLY messages to all the deferred requests.
M AEKAWA ’ S A LGORITHM FOR M UTUAL E XCLUSION To get access to a CS, not all processes have to agree Split set of processes up into subsets that overlap There is consensus within every subset When a process wishes to enter the CS, it sends a request to every member of its district.
M AEKAWA ’ S A LGORITHM FOR M UTUAL E XCLUSION C ONTD …
When the process receives replies from all the members of the district, it can enter the CS. When a process receives a request, it responds with a "YES“ if it has not already sent any reply. When a process exits the CS, it informs the district, which can then reply for other nodes.
M AEKAWA ’ S A LGORITHM FOR M UTUAL E XCLUSION C ONTD … Request sets N = { 1, 2,..., N } R i ∩ R j ≠ ∅ all i, j ∈ N A site can send a REPLY message only if it has not been sent a REPLY message after getting a RELEASE message.
R EQUESTING THE CRITICAL SECTION : 1. A site S i requests access to the CS by sending REQUEST(i) messages to all the sites in its request set R i. 2. When a site Sj receives the REQUEST(i) message, it sends a REPLY(j) message to Si provided it has not sent a REPLY message to a site from the time it received the last RELEASE message. Otherwise it queues up the REQUEST for latter consideration.
E XECUTING & R ELEASING THE CS Executing the Critical Section 3. Site Si accesses the CS only after receiving REPLY messages from all the sites in Ri. Releasing the CS: 4. After the execution of the CS, the site Si sends RELEASE(i) message to all the sites in Ri. 5. When a site Sj receives a RELEASE(i)message from site Si, it sends a REPLY message to the next site waiting in the queue and deletes that entry from the queue. If the queue is empty, then the site updates its state to reflect that the site has not sent out any REPLY message.
M AEKAWA ’ S ALGORITHM With each process i, associate a subset S i. Divide the set of processes into subsets that satisfy the following two conditions: i ∈ S i ∀ i,j : 0≤i,j ≤ n-1 | S i ⋂ S j ≠ ∅ Main idea. Each process i is required to receive permission from S i only. Correctness requires that multiple processes will never receive permission from all members of their respective subsets. 0,1,21,3,5 2,4,5 S0S0 S1S1 S2S2
M AEKAWA ’ S ALGORITHM Example. Let there be seven processes 0, 1, 2, 3, 4, 5, 6 S 0 ={0, 1, 2} S 1 ={1, 3, 5} S 2 ={2, 4, 5} S 3 ={0, 3, 4} S 4 ={1, 4, 6} S 5 ={0, 5, 6} S 6 ={2, 3, 6}
M AEKAWA ’ S ALGORITHM Version 1 {Life of process I} 1. Send timestamped request to each process in S i. 2. Request received send ack to process with the lowest timestamp. Thereafter, "lock" (i.e. commit) yourself to that process, and keep others waiting. 3. Enter CS if you receive an ack from each member in S i. 4. To exit CS, send release to every process in S i. 5. Release received unlock yourself. Then send ack to the next process with the lowest timestamp. S 0 ={0, 1, 2} S 1 ={1, 3, 5} S 2 ={2, 4, 5} S 3 ={0, 3, 4} S 4 ={1, 4, 6} S 5 ={0, 5, 6} S 6 ={2, 3, 6}
M AEKAWA ’ S ALGORITHM - VERSION 1 ME1. At most one process can enter its critical section at any time. Let i and j attempt to enter their Critical Sections S i ∩ S j ≠ ∅ implies there is a process k ∊ S i ⋂ S j Process k will never send ack to both. So it will act as the arbitrator and establishes ME1 S 0 ={0, 1, 2} S 1 ={1, 3, 5} S 2 ={2, 4, 5} S 3 ={0, 3, 4} S 4 ={1, 4, 6} S 5 ={0, 5, 6} S 6 ={2, 3, 6}
M AEKAWA ’ S ALGORITHM - VERSION 1 ME2. No deadlock. Unfortunately deadlock is possible! Assume 0, 1, 2 want to enter their critical sections. From S 0 = {0,1,2}, 0,2 send ack to 0, but 1 sends ack to 1 ; From S 1 = {1,3,5}, 1,3 send ack to 1, but 5 sends ack to 2 ; From S 2 = {2,4,5}, 4,5 send ack to 2, but 2 sends ack to 0 ; Now, 0 waits for 1 (to send a release), 1 waits for 2 (to send a release),, and 2 waits for 0 (to send a release),. So deadlock is possible! S 0 ={0, 1, 2} S 1 ={1, 3, 5} S 2 ={2, 4, 5} S 3 ={0, 3, 4} S 4 ={1, 4, 6} S 5 ={0, 5, 6} S 6 ={2, 3, 6}
T OKEN BASED ALGORITHMS one token, shared among all sites site can enter its CS iff it holds token The major difference is the way the token is searched use sequence numbers instead of timestamps used to distinguish requests from same site kept independently for each site use sequence number to distinguish between old and current requests
Token-based ME Algorithms A unique Token is shared among all sites. A site is allowed to enter the CR if it holds the Token. Token-based algorithms use a sequence number instead of timestamps. Correctness proof is trivial. Rather, the issues of freedom from starvation and freedom from deadlock are more important.
S UZUKI -K ASAMI ’ S BROADCAST A LGORITHM node holding TOKEN can execute CS repeatedly if no request from others comes if a node wants TOKEN, it broadcasts a REQUEST message to all other nodes node : REQUEST(j, n) node j is requesting n-th CS invocation n = 1, 2, 3,..., sequence # node i receives REQUEST from j update RN i [j] = max ( RN i [j], n )
S UZUKI -K ASAMI ’ S A LGORITHM C ONTD … RN i [j] = largest seq # received so far from node j TOKEN : TOKEN(Q, LN ) ( suppose at node i ) Q -- queue of requesting nodes LN -- array of size N such that LN[j] = the seq # of the request of node j granted most recently
S UZUKI -K ASAMI ’ S A LGORITHM C ONTD … Requesting the critical section: 1. If the requesting site Si does not have the token then it increments its sequence number, RNi[i], and sends a REQUEST(i,sn) message to all other sites. 2. When a site Sj receives this message, it sets RNj[i] to max(RNj[i], sn). If Sj has the idle token, then it sends the token to Si if RNj[i]=LN[i] + 1.
S UZUKI -K ASAMI ’ S A LGORITHM C ONTD … Executing the Critical Section: 3. The site Si executes the CS when it has received the token. Release the Critical Section: (after executing CS) 4. It sets LN[i] element of the token array equal to Rni[i]. 5. For every site Sj whose ID is not in the token queue, it appends its ID to the token queue if Rni[j] = LN[j] + 1.
S UZUKI -K ASAMI ’ S A LGORITHM C ONTD … 6. If token queue is non empty after the above update, then it deletes the top site ID from the queue and sends the token to the site indicated by the ID.
S UZUKI K ASAMI E XAMPLE
C ONSIDER 5 S ITES S1 S2 S4 S3 S5 n=1 n=2 n=5 n=4 n= RN1 RN2 RN5 RN3 RN LN Q{ } (empty) Token at Site S1
S ITE S1 WANTS TO ENTER CS S1 won’t execute Request Part of the algorithm S1 enters CS as it has token When S1 release CS,it performs following tasks (i) LN[1] = RN1[1] = 0 i.e. same as before (ii) If during S1’s CS execution, any request from other site (say S2) have come then that Site’s id will now be entered into the queue if RN1[2] = LN[2]+1 So if site holding (not received from other) the token enters CS then overall state of the system won’t change it remains same.
S ITE S2 WANTS TO ENTER CS S2 does not have token so it sends REQUEST(2,1) message to every other site Sites S1,S3,S4 and S5 updates their RN as follows It may be possible that S2’s request came at S1 when S1 is under CS execution So at this point S1 will not pass the token to S2. It will finish its CS and then adds S2 Into Queue because at that point RN1[2] = 1 which is LN[2]+1 Now as the queue is not empty, so S1 pass token to S2 by removing S2 entry from Token queue.
S ITE S2 S ENDS REQUEST(2,1) S1 S2 S4 S3 S5 n=1 n=2 n=5 n= RN1 RN2 RN3 RN LN Q{S2 } [Inserted only when S1 leave CS] Token at Site S1 REQUEST(2,1)
S ITE S1 S ENDS T OKEN TO S2 S1 S2 S4 S3 S5 n=1 n=2 n=5 n= RN1 RN2 RN3 RN4 LN Token at Site S Q{} LN
S ITE S2 L EAVES CS S1 S2 S4 S3 S5 n=1 n=2 n=5 n= RN1 RN2 RN3 RN4 LN Token at Site S Q{} LN
N OW SUPPOSE S2 AGAIN WANTS TO ENTER CS S2 can enter CS as token queue is empty i.e. no other site has requested for CS and token is currently held by S2 When S2 release CS it updates LN[2] as RN2[2] which is same as 1.
S ALIENT P OINTS A site updates its RN table as soon as it receives Request message from any other site [At this point a site may hold an ideal token or may be executing CS] When a site enters CS without sending request message (i.e. when site wants to again enter CS without making request) then its RN entry and LN entry of token remains unchanged A site can be added in to token queue only by the site who is releasing the CS A site holding the token can repeatedly enter CS until it does not send token to some other site. A site gives priority to other sites with outstanding requests for the CS over its pending requests for the CS
R AYMOND ’ S T REE B ASED A LGORITHM Sites are arranged logically as a tree ( e.g. minimal spanning tree ) edges are directed toward the site that holds the token each site has a variable HOLDER -- indicates the location of TOKEN relative to the site ( node ) itself
R AYMOND ’ S T REE B ASED A LGORITHM C ONTD …
HOLDER(A) = D HOLDER(B) = A HOLDER(C) = A HOLDER(D) = E HOLDER(E) = self HOLDER(F) = D
R AYMOND ’ S T REE B ASED A LGORITHM C ONTD …
When a nontoken-node (e.g. A ) wants to enter CS, it sends REQUEST to HOLDER(A) ( i.e. D ) D sends REQUEST to HOLDER(D), i.e. E when E no longer needs TOKEN, it sends TOKEN to one of its neighbours who has requested TOKEN
R AYMOND ’ S T REE B ASED A LGORITHM C ONTD … S i requests entry to CS if S i does not hold the token and Q i is empty then send request to holder i add S i to Q i S j receives request from S i if S j is holding token send token to S i set holder j to S i if S j is not holding token place request in Q j if S j does not have a pending request then send request to holder j
R AYMOND ’ S T REE B ASED A LGORITHM C ONTD … S i receives token delete top entry S j from Q i if k = i enter own critical section if k not i then { send token to S j ; set holder i to S j } S i leaves a CS if Q i is nonempty then { delete top entry S j from Q i ; send token to S j ; set holder i to S j } if Q i is (still) nonempty then send request to holder i
R AYMOND ’ S T REE B ASED A LGORITHM C ONTD …
Initially, P0 holds the token. Also, P0 is the current root.
P3 wants the token to get into its critical section.So, P3 adds itself to its own FIFO queue and sends a request message to its parent P2.
P2 receives the request from P3. It adds P3 to its FIFO queue and passes the request message to its parent P1.
P1 receives the request from P2. It adds P3 to its FIFO queue and passes the request message to its parent P0
At this point, P2 also wants the token. Since its FIFO queue is not empty, it adds itself to its own FIFO queue.
P0 receives the request message from P3 though P1. It surrenders the token and passes it on to P1.It also changes the direction of the arrow between them,making P1 the root, temporarily.
P1 removes the top element of its FIFO queue to see which node requested the token.Since the token needs to go to P3, P1 surrenders the token and passes it on to P2.It also changes the direction of the arrow between them, making P2 the root, temporarily.
P2 removes the top element of its FIFO queue to see which node requested the token.Since the token needs to go to P3, P2 surrenders the token and passes it on to P3.It also changes the direction of the arrow between them, making P3 the root.
Now, P3 holds the token and can execute its critical section.It is able to clear the top (and only) element of its FIFO queue.Note that P3 is the current root. In the meantime, P2 checks the top element of its FIFO queue and realizes that it also needs to request the token.So, P2 sends a request message to its current parent, P3, who appends the request to its FIFO queue.
As soon as P3 completes its critical section, it checks the top element of its FIFO queue to see if it is needed elsewhere.In this case, P2 has requested it, so P3 sends it back to P2.It also changes the direction of the arrow between them, making P2 the new root.
P2 holds the token and is able to complete its critical section.Then it checks its FIFO queue, which is empty. So it waits until some other node requests the token.
AlgorithmResponse time SD# of messages (LL) # of messages (HL) Lamport2T + ET3(n – 1) Ricart- Agrawala 2T + ET2(n – 1) Maekawa2T + E2T3(sqrt(n) + 1) 5(sqrt(n) + 1) Suzuki- Kasami 2T + ETnn Sinhal2T + ETn/2n RaymondT(log n) + ETlog n/2log n4 LL: Light Load, HL: Heavy Load Comparison of Distributed ME Algorithms