A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Slides:



Advertisements
Similar presentations
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Distributed Mutual Exclusion.
Advertisements

Optimistic Methods for Concurrency Control By : H.T. Kung & John T. Robinson Presenters: Munawer Saeed.
CS3771 Today: deadlock detection and election algorithms  Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion.
Token-Dased DMX Algorithms n LeLann’s token ring n Suzuki-Kasami’s broadcast n Raymond’s tree.
Mutual Exclusion By Shiran Mizrahi. Critical Section class Counter { private int value = 1; //counter starts at one public Counter(int c) { //constructor.
1 Algorithms and protocols for distributed systems We have defined process groups as having peer or hierarchical structure and have seen that a coordinator.
CS 582 / CMPE 481 Distributed Systems
What we will cover…  Distributed Coordination 1-1.
Highly Concurrent and Fault-Tolerant h-out of-k Mutual Exclusion Using Cohorts Coteries for Distributed Systems.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Distributed Snapshots –Termination detection Election algorithms –Bully –Ring.
Chapter 7. Deadlocks.
1 Lecture 8: Deadlocks Operating System Spring 2008.
Inter Process Communication:  It is an essential aspect of process management. By allowing processes to communicate with each other: 1.We can synchronize.
20101 Synchronization in distributed systems A collection of independent computers that appears to its users as a single coherent system.
A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Chapter 18: Distributed Coordination (Chapter 18.1 – 18.5)
1 Prioritized h-out of-k Resource Allocation for Mobile Ad Hoc Networks and Distributed Systems Jehn-Ruey Jiang Hsuan Chuang University HsinChu, Taiwan.
A Distributed Group k-Exclusion Algorithm Using k-Write-Read Coteries Presented by Jehn-Ruey Jiang National Central University Taiwan, R. O. C.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed Mutual Exclusion.
Distributed Mutex EE324 Lecture 11.
Distributed Mutual Exclusion
4.5 DISTRIBUTED MUTUAL EXCLUSION MOSES RENTAPALLI.
4.5 Distributed Mutual Exclusion Ranjitha Shivarudraiah.
Maekawa’s algorithm Divide the set of processes into subsets that satisfy the following two conditions: i  S i  i,j :  i,j  n-1 :: S i  S j.
MUTUAL EXCLUSION AND QUORUMS CS Distributed Mutual Exclusion Given a set of processes and a single resource, develop a protocol to ensure exclusive.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms –Bully algorithm.
Operating Systems Distributed Coordination. Topics –Event Ordering –Mutual Exclusion –Atomicity –Concurrency Control Topics –Event Ordering –Mutual Exclusion.
Coordination and Agreement. Topics Distributed Mutual Exclusion Leader Election.
1 Distributed Process Management Chapter Distributed Global States Operating system cannot know the current state of all process in the distributed.
Global State (1) a)A consistent cut b)An inconsistent cut.
Presenter: Long Ma Advisor: Dr. Zhang 4.5 DISTRIBUTED MUTUAL EXCLUSION.
Studying Different Problems from Distributed Computing Several of these problems are motivated by trying to use solutiions used in `centralized computing’
Lecture 10 – Mutual Exclusion Distributed Systems.
ITEC452 Distributed Computing Lecture 6 Mutual Exclusion Hwajung Lee.
Hwajung Lee. Mutual Exclusion CS p0 p1 p2 p3 Some applications are:  Resource sharing  Avoiding concurrent update on shared data  Controlling the.
Hwajung Lee. Mutual Exclusion CS p0 p1 p2 p3 Some applications are: 1. Resource sharing 2. Avoiding concurrent update on shared data 3. Controlling the.
An Efficient and Fault-Tolerant Solution for Distributed Mutual Exclusion by D. Agrawal, A.E. Abbadi Presentation by Peter Tsui for COEN 317, F/03.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Page 1 Mutual Exclusion & Election Algorithms Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content.
A Torus Quorum Protocol for Distributed Mutual Exclusion A Torus Quorum Protocol for Distributed Mutual Exclusion S.D. Lang and L.J. Mao School of Computer.
Lecture 7- 1 CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 7 Distributed Mutual Exclusion Section 12.2 Klara Nahrstedt.
Distributed systems. distributed systems and protocols distributed systems: use components located at networked computers use message-passing to coordinate.
Decentralized solution 1
Distributed Mutual Exclusion Synchronization in Distributed Systems Synchronization in distributed systems are often more difficult compared to synchronization.
Mutual Exclusion Algorithms. Topics r Defining mutual exclusion r A centralized approach r A distributed approach r An approach assuming an organization.
Hwajung Lee. Mutual Exclusion CS p0 p1 p2 p3 Some applications are:  Resource sharing  Avoiding concurrent update on shared data  Controlling the.
CS3771 Today: Distributed Coordination  Previous class: Distributed File Systems Issues: Naming Strategies: Absolute Names, Mount Points (logical connection.
CSC 8420 Advanced Operating Systems Georgia State University Yi Pan Transactions are communications with ACID property: Atomicity: all or nothing Consistency:
Revisiting Logical Clocks: Mutual Exclusion Problem statement: Given a set of n processes, and a shared resource, it is required that: –Mutual exclusion.
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Oct 1, 2015 Lecture 12: Mutual Exclusion All slides © IG.
4.5 Distributed Mutual Exclusion
Mutual Exclusion Continued
Distributed Mutual Exclusion
Distributed Mutex EE324 Lecture 11.
Advanced Operating System Fall 2009
Chapter 6.3 Mutual Exclusion
Distributed Mutual Exclusion
Decentralized solution 1
Mutual Exclusion Problem Specifications
Outline Distributed Mutual Exclusion Introduction Performance measures
Mutual Exclusion CS p0 CS p1 p2 CS CS p3.
Synchronization (2) – Mutual Exclusion
ITEC452 Distributed Computing Lecture 7 Mutual Exclusion
Database Management System
Lecture 18: Coherence and Synchronization
Distributed Systems and Concurrency: Synchronization in Distributed Systems Majeed Kassis.
Distributed Mutual eXclusion
Presentation transcript:

A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University Taiwan, R. O. C.

 The slides are for the extended version: PDCAT04-Extended.pdf PDCAT04-Extended.pdf  We’ve included a deadlock-free and starvation-free conflict resolution mechanism.  We’ve proposed pre-release action and conditional inquiring to achieve the k-concurrency property.

Outline  Introduction  Related Work  The proposed Algorithm  Analysis and Comparison  Conclusion

Outline  Introduction  Related Work  The proposed Algorithm  Analysis and Comparison  Conclusion

Distributed Systems  A distributed system consists of interconnected, autonomous nodes which communicate with each other by passing messages. Interconnected Network node a node b node c node d node e

Mutual Exclusion  A node in the system may need to enter the critical section (CS) occasionally to access a shared resource, such as a shared file or a shared table, etc.  How to control the nodes so that the shared resource is accessed by at most one node at a time is called the mutual exclusion problem.

Mutual Exclusion Example in CS node a node b node c node d node e

Mutual Exclusion Example in CS node a node b node c node d node e

k-Mutual Exclusion  If there are k, k  1, identical copies of shared resources, such as a k-user software license, then there can be at most k nodes accessing the resources at a time.  This raises the k-mutual exclusion problem.

k-Mutual Exclusion Example in CS k=2 node a node b node c node d node e

k-Mutual Exclusion Example in CS k=2 node a node b node c node d node e

h-out of k-mutual exclusion  On some occasions, a node may require to access h (1  h  k) copies out of the k shared resources at a time; for example, a node may need h disks from a pool of k disks to proceed.  How to control the nodes to acquire the desired number of resources with the total number of resources accessed concurrently not exceeding k is called the h-out of-k mutual exclusion problem or the h-out of-k resource allocation problem.

h-out of-k Mutual Exclusion Example in CS (h=2) in CS (h=1) k=3 node a node b node c node d node e

h-out of-k Mutual Exclusion Example in CS (h=1) k=3 node a node b node c node d node e

Outline  Introduction  Related Work  The proposed Algorithm  Analysis and Comparison  Conclusion

h-out of-k ME Algorithms  Raynal (1991): uses request broadcast  Baldoni et al. (1998): uses k-arbiters  Manabe et al. (2004): uses (h, k)-arbiters  Jiang (2004): uses k-coteries

Jiang’s Algorithm  Among the four algorithms, only Jiang’s algorithm using k-coteries is fault-tolerant.  It can tolerate node and/or network link failures even when the failures lead to network partitioning.  It has lower message cost than others.

k-Coterie

Example of k-Coteries  The collection of sets (Q1={3, 4}, Q2={3, 5}, Q3={4, 5}, Q4={1, 3}, Q5={1, 4}, Q6={1, 5}, Q7={2, 3}, Q8={2, 4} Q9={2, 5}) is a 2-coterie  There are at most 2 mutually disjoint quorums  For any quorum Q, we can find a quorum Q such that Q and Q are mutually disjoint  Every quorum is minimal

Basic Idea of Jiang’s Alg.  A node should select h mutually disjoint sets and collect permissions from all the nodes of the h sets to enter CS for accessing h resources.  To render the algorithm fault-tolerant, a node is demanded to repeatedly reselect h mutually disjoint sets for gathering incremental permissions when a node fails to gather enough permissions to enter CS after a time-out period.

Drawbacks of Jiang’s Alg.  First, it does not specify explicitly how a node can efficiently select and reselect h mutually disjoint sets.  Second, when there is contention, a low-priority node always yields its gathered permissions to high-priority nodes, which causes higher message overhead and may prohibit nodes from entering CS concurrently.

Outline  Introduction  Related Work  The proposed Algorithm  Analysis and Comparison  Conclusion

Overview of the Proposed Alg.  Using a specific k-coterie  cohorts coterie  Having constant message cost in the best case  A candidate to achieve the highest availability among all the algorithms using k-coteries  Achieving k-concurrency by pre-release action and conditional inquiring

Cohorts Structure Coh(k, m)

Example of Coh(2, 4)  (C 1 ={13,14}, C 2 ={9,10,11,12}, C 3 ={5,6,7,8}, C 4 ={1,2,3,4}) is a Coh(2,4) l C1C1 C2C2 C3C3 C4C4

Quorum under Coh(k, m)

Construction of Quorums under Coh(2, 4)  one primary cohort  with supporting cohorts at rear  E.G.: {3, 6, 7, 8} l There will be no disjoint quorum to be formed. and {2, 5, 9, 10,11}

Requesting h Resources  When a node u wants to enter CS to access h resources, u should invoke Get_Quorum(h, k, (C 1,...,C m )) and waits for it to return.  h, k: integers  (C 1,...,C m ): Cohorts Structure

Case 1 Case 2 Case 3

Probe(C i, g)

Case 1 for Probe(C i, g) to return

Case 2 for Probe(C i, g) to return

Case 3 for Probe(C i, g) to return

Probe(C i, g) waits  It is noted that Probe(C i, g) will postpone the return if none of the three cases stands, which means no node in C i can reply to grant its permission immediately.

How to avoid deadlock and starvation?  Adopting mechanism similar to Maekawa’s using timestamped messages.  We apply new mechanisms  pre-release action  conditional inquiring to also achieve k-concurrency

k-Concurrency  When the number of resources that are currently used or requested does not exceed k, a low- priority node should not be prohibited from entering CS by a high-priority node.

Pre-release

Six Types of Messages  REQUEST  LOCKED  RELEASE  PRE-RELEASE  INQUIRE  RELINQUISH Comparison: Maekawa’s algorithm uses the following six messages: REQUEST LOCKED FAILED RELEASE INQUIRE RELINQUISH

On Receiving REQUEST  On receiving a REQUEST from node u, a node v checks it is currently locked for another REQUEST. If not so, v marks itself locked, set u as the locker, records the number h of resources that u requests, and sends a LOCKED message to u.

Two Local Priority Queues  R-QUEUE: On receiving a REQUEST from node u, if v is locked for a REQUEST from another node w (w is the locker), the REQUEST from node u is inserted into R-QUEUE  P-QUEUE: On receiving a PRE-RELEASE message from node u, node v inserts the message into P-QUEUE.

Timestamp

Conflict Condition

On Receiving PRE-RELEASE  On receiving a PRE-RELEASE message from node u (u must be the locker), node v inserts the message into P-QUEUE.  It then marks itself unlocked if R-QUEUE is empty; otherwise, it removes from R- QUEUE the node w, sets w as locker, and sends w a LOCKED message, where w is the node at the front of R-QUEUE.

Conditional Inquiring  Node v sends an INQUIRE message to node w only if the conflict condition holds for the locker w.  It is not necessary to send the INQUIRY message if an INQUIRY has already sent to w and w has not yet sent RELINQUISH or RELEASE (we will explain the two messages later).

On Receiving INQUIRE  When node w receives an INQUIRE message from node v, it replies a RELINQUISH message to cancel its lock if it is not in CS.  Otherwise, it replies a RELEASE message, but only after it exits CS. If an INQUIRE message has arrived after w has sent a RELEASE message, it is simply ignored.

On Receiving RELINQUISH  On receiving a RELINQUISH message form w (w must be the locker), node v swaps w with u, sets u as the locker, and sends a LOCKED message to u, where u is the node at the front of R-QUEUE.

Effective Permission  Node u is said to have an effective permission of node v if u has received LOCKED from v and does not sent corresponding RELINQUISH or RELEASE to v.

Entering CS

Exiting CS  After existing CS, node u should send RELEASE message to all the nodes to which u has sent REQUEST.

On Receiving RELEASE  On receiving a RELEASE message from node u (u may or may not be the locker), node v removes u’s PRE-RELEASE message from P-QUEUE if u’s PRE-RELEASE message is in P-QUEUE.  Node v marks itself unlocked if R-QUEUE is empty; otherwise, it removes from R-QUEUE the REQUEST message of node w, sets w as locker, and sends w a LOCKED message, where w is the node whose REQUEST is at the front of R-QUEUE.

3 2 l requester REQUEST LOCKED REQUEST LOCKED PR-ERELEASE turn-around time in CS RELEASE Illustration #1 3 nodes for k=1 time Node 2 maybe fails. But the requester can still succeed to enter CS.

5 2 l 5 nodes for k=2 Node 5 maybe fails. But two requesters can still succeed to enter CS concurrently. 4 3 Illustration #2

requester REQUEST LOCKED REQUEST LOCKED PRE-RELEASE turn-around time in CS Illustration #2 time 4 5 LOCKED requester 2 REQUEST LOCKED PRE-RELEASE PRE-RELESE REQUEST LOCKED PRE-RELEASE turn-around time in CS turn-around time 5 2 l 4 3 Requesters 1 and 2 can be in CS concurrently.

requester REQUEST LOCKED REQUEST LOCKED PRE-RELEASE turn-around time in CS Illustration #3 time 4 5 LOCKED requester 2 REQUEST LOCKED PRE-RELEASE PRE-RELESE REQUEST LOCKED PRE-RELEASE turn-around time in CS turn-around time Now suppose that there is a new requester 3 at time t, which requests one resource. 5 2 l 4 3 time t The requester 3 will first send REQUESTs to nodes 1, 2 and 3. All the three nodes will check the conflict condition and perform the conditional inquiring to achieve k-concurrency.

R2 Illustration #3 R3 locker: requester 1 for 1 resource P-QUEUE: R-QUEUE: R1 locker: requester 2 for 1 resource P-QUEUE: R-QUEUE: The local state of node 1The local state of node 2 R3 Suppose the priority order is R3>R2>R1 The conflict condition holds for requester 1. The conflict condition does not hold for requester 2.

R2 Illustration #3 R3 locker: requester 1 for 1 resource P-QUEUE: R-QUEUE: R1 locker: requester 2 for 1 resource P-QUEUE: R-QUEUE: The local state of node 1The local state of node 2 R3 If the requester 3 requests 2 resoueces conflict condition holds for requester 1. The conflict condition holds for requesters 1 and 2.

Outline  Introduction  Related Work  The proposed Algorithm  Analysis and Comparison  Conclusion

Analysis  Message Cost:  Best Case: 4c  h, where c is the cohort size, c > 2k  2  one REQUEST, LOCKED and RELEASE for each node in (C m  …  C m  h  1 ) and one PRERELEASE for the nodes in (C m  …  C m  h  1 )  R.  Worst Case: 7n, where n is the number of nodes  one REQUEST, INQUIRE, RELINQUISH, LOCKED, RELEASE, and LOCKED for each nodes and one RELEASE for those not in R. (it occurs only when there are conflicting nodes.)

Comparison

Outline  Introduction  Related Work  The proposed Algorithm  Analysis and Comparison  Conclusion

Conclusion  The proposed algorithm becomes a k-mutual exclusion algorithm for k>h=1, and becomes a mutual exclusion algorithm for k=h=1.  It is resilient to node and/or link failures and has constant message cost in the best case.  It is a candidate to achieve the highest availability among all the algorithms using k-coteries since the cohorts coterie is ND.  It has the k-concurrency property, which guarantees that a low-priority node is not postponed by a high- priority node when there are not conflicting nodes.

Thanks!!

Dominated k-Coteries

Nondominated k-Coteries  Since an available quorum implies an available entry to CS, we should always concentrate on ND (nondominated) k-coteries that no other k- coterie can dominate.  The algorithm using ND k-coteries, for example the proposed algorithm, is a candidate to achieve the highest availability.