Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Similar presentations


Presentation on theme: "A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University."— Presentation transcript:

1 A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University Taiwan, R. O. C.

2  The slides are for the extended version: http://www.csie.ncu.edu.tw/~jrjiang/ PDCAT04-Extended.pdf http://www.csie.ncu.edu.tw/~jrjiang/ PDCAT04-Extended.pdf  We’ve included a deadlock-free and starvation-free conflict resolution mechanism.  We’ve proposed pre-release action and conditional inquiring to achieve the k-concurrency property.

3 Outline  Introduction  Related Work  The proposed Algorithm  Analysis and Comparison  Conclusion

4 Outline  Introduction  Related Work  The proposed Algorithm  Analysis and Comparison  Conclusion

5 Distributed Systems  A distributed system consists of interconnected, autonomous nodes which communicate with each other by passing messages. Interconnected Network node a node b node c node d node e

6 Mutual Exclusion  A node in the system may need to enter the critical section (CS) occasionally to access a shared resource, such as a shared file or a shared table, etc.  How to control the nodes so that the shared resource is accessed by at most one node at a time is called the mutual exclusion problem.

7 Mutual Exclusion Example in CS node a node b node c node d node e

8 Mutual Exclusion Example in CS node a node b node c node d node e

9 k-Mutual Exclusion  If there are k, k  1, identical copies of shared resources, such as a k-user software license, then there can be at most k nodes accessing the resources at a time.  This raises the k-mutual exclusion problem.

10 k-Mutual Exclusion Example in CS k=2 node a node b node c node d node e

11 k-Mutual Exclusion Example in CS k=2 node a node b node c node d node e

12 h-out of k-mutual exclusion  On some occasions, a node may require to access h (1  h  k) copies out of the k shared resources at a time; for example, a node may need h disks from a pool of k disks to proceed.  How to control the nodes to acquire the desired number of resources with the total number of resources accessed concurrently not exceeding k is called the h-out of-k mutual exclusion problem or the h-out of-k resource allocation problem.

13 h-out of-k Mutual Exclusion Example in CS (h=2) in CS (h=1) k=3 node a node b node c node d node e

14 h-out of-k Mutual Exclusion Example in CS (h=1) k=3 node a node b node c node d node e

15 Outline  Introduction  Related Work  The proposed Algorithm  Analysis and Comparison  Conclusion

16 h-out of-k ME Algorithms  Raynal (1991): uses request broadcast  Baldoni et al. (1998): uses k-arbiters  Manabe et al. (2004): uses (h, k)-arbiters  Jiang (2004): uses k-coteries

17 Jiang’s Algorithm  Among the four algorithms, only Jiang’s algorithm using k-coteries is fault-tolerant.  It can tolerate node and/or network link failures even when the failures lead to network partitioning.  It has lower message cost than others.

18 k-Coterie

19 Example of k-Coteries  The collection of sets (Q1={3, 4}, Q2={3, 5}, Q3={4, 5}, Q4={1, 3}, Q5={1, 4}, Q6={1, 5}, Q7={2, 3}, Q8={2, 4} Q9={2, 5}) is a 2-coterie  There are at most 2 mutually disjoint quorums  For any quorum Q, we can find a quorum Q such that Q and Q are mutually disjoint  Every quorum is minimal

20 Basic Idea of Jiang’s Alg.  A node should select h mutually disjoint sets and collect permissions from all the nodes of the h sets to enter CS for accessing h resources.  To render the algorithm fault-tolerant, a node is demanded to repeatedly reselect h mutually disjoint sets for gathering incremental permissions when a node fails to gather enough permissions to enter CS after a time-out period.

21 Drawbacks of Jiang’s Alg.  First, it does not specify explicitly how a node can efficiently select and reselect h mutually disjoint sets.  Second, when there is contention, a low-priority node always yields its gathered permissions to high-priority nodes, which causes higher message overhead and may prohibit nodes from entering CS concurrently.

22 Outline  Introduction  Related Work  The proposed Algorithm  Analysis and Comparison  Conclusion

23 Overview of the Proposed Alg.  Using a specific k-coterie  cohorts coterie  Having constant message cost in the best case  A candidate to achieve the highest availability among all the algorithms using k-coteries  Achieving k-concurrency by pre-release action and conditional inquiring

24 Cohorts Structure Coh(k, m)

25 Example of Coh(2, 4)  (C 1 ={13,14}, C 2 ={9,10,11,12}, C 3 ={5,6,7,8}, C 4 ={1,2,3,4}) is a Coh(2,4). 14 12 11 10 4 3 13 2 6 l5 9 7 8 C1C1 C2C2 C3C3 C4C4

26 Quorum under Coh(k, m)

27 Construction of Quorums under Coh(2, 4)  one primary cohort  with supporting cohorts at rear  E.G.: {3, 6, 7, 8} 14 12 11 10 4 3 13 2 6 l5 9 7 8 14 12 11 10 4 3 13 2 6 15 9 7 8 14 12 11 10 4 3 13 2 6 1 5 9 7 8 There will be no disjoint quorum to be formed. and {2, 5, 9, 10,11}

28 Requesting h Resources  When a node u wants to enter CS to access h resources, u should invoke Get_Quorum(h, k, (C 1,...,C m )) and waits for it to return.  h, k: integers  (C 1,...,C m ): Cohorts Structure

29 Case 1 Case 2 Case 3

30 Probe(C i, g)

31 Case 1 for Probe(C i, g) to return

32 Case 2 for Probe(C i, g) to return

33 Case 3 for Probe(C i, g) to return

34 Probe(C i, g) waits  It is noted that Probe(C i, g) will postpone the return if none of the three cases stands, which means no node in C i can reply to grant its permission immediately.

35 How to avoid deadlock and starvation?  Adopting mechanism similar to Maekawa’s using timestamped messages.  We apply new mechanisms  pre-release action  conditional inquiring to also achieve k-concurrency

36 k-Concurrency  When the number of resources that are currently used or requested does not exceed k, a low- priority node should not be prohibited from entering CS by a high-priority node.

37 Pre-release

38 Six Types of Messages  REQUEST  LOCKED  RELEASE  PRE-RELEASE  INQUIRE  RELINQUISH Comparison: Maekawa’s algorithm uses the following six messages: REQUEST LOCKED FAILED RELEASE INQUIRE RELINQUISH

39 On Receiving REQUEST  On receiving a REQUEST from node u, a node v checks it is currently locked for another REQUEST. If not so, v marks itself locked, set u as the locker, records the number h of resources that u requests, and sends a LOCKED message to u.

40 Two Local Priority Queues  R-QUEUE: On receiving a REQUEST from node u, if v is locked for a REQUEST from another node w (w is the locker), the REQUEST from node u is inserted into R-QUEUE  P-QUEUE: On receiving a PRE-RELEASE message from node u, node v inserts the message into P-QUEUE.

41 Timestamp

42 Conflict Condition

43 On Receiving PRE-RELEASE  On receiving a PRE-RELEASE message from node u (u must be the locker), node v inserts the message into P-QUEUE.  It then marks itself unlocked if R-QUEUE is empty; otherwise, it removes from R- QUEUE the node w, sets w as locker, and sends w a LOCKED message, where w is the node at the front of R-QUEUE.

44 Conditional Inquiring  Node v sends an INQUIRE message to node w only if the conflict condition holds for the locker w.  It is not necessary to send the INQUIRY message if an INQUIRY has already sent to w and w has not yet sent RELINQUISH or RELEASE (we will explain the two messages later).

45 On Receiving INQUIRE  When node w receives an INQUIRE message from node v, it replies a RELINQUISH message to cancel its lock if it is not in CS.  Otherwise, it replies a RELEASE message, but only after it exits CS. If an INQUIRE message has arrived after w has sent a RELEASE message, it is simply ignored.

46 On Receiving RELINQUISH  On receiving a RELINQUISH message form w (w must be the locker), node v swaps w with u, sets u as the locker, and sends a LOCKED message to u, where u is the node at the front of R-QUEUE.

47 Effective Permission  Node u is said to have an effective permission of node v if u has received LOCKED from v and does not sent corresponding RELINQUISH or RELEASE to v.

48 Entering CS

49 Exiting CS  After existing CS, node u should send RELEASE message to all the nodes to which u has sent REQUEST.

50 On Receiving RELEASE  On receiving a RELEASE message from node u (u may or may not be the locker), node v removes u’s PRE-RELEASE message from P-QUEUE if u’s PRE-RELEASE message is in P-QUEUE.  Node v marks itself unlocked if R-QUEUE is empty; otherwise, it removes from R-QUEUE the REQUEST message of node w, sets w as locker, and sends w a LOCKED message, where w is the node whose REQUEST is at the front of R-QUEUE.

51 3 2 l requester 1 2 3 REQUEST LOCKED REQUEST LOCKED PR-ERELEASE turn-around time in CS RELEASE Illustration #1 3 nodes for k=1 time Node 2 maybe fails. But the requester can still succeed to enter CS.

52 5 2 l 5 nodes for k=2 Node 5 maybe fails. But two requesters can still succeed to enter CS concurrently. 4 3 Illustration #2

53 requester 1 1 2 3 REQUEST LOCKED REQUEST LOCKED PRE-RELEASE turn-around time in CS Illustration #2 time 4 5 LOCKED requester 2 REQUEST LOCKED PRE-RELEASE PRE-RELESE REQUEST LOCKED PRE-RELEASE turn-around time in CS turn-around time 5 2 l 4 3 Requesters 1 and 2 can be in CS concurrently.

54 requester 1 1 2 3 REQUEST LOCKED REQUEST LOCKED PRE-RELEASE turn-around time in CS Illustration #3 time 4 5 LOCKED requester 2 REQUEST LOCKED PRE-RELEASE PRE-RELESE REQUEST LOCKED PRE-RELEASE turn-around time in CS turn-around time Now suppose that there is a new requester 3 at time t, which requests one resource. 5 2 l 4 3 time t The requester 3 will first send REQUESTs to nodes 1, 2 and 3. All the three nodes will check the conflict condition and perform the conditional inquiring to achieve k-concurrency.

55 R2 Illustration #3 R3 locker: requester 1 for 1 resource P-QUEUE: R-QUEUE: R1 locker: requester 2 for 1 resource P-QUEUE: R-QUEUE: The local state of node 1The local state of node 2 R3 Suppose the priority order is R3>R2>R1 The conflict condition holds for requester 1. The conflict condition does not hold for requester 2.

56 R2 Illustration #3 R3 locker: requester 1 for 1 resource P-QUEUE: R-QUEUE: R1 locker: requester 2 for 1 resource P-QUEUE: R-QUEUE: The local state of node 1The local state of node 2 R3 If the requester 3 requests 2 resoueces conflict condition holds for requester 1. The conflict condition holds for requesters 1 and 2.

57 Outline  Introduction  Related Work  The proposed Algorithm  Analysis and Comparison  Conclusion

58 Analysis  Message Cost:  Best Case: 4c  h, where c is the cohort size, c > 2k  2  one REQUEST, LOCKED and RELEASE for each node in (C m  …  C m  h  1 ) and one PRERELEASE for the nodes in (C m  …  C m  h  1 )  R.  Worst Case: 7n, where n is the number of nodes  one REQUEST, INQUIRE, RELINQUISH, LOCKED, RELEASE, and LOCKED for each nodes and one RELEASE for those not in R. (it occurs only when there are conflicting nodes.)

59 Comparison

60 Outline  Introduction  Related Work  The proposed Algorithm  Analysis and Comparison  Conclusion

61 Conclusion  The proposed algorithm becomes a k-mutual exclusion algorithm for k>h=1, and becomes a mutual exclusion algorithm for k=h=1.  It is resilient to node and/or link failures and has constant message cost in the best case.  It is a candidate to achieve the highest availability among all the algorithms using k-coteries since the cohorts coterie is ND.  It has the k-concurrency property, which guarantees that a low-priority node is not postponed by a high- priority node when there are not conflicting nodes.

62 Thanks!!

63 Dominated k-Coteries

64 Nondominated k-Coteries  Since an available quorum implies an available entry to CS, we should always concentrate on ND (nondominated) k-coteries that no other k- coterie can dominate.  The algorithm using ND k-coteries, for example the proposed algorithm, is a candidate to achieve the highest availability.


Download ppt "A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University."

Similar presentations


Ads by Google