Download presentation
Presentation is loading. Please wait.
1
Chapter 18.3: Distributed Coordination
2
18.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 18 Distributed Coordination Chapter 18.1 Event Ordering Mutual Exclusion Atomicity Chapter 18.2 Concurrency Control Deadlock Handling Chapter 18.3 Deadlock Prevention – finish up Election Algorithms – a little bit Reaching Agreement – a little bit
3
18.3 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter Objectives To present schemes for handling deadlock detection in a distributed system (have looked at deadlock prevention and avoidance) To take a brief look at election algorigthms To take a brief look at Reaching Agreement considerations.
4
18.4 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Deadlock Detection
5
18.5 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Deadlock Detection In deadlock prevention, we may implement an algorithm that preempts resources even if no deadlock has occurred. This is not necessarily good, and we want to avoid unnecessary preemptions wherever possible – an this is a real problem with deadlock prevention... To help us avoid unnecessary preemptions, we can build a wait-for graph that is used to describe the state of resource allocations. Remember that we’re only considering a single resource of each type, and thus if we have a cycle in our wait-for graph, we are in trouble and have a deadlock. Wait-for graph philosophy is reasonably straightforward; issue is how to maintain it. Two techniques we consider require each site to keep its own local wait-for graph. In the wait-for graphs, nodes correspond to processes (local and non-local) currently holding or requesting any resources local to that site. Can see in the figure (next page), we have a system consisting of two sites, each maintaining its own local wait-for graph. Note that P(2) and P(3) appear in both graphs, and this indicates that these processes have requested resources at both sites.
6
18.6 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Two Local Wait-For Graphs Both local wait-for graphs are built in the accustomed manner for local processes / resources. When a process P(i) at site S(i) needs a resource held by process P(J) in site S(2), a request message is sent by P(i) to site S(2). The edge P(i) P(J) is then inserted into the local wait-for graph of site S(2) Of course, if any local wait-for graph has a cycle, we have deadlock. BUT the fact that there are NO cycles does not mean there are no deadlocks. We must look at a ‘larger picture.’ To show this: Note that each graph above is acyclic; nevertheless a deadlock exists in the system. To prove that a deadlock has NOT occurred, we must show that the UNION of all local graphs is acyclic. Next slide shows this is not the case…
7
18.7 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Global Wait-For Graph Continuing, when we take the union of the two wait-for graphs, it is clearn that we do indeed have a cycle, and this implies that the system is in a deadlocked state. We have a number of methods to organize the wait-for graph in a distributed system. Some common approaches are Centralized approaches and Fully distributed approaches. These are very detailed and in the interest of time (and desire to cover another chapter after this one) in this course, we will not go into detail on these two approaches. Rather, we will jump to Election Algorithms and Reaching Agreement.
8
18.8 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Election Algorithms
9
18.9 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Election Algorithms We have discussed in a number of instances how centralized and fully distributed approaches handle the coordination of transactins. So, given we understand the role (and possible distribution) of transaction coordinators, what happens when one such transaction coordinator becomes unavailable? We must determine where a new copy of the coordinator should be restarted. Hence, enter a process referred to as Election Algorithms. These algorithms assume that a unique priority number is associated with each active process in the system; assume also that the priority number of process P i is i Assume also a one-to-one correspondence between processes and sites The coordinator is always the process with the largest priority number. So, when a coordinator fails, the algorithm must elect that active process with the largest priority number Then, this number is sent to each active process in the system. Also, when the former transaction coordinator becomes restored, it must be able to identify the new transaction coordinator via this algorithm.. Two algorithms are typically used to elect a new coordinator: A bully algorithm and A ring algorithm
10
18.10 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Bully Algorithm (1 of 2) This algorithm is applicable to systems where every process can send a message to every other process in the system Given this assumption, If process P i sends a request that is not answered by the coordinator within a time interval T, then P i assumes that the coordinator has failed; P i then acts like a bully and tries to elect itself as the new coordinator P i sends an election message to every process with a higher priority number, P( j ), then waits for any of these processes to answer within some time, T If there’s no response within T, P(i) assume that all processes with numbers greater than i have failed; P i then elects itself the new coordinator If an answer is received, P i begins time interval T´, waiting to receive a message that a process with a higher priority number has been elected If no message is sent within T´, P(i) assumes the process with a higher number has failed; P i should restart the algorithm.
11
18.11 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Bully Algorithm (Cont.) If P i is not the coordinator, then, at any time during execution, P i may receive one of the following two messages from process P( j ) P( j ) is the new coordinator (j > i). P i, in turn, records this information P ( j ) j started an election (j > i). P i, sends a response to P ( j ) and begins its own election algorithm, provided that P i has not already initiated such an election The process that completes its algorithm has the highest number and is elected as the coordinator. It will have also sent its number to all active processes with smaller numbers. After a failed process recovers, it will immediately begins execution of the same algorithm – being a bully that it is. If there are no active processes with higher numbers, the recovered process forces all processes with lower number to let it become the coordinator process, even if there is a currently active coordinator with a lower number You can go through the detailed example of how these elections occur…
12
18.12 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Ring Algorithm (1 of 2) No great surprise here. This election algorithm is based on a ring architectural structure or at least a logical ring, if not physical ring. Communications are as expected where processes sends its messages to the neighbors on the right. The Active List. The main data structure used by the algorithm includes what is called an ‘active list’ containing priority numbers of all processes active in the system. Each process maintains an active list, consisting of all the priority numbers of all active processes in the system. If process P(i) detects a coordinator failure, it creates an initially empty new active list. It then sends a message elect(i) to its right neighbor, and adds the number i to its active list Note the direction of the communications.
13
18.13 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Ring Algorithm (Cont.) If P i receives a message elect( j ) from the process on the left, it must respond in one of three ways: 1. If this is the first elect message it has seen or sent, P i creates a new active list with the numbers i and j It then sends the message elect( i ), followed by the message elect( j ) 2. If i j, then the active list for P i now contains the numbers of all the active processes in the system P i can now easily determine the largest number in the active list to identify the new coordinator process 3. If i = j, then P i receives the message elect( i ) The active list for P i contains all the active processes in the system P i can now determine the new coordinator process.
14
18.14 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Reaching Agreement
15
18.15 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Reaching Agreement (directly from book) Normally, applications processes wish to agree on a common “values” Such agreement, however, may not take place due to a: Faulty communication medium which might result in lost or garbled messages, or Faulty processes Processes themselves may send garbled or otherwise incorrect messages to other processes Processes themselves can also be flawed in other ways and result in unpredictable process behaviors. In short, we can have a mess. We can ‘hope’ that processes fail in a clean manner, But processes can fail miserably and send garbled / incorrect messages to other processes or even collaborate with other failed processes in an attempt to destroy the integrity of the system. So let’s look more closely at reaching agreement:
16
18.16 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Reaching Agreement – Unreliable Communications Approach 1: assume processes fail in a clean manner, where the data communications medium is unreliable. So lets assume that some process P( i ) at site S(1) which has sent a message to process P( j ) at site S(2), needs to know whether P( j ) has received the message so that it can decide how to proceed with, say, its computation. For example, P( i ) may decide to compute a function foo if P( j ) has received its message or to compute a function boo if P( j ) has not received the message (because of some hardware failure). We can use a time-out scheme similar to the one described earlier to detect failures. To implement this, when P( i ) sends out a message, it also specifies some kind of time interval during which it is willing to wait for an acknowledgment message from P( j ). When P( j ) receives the message, it immediately sends an acknowledgement to P( i ). If P( i ) received the acknowledgment message within the specified time interval, it can safely conclude that P( j ) needs to retransmit its message and wait for an acknowledgment. Then P( I ) can know whether to execute foo or boo. This procedure continues until P( i ) either gets the acknowledgment message back or it is notified by the system that site S(2) is down. Note that, if these are the only two viable alternatives, P( i ) must wait until it has been notified that one of the situations has occurred.
17
18.17 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Reaching Agreement – Unreliable Communications Suppose now that P( j ) also needs to know that P( i ) has received its acknowledgment message so that it can decide how to proceed with its computation. For example, P( j ) may want to compute foo only if it is assured that P( i ) got its acknowledgment. In other words, P( i ) and P( j ) will compute foo if and only if both have agreed on it. It turns out that, in the presence of failure, it is not possible to accomplish this task. More precisely, it is not possible in a distributed environment for processes P( i ) and P ( j ) to agree completely on their respective states.
18
18.18 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Reaching Agreement – Unreliable Communications To prove this claim, let us suppose that a minimal sequence of message transfers exists such that, after the messages have been delivered, both processes agree to compute foo. Let m’ be the last message sent by P( i ) to P ( j ). Since P( i ) does not know whether its message will arrive at P( j ) (since the message may be lost due to a failure), P( i ) will execute foo regardless of the outcome of the message delivery. Thus, the message m’ could be removed from the communications sequence without affecting the decision procedure. Hence, the original sequence was not minimal, contradicting our assumption and showing that there is no sequence. (proof by contradiction) The processes can never be sure that both will compute foo.
19
End of Chapter 18.3
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.