CS3771 Today: deadlock detection and election algorithms Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion in distributed systems Centralized approach Token based approach Fully distributed approach Deadlock prevention Today: Deadlock detection Global wait-for graph Deadlock detection algorithms Election algorithms
CS3772 Deadlock handling with deadlock detection The deadlock-prevention may preempt resources even if no deadlock has occurred! Deadlock detection is based on so called wait-for graphs, corresponds to local or nonlocal processes that either hold or request local resources. A wait-for graph shows resource allocation state A cycle in the wait-for graph represents deadlock P5P3 P1P2 P3 P2 P4 Site A Site B
CS3773 Global wait-for graphs To show that there is NO DEADLOCK it is not enough to show that there is no cycle locally We need to construct the global wait-for graph It is the union of all local graphs. P5P3 P1 P2 P4
CS3774 How to construct this global wait-for graph? Centralized approach: The graph is maintained in ONE process: the deadlock-detection coordinator Since there is communication delay in the system we have two types of graphs: Real wait-for graph // real but unknown state of the system Constructed wait-for graph // approximation generated by the coordinator during the execution of its algorithm When is the wait-for graph constructed? 1.Whenever a new local edge inserted/removed a message is sent 2. Periodically maintained 3. Whenever the coordinator invokes the cycle-detector algorithm What happens if a cycle is detected? The coordinator selects a victim and notifies all processes
CS3775 Centralized approach deadlock detection False cycles may exist in the constructed global wait-for graph (because messages arrive in some order and delays contribute to edges added that form cycles; if a remove edge message arrives after another add edge message) There is a centralized deadlock detection algorithm based on Option 3 that guarantees that it detects all deadlocks and no false deadlocks are detected.
CS3776 Centralized deadlock detection algorithm Wait-for graph is built whenever the coordinator needs to invoke the cycle-detection algorithm. Also 1. When Pi at site A requests a resource from Pj, at site B, a Request message is sent together with the TS. 2. The edge Pi->Pj is inserted in the local wait-for graph of A. 3. This edge is inserted in the local wait-for graph of B only if B cannot grant the requested resource immediately. TS is associated with the link. 4. Local edges do not have TSs associated.
CS3777 Cycle detection phase 1.The coordinator sends message to each site. 2.Each site responds with its local wait-for graph. 3.The coordinator builds global wait-for as follows: Graph will contain a vertex per every process Graph will have edge Pi->Pj if and only if: Pi->Pj is a local edge in one of the local graphs, or Pi->Pj has a TS associated and appears in more than 2 local graphs. 4.Cycle means deadlock, no cycle means no deadlock state in this graph.
CS3778 Fully distributed approach- deadlock detection Each site is taking part in the deadlock detection process. Every site constructs a special augmented local wait-for graph that characterizes the dynamic behavior of the system. Deadlock means that there is a cycle at least at one of the sites.
CS3779 Building the special local wait- for graphs We add one additional node, Pex, to the local wait- for graphs. We add edges: Pi->Pex if Pi is waiting on a resource held by any process in a different site. Pex->Pj if there is a process in a remote site that waits on Pj. Detection idea: if there is a cycle without Pex then it is a deadlock state; if Pex is involved then it is possibly a deadlock.
CS37710 Augmented local wait-for graphs P5P3 P1P2 P3 P2 P4 Site A Site B Pex P3 is waiting for a resource hold by P4, therefore P3->Pex Pex->P2 is added as P4 waits on P2 at site B
CS37711 Detection If at site Si there is a cycle Pex->…Pk->Pex then: Assuming Pk is waiting on a resource from site Sj Si sends message to Sj, Sj’s local graph is updated, a cycle without Pex is searched for in Sj. If there is a cycle then the algorithm terminates, deadlock is detected. If there is a cycle involving Pex then a message is sent similar to above. The algorithm ends when all nodes have been updated and no local cycle not involving Pex is found or when a deadlock is detected.
CS37712 Detection - example P3 P2 P4 Site B Pex Edge P2->P3 is added, due to message from Site A Deadlock detected: P2,P3,P4,P2
CS37713 Election algorithms Many distributed algorithms are using a coordinator process, e.g. in: Enforcing mutual exclusion Maintaining the global wait-for graph Replacing a lost token In case the coordinator process fails a new copy of the coordinator should be started. Where? – decided with election algorithms.
CS37714 Election algorithms - assumptions Every process has a priority number. The coordinator is always the process with the largest priority number. If the coordinator fails the algorithm must elect the process with the largest priority number.
CS37715 The Bully algorithm Suppose Pi sends a message to the coordinator and that is not answered within time T. Pi assumes failure of coordinator. Pi sends an election message to all processes with higher priorities. If no response is received within time T, Pi assumes that all these processes have failed. Pi starts a copy of the coordinator. If a response is received then Pi waits for time T’ to see if a process with a higher priority has been elected.
CS37716 The Bully algorithm, cont. If no answer is received within T’ then the algorithm is restarted (as this may indicate that the process that sent the first reply has failed). A process that is elected sends a message to all other processes.
CS37717 Summary Deadlocks are primarily handled with detection in distributed systems. The main problem is maintaining the wait-for graph. Some of the distributed algorithms require the use of a coordinator. In case of failure of the coordinator an election algorithm is run to restart a new copy on some site.