Distributed and hierarchical deadlock detection, deadlock resolution

Distributed and hierarchical deadlock detection, deadlock resolution
distributed algorithms Obermarck’s path-pushing Chandy, Misra, and Haas’s edge-chasing hierarchical algorithms Menasce and Muntz’s algorithm Ho and Ramamoorthy’s algorithm resolution

Distributed deadlock detection
Path-pushing WFG is disseminated as paths — sequences of edges Deadlock if process detects local cycle Edge-chasing Probe messages circulate Blocked processes forward probe to processes holding requested resources Deadlock if initiator receives own probe

Obermarck’s Path-Pushing
4/25/2017 Obermarck’s Path-Pushing Individual sites maintain local WFGs Nodes for local processes Node “Pex” represents external processes Pex1 -> P1 -> P2 ->P3 -> Pex2 a process may wait for multiple other processes Deadlock detection: site Si finds a cycle that does not involve Pex – deadlock site Si finds a cycle that does involve Pex – possibility of a deadlock sends a message containing its detected path to the downstream sites to decrease network traffic the message is sent only when Pex1 > Pex2 assumption: the identifier of a process spanning the sites is the same! If site Sj receives such a message, it updates its local WFG graph, and reevaluates the graph (possibly pushing a path again)

Chandy, Misra, and Haas’s Edge-Chasing
When a process has to wait for a resource (blocks), it sends a probe message to process holding the resource Process can request (and can wait for) multiple resources at once Probe message contains 3 values: ID of process that blocked ID of process sending message ID of process message was sent to When a blocked process receives a probe, it propagates the probe to the process(es) holding resources that it has requested ID of blocked process stays the same, other two values updated as appropriate If the blocked process receives its own probe, there is a deadlock size of a message is O(1)

Performance evaluation of Obermarck’s and Chandy-Misra-Haas algorithms
4/25/2017 Performance evaluation of Obermarck’s and Chandy-Misra-Haas algorithms Obermarck’s on average(?) only half the sites involved in deadlock send messages every such site sends one message, thus n(n–1)/2 messages to detect deadlock for n sites size of a message is O(n) Chandy, Misra, and Haas’s (Singhal’s estimate is incorrect) given n processes, a process may be blocked by up to (n-1) processes, the next process may be blocked by another (n-2) processes and so on. If there is more sites than processes, the worst case the number of messages is n(n-1)/2. If there are fewer sites m than processes then the worst case estimate is n2(n-m)/2m size of a message is 3 integers This is on Chandy-Misra-Haas and Obermark's algorithm. The links to the original papers are on the course's website. I advise you to read the original papers as Singhal's description is not very precise. Both algorithms assume that a process may wait for more than one process. However, this is different from multiple resource instances that we studied. For CMH and Obermark's it is assumed that the process will not be able to make progress unless BOTH processes free their resources. Thus, in the worst case the processes wait-for graph (WFG) is completely connected. Assume that all links span sites. Thus, the total number of external links is n(n-1)/2 where n is the number of processes. My message complexity estimate is as follows. For CMH, Each (process) initiator maintains a counter. Every time there is an external chain, the process increments the counter and sends a probe with this counter. Each process keeps the counter for all other processes and only sends the probe once (for each initiator and each counter). In the worst case, the probe for one initiator can traverse all edges in the WFG. In their (conference) paper CMH try to limit the number of initiators per deadlock by allowing only the processes that just obtained an incoming link in the WFG to initiate the probes. However, the cycle can be formed such that the links are formed in parallel. Hence every process becomes an initiator. Thus, the total number of messages in the worst case is n*n*(n-1)/2 which is in O(n^3) ---- For Obermark, the sites send a "path" to the other sites. Recall, that to decrease traffic Obermark proposes that only those sites send the path whose outgoing process identifier is less than incoming. BTW, I was not clear in class, Obermark proposes that the site sends the path only to the "downstream" with respect to the shared process (transaction). That is the site does not broadcast it to all other sites. Now, let us estimate the complexity of the algorithm. I find it a little easier to think of processes being inside the sites and the edges in the WFG being external. The reasoning, however, is similar in Obermark's notation. Suppose the identifiers of the processes are 1, 2, ..., n Suppose all WFG links are external and the higher id processes wait for the lower id ones (except for process 1 that waits for process n so that we have a deadlock) For simplicity, suppose there are n sites (one per process). Let us call the sites by the identifier of the process that resides there: 1, 2, 3, ..., n In the worst case, - site 2 pushes process id 2 to site 1, - site 3 pushes process id 3 to sites 1 and 2 -- this forces site 2 to push process id 3 to site 1. and so on Thus, process n identifier travels along all links, process (n-1) travels along all links except those leading to site n, and so on. Notice that the graph is completely connected. The total number of links is n(n+1)/2 Thus, the number of links, process n travels is n(n+1)/2 process n-1: (n-1)n/2 The total number of messages is: n(n+1)/2 + (n-1)n/2 + (n-2)(n-1)/2 + (n-3)(n-2)/ doing pairwise factoring we obtain [ n(n+1 + n-1) + (n-2)(n-1+n-3) + ...] / 2 simplify [ n * 2n + (n-2)(2n-4) + (n-4)(2n-8) + ...]/2 factoring 2 we obtain n^2 + (n-2)^2 + (n-4)^ There is probably an exact formula for this series, but I just estimated it from above and below. Let us call the above sum A(n) It is known that n^2 + (n-1)^2 + (n-2)^ = n(n+1)(2n+1)/6 Let's call this sum B(n) Clearly, B(n) > A(n) since some of the addends of B(n) are missing from A(n). Let us estimate B(n) from below. Let us consider 2A(n) = 2n^2 + 2(n-2)^ it is smaller than n^2 + (n-1)^2 + (n-2)^2 + (n-3)^ That is 2A(n) > B(n). Which means that B(n) > A(n) > B(n)/2 Observe that both B(n) and B(n)/2 are in O(n^3) Therefore A(n) is in O(n^3). That is, the worst case message complexity of Obermark's algorithm is the same as that of CMH and it is cubic with respect to the number of processes in the system. THIS IS OLD It is not clear how Singhal came up with the message complexity for CMH. I put the original paper on the web. In the paper the complexity is computed as follows. There could be at least one probe message passed between each pair of processes in the system. Consider the case of N processes. In the worse case process P1 can wait for processes P2..PN, P2 -- for P3..PN, and so on. If all processes are on separate sites, the total number of probe messages will be N(N-1)/2. However, the original paper does not consider a situation where the number of sites M is smaller than the number of processes. Suppose this is the case. Observe that if two processes share a site, the wait-for link between them does not generate a message. Hence the worst case allocation of processes is such that the maximum number of links is external. I'd like to show that the number of external links is maximized when the processes are spread evenly among the sites. Indeed, let us consider the following setup. Two sites A and B have NA and NB processes respectively such that NA > NB. Suppose that all processes wait for each other. Let us consider moving a process P from site A to site B. In this move, all wait-for links from P to processes in A are externalized and all links from P to processes in B are internalized. Since all processes wait for each other. The number of externalized links is NA-1 and the number of internalized links is NB. This means that if we move a process from A to B (where the number of processes is NA > NB) the number of external links at least does not decrease. Hence, the maximum number of external links is when the number of processes is equal on all sites. Now let us count the number of external links. I am going to count all links and then subtract the ones that are internalized due to processes being on the same site. The total number of links (as calculated above) is N(N-1)/2. Let us count internal links. Suppose M divides N. This means that the number of processes at each site is N/M. Since every process is connected with every other by wait-for relation, the number of internal links per site is: N/M(N/M-1) 2 that is N(N-M)/2(M*M). There are exactly M sites. Hence, the total number of internal links is M*N(N-M)/2(M*M) which is N(N-M)/2M The final formula is: N(N-1) N(N-M) M Which simplifies to: N*N*(M-1) 2M

Menasce and Muntz’ hierarchical deadlock detection
Sites (called controllers) are organized in a tree Leaf controllers manage resources Each maintains a local WFG concerned only about its own resources Interior controllers are responsible for deadlock detection Each maintains a global WFG that is the union of the WFGs of its children Detects deadlock among its children changes are propagated upward either continuously or periodically

Ho and Ramamoorthy’s hierarchical deadlock detection
Sites are grouped into disjoint clusters Periodically, a site is chosen as a central control site Central control site chooses a control site for each cluster Control site collects status tables from its cluster, and uses the Ho and Ramamoorthy one-phase centralized deadlock detection algorithm to detect deadlock in that cluster All control sites then forward their status information and WFGs to the central control site, which combines that information into a global WFG and searches it for cycles Control sites detect deadlock in clusters Central control site detects deadlock between clusters

Estimating performance of deadlock detection algorithms
Usually measured as the number of messages exchanged to detect deadlock Deceptive since message are also exchanged when there is no deadlock Doesn’t account for size of the message Should also measure: Deadlock persistence time (measure of how long resources are wasted) Tradeoff with communication overhead Storage overhead (graphs, tables, etc.) Processing overhead to search for cycles Time to optimally recover from deadlock

Deadlock resolution resolution – aborting at least one process (victim) in the cycle and granting its resources to others efficiency issues of deadlock resolution fast – after deadlock is detected the victim should be quickly selected minimal – abort minimum number of processes, ideally abort less “expensive” processes (with respect to completed computation, consumed resources, etc.) complete – after victim is aborted, info about it quickly removed from the system (no phantom deadlocks) no starvation – avoid repeated aborting of the same process problems detecting process may not know enough info about the victim (propagating enough info makes detection expensive) multiple sites may simultaneously detect deadlock since WFG is distributed removing info about the victim takes time

Deadlock recovery (cont.)
Any less drastic alternatives? Preempt resources One at a time until no deadlock Which “victim”? Again, based on cost, similar to CPU scheduling Is rollback possible? Preempt resources — take them away Rollback — “roll” the process back to some safe state, and restart it from there OS must checkpoint the process frequently — write its state to a file Could roll back to beginning, or just enough to break the deadlock This second time through, it has to wait for the resource Has to keep multiple checkpoint files, which adds a lot of overhead Avoid starvation May happen if decision is based on same cost factors each time Don’t keep preempting same process (i.e., set some limit)

Distributed and hierarchical deadlock detection, deadlock resolution

Similar presentations

Presentation on theme: "Distributed and hierarchical deadlock detection, deadlock resolution"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distributed and hierarchical deadlock detection, deadlock resolution

Similar presentations

Presentation on theme: "Distributed and hierarchical deadlock detection, deadlock resolution"— Presentation transcript:

Similar presentations

About project

Feedback