Presentation is loading. Please wait.

Presentation is loading. Please wait.

© Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Deadlock: Part II - Recovery.

Similar presentations


Presentation on theme: "© Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Deadlock: Part II - Recovery."— Presentation transcript:

1 © Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Deadlock: Part II - Recovery

2 ECE 8813a (2) Approach Employ recovery mechanisms to break deadlocked configurations of messages Predicated on the following hypothesis  Deadlocks are rare  make them rare  Therefore recovery costs are acceptable  Recovery is less resource intensive than avoidance

3 ECE 8813a (3) Key Questions Probability of occurrence Characterization On-line detection Recovery techniques

4 ECE 8813a (4) Probability of Occurrence Influential factors  Routing freedom oExponential decrease in probability of occurrence  Number of blocked packets oCorrelated blocking patterns  Number of resource dependency cycles oIncreases with routing freedom  Presence of virtual channels oReduces the probability of blocking

5 ECE 8813a (5) Characterization of Deadlocks Deadlock set  Set of messages that are deadlocked Resource set  Set of buffer resources occupied by the deadlocked set Knot cycle density  Number of unique cycles within a knot oCaptures complexity of formation of deadlocked message configurations

6 ECE 8813a (6) Example Reserved channel Channel-wait-for graph Alternative virtual channels Deadlock set – all messages Resource set – channels vc 0 – vc 15 Knot density - 24

7 ECE 8813a (7) Impact of Routing Freedom Note additional channel for m4 Addition of additional channel for m 4 Extension of the range of the routing function sufficient to break the knot  Note that it is not possible to reach any other node from vc 16

8 ECE 8813a (8) Performance Issues Cyclic non-deadlocks can form  Extended blocking – “sludgelock”  Occurs when available routing freedom is being extensively exploited at high loads Deadlock avoidance places acyclic dependency guarantees before routing freedom Deadlock recovery emphasizes routing freedom first, and hence can have a performance advantage

9 ECE 8813a (9) On-Line Deadlock Detection Exact detection vs. heuristics  False detection Local vs. centralized detection Time-outs  At nodes with packet headers oCounter + comparator  Optimal value depends on message length Concurrent recovery  Local prioritization for selection of recovery packet

10 ECE 8813a (10) Deadlock Recovery Principles Progressive deadlock Recovery  Robin hood approach – de-allocate resources from normal packets and assign to the recovery packet oRemove any message from a deadlocked cycle oEnsure its progress towards the destination Regressive deadlock recovery  De-allocate resources from deadlocked packets  Typically destroy packets  Need some recovery mechanism, for example, end- to-end flow control

11 ECE 8813a (11) Trade-offs Deadlock recovery at the source is typically regressive  De-allocate and re-inject Deadlock recovery in the network can be both  Regressive – propagate de-allocation signals upstream to release resources and abort the packet  Progressive – re-allocation of resources from normal to deadlocked packets

12 ECE 8813a (12) Progressive Deadlock Recovery Forms the recovery lane across routers Floating virtual channel can be used by all VCs Utilized via a separate control path  Effectively cycle stealing on the physical channels Use of recovery lanes must be deadlock free Important to be on the output side

13 ECE 8813a (13) Recovering from Deadlock control to steal physical channel cycles Eliminates one cycle

14 ECE 8813a (14) Implementation Issues Number of deadlock buffers  Impact on crossbar size Location  Centralized vs. edge oRate at which deadlocked packets can drain  At the switch output vs. switch input

15 ECE 8813a (15) Optimizations Sequential progressive recovery  Only one packet is permitted to enter the deadlock recovery lane  Mutual exclusion via a circulating token  Recovery lane implements a connected routing sub- function with no cyclic dependencies Concurrent recovery  Hamiltonian path based  Spanning tree based

16 ECE 8813a (16) Summary Routing protocols must be designed to be correct  Applications to fault tolerance and multicast Recovery vs. Avoidance  Resource commitment vs. latency impact Customized solutions can favor recovery


Download ppt "© Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Deadlock: Part II - Recovery."

Similar presentations


Ads by Google