Introduction to Self-Stabilization Stéphane Devismes (CNRS, LRI)
Example of Self-Stabilizing System Dijkstra’s Token Ring
Model Locally Shared Memory Guarded Action: Action: Executed only if its guard is true (enabled) The execution is asynchronous but each step is atomic
Topology: Rooted Oriented Ring
Algorithm (K = 7) 1 2 1 1 1 1 1
Transient Faults… (undefinitive and rare) 2 1 2 1 3 1 1 2 3 1 1 The system retreives by itself a correct behavior: Self-stabilization 3 2 1 1
Self-Stabilization Self-Stabilization [Dijkstra, 1974]: Starting from any configuration, a self-stabilizing system reaches in a finite time a configuration c such that any suffix starting from c satisfies the intended specification.
Self-Stabilization Closure Illegitimate States Legitimate States Convergence System States
Advantages Fault-Tolerance Initialization Dynamic Topology
Disavantages Initial inconsistencies (stabilization time) Overcost No detection of stabilization
Around Self-Stabilization Probabilistic Self-Stabilization Robust Stabilization Weak-Stabilization Pseudo-Stabilization Snap-Stabilization Fault-Containment …
The system stabilizes even if some processes crash Robust Stabilization The system stabilizes even if some processes crash
Pseudo-Stabilization Pseudo-Stabilization [Burns, Gouda, and Miller, 1993]: Starting from any configuration, any execution of a pseudo-stabilizing system has a non-empty suffix that satisfies the intended specification. Self ≠ Pseudo ?
Specification = {(i,i,i,…),(j,j,j,…)} Self- vs Pseudo- Specification = {(i,i,i,…),(j,j,j,…)} i r i j j
Robust Stabilizing Leader Election (SSS’07) Carole Delporte-Gallet (LIAFA) Stéphane Devismes (CNRS, LRI) Hugues Fauconnier (LIAFA) LIAFA
14-16 November 2007, Paris, France SSS’07 (+WRAS) 9th International Symposium on Stabilization, Safety, and Security of Distributed Systems 14-16 November 2007, Paris, France http://sss07.lri.fr/
Related Works on Robust Stabilization Gopal and Perry, PODC’93 Beauquier and Kekkonen-Moneta, JSS’97 Anagnostou and Hadzilacos, WDAG’93 In partial synchronous model ? Les deux premiers: systèmes synchrones Le troisième: résultat d’impossibilité dans les réseaux asynchrones
Model Network: fully-connected n Processes (numbered from 1 to n): timely can crashed (an arbitrary number of processes may crash) Variables: initially arbitrary assigned Links: Unidirectional Initially not necessarily empty No order on the message delivrance Variable reliability and timeliness assumptions
Communication-Efficiency [Larrea, Fernandez, and Arevalo, 2000]: « An algorithm is communication-efficient if it eventually only uses n - 1 unidirectional links »
Can we implement Self-Stabilizing Leader Election in a full synchronous network? Yes, it can be communication-efficiently implemented
Principle of the algorithm A process p periodically sends ALIVE to every other if Leader = p Any process q such that Leader <> q always chooses as leader the process from which it receives ALIVE the most recently When a process p such that Leader = p receives ALIVE from q, then Leader := q if q < p On Time out, a process p sets Leader to p
Can we implement Communication-Efficient Self-Stabilizing Leader Election in a system where at most one link is asynchronous? No
Impossibility of Communication-Efficiency in a system with at most one asynchronous link Claim: Any process p such that Leader <> p must periodically receive messages within a bounded time otherwise it chooses another leader
Can we implement (non communication efficient) Self-Stabilizing Leader Election in a system where some links are asynchronous? Yes
Self-Stabilizing Leader Election in a system with a timely routing overlay For each pair of alive processor (p,q), there exists at least two paths of timely links: From p to q From q to p
Principle of the algorithm Each process computes the set of alive processes and chooses as leader the smallest process of this set To compute the set: Each process p periodically sends ALIVE,p to every other process Any ALIVE,p message is repeated n - 1 times (any other process periodically receives such a message)
Can we implement Self-Stabilizing Leader Election in a system without timely routing overlay ? No
Can we implement a Communication-Efficient Pseudo-Stabilizing Leader Election in a system where Communication-Efficient Self-Stabilizing Leader Election is not possible ? Yes In a system having a timely source and fair links (adaptation of an algorithm of [Aguilera et al, PODC’93])
Algorithm for systems with Source + fair links A process p periodically sends ALIVE to every other if Leader = p Each process stores in an Active set the IDs of each process from which it recently receives ALIVE Each process chooses its leader among the processes in its Active set Problem: we cannot use the IDs to choose a leader
Accusation Counter p stores in Counter[p] how many times it was suspected to be crashed When p suspects its leader: it sends an ACCUSATION to LEADER And chooses as new leader the process in its Active set with the smallest accusation counter (we use IDs to break ties) p periodically sends ALIVE,Counter[p] to every other if Leader = p Problem: assuming that LEADER=s, the source s can volontary stop sending ALIVE
Phase Counter Each process maintains in Phase[p] the number of times it looses the leadership p periodically sends ALIVE,Counter[p],Phase[p] to every other if Leader = p p increments Counter[p] only when receiving ACCUSATION,ph with ph = Phase[p]
Can we implement a Communication-Efficient Pseudo-Stabilizing Leader Election in a system having only a timely source? No, but a non communication efficient pseudo-stabilizing leader election can be done (techniques similar to those used in the algorithm of [Aguilera et al, PODC’93])
Result Summary ce-SS SS ce-PS PS Synchronous Yes Timely bi-source No Timely routing ? Timely source + fair links Timely source Totally asynchronous
Perspectives Communication-efficient leader election in a system with timely routing Extend these results to other topologies and models Robust stabilizing decision problems ? Exemple de modèle: only a finite number of processes crashed