Around Self-Stabilization Part 2: Strengthened Forms of Self-Stabilization Stéphane Devismes Post-Doc CNRS at the LRI (Paris VII)
06/08/2008Computer Science Department, University of Osaka1 Roadmap 1. Self-Stabilization (recall) 2. Motivation 3. Tolerating more types of fault 4. FTSS 5. Enhance the convergence 6. Snap-Stabilization 7. Conclusion
06/08/2008Computer Science Department, University of Osaka2 Self-Stabilization (recall) [Dijkstra 1974] General approach for recovering from the effect of any transient faults
06/08/2008Computer Science Department, University of Osaka3 Motivation Self-Stabilization includes several advantages: 1.Tolerance to any transient fault: No hypothesis on the nature of extent of transient faults Recovers from the effects of those faults in a unified manner 2.No initialization: Large scale systems 3.Dynamicity: Self-organization in sensor and ad hoc networks
06/08/2008Computer Science Department, University of Osaka4 Motivation But also several drawbacks: 1.Impossibility results Some fundamental problems have no self-stabilizing solution 2.Overhead Self-stabilizing protocols can make use of a large amount of resources 3.Usually not tolerant for other kinds of fault 4.Eventual safety During the convergence, almost nothing is guaranteed Weakened Forms Strengthened Forms
06/08/2008Computer Science Department, University of Osaka5 Motivation Strengthened Forms for: Tolerating more types of faults Enhance the convergence property Converging quickly in some (frequent) cases Ensure some weak safety property when there are faults
06/08/2008Computer Science Department, University of Osaka6 Tolerating more types of faults Types of faults: Transient Intermittent Crash Byzantine
06/08/2008Computer Science Department, University of Osaka7 Tolerating more types of fault Transient Faults: Usually treated by the Self-Stabilization Duration: finite Periodicity: rare Effect: alter the contain of some component(s) of the network (processes and/or links) E.g., memory/message corruption, crash-recover, lose of messages…
06/08/2008Computer Science Department, University of Osaka8 Tolerating more types of fault Intermittent Faults: Duration: finite Periodicity: frequent Effect: alter the contain of some component(s) of the network (processes and/or links) E.g., memory/message corruption, crash-recover, lose of messages… Some paper deals with both self-stabilization and certain types of intermittent fault, e.g., [Delaët and Tixeuil, JPDC’02] Fair lose of message + finite number of message corruption
06/08/2008Computer Science Department, University of Osaka9 Tolerating more types of fault Crash Failures: Duration: definitive Effect: some component(s) of the network (processes and/or links) definitively stops working E.g., process crash, link removal Fault-Tolerant Self-Stabilization (FTSS) [ Gopal and Perry, PODC’93] Usually consider process crash only.
06/08/2008Computer Science Department, University of Osaka10 Tolerating more types of fault Byzantine Failures: Duration: unlimited Effect: some component(s) of the network (usually processes) work in an arbitrary manner E.g., processes hit by an attack Byzantine-Tolerant Self-Stabilization [Dolev and Welch, PODC’95 ] Restriction on the number of Byzantine processes and/or Some synchrony assumptions
Robust Stabilizing Leader Election Carole Delporte-Gallet (LIAFA) Stéphane Devismes (CNRS, LRI) Hugues Fauconnier (LIAFA) LIAFA
06/08/2008Computer Science Department, University of Osaka12 Topics Designing Leader Election protocols in message- passing model that are 1.Crash tolerant 2.Self-Stabilizing 3.Communication-Efficient 4.With weak synchrony assumption
06/08/2008Computer Science Department, University of Osaka13 Model Fully-connected network Communications using messages Link : Unidirectional No order on the delivers May be synchronous Process : Synchronous or crashed With identifier State initially arbitrary 1234
06/08/2008Computer Science Department, University of Osaka14 Communication-Efficiency [Larrea, Fernandez, and Arevalo, 2000]: « An algorithm is communication-efficient if it eventually only uses n - 1 unidirectional links » 1234
06/08/2008Computer Science Department, University of Osaka15 Related Works [Gopal and Perry, PODC’93] [Anagnostou and Hadzilacos, WDAG’93] [Beauquier and Kekkonen-Moneta, JSS’97] Communication-Efficiency never considered
06/08/2008Computer Science Department, University of Osaka16 Self-Stabilizing Leader Election in a full timely network? Yes + communication-efficiency
06/08/2008Computer Science Department, University of Osaka17 Algorithm (1/4) Each process p periodically sends ALIVE,p to each other if Leader = p Leader=1 Leader=2 ALIVE,2 ALIVE,1
06/08/2008Computer Science Department, University of Osaka18 Algorithm (2/4) When an alive process p such that Leader = p receives ALIVE from process q, Leader := q if q < p Leader=1 Leader=2 ALIVE,2 ALIVE,1 Leader=1 4
06/08/2008Computer Science Department, University of Osaka19 Algorithm (3/4) Each alive process q such that Leader ≠ q always chooses as leader the process from which it receives ALIVE the most recently Leader=1 Leader=2Leader=1 ALIVE,1 Leader=1 4
06/08/2008Computer Science Department, University of Osaka20 Algorithm (4/4) On Time out, each alive process p sets Leader to p Leader=3 Leader=2Leader=4 ALIVE,2 ALIVE,1 Leader=1 Leader=2 4
06/08/2008Computer Science Department, University of Osaka21 Communication-Efficient Self-Stabilizing Leader Election in a system where at most one link is asynchronous? No
06/08/2008Computer Science Department, University of Osaka22 Impossibility of Communication-Efficiency in a system with at most one asynchronous link Claim: Any process p such that Leader ≠ p must periodically receive messages within a bounded time otherwise it chooses another leader The process chooses another leader
06/08/2008Computer Science Department, University of Osaka23 Self-Stabilizing (non communication-efficient) Leader Election in a system where some links are asynchronous? Yes
06/08/2008Computer Science Department, University of Osaka24 Self-Stabilizing Leader Election in a system with a timely routing overlay For each pair of alive processes (p,q), there exists at least two paths of timely links: From p to q From q to p
06/08/2008Computer Science Department, University of Osaka25 Algorithm Each process computes the set of alive processes and chooses as leader the smallest process of this set To compute the set: 1.Each process p periodically sends ALIVE,p to every other process 2.Any ALIVE,p message is repeated n - 1 times (any other process periodically receives such a message)
06/08/2008Computer Science Department, University of Osaka26 Self-Stabilizing Leader Election in a system without timely routing overlay ? No
06/08/2008Computer Science Department, University of Osaka27 Conclusion Obtaining algorithms that are both self-stabilizing and crash tolerant is highly desirable But designing communication-efficient solution requires strong synchrony assumption even if the network is fully-connected Solution: FTPS (Fault-Tolerant Pseudo-Stabilization)
06/08/2008Computer Science Department, University of Osaka28 Enhance The Convergence Fault-containing Self-Stabilization Time-Adaptive Self-Stabilization Safe-Converging Self-Stabilization Superstabilization Snap-Stabilization
06/08/2008Computer Science Department, University of Osaka29 Fault-Containing Self-Stabilization [Ghosh et al, PODC’96] Self-stabilizing + if there is a few number of faults: Spatial containment: a few number of processes can be contaminated by the faults Fast convergence time
06/08/2008Computer Science Department, University of Osaka30 Time-Adaptive Self-Stabilization [Kutten & Patt-Shamir, PODC’97] Self-stabilizing and if f<k processes are faulty: The output of the algorithm stabilizes in O(f) Faults hit f processesThe output is stabilized The state is stabilized
06/08/2008Computer Science Department, University of Osaka31 Safe-Converging Self-Stabilization [Kakugawa & Masuzawa, IPDPS’06] Self-stabilizing and fast convergence to a weaker (useful) predicate E.g. Minimal Dominating Set (MDS): Arbitrary initial configuration DS MDS
06/08/2008Computer Science Department, University of Osaka32 Superstabilization [Dolev & Herman, CJTCS’97] A Superstabilizing Algorithm Must be self-stabilizing Must preserve a “passage predicate” Passage Predicate - Defined with respect to a class of topology changes (A topology change falsifies legitimacy and therefore the passage predicate must be weaker than legitimacy but strong enough to be useful). Topological change Passage Predicate
06/08/2008Computer Science Department, University of Osaka33 Passage Predicate - Example In a token ring: A processor crash can lose the token but still not falsify the passage predicate Passage PredicateLegitimate State At most one token exists in the system. (e.g. the existence of 2 tokens isn’t legal) Exactly one token exists in the system.
06/08/2008Computer Science Department, University of Osaka34 Snap-Stabilization [Bui et al, WSS’99] A snap-stabilizing algorithm immediately operates correctly after the end of the faults Request-based algorithm and user-centric point of view: Each time a user initiates a request, it obtain a correct result for its request
06/08/2008Computer Science Department, University of Osaka35 Snap-Stabilization
06/08/2008Computer Science Department, University of Osaka36 Self vs. Snap 1.X 2.X N.X
06/08/2008Computer Science Department, University of Osaka37 Self vs. Snap 1.X
Snap-Stabilization in Message-Passing Systems Sylvie Delaët (LRI) Stéphane Devismes (CNRS, LRI) Mikhail Nesterenko (Kent State University) Sébastien Tixeuil (LIP6)
06/08/2008Computer Science Department, University of Osaka39 Message-Passing Model Network bidirectional and fully-connected Communications by messages Links asynchronous, fair, and FIFO Ids on processes Transient faults m1m1 m2m2 m3m3 m3m3 mama mbmb mama mbmb 1234 m
06/08/2008Computer Science Department, University of Osaka40 Related Works in message-passing (reliable communication in self-stabilization) [Gouda & Multari, 1991] Deterministic + Unbounded Capacity => Unbounded Counter Deterministic + Bounded Capacity => Bounded Counter [Afek & Brown, 1993] Probabilistic + Unbounded Capacity + Bounded Counter ? ?
06/08/2008Computer Science Department, University of Osaka41 Related Works in message-passing (self-stabilization) [Varghese, 1993] Deterministic + Bounded Capacity [Katz & Perry, 1993] Unbounded Capacity, deterministic, infinite counter [Delaët et al] Unbounded Capacity, deterministic, finite memory Silent tasks
06/08/2008Computer Science Department, University of Osaka42 Related Works (snap-stabilization) Nothing in the Message-Passing Model Only in State Model: Locally Shared Memory Composite Atomicity [Cournier et al, 2003]
Snap-Stabilization in Message-Passing Systems
06/08/2008Computer Science Department, University of Osaka44 Case 1: unbounded capacity links Impossible for safety-distributed specifications
06/08/2008Computer Science Department, University of Osaka45 B A Safety-distributed specification p q Example : Mutual Exclusion
06/08/2008Computer Science Department, University of Osaka46 A Safety-distributed specification p spsp m1m1 m2m2 m3m3 m4m4 m5m5 B q sqsq m’ 1 m’ 2 m’ 3 m’ 4
06/08/2008Computer Science Department, University of Osaka47 A Safety-distributed specification p spsp m1m1 m2m2 m3m3 m4m4 m5m5 B q sqsq m’ 1 m’ 2 m’ 3 m’ 4
06/08/2008Computer Science Department, University of Osaka48 Case 2: bounded capacity links Problem to solve: Reliable Communication Starting from any configuration, if Tintin sends a question to Captain Haddock, then: Tintin eventually receives good answers Tintin only delivers the good answers ? ?
06/08/2008Computer Science Department, University of Osaka49 Case 2: bounded capacity links Case Study: Single-Message Capacity 0 or 1 message
06/08/2008Computer Science Department, University of Osaka50 Case 2: bounded capacity links Sequence number State {0,1,2,3,4} p q State p State q 0 NeigState p NeigState q ? ?? 0 1 Until State p = 4 ?
06/08/2008Computer Science Department, University of Osaka51 Case 2: bounded capacity links Pathological Case: p q State p State q 0 NeigState p NeigState q ? 1?
06/08/2008Computer Science Department, University of Osaka52 Generalizations Arbitrary Bounded Capacity 2xC max +3 values p q C max values 1 value
06/08/2008Computer Science Department, University of Osaka53 Generalizations PIF in fully-connected network m m m AmAm AmAm AmAm
06/08/2008Computer Science Department, University of Osaka54 Application Mutual Exclusion in a fully-connected & identified network using the PIF
06/08/2008Computer Science Department, University of Osaka55 Mutual Exclusion Specification: Any process that requests the CS enters in the CS in finite time (Liveness) If a requesting process enters in the CS, then it executes the CS alone (Safety) N.b. Some non-requesting processes may be initially in the CS
06/08/2008Computer Science Department, University of Osaka56 Principles (1/6) Let L be the process with the smallest ID L decides using Value L which is authorized to access the CS 1.if Value L = 0, then L is authorized 2.if Value L = i, then the i th neighbour of L is authorized When a process learns that it is authorized by L to access the CS: 1.It ensures that no other process can execute the CS 2.It executes the CS, if it requests it 3.It notifies L when it terminates Step 2 (so that L increments Value L )
06/08/2008Computer Science Department, University of Osaka57 Principles (2/6) Each process sequentially executes 4 phases infinitely often A requesting process p can enter in the CS only after executing Phases 1 to 4 consecutively The CS is in Phase 4
06/08/2008Computer Science Department, University of Osaka58 Principles (3/6) Process p evaluates the IDs 5 2 Id? Leader=2 Phase=1
06/08/2008Computer Science Department, University of Osaka59 Principles (4/6) Process p asks if Value q = p to each other process q 5 2 Ok? No Yes No 3 8 Leader=2 Value= Value=2Value= Ok=true Phase=2
06/08/2008Computer Science Department, University of Osaka60 Principles (5/6) If Winner(p) then p broadcasts EXIT to every other process 5 2 Exit Ok 3 8 Leader=2 Value= Value=2 Value= Ok=true Phase=3 Winner(5)=true Winner(2)=? Winner(3)=? Winner(8)=? Phase=? Leader=? Ok=? Phase=? Leader=? Ok=? Phase=? Leader=? Ok=? Phase=1
06/08/2008Computer Science Department, University of Osaka61 Principles (6/6) If Winner(p) then CS; If p≠L, then p broadcasts ExitCS, else p increments Value p 5 2 ExitCS Ok 3 8 Leader=2 Value= Value=2 Value= Ok=true Phase=4 Winner(5)=true Winner(2)=? Winner(3)=? Winner(8)=? Leader=? Ok=? Leader=? Ok=? Leader=? Ok=? Phase=1 Value=3
06/08/2008Computer Science Department, University of Osaka62 Conclusion Snap-Stabilization in message-passing is no more an open question
06/08/2008Computer Science Department, University of Osaka63 Extensions Apply snap-stabilization in message-passing to: Other topologies (tree, arbitrary topology) Other problems Other failure patterns Space requirement
まいど おおきに !