Introduction to Self-Stabilization Stéphane Devismes.

Slides:



Advertisements
Similar presentations
Impossibility of Distributed Consensus with One Faulty Process
Advertisements

Teaser - Introduction to Distributed Computing
Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services Authored by: Seth Gilbert and Nancy Lynch Presented by:
Snap-stabilizing Committee Coordination Borzoo Bonakdarpour Stephane Devismes Franck Petit IEEE International Parallel and Distributed Processing Symposium.
Chapter 15 Basic Asynchronous Network Algorithms
Leader Election Let G = (V,E) define the network topology. Each process i has a variable L(i) that defines the leader.  i,j  V  i,j are non-faulty.
Snap-Stabilization in Message-Passing Systems Sylvie Delaët (LRI) Stéphane Devismes (CNRS, LRI) Mikhail Nesterenko (Kent State University) Sébastien Tixeuil.
Self Stabilizing Algorithms for Topology Management Presentation: Deniz Çokuslu.
Self-stabilizing Distributed Systems Sukumar Ghosh Professor, Department of Computer Science University of Iowa.
Fast Leader (Full) Recovery despite Dynamic Faults Ajoy K. Datta Stéphane Devismes Lawrence L. Larmore Sébastien Tixeuil.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1.
Lecture 4: Elections, Reset Anish Arora CSE 763 Notes include material from Dr. Jeff Brumfield.
UPV / EHU Efficient Eventual Leader Election in Crash-Recovery Systems Mikel Larrea, Cristian Martín, Iratxe Soraluze University of the Basque Country,
Byzantine Generals Problem: Solution using signed messages.
Failure Detectors. Can we do anything in asynchronous systems? Reliable broadcast –Process j sends a message m to all processes in the system –Requirement:
From Self- to Snap- Stabilization Alain Cournier, Stéphane Devismes, and Vincent Villain SSS’2006, November 17-19, Dallas (USA)
Self-Stabilization: An approach for Fault-Tolerance in Distributed Systems Stéphane Devismes 16/12/2013MAROC'2013.
Chapter 8 - Self-Stabilizing Computing1 Chapter 8 – Self-Stabilizing Computing Self-Stabilization Shlomi Dolev MIT Press, 2000 Draft of January 2004 Shlomi.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
Asynchronous Consensus (Some Slides borrowed from ppt on Web.(by Ken Birman) )
CPSC 668Set 3: Leader Election in Rings1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:
CPSC 668Self Stabilization1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
CS294, YelickSelf Stabilizing, p1 CS Self-Stabilizing Systems
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 2 – Distributed Systems.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 12: Impossibility.
Efficient Algorithms to Implement Failure Detectors and Solve Consensus in Distributed Systems Mikel Larrea Departamento de Arquitectura y Tecnología de.
Composition Model and its code. bound:=bound+1.
On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.
Selected topics in distributed computing Shmuel Zaks
Andreas Larsson, Philippas Tsigas SIROCCO Self-stabilizing (k,r)-Clustering in Clock Rate-limited Systems.
On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.
Snap-Stabilizing PIF and Useless Computations Alain Cournier, Stéphane Devismes, and Vincent Villain ICPADS’2006, July , Minneapolis (USA)
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 10 Instructor: Haifeng YU.
Review for Exam 2. Topics included Deadlock detection Resource and communication deadlock Graph algorithms: Routing, spanning tree, MST, leader election.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February 10, 2005 Session 9.
A Self-Stabilizing O(n)-Round k-Clustering Algorithm Stéphane Devismes, VERIMAG.
Autonomic distributed systems. 2 Think about this Human population x10 9 computer population.
Self-Stabilizing K-out-of-L Exclusion on Tree Networks Stéphane Devismes, VERIMAG Joint work with: – Ajoy K. Datta (Univ. Of Nevada) – Florian Horn (LIAFA)
Self-Stabilizing K-out-of-L Exclusion on Tree Networks Stéphane Devismes, VERIMAG Joint work with: – Ajoy K. Datta (Univ. Of Nevada) – Florian Horn (LIAFA)
Approximation of δ-Timeliness Carole Delporte-Gallet, LIAFA UMR 7089, Paris VII Stéphane Devismes, VERIMAG UMR 5104, Grenoble I Hugues Fauconnier, LIAFA.
Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory.
Weak vs. Self vs. Probabilistic Stabilization Stéphane Devismes (CNRS, LRI, France) Sébastien Tixeuil (LIP6-CNRS & INRIA, France) Masafumi Yamashita (Kyushu.
SysRép / 2.5A. SchiperEté The consensus problem.
Snap-Stabilization in Message-Passing Systems Sylvie Delaët (LRI) Stéphane Devismes (CNRS, LRI) Mikhail Nesterenko (Kent State University) Sébastien Tixeuil.
Impossibility of Distributed Consensus with One Faulty Process By, Michael J.Fischer Nancy A. Lynch Michael S.Paterson.
Agreement in Distributed Systems n definition of agreement problems n impossibility of consensus with a single crash n solvable problems u consensus with.
CS 542: Topics in Distributed Systems Self-Stabilization.
1 Fault tolerance in distributed systems n Motivation n robust and stabilizing algorithms n failure models n robust algorithms u decision problems u impossibility.
Self-stabilization. Technique for spontaneous healing after transient failure or perturbation. Non-masking tolerance (Forward error recovery). Guarantees.
Self-Stabilizing Algorithm with Safe Convergence building an (f,g)-Alliance Fabienne Carrier Ajoy K. Datta Stéphane Devismes Lawrence L. Larmore Yvan Rivierre.
Hwajung Lee.  Technique for spontaneous healing.  Forward error recovery.  Guarantees eventual safety following failures. Feasibility demonstrated.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
Snap-Stabilization in Message-Passing Systems Sylvie Delaët (LRI) Stéphane Devismes (CNRS, LRI) Mikhail Nesterenko (Kent State University) Sébastien Tixeuil.
Superstabilizing Protocols for Dynamic Distributed Systems Authors: Shlomi Dolev, Ted Herman Presented by: Vikas Motwani CSE 291: Wireless Sensor Networks.
ITEC452 Distributed Computing Lecture 15 Self-stabilization Hwajung Lee.
Self-stabilizing (f,g)-Alliances with Safe Convergence Fabienne Carrier Ajoy K. Datta Stéphane Devismes Lawrence L. Larmore Yvan Rivierre.
Snap-Stabilizing Committee Coordination Borzoo Bonakdarpour, Stéphane Devismes, and Frank Petit.
Snap-Stabilizing Depth-First Search on Arbitrary Networks Alain Cournier, Stéphane Devismes, Franck Petit, and Vincent Villain OPODIS 2004, December
Around Self-Stabilization Part 2: Strengthened Forms of Self-Stabilization Stéphane Devismes Post-Doc CNRS at the LRI (Paris VII)
CSE-591: Term Project Self-stabilizing Network Algorithms by Tridib Mukherjee ASU ID :
Computer Science 425/ECE 428/CSE 424 Distributed Systems (Fall 2009) Lecture 20 Self-Stabilization Reading: Chapter from Prof. Gosh’s book Klara Nahrstedt.
Fundamentals of Fault-Tolerant Distributed Computing In Asynchronous Environments Paper by Felix C. Gartner Graeme Coakley COEN 317 November 23, 2003.
第1部: 自己安定の緩和 すてふぁん どぅゔぃむ ポスドク パリ第11大学 LRI CNRS あどばいざ: せばすちゃ てぃくそい
New Variants of Self-Stabilization
Robust Stabilizing Leader Election
Algorithms for Extracting Timeliness Graphs
Introduction to Self-Stabilization
Corona Robust Low Atomicity Peer-To-Peer Systems
Snap-Stabilization in Message-Passing Systems
Presentation transcript:

Introduction to Self-Stabilization Stéphane Devismes

27/03/20082 Self-Stabilization [Dijkstra, 1974] Example: Dijkstra’s Token Ring

27/03/20083 Starting from an arbitrary state

27/03/20084 Definition: Closure + Convergence States of the system Illegitimate statesLegitimate States Convergence Closure

27/03/20085 Why Self-Stabilization? Tolerance to transient faults Eventually Safe No initializationOvercost Dynamicity No Detection of Stability AdvantagesDrawbacks

27/03/20086 Protocols for: Resources Allocation (Mutual Exclusion) Broadcast Routing Overlay (Spanning trees, Routing table) …

27/03/20087 Around Self-Stabilization (1/2) Weaker Properties: K-Stabilization (no more than K faults) Weak-Stabilization (possible convergence) Probabilistic Stabilization (probabilistic convergence) Pseudo-Stabilization Aim: circumvent impossibility results Example: alternated bit protocol

27/03/20088 Pseudo-Stabilization ? Self-Stabilization [Dijkstra, 1974]: Starting from any configuration, a self-stabilizing system reaches in a finite time a configuration c such that any suffix starting from c satisfies the intended specification. Pseudo-Stabilization [Burns, Gouda, and Miller, 1993]: Starting from any configuration, any execution of a pseudo-stabilizing system has a non-empty suffix that satisfies the intended specification.

27/03/20089 Self- vs. Pseudo- Stabilization Illegitimate States Legitimate States Strong Closure vs. Ultimate Closure

27/03/ Self- vs. Pseudo- Stabilization Example: Leader Election Self-Stabilizing Leader Election: Eventually there is a unique leader that cannot change Pseudo-Stabilizing Leader Election: We never have the guarantee that the leader no more changes but eventually it no more change Remark: no stabilization time in pseudo-stabilization

27/03/ Around Self-Stabilization (2/2) Stronger Properties: Fault-containment (Quick stabilization when there are few faults) Snap-Stabilization (Safety for the tasks started after the faults) Byzantine-Tolerant Stabilization Fault-Tolerant Stabilization (Stabilization despite crashes) Aim: circumvent the drawbacks

Fault-Tolerant Stabilizing Leader Election Carole Delporte-Gallet (LIAFA) Stéphane Devismes (CNRS, LRI) Hugues Fauconnier (LIAFA) LIAFA

27/03/ Fault-Tolerant Stabilization Gopal and Perry, PODC’93 Beauquier and Kekkonen-Moneta, JSS’97 Anagnostou and Hadzilacos, WDAG’93 In partial synchronous model ?

27/03/ Leader Election Fault-Tolerant Stabilizing Leader Election with: weak reliability and synchrony assumptions

27/03/ Model Network: fully-connected n Processes: timely may crash (an arbitrary number of processes may crash) Variables: initially arbitrary assigned Links: Unidirectional Initially not necessarily empty No order on the message deliverance Variable reliability and timeliness assumptions

27/03/ Communication-Efficiency [Larrea, Fernandez, and Arevalo, 2000]: « An algorithm is communication-efficient if it eventually only uses n - 1 unidirectional links »

27/03/ Self-Stabilizing Leader Election in a full timely network? Yes + communication-efficiency

27/03/ Principles of the algorithm A process p periodically sends ALIVE to every other if Leader = p Leader=1 Leader=2 Alive,2 Alive,1

27/03/ Principles of the algorithm When a process p such that Leader = p receives ALIVE from q, then Leader := q if q < p Leader=1 Leader=2 Alive,2 Alive,1 Leader=1 4

27/03/ Principles of the algorithm Any process q such that Leader ≠ q always chooses as leader the process from which it receives ALIVE the most recently Leader=1 Leader=2Leader=1 Alive,1 Leader=1 4

27/03/ Principles of the algorithm On Time out, a process p sets Leader to p Leader=3 Leader=2Leader=4 Alive,2 Alive,1 Leader=1 Leader=2 4

27/03/ Communication-Efficient Self-Stabilizing Leader Election in a system where at most one link is asynchronous? No

27/03/ Impossibility of Communication-Efficiency in a system with at most one asynchronous link Claim: Any process p such that Leader ≠ p must periodically receive messages within a bounded time otherwise it chooses another leader The process chooses another leader

27/03/ Self-Stabilizing (non communication-efficient) Leader Election in a system where some links are asynchronous? Yes

27/03/ Self-Stabilizing Leader Election in a system with a timely routing overlay For each pair of alive processes (p,q), there exists at least two paths of timely links: From p to q From q to p

27/03/ Principle of the algorithm Each process computes the set of alive processes and chooses as leader the smallest process of this set To compute the set: 1. Each process p periodically sends ALIVE,p to every other process 2. Any ALIVE,p message is repeated n - 1 times (any other process periodically receives such a message)

27/03/ Self-Stabilizing Leader Election in a system without timely routing overlay ? No

27/03/ Pseudo-Stabilizing Leader Election in a system where Self-Stabilizing Leader Election is not possible ? Yes + communication-efficiently In a system having a source and fair links

27/03/ Algorithm for systems with Source + fair links A process p periodically sends ALIVE to every other if Leader = p Each process stores in Active its ID + the IDs of each process from which it recently receives ALIVE Each process chooses its leader among the processes in its Active set Problem: we cannot use the IDs to choose a leader 21 Source <1><1><2><2> Alive,1 Alive,2

27/03/ Accusation Counter p stores in Counter[p] how many times it was suspected to be crashed When a process suspects its leader: it sends an ACCUSATION to LEADER, and chooses as new leader the process in Active with the smallest accusation counter p periodically sends ALIVE,Counter[p] to every other if Leader = p Problem: the accusation counter of the source can increase infinitely often 12 3 Source 3 <3><3> ,C=2 1,C=1 <2><2> Accuse

27/03/ Phase Counter Each process maintains in Phase[p] the number of times it looses the leadership p periodically sends ALIVE,Counter[p],Phase[p] to every other if Leader = p p increments Counter[p] only when receiving ACCUSATION,ph with ph = Phase[p] 12 3 Source <3><3> ,C=2 1,C=1 Ph=3 Ph=1Ph=2 Ph=4 (previously 3) <2><2> Accuse,3

27/03/ Communication-Efficient Pseudo-Stabilizing Leader Election in a system having only a source? No, but a non communication-efficient pseudo-stabilizing leader election can be done

27/03/ Result Summary ce-FTSSFTSSce-FTPSFTPS Full-TimelyYes Bi-sourceNoYes Timely routingNoYes? Source + fair linksNo Yes SourceNo Yes Totally asynchronousNo

27/03/ Perspectives Communication-efficient FTPS leader election in a system with timely routing overlay Extend these results to other topologies and models Fault-tolerant stabilizing decision problems ?

Thank You!