On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.

Slides:

Advertisements

Similar presentations

Chapter 6 - Convergence in the Presence of Faults1-1 Chapter 6 Self-Stabilization Self-Stabilization Shlomi Dolev MIT Press, 2000 Shlomi Dolev, All Rights.

Advertisements

Chapter 7 - Local Stabilization1 Chapter 7: roadmap 7.1 Super stabilization 7.2 Self-Stabilizing Fault-Containing Algorithms 7.3 Error-Detection Codes.

Lecture 8: Asynchronous Network Algorithms

Snap-stabilizing Committee Coordination Borzoo Bonakdarpour Stephane Devismes Franck Petit IEEE International Parallel and Distributed Processing Symposium.

Leader Election Let G = (V,E) define the network topology. Each process i has a variable L(i) that defines the leader.  i,j  V  i,j are non-faulty.

Snap-Stabilization in Message-Passing Systems Sylvie Delaët (LRI) Stéphane Devismes (CNRS, LRI) Mikhail Nesterenko (Kent State University) Sébastien Tixeuil.

Self Stabilizing Algorithms for Topology Management Presentation: Deniz Çokuslu.

Self-Stabilization in Distributed Systems Barath Raghavan Vikas Motwani Debashis Panigrahi.

Fast Leader (Full) Recovery despite Dynamic Faults Ajoy K. Datta Stéphane Devismes Lawrence L. Larmore Sébastien Tixeuil.

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1.

Introduction to Self-Stabilization Stéphane Devismes.

Byzantine Generals Problem: Solution using signed messages.

From Self- to Snap- Stabilization Alain Cournier, Stéphane Devismes, and Vincent Villain SSS’2006, November 17-19, Dallas (USA)

Self-Stabilization: An approach for Fault-Tolerance in Distributed Systems Stéphane Devismes 16/12/2013MAROC'2013.

1 Complexity of Network Synchronization Raeda Naamnieh.

CPSC 668Set 3: Leader Election in Rings1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.

Chapter 4 - Self-Stabilizing Algorithms for Model Conservation4-1 Chapter 4: roadmap 4.1 Token Passing: Converting a Central Daemon to read/write 4.2 Data-Link.

1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:

CPSC 668Self Stabilization1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.

LSRP: Local Stabilization in Shortest Path Routing Anish Arora Hongwei Zhang.

Chapter 4 - Self-Stabilizing Algorithms for Model Conversions iddistance 22 iddistance iddistance iddistance.

CS294, YelickSelf Stabilizing, p1 CS Self-Stabilizing Systems

Self-Stabilization An Introduction Aly Farahat Ph.D. Student Automatic Software Design Lab Computer Science Department Michigan Technological University.

Chapter Resynchsonous Stabilizer Chapter 5.1 Resynchsonous Stabilizer Self-Stabilization Shlomi Dolev MIT Press, 2000 Draft of Jan 2004, Shlomi.

A Local Facility Location Algorithm Supervisor: Assaf Schuster Denis Krivitski Technion – Israel Institute of Technology.

Self Stabilization Classical Results and Beyond… Elad Schiller CTI (Grece)

On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.

Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.

Selected topics in distributed computing Shmuel Zaks

Andreas Larsson, Philippas Tsigas SIROCCO Self-stabilizing (k,r)-Clustering in Clock Rate-limited Systems.

Fault-containment in Weakly Stabilizing Systems Anurag Dasgupta Sukumar Ghosh Xin Xiao University of Iowa.

Snap-Stabilizing PIF and Useless Computations Alain Cournier, Stéphane Devismes, and Vincent Villain ICPADS’2006, July , Minneapolis (USA)

Consensus and Its Impossibility in Asynchronous Systems.

Fault-containment in Weakly Stabilizing Systems Anurag Dasgupta Sukumar Ghosh Xin Xiao University of Iowa.

Energy-Efficient Shortest Path Self-Stabilizing Multicast Protocol for Mobile Ad Hoc Networks Ganesh Sridharan

The Complexity of Distributed Algorithms. Common measures Space complexity How much space is needed per process to run an algorithm? (measured in terms.

A Self-Stabilizing O(n)-Round k-Clustering Algorithm Stéphane Devismes, VERIMAG.

6.852: Distributed Algorithms Spring, 2008 Class 25-1.

Self-Stabilizing K-out-of-L Exclusion on Tree Networks Stéphane Devismes, VERIMAG Joint work with: – Ajoy K. Datta (Univ. Of Nevada) – Florian Horn (LIAFA)

Self-Stabilizing K-out-of-L Exclusion on Tree Networks Stéphane Devismes, VERIMAG Joint work with: – Ajoy K. Datta (Univ. Of Nevada) – Florian Horn (LIAFA)

Weak vs. Self vs. Probabilistic Stabilization Stéphane Devismes (CNRS, LRI, France) Sébastien Tixeuil (LIP6-CNRS & INRIA, France) Masafumi Yamashita (Kyushu.

Snap-Stabilization in Message-Passing Systems Sylvie Delaët (LRI) Stéphane Devismes (CNRS, LRI) Mikhail Nesterenko (Kent State University) Sébastien Tixeuil.

Impossibility of Distributed Consensus with One Faulty Process By, Michael J.Fischer Nancy A. Lynch Michael S.Paterson.

CS 542: Topics in Distributed Systems Self-Stabilization.

Sorting on Skip Chains Ajoy K. Datta, Lawrence L. Larmore, and Stéphane Devismes.

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 Set 3: Leader Election in Rings 1.

1 Fault tolerance in distributed systems n Motivation n robust and stabilizing algorithms n failure models n robust algorithms u decision problems u impossibility.

Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb

Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb

Self-Stabilizing Algorithm with Safe Convergence building an (f,g)-Alliance Fabienne Carrier Ajoy K. Datta Stéphane Devismes Lawrence L. Larmore Yvan Rivierre.

Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb

Snap-Stabilization in Message-Passing Systems Sylvie Delaët (LRI) Stéphane Devismes (CNRS, LRI) Mikhail Nesterenko (Kent State University) Sébastien Tixeuil.

Superstabilizing Protocols for Dynamic Distributed Systems Authors: Shlomi Dolev, Ted Herman Presented by: Vikas Motwani CSE 291: Wireless Sensor Networks.

1 Fault-Tolerant Consensus. 2 Communication Model Complete graph Synchronous, network.

Distributed Error- Confinement Shay Kutten (Technion) with Boaz Patt-Shamir (Tel Aviv U.) Yossi Azar (Tel Aviv U.)

CIS 825 Review session. P1: Assume that processes are arranged in a ring topology. Consider the following modification of the Lamport’s mutual exclusion.

Self-stabilizing (f,g)-Alliances with Safe Convergence Fabienne Carrier Ajoy K. Datta Stéphane Devismes Lawrence L. Larmore Yvan Rivierre.

Snap-Stabilizing Depth-First Search on Arbitrary Networks Alain Cournier, Stéphane Devismes, Franck Petit, and Vincent Villain OPODIS 2004, December

Around Self-Stabilization Part 2: Strengthened Forms of Self-Stabilization Stéphane Devismes Post-Doc CNRS at the LRI (Paris VII)

CSE-591: Term Project Self-stabilizing Network Algorithms by Tridib Mukherjee ASU ID :

Model and complexity Many measures Space complexity Time complexity

第１部: 自己安定の緩和すてふぁんどぅゔぃむポスドクパリ第１１大学 LRI CNRS あどばいざ: せばすちゃてぃくそい

New Variants of Self-Stabilization

Alternating Bit Protocol

CS60002: Distributed Systems

A Snap-Stabilizing DFS with a Lower Space Requirement

Robust Stabilizing Leader Election

Distributed Error- Confinement

Introduction to Self-Stabilization

Snap-Stabilization in Message-Passing Systems

Presentation transcript:

On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble

Self-Stabilization [Dijkstra,74] AlgoTel'2014, Ile de Ré04/06/2014

Self-Stabilization [Dijkstra,74] Transient faults: Location: node or link Duration: finite Frequency: low e.g., memory corruptions, message losses, message corruptions AlgoTel'2014, Ile de Ré04/06/2014

Self-Stabilization [Dijkstra,74] AlgoTel'2014, Ile de Ré04/06/2014

Self-Stabilization [Dijkstra,74] AlgoTel'2014, Ile de Ré04/06/2014

Self-Stabilization [Dijkstra,74] AlgoTel'2014, Ile de Ré04/06/2014

Self-Stabilization [Dijkstra,74] AlgoTel'2014, Ile de Ré04/06/2014

Self-Stabilization [Dijkstra,74] AlgoTel'2014, Ile de Ré04/06/2014

Self-Stabilization [Dijkstra,74] Recover after any number of transient faults AlgoTel'2014, Ile de Ré04/06/2014

Advantages Tolerate any finite number of transient faults Lightweight – Low overhead No initialization – Large-scale network – Self-organization in wireless sensor network Tolerate (detectable) topological changes Self-stabilizing algorithms can be easily composed AlgoTel'2014, Ile de Ré04/06/2014

Drawbacks of Self-Stabilization Temporary Loss of Safety No local detection of stabilization – Permanent local checks – Endlessly repeat computations Impossibility Results AlgoTel'2014, Ile de Ré04/06/2014

Drawbacks of Self-Stabilization Temporary Loss of Safety No local detection of stabilization – Permanent local checks – Endlessly repeat computations Impossibility Results AlgoTel'2014, Ile de Ré04/06/2014

Temporary Loss of Safety AlgoTel'2014, Ile de Ré Goal: Minimize the stabilization time 04/06/2014

Temporary Loss of Safety AlgoTel'2014, Ile de Ré Stabilization time usually in Ω(D) rounds [Tixeuil & Genolini, 2002] Stabilization time usually in Ω(D) rounds [Tixeuil & Genolini, 2002] 04/06/2014

Self-stabilization: endless repetitions of finite task executions AlgoTel'2014, Ile de Ré Transient faults 04/06/2014

Self-stabilization: endless repetitions of finite task executions AlgoTel'2014, Ile de Ré Finite (but usually unbounded) number of times Transient faults 04/06/2014

(Deterministic) Snap-Stabilization [Bui et al, 1999] Resume a correct behavior since the first started task after the end of faults AlgoTel'2014, Ile de Ré Transient faults First start 04/06/2014

(Deterministic) Snap-Stabilization [Bui et al, 1999] Strengthened form of self-stabilization – Snap-Stabilization ⇒ Self-Stabilization Lot of solutions are available, but only in – Identified, or – Rooted networks What about anonymous networks ? AlgoTel'2014, Ile de Ré04/06/2014

(Deterministic) Snap-Stabilization [Bui et al, 1999] Most of classical problems even have no deterministic self-stabilizing solutions, e.g., – Token passing – Leader Election Several weakened forms of self-stabilization have been introduced to circumvent these impossibility results, e.g., – Weak-stabilization [Gouda, 2001] – K-stabilization [Beauquier et al, 1998] – Probabilistic self-stabilization [Herman, 1990] AlgoTel'2014, Ile de Ré04/06/2014

(Deterministic) Snap-Stabilization [Bui et al, 1999] Most of classical problems even have no deterministic self-stabilizing solutions, e.g., – Token passing – Leader Election Several weakened forms of self-stabilization have been introduced to circumvent these impossibility results, e.g., – Weak-stabilization [Gouda, 2001] – K-stabilization [Beauquier et al, 1998] – Probabilistic self-stabilization [Herman, 1990] AlgoTel'2014, Ile de Ré04/06/2014

Deterministic vs. Probabilistic Self-Stabilization Deterministic Self-stabilization – Deterministic convergence: Starting from any configuration, the system converges to a legitimate configuration within finite time Probabilistic Self-stabilization – Probabilistic convergence: Starting from any configuration, the system converges to a legitimate configuration within almost surely finite time (Las Vegas Approach) AlgoTel'2014, Ile de Ré04/06/2014

Contribution (1/2) Probabilistic Snap-stabilization – Weakened form of snap-stabilization – Not comparable to self-stabilization Ideas – We relax the definition of snap-stabilization without altering its strong safety guarantees – to address anonymous networks AlgoTel'2014, Ile de Ré04/06/2014

Definition For every task started after the faults (i.e., starting from an arbitrary configuration) – The safety part of the specification is satisfied – However, the liveness part of the specification is only almost surely ensured (Las Vegas approach) AlgoTel'2014, Ile de Ré Transient faults The task is executed safely, but it terminates in almost surely finite time 04/06/2014

Contribution (2/2) 2 probabilistic snap-stabilizing protocols for the guaranteed service leader election in anonymous networks using the locally shared memory model – The first one assumes a synchronous scheduler – The second one assumes an asynchronous scheduler – This problem has no deterministic self- or snap- stabilizing solution AlgoTel'2014, Ile de Ré04/06/2014

Definition of the problem AlgoTel'2014, Ile de Ré time processes P1P1 P2P2 P3P3 P4P4 P5P5 04/06/2014

Definition of the problem AlgoTel'2014, Ile de Ré time processes P1P1 P2P2 P3P3 P4P4 P5P5 ? ? ? ? ? Upon a request A process initiates the question “Am I the leader of the network ?” Upon a request A process initiates the question “Am I the leader of the network ?” 04/06/2014

yes Definition of the problem AlgoTel'2014, Ile de Ré time processes P1P1 P2P2 P3P3 P4P4 P5P5 The task consists in computing the answer (yes/no) ? ? ? ? ? no 04/06/2014

yes Definition of the problem AlgoTel'2014, Ile de Ré time processes P1P1 P2P2 P3P3 P4P4 P5P5 Each process always receives the same answer to each of its request ? ? ? ? ? no yes ? ? ? no ? ? ? ? 04/06/2014

yes Definition of the problem AlgoTel'2014, Ile de Ré time processes P1P1 P2P2 P3P3 P4P4 P5P5 Exactly one process always receives yes ? ? ? ? ? no yes ? ? ? no ? ? ? ? 04/06/2014

Overview of the synchronous solution AlgoTel'2014, Ile de Ré04/06/2014

Roadmap Non-stabilizing probabilistic solution Probabilistic self-stabilizing solution Probabilistic snap-stabilizing solution AlgoTel'2014, Ile de Ré04/06/2014

Required Knowledge If processes has no global information about the network (e.g., its size), the problem is impossible to solve Sketch: we can reduce the problem to the Ring-Size Counting problem, which is known to be impossible to solve in anonymous networks AlgoTel'2014, Ile de Ré04/06/2014

Required Knowledge We assume that each process knows a value B s.t.: B < n ≤ 2B, where n is the number of processes AlgoTel'2014, Ile de Ré04/06/2014

Solution in a fault-free synchronous system AlgoTel'2014, Ile de Ré04/06/2014

Cycles Each process executes infinitely many cycles – The inputs of a cycle are a Boolean variable LE at each process – Initially, the value of LE is randomly chosen – A cycle consists in computing a variable UNIQ at each process which states if there is a unique process satisfying LE = true If yes, LE remains unchanged for each process Otherwise, each LE variable is randomly reset AlgoTel'2014, Ile de Ré04/06/2014

Request Upon a request, a process waits the end of the cycle – If UNIQ = true, then the value of LE is the answer – Otherwise, the answer is delayed until (at least) the end of the next cycle Aim: ensure that within almost surely finite time there is a unique process satisfying LE = true AlgoTel'2014, Ile de Ré04/06/2014

How to compute a cycle? Initially – The value of LE is randomly chosen by each process – UNIQ is set to false Each cycle is made of 3 phases Each phase lasts 2B steps (2B > Diameter) Each process has a local clock which takes value in [0..6B-1] to know in which phase it is AlgoTel'2014, Ile de Ré04/06/2014

How to compute a cycle? AlgoTel'2014, Ile de Ré Phase 1Phase 2 Phase 3 02B4B6B-1 One cycle 04/06/2014

How to compute a cycle? First Phase: compute a spanning forest s.t. – Each tree is rooted at a process such that LE = true – If no such process: no tree at the end of the phase AlgoTel'2014, Ile de Ré T T T T F F F F F F F F F F F F F F 04/06/2014

How to compute a cycle? First Phase: compute a spanning forest s.t. – Each tree is rooted at a process such that LE = true – If no such process: no tree at the end of the phase AlgoTel'2014, Ile de Ré T T T T F F F F F F F F F F F F F F 04/06/2014

How to compute a cycle? Second Phase: each root (if any) computes the number of nodes in its tree AlgoTel'2014, Ile de RéICDCN'2014, Coimbatore T T T T F F F F F F F F F F F F F F /06/2014

How to compute a cycle? Second Phase: Each root (if any) computes the number of node in its tree AlgoTel'2014, Ile de RéICDCN'2014, Coimbatore T T T T F F F F F F F F F F F F F F At the end of the phase If a root has more that B processes in its tree: LE ← true; UNIQ ← true In either cases, processes execute: LE ← false; UNIQ ← false (Here, we choose B = 5) At the end of the phase If a root has more that B processes in its tree: LE ← true; UNIQ ← true In either cases, processes execute: LE ← false; UNIQ ← false (Here, we choose B = 5) 04/06/2014

How to compute a cycle? Second Phase: Each root (if any) computes the number of node in its tree AlgoTel'2014, Ile de RéICDCN'2014, Coimbatore T T F F F F F F F F F F F F F F F F F,1 F,2 F,3 T,6 At the end of the phase If one root has more that B processes in its tree: LE ← true; UNIQ ← true In either cases, processes execute: LE ← false; UNIQ ← false (Here, we choose B = 5) At the end of the phase If one root has more that B processes in its tree: LE ← true; UNIQ ← true In either cases, processes execute: LE ← false; UNIQ ← false (Here, we choose B = 5) 04/06/2014

How to compute a cycle? Second Phase: Each root (if any) computes the number of node in its tree AlgoTel'2014, Ile de RéICDCN'2014, Coimbatore T T F F F F F F F F F F F F F F F F F,1 F,2 F,3 T,6 Remark: at most one process can satisfy the first case because B ≥ n/2 04/06/2014

How to compute a cycle? Third Phase: broadcast of the result – Each process computes the disjunction of all the UNIQ variables AlgoTel'2014, Ile de RéICDCN'2014, Coimbatore T T F F F F F F F F F F F F F F F F F,1 F,2 F,3 T,6 04/06/2014

How to compute a cycle? Third Phase: broadcast of the result – Each process computes the disjunction of all the UNIQ variables AlgoTel'2014, Ile de RéICDCN'2014, Coimbatore T T F F F F F F F F F F F F F F F F F,1 T,1 F,1 T,2 F,3 T,3 T,6 04/06/2014

How to compute a cycle? Third Phase: broadcast of the result – Each process computes the disjunction of all the UNIQ variables AlgoTel'2014, Ile de RéICDCN'2014, Coimbatore T T F F F F F F F F F F F F F F F F T,1 T,2 T,3 T,6 04/06/2014

How to compute a cycle? Third Phase: broadcast of the result – Each process computes the disjunction of all the UNIQ variables AlgoTel'2014, Ile de RéICDCN'2014, Coimbatore T T F F F F F F F F F F F F F F F F T,1 T,2 T,3 T,6 At the end of the cycle, UNIQ = true at a process IFF there a unique process such that LE = true 04/06/2014

How to compute a cycle? Third Phase: broadcast of the result – Each process computes the disjunction of all the UNIQ variables AlgoTel'2014, Ile de RéICDCN'2014, Coimbatore T T F F F F F F F F F F F F F F F F T,1 T,2 T,3 T,6 At the beginning of the next cycle: LE is randomly reset if UNIQ = false W.p.p. a process will be elected in the next cycle Otherwise LE remains constant UNIQ is reset to false for each process At the beginning of the next cycle: LE is randomly reset if UNIQ = false W.p.p. a process will be elected in the next cycle Otherwise LE remains constant UNIQ is reset to false for each process At the end of the cycle, UNIQ = true at a process IFF there a unique process such that LE = true 04/06/2014

Complexity The complexity depends on the probability law we use each time we (re)set LE We compute that the best choice is to (re)set LE to true with probability p ∈ [1/2B,1/(B+1)] In this case, the expected number of cycles to “stabilize” the value of LE is e 2 /2 ≤ 3.70 The expected time to give an answer to a request is then in 0(n) AlgoTel'2014, Ile de Ré04/06/2014

How to make this algorithm self-stabilizing? After faults, processes can be desynchronized Solution: use a phase clock algorithm [Boulinier et al,2004] – Clocks synchronize in at most 6B steps – Then, the first cycle started at 6B steps will correctly compute UNIQ! AlgoTel'2014, Ile de Ré04/06/2014

How to make this algorithm snap-stabilizing? The first cycle started at 6B steps will correctly compute UNIQ! For each request – A process first waits 6B steps (using a counter) – And only considers outputs of cycles started after these 6B steps – So, only outputs of correct cycles will be considered! – The answer LE will be output only when UNIQ = true at the end of such a cycle The expected delay for each answer remains in 0(n) AlgoTel'2014, Ile de Ré04/06/2014

Overview of the asynchronous solution We use the self-stabilizing UNISON algorithm of [Boulinier et al, 2004] to emulate synchronous executions of cycles. – After stabilization, clocks differ from at most 1 Using this algorithm, cycles ``synchronize’’ in 8B rounds AlgoTel'2014, Ile de Ré04/06/2014

Overview of the asynchronous solution We use a result from [Boulinier et al, 2008] to detect when cycles have stabilized: If the clock of some process successively takes values u, u+1, … u+(2D+1), with ∀ i ∈ {1, …, 2D+1}, u+i > 0, then every other process executes at least one step during that period Drawback: The expected delay for each answer is in 0(n 2 ) AlgoTel'2014, Ile de Ré04/06/2014

Conclusion We introduced probabilistic snap-stabilization We proposed 2 probabilistic snap-stabilizing protocols for the guaranteed service leader election in anonymous networks – The first one assumes a synchronous scheduler, and its expected time complexity is in O(n) – The second one assumes an asynchronous scheduler, and its expected time complexity is in O(n 2 ) These two expected times can be reduced to O(D) and O(Dn), respectively, if processes have the knowledge of the diameter D of the network. AlgoTel'2014, Ile de Ré04/06/2014

Thank you! AlgoTel'2014, Ile de Ré04/06/2014