Self-stabilization. Technique for spontaneous healing after transient failure or perturbation. Non-masking tolerance (Forward error recovery). Guarantees.

Slides:



Advertisements
Similar presentations
Chapter 15 Basic Asynchronous Network Algorithms
Advertisements

Leader Election Let G = (V,E) define the network topology. Each process i has a variable L(i) that defines the leader.  i,j  V  i,j are non-faulty.
1 Discrete Structures & Algorithms Graphs and Trees: III EECE 320.
Self Stabilizing Algorithms for Topology Management Presentation: Deniz Çokuslu.
Self-stabilizing Distributed Systems Sukumar Ghosh Professor, Department of Computer Science University of Iowa.
Self-Stabilization in Distributed Systems Barath Raghavan Vikas Motwani Debashis Panigrahi.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1.
Lecture 4: Elections, Reset Anish Arora CSE 763 Notes include material from Dr. Jeff Brumfield.
UBE529 Distributed Algorithms Self Stabilization.
Introduction to Self-Stabilization Stéphane Devismes.
1 Complexity of Network Synchronization Raeda Naamnieh.
LSRP: Local Stabilization in Shortest Path Routing Hongwei Zhang and Anish Arora Presented by Aviv Zohar.
CPSC 668Self Stabilization1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
LSRP: Local Stabilization in Shortest Path Routing Anish Arora Hongwei Zhang.
CS294, YelickSelf Stabilizing, p1 CS Self-Stabilizing Systems
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 2 – Distributed Systems.
Message Passing Systems A Formal Model. The System Topology – network (connected undirected graph) Processors (nodes) Communication channels (edges) Algorithm.
Chapter 7 - Local Stabilization1 Chapter 7 – Local Stabilization Self-Stabilization Shlomi Dolev MIT Press, 2000 Draft of January 2004 Shlomi Dolev, All.
Message Passing Systems A Formal Model. The System Topology – network (connected undirected graph) Processors (nodes) Communication channels (edges) Algorithm.
On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.
Selected topics in distributed computing Shmuel Zaks
Representing distributed algorithms Why do we need these? Don’t we already know a lot about programming? Well, you need to capture the notions of atomicity,
On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.
Fault-containment in Weakly Stabilizing Systems Anurag Dasgupta Sukumar Ghosh Xin Xiao University of Iowa.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 10 Instructor: Haifeng YU.
Random Walks on Distributed N etworks Masafumi Yamash ita (Kyushu Univ., Japan)
1 Self-stabilizing Algorithms and Frequency Assignment Problems.
Review for Exam 2. Topics included Deadlock detection Resource and communication deadlock Graph algorithms: Routing, spanning tree, MST, leader election.
Defining Programs, Specifications, fault-tolerance, etc.
Fault-containment in Weakly Stabilizing Systems Anurag Dasgupta Sukumar Ghosh Xin Xiao University of Iowa.
By J. Burns and J. Pachl Based on a presentation by Irina Shapira and Julia Mosin Uniform Self-Stabilization 1 P0P0 P1P1 P2P2 P3P3 P4P4 P5P5.
The Complexity of Distributed Algorithms. Common measures Space complexity How much space is needed per process to run an algorithm? (measured in terms.
A Self-Stabilizing O(n)-Round k-Clustering Algorithm Stéphane Devismes, VERIMAG.
Dissecting Self-* Properties Andrew Berns & Sukumar Ghosh University of Iowa.
Program correctness The State-transition model The set of global states = so x s1 x … x sm {sk is the set of local states of process k} S0 ---> S1 --->
Autonomic distributed systems. 2 Think about this Human population x10 9 computer population.
Self-Stabilizing K-out-of-L Exclusion on Tree Networks Stéphane Devismes, VERIMAG Joint work with: – Ajoy K. Datta (Univ. Of Nevada) – Florian Horn (LIAFA)
Self-Stabilizing K-out-of-L Exclusion on Tree Networks Stéphane Devismes, VERIMAG Joint work with: – Ajoy K. Datta (Univ. Of Nevada) – Florian Horn (LIAFA)
Program correctness The State-transition model A global states S  s 0 x s 1 x … x s m {s k = set of local states of process k} S0  S1  S2  Each state.
Hwajung Lee. The State-transition model The set of global states = s 0 x s 1 x … x s m {s k is the set of local states of process k} S0  S1  S2  Each.
Weak vs. Self vs. Probabilistic Stabilization Stéphane Devismes (CNRS, LRI, France) Sébastien Tixeuil (LIP6-CNRS & INRIA, France) Masafumi Yamashita (Kyushu.
Hwajung Lee. The State-transition model The set of global states = s 0 x s 1 x … x s m {s k is the set of local states of process k} S0  S1  S2  Each.
Fault Management in Mobile Ad-Hoc Networks by Tridib Mukherjee.
University of Iowa1 Self-stabilization. The University of Iowa2 Man vs. machine: fact 1 An average household in the developed countries has 50+ processors.
Leader Election (if we ignore the failure detection part)
Self-stabilization. What is Self-stabilization? Technique for spontaneous healing after transient failure or perturbation. Non-masking tolerance (Forward.
CS 542: Topics in Distributed Systems Self-Stabilization.
Trees Thm 2.1. (Cayley 1889) There are nn-2 different labeled trees
Self-stabilizing energy-efficient multicast for MANETs.
Hwajung Lee. Let G = (V,E) define the network topology. Each process i has a variable L(i) that defines the leader.   i,j  V  i,j are non-faulty ::
Page 1 Mutual Exclusion & Election Algorithms Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content.
1 Fault tolerance in distributed systems n Motivation n robust and stabilizing algorithms n failure models n robust algorithms u decision problems u impossibility.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
Hwajung Lee.  Technique for spontaneous healing.  Forward error recovery.  Guarantees eventual safety following failures. Feasibility demonstrated.
Program Correctness. The designer of a distributed system has the responsibility of certifying the correctness of the system before users start using.
Superstabilizing Protocols for Dynamic Distributed Systems Authors: Shlomi Dolev, Ted Herman Presented by: Vikas Motwani CSE 291: Wireless Sensor Networks.
ITEC452 Distributed Computing Lecture 15 Self-stabilization Hwajung Lee.
Minimum Spanning Tree Given a weighted graph G = (V, E), generate a spanning tree T = (V, E’) such that the sum of the weights of all the edges is minimum.
Self-Stabilizing Systems
CSE-591: Term Project Self-stabilizing Network Algorithms by Tridib Mukherjee ASU ID :
Computer Science 425/ECE 428/CSE 424 Distributed Systems (Fall 2009) Lecture 20 Self-Stabilization Reading: Chapter from Prof. Gosh’s book Klara Nahrstedt.
Self-stabilizing Overlay Networks Sukumar Ghosh University of Iowa Work in progress. Jointly with Andrew Berns and Sriram Pemmaraju (Talk at Michigan Technological.
Leader Election Let G = (V,E) define the network topology. Each process i has a variable L(i) that defines the leader.  i,j  V  i,j are non-faulty ::
第1部: 自己安定の緩和 すてふぁん どぅゔぃむ ポスドク パリ第11大学 LRI CNRS あどばいざ: せばすちゃ てぃくそい
Self-stabilization.
Leader Election (if we ignore the failure detection part)
Atomicity, Non-determinism, Fairness
CS60002: Distributed Systems
ITEC452 Distributed Computing Lecture 5 Program Correctness
Dijkstra’s Algorithm for the Shortest Path Problem
Presentation transcript:

Self-stabilization

Technique for spontaneous healing after transient failure or perturbation. Non-masking tolerance (Forward error recovery). Guarantees eventual safety following failures. Feasibility demonstrated by Dijkstra in his Communications of the ACM 1974)

Self-stabilizing systems Recover from any initial configuration to a legitimate configuration in a bounded number of steps, as long as the codes are not corrupted. The ability to spontaneously recover from any initial state implies that no initialization is ever required. Such systems can be deployed ad hoc, and are guaranteed to function properly in bounded time

Self-stabilizing systems

Recall some of the old examples of clock phase synchronization or graph coloring discussed in the class. They were all self-stabilizing. Why? (See the lecture of September 3, pages 8 and 14. The example in page 8 was not self-stabilizing, but the example in page 14 was so.)

Example 1: Stabilizing mutual exclusion (Dijkstra 1974) N-1 Consider a unidirectional ring of processes. In the legal configuration, exactly one token will circulate in the network

Stabilizing mutual exclusion 0 {Process 0} do x[0] = x[N-1]  x[0] := x[0] + 1 od {Process j > 0} do x[j] ≠ x[j -1]  x[j] := x[j-1] od The state of process j is x[j]  {0, 1, 2, K-1} (TOKEN = ENABLED GUARD) Hand-execute this first, before reading further. Start the system from an arbitrary initial configuration

Stabilizing mutual exclusion 0 {Process 0} do x[0] = x[N-1]  x[0] := x[0] + 1 mod K od {Process j > 0} do x[j] ≠ x[j-1]  x[j] := x[j-1] od The state of process j is x[j]  {0, 1, 2, K-1} (TOKEN = ENABLED GUARD)

Stabilizing mutual exclusion Why will it work? Here is a quick summary of the arguments: As long as K > N, there is at least one value x (O ≤ x ≤ K-1) that is NOT the initial state of any nod. Observe the following facts: There is no deadlock Number of tokens never increases (closure) Processes 1..N-1 acquire their states from their left neighbor Eventually process 0 attains the state x Thereafter in N-1 steps, all processes attain the state x. This is a legal configuration (only process 0 has a token) (convergence). So the system stabilizes.

Example 2: Stabilizing spanning tree Given a connected graph G = (V,E) and a root r, design an algorithm for maintaining a spanning tree in presence of transient failures that may corrupt the local states of processes. Let n = |V|

An ilustration The parent pointer of node 2 is corrupted

Definitions Each process i has two variables: L(i) = Distance from the root via tree edges P(i) = parent of process i N(i) denotes the neighbors of i By definition L(r) = 0, and P(r) is undefined. Also, 0 ≤ L(i) ≤ n. In a legal state  i  V: i ≠ r:: L(i) ≠ n and L(i) = L(P(i)) +1.

The algorithm do (L(i) ≠ n)  (L(i) ≠ L(P(i)) +1)  (L(P(i)) ≠ n)  L(i) :=L(P(i)) + 1 [] (L(i)  n)  (L(P(i)) =n)  L(i):=n [] (L(i) =n)  (  k  N(i):L(k) < n-1)  L(i) :=L(k)+1; P(i):=k od P(2) is corrupted The blue labels denote the values of L

Proof of stabilization Define an edge from i to P(i) to be well-formed, when L(i) ≠ n, L(P(i) ≠ n and L(i) = L(P(i)) +1. In any configuration, the well-formed edges form a spanning forest. Delete all edges that are not well- formed. Each tree T(k) in the forest is identified by k, the lowest value of L in that tree.

Example In the sample graph shown earlier, the original spanning tree is decomposed into two well-formed trees T(0) = {0, 1} T(2) = {2, 3, 4, 5} Let F(k) denote the number of T(k)’s in the forest. Define a tuple F= (F(0), F(1), F(2) …, F(n)). For the sample graph, F = (1, 0, 1, 0, 0, 0) after node 2’s has a transient failure.

Skeleton of the proof Minimum F = (1,0,0,0,0,0) {legal configuration} Maximum F = (1, n-1, 0, 0, 0, 0). With each action of the algorithm, F decreases lexicographically. Verify the claim! This proves that eventually F becomes (1,0,0,0,0,0) and the spanning tree stabilizes. What is the time complexity of this algorithm?