CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1.

Slides:



Advertisements
Similar presentations
Chapter 5: Tree Constructions
Advertisements

Chapter 6 - Convergence in the Presence of Faults1-1 Chapter 6 Self-Stabilization Self-Stabilization Shlomi Dolev MIT Press, 2000 Shlomi Dolev, All Rights.
Lecture 8: Asynchronous Network Algorithms
Chapter 2 - Definitions, Techniques and Paradigms2-1 Chapter 2 Self-Stabilization Self-Stabilization Shlomi Dolev MIT Press, 2000 Draft of May 2003, Shlomi.
Chapter 15 Basic Asynchronous Network Algorithms
Self Stabilizing Algorithms for Topology Management Presentation: Deniz Çokuslu.
6.852: Distributed Algorithms Spring, 2008 Class 24.
Lecture 7: Synchronous Network Algorithms
UBE529 Distributed Algorithms Self Stabilization.
CPSC 668Set 7: Mutual Exclusion with Read/Write Variables1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
CPSC 668Set 10: Consensus with Byzantine Failures1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
CPSC 411, Fall 2008: Set 9 1 CPSC 411 Design and Analysis of Algorithms Set 9: More Graph Algorithms Prof. Jennifer Welch Fall 2008.
Chapter 8 - Self-Stabilizing Computing1 Chapter 8 – Self-Stabilizing Computing Self-Stabilization Shlomi Dolev MIT Press, 2000 Draft of January 2004 Shlomi.
CPSC 668Set 2: Basic Graph Algorithms1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
Jim Anderson Comp 122, Fall 2003 Single-source SPs - 1 Chapter 24: Single-Source Shortest Paths Given: A single source vertex in a weighted, directed graph.
CPSC 668Set 3: Leader Election in Rings1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
Chapter 4 - Self-Stabilizing Algorithms for Model Conservation4-1 Chapter 4: roadmap 4.1 Token Passing: Converting a Central Daemon to read/write 4.2 Data-Link.
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
CPSC 668Set 16: Distributed Shared Memory1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
CPSC 668Set 10: Consensus with Byzantine Failures1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:
CPSC 668Self Stabilization1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
Chapter 4 - Self-Stabilizing Algorithms for Model Conversions iddistance 22 iddistance iddistance iddistance.
CS294, YelickSelf Stabilizing, p1 CS Self-Stabilizing Systems
Shortest Path Algorithms
Leader Election in Rings
Concurrency in Distributed Systems: Mutual exclusion.
Chapter 4 Self-Stabilization Self-Stabilization Shlomi Dolev MIT Press, 2000 Draft of October 2003 Shlomi Dolev, All Rights Reserved ©
Self Stabilization Classical Results and Beyond… Elad Schiller CTI (Grece)
Tirgul 7 Review of graphs Graph algorithms: – BFS (next tirgul) – DFS – Properties of DFS – Topological sort.
Message Passing Systems A Formal Model. The System Topology – network (connected undirected graph) Processors (nodes) Communication channels (edges) Algorithm.
Message Passing Systems A Formal Model. The System Topology – network (connected undirected graph) Processors (nodes) Communication channels (edges) Algorithm.
On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.
1 Shortest Path Algorithms Andreas Klappenecker [based on slides by Prof. Welch]
Jim Anderson Comp 122, Fall 2003 Single-source SPs - 1 Chapter 24: Single-Source Shortest Paths Given: A single source vertex in a weighted, directed graph.
Selected topics in distributed computing Shmuel Zaks
On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 10 Instructor: Haifeng YU.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 Set 1: Introduction 1.
Chapter 24: Single-Source Shortest Paths Given: A single source vertex in a weighted, directed graph. Want to compute a shortest path for each possible.
CSE 2331 / 5331 Topic 12: Shortest Path Basics Dijkstra Algorithm Relaxation Bellman-Ford Alg.
DISTRIBUTED SYSTEMS II A POLYNOMIAL LOCAL SOLUTION TO MUTUAL EXCLUSION Prof Philippas Tsigas Distributed Computing and Systems Research Group.
DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE
6.852: Distributed Algorithms Spring, 2008 Class 25-1.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 Set 8: More Mutex with Read/Write Variables 1.
Several sets of slides by Prof. Jennifer Welch will be used in this course. The slides are mostly identical to her slides, with some minor changes. Set.
DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 2: Basic Graph Algorithms 1.
Self-stabilization. What is Self-stabilization? Technique for spontaneous healing after transient failure or perturbation. Non-masking tolerance (Forward.
CS 542: Topics in Distributed Systems Self-Stabilization.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 Set 3: Leader Election in Rings 1.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 16: Distributed Shared Memory 1.
1 Fault tolerance in distributed systems n Motivation n robust and stabilizing algorithms n failure models n robust algorithms u decision problems u impossibility.
Self-stabilization. Technique for spontaneous healing after transient failure or perturbation. Non-masking tolerance (Forward error recovery). Guarantees.
DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 9: Fault Tolerant Consensus 1.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
ITEC452 Distributed Computing Lecture 15 Self-stabilization Hwajung Lee.
CSCE 411 Design and Analysis of Algorithms Set 9: More Graph Algorithms Prof. Jennifer Welch Spring 2012 CSCE 411, Spring 2012: Set 9 1.
1 Chapter 11 Global Properties (Distributed Termination)
TIRGUL 10 Dijkstra’s algorithm Bellman-Ford Algorithm 1.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Basic Graph Algorithms
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Lecture 8: Synchronous Network Algorithms
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Presentation transcript:

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Reference CSCE 668Self Stabilization 2  Self-Stabilization, Shlomi Dolev, MIT Press,  Chapter 2  Slides prepared for the book by Shlomi Dolev  available at

Self-Stabilization CSCE 668Self Stabilization 3  A powerful form of fault-tolerance.  Starting from an arbitrary system configuration, the algorithm is able to start working properly all on its own  Arbitrary system configuration is caused by some transient failure: message loss, corrupted memory, processor failure, loss of synchrony,…  As long as system is well-behaved sufficiently long, the algorithm can correct itself.  Paradigm has been applied to both shared memory and message passing models

Definitions CSCE 668Self Stabilization 4  Execution no longer defined to start with an initial configuration  instead can start with an arbitrary configuration  Depending on the problem to be solved, certain executions are considered legal, forming the set LE.  A configuration C is safe if every admissible execution starting with C is in LE.  An algorithm is self-stabilizing if every admissible execution reaches a safe configuration.

Self-Stabilization Definition CSCE 668Self Stabilization 5 … … … … … … … … … … … arbitrary configuration safe configuration legal execution …

Communication Model CSCE 668Self Stabilization 6  A "hybrid" of message passing and shared memory  Communication topology is represented as an undirected graph  not necessarily fully connected  Processors correspond to vertices  Corresponding to each edge (p i,p j ) are two shared read/write registers:  R ij : written by p i and read by p j  R ji : written by p j and read by p i

Communication Model CSCE 668Self Stabilization 7 p0p0 p1p1 p3p3 p2p2 R 01 R 10 R 12 R 21 R 32 R 23 R 31 R 13

Self-Stabilizing Spanning Tree Definition CSCE 668Self Stabilization 8  Every processor has a variable parent in its local state.  There is a distinguished root processor.  LE consists of all admissible executions in which the parent variables form a spanning tree rooted at root.

SS Spanning Tree Algorithm CSCE 668Self Stabilization 9  Each processor has local variable  parent, id of neighbor who is parent  dist, estimated distance to root  Root sets dist to 0, and copies state to all its "outgoing" registers  Non-root reads neighbors' states from “incoming” registers and adopts as its parent the neighbor with the smallest distance, and sets its distance to one more  Nodes perform these actions repeatedly

SS Spanning Tree Algorithm CSCE 668Self Stabilization 10 Code for root p 0 : while true do parent :=  dist := 0 for each neighbor p j do R 0j := 0 // write shared variable endfor

SS Spanning Tree Algorithm CSCE 668Self Stabilization 11 Code for non-root p i : while true do for each neighbor p j do neigh-dist[ j ] := R ji // read shared variable dist := 1 + min{neigh-dist[ j ] : p j is a neighbor} foundParent := false for each neighbor p j do if !foundParent and neigh-dist[ j ] = dist - 1 then parent := j ; foundParent := true endif R ij := dist // write shared variable endfor endwhile storage of negative values is not allowed

Output of Spanning Tree Algorithm CSCE 668Self Stabilization numbers are distances red arrows indicate parents black edges are non-tree edges root

Correctness Proof of SS ST Alg CSCE 668Self Stabilization 13 Definition: Executions are partitioned into asynchronous rounds, which are the shortest segments containing at least one step by each processor. Definition:  is the degree (maximum number of neighbors) of the communication graph. Definition: D is the diameter of the communication graph.

Correctness Proof of SS ST Alg CSCE 668Self Stabilization 14 Lemma: Consider any admissible execution. There exists T 1 < T 2 < … < T D such that after asynchronous round T k : (a) every proc. at distance ≤ k from root has dist = shortest path distance to root and parent variables form a BFS tree (b) every proc. at distance > k from root has dist ≥ k.

Correctness Proof of SS ST Alg CSCE 668Self Stabilization 15 Proof: By induction on k. Basis (k = 1): Let T 1 = 5 .  Initially all distances are nonnegative.  Procs might start with program counter in the middle of an iteration of the outer while loop; after at most 2  rounds, partial iterations are done.  After next  rounds, all non-root procs have completed read for-loop at least once and computed dist: all are > 0  After next  rounds, all non-root procs have completed write for-loop at least once  After next  rounds, all non-root procs have completed read for-loop at least once and computed dist: every neighbor of root reads 0 from root and > 0 from every other node, so sets dist to 1 and parent to root.

Correctness Proof of SS ST Alg CSCE 668Self Stabilization 16 Induction (k > 1): Assume for k - 1 and show for k. Let T k = T k .  Consider the execution just after end of asynchronous round T k-1.  After next  rounds, all non-root nodes have executed write for-loop at least once (and written their dist values).  After next  rounds, all non-root nodes have executed read for-loop at least once.  Suppose p i is at distance d ≤ k from root.  p i has at least one neighbor p j at distance d-1 ≤ k-1 from root, and no neighbor that is closer to the root.  By inductive hypothesis, p j 's register has correct value in it and all other neighbors of p i have registers with values ≥ d-1.  Thus p i correctly computes dist and parent.  Suppose p i is at distance > k from root.  Every neighbor of p i is at distance ≥ k from root.  By inductive hypothesis, all their registers have values ≥ k-1.  Thus p i computes dist to be ≥ k.

Correctness Proof of SS ST Alg CSCE 668Self Stabilization 17  Since every processor is at most distance D from root, previous lemma implies that a correct breadth- first spanning tree has been constructed after O(D  ) asynchronous rounds, no matter what the starting configuration.

Another Classic SS Algorithm CSCE 668Self Stabilization 18  Proposed by Dijkstra  Suggested for mutual exclusion  we will view it as a "token circulation" algorithm  Uses a stronger model of computation  in one atomic step, a proc can read all its "incoming" registers and write all its "outgoing" registers

Ring Communication Topology CSCE 668Self Stabilization 19  Procs are arranged in a unidirectional ring.  Only need one register for each proc. p0p0 p1p1 p3p3 p2p2 R3R3 R2R2 R1R1 R0R0 p 0 writes into R 0, p 1 reads from R 0, etc.

Processor's States CSCE 668Self Stabilization 20  Each processor's state consists solely of an integer, ranging from 0 to K - 1 (for suitable value of K)  Actually, processor just stores this information in its register.

Definition of Holding the Token CSCE 668Self Stabilization 21  Proc p 0 holds the token if R 0 = R n-1.  Proc p i (other than p 0 ) holds the token if R i ≠ R i-1.

Self-Stabilizing Token Circulation Definition CSCE 668Self Stabilization 22  LE consists of all admissible executions in which  in every configuration only one processor holds the token and  every processor holds the token infinitely often (Note resemblance to mutual exclusion problem.)

Dijkstra's Algorithm CSCE 668Self Stabilization 23 code for p 0 : while true do if R 0 = R n-1 then R 0 := (R 0 + 1) mod K endif endwhile executes atomically code for p i, i ≠ 0: while true do if R i ≠ R i-1 then R i := R i-1 endif endwhile

Analysis of Dijkstra's Algorithm CSCE 668Self Stabilization 24 Lemma: If all registers are equal in a configuration, then the configuration is safe. Proof: p0p0 p1p1 p3p3 p2p Suppose K =

Analysis of Dijkstra's Algorithm CSCE 668Self Stabilization 25  If execution begins with arbitrary values between 0 and K-1 in the registers, how can we show that eventually all the values will be the same (i.e., reach a safe state)?  Depends on K being large enough.  Suppose K = n+1 (so there are n+1 different values).  Lemma 1: In every configuration, there is at least one integer in {0,…,K-1} that does not appear in any register.

Analysis of Dijkstra's Algorithm CSCE 668Self Stabilization 26 Lemma 2: In every admissible execution (starting from any configuration), p 0 holds the token, and thus changes R 0, at least once during every n rounds. Proof: Suppose in contradiction there is a segment of n rounds in which p 0 does not change R 0.  Once p 1 takes a step in the first round, R 1 = R 0, and this equality remains true.  Once p 2 takes a step in the second round, R 2 = R 1 = R 0, and this equality remains true. ……  Once p n-1 takes a step in the (n-1)-st round, R n-1 = R n-2 = … = R 0.  So when p 0 takes a step in the n-th round, it will change R 0.

Analysis of Dijkstra's Algorithm CSCE 668Self Stabilization 27 Theorem: In any admissible execution starting at any configuration C, a safe configuration is reached within O(n 2 ) rounds. Proof: Let j be a value not in any register in C.  By Lemma 2, p 0 changes R 0 (by incrementing it) at least once every n rounds.  Thus eventually R 0 holds j, in configuration D, after at most O(n 2 ) rounds.  Since other procs only copy values, no register holds j between C and D.  After at most n more rounds, the value j propagates around the ring to p n-1.

What about Reducing K? CSCE 668Self Stabilization 28  Easy to see that K = n (n different values) suffices: either there is a missing value or p 0 's value is unique.  Can also show that K = n - 1 (n-1 different values) suffices.  But if K < n - 1 (less than n-1 different values), then there is a counter-example.  If the strong atomicity model is weakened to our familiar read/write atomicity, then K > 2n - 2 suffices.