CPSC 668Self Stabilization1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.

Slides:



Advertisements
Similar presentations
Chapter 5: Tree Constructions
Advertisements

Chapter 6 - Convergence in the Presence of Faults1-1 Chapter 6 Self-Stabilization Self-Stabilization Shlomi Dolev MIT Press, 2000 Shlomi Dolev, All Rights.
Chapter 2 - Definitions, Techniques and Paradigms2-1 Chapter 2 Self-Stabilization Self-Stabilization Shlomi Dolev MIT Press, 2000 Draft of May 2003, Shlomi.
Chapter 15 Basic Asynchronous Network Algorithms
Self Stabilizing Algorithms for Topology Management Presentation: Deniz Çokuslu.
6.852: Distributed Algorithms Spring, 2008 Class 24.
Lecture 7: Synchronous Network Algorithms
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1.
UBE529 Distributed Algorithms Self Stabilization.
CPSC 668Set 19: Asynchronous Solvability1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
CPSC 668Set 14: Simulations1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
CPSC 668Set 7: Mutual Exclusion with Read/Write Variables1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
CPSC 668Set 10: Consensus with Byzantine Failures1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
CPSC 311, Fall CPSC 311 Analysis of Algorithms Graph Algorithms Prof. Jennifer Welch Fall 2009.
CPSC 668Set 5: Synchronous LE in Rings1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
CPSC 411, Fall 2008: Set 9 1 CPSC 411 Design and Analysis of Algorithms Set 9: More Graph Algorithms Prof. Jennifer Welch Fall 2008.
Chapter 8 - Self-Stabilizing Computing1 Chapter 8 – Self-Stabilizing Computing Self-Stabilization Shlomi Dolev MIT Press, 2000 Draft of January 2004 Shlomi.
CPSC 668Set 4: Asynchronous Lower Bound for LE in Rings1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
CPSC 668Set 2: Basic Graph Algorithms1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
CPSC 668Set 3: Leader Election in Rings1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
Chapter 4 - Self-Stabilizing Algorithms for Model Conservation4-1 Chapter 4: roadmap 4.1 Token Passing: Converting a Central Daemon to read/write 4.2 Data-Link.
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
CPSC 668Set 16: Distributed Shared Memory1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
CPSC 668Set 10: Consensus with Byzantine Failures1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
Chapter 4 - Self-Stabilizing Algorithms for Model Conversions iddistance 22 iddistance iddistance iddistance.
CS294, YelickSelf Stabilizing, p1 CS Self-Stabilizing Systems
Shortest Path Algorithms
Leader Election in Rings
Concurrency in Distributed Systems: Mutual exclusion.
Chapter 4 Self-Stabilization Self-Stabilization Shlomi Dolev MIT Press, 2000 Draft of October 2003 Shlomi Dolev, All Rights Reserved ©
Self Stabilization Classical Results and Beyond… Elad Schiller CTI (Grece)
Tirgul 7 Review of graphs Graph algorithms: – BFS (next tirgul) – DFS – Properties of DFS – Topological sort.
Message Passing Systems A Formal Model. The System Topology – network (connected undirected graph) Processors (nodes) Communication channels (edges) Algorithm.
Message Passing Systems A Formal Model. The System Topology – network (connected undirected graph) Processors (nodes) Communication channels (edges) Algorithm.
On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.
1 Shortest Path Algorithms Andreas Klappenecker [based on slides by Prof. Welch]
Selected topics in distributed computing Shmuel Zaks
On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 10 Instructor: Haifeng YU.
Chapter 24: Single-Source Shortest Paths Given: A single source vertex in a weighted, directed graph. Want to compute a shortest path for each possible.
DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch Set 11: Asynchronous Consensus 1.
CSE 2331 / 5331 Topic 12: Shortest Path Basics Dijkstra Algorithm Relaxation Bellman-Ford Alg.
DISTRIBUTED SYSTEMS II A POLYNOMIAL LOCAL SOLUTION TO MUTUAL EXCLUSION Prof Philippas Tsigas Distributed Computing and Systems Research Group.
DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE
6.852: Distributed Algorithms Spring, 2008 Class 25-1.
Several sets of slides by Prof. Jennifer Welch will be used in this course. The slides are mostly identical to her slides, with some minor changes. Set.
DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 2: Basic Graph Algorithms 1.
Self-stabilization. What is Self-stabilization? Technique for spontaneous healing after transient failure or perturbation. Non-masking tolerance (Forward.
CS 542: Topics in Distributed Systems Self-Stabilization.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 Set 3: Leader Election in Rings 1.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 16: Distributed Shared Memory 1.
1 Fault tolerance in distributed systems n Motivation n robust and stabilizing algorithms n failure models n robust algorithms u decision problems u impossibility.
Self-stabilization. Technique for spontaneous healing after transient failure or perturbation. Non-masking tolerance (Forward error recovery). Guarantees.
DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 9: Fault Tolerant Consensus 1.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
ITEC452 Distributed Computing Lecture 15 Self-stabilization Hwajung Lee.
CSCE 411 Design and Analysis of Algorithms Set 9: More Graph Algorithms Prof. Jennifer Welch Spring 2012 CSCE 411, Spring 2012: Set 9 1.
1 Chapter 11 Global Properties (Distributed Termination)
Data Structures and Algorithm Analysis Graph Algorithms Lecturer: Jing Liu Homepage:
TIRGUL 10 Dijkstra’s algorithm Bellman-Ford Algorithm 1.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Lecture 8: Synchronous Network Algorithms
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Presentation transcript:

CPSC 668Self Stabilization1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch

CPSC 668Self Stabilization2 Reference Self-Stabilization, Shlomi Dolev, MIT Press, –Chapter 2 Slides prepared for the book by Shlomi Dolev –available at html

CPSC 668Self Stabilization3 Self-Stabilization A powerful form of fault-tolerance. Starting from an arbitrary system configuration, the algorithm is able to start working properly all on its own Arbitrary system configuration is caused by some transient failure: message loss, corrupted memory, processor failure, loss of synchrony,… As long as system is well-behaved sufficiently long, the algorithm can correct itself. Paradigm has been applied to both shared memory and message passing models

CPSC 668Self Stabilization4 Definitions Execution no longer defined to start with an initial configuration –instead can start with an arbitrary configuration Depending on the problem to be solved, certain executions are considered legal, forming the set LE. A configuration C is safe if every admissible execution starting with C is in LE. An algorithm is self-stabilizing if every admissible execution reaches a safe configuration.

CPSC 668Self Stabilization5 Self-Stabilization Definition … … … … … … … … … … … arbitrary configuration safe configuration legal execution …

CPSC 668Self Stabilization6 Communication Model A "hybrid" of message passing and shared memory Communication topology is represented as an undirected graph –not necessarily fully connected Processors correspond to vertices Corresponding to each edge (p i,p j ) are two shared read/write registers: –R ij : written by p i and read by p j –R ji : written by p j and read by p i

CPSC 668Self Stabilization7 Communication Model p0p0 p1p1 p3p3 p2p2 R 01 R 10 R 12 R 21 R 32 R 23 R 31 R 13

CPSC 668Self Stabilization8 Self-Stabilizing Spanning Tree Definition Every processor has a variable parent in its local state. There is a distinguished root processor. LE consists of all admissible executions in which the parent variables form a spanning tree rooted at root.

CPSC 668Self Stabilization9 SS Spanning Tree Algorithm Each processor has local variable –parent, id of neighbor who is parent –dist, estimated distance to root Root sets dist to 0, and copies state to all its "outgoing" registers Non-root reads neighbors' states and adopts as its parent the neighbor with the smallest distance, and sets its distance to one more Nodes perform these actions repeatedly

CPSC 668Self Stabilization10 SS Spanning Tree Algorithm Code for root p 0 : while true do parent :=  dist := 0 for each neighbor p j do R 0j := 0 // write shared variable endfor

CPSC 668Self Stabilization11 SS Spanning Tree Algorithm Code for non-root p i : while true do for each neighbor p j do neigh-dist[j] := R ji // read shared variable endfor dist := 1 + min{neigh-dist[j] : p j is a neighbor} foundParent := false for each neighbor p i do if !foundParent and neigh-dist[j] = dist - 1 then parent := j foundParent := true endif R ij := dist // write shared variable endfor endwhile storage of negative values is not allowed

CPSC 668Self Stabilization12 2 Output of Spanning Tree Algorithm numbers are distances red arrows indicate parents white edges are non-tree edges

CPSC 668Self Stabilization13 Correctness Proof of SS ST Alg Definition: Executions are partitioned into asynchronous rounds, which are the shortest segments containing at least one step by each processor. Definition:  is the degree (maximum number of neighbors) of the communication graph. Definition: D is the diameter of the communication graph.

CPSC 668Self Stabilization14 Correctness Proof of SS ST Alg Lemma: Consider any admissible execution. There exists T 1 < T 2 < … < T D such that after asynchronous round T k : (a) every proc. at distance ≤ k from root has dist = shortest path distance to root and parent variables form a BFS tree (b) every proc. at distance > k from root has dist ≥ k.

CPSC 668Self Stabilization15 Correctness Proof of SS ST Alg Proof: By induction on k. Basis (k = 1): Let T 1 = 5 . Initially all distances are nonnegative. Procs might start with program counter in the middle of an iteration of the outer while loop; after at most 2  rounds, partial iterations are done. After next  rounds, all non-root procs have completed read for- loop at least once and computed dist: all are > 0 After next  rounds, all non-root procs have completed write for- loop at least once After next  rounds, all non-root procs have completed read for- loop at least once and computed dist: every neighbor of root reads 0 from root and > 0 from every other node, so sets dist to 1 and parent to root.

CPSC 668Self Stabilization16 Correctness Proof of SS ST Alg Induction (k > 1): Assume for k - 1 and show for k. Let T k = T k . Consider the execution just after end of asynchronous round T k-1. After next  rounds, all non-root nodes have executed write for-loop at least once (and written their dist values). After next  rounds, all non-root nodes have executed read for-loop at least once. Suppose p i is at distance d ≤ k from root. –p i has at least one neighbor p j at distance d-1 ≤ k-1 from root, and no neighbor that is closer to the root. –By inductive hypothesis, p j 's register has correct value in it and all other neighbors of p i have registers with values ≥ d-1. –Thus p i correctly computes dist and parent. Suppose p i is at distance > k from root. –Every neighbor of p i is at distance ≥ k from root. –By inductive hypothesis, all their registers have values ≥ k-1. –Thus p i computes dist to be ≥ k.

CPSC 668Self Stabilization17 Correctness Proof of SS ST Alg Since every processor is at most distance D from root, previous lemma implies that a correct breadth-first spanning tree has been constructed after O(D  ) asynchronous rounds, no matter what the starting configuration.

CPSC 668Self Stabilization18 Another Classic SS Algorithm Proposed by Dijkstra Suggested for mutual exclusion –we will view it as a "token circulation" algorithm Uses a stronger model of computation –in one atomic step, a proc can read all "incoming" registers and write all its "outgoing" registers

CPSC 668Self Stabilization19 Ring Communication Topology Procs are arranged in a unidirectional ring. Only need one register for each proc. p0p0 p1p1 p3p3 p2p2 R3R3 R2R2 R1R1 R0R0 p 0 writes into R 0, p 1 reads from R 0, etc.

CPSC 668Self Stabilization20 Processor's States Each processor's state consists solely of an integer, ranging from 0 to K - 1 (for suitable value of K) Actually, processor just stores this information in its register.

CPSC 668Self Stabilization21 Definition of Holding the Token Proc p 0 holds the token if R 0 = R n-1. Proc p i (other than p 0 ) holds the token if R i ≠ R i-1.

CPSC 668Self Stabilization22 Self-Stabilizing Token Circulation Definition LE consists of all admissible executions in which –in every configuration only one processor holds the token and –every processor holds the token infinitely often (Note resemblance to mutual exclusion problem.)

CPSC 668Self Stabilization23 Dijkstra's Algorithm code for p 0 : while true do if R 0 = R n-1 then R 0 := (R 0 + 1) mod K endif endwhile executes atomically code for p i, i ≠ 0: while true do if R i ≠ R i-1 then R i := R i-1 endif endwhile

CPSC 668Self Stabilization24 Analysis of Dijkstra's Algorithm Lemma: If all registers are equal in a configuration, then the configuration is safe. Proof: p0p0 p1p1 p3p3 p2p Suppose K =

CPSC 668Self Stabilization25 Analysis of Dijkstra's Algorithm If execution begins with arbitrary values between 0 and K-1 in the registers, how can we show that eventually all the values will be the same (i.e., reach a safe state)? Depends on K being large enough. Suppose K = n (so there are n+1 different values). Lemma 1: In every configuration, there is at least one integer in {0,…,K-1} that does not appear in any register.

CPSC 668Self Stabilization26 Analysis of Dijkstra's Algorithm Lemma 2: In every admissible execution (starting from any configuration), p 0 holds the token, and thus changes R 0, at least once during every n rounds. Proof: Suppose in contradiction there is a segment of n rounds in which p 0 does not change R 0. Once p 1 takes a step in the first round, R 1 = R 0, and this equality remains true. Once p 2 takes a step in the second round, R 2 = R 1 = R 0, and this equality remains true. … Once p n-1 takes a step in the (n-1)-st round, R n-1 = R n- 2 = … = R 0. So when p 0 takes a step in the n-th round, it will change R 0.

CPSC 668Self Stabilization27 Analysis of Dijkstra's Algorithm Theorem: In any admissible execution starting at any configuration C, a safe configuration is reached within O(n 2 ) rounds. Proof: Let j be a value not in any register in C. By Lemma 2, p 0 changes R 0 (by incrementing it) at least once every n rounds. Thus eventually R 0 holds j, in configuration D, after at most O(n 2 ) rounds. Since other procs only copy values, no register holds j between C and D. After at most n more rounds, the value j propagates around the ring to p n-1.

CPSC 668Self Stabilization28 What about Reducing K? Easy to see that K = n - 1 (n different values) suffices: either there is a missing value or p 0 's value is unique. Can also show that K = n - 2 (n-1 different values) suffices. But if K < n - 2 (less than n-1 different values), then there is a counter-example. If the strong atomicity model is weakened to our familiar read/write atomicity, then K > 2n - 2 suffices.