Chapter 2 Self-Stabilization Self-Stabilization Shlomi Dolev MIT Press , 2000 Draft of May 2003, Shlomi Dolev, All Rights Reserved © Chapter 2 - Definitions, Techniques and Paradigms
Chapter 2: roadmap 2.1 Definitions of Computational Model 2.2 Self-Stabilization Requirements 2.3 Complexity Measures 2.4 Randomized Self-Stabilization 2.5 Example: Spanning-Tree Construction 2.6 Example: Mutual Exclusion 2.7 Fair Composition of Self-Stabilization Algorithms 2.8 Recomputation of Floating Output 2.9 Proof Techniques 2.10 Pseudo-Self-Stabilization Chapter 2 - Definitions, Techniques and Paradigms
What is a Distributed System? Communication networks Multiprocessor computers Multitasking single processor A Distributed System is modeled by a set of n state machines called processors that communicate with each other Chapter 2 - Definitions, Techniques and Paradigms
The Distributed System Model Denote : Pi - the ith processor neighbor of Pi - a processor that can communicate with it Node i = Processor i Pi Pj Link Pi<->Pj = Pi can communicate with Pj How to Represent the Model? Ways of communication message passing - fits communication networks and all the rest shared memory - fits geographically close systems Chapter 2 - Definitions, Techniques and Paradigms
Asynchronous Distributed Systems – Message passing A communication link which is unidirectional from Pi to Pj transfers message from Pi to Pj For a unidirectional link we will use the abstract qij (a FIFO queue) receive m2 P1 P2 P3 q13 = () q32 = () q21 = (m2,m10) P1 P2 P3 q13 = () q32 = (m1) q21 = (m10) P1 P2 P3 q13 = () q32 = (m1) q21 = (m2, m10) send m1 Chapter 2 - Definitions, Techniques and Paradigms
Asynchronous Distributed Systems - Message passing System configuration (configuration) : Description of a distributed system at a particular time. A configuration will be denoted by c = (s1,s2,…,sn,q1,2,q1,3,…,qi,j,…,qn,n-1) , where si =State of Pi qi,j (ij) the message queue P1 P2 P3 q13 = () q32 = () q21 = (m2,m10) m1 Chapter 2 - Definitions, Techniques and Paradigms
Asynchronous Distributed Systems – Shared Memory P1 P2 r12 Processors communicate by the use of shared communication registers The configuration will be denoted by c = (s1,s2,…,sn,r1,2,r1,3,…ri,j,…rn,n-1) where si = State of Pi ri= Content of communication register i Chapter 2 - Definitions, Techniques and Paradigms
The distributed System – A Computation Step In shared memory model … P1 P2 m1 r12: x P1 P2 r12: m1 P1 writes P1 P2 r12: m1 P2 reads m1 And in message passing model … loss21(m3) P1 P2 q12 = (m1) q21 = (m2,m3,m4) P1 P2 q12 = (m1) q21 = (m2,m4) P1 receives P1 P2 q12 = (m1) q21 = (m4) m2 Chapter 2 - Definitions, Techniques and Paradigms
The Interleaving model Scheduling of events in a distributed system influences the transition made by the processors The interleaving model - at each given time only a single processor executes a computation step Every state transition of a process is due to communication-step execution A step will be denoted by a c1 a c2 denotes the fact that c2 can be reached from c1 by a single step a Chapter 2 - Definitions, Techniques and Paradigms
The distributed System – more definitions Step a is applicable to configuration c iff c’ : c a c’ . In the message passing system - queue qi,j contains m in ck, and in ck+1 m is removed from qi,j and placed in Pt and this is the only difference between ck& ck+1 An execution E = (c1,a1,c2,a2,…) , an alternating sequence such that ci-1 a ci (i>1) A fair execution - every step that is applicable infinitely often is executed infinitely often In the message passing model - a message which is sent infinitely often will be received infinitely often – number of losses is finite Chapter 2 - Definitions, Techniques and Paradigms
Synchronous Distributed Systems A global clock pulse (pulse) triggers a simultaneous step of every processor in the system Fits multiprocessor systems in which the processors are located close to one another The execution of a synchronous system E = (c1,c2,…) is totally defined by c1, the first configuration in E 1 4 3 2 q12 = (m12) q14 = (m14) q21 = (m21) q31 = (m31) q32 = (m32) intermediate state send receive 1 4 3 2 q12 = () q14 = () q21 = () q31 = () q32 = () 1 4 3 2 q12 = () q14 = () q21 = () q31 = () q32 = () * This models message passing, shared memory is analogous with write -> read Chapter 2 - Definitions, Techniques and Paradigms
Legal Behavior A desired legal behavior is a set of legal executions denoted LE ctsafe c’safe c’’safe step legal execution step ci c’ c’’ cm cl c2safe c1safe cksafe c * A desired legal behavior is a set of legal executions denoted LE *Defined for a particular system and a particular task Should have a suffix that appears in LE *A configuration c is safe with regard to task LE and an algorithm if every fair execution of the algorithm that starts from c belongs to LE *An algorithm is self-stabilizing for a task LE if every fair execution of the algorithm reaches a safe configuration with relation to LE A self-stabilizing system can be started in any arbitrary configuration and will eventually exhibit a desired “legal” behavior Chapter 2 - Definitions, Techniques and Paradigms
Time complexity The first asynchronous round (round) in an execution E is the shortest prefix E’ of E such that each processor executes at least one step in E’, E=E’E’’. The number of rounds = time complexity A Self-Stabilizing algorithm is usually a do forever loop The number of steps required to execute a single iteration of such a loop is O(), where is an upper bound on the number of neighbors of Pi Asynchronous cycle (cycle) the first cycle in an execution E is the shortest prefix E’ of E such that each processor executes at least one complete iteration of it’s do forever loop in E’, E=E’E’’. Note : each cycle spans O() rounds The time complexity of synchronous algorithm is the number of pulses in the execution mkly Chapter 2 - Definitions, Techniques and Paradigms
Space complexity The space complexity of an algorithm is the total number of (local and shared) memory bits used to implement the algorithm Chapter 2 - Definitions, Techniques and Paradigms
Randomized Self-Stabilization Assumptions and definitions Processor activity is managed by a scheduler The scheduler’s assumption - at most one step is executed in every given time The scheduler is regarded as an adversary The scheduler is assumed to have unlimited resources and chooses the next activated processor on-line A scheduler S is fair if, for any configuration c with probability 1, an execution starting from c in which processors activated by S is fair Chapter 2 - Definitions, Techniques and Paradigms
Randomized Self-Stabilization - Assumptions and definitions .. An algorithm is randomized self-stabilizing for task LE if, starting with any system configuration and considering any fair scheduler, the algorithm reaches a safe configuration within a finite number of rounds Randomized algorithms are often used to break symmetry in a system of totally identical processors Chapter 2 - Definitions, Techniques and Paradigms
Chapter 2: roadmap 2.1 Definitions of Computational Model 2.2 Self-Stabilization Requirements 2.3 Complexity Measures 2.4 Randomized Self-Stabilization 2.5 Example: Spanning-Tree Construction 2.6 Example: Mutual Exclusion 2.7 Fair Composition of Self-Stabilization Algorithms 2.8 Recomputation of Floating Output 2.9 Proof Techniques 2.10 Pseudo-Self-Stabilization Chapter 2 - Definitions, Techniques and Paradigms
Spanning-Tree Construction 1 3 2 The root writes 0 to all it’s neighbors The rest – each processor chooses the minimal distance of it’s neighbors, adds one and updates it’s neighbors Chapter 2 - Definitions, Techniques and Paradigms
Spanning-Tree Algorithm for Pi 01 Root: do forever 02 for m := 1 to do write rim := 0,0 03 od 04 Other: do forever 05 for m := 1 to do write lrmi := read(rmi) 06 FirstFound := false 07 dist := 1 + minlrmi.dis 1 m 08 for m := 1 to 09 do 10 if not FirstFound and lrmi.dis = dist -1 11 write rim := 1,dist 12 FirstFound := true 13 else 14 write rim := 0,dist 15 od 16 od = # of processor’s neighbors i= the writing processor m= for whom the data is written lrji (local register ji) the last value of rji read by Pi Chapter 2 - Definitions, Techniques and Paradigms
Spanning-Tree, System and code Demonstrates the use of our definitions and requirements The system – We will use the shared memory model for this example The system consists n processors A processor Pi communicates with its neighbor Pj by writing in the communication register rij and reading from rji Chapter 2 - Definitions, Techniques and Paradigms
Spanning-Tree, System and code The output tree is encoded by means of the registers 1 4 5 8 6 2 3 7 r21: parent = 1 dis = 1 r58: parent = 8 dis = 3 r73: parent = 2 The output tree is encoded by means of the registers as follows: * Each register rij contains binary parent field, if Pj is the parent of Pi in the BFS tree then the value is 1 * In addition each register rij has a distance field (dis) that holds the distance from the root to Pi Chapter 2 - Definitions, Techniques and Paradigms
Spanning-Tree Application RUN Chapter 2 - Definitions, Techniques and Paradigms
Spanning-Tree, is Self-Stabilizing The legal task ST - every configuration encodes a BFS tree of the communication graph Definitions : A floating distance in configuration c is a value in rij.dis that is smaller than the distance of Pi from the root The smallest floating distance in configuration c is the smallest value among the floating distance is the maximum number of links adjacent to a processor * A configuration is a vector of the processors states and a vector of communication registers values The task ST of legitimate sequences is defined as the set of all configuration sequences in which every configuration encodes a BFS tree of the communication graph Definitions : *A floating distance in some configuration c is a value in a register rij.dis that is smaller than the distance of Pi from the root *The smallest floating distance in some configuration c is the smallest value among the floating distance * is the maximum number of links adjacent to a processor Chapter 2 - Definitions, Techniques and Paradigms
Spanning-Tree, is Self-Stabilizing For every k > 0 and for every configuration that follows + 4k rounds, it holds that: (Lemma 2.1) If there exists a floating distance, then the value of the smallest floating distance is at least k The value in the registers of every processor that is within distance k from the root is equal to its distance from the root Note that once a value in the register of every processor is equal to it’s distance from the root, a processor Pi chooses its parent to be the parent in the first BFS tree, this implies that : The algorithm presented is Self-Stabilizing for ST Chapter 2 - Definitions, Techniques and Paradigms
Mutual Exclusion The root changes it’s state if equal to it’s neighbor The rest – each processor copies it’s neighbor’s state if it is different Chapter 2 - Definitions, Techniques and Paradigms
Dijkstra’s Algorithm 01 P1: do forever 02 if x1=xn then 03 x1:=(x1+1)mod(n+1) 04 Pi(i 1): do forever 05 if xi xi-1 then 06 xi:=xi-1 Chapter 2 - Definitions, Techniques and Paradigms
Mutual-Exclusion Application RUN Chapter 2 - Definitions, Techniques and Paradigms
Dijsktra’s alg. is Self-Stabilizing A configuration of the system is a vector of n integer values (the processors in the system) The task ME : exactly one processor can change its state in any configuration, every processor can change its state in infinitely many configurations in every sequence in ME A safe configuration in ME and Dijkstra’s algorithm is a configuration in which all the x variables have the same value * The task ME is defined by the set of all configuration sequences in which exactly one processor can change its state in any configuration and every processor can change its state in infinitely many configurations in every sequence in ME Chapter 2 - Definitions, Techniques and Paradigms
Dijkstra’s alg. is Self-Stabilizing A configuration in which all x variables are equal, is a safe configuration for ME (Lemma 2.2) For every configuration there exists at least one integer j such that for every i xi j (Lemma 2.3) For every configuration c, in every fair execution that starts in c, P1 changes the value of x1 at least once in every n rounds (Lemma 2.4) For every possible configuration c, every fair execution that starts in c reaches a safe configuration with relation to ME within O(n2) rounds (Theorem 2.1) * A configuration c in which all the x variables have the same value is a safe configuration for ME and Dijsktra’s algorithm (Lemma 2.2) * For every possible configuration c, there exists at least one integer jn such that for every i xi j in c (Lemma 2.3) * For every possible configuration c, in every fair execution that starts in c, the special processor P1 changes the value of x1 ate least once in every n rounds (Lemma 2.4) Chapter 2 - Definitions, Techniques and Paradigms
Fair Composition -Some definitions The idea – composing self-stabilizing algorithms AL1,...,ALk so that the stabilized behavior of AL1,AL2,...,ALi is used by ALi+1 ALi+1 cannot detect whether the algorithms have stabilized, but it is executed as if they have done so The technique is described for k=2 : Two simple algorithms server & client are combined to obtain a more complex algorithm The server algorithm ensures that some properties will hold to be used by the client algorithm The technique is described for k=2 : * Two simple algorithms, called a server algorithm and a client algorithm, are combined to obtain a more complex algorithm * The server algorithm ensures that some properties will hold, and these properties are later used by the client algorithm Chapter 2 - Definitions, Techniques and Paradigms
Fair Composition -more definitions Assume the server algorithm AL1 is for a task defined by a set of legal execution T1, and the client algorithm AL2 for T2 Let Ai be the state set of Pi in AL1 and Si= AiBi the state set of Pi in AL2 ,where whenever Pi executes AL2 , it modifies only the Bi components of AiBi For a configuration c S1 … Sn , define the A-projection of c as the configuration (ap1, … , apn) A1 … An The A-projection of an execution - consist of the A-projection of every configuration of the execution * Assume server algorithm AL1 is for a task defined by a set of legal execution T1, and that the client algorithm AL2 is for a task defined by a set of legal execution T2 * Let Ai be the state set of Pi in AL1 and Si= AiBi the state set of Pi in AL2 ,where whenever Pi executes AL2 , it modifies only the Bi components of AiBi * For a configuration c S1 … Sn , define the A-projection of c as the configuration (ap1, … , apn) A1 … An * The A-projection of an execution is defined analogously to consist of the A-projection of every configuration of the execution Chapter 2 - Definitions, Techniques and Paradigms
Fair Composition -more definitions … AL2 is self-stabilizing for task T2 given task T1 if any fair execution of AL2 that has an A-projection in T1 has a suffix in T2 AL is a fair composition of AL1 and AL2 if, in AL, every processor execute steps of AL1 and AL2 alternately Assume AL2 is self-stabilizing for task T2 given task T1. If AL1 is self-stabilizing for T1, then the fair composition of AL1 and AL2 is self-stabilizing for T2 (Theorem 2.2) * We say that algorithm AL2 is self-stabilizing for task T2 given task T1 if any fair execution of AL2 that has an A-projection in T1 has a suffix in T2 * An algorithm AL is a fair composition of AL1 and AL2 if, in AL, every processor execute steps of AL1 and AL2 alternately * Note: for an execution E at AL, the A-projection of E is a sub-execution of E corresponding to a fair execution of the server algorithm AL1 * Theorem (2.2) : Assume that AL2 is self-stabilizing for task T2 given task T1. If AL1 is self-stabilizing for T1, then the fair composition of AL1 and AL2 is self-stabilizing for T2 Chapter 2 - Definitions, Techniques and Paradigms
Example : Mutual exclusion for general communication graphs Will demonstrate the power of fair composition method What will we do ? Compose the spanning-tree construction algorithm with the mutual exclusion algorithm Server 1 3 2 Client The server algorithm is the spanning-tree construction The client algorithm is the mutual exclusion Chapter 2 - Definitions, Techniques and Paradigms
Modified version of the mutual exclusion algorithm Designed to stabilize in a system in which a rooted spanning tree exists and in which only read/write atomicity is assumed Euler tour defines a virtual ring P1 lr4,1 lr2,1 lr3,1 r1,4 r1,3 r1,2 r4,1 r2,1 r3,1 P4 P2 P3 lr1,4 lr1,2 lr1,3 Chapter 2 - Definitions, Techniques and Paradigms
Mutual exclusion for tree structure (for Pi) 01 Root: do forever 02 lr1,i := read (r1,i) 03 if lr,i = ri,1 then 04 (* critical section*) 05 write ri,2 := (lr1,i + 1 mod (4n -5)) 06 for m := 2 to do 07 lrm,i = read(rm,i ) 08 write ri,m+1 := lrm,i 09 od 10 od 11 Other: do forever 12 lr1,i := read (r1,i) 13 if lr1,i ri,2 14 (* critical section*) 15 write ri,2 := lr1,i 16 for m := 2 to do 17 lrm,i := read (rm,i) 18 write ri,m+1 := lrm,i 19 od 20 od Chapter 2 - Definitions, Techniques and Paradigms
mutual exclusion for communication graphs Mutual-exclusion can be applied to a spanning tree of the system using the ring defined by the Euler tour Note – a parent field in the spanning-tree defines the parent of each processor in the tree When the server reaches the stabilizing state, the mutual-exclusion algorithm is in an arbitrary state, from there converges to reach the safe configuration Let i = i1, i2, … , i be the arbitrary ordering of the tree edges incident to a non-root node vi V, where the first edge is the edge leading to the parent of vi in the tree *…(page 25) he mutual-exclusion algorithm can be applied to a spanning tree of the system using the ring defined by the Euler tour on the tree * Note - the value of the parent fields of the server algorithm (spanning-tree) eventually defines the parent of each processor in the tree * Until the server algorithm reaches the stabilized state the clients execution may not be as considered during the design of the algorithm. Consequently, once the tree is fixed, the self-stabilizing mutual-exclusion algorithm is in an arbitrary state from which it converges to reach a safe configuration Chapter 2 - Definitions, Techniques and Paradigms
Chapter 2: roadmap 2.1 Definitions of Computational Model 2.2 Self-Stabilization Requirements 2.3 Complexity Measures 2.4 Randomized Self-Stabilization 2.5 Example: Spanning-Tree Construction 2.6 Example: Mutual Exclusion 2.7 Fair Composition of Self-Stabilization Algorithms 2.8 Recomputation of Floating Output 2.9 Proof Techniques 2.10 Pseudo-Self-Stabilization Chapter 2 - Definitions, Techniques and Paradigms
Recomputation of Floating Output -The Idea OK Output BAD c1 c2 ci ck BAD BAD floating output variables What do we need to prove ? * The recomputation of the floating output is a way to convert a non-stabilizing algorithm AL that computes a fixed distributed output into a self-stabilizing algorithm (the idea is to execute AL repeatedly) * Each time the execution of AL is over, the output is written in special output variables called floating output variables What do we need to prove ? * From every possible configuration, an execution of AL will eventually begin from a predefined initial state * Any two executions of AL that begin from the initial state have identical output From every possible configuration, an execution of AL will eventually begin from a predefined initial state Any two executions of AL that begin from the initial state have identical output Chapter 2 - Definitions, Techniques and Paradigms
Recomputation of Floating Output Example: Synchronous Consensus Demonstrates the core idea behind the recomputation of the floating output Assumptions We will ignore synchronization issues We will assume that the system is a message-passing synchronous system Define – The synchronous consensus task - a set of (synchronous) executions SC in which the output variable of every processor is 1 iff there exists an input value with value 1 Chapter 2 - Definitions, Techniques and Paradigms
Non-stabilizing synchronous consensus algorithm This algorithm is not a self stabilizing algorithm 01 initialization 02 pulse := 0 03 Oi = Ii 04 while pulsei D do 05 upon a pulse 06 pulsei := pulsei + 1 07 send (Oi) 08 forall Pj N(i) do receive (Oj) 09 if Oi = 0 and Pj N(i) | Oj = 1 then 10 Oi := 1 1 1 1 1 i : Ii = 0 Chapter 2 - Definitions, Techniques and Paradigms
Self-stabilizing synchronous consensus algorithm 01 upon a pulse 02 if (pulsei mod (D +1)) = 0 then 03 begin 04 Oi = oi (*floating output is assigned by the result*) 05 oi = Ii (*initializing a new computation*) 06 end 07 send (oi , pulsei) 08 forall Pj N(i) do receive (oj , pulsej) 09 if oi = 0 and Pj N(i) | oj = 1 then 10 oi := 1 11 pulsei := max {{pulsei } {pulsej | Pj N(i)}} +1 Chapter 2 - Definitions, Techniques and Paradigms
Self-stabilizing synchronous consensus algorithm The output variable Oi of every processor Pi is used as a floating output, which is eventually fixed and correct The variable oi of every processor Pi is used for recomputing the output of the algorithm i : Oi = 0 i : Oi = 1 1 1 1 1 1 i : Ii = 0 Chapter 2 - Definitions, Techniques and Paradigms
Chapter 2: roadmap 2.1 Definitions of Computational Model 2.2 Self-Stabilization Requirements 2.3 Complexity Measures 2.4 Randomized Self-Stabilization 2.5 Example: Spanning-Tree Construction 2.6 Example: Mutual Exclusion 2.7 Fair Composition of Self-Stabilization Algorithms 2.8 Recomputation of Floating Output 2.9 Proof Techniques 2.10 Pseudo-Self-Stabilization Chapter 2 - Definitions, Techniques and Paradigms
Variant Function Used for proving convergence step c c1 c2 c3 csafe steps |VF(c)| |VF(c1)| |VF(c2)| |VF(c3)| … |VF(csafe)| … bound Used for proving convergence Can be used to estimate the number of steps required to reach a safe configuration The idea : Use a function over a configuration set whose value is bounded Prove that this function monotonically decreases/ increases when processors execute a step Show after the function reaches a certain threshold, the system is in a safe configuration Chapter 2 - Definitions, Techniques and Paradigms
Variant Function - Example: self stabilizing Maximal Matching Every processor Pi tries to find a matching neighbor Pj Program for Pi : 01 do forever 02 if pointeri = null and ( Pj N(i) | pointerj = i ) then 03 pointeri = j 04 if pointeri = null and ( Pj N(i) | pointerj i ) and 05 ( Pj N(i) | pointerj = null ) then 06 pointeri = j 07 if pointeri = j and pointerj = k and k i then 08 pointeri = null 09 od Chapter 2 - Definitions, Techniques and Paradigms
Maximal Matching Application RUN Chapter 2 - Definitions, Techniques and Paradigms
Variant Function - Example: self stabilizing Maximal Matching The algorithm should reach a configuration in which pointeri = j implies that pointerj =i We will assume the existence of a central daemon The set of legal executions MM for the maximal matching task includes every execution in which the values of the pointers of all the processors are fixed and form a maximal matching * The maximal matching algorithm should reach a configuration cl in which the existence of a pointer of Pi that points to Pj implies the existence of a pointer of Pj that points to Pi * To simplify the discussion, let us assume the existence of a central daemon that activates one processor at a time * The set of legal executions MM for the maximal matching task includes every execution in which the values of the pointers of all the processors are fixed and form a maximal matching Chapter 2 - Definitions, Techniques and Paradigms
Variant Function - Example: self stabilizing Maximal Matching- definitions Program for Pi : 01 do forever 02 if pointeri = null and ( Pj N(i) | pointerj = i ) then 03 pointeri = j 04 if pointeri = null and ( Pj N(i) | pointerj i ) and 05 ( Pj N(i) | pointerj = null ) then 06 pointeri = j 07 if pointeri = j and pointerj = k and k i then 08 pointeri = null 09 od matched free waiting Given a configuration cl, we say that a processor Pi is: matched in cl, if Pi has a neighbor Pj such that pointeri = j and pointerj = i single in cl, if pointeri = null and every neighbor of Pi is matched waiting in cl, if Pi has a neighbor such that pointeri = j and pointerj = null free in cl, if pointeri = null and there exists a neighbor Pj, such that Pj is not matched chaining in cl, if there exists a neighbor Pj for which pointeri = j and pointerj = k , k i chaining single Chapter 2 - Definitions, Techniques and Paradigms
Variant Function - Example: self stabilizing Maximal Matching- proving correctness The variant function VF(c) returns a vector (m+s,w,f,c) m - matched, s – single, w – waiting, f – free, c - chaining Values of VF are compared lexicographically VF(c) = (n,0,0,0) c is a safe configuration with relation to MM and to our algorithm Once a system reaches a safe configuration, no processor changes the value of its pointer Note that: for every configuration cl for which VF(c) = (n,0,0,0) is a safe configuration with relation to MM and to our algorithm for every safe configuration VF(c) = (n,0,0,0) Once a system reaches the safe configuration, no processor changes the value of its pointer In every non-safe configuration, there exists at least one processor that can change the value of its pointer when it is activated by the central daemon Chapter 2 - Definitions, Techniques and Paradigms
Variant Function - Example: self stabilizing Maximal Matching- proving correctness In every non-safe configuration, there exists at least one processor that can change the value of its pointer Every change of a pointer-value increases the value of VF The number of such pointer-value changes is bounded by the number of all possible vector values. The first three elements of the vector (m+s,w,f,c) imply the value of c, thus there at most O(n3) changes. The fact that m+s+w+f+c = n implies that the number of possible vector values is O(n3) The value of n and the first three elements of the vector (m+s, w, f, c) imply the value of c . Therefore the system reaches a safe configuration within O(n3) pointer-value changes Chapter 2 - Definitions, Techniques and Paradigms
Convergence Stairs A1 A2 ..... Ak Ai – predicate c1c2…cj A1 cj+1…cl A2 cl+1…ci ..... cmcm+1 Ak Definitions : Ai+1 refines predicate Ai if Ai holds whenever Ai+1 holds The term attractor is often used for such Ai predicate The idea : prove that the self-stabilizing algorithm converges to fulfill k >1 predicates A1,A2, … ,Ak such that, for every 1 i k, Ai+1 is a refinement of Ai The “stairs” : prove that, from some point of the execution, every configuration satisfies Ai then proving that an execution in which Ai holds reaches a configuration after which every configuration satisfies Ai+1 until Ak which is the predicate for a safe configuration csafe Ai – predicate for every 1 i k, Ai+1 is a refinement of Ai Chapter 2 - Definitions, Techniques and Paradigms
Convergence Stairs - Example: Leader election in a General Communication Network Program for Pi, each processor reads it’s neighbors leader and chooses the candidate with the lowest value : 01 do forever 02 candidate,distance = ID(i), 0 03 forall Pj N(i) do 04 begin 05 leaderi[j],disi[j] := read leaderj, disj 06 if (disi[j] N) and ((leaderi[j] candidate) or 07 ((leaderi[j] = candidate) and (disi[j] distance))) then 08 candidate,distance := leaderi[j],disi[j] + 1 09 end 10 write leaderi ,disi := candidate,distance 11 od Program for Pi , each processor reads its neighbors leader and chooses the canidate with the lowest value: Chapter 2 - Definitions, Techniques and Paradigms
Convergence Stairs - Example: Leader election in a General Communication Networks We assume that every processor has a unique identifier in the range 1 to N The leader election task is to inform every processor of the identifier of a single processor in the system, this single processor is the leader Floating identifier - an identifier that appears in the initial configuration, when no processor in the system with this identifier appears in the system * We assume that every processor has a unique identifier in the range 1 to N , where N is the upper bound of the number of processors in the system * The leader election task is to inform every processor of the identifier of a single processor in the system, this single processor with the elected identifier is the leader * The term floating identifier is used to describe an identifier that appears in the initial configuration, when no processor in the system with this identifier appears in the system (we use distance variables and the upper bound N to eliminate floating identifiers) Chapter 2 - Definitions, Techniques and Paradigms
Convergence Stairs - Example: Leader election, proving correctness We will use 2 convergence stairs : A1 - no floating identifiers exists A2 (for a safe configuration) every processor chooses the minimal identifier of a processor in the system as the identifier of the leader To show that A1 holds, we argue that, if a floating identifier exists, then during O() rounds, the minimal distance of a floating identifier increases * We will use 2 convergence stairs : First - predicate A1 on system configurations verifying that no floating identifiers exists Second - predicate A2 for a safe configuration, a predicate that verifies that every processor chooses the minimal identifier of a processor in the system as the identifier of the leader * To show that A1 holds, we argue that, if a floating identifier exists, then during O() rounds, the minimal distance of a floating identifier increases Chapter 2 - Definitions, Techniques and Paradigms
Convergence Stairs - Example: Leader election, proving correctness ... After the first stair only the correct ids exist, so the minimal can be chosen From that point, every fair execution that starts from any arbitrary configuration reaches the safe configuration Notice : if A1 wasn’t true, we couldn’t prove the correctness Notice : if A1 wasn’t true, we couldn’t prove the correctness, it is possible that the minimal identifier z - which is the first (arbitrary) configuration of the system - is not an identifier of a processor in the system Chapter 2 - Definitions, Techniques and Paradigms
Scheduler-Luck Game Purpose : proving the upper bounds of the time complexity of randomized distributed algorithms by probability calculations Using the sl-game method . It tries to avoid considering every possible outcome of the randomized function used by the randomized algorithm Chapter 2 - Definitions, Techniques and Paradigms
The sl-game Given a randomized algorithm AL, we define a game between 2 players, scheduler and luck scheduler ● Given a randomized algorithm AL, we define a game between 2 players, scheduler and luck The goal of the scheduler is to prevent the algorithm AL from fulfilling its task, it can choose the activation interleaving of the processors luck may determine the result of the randomized function invoked The rules … In each turn of the game, the scheduler chooses the next processor to be activated, which then makes a step. If, during this step,the activated processor uses a random function, then luck may intervene luck Chapter 2 - Definitions, Techniques and Paradigms
The sl-game(2) Luck’s strategy is used to show that the algorithm stabilizes Some definitions : cp = fi=1pi f is # of lucks intervenes pi - probability of the ith intervention luck should have a (cp,r)-strategy to win the game If luck has a (cp,r)-strategy,then AL reaches a safe configuration within, at most, r/cp expected number of rounds (Theorem 2.4) Some definitions : cp = fi=1pi where f is the number of times that luck intervenes and pi is the probability of the ith intervention luck has a (cp,r)-strategy to win the game if it has a strategy to reach a safe configuration in the game in an expected number of at most r rounds, and with interventions that yield a combined probability of no more than cp We can get to the conclusion that (Theorem 2.4): If luck has an (cp,r)-startegy,then AL reaches a safe configuration within, at most, r/cp expected number of rounds Chapter 2 - Definitions, Techniques and Paradigms
SL-Game, Example: Self Stabilizing Leader election in Complete Graphs The next algorithm works in complete graph systems, communication via shared memory Program for Pi : 01 do forever 02 forall Pj N(i) do 03 li[j] = read( leaderj ) 04 if (leaderj = 0 and { j i|li[j] = 0}) or 05 (leaderi = 1 and { j i|li[j] = 1}) then 06 write leaderi := random({0,1}) 07 end The next algorithm works in complete graph systems in which every processor can communicate with every other processor via shared memory Chapter 2 - Definitions, Techniques and Paradigms
Random Leader election in Complete Graphs RUN Chapter 2 - Definitions, Techniques and Paradigms
SL-Game, Example: Self Stabilizing Leader election in Complete Graphs Task LE - the set of executions in which there exists a single fixed leader throughout the execution A configuration is safe if it satisfies the following: for exactly one processor, say Pi, leaderi = 1 and j i li[j] = 0 for every other processor Pj Pi leaderj = 0 and lj[i] = 1 In any fair execution E that starts with a safe configuration, Pi is a single leader and thus E LE Task LE to be the set of executions in which there exists a single fixed leader throughout the execution A configuration is safe if it satisfies the following: for exactly one processor, say Pi, leaderi = 1 and j i li[j] = 0 for every other processor Pj Pi leaderj = 0 and lj[i] = 1 In any fair execution E that starts with a safe configuration, Pi is a single, leader and thus E LE Chapter 2 - Definitions, Techniques and Paradigms
SL-Game, Example: Self Stabilizing Leader election in Complete Graphs - proof The algorithm stabilizes within 2O(n) expected number of rounds (Lemma 2.6) using theorem 2.4 we will show that the number of round is expected to be 2n2n we present an (1/2n, 2n)-strategy for luck to win the sl-game Luck’s strategy is as follows: whenever some processor Pi tosses a coin, luck intervenes; if for all j i, leaderj = 0, then luck fixes the coin toss to be 1; otherwise, it fixes the coin toss to be 0 Lemma 2.6 : The algorithm stabilizes within 2O(n) expected number of rounds * using theorem 2.4 we will show that the number of round is expected to be 2n2n * we present an (1/2n, 2n)-strategy for luck to win the sl-game * Luck’s strategy is as follows: whenever some processor Pi tosses a coin, luck intervenes; if for all j i, leaderj = 0, then luck fixes the coin toss to be 1; otherwise, it fixes the coin toss to be 0 Chapter 2 - Definitions, Techniques and Paradigms
SL-Game, Example: Self Stabilizing Leader election in Complete Graphs - proof... The algorithm is not self-stabilizing under fine-atomicity (Lemma 2.7) By presenting a winning strategy for the scheduler that ensures that the system never stabilizes Starting with all leader registers holding 1, the scheduler will persistently activate each processor until it chooses 0 and before it writes down. Then it lets them write. Analogously, scheduler forces to choose them 1. Lemma 2.7 : The algorithm is not self-stabilizing under fine-atomicity (in which a coin-toss is separate atomic step) . * By presenting a winning strategy for the scheduler (that guarantees that the obtained schedule is a fair schedule with probability 1) that ensures that the system never stabilizes Chapter 2 - Definitions, Techniques and Paradigms
Neighborhood Resemblance Purpose : proving memory lower bounds silent self-stabilizing algorithm - self-stabilizing algorithm in which the communication between the processors is fixed from some point of the execution The technique can be applied to silent self- stabilizing algorithms The idea : using a configuration c0 (safe configuration) and construct a non-safe configuration c1 in which every processor has the same neighborhood and therefore cannot distinguish c0 from c1 * Purpose : proving memory lower bounds - proving that it is impossible to achieve certain tasks with less than a certain amount of memory * We define silent self-stabilizing algorithm as a self-stabilizing algorithm if the communication between the processors is fixed from some point of the execution * The technique can be applied on silence self- stabilizing algorithms * The idea : The neighborhood-resemblance technique uses a configuration c0 that is claimed to be safe configuration and constructs a non-safe configuration c1 in which every processor has the same neighborhood and therefor cannot distinguish c0 from c1 Chapter 2 - Definitions, Techniques and Paradigms
Reminder … Spanning-Tree Algorithm for Pi 01 Root: do forever 02 for m := 1 to do write rim := 0,0 03 od 04 Other: do forever 05 for m := 1 to do write lrmi := read(rmi) 06 FirstFound := false 07 dist := 1 + minlrmi.dis 1 m 08 for m := 1 to 09 do 10 if not FirstFound and lrmi.dis = dist -1 11 write rim := dist,1 12 FirstFound := true 13 else 14 write rim := 0,dist 15 od 16 od 1 3 2 Chapter 2 - Definitions, Techniques and Paradigms
Neighborhood Resemblance - Example: Spanning-Tree Construction We will show that implementing the distance field requires (log d) bits in every communication register, where d is the diameter of the system Note - the Spanning-Tree Construction is indeed a silent self-stabilizing algorithm * We will show that implementing the distance field requires (log d) bits in every communication register, where d is the diameter of the system * Note - the Spanning-Tree Construction is indeed a silent self-stabilizing algorithm (once the system stabilizes, the content of each communication register is fixed ) Chapter 2 - Definitions, Techniques and Paradigms
Neighborhood Resemblance, Example: Spanning Tree Construction ak bk Non-tree with special processor Graph P2 P1 ai Q1 bi ei Q2 ej ak ek bk Tree ak ek bk ai bi Non-tree without special processor Chapter 2 - Definitions, Techniques and Paradigms
Chapter 2: roadmap 2.1 Definitions of Computational Model 2.2 Self-Stabilization Requirements 2.3 Complexity Measures 2.4 Randomized Self-Stabilization 2.5 Example: Spanning-Tree Construction 2.6 Example: Mutual Exclusion 2.7 Fair Composition of Self-Stabilization Algorithms 2.8 Recomputation of Floating Output 2.9 Proof Techniques 2.10 Pseudo-Self-Stabilization Chapter 2 - Definitions, Techniques and Paradigms
What is Pseudo-Self-Stabilization ? step c2safe c1safe cksafe c ci c’ c’’ cm cl ctsafe c’safe c’’safe ci c’’safe clsafe * The Concept : Pseudo self-stabilizing algorithms converge from any initial state to execution, in which they exhibit a legal behavior; but they still may deviate from this legal behavior a finite number of times The algorithm exhibits a legal behavior; but may deviate from this legal behavior a finite number of times Chapter 2 - Definitions, Techniques and Paradigms
An Abstract Task An abstract task - variables and restrictions on their values The token passing abstract task AT for a system of 2 processors; Sender (S) and Receiver (R). S and R have boolean variable tokenS and tokenR Given E = (c1,a1,c2,a2,…) one may consider only the values of tokenS and tokenR in every configuration ci to check whether the token-passing task is achieved * An abstract task is defined by a set of variables and a set of restrictions on their values * Let us define the token passing abstract task AT for a system of 2 processors; Sender (S) and Receiver (R). S and R have boolean variable tokenS and tokenR respectively * An algorithm is pseudo-self-stabilizing for an abstract task AT if every infinite execution of the algorithm has a suffix satisfying the restriction of AT Chapter 2 - Definitions, Techniques and Paradigms
Pseudo-self-Stabilization Denote : ci|tkns the value of the boolean variables (tokenS , tokenR ) in ci E|tkns as (c1|tkns, c2|tkns, c3|tkns, …) We can define AT by E|tkns as follows: there is no ci|tkns for which tokenS=tokenR=true It is impossible to define a safe configuration in terms of ci|tkns, since we ignore the state variables of R/S Given E = (c1,a1,c2,a2,…) one may consider only the values of tokenS and tokenR in every configuration ci to check whether the token-passing task is achieved Denote : ci|tkns the value of the boolean variables (tokenS , tokenR ) in ci E|tkns as (c1|tkns, c2|tkns, c3|tkns, …) We can define AT by E|tkns as follows: there is no ci|tkns for which tokenS=tokenR=true Note that it is impossible to define a safe configuration in terms of ci|tkns, since we ignore the state variables of R/S Chapter 2 - Definitions, Techniques and Paradigms
Pseudo-Self-Stabilization legal tokenR = true tokenS = false tokenR = false tokenS = true The system is not in a safe configuration, and there is no time bound for reaching a safe configuration Only after a message is lost does the system reach the safe configuration Illegal tokenR = true tokenS = true legal Only after a message is lost does the system reach the lower cycle in the figure, and a safe configuration tokenR = true tokenS = false tokenR = false tokenS = true Chapter 2 - Definitions, Techniques and Paradigms
Pseudo-Self-Stabilization, The Alternating Bit Algorithm A data link algorithm used for message transfer over a communication link Messages can be lost, since the common communication link is unreliable The algorithm uses retransmission of messages to cope with message loss frame distinguishes the higher level messages, from the messages that are actually sent (between S and R) The term frame distinguish the higher level messages that must be transferred, from the messages that are actually sent (between S and R) in order to transfer the higher-level messages Chapter 2 - Definitions, Techniques and Paradigms
Pseudo-Self-Stabilization, The Data Link Algorithm The task of delivering a message is sophisticated, and may cause message corruption or even loss The layers involved Physical Layer Data link Layer Tail Packet Frame Network Layer Head The task of delivering a message from one processor in the network to another remote processor is sophisticated, and may cause message corruption or even loss There are several layers involved Physical Layer Data Link Layer (which concerns us) Network Layer Chapter 2 - Definitions, Techniques and Paradigms
Pseudo-Self-Stabilization, The Data Link Algorithm . m3 m2 m1 S R . m3 m2 m1 fetch The flow of a message: S R . m3 m2 m1 f1 send S R . m3 m2 m1 f1 receive The flow of a message from the senders queue to the receiver : fetch -> insert to frame and send -> receive frame -> deliver to and send notification to the sender S R . m3 m2 m1 deliver S R . m3 m2 m1 f2 send Chapter 2 - Definitions, Techniques and Paradigms
Pseudo-Self-Stabilization, Back to The Alternating Bit Algorithm The abstract task of the algorithm: S has an infinite queue of input messages (im1,im2,…) that should be transferred to the receiver in the same order without duplications, reordering or omissions. R has an output queue of messages (om1,om2,…). The sequence of messages in the output queue should always be the prefix of the sequence of messages in the input queue The abstract task of the algorithm: S has an infinite queue of input messages (im1,im2,…) that should be transferred to the receiver in the same order without duplications, reordering or omissions. R has an output queue of messages (om1,om2,…). The sequence of messages in the output queue should always be the prefix of the sequence of messages in the input queue Chapter 2 - Definitions, Techniques and Paradigms
The alternating bit algorithm - Sender 01 initialization 02 begin 03 i := 1 04 bits := 0 05 send(bits,imi) (*imi is fetched*) 06 end (*end initialization*) 07 upon a timeout 08 send(bits,imi) 09 upon frame arrival 10 begin 11 receive(FrameBit) 12 if FrameBit = bits then (*acknowledge arrives*) 13 begin 14 bits := (bits + 1) mod 2 15 i := i + 1 16 end 17 send(bits,imi) (*imi is fetched*) 18 end Chapter 2 - Definitions, Techniques and Paradigms
The alternating bit algorithm - Receiver 01 initialization 02 begin 03 j := 1 04 bitr := 1 05 end (*end initialization*) 06 upon frame arrival 07 begin 08 receive(FrameBit , msg) 09 if FrameBit bitr then (*a new message arrived*) 10 begin 11 bitr := FrameBit 12 j := j + 1 13 omj := msg (*omj is delivered*) 14 end 15 send(bitr) 16 end Chapter 2 - Definitions, Techniques and Paradigms
Pseudo-Self-Stabilization, The Alternating Bit Algorithm Denote L = bits,qs,r,bitr,qr,s , the value of the of this label sequence is in [0*1*] or [1*0*] where qs,r and qr,s are the queue messages in transit on the link from S to R and from R to S respectively We say that a single border between the labels of value 0 and the labels of value 1 slides from the sender to the receiver and back to the sender Once a safe configuration is reached, there is at most one border in L, where a border is two consecutive but different labels Chapter 2 - Definitions, Techniques and Paradigms
The Alternating Bit Algorithm, borders sample Suppose we have two borders If frame m2 gets lost, receiver will have no knowledge about it X S R <m3 ,0>..<m2 ,1> .. <m1 ,0> bitS = 0 bitR = 1 <m3 ,0>…<m1 ,0> Chapter 2 - Definitions, Techniques and Paradigms
Pseudo-Self-Stabilization, The Alternating Bit Algorithm Denote L(ci) - the sequence L of the configuration ci A loaded configuration ci is a configuration in which the first and last values in L(ci) are equal Chapter 2 - Definitions, Techniques and Paradigms
Pseudo-Self-Stabilization, The Alternating Bit Algorithm The algorithm is pseudo self-stabilizing for the data-link task, guaranteeing that the number of messages that are lost during the infinite execution is bounded, and the performance between any such two losses is according to the abstract task of the data-link Chapter 2 - Definitions, Techniques and Paradigms