Self-Stabilization An Introduction Aly Farahat Ph.D. Student Automatic Software Design Lab Computer Science Department Michigan Technological University.

Slides:

Advertisements

Similar presentations

Tintu David Joy. Agenda Motivation Better Verification Through Symmetry-basic idea Structural Symmetry and Multiprocessor Systems Mur ϕ verification system.

Advertisements

1 Verification of Parameterized Systems Reducing Model Checking of the Few to the One. E. Allen Emerson, Richard J. Trefler and Thomas Wahl Junaid Surve.

Distributed Snapshots: Determining Global States of Distributed Systems Joshua Eberhardt Research Paper: Kanianthra Mani Chandy and Leslie Lamport.

Chapter 6 - Convergence in the Presence of Faults1-1 Chapter 6 Self-Stabilization Self-Stabilization Shlomi Dolev MIT Press, 2000 Shlomi Dolev, All Rights.

PROTOCOL VERIFICATION & PROTOCOL VALIDATION. Protocol Verification Communication Protocols should be checked for correctness, robustness and performance,

Self Stabilizing Algorithms for Topology Management Presentation: Deniz Çokuslu.

Self-stabilizing Distributed Systems Sukumar Ghosh Professor, Department of Computer Science University of Iowa.

Self-Stabilization in Distributed Systems Barath Raghavan Vikas Motwani Debashis Panigrahi.

Efficient Reachability Analysis for Verification of Asynchronous Systems Nishant Sinha.

CS 267: Automated Verification Lecture 10: Nested Depth First Search, Counter- Example Generation Revisited, Bit-State Hashing, On-The-Fly Model Checking.

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1.

10/14/2009Automatic Software Design Lab1 Nash Equilibria in Distributed Systems Mohamed G. Gouda & H. B. Acharya Presenter Aly Farahat Ph.D. Student Automatic.

From Self- to Snap- Stabilization Alain Cournier, Stéphane Devismes, and Vincent Villain SSS’2006, November 17-19, Dallas (USA)

Termination Detection. Goal Study the development of a protocol for termination detection with the help of invariants.

Termination Detection Part 1. Goal Study the development of a protocol for termination detection with the help of invariants.

11/11/2009Software Design Laboratory1 The solution to a cyclic relaxation problem Edsger W. Dijkstra Presenter Aly Farahat Ph.D. Student Software Design.

Synthesis of Fault-Tolerant Distributed Programs Ali Ebnenasir Department of Computer Science and Engineering Michigan State University East Lansing MI.

Enhancing The Fault-Tolerance of Nonmasking Programs Sandeep S. Kulkarni and Ali Ebnenasir Software Engineering and Network Systems Laboratory Computer.

CPSC 668Self Stabilization1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.

CS294, YelickSelf Stabilizing, p1 CS Self-Stabilizing Systems

Design of Fault Tolerant Data Flow in Ptolemy II Mark McKelvin EE290 N, Fall 2004 Final Project.

Outline Max Flow Algorithm Model of Computation Proposed Algorithm Self Stabilization Contribution 1 A self-stabilizing algorithm for the maximum flow.

The Complexity of Adding Failsafe Fault-tolerance Sandeep S. Kulkarni Ali Ebnenasir.

Automatic Synthesis of Fault-Tolerance Ali Ebnenasir Software Engineering and Network Systems Laboratory Computer Science and Engineering Department Michigan.

CS 603 Communication and Distributed Systems April 15, 2002.

GS 3 GS 3 : Scalable Self-configuration and Self-healing in Wireless Networks Hongwei Zhang & Anish Arora.

On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.

Selected topics in distributed computing Shmuel Zaks

From requirements to specification Specification is a refinement of requirements Can be included together as Software Requirements Specifications (SRS)

Andreas Larsson, Philippas Tsigas SIROCCO Self-stabilizing (k,r)-Clustering in Clock Rate-limited Systems.

On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.

Fault-containment in Weakly Stabilizing Systems Anurag Dasgupta Sukumar Ghosh Xin Xiao University of Iowa.

Snap-Stabilizing PIF and Useless Computations Alain Cournier, Stéphane Devismes, and Vincent Villain ICPADS’2006, July , Minneapolis (USA)

Defining Programs, Specifications, fault-tolerance, etc.

Fault-containment in Weakly Stabilizing Systems Anurag Dasgupta Sukumar Ghosh Xin Xiao University of Iowa.

By J. Burns and J. Pachl Based on a presentation by Irina Shapira and Julia Mosin Uniform Self-Stabilization 1 P0P0 P1P1 P2P2 P3P3 P4P4 P5P5.

Fault-Tolerant Parallel and Distributed Computing for Software Engineering Undergraduates Ali Ebnenasir and Jean Mayo {aebnenas, Department.

The Complexity of Distributed Algorithms. Common measures Space complexity How much space is needed per process to run an algorithm? (measured in terms.

A Self-Stabilizing O(n)-Round k-Clustering Algorithm Stéphane Devismes, VERIMAG.

Dissecting Self-* Properties Andrew Berns & Sukumar Ghosh University of Iowa.

Autonomic distributed systems. 2 Think about this Human population x10 9 computer population.

Stabilization Presented by Xiaozhou David Zhu. Contents What-is Motivation 3 Definitions An Example Refinements Reference.

Weak vs. Self vs. Probabilistic Stabilization Stéphane Devismes (CNRS, LRI, France) Sébastien Tixeuil (LIP6-CNRS & INRIA, France) Masafumi Yamashita (Kyushu.

Open Incremental Model Checking (OIMC) and the Role of Contracts Model-Based Programming and Verification.

Impossibility of Distributed Consensus with One Faulty Process By, Michael J.Fischer Nancy A. Lynch Michael S.Paterson.

CS 542: Topics in Distributed Systems Self-Stabilization.

Software Quality and Safety Pascal Mbayiha.  software engineering  large, complex systems  functionality, changing requirements  development difficult.

1 Fault tolerance in distributed systems n Motivation n robust and stabilizing algorithms n failure models n robust algorithms u decision problems u impossibility.

Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb

Self-stabilization. Technique for spontaneous healing after transient failure or perturbation. Non-masking tolerance (Forward error recovery). Guarantees.

Self-Stabilizing Algorithm with Safe Convergence building an (f,g)-Alliance Fabienne Carrier Ajoy K. Datta Stéphane Devismes Lawrence L. Larmore Yvan Rivierre.

Hwajung Lee.  Technique for spontaneous healing.  Forward error recovery.  Guarantees eventual safety following failures. Feasibility demonstrated.

/ PSWLAB Thread Modular Model Checking by Cormac Flanagan and Shaz Qadeer (published in Spin’03) Hong,Shin Thread Modular Model.

Superstabilizing Protocols for Dynamic Distributed Systems Authors: Shlomi Dolev, Ted Herman Presented by: Vikas Motwani CSE 291: Wireless Sensor Networks.

ITEC452 Distributed Computing Lecture 15 Self-stabilization Hwajung Lee.

Self-stabilizing (f,g)-Alliances with Safe Convergence Fabienne Carrier Ajoy K. Datta Stéphane Devismes Lawrence L. Larmore Yvan Rivierre.

Self-Stabilizing Systems

CSE-591: Term Project Self-stabilizing Network Algorithms by Tridib Mukherjee ASU ID :

Fundamentals of Fault-Tolerant Distributed Computing In Asynchronous Environments Paper by Felix C. Gartner Graeme Coakley COEN 317 November 23, 2003.

Distributed Systems Lecture 6 Global states and snapshots 1.

Complexity of Compositional Model Checking of Computation Tree Logic on Simple Structures Krishnendu Chatterjee Pallab Dasgupta P.P. Chakrabarti IWDC 2004,

Model and complexity Many measures Space complexity Time complexity

Distributed Maintenance of Spanning Tree using Labeled Tree Encoding

第１部: 自己安定の緩和すてふぁんどぅゔぃむポスドクパリ第１１大学 LRI CNRS あどばいざ: せばすちゃてぃくそい

The Echo Algorithm The echo algorithm can be used to collect and disperse information in a distributed system It was originally designed for learning network.

New Variants of Self-Stabilization

CS60002: Distributed Systems

Presented By: Raquel Whittlesey-Harris 12/04/02

Introduction to Self-Stabilization

Presentation transcript:

Self-Stabilization An Introduction Aly Farahat Ph.D. Student Automatic Software Design Lab Computer Science Department Michigan Technological University

10/14/2009Automatic Software Design Lab2 The ability of a system to resume its ‘legal behavior’ regardless of its initial state, in a finite number of steps

10/14/2009Automatic Software Design Lab3 Contents Basic Concepts Self-Stabilizing Operating System Design of Self-Stabilization

10/14/2009Automatic Software Design Lab4 Basic Concepts

10/14/2009Automatic Software Design Lab5 Properties of Self-Stabilization Closure Program execution remains in a set of legitimate states (i.e., invariant) in the absence of faults Convergence Any computation has a suffix entirely confined within the set of legitimate states

10/14/2009Automatic Software Design Lab6 Motivation No need for initialization Recovery after faults stop occurring Little assumption about a fault-model (transient faults)‏ Captures a family of fault-tolerant systems

10/14/2009Automatic Software Design Lab7 Program’s Execution

10/14/2009Automatic Software Design Lab8 Invariant A set of states from where program executions meet its requirements –a.k.a set of legitimate states –E.g., the non-faulty nodes in a network should remain connected Assuming no faults: the program’s state in any computation is in the invariant.

10/14/2009Automatic Software Design Lab9 Dijkstra’s Token-Ring P 1 : x 1 ==x N  x 1 :=(x N +1)mod N P i>1 : x i ≠x i-1  x i :=x i-1 N: Number of processes x i : Variable associated with P i

10/14/2009Automatic Software Design Lab10 Self-Stabilizing Operating System

10/14/2009Automatic Software Design Lab11 SS Operating Systems Need for self-organizing OS Defines –a model for the processor execution –a model for system execution –the aggregate system state and self- stabilization (High-atomicity)‏

10/14/2009Automatic Software Design Lab12 Solution Foundations General ideas: –Continuous Monitoring and Consistency Enforcement –Reset SS components of an OS: –Loader –Scheduler –Code Portions in ROM

10/14/2009Automatic Software Design Lab13 Design of Self-Stabilization

10/14/2009Automatic Software Design Lab14 Complex task –In part because there is no central point of control Automation is one way to facilitate the design of SS

10/14/2009Automatic Software Design Lab15 Research Problem

10/14/2009Automatic Software Design Lab16 Write Restrictions –Transitions belonging to a specified process are allowed to write only a proper subset of the state variables –Effect: transitions writing a “non-subset” of the write variables are not included in this process

10/14/2009Automatic Software Design Lab17 Read Restrictions –Transitions belonging to a specified process are allowed to read only a proper subset of the state variables –Effect: each transition in this process has group-mates originating at source states with unreadable variables as “don’t cares”

10/14/2009Automatic Software Design Lab18 Token-Ring R/W Restrictions

10/14/2009Automatic Software Design Lab19 Token-Ring Design Fault-Intolerant Token Ring Resolve Deadlock states Consider R/W Restrictions Do not add cycles

10/14/2009Automatic Software Design Lab20 Further Readings -E. W. Dikjstra, “Self-stabilization in spite of distributed control,” In Selected Writings on Computing: a Personal Perspective. Springer- Verlag, Berlin, 1982, Originally Published in A. Arora & M. G. Gouda, “Closure and convergence: a foundation of fault-tolerant computing.” In Proceedings of the 22 nd International Conference On Fault-Tolerant Computing Systems, S. Dolev & R. Yagel, “Toward self-stabilizing operating systems,” In Proceedings of the 2 nd International Workshop on Self-Adaptive and Autonomic Computing Systems (SAACS’04), Zaragoza, Spain, pp , 2004

10/14/2009Automatic Software Design Lab21 Thank you!

10/14/2009Automatic Software Design Lab22 Bottlenecks State Space explosion Considering: –Cycle Resolution –Deadlock Resolution –Read Restrictions (grouping of transitions)‏ Is Hard!

10/14/2009Automatic Software Design Lab23 Self-Stabilization in High Atomicity Definitions: –Global Predicate: A Boolean function of all state variables –Local Predicate: A predicate on a process locally readable variables High Atomicity: –All variables are atomically readable by any process –Global predicates are detectable in any process Recovery is ensured at most in N steps under write-only restrictions –N: the number of distributed processes

10/14/2009Automatic Software Design Lab24 Self-Stabilization in Low-Atomicity Read Restrictions: Local actions affecting global behavior The invariant is not detectable in each component due to read-restrictions Grouping: transitions are grouped for all unreadable variables

10/14/2009Automatic Software Design Lab25 Handling Bottlenecks State Explosion: Use of symbolic methods –Currently we are using BDD libraries Cycle Resolution: symbolic SCC detection algorithms [Gentilini et al.] –We are investigating properties of directed graphs to resolve cycles in SCC’s Hardness: use heuristic search and sound but incomplete algorithms

10/14/2009Automatic Software Design Lab26 Ranking States We rank the states outside the invariant based on write-restrictions only Ranks Construction: –Rank 0 = def Invariant –i.e: There exists a process who can modify the state in Rank i by atomically writing its own variables to reach a state of Rank i-1. –Rank n : States directly reaching Rank n-1 and unreachable by non-ranked states. This is a backward reachability analysis using “Onion Rings” [Gentilini et al.]

10/14/2009Automatic Software Design Lab27 Ranking in a (barrier synchronization example)‏

10/14/2009Automatic Software Design Lab28 Incremental addition We check if the original program has no cycles outside the invariant, otherwise we declare failure. Incrementally add recovery transitions with their group-mates, process by process and rank by rank. We ensure the following: –We add groups having no transitions in cycles outside the invariant –We add groups having at least one transition resolving deadlock states –We do not add transitions originating in the invariant

10/14/2009Automatic Software Design Lab29 Future Work Parallelizing synthesis algorithms Compositional synthesis of self- stabilization

10/14/2009Automatic Software Design Lab30 Token-Ring (cont’)‏ A Guard for P i evaluates to true when P i has the token Legal States (Invariant): Only 1 process has the token Illegal State: More than one process has a token

10/14/2009Automatic Software Design Lab31 Generalized Self-Stabilization Q leads-to P [Arora] In our case (Q is the constant predicate True)‏