Synthesis of Fault-Tolerant Distributed Programs Ali Ebnenasir Department of Computer Science and Engineering Michigan State University East Lansing MI.

Slides:



Advertisements
Similar presentations
Tintu David Joy. Agenda Motivation Better Verification Through Symmetry-basic idea Structural Symmetry and Multiprocessor Systems Mur ϕ verification system.
Advertisements

Impossibility of Distributed Consensus with One Faulty Process
Delta Debugging and Model Checkers for fault localization
CS 267: Automated Verification Lecture 2: Linear vs. Branching time. Temporal Logics: CTL, CTL*. CTL model checking algorithm. Counter-example generation.
CS 267: Automated Verification Lecture 8: Automata Theoretic Model Checking Instructor: Tevfik Bultan.
Partial Order Reduction: Main Idea
Automatic Verification Book: Chapter 6. What is verification? Traditionally, verification means proof of correctness automatic: model checking deductive:
ECE Synthesis & Verification - L271 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Systems Model Checking basics.
UPPAAL Introduction Chien-Liang Chen.
1 Partial Order Reduction. 2 Basic idea P1P1 P2P2 P3P3 a1a1 a2a2 a3a3 a1a1 a1a1 a2a2 a2a2 a2a2 a2a2 a3a3 a3a3 a3a3 a3a3 a1a1 a1a1 3 independent processes.
The Theory of NP-Completeness
Efficient Reachability Analysis for Verification of Asynchronous Systems Nishant Sinha.
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
Byzantine Generals Problem: Solution using signed messages.
1 Formal Methods in SE Qaisar Javaid Assistant Professor Lecture 05.
SAT and Model Checking. Bounded Model Checking (BMC) A.I. Planning problems: can we reach a desired state in k steps? Verification of safety properties:
CSE 555 Protocol Engineering Dr. Mohammed H. Sqalli Computer Engineering Department King Fahd University of Petroleum & Minerals Credits: Dr. Abdul Waheed.
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
Model Checking. Used in studying behaviors of reactive systems Typically involves three steps: Create a finite state model (FSM) of the system design.
1 Carnegie Mellon UniversitySPINFlavio Lerda SPIN An explicit state model checker.
Lecture 4&5: Model Checking: A quick introduction Professor Aditya Ghose Director, Decision Systems Lab School of IT and Computer Science University of.
Enhancing The Fault-Tolerance of Nonmasking Programs Sandeep S. Kulkarni and Ali Ebnenasir Software Engineering and Network Systems Laboratory Computer.
CS294, YelickConsensus, p1 CS Consensus
The Complexity of Adding Failsafe Fault-tolerance Sandeep S. Kulkarni Ali Ebnenasir.
Self-Stabilization An Introduction Aly Farahat Ph.D. Student Automatic Software Design Lab Computer Science Department Michigan Technological University.
Review of the automata-theoretic approach to model-checking.
Automatic Synthesis of Fault-Tolerance Ali Ebnenasir Software Engineering and Network Systems Laboratory Computer Science and Engineering Department Michigan.
Software Testing. “Software and Cathedrals are much the same: First we build them, then we pray!!!” -Sam Redwine, Jr.
Describing Syntax and Semantics
1 Formal Engineering of Reliable Software LASER 2004 school Tutorial, Lecture1 Natasha Sharygina Carnegie Mellon University.
R R R Fault Tolerant Computing. R R R Acknowledgements The following lectures are based on materials from the following sources; –S. Kulkarni –J. Rushby.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.
Cheng/Dillon-Software Engineering: Formal Methods Model Checking.
Institute for Applied Information Processing and Communications 1 Karin Greimel Semmering, Open Implication.
1 The Theory of NP-Completeness 2012/11/6 P: the class of problems which can be solved by a deterministic polynomial algorithm. NP : the class of decision.
Model Checking Lecture 3 Tom Henzinger. Model-Checking Problem I |= S System modelSystem property.
February 18, 2015CS21 Lecture 181 CS21 Decidability and Tractability Lecture 18 February 18, 2015.
Theory of Computation, Feodor F. Dragan, Kent State University 1 NP-Completeness P: is the set of decision problems (or languages) that are solvable in.
Defining Programs, Specifications, fault-tolerance, etc.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
Inferring Synchronization under Limited Observability Martin Vechev, Eran Yahav, Greta Yorsh IBM T.J. Watson Research Center (work in progress)
Fault-Tolerant Parallel and Distributed Computing for Software Engineering Undergraduates Ali Ebnenasir and Jean Mayo {aebnenas, Department.
1 The Theory of NP-Completeness 2 Cook ’ s Theorem (1971) Prof. Cook Toronto U. Receiving Turing Award (1982) Discussing difficult problems: worst case.
Symbolic Synthesis of Masking Fault-Tolerant Distributed Programs Borzoo Bonakdarpour Workshop APRETAF January 23, 2009 Joint work with Sandeep Kulkarni.
- 1 -  P. Marwedel, Univ. Dortmund, Informatik 12, 05/06 Universität Dortmund Validation - Formal verification -
Lecture 5 1 CSP tools for verification of Sec Prot Overview of the lecture The Casper interface Refinement checking and FDR Model checking Theorem proving.
Predicate Abstraction. Abstract state space exploration Method: (1) start in the abstract initial state (2) use to compute reachable states (invariants)
CS6045: Advanced Algorithms NP Completeness. NP-Completeness Some problems are intractable: as they grow large, we are unable to solve them in reasonable.
SAT-Based Model Checking Without Unrolling Aaron R. Bradley.
1 Fault tolerance in distributed systems n Motivation n robust and stabilizing algorithms n failure models n robust algorithms u decision problems u impossibility.
Chapter 11 Introduction to Computational Complexity Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1.
Software Systems Verification and Validation Laboratory Assignment 4 Model checking Assignment date: Lab 4 Delivery date: Lab 4, 5.
Variants of LTL Query Checking Hana ChocklerArie Gurfinkel Ofer Strichman IBM Research SEI Technion Technion - Israel Institute of Technology.
Faults and fault-tolerance One of the selling points of a distributed system is that the system will continue to perform even if some components / processes.
Design of Tree Algorithm Objectives –Learning about satisfying safety and liveness of a distributed program –Apply the method of utilizing invariants and.
Symbolic Model Checking of Software Nishant Sinha with Edmund Clarke, Flavio Lerda, Michael Theobald Carnegie Mellon University.
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
Model Checking Lecture 2. Model-Checking Problem I |= S System modelSystem property.
Fundamentals of Fault-Tolerant Distributed Computing In Asynchronous Environments Paper by Felix C. Gartner Graeme Coakley COEN 317 November 23, 2003.
Model Checking Lecture 2 Tom Henzinger. Model-Checking Problem I |= S System modelSystem property.
Complexity of Compositional Model Checking of Computation Tree Logic on Simple Structures Krishnendu Chatterjee Pallab Dasgupta P.P. Chakrabarti IWDC 2004,
Automatic Test Generation
Formal methods: Lecture
SS 2017 Software Verification Bounded Model Checking, Outlook
Synthesis for Verification
Propositional Calculus: Boolean Algebra and Simplification
An explicit state model checker
Producing short counterexamples using “crucial events”
Abstraction.
Presentation transcript:

Synthesis of Fault-Tolerant Distributed Programs Ali Ebnenasir Department of Computer Science and Engineering Michigan State University East Lansing MI USA Advisor: Dr. Sandeep S. Kulkarni

2 Motivation Programs are subject to unanticipated faults New classes of faults, add corresponding fault-tolerance How to add fault-tolerance? Design a fault-tolerant program from scratch Incremental addition of fault-tolerance How to ensure correctness? Verification after the fact Automatic synthesis of fault-tolerant programs (correct by construction)

3 Motivation (Continued) Synthesis of fault-tolerant programs Start from (Temporal Logic) specification Start from the fault-intolerant program Synthesis of fault-tolerant programs from their fault- intolerant versions has the potential to Reuse the behaviors of the fault-intolerant program Preserve behaviors that are hard to specify (e.g., efficiency) Problem: Complexity of synthesis A polynomial-time non-deterministic algorithm for the synthesis of fault-tolerant distributed programs [FTRTFT00]

4 Outline Program and Fault Model Distribution Model Problem Statement Strategy Current Results Future Plan

5 Program and Fault Model Program is identified by its state space and set of transitions Finite State space S p Invariant S, fault-span T  S p Program p, Fault f, Safety  { (s 0, s 1 ) | (s 0, s 1 )  S p  S p } Fault-tolerance Satisfy a particular fault-tolerance specification in the presence of faults Failsafe, Nonmasking, Masking S T p/fp f SpSp

6 Distribution Model Read/Write restrictions Example A program p with two processes j and k Two Boolean variables a and b Process j cannot read b Can we include the following transition ? a=0,b=0 a=1,b=0 Groups of transitions (instead of individual transitions) must be chosen a=0,b=1 a=1,b=1 Only if we include the transition

7 Problem Statement Synthesis Algorithm Fault-intolerant program p Specification Spec Invariant S Fault-tolerant program p' Invariant S' Faults f No new transition here New transitions added here S S'S' p Finite state space Distribution restrictions SpSp f

8 Strategy Theoretical issues Develop heuristics Explore polynomial-time boundaries Analyze fault-intolerant programs Develop a synthesis framework for Developers of fault-tolerance Developers of heuristics

9 Theoretical Issues - Heuristics Apply heuristics to reduce the exponential complexity [SRDS01] Assign weights to transitions and states based on their usefulness Different approaches for resolving deadlocks and livelocks Identify the applicability of heuristics to the problem at hand Choose different subsets of heuristics Apply in different order

10 Theoretical Issues – Polynomial-Time Boundary Find properties of programs/specifications where polynomial-time synthesis is possible Example: Algorithmic synthesis of failsafe fault-tolerant programs is NP-hard [ICDCS02] Polynomial-time synthesis of failsafe fault-tolerance for monotonic programs and specification

11 Example for Polynomial-Time Boundary : Monotonicity of Specifications Definition: A specification spec is positive monotonic with respect to variable x iff: For every s 0, s 1, s ’ 0, s ’ 1 : The value of all other variables in s 0 and s ’ 0 are the same. The value of all other variables in s 1 and s ’ 1 are the same. s1s1 s0s0 x = false If Does not violate safety s’0s’0 s’1s’1 x = true Does not violate safety Then

12 Example for Polynomial-Time Boundary : Monotonicity of Programs Definition: Program p with invariant S is negative monotonic with respect to variable x iff: For every s 0, s 1, s ’ 0, s ’ 1 : The value of all other variables in s 0 and s ’ 0 are the same. The value of all other variables in s 1 and s ’ 1 are the same. Invariant S s1s1 s0s0 x = true s’0s’0 s’1s’1 x = false

13 Example for Polynomial-Time Boundary : Theorem Synthesis of failsafe fault-tolerance can be done in polynomial time if either: Program is negative monotonic, and Spec is positive monotonic; Or Program is positive monotonic, and Spec is negative monotonic. If only one of these conditions is satisfied then synthesizing failsafe fault-tolerance is still NP-hard. For many problems, these requirements are easily met. E.g., Agreement, Consensus, and Commit.

14 Example for Polynomial-Time Boundary : Byzantine Agreement Processes: General, g, and three non-generals j, k, and l Variables d.g : {0, 1} d.j, d.k, d.l : {0, 1, ┴ } b.g, b.j, b.k, b.l : {true, false} f.j, f.k, f.l : {0, 1} Fault-intolerant program transitions d.j = ┴ /\ f.j = 0 d.j := d.g d.j ≠ ┴ /\ f.j = 0 f.j := 1 Fault transitions ¬ b.g /\ ¬ b.j /\ ¬ b.k /\ ¬ b.l b.j := true b.j d.j :=0|1 g lkj

15 Example for Polynomial-Time Boundary : Byzantine Agreement (Continued) Safety Specification Agreement: No two non-Byzantine non-generals can finalize with different decisions Validity: If g is not Byzantine, each non-Byzantine non-general process should finalize with the same decision as g Read/Write restrictions Readable variables for process j: b.j, d.j, f.j d.g, d.k, d.l Process j can write d.j, f.j

16 Example for Polynomial-Time Boundary : Byzantine Agreement (Continued) Observation 1: Positive monotonicity of specification with respect to b.j Observation 2: Negative monotonicity of program, consisting of the transitions of j, with respect to b.k Observation 3: Negative monotonicity of specification with respect to f.j Observation 4: Positive monotonicity of program, consisting of the transitions of j, with respect to f.k

17 Example for Polynomial-Time Boundary : Byzantine Agreement (Continued) Failsafe fault-tolerant program. d.j = ┴ /\ f.j = 0 d.j := d.g d.j ≠ ┴ /\ ((d.j = d.k) \/ (d.j = d.l)) /\ f.j = 0 f.j := 1

18 Theoretical Issues – Analysis of Fault-Intolerant Programs Analyze the behavior and the structure of the fault- intolerant program. Example: Reasoning about the program in high atomicity; i.e., no distribution restrictions. Enhancement of fault-tolerance [ICDCS03]. Take advantage of model checkers.

19 Theoretical Issues – Analysis of Fault-Intolerant Programs Synthesis Framework The SPIN Model Checker Fault-tolerant program Intermediate program in Promela Fault-intolerant program Counterexample

20 Theoretical Issues: Current Results Intolerant Program Masking fault-tolerant [FTRTFT00] Failsafe fault-tolerant [ICDCS02] Nonmasking fault-tolerant [ICDCS03]

21 Synthesis Framework Goals: Algorithmic synthesis of fault-tolerant programs from their fault- intolerant versions. Easy to integrate new heuristics. Easy to change its implementation. Users: Developers of fault-tolerance. Developers of heuristics. Examples: A canonical version of Byzantine agreement. An agreement program that is subject to Byzantine and failstop faults (1.3 million states). A token ring program perturbed by state-corruption faults.

22 Related Work E.A. Emerson and E.M. Clarke, Using branching time temporal logic to synthesize synchronization skeletons, Z. Manna and P. Wolper, Synthesis of communicating processes from temporal logic specifications, A. Arora, P.C. Attie, and E.A. Emerson, Synthesis of fault-tolerant concurrent programs, P.C. Attie, and E.A. Emerson, Synthesis of concurrent programs for an atomic read/write model of computation, O. Kupferman and M. Vardi, Synthesis with incomplete information, 1997.

23 Future Plan Theoretical issues Develop more intelligent heuristics to reduce the chance of failure in the synthesis Find polynomial-time boundary for other levels of fault- tolerance Synthesis framework issues Scalability of the synthesis framework for larger programs Implement the synthesis algorithm on a distributed platform

24 Future Plan - Continued Synthesis framework issues Use model checkers for behavioral analysis Query Intermediate program Reachability analysis from a given state Result set Deadlock states Non-progress cycles Finite sequence of states

25 Publications [ICDCS02] Sandeep S. Kulkarni and Ali Ebnenasir. The Complexity of Adding Failsafe Fault-Tolerance. The 22nd International Conference on Distributed Computing Systems, July 2-5, Vienna, Austria. [ICDCS03] Sandeep S. Kulkarni and Ali Ebnenasir. Enhancing The Fault-Tolerance of Nonmasking Programs. Accepted in the 23rd International Conference on Distributed Computing Systems, May , Providence, Rhode Island USA. [SRDS03] Sandeep S. Kulkarni and Ali Ebnenasir. A Framework for Automatic Synthesis of Fault-Tolerance. Submitted to The 22 nd Symposium on Reliable Distributed Systems 6 th -8 th /October, Florence, Italy. The implementation of the synthesis framework:

26 Thank You! Questions and Comments?

27 Reduction from 3-SAT Included iff x 0 is false Included iff x 0 is true Included iff x j is false Included iff x k is true Included iff x l is false c j = x j \/ x k \/ x l _ a n = a 0 a0a0 x0x0 x1x1 x’0x’0 x’1x’1 x’nx’n xnxn