The Complexity of Adding Failsafe Fault-tolerance Sandeep S. Kulkarni Ali Ebnenasir.

The Complexity of Adding Failsafe Fault-tolerance Sandeep S. Kulkarni Ali Ebnenasir

Motivations Why automatic addition of fault-tolerance? Why begin with a fault-intolerant program? Reuse of the fault-intolerant program Separation of concerns (functionality vs. fault- tolerance) Potential to preserve properties such as efficiency One obstacle Adding masking fault-tolerance to distributed programs is NP-hard [ FTRTFT, 2000]

Motivation (Continued) Approach for dealing with complexity Heuristics [SRDS 2001] Weaker form of tolerance  Failsafe Safety only in the presence of faults  Nonmasking Safety may be temporarily violated Restricting input  Programs  Specifications

Motivation (Continued) Why failSafe Fault-Tolerance? Simplify the design of masking Partial automation of masking fault-tolerance (using TSE ’ 98) Intolerant Program Nonmasking fault-tolerant Masking fault-tolerantFailsafe fault-tolerant Automate

Outline of the Talk Problem of adding fault-tolerance Difficulties caused by distribution Complexity of failsafe fault-tolerance Class of programs and specifications for which polynomial synthesis is possible

Basic Concepts: Programs and Faults State space S p Program transitions delta p, faults delta f Invariant S, fault-span T Specification spec: Safety is specified by transitions, (s j, s k ) that should not be executed S T p/fp f

Problem Statement Inputs: program p, Invariant S, Faults f, Specification spec Outputs: program p ’, Invariant S ’ Requirements: Only fault-tolerance is added; no new functional behavior is added Invariant of fault-intolerant programInvariant of fault-tolerant program No new transition here New transitions may be added here

Difficulties with Distribution Read/Write restrictions Two Boolean variables a and b Process cannot read b Can we include the following transition ? a=0,b=0 a=1,b=0 Only if we include the transition a=0,b=1 a=1,b=1 Groups of transitions (instead of individual transitions) must be chosen.

Reduction from 3-SAT Included iff x 0 is false Included iff x 0 is true Included iff x j is false Included iff x k is true Included iff x l is false c j = x j \/ x k \/ x l _ a n = a 0 a0a0

Dealing with the Complexity of Adding Failsafe Fault-tolerance For what class of problems, failsafe fault- tolerance can be added in polynomial time Restrictions on Fault-tolerant programs Specifications Faults Our approach for restrictions: In the absence of faults, preserve all computations of the fault-intolerant program

Restrictions on Programs and Specifications Monotonicity requirements Capture the notion that safe assumptions can be made about variables that cannot be read Focus on specifications and transitions of fault-intolerant programs

Monotonicity of Specifications Definition: A specification spec is positive monotonic with respect to variable x iff:  For every s 0, s 1, s ’ 0, s ’ 1 : The value of all other variables in s 0 and s ’ 0 are the same The value of all other variables in s 1 and s ’ 1 are the same s1s1 s0s0 x = false If Does not violate safety s’0s’0 s’1s’1 x = true Does not violate safety Then

Monotonicity of Programs Definition: Program p with invariant S is negative monotonic with respect to variable x iff:  For every s 0, s 1, s ’ 0, s ’ 1 : The value of all other variables in s 0 and s ’ 0 are the same The value of all other variables in s 1 and s ’ 1 are the same s1s1 s0s0 Invariant S x = true s’0s’0 s’1s’1 X = falsex = false

Theorem Adding failsafe fault-tolerance can be done in polynomial time if either:  Program is negative monotonic, and  Spec is positive monotonic Or  Program is positive monotonic, and  Spec is negative monotonic If only one of these conditions is satisfied then adding failsafe fault-tolerance is still NP- hard For many problems, these requirements are easily met

Example: Byzantine Agreement Processes: General, g, and three non-generals j, k, and l Variables d.g : {0, 1} d.j, d.k, d.l : {0, 1, ┴ } b.g, b.j, b.k, b.l : {true, false} f.g, f.j, f.k, f.l : {0, 1} Fault-intolerant program transitions d.j = ┴ /\ f.j = 0 d.j := d.g d.j ≠ ┴ /\ f.j = 0 f.j := 1 Fault transitions ¬ b.g /\ ¬ b.j /\ ¬ b.k /\ ¬ b.l b.j := true b.j d.j,f.j :=0|1,0|1

Example: Byzantine Agreement (Continued) Safety Specification: Agreement: No two non-Byzantine non-generals can finalize with different decisions Validity: If g is not Byzantine, no process can finalize with different decision with respect to g Read/Write restrictions Readable variables for process j:  b.j, d.j, f.j  d.g, d.k, d.l Process j can write  d.j, f.j

Example: Byzantine Agreement (Continued) Observation 1: Positive monotonicity of specification with respect to b.j Observation 2: Negative monotonicity of program, consisting of the transitions of j, with respect to b.k Observation 3: Negative monotonicity of specification with respect to f.j Observation 4: Positive monotonicity of program, consisting of the transitions of j, with respect to f.k

Summary Complexity analysis for failsafe fault- tolerance Reduction from 3-SAT Restrictions on specifications and programs for which polynomial synthesis is possible  Several problems fall in this category Byzantine agreement, consensus, commit, … Necessity of these restrictions

Future Work Simplifying the design of masking fault- tolerance using the two-step approach Refining boundary between classes for which polynomial synthesis is possible and for which exponential complexity is inevitable Using monotonicity requirements for simplifying masking fault-tolerance

Thank You Questions?

Future Work Conclusion Specifying the boundary  Fault-tolerance addition can be done in polynomial time  Exponential complexity is inevitable  Goal: what problems can benefit from automation? Necessity and sufficiency of monotonicity requirements Future Work How can we Change a non-monotonic program to a monotonic one by modifying its invariant? How can we Strengthen a non-monotonic specification to a monotonic one? How a nonmasking program can be designed manually to satisfy monotonicity requirements?

Basic Concepts: Fault-tolerant Program Fault-tolerance in the presence of faults: Failsafe: Satisfies its safety specification Nonmasking: Satisfies its liveness specification (safety may be violated temporarily) Masking: Satisfies safety and liveness specification

The complexity of Adding Failsafe fault-tolerance Adding (failsafe/nonmasking/masking) fault- tolerance in high atomicity model is in P Adding masking fault-tolerance to distributed programs is in NP How about failsafe? Adding Failsafe to distributed programs is NP-hard!! (proof in the paper) Reduction of 3-SAT to the problem of failsafe fault-tolerance addition

Our Approach Stepwise towards masking fault- tolerance: Automating the addition of failsafe fault-tolerance How hard is adding failsafe fault- tolerance? Polynomial time boundaries for failsafe tolerance addition?

S p’ S p,

The Complexity of Adding Failsafe Fault-tolerance Sandeep S. Kulkarni Ali Ebnenasir.

Similar presentations

Presentation on theme: "The Complexity of Adding Failsafe Fault-tolerance Sandeep S. Kulkarni Ali Ebnenasir."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The Complexity of Adding Failsafe Fault-tolerance Sandeep S. Kulkarni Ali Ebnenasir.

Similar presentations

Presentation on theme: "The Complexity of Adding Failsafe Fault-tolerance Sandeep S. Kulkarni Ali Ebnenasir."— Presentation transcript:

Similar presentations

About project

Feedback