Download presentation
Presentation is loading. Please wait.
1
The Complexity of Adding Failsafe Fault-tolerance Sandeep S. Kulkarni Ali Ebnenasir
2
Motivations Why automatic addition of fault-tolerance? Why begin with a fault-intolerant program? Reuse of the fault-intolerant program Separation of concerns (functionality vs. fault- tolerance) Potential to preserve properties such as efficiency One obstacle Adding masking fault-tolerance to distributed programs is NP-hard [ FTRTFT, 2000]
3
Motivation (Continued) Approach for dealing with complexity Heuristics [SRDS 2001] Weaker form of tolerance Failsafe Safety only in the presence of faults Nonmasking Safety may be temporarily violated Restricting input Programs Specifications
4
Motivation (Continued) Why failSafe Fault-Tolerance? Simplify the design of masking Partial automation of masking fault-tolerance (using TSE ’ 98) Intolerant Program Nonmasking fault-tolerant Masking fault-tolerantFailsafe fault-tolerant Automate
5
Outline of the Talk Problem of adding fault-tolerance Difficulties caused by distribution Complexity of failsafe fault-tolerance Class of programs and specifications for which polynomial synthesis is possible
6
Basic Concepts: Programs and Faults State space S p Program transitions delta p, faults delta f Invariant S, fault-span T Specification spec: Safety is specified by transitions, (s j, s k ) that should not be executed S T p/fp f
7
Problem Statement Inputs: program p, Invariant S, Faults f, Specification spec Outputs: program p ’, Invariant S ’ Requirements: Only fault-tolerance is added; no new functional behavior is added Invariant of fault-intolerant programInvariant of fault-tolerant program No new transition here New transitions may be added here
8
Difficulties with Distribution Read/Write restrictions Two Boolean variables a and b Process cannot read b Can we include the following transition ? a=0,b=0 a=1,b=0 Only if we include the transition a=0,b=1 a=1,b=1 Groups of transitions (instead of individual transitions) must be chosen.
9
Reduction from 3-SAT Included iff x 0 is false Included iff x 0 is true Included iff x j is false Included iff x k is true Included iff x l is false c j = x j \/ x k \/ x l _ a n = a 0 a0a0
10
Dealing with the Complexity of Adding Failsafe Fault-tolerance For what class of problems, failsafe fault- tolerance can be added in polynomial time Restrictions on Fault-tolerant programs Specifications Faults Our approach for restrictions: In the absence of faults, preserve all computations of the fault-intolerant program
11
Restrictions on Programs and Specifications Monotonicity requirements Capture the notion that safe assumptions can be made about variables that cannot be read Focus on specifications and transitions of fault-intolerant programs
12
Monotonicity of Specifications Definition: A specification spec is positive monotonic with respect to variable x iff: For every s 0, s 1, s ’ 0, s ’ 1 : The value of all other variables in s 0 and s ’ 0 are the same The value of all other variables in s 1 and s ’ 1 are the same s1s1 s0s0 x = false If Does not violate safety s’0s’0 s’1s’1 x = true Does not violate safety Then
13
Monotonicity of Programs Definition: Program p with invariant S is negative monotonic with respect to variable x iff: For every s 0, s 1, s ’ 0, s ’ 1 : The value of all other variables in s 0 and s ’ 0 are the same The value of all other variables in s 1 and s ’ 1 are the same s1s1 s0s0 Invariant S x = true s’0s’0 s’1s’1 X = falsex = false
14
Theorem Adding failsafe fault-tolerance can be done in polynomial time if either: Program is negative monotonic, and Spec is positive monotonic Or Program is positive monotonic, and Spec is negative monotonic If only one of these conditions is satisfied then adding failsafe fault-tolerance is still NP- hard For many problems, these requirements are easily met
15
Example: Byzantine Agreement Processes: General, g, and three non-generals j, k, and l Variables d.g : {0, 1} d.j, d.k, d.l : {0, 1, ┴ } b.g, b.j, b.k, b.l : {true, false} f.g, f.j, f.k, f.l : {0, 1} Fault-intolerant program transitions d.j = ┴ /\ f.j = 0 d.j := d.g d.j ≠ ┴ /\ f.j = 0 f.j := 1 Fault transitions ¬ b.g /\ ¬ b.j /\ ¬ b.k /\ ¬ b.l b.j := true b.j d.j,f.j :=0|1,0|1
16
Example: Byzantine Agreement (Continued) Safety Specification: Agreement: No two non-Byzantine non-generals can finalize with different decisions Validity: If g is not Byzantine, no process can finalize with different decision with respect to g Read/Write restrictions Readable variables for process j: b.j, d.j, f.j d.g, d.k, d.l Process j can write d.j, f.j
17
Example: Byzantine Agreement (Continued) Observation 1: Positive monotonicity of specification with respect to b.j Observation 2: Negative monotonicity of program, consisting of the transitions of j, with respect to b.k Observation 3: Negative monotonicity of specification with respect to f.j Observation 4: Positive monotonicity of program, consisting of the transitions of j, with respect to f.k
18
Summary Complexity analysis for failsafe fault- tolerance Reduction from 3-SAT Restrictions on specifications and programs for which polynomial synthesis is possible Several problems fall in this category Byzantine agreement, consensus, commit, … Necessity of these restrictions
19
Future Work Simplifying the design of masking fault- tolerance using the two-step approach Refining boundary between classes for which polynomial synthesis is possible and for which exponential complexity is inevitable Using monotonicity requirements for simplifying masking fault-tolerance
20
Thank You Questions?
21
Future Work Conclusion Specifying the boundary Fault-tolerance addition can be done in polynomial time Exponential complexity is inevitable Goal: what problems can benefit from automation? Necessity and sufficiency of monotonicity requirements Future Work How can we Change a non-monotonic program to a monotonic one by modifying its invariant? How can we Strengthen a non-monotonic specification to a monotonic one? How a nonmasking program can be designed manually to satisfy monotonicity requirements?
22
Basic Concepts: Fault-tolerant Program Fault-tolerance in the presence of faults: Failsafe: Satisfies its safety specification Nonmasking: Satisfies its liveness specification (safety may be violated temporarily) Masking: Satisfies safety and liveness specification
23
The complexity of Adding Failsafe fault-tolerance Adding (failsafe/nonmasking/masking) fault- tolerance in high atomicity model is in P Adding masking fault-tolerance to distributed programs is in NP How about failsafe? Adding Failsafe to distributed programs is NP-hard!! (proof in the paper) Reduction of 3-SAT to the problem of failsafe fault-tolerance addition
24
Our Approach Stepwise towards masking fault- tolerance: Automating the addition of failsafe fault-tolerance How hard is adding failsafe fault- tolerance? Polynomial time boundaries for failsafe tolerance addition?
25
S p’ S p,
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.