Reliability Modeling for Design Diversity: A Review and Some Empirical Studies Teresa Cai Group Meeting April 11, 2006
2 Outline Introduction Reliability modeling on design diversity Empirical Studies Possible directions Conclusion
3 Software reliability engineering techniques and their application domains Fault (Defect)Fault (Failure)... -> Defects -> Faults -> Errors -> Failures ->... AvailabilityReliabilitySafetySecurity Avoidance Fault Removal Fault (Error) TolerancePrediction
4 Introduction Software fault tolerance adopts two main techniques on top of fault avoidance and fault removal Single version techniques: checkpointing and recovery; Exception handling; Data diversity Multiple version techniques: Recovery blocks N-version programming N-version self-checking programming
5 N-Version Programming (NVP) fault tolerant software architecture Examples: N-Version Programming (NVP)
6 Introduction The rationale is the expectation that software components built differently will fail differently The probability of coincident failures in multiple versions remains the key issue in design diversity Reliability models attempt to the modeling of reliability and fault correlations in diverse systems Empirical data are highly demanded for evaluation and cross-validation of the usefulness and/or effectiveness of these models
7 Eckhardt and Lee (1985) Variation of difficulty on demand space Positive correlations between version failures Littlewood and Miller (1989) Forced design diversity Possibility of negative correlations Dugan and Lyu (1995) Markov reward model Tomek and Trivedi (1995) Stochastic reward net Popov, Strigini et al (2003) Subdomains on demand space Upper bounds and “likely” lower bounds for reliability Conceptual models Structural models In between Current Reliability Modeling
8 Eckhardt and Lee Model Assumption: Failures of an individual program π are deterministic and a program version either fails or succeeds for each input value x; There is a randomness due to the development process. P(π) is the probability that a particular version π will be produced from the set of all possible program version Π There is a randomness due to the demands in operation. P(x): probability of selection of a given input demands x in the set of all possible demands X.
9 Eckhardt and Lee Model Score function ω(x) = 0: program πsucceeds for input x ω(x) = 1: program πfails for input x Difficulty function: the average probability of a program version failing on a given demand
10 Eckhardt and Lee Model The average probability of failure per demand (pfd) of a randomly chosen single version: The average pfd of randomly chosen pair of program versions:
11 Eckhardt and Lee Model If for process A and B, the difficulty functions are identical and constant: Otherwise, it is always the case that:
12 Littlewood and Miller Model Assumption: the same as EL model The LM model generalizes the EL model to take account to forced diversity by defining different distributions over the population of all program.
13 Littlewood and Miller Model Independence: Cov(A<B) = 0 Positive correlated: Cov(A 0 Negative correlated: Cov(A<B) < 0 Basic intuition: What you find difficult, I may find easy (or at least easier) Forced diversity: diverse processes and techniques are employed to force the diversity of final program versions
14 Popov and Strigini Model Alternative estimates for probability of failures on demand (pfd) of a 1-out-of-2 system
15 Popov and Strigini Model Upper bound of system pfd “likely” lower bound of system pfd - under the assumption of conditional independence
16 Empirical studies Various projects have been conducted to investigate and evaluate the effectiveness of design diversity evaluations on the effectiveness and cost issues of the final product of diverse systems Avizienis and Chen (1977) Knight and Leveson (1986) NASA-4 University project (1990) experiments evaluating the design process of diverse systems Avizienis (1995) adoption of design diversity into different aspects of software engineering practice Popov & Strigini (2003)
17 Recent simple empirical studies on difficulty functions From Center for Software Reliability, City University First experiment: “Online Judge” On-line spec / submission / testing 3444 C program versions Source code: about 20 Input domain: 2500 input pairs Second experiment: ACM program contest Contest Host problem 2666 program versions fall into 34 equivalence classes, while 5 classes dominates 98% of the population Input domain: input pairs
18 First experiment Difficulty functions Expected pdfs Dominated faults are spec faults
19 Second experiment Difficulty functions with / without spec faults for initial release
20 Second experiment Difficulty functions without spec faults
21 Second experiment Comparison of expected probability of failure on demand
22 Second experiment Observation Difficulty functions are relative flat The increase in the pfd of a diverse pair relative to the independence assumption is small for all population Limitations Simple programs The input domain may not be representative for real world application
23 Our on-going work Two program pools on RSDIMU project C vs. PASCAL 34 versions vs. 20 versions 100 million test cases generated Fault correlation analysis in and between: 34 C program versions 20 Pascal program versions Between the two
24 Our objective Is there any difference between the fault correlation in two different program pools? Does diversity (different programming languages and process) have any influence on the fault correlation of the final programs? What’s the benefit or loss in diversity with this empirical study?