The Infeasibility of Quantifying the Reliability of Life-Critical Real-Time Software.

Slides:

Advertisements

Similar presentations

Lecture 8: Testing, Verification and Validation

Advertisements

Introduction to Transfer Lines Active Learning – Module 1

CS 795 – Spring  “Software Systems are increasingly Situated in dynamic, mission critical settings ◦ Operational profile is dynamic, and depends.

Copyright 2000, Stephan Kelley1 Estimating User Interface Effort Using A Formal Method By Stephan Kelley 16 November 2000.

Planning under Uncertainty

1 Software Testing and Quality Assurance Lecture 36 – Software Quality Assurance.

Software Testing Using Model Program DESIGN BY HONG NGUYEN & SHAH RAZA Dec 05, 2005.

Software Testing and Quality Assurance

Lecture 2: Thu, Jan 16 Hypothesis Testing – Introduction (Ch 11)

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 9-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.

Design of Fault Tolerant Data Flow in Ptolemy II Mark McKelvin EE290 N, Fall 2004 Final Project.

Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 8-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.

1 Evaluation of Safety Critical Software David L. Parnas, C ACM, June 1990.

Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.

1 Fundamentals of Reliability Engineering and Applications Dr. E. A. Elsayed Department of Industrial and Systems Engineering Rutgers University

Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter Hypothesis Tests Regarding a Parameter 10.

Chapter 8 Introduction to Hypothesis Testing

Software Integration and Documenting

Choosing Statistical Procedures

Overview Software Quality Assurance Reliability and Availability

Software faults & reliability Presented by: Presented by: Pooja Jain Pooja Jain.

©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 24 Slide 1 Critical Systems Validation 1.

Chapter 10 Hypothesis Testing

Business Statistics - QBM117 Introduction to hypothesis testing.

Confidence Intervals and Hypothesis Testing - II

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,

© 2003 Prentice-Hall, Inc.Chap 9-1 Fundamentals of Hypothesis Testing: One-Sample Tests IE 340/440 PROCESS IMPROVEMENT THROUGH PLANNED EXPERIMENTATION.

Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th Lesson Introduction to Hypothesis Testing.

© 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

Software Reliability SEG3202 N. El Kadri.

Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.

© 2003 Prentice-Hall, Inc.Chap 7-1 Business Statistics: A First Course (3 rd Edition) Chapter 7 Fundamentals of Hypothesis Testing: One-Sample Tests.

1 Introduction to Hypothesis Testing. 2 What is a Hypothesis? A hypothesis is a claim A hypothesis is a claim (assumption) about a population parameter:

Lecture 7 Introduction to Hypothesis Testing. Lecture Goals After completing this lecture, you should be able to: Formulate null and alternative hypotheses.

Fundamentals of Data Analysis Lecture 9 Management of data sets and improving the precision of measurement.

MS 305 Recitation 11 Output Analysis I

1 Digitally Controlled Converter with Dynamic Change of Control Law and Power Throughput Carsten Nesgaard Michael A. E. Andersen Nils Nielsen Technical.

Ch. 1.  High-profile failures ◦ Therac 25 ◦ Denver Intl Airport ◦ Also, Patriot Missle.

Copyright © Cengage Learning. All rights reserved. 14 Elements of Nonparametric Statistics.

Software Reliability in Nuclear Systems Arsen Papisyan Anthony Gwyn.

Building Dependable Distributed Systems Chapter 1 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.

Software Testing and Quality Assurance Software Quality Assurance 1.

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.

Chap 8-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 8 Introduction to Hypothesis.

Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.1 One-Way ANOVA: Comparing.

CS 505: Thu D. Nguyen Rutgers University, Spring CS 505: Computer Structures Fault Tolerance Thu D. Nguyen Spring 2005 Computer Science Rutgers.

Idaho RISE System Reliability and Designing to Reduce Failure ENGR Sept 2005.

Fault Tolerance Benchmarking. 2 Owerview What is Benchmarking? What is Dependability? What is Dependability Benchmarking? What is the relation between.

Software testing techniques Software testing techniques Statistical Testing Presentation on the seminar Kaunas University of Technology.

© 2004 Prentice-Hall, Inc.Chap 9-1 Basic Business Statistics (9 th Edition) Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.

Hypothesis Testing Errors. Hypothesis Testing Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean.

1 Fault-Tolerant Computing Systems #1 Introduction Pattara Leelaprute Computer Engineering Department Kasetsart University

29 March Software Quality and Testing. Why do we care? Therac-25 (1985) Multiple space fiascos (1990s) Ariane V exploded after 40 seconds (conversion)

1 Simulation Scenarios. 2 Computer Based Experiments Systematically planning and conducting scientific studies that change experimental variables together.

©Ian Sommerville 2000Dependability Slide 1 Chapter 16 Dependability.

1 of 53Visit UMT online at Prentice Hall 2003 Chapter 9, STAT125Basic Business Statistics STATISTICS FOR MANAGERS University of Management.

The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.

 Software reliability is the probability that software will work properly in a specified environment and for a given amount of time. Using the following.

Prof. Enrico Zio Availability of Systems Prof. Enrico Zio Politecnico di Milano Dipartimento di Energia.

Chapter Nine Hypothesis Testing.

Math 4030 – 9b Introduction to Hypothesis Testing

Software Reliability Definition: The probability of failure-free operation of the software for a specified period of time in a specified environment.

Software Reliability PPT BY:Dr. R. Mall 7/5/2018.

i) Two way ANOVA without replication

Introduction to Testing Design Strategies – The Smarter Tester

Statistical Testing Jonas Abromaitis IFM-0/2.

Software Reliability Models.

Critical Systems Validation

DESIGN OF EXPERIMENTS by R. C. Baker

Presentation transcript:

The Infeasibility of Quantifying the Reliability of Life-Critical Real-Time Software

introduction -The availability of enormous computing power at a low cost has led to expanded use of digital computers in current applications and their introduction into many new applications. -Increased performance at a minimal hardware cost. -Software systems which contain more errors.

Terminology: Failure rate per hour: Ultra reliability = < Moderate reliability= to Low reliability= > Software errors behaves like a stochastic point process. -In a real-time system, the software is periodically scheduled- the probability of software failure is given by the binomial distribution : Software Reliability

p(Sn = k) = p k (1- p) n-k P(sn > 0) = 1 – (1-p) n = 1 – (1 – p) kt k – number of inputs per unit time. To simplify: P(Sn > n) = 1- e -ktp

Analyzing Software as a Black Box 1. Testing with replacement - D t = y 0 * (r/n) 2. Testing without replacement- D t = y 0* Y 0 - mean failure time of a test specimen. For probability of failure of 10 –9 for a 10 hour mission: y 0 = 10 / -ln(1- 10 –9 ) 10 10

hours = years hours = years hours = years hours = 114 years (r = 1) No. of replicates (n) Expected Test Duration Dt

Reliability Growth Models -The software design involves a repetitive cycle of testing and repairing a program. The result is a sequence of programs : p 1, …. p n and a sequence of failure times, t 1,…. t n.. -The goal is the predict the reliability of the p n.. Experiment performed by Nagel and Skrivan: Program A1: number of bugs Removed failure probability per input

Calculation the requirements per input : p = -ln(1- p aye ) / Kt P aye = for a 10 hour mission, k = 10/sec then: P = 2.78 * Extrapolation to predict when ultra reliability will be reached

-To get a rate of 2.78* you need about 24 bugs. -Bug 23 will have a failure rate of about 9.38*10 -15, the expected number of test cases until observing a binomial event of probability 9.38* is 1.07* If each test case would require 0.10 sec then the expected time to discover bug 23 alone would be 1.07*10 13 sec or 3.4*10 5 years.

ProgramslopeLast bugTest time A *10 5 years B *10 5 years A *10 5 years B *10 5 years A *10 5 years b *10 5 years Results for 5 different programs:

Low Sample Rate Systems and Accelerated Testing R = test time per input 1/p = number of inputs until the next bug appears  D t = R/p Therefore D t = RKt / -ln(1 - p aye ). K = number of inputs per unit time.

K (R = 0.1) Expected Test Time, D t 10/sec1.14*10 6 years 1/sec1.14*10 5 years 1/minute1.9*10 3 years 1/hour31.7 years 1/day1.32 years 1/month16 days

Reliability Growth Models and Accelerated Testing If the sample rate is 1 input per minute then the failure rate per input must be less than /60 = 1.67* bug failure rate per input -The removal of the last bug alone would take approximately 2.2*10 10 test cases. Even if the testing process were 60/1000 sec testing would take 42 years * * *10 -11

Summary for all the programs: Test Time To Remove the Last Bug to Obtain Ultra reliability ProgramSlopeLast bugTest time A years B years A years B years A years B years

Models of Software Fault Tolerance -independence assumption enables quantification in the ultra reliable region -Quantification of fault-tolerant software reliability is unlikely without the independence assumption -independence assumption cannot be experimentally justified for ultra reliable region

E i,k = The event that the I version fails on its k execution. P i,k = The probability that version I fails during the k execution. -The probability that two or more versions fail on the k th execution : P aye,k = P( (E 1,k Ê 2,k ) or (E 1,k Ê 3,k )or (E 2,k Ê 3,k ) or (E 1,k ^ E 2,k Ê 3,k )) = P(E 1,k Ê 2,k ) + P (E 1,k Ê 3,k )+ P(E 2,k Ê 3,k ) - 2P(E 1,k ^ E 2,k Ê 3,k ). = P(E 1,k )P(E 2,k ) + P(E 1,k )P(E 3,k ) + P(E 2,k )P(E 3,k ) – 2P(E 1,k )P(E 2,k )P(E 3,k )  P aye,k = 3p p 3 3p 2 Independence enables quantification of ultra reliability

P aye (T) = 1- e (-3p^2*KT) 3p 2 KT If T = 1,k = 3600 (1 execution per second) and P(E 1,k ) = then we get P aye (T) = 1.08*10 -8

P aye = P(E 1 Ê 2 ) + P (E 1 Ê 3 )+ P(E 2 Ê 3 ) - 2P(E 1 ^ E 2 Ê 3 ). = P(E 1 )P(E 2 ) + P(E 1 )P(E 3 )+P(E 2 )P(E 3 )-2P(E 1 )P(E 2 )P(E 3 ) +[P(E 1 ^ E 2 ) - P(E 1 )P(E 2 )] +[P(E 1 ^ E 3 ) - P(E 1 )P(E 3 )] +[P(E 2 ^ E 3 ) - P(E 2 )P(E 3 )] -2[P(E 1 ^ E 2 ^ E 3 ) - P(E 1 ) P(E 2 )P(E 3 )] - P(E 1 ^ E 2 ^ E 3 ) < P(E i ^ E j ) therefore P(E i ^ E j ) < P aye Ultra reliable Quantification Is Infeasible Without Independence

Example1: E 1 = E 2 = E 3 = If independent then p(E i Êj) = If p(E i Êj) = /hour one could test for a 100 years and not seen even one coincident error. Example2: E 1 = E 2 = E 3 = If p(E i Êj) = /hour one could test for a one years and not likely see even one coincident error!! Danger Of Extrapolation to the Ultra reliability Region

-In the second case if the erroneous assumption of independence would be made then it would allow the assignment of a 3*10 -8 probability of failure to the system when in reality the system is no better than In order to get probability of failure to be less than at 1 hour we need p(E i ^Ej) to be less then

There are two kinds of models: 1.The model includes terms which cannot be measured within feasible amounts of time. 2.The model includes only parameters which can be measured within feasible amounts of time. -A general model must provide a mechanism that makes the interaction terms negligibly small. - There is little hope of deriving the interaction terms from fundamental Laws, since the error process occurs in the human mind. Feasibility of a General Model For Coincident Errors

The Coincident-Error Experiments Experiment that was performed by Knight and Leveson: -27 versions of a program were produced and subjected to 1,000,000 input cases. -The observed average failure rate per input was independence model was rejected. -In order to observe the errors the error rate must be in the low to moderate reliability region. Future experiments will have one of the following results :

1.Demonstration that the independence assumption does not hold for the low reliability system 2. Demonstration that the independence assumption does hold from systems for the low reliability system. 3. No coincident errors were seen but the test time was insufficient to demonstrate independence for the potentially ultra reliable system.

Conclusions The potential performance advantages of using computers over their analog predecessors have created an atmosphere where serious safety concerns about digital hardware and software are not adequately addressed. Practical methods to prevent design errors have not been found.

Life testing of ultra reliable software is infeasible. (i.e. to quantify /hour failure rate requires more than 10 8 hours o testing). The assumption of independence is not reasonable for software and can not be tested for software or for hardware in the ultra reliable region. It is possible that models which are inferior to other models in the moderate region are superior in the ultra reliable region – but this cannot be demonstrated.