Software Reliability Research Pankaj Jalote Professor, CSE, IIT Kanpur, India.

Software Reliability Research Pankaj Jalote Professor, CSE, IIT Kanpur, India

System Reliability System – an entity that provides defined behavior at interfaces System – an entity that provides defined behavior at interfaces System is a hierarchy of subsystems, each subsystem being a systemSystem is a hierarchy of subsystems, each subsystem being a system Reliability of a system - its ability to provide failure-free operation Reliability of a system - its ability to provide failure-free operation Failure – the system behavior is incorrect or not as expected; is a random phenomenon Failure – the system behavior is incorrect or not as expected; is a random phenomenon

Reliability Quantification Reliability of a system defined as failure probability in a time period R(t) = Prob that system has not failed by time t Reliability of a system defined as failure probability in a time period R(t) = Prob that system has not failed by time t For rel work, often distribution of R(t) is specified For rel work, often distribution of R(t) is specified

Reliability Quantification.. Reliability can also be quantified by Mean Time to Failure (MTTF) Reliability can also be quantified by Mean Time to Failure (MTTF) Also by failure rate (no of failures per unit time.) Also by failure rate (no of failures per unit time.) From R(t), MTTF or failure rate can be determined From R(t), MTTF or failure rate can be determined Under some assumptions, failure rate and MTTF are inversely related Under some assumptions, failure rate and MTTF are inversely related

Software Reliability Software (un)reliability not caused due to aging but due to bugs Software (un)reliability not caused due to aging but due to bugs The more the bugs, the lesser the reliability of the software The more the bugs, the lesser the reliability of the software Still failures seem random, hence rel theory can be applied Still failures seem random, hence rel theory can be applied

Software Reliability Research Two main threads Two main threads Software reliability modeling – how to model and predict sw relSoftware reliability modeling – how to model and predict sw rel Improving sw reliability – by removing defects through program checking, verification, testing,…Improving sw reliability – by removing defects through program checking, verification, testing,… Will discuss some work being done here in these two Will discuss some work being done here in these two

Software Reliability Modeling

Software Reliability Software systems often are one-off Software systems often are one-off Measuring reliability in lab not practical as too much failure data is needed; requires timeMeasuring reliability in lab not practical as too much failure data is needed; requires time Failures often result in fault removal, leading to reliability improvement Failures often result in fault removal, leading to reliability improvement Predicting future reliability from measured reliability is harderPredicting future reliability from measured reliability is harder Hence different models needed Hence different models needed

Software Reliability Growth Models Assume that reliability is a function of the defect level and as defects are removed, reliability improves Assume that reliability is a function of the defect level and as defects are removed, reliability improves Model the failure-fix process of software evolution Model the failure-fix process of software evolution Many models have been proposed in the last 3 decades Many models have been proposed in the last 3 decades Model parameters determined from past data on failures and fixes Model parameters determined from past data on failures and fixes

Reliability of Software Products For software products, a large population exists in field and faults are not removed as failures occur For software products, a large population exists in field and faults are not removed as failures occur According to SRGMs, the reliability should remain the same According to SRGMs, the reliability should remain the same I.e. the failure rate should be constant I.e. the failure rate should be constant

Average Failure Rate of a MS Product

Reasons for this Phenomenon Users learn with time and avoid failure causing situation Users learn with time and avoid failure causing situation Users start with exploring more, then limit to some part of the product Users start with exploring more, then limit to some part of the product Most users use a few product featuresMost users use a few product features Configuration related failures are much more in the start Configuration related failures are much more in the start These failures reduce with time These failures reduce with time

A New Model for Product Rel. For a user, there is a transient failure rate, which decays with a factor For a user, there is a transient failure rate, which decays with a factor With time the transient goes, and failure rate reaches a steady state With time the transient goes, and failure rate reaches a steady state Steady state failure rate – represents the reliability of the product Steady state failure rate – represents the reliability of the product

Failure Rate of a Unit Failure rate for one unit is λ (i) = λ0 *α i + λf Failure rate for one unit is λ (i) = λ0 *α i + λf λ0 is the initial transient rate λ0 is the initial transient rate λf is the final steady state rate λf is the final steady state rate α is the decay factor α is the decay factor

Applying it to a Product Considered the failure and sale data of a real product for MS Considered the failure and sale data of a real product for MS Applying the model to the data and determining parameters, we get Applying the model to the data and determining parameters, we get λ0 = 0.04 failures/month λf = 0.008 failures/month α = 0.4 (i.e. 40% decay each month)

Example… Steady state failure rate is 1/6 th of average rate in month 2, 1/3 rd of average rate in month 4 Steady state failure rate is 1/6 th of average rate in month 2, 1/3 rd of average rate in month 4 I.e. initial MTTF could be 1/6 th the steady state MTTF I.e. initial MTTF could be 1/6 th the steady state MTTF Steady state is reached quite soon – in two to three months Steady state is reached quite soon – in two to three months

Software Architecture Based Rel Estimation

Sw Architecture Architecture is the components in the system and how they are connected Architecture is the components in the system and how they are connected Is decided very early in sw project Is decided very early in sw project If reliability and performance can be modeled from architecture, can improve the architecture If reliability and performance can be modeled from architecture, can improve the architecture Some work going on in arch. based perf. and rel modeling Some work going on in arch. based perf. and rel modeling

Program Verification

Basic goal – to ensure that program is free of defects (bugs) as much as possible Basic goal – to ensure that program is free of defects (bugs) as much as possible Good program verification leads to higher reliability Good program verification leads to higher reliability

Program Verification Techniques Testing – program is executed with test data to find bugs Testing – program is executed with test data to find bugs Static analysis – program source code is analyzed Static analysis – program source code is analyzed Dynamic analysis – program run on some data and assertions made Dynamic analysis – program run on some data and assertions made Model checking Model checking Formal verification Formal verification

Techniques Most techniques work in isolation Most techniques work in isolation Sometimes they are complimentary in their defect detection capability Sometimes they are complimentary in their defect detection capability Combining techniques meaningfully can improve reliability Combining techniques meaningfully can improve reliability We are working on techniques for combining testing and static analysis We are working on techniques for combining testing and static analysis

State-based Testing Automation

Testing Testing remains main verification activity – most reliance on it Testing remains main verification activity – most reliance on it Consumes as much as half of the total effort in a sw product Consumes as much as half of the total effort in a sw product Testing: test case design, execution, checking the results, then debugging, fixing, retesting Testing: test case design, execution, checking the results, then debugging, fixing, retesting Each step is expensive Each step is expensive

Test Automation Test automation can help reduce cost and make testing more effective Test automation can help reduce cost and make testing more effective Most test automation approaches focus on data collection, re-testing Most test automation approaches focus on data collection, re-testing Little effort in complete end-to-end automation Little effort in complete end-to-end automation We are working on automating OO testing using state based models We are working on automating OO testing using state based models

Summary Software reliability is a rich and wide area Software reliability is a rich and wide area Exciting work going on across the world in modeling, analysis, program checking, testing, etc Exciting work going on across the world in modeling, analysis, program checking, testing, etc Lots of open issues Lots of open issues

Software Reliability Research Pankaj Jalote Professor, CSE, IIT Kanpur, India.

Similar presentations

Presentation on theme: "Software Reliability Research Pankaj Jalote Professor, CSE, IIT Kanpur, India."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Software Reliability Research Pankaj Jalote Professor, CSE, IIT Kanpur, India.

Similar presentations

Presentation on theme: "Software Reliability Research Pankaj Jalote Professor, CSE, IIT Kanpur, India."— Presentation transcript:

Similar presentations

About project

Feedback