Achieving Better Reliability With Software Reliability Engineering Russel D’Souza Russel D’Souza
The Software Problem Customers Demand :- Customers Demand :- More reliable software Faster products Cheaper products Success or Failure in meeting demands affects Market share Profitability Demands conflict, Causing risk and Overwhelming pressure Demands conflict, Causing risk and Overwhelming pressure
Problem Aggravated Software Glitches Cause:- Loss of Competitive position & Market share Poor Quality products, High costs of defects Lack of Security/Fault protection Loss of Consumer Confidence Poor Quality products and Slow response to Consumer’s needs Unsatisfactory return on Software Investment
Real World Problems Defective Software cost Industry $175 billion in Y2K Loss Of a Single Network cell costs $18K per minute of downtime More than 110 million Computers are online Connected via Internet and are prone to Virus attacks and Defects >90% of Institutions reported Insider abuse of Network Access in year 2000
The Solution – Software Reliability Engineering (SRE) ► Reduces or Eliminates Defects from Software ► Designs Software for Reliability, Fault tolerance, Rapid fault recovery ► Maximizes use of proven SRE models ► Applies existing Statistical models to Real world Software Environments ► Adds and Integrates SRE with other Good processes and practices without Replacing them
What is SRE ? A Sub-discipline of Software engineering based on Solid body of Theory that includes Operational profiles, Random process software reliability models, Statistical estimation and Sequential sampling theory Works by quantitatively characterizing the operational behavior of software based systems Based on two fundamental ideas:- Deliver desired functionality for a product efficiently by quantitatively characterizing the expected use of product, precisely focusing resources in most used and critical functions Make testing realistically represent field conditions
SRE Is Widely Applicable SRE Is Widely Applicable Technically speaking, you can apply SRE to any software-based product, beginning at start of any release cycle. Economically speaking, the complete SRE process may be impractical for small components (involving perhaps less than 2 staff months of effort), unless used in a large number of products. Economically speaking, the complete SRE process may be impractical for small components (involving perhaps less than 2 staff months of effort), unless used in a large number of products. Independent of development technology and platform SRE requires no changes in architecture, design, or code - but it may suggest changes that would be beneficial. SRE requires no changes in architecture, design, or code - but it may suggest changes that would be beneficial.
The SRE Process
List Associated Systems Lists all systems associated with the product that must be tested independently and are of two types – Base product & variations and Super- systems Develop Operational Profiles An Operational profile is a Complete set of commands with their probabilities of occurrence System Testers and System engineers are included in this activity Testers get more in contact with the product users which allows them to get a feedback from the users as to what system behavior is acceptable, what is not and how users employ the product
Define “Just Right” Reliability Define what “failure” means for the product Failure is defined as any Departure of system Behavior in execution from user needs Failure intensity is the number of Failures per unit time Choose a common Measure for all failure intensities, either Failures per some natural unit or Failures per hour Set the total system Failure Intensity Objective (FIO) for each associated System using Field data like Customer Satisfaction surveys related to measured failure intensity, or an analysis of competing products balancing among major quality characteristics users need.
Prepare for Test Use the Operational profiles to prepare the Test cases and the Test procedures Select Test cases within the Operation on a Uniform Basis Execute Test Execute Test Allocate Test time among Feature test, Load test, and Regression test Feature tests - Interactions and Effects of the field environment minimized Feature tests - Interactions and Effects of the field environment minimized Load tests execute Test cases simultaneously, with full interactions and all the effects of the Field environment Load tests execute Test cases simultaneously, with full interactions and all the effects of the Field environment Regression executes some or all feature tests and it is designed to reveal failures caused by faults introduced by program changes Regression executes some or all feature tests and it is designed to reveal failures caused by faults introduced by program changes
Guiding Test Guiding Test Involves guiding the product’s system Test phase and Release Failure data is interpreted differently for software we are developing and software we acquire. We attempt to remove the faults that are causing Failures For developed software, we estimate the FI/FIO ratio from the times of failure events or the number of failures per time interval, using reliability estimation programs such as CASRE (Computer Aided Software Reliability estimation)
Reliability Growth Test SRE is used to Estimate and track Reliability Main objective of this Test is to find and remove faults Includes Feature, Load and Regression tests Feature test is one in which operations are executed separately with interactions and effects of field environment minimized Feature test is one in which operations are executed separately with interactions and effects of field environment minimized Load test on the other hand has the environment similar to that in actual field when carried out.. It is sub-divided into two types - Acceptance test and Performance test Load test on the other hand has the environment similar to that in actual field when carried out.. It is sub-divided into two types - Acceptance test and Performance test Regression test is the execution of randomly selected or all Feature tests after a significant change in a System Build Regression test is the execution of randomly selected or all Feature tests after a significant change in a System Build Certification Test ► Makes a Binary type decision about the Software being tested. I.e. the software is either accepted or rejected ► Certification test is generally used only for Load tests Software Reliability Engineering (SRE) Types of Tests
Software Reliability Models (SRM) Modeling techniques can be divided:- Prediction modeling & Estimation modeling Both techniques Based on Observing and Accumulating Failure data and analyzing with Statistical Inference Features Of A Good SRM:- Features Of A Good SRM:- Give good predictions of future failure behavior Compute useful quantities Be simple enough for many to use Be widely applicable Be based on sound assumptions Become and remain stable
Conclusion SRE is a field of engineering where you:- Design, Build, Balance Testing and other Reliability improvement approaches for a software product Allocate Testing resources in accordance with use and criticality of operations Control the Software-Based products you develop, rather than the process controlling you. Can be confident of the reliability and availability of the products. Can deliver them in minimum time and cost for High levels of Reliability and Availability achieved Thus SRE is a vital skill to possess to be Competitive in Today’s marketplace