Download presentation
Presentation is loading. Please wait.
Published bySilvia Hill Modified over 9 years ago
1
SENG521 (Fall 2002)far@enel.ucalgary.ca1 SENG 521 Software Reliability & Testing Overview of Software Reliability Engineering Department of Electrical & Computer Engineering, University of Calgary B.H. Far ( far@enel.ucalgary.ca ) http://www.enel.ucalgary.ca/~far/Lectures/SENG521/01/
2
SENG521 (Fall 2002)far@enel.ucalgary.ca2 Contents About this course. What is software reliability? What factors affect software quality? What is software reliability engineering? Software reliability engineering process.
3
SENG521 (Fall 2002)far@enel.ucalgary.ca3 Section 1 Basic Concepts & Definitions
4
SENG521 (Fall 2002)far@enel.ucalgary.ca4 Realities … Software development is a very high risk task. About 20% of the software projects are canceled. (missed schedules, etc.) About 84% of software projects are incomplete when released (need patch, etc). Almost all of the software projects costs exceed initial estimations. (cost overrun)
5
SENG521 (Fall 2002)far@enel.ucalgary.ca5 Software Engineering /1 Business software has a large number of parts that have many interactions (i.e., complexity). Software engineering paradigms provide models and techniques that make it easier to handle complexity. A number of contemporary software engineering. paradigms have been proposed: Object-orientation Component-ware Design patterns Software architectures etc.
6
SENG521 (Fall 2002)far@enel.ucalgary.ca6 Software Engineering /2 Evolution of software engineering paradigms: Assembly languages Procedural and structured programming Object Oriented programming Component-ware Design patterns Software architectures… Software Agents Languages that have their conceptual basis determined by machine architecture Languages that have their key abstractions rooted in the problem domain Increase of Complexity time
7
SENG521 (Fall 2002)far@enel.ucalgary.ca7 What Affects Software? Timeliness: Timeliness: Meeting the project deadline. Reaching the market at the right time. Cost: Cost: Meeting the anticipated project costs. Reliability: Reliability: Working fine for the designated period on the designated system.
8
SENG521 (Fall 2002)far@enel.ucalgary.ca8 Definition: Failure & Availability Failure: Failure: Any departure of system behavior in execution from user needs. Failure intensity: Failure intensity: the number of failures per natural or time unit. Failure intensity is way of expressing reliability. Availability: Availability: The probability at any given time that a system or a capability of a system functions satisfactorily in a specified environment. If you are given an average down time per failure, availability implies a certain reliability.
9
SENG521 (Fall 2002)far@enel.ucalgary.ca9 Definition: Verification & Validation Verification: Verification: For each development phase or for each module are the outputs and inputs generated correctly? And do they match correctly? Validation: Validation: Does the software meet its requirements?
10
SENG521 (Fall 2002)far@enel.ucalgary.ca10 Definition: Reliability Reliability is the probability that a system or a capability of a system functions without failure for a “specified time” or “number of natural units” in a specified environment. (Musa, et al.) A recent survey of software consumers revealed that reliability was the most important quality attribute of the application software. This course is concerned with the engineering of reliable software products.
11
SENG521 (Fall 2002)far@enel.ucalgary.ca11 About This Course … The topics discussed include: Concepts and relationships; analytical models and supporting tools; techniques for software reliability improvement, including: fault avoidance, fault elimination, fault tolerance error detection and repair, failure detection and retraction; risk management.
12
SENG521 (Fall 2002)far@enel.ucalgary.ca12 Section 2 Reliability
13
SENG521 (Fall 2002)far@enel.ucalgary.ca13 Reliability: Natural System Natural system life cycle. Aging effect: Life span of a natural system is limited by the maximum reproduction rate of the cells.
14
SENG521 (Fall 2002)far@enel.ucalgary.ca14 Reliability: Hardware Hardware life cycle. Useful life span of a hardware system is limited by the age (wear out) of the system.
15
SENG521 (Fall 2002)far@enel.ucalgary.ca15 Reliability: Software Software life cycle. Software systems are changed (updated) many times during their life cycle. Each update adds to the structural deterioration of the software system.
16
SENG521 (Fall 2002)far@enel.ucalgary.ca16 Software vs. Hardware Software reliability doesn’t decrease with time. Hardware faults are mostly physical faults. Software faults are mostly design faults which are harder to measure, model, detect and correct.
17
SENG521 (Fall 2002)far@enel.ucalgary.ca17 Reliability: Science Exploring ways of implementing “reliability” in software products. Reliability Science’s goals: Developing “models” and “techniques” to build reliable software. Testing such models and techniques for adequacy, soundness and completeness.
18
SENG521 (Fall 2002)far@enel.ucalgary.ca18 Section 3 Reliability Engineering
19
SENG521 (Fall 2002)far@enel.ucalgary.ca19 What is Engineering? Engineering = Analysis + Design + Construction + Verification + Management What is the problem to be solved? What characters of the entity are used to solve the problem? How will the entity be realized? How it is constructed? What approach is used to uncover errors in design and construction? How will the entity be supported in the long term? What is the problem to be solved? What characters of the entity are used to solve the problem? How will the entity be realized? How it is constructed? What approach is used to uncover errors in design and construction? How will the entity be supported in the long term?
20
SENG521 (Fall 2002)far@enel.ucalgary.ca20 Reliability: Engineering /1 Engineering of “reliability” in software products. Reliability Engineering’s goal: developing software to reach the market With “minimum” development time With “minimum” development cost With “maximum” reliability SoftwareQualitySoftwareQuality
21
SENG521 (Fall 2002)far@enel.ucalgary.ca21 Reliability: Engineering /2 Pick quantitative representations for the 3 factors (cost, time and reliability) and measure them! Software quality means getting the right balance among development cost, development time and reliability. SRE Minimum & Maximum Cost, Time, Reliability Optimum
22
SENG521 (Fall 2002)far@enel.ucalgary.ca22 What is SRE? /1 Software Reliability Engineering (SRE) is a multi- faceted discipline covering the software product lifecycle. It involves both technical and management activities in three basic areas: Software Development and Maintenance Measurement and Analysis of Reliability Data, Feedback of Reliability Information into the software lifecycle activities.
23
SENG521 (Fall 2002)far@enel.ucalgary.ca23 What is SRE ? /2 SRE is a practice for quantitatively planning and guiding software development and test, with emphasis on reliability and availability. SRE simultaneously does three things: It ensures that product reliability and availability meet user needs. It delivers the product to market faster. It increases productivity, lowering product life-cycle cost. In applying SRE, one can vary relative emphasis placed on these three factors.
24
SENG521 (Fall 2002)far@enel.ucalgary.ca24 Section 4 Software Reliability Engineering (SRE) Process
25
SENG521 (Fall 2002)far@enel.ucalgary.ca25 SRE: Process /1 There are 5 steps in SRE process (for each system to test): Define necessary reliability Develop operational profiles Prepare for test Execute test Apply failure data to guide decisions
26
SENG521 (Fall 2002)far@enel.ucalgary.ca26 SRE: Process /2 The Develop Operational Profiles, and Prepare for Test activities all start during the Requirements and Architecture phases of the software development process. They all extend to varying degrees into the Design and Implementation phase, as they can be affected by it. The Execute Test and Guide Test activities coincide with the Test phase.
27
SENG521 (Fall 2002)far@enel.ucalgary.ca27 SRE: Necessary Reliability Define what “failure” means for the product. Choose a common measure for all failure intensities, either failures per some natural unit or failures per hour. Set the total system failure intensity objective (FIO). Compute a developed software FIO by subtracting the total of the FIOs of all hardware and acquired software components from the system FIOs. Use the developed software FIOs to track the reliability growth during system test.
28
SENG521 (Fall 2002)far@enel.ucalgary.ca28 SRE: Operational Profile /1 An operation is a major system logical task, which returns control to the system when complete. An operational profile is a complete set of operations with their probabilities of occurrence.
29
SENG521 (Fall 2002)far@enel.ucalgary.ca29 SRE: Operational Profile /2 There are four principal steps in developing an operational profile: Identify the operation initiators List the operations invoked by each initiator Determine the occurrence rates Determine the occurrence probabilities by dividing the occurrence rates by the total occurrence rate There are three kinds of initiators: user types, external systems, and the system itself.
30
SENG521 (Fall 2002)far@enel.ucalgary.ca30 SRE: Operational Profile /3 Review Operational profile: Review the functionality to be implemented to remove operations that are not likely to be worth their cost Suggest operations where opportunities for reuse will be most cost-effective Plan a more competitive release strategy using operational development. With operational development, development proceeds operation by operation, ordered by the operational profile. This makes it possible to deliver the most used, most critical capabilities to customers earlier than scheduled. Allocate resources for requirements, design, and code reviews among operations to cut schedules and costs Allocate system engineering, architectural design, development, and code resources among operations to cut schedules and costs Allocate development, code, and test resources among modules to cut schedules and costs
31
SENG521 (Fall 2002)far@enel.ucalgary.ca31 SRE: Prepare for Test The Prepare for Test activity uses the operational profiles to prepare test cases and test procedures. Test cases are allocated in accordance with the operational profile. Test cases are assigned to the operations by selecting from all the possible intra-operation choices with equal probability. The test procedure is the controller that invokes test cases during execution.
32
SENG521 (Fall 2002)far@enel.ucalgary.ca32 SRE: Execute Test Allocate test time among the associated systems and types of test (feature, load, regression, etc.). Invoke the test cases at random times, choosing operations randomly in accordance with the operational profile. Identify failures, along with when they occur. This information will be used in Apply Failure Data and Guide Test.
33
SENG521 (Fall 2002)far@enel.ucalgary.ca33 Types of Test Reliability Growth Test Certification Test
34
SENG521 (Fall 2002)far@enel.ucalgary.ca34 SRE: Apply Failure Data Plot each new failure as it occurs on a reliability demonstration chart. Accept or reject software (operations) using reliability demonstration chart. Track reliability growth as faults are removed.
35
SENG521 (Fall 2002)far@enel.ucalgary.ca35 Collect Field Data SRE for the software product lifecycle. Collect field data to use in succeeding releases either using automatic reporting routines or manual collection, using a random sample of field sites. Collect data on failure intensity and on customer satisfaction and use this information in setting the failure intensity objective for the next release. Measure operational profiles in the field and use this information to correct the operational profiles we estimated. Collect information to refine the process of choosing reliability strategies in future projects.
36
SENG521 (Fall 2002)far@enel.ucalgary.ca36 Section 5 Error & Failure
37
SENG521 (Fall 2002)far@enel.ucalgary.ca37 Definition: Fault A fault is a cause for either a failure of the program or an internal error (e.g., an incorrect state, incorrect timing) A fault must be detected and then removed Fault can be removed without execution (e.g., code inspection, design review) Fault removal due to execution depends on the occurrence of associated “failure”. Occurrence depends on length of execution time and operational profile.
38
SENG521 (Fall 2002)far@enel.ucalgary.ca38 Definition: Error Error has two meanings: A discrepancy between a computed, observed or measured value or condition and the true, specified or theoretically correct value or condition. A human action that results in software containing a fault. Human errors are the hardest to detect.
39
SENG521 (Fall 2002)far@enel.ucalgary.ca39 More Definitions Defect: Defect: refers to either fault (cause) or failure (effect) Service: Service: expected behavior of a software system Availability: Availability: system uptime divided by the sum of system uptime and downtime.
40
SENG521 (Fall 2002)far@enel.ucalgary.ca40 Failure Specification /1 1)Time of failure 2)Time interval between failures 3)Cumulative failure up to a given time 4)Failures experienced in a time interval Failure no. Failure times (hours) Failure interval (hours) 110 2199 33213 44311 55815 67012 78818 810315 912522 1015025 1116919 1219930 1323132 1425625 1529640 Time based failure specification
41
SENG521 (Fall 2002)far@enel.ucalgary.ca41 Failure Specification /2 1)Time of failure 2)Time interval between failures 3)Cumulative failure up to a given time 4)Failures experienced in a time interval Time(s)Cumulative Failures Failures in interval 3022 6053 9072 12081 150102 180111 210121 240131 270141 Failure based failure specification
42
SENG521 (Fall 2002)far@enel.ucalgary.ca42 Failure Specification /3 Many reliability modeling programs and tools based on them (e.g., SMERFS, and CASRE) have the capability to estimate model parameters from either “failure count” or “time interval between failures” data.
43
SENG521 (Fall 2002)far@enel.ucalgary.ca43 Failure Functions /1 Cumulative Failure Function (mean value function) denotes the average cumulative failures associated with each time point. Failures in time period ProbabilityValue X Probability 00.100.00 10.18 20.220.44 30.160.48 40.110.44 50.080.40 60.050.30 70.040.28 80.030.24 90.020.18 100.010.10 Cumulative failure3.04 Failure distribution
44
SENG521 (Fall 2002)far@enel.ucalgary.ca44 Failure Functions /2 Failure Intensity Function (FIF) represents the rate of change of cumulative failure function. As faults are removed, failure intensity tends to drop and reliability tends to increase.
45
SENG521 (Fall 2002)far@enel.ucalgary.ca45 Failure Functions /3 Meantime to Failure (MTTF): expected time that next failure will be observed. R(x) is the reliability. Meantime to Repair (MTTR): expected time until the system will be repaired.
46
SENG521 (Fall 2002)far@enel.ucalgary.ca46 Failure Functions /4 Failure Rate Function: the probability that a failure per unit time occurs in the interval [t, t+Δt] given the failure has not occurred before t. Meantime Between Failures (MTBF): MTBF = MTTF + MTTR Availability can also be defined as:
47
SENG521 (Fall 2002)far@enel.ucalgary.ca47 Failure Functions /5 Failure(s) in time period Probability Elapsed time (1 hour) Elapsed time (5 hours) 00.100.01 10.180.02 20.220.03 30.160.04 40.110.05 50.080.07 60.050.09 70.040.12 80.030.16 90.020.13 100.010.10 1100.07 1200.05 1300.03 1400.02 1500.01 Mean3.047.77
48
SENG521 (Fall 2002)far@enel.ucalgary.ca48 Reliability Model ReliabilityModelReliabilityModel Fault introduction: Characteristics of the product (e.g., program size) Development process (e.g., SE tools and techniques, staff experiences, etc.) Fault removal: Failure discovery (e.g., extent of execution, operational profile) Quality of repair activity Environment
49
SENG521 (Fall 2002)far@enel.ucalgary.ca49 Conclusion Software Reliability Engineering (SRE) can offer metrics to help elevate a software development organization to the upper levels of software development maturity.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.