CIS 376 Bruce R. Maxim UM-Dearborn

Slides:



Advertisements
Similar presentations
Critical Systems Specification
Advertisements

Critical Systems Specification
ITIL: Service Transition
CSE3308/CSC Software Engineering: Analysis and DesignLecture 7B.1 Software Engineering: Analysis and Design - CSE3308 Reliability CSE3308/CSC3080/DMS/2000/17.
Chapter 9 Testing the System, part 2. Testing  Unit testing White (glass) box Code walkthroughs and inspections  Integration testing Bottom-up Top-down.
SWE Introduction to Software Engineering
Critical Systems Specification
Modified from Sommerville’s originalsSoftware Engineering, 7th edition. Chapter 9 Slide 1 Critical Systems Specification.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 30 Slide 1 Security Engineering.
Soft. Eng. II, Spr. 2002Dr Driss Kettani, from I. Sommerville1 CSC-3325: Chapter 9 Title : Reliability Reading: I. Sommerville, Chap. 16, 17 and 18.
SWE Introduction to Software Engineering
Developing Dependable Systems CIS 376 Bruce R. Maxim UM-Dearborn.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 9 Slide 1 Critical Systems Specification.
©Ian Sommerville 2004Software Engineering, 7th edition. Insulin Pump Slide 1 An automated insulin pump.
©Ian Sommerville 2004Software Engineering, 7th edition. Insulin Pump Slide 1 The portable insulin pump Developing a dependability specification for the.
3. Software product quality metrics The quality of a product: -the “totality of characteristics that bear on its ability to satisfy stated or implied needs”.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 2 Slide 1 Systems engineering 1.
1 Software Testing Techniques CIS 375 Bruce R. Maxim UM-Dearborn.
Software Dependability CIS 376 Bruce R. Maxim UM-Dearborn.
Software Project Management
Software Reliability Categorising and specifying the reliability of software systems.
CSCI 5801: Software Engineering
System Testing There are several steps in testing the system: –Function testing –Performance testing –Acceptance testing –Installation testing.
1 Copyright © 2003 M. E. Kabay. All rights reserved. Critical Systems Specification IS301 – Software Engineering Lecture #18 – M. E. Kabay,
1 Chapter 2 Socio-technical Systems (Computer-based System Engineering)
Risk Management - the process of identifying and controlling hazards to protect the force.  It’s five steps represent a logical thought process from.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 9 Slide 1 Critical Systems Specification 2.
Chapter 12 – Dependability and Security Specification
Topic (1)Software Engineering (601321)1 Introduction Complex and large SW. SW crises Expensive HW. Custom SW. Batch execution.
1 Chapter 3 Critical Systems. 2 Objectives To explain what is meant by a critical system where system failure can have severe human or economic consequence.
Testing Basics of Testing Presented by: Vijay.C.G – Glister Tech.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 3 Slide 1 Critical Systems 1.
Testing -- Part II. Testing The role of testing is to: w Locate errors that can then be fixed to produce a more reliable product w Design tests that systematically.
Ch. 1.  High-profile failures ◦ Therac 25 ◦ Denver Intl Airport ◦ Also, Patriot Missle.
Architectural Design Yonsei University 2 nd Semester, 2014 Sanghyun Park.
Software Reliability (Lecture 13) Dr. R. Mall. Organization of this Lecture: $ Introduction. $ Reliability metrics $ Reliability growth modelling $ Statistical.
Building Dependable Distributed Systems Chapter 1 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
1 Introduction to Software Testing. Reading Assignment P. Ammann and J. Offutt “Introduction to Software Testing” ◦ Chapter 1 2.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 9 Slide 1 Critical Systems Specification 1.
1 Software Quality Assurance. 2 Quality Concepts - 1 Variation control is the heart of quality control Software engineers strive to control the – process.
Chapter 1: Fundamental of Testing Systems Testing & Evaluation (MNN1063)
Software Engineering1  Verification: The software should conform to its specification  Validation: The software should do what the user really requires.
CS451 Lecture 10: Software Testing Yugi Lee STB #555 (816)
Slide 1 Security Engineering. Slide 2 Objectives l To introduce issues that must be considered in the specification and design of secure software l To.
Software Reliability [Kehandalan Perangkat Lunak] Catur Iswahyudi.
Erman Taşkın. Information security aspects of business continuity management Objective: To counteract interruptions to business activities and to protect.
©Ian Sommerville 2000Dependability Slide 1 Chapter 16 Dependability.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 23 Slide 1 Software testing.
Testing Overview Software Reliability Techniques Testing Concepts CEN 4010 Class 24 – 11/17.
Defect testing Testing programs to establish the presence of system defects.
Lecturer: Eng. Mohamed Adam Isak PH.D Researcher in CS M.Sc. and B.Sc. of Information Technology Engineering, Lecturer in University of Somalia and Mogadishu.
ISQB Software Testing Section Meeting 10 Dec 2012.
ITIL: Service Transition
Software Metrics and Reliability
Software Quality Assurance
Hardware & Software Reliability
Critical Systems Specification
Software Reliability Definition: The probability of failure-free operation of the software for a specified period of time in a specified environment.
Software Reliability PPT BY:Dr. R. Mall 7/5/2018.
Security Engineering.
Software Reliability: 2 Alternate Definitions
Critical Systems Validation
Critical Systems Specification
Software Reliability (Lecture 12)
Presentation transcript:

CIS 376 Bruce R. Maxim UM-Dearborn Software Reliability CIS 376 Bruce R. Maxim UM-Dearborn

Functional and Non-functional Requirements System functional requirements may specify error checking, recovery features, and system failure protection System reliability and availability are specified as part of the non-functional requirements for the system.

System Reliability Specification Hardware reliability probability a hardware component fails Software reliability probability a software component will produce an incorrect output software does not wear out software can continue to operate after a bad result Operator reliability probability system user makes an error

Failure Probabilities If there are two independent components in a system and the operation of the system depends on them both then P(S) = P(A) + P(B) If the components are replicated then the probability of failure is P(S) = P(A)n meaning that all components fail at once

Functional Reliability Requirements The system will check the all operator inputs to see that they fall within their required ranges. The system will check all disks for bad blocks each time it is booted. The system must be implemented in using a standard implementation of Ada.

Non-functional Reliability Specification The required level of reliability must be expressed quantitatively. Reliability is a dynamic system attribute. Source code reliability specifications are meaningless (e.g. N faults/1000 LOC) An appropriate metric should be chosen to specify the overall system reliability.

Hardware Reliability Metrics Hardware metrics are not suitable for software since its metrics are based on notion of component failure Software failures are often design failures Often the system is available after the failure has occurred Hardware components can wear out

Software Reliability Metrics Reliability metrics are units of measure for system reliability System reliability is measured by counting the number of operational failures and relating these to demands made on the system at the time of failure A long-term measurement program is required to assess the reliability of critical systems

Reliability Metrics - part 1 Probability of Failure on Demand (POFOD) POFOD = 0.001 For one in every 1000 requests the service fails per time unit Rate of Fault Occurrence (ROCOF) ROCOF = 0.02 Two failures for each 100 operational time units of operation

Reliability Metrics - part 2 Mean Time to Failure (MTTF) average time between observed failures (aka MTBF) Availability = MTBF / (MTBF+MTTR) MTBF = Mean Time Between Failure MTTR = Mean Time to Repair Reliability = MTBF / (1+MTBF)

Time Units Raw Execution Time Calendar Time Number of Transactions non-stop system Calendar Time If the system has regular usage patterns Number of Transactions demand type transaction systems

Availability Measures the fraction of time system is really available for use Takes repair and restart times into account Relevant for non-stop continuously running systems (e.g. traffic signal)

Probability of Failure on Demand Probability system will fail when a service request is made Useful when requests are made on an intermittent or infrequent basis Appropriate for protection systems service requests may be rare and consequences can be serious if service is not delivered Relevant for many safety-critical systems with exception handlers

Rate of Fault Occurrence Reflects rate of failure in the system Useful when system has to process a large number of similar requests that are relatively frequent Relevant for operating systems and transaction processing systems

Mean Time to Failure Measures time between observable system failures For stable systems MTTF = 1/ROCOF Relevant for systems when individual transactions take lots of processing time (e.g. CAD or WP systems)

Failure Consequences - part 1 Reliability does not take consequences into account Transient faults have no real consequences but other faults might cause data loss or corruption May be worthwhile to identify different classes of failure, and use different metrics for each

Failure Consequences - part 2 When specifying reliability both the number of failures and the consequences of each matter Failures with serious consequences are more damaging than those where repair and recovery is straightforward In some cases, different reliability specifications may be defined for different failure types

Failure Classification Transient - only occurs with certain inputs Permanent - occurs on all inputs Recoverable - system can recover without operator help Unrecoverable - operator has to help Non-corrupting - failure does not corrupt system state or data Corrupting - system state or data are altered

Building Reliability Specification For each sub-system analyze consequences of possible system failures From system failure analysis partition failure into appropriate classes For each class send out the appropriate reliability metric

Examples

Specification Validation It is impossible to empirically validate high reliability specifications No database corruption really means POFOD class < 1 in 200 million If each transaction takes 1 second to verify, simulation of one day’s transactions takes 3.5 days

Statistical Reliability Testing Test data used, needs to follow typical software usage patterns Measuring numbers of errors needs to be based on errors of omission (failing to do the right thing) and errors of commission (doing the wrong thing)

Difficulties with Statistical Reliability Testing Uncertainty when creating the operational profile High cost of generating the operational profile Statistical uncertainty problems when high reliabilities are specified

Safety Specification Each safety specification should be specified separately These requirements should be based on hazard and risk analysis Safety requirements usually apply to the system as a whole rather than individual components System safety is an an emergent system property

Safety Life Cycle - part 1 Concept and scope definition Hazard and risk analysis Safety requirements specification safety requirements derivation safety requirements allocation Planning and development safety related systems development external risk reduction facilities

Safety Life Cycle - part 2 Deployment safety validation installation and commissioning Operation and maintenance System decommissioning

Safety Processes Hazard and risk analysis assess the hazards and risks associated with the system Safety requirements specification specify system safety requirements Designation of safety-critical systems identify sub-systems whose incorrect operation can compromise entire system safety Safety validation check overall system safety

Hazard Analysis Stages Hazard identification identify potential hazards that may arise Risk analysis and hazard classification assess risk associated with each hazard Hazard decomposition seek to discover potential root causes for each hazard Risk reduction assessment describe how each hazard is to be taken into account when system is designed

Fault-tree Analysis Hazard analysis method that starts with an identified fault and works backwards to the cause of the fault Can be used at all stages of hazard analysis It is a top-down technique, that may be combined with a bottom-up hazard analysis techniques that start with system failures that lead to hazards

Fault-tree Analysis Steps Identify hazard Identify potential causes of hazards Link combinations of alternative causes using “or” or “and” symbols as appropriate Continue process until “root” causes are identified (result will be an and/or tree or a logic circuit) the causes are the “leaves”

How does it work? What would a fault tree look like for a fault tree describing the causes for a hazard like “data deleted”?

Risk Assessment Assess the hazard severity, hazard probability, and accident probability Outcome of risk assessment is a statement of acceptability Intolerable (can never occur) ALARP (as low as possible given cost and schedule constraints) Acceptable (consequences are acceptable and no extra cost should be incurred to reduce it further)

Risk Acceptability Determined by human, social, and political considerations In most societies, the boundaries between regions are pushed upwards with time (meaning risk becomes less acceptable) Risk assessment is always subjective (what is acceptable to one person is ALARP to another)

Risk Reduction System should be specified so that hazards do not arise or result in an accident Hazard avoidance system designed so hazard can never arise during normal operation Hazard detection and removal system designed so that hazards are detected and neutralized before an accident can occur Damage limitation system designed to minimized accident consequences

Security Specification Similar to safety specification not possible to specify quantitatively usually stated in “system shall not” terms rather than “system shall” terms Differences no well-defined security life cycle yet security deals with generic threats rather than system specific hazards

Security Specification Stages - part 1 Asset identification and evaluation data and programs identified with their level of protection degree of protection depends on asset value Threat analysis and risk assessment security threats identified and risks associated with each is estimated Threat assignment identified threats are related to assets so that asset has a list of associated threats

Security Specification Stages - part 2 Technology analysis available security technologies and their applicability against the threats Security requirements specification where appropriate these will identify the security technologies that may be used to protect against different threats to the system