1 Copyright © 2003 M. E. Kabay. All rights reserved. Critical Systems Specification IS301 – Software Engineering Lecture #18 – 2003-10-23 M. E. Kabay,

1 Copyright © 2003 M. E. Kabay. All rights reserved. Critical Systems Specification IS301 – Software Engineering Lecture #18 – 2003-10-23 M. E. Kabay, PhD, CISSP Dept of Computer Information Systems Norwich University mkabay@norwich.edu

2 Copyright © 2003 M. E. Kabay. All rights reserved. Acknowledgement  All of the material in this presentation is based directly on slides kindly provided by Prof. Ian Sommerville on his Web site at http://www.software-engin.com  Used with Sommerville’s permission as extended by him for all non-commercial educational use  Copyright in Kabay’s name applies solely to appearance and minor changes in Sommerville’s work or to original materials and is used solely to prevent commercial exploitation of this material

5 Copyright © 2003 M. E. Kabay. All rights reserved. Functional and Non- Functional Requirements  System functional requirements  Define error checking  Recovery facilities and features  Protection against system failures  Non-functional requirements  Required reliability  Availability of system

6 Copyright © 2003 M. E. Kabay. All rights reserved. System Reliability Specification  Hardware reliability  P{hardware component failing}?  Time to repair component?  Software reliability  P{incorrect output}?  Software can continue operation after error HW often causes stoppage  Operator reliability  P{operator error}?

7 Copyright © 2003 M. E. Kabay. All rights reserved. What Happens When All Components Must Work?  Consider system with 2 components A and B where P{failure of A} = P A P{failure of B} = P B P{not A} = 1 – P{A} P{A&B} = P{A}*P{B} I.e., at least 1 will fail  Operation of system depends on both of them  P{A will not fail} = (1 – P A )  P{B will not fail} = (1 – P B )  P{A & B will both not fail} = (1 – P A ) (1 – P B )  P{system failure} = 1 – [(1 – P A ) (1 – P B )]

8 Copyright © 2003 M. E. Kabay. All rights reserved. General Principles  If there are a number of elements i  with probability of failure P i and all of them have to work for the system to work,  then the probability of system failure P S is  Therefore, as number of components (all of which need to function) increases then probability of system failure increases P S = 1 -  (1 – P i ) i 

9 Copyright © 2003 M. E. Kabay. All rights reserved. Component Replication  If components with failure probability P are replicated so that system works as long as any one of components works, then probability of system failure is P S = P{all will fail} = P n  If the system will fail if any of the components fail, then probability of system failure is P S = P{at least 1 will fail} = P{not all will work} = 1 - (1 – P) n

10 Copyright © 2003 M. E. Kabay. All rights reserved. Examples of Functional Reliability Requirements  Predefined range for all values input by operator shall be defined and system shall check all operator inputs fall within predefined range  System shall check all disks for bad blocks when it initialized  System must use N-version programming to implement braking control system  System must be implemented in safe subset of Ada and checked using static analysis

11 Copyright © 2003 M. E. Kabay. All rights reserved. Non-Functional Reliability Specification  Required level of system reliability required should be expressed in quantitatively  Reliability a dynamic system attribute:  Reliability specifications related to source code meaningless:  “No more than N faults/1000 lines” -- BAD  Useful only for post-delivery process analysis -- trying to assess quality of development techniques  Appropriate reliability metric should be chosen to specify overall system reliability

12 Copyright © 2003 M. E. Kabay. All rights reserved. Reliability Metrics  Reliability metrics: units of measurement of system reliability  Count number of operational failures  Relate to demands on system  Time system has been operational  Long-term measurement program  Required to assess reliability of critical systems

14 Copyright © 2003 M. E. Kabay. All rights reserved. Probability of Failure on Demand (POFOD)  Probability system will fail when service request made. Useful when demands for service intermittent and relatively infrequent  Appropriate for protection systems where services demanded occasionally and where there serious consequence if service not delivered  Relevant for many safety-critical systems with exception management components  Emergency shutdown system in chemical plant

15 Copyright © 2003 M. E. Kabay. All rights reserved. Rate of Fault Occurrence (ROCOF)  Reflects rate of occurrence of failure in system  ROCOF of 0.002 means 2 failures likely in each 1000 operational time units e.g. 2 failures per 1000 hours of operation  Relevant for operating systems, transaction processing systems where system has to process large number of similar requests relatively frequent  Credit card processing system, airline booking system

16 Copyright © 2003 M. E. Kabay. All rights reserved. Mean Time to Failure  Measure of time between observed failures of system. reciprocal of ROCOF for stable systems  MTTF of 500 means mean time between failures 500 time units  Relevant for systems with long transactions i.e. where system processing takes long time. MTTF should be longer than transaction length  Computer-aided design systems where designer will work on design for several hours, word processor systems

17 Copyright © 2003 M. E. Kabay. All rights reserved. Availability  Measure of fraction of time system available for use  Takes repair and restart time into account  Availability of 0.998 means software available for 998 out of 1000 time units  Relevant for non-stop, continuously running systems  Telephone switching systems, railway signaling systems

18 Copyright © 2003 M. E. Kabay. All rights reserved. Failure Consequences  Reliability measurements do NOT take consequences of failure into account  Transient faults may have no real consequences  Other faults may cause  Data loss  Corruption  Loss of system service  Identify different failure classes  Use different metrics for each of these.  Reliability specification must be structured

19 Copyright © 2003 M. E. Kabay. All rights reserved. Failure Consequences  When specifying reliability, it not just number of system failures matter but consequences of these failures  Failures have serious consequences clearly more damaging than those where repair and recovery straightforward  In some cases, therefore, different reliability specifications for different types of failure may be defined

21 Copyright © 2003 M. E. Kabay. All rights reserved. Steps to Reliability Specification  For each sub-system, analyze consequences of possible system failures  From system failure analysis, partition failures into appropriate classes  For each failure class identified, set out reliability using appropriate metric.  Different metrics may be used for different reliability requirements  Identify functional reliability requirements to reduce chances of critical failures

22 Copyright © 2003 M. E. Kabay. All rights reserved. Bank Auto-Teller System  Expected usage statistics  Each machine in network used 300 times day  Lifetime of software release 2 years  Each machine handles about 220,000 transactions over 2 years  Total throughput  Bank has 1,000 ATMs  ~300,000 database transactions per day  ~110M transactions per year

23 Copyright © 2003 M. E. Kabay. All rights reserved. Bank ATM (cont’d)  Types of failure  Single-machine failures Affect individual ATM  Network failures Affect groups of ATMs Lower throughput  Central database failures Potentially affect entire network

24 Copyright © 2003 M. E. Kabay. All rights reserved. Examples of Reliability Spec. Failure Class ExampleReliability Metric Permanent, non- corrupting System fails to operate w/ any card input. SW must be restarted to correct failure. ROCOF 1 occurrence /1,000 days Transient, non- corrupting Mag stripe data cannot be read on undamaged card that is input POFOD 1 in 1,000 transactions Transient, corrupting Pattern of transactions across network causes DB corruption Unquantifiable! Should never happen in lifetime of system

25 Copyright © 2003 M. E. Kabay. All rights reserved. Specification Validation  Impossible to validate very high reliability specifications empirically  E.g., in ATM example:  “no database corruptions”  =POFOD of less than 1 in 220 million  If transaction takes 1 second, then simulating one day’s ATM transactions on a single system would take 300,000 seconds = 3.5 days  Testing a single run of 110M transactions would take 3.5 years  It would take longer than system’s lifetime (2 years) to test it for reliability

27 Copyright © 2003 M. E. Kabay. All rights reserved. Safety Specification  Safety requirements of system should be  Separately specified  Based on analysis of possible hazards and risks  Safety requirements  Usually apply to system as whole rather than to individual sub-systems  In systems engineering terms, safety of system is emergent property

29 Copyright © 2003 M. E. Kabay. All rights reserved. Safety Processes  Hazard and risk analysis  Assess hazards and risks of damage associated with system  Safety requirements specification  Specify set of safety requirements which apply to system  Designation of safety-critical systems  Identify sub-systems whose incorrect operation may compromise system safety. Ideally, these should be as small part as possible of whole system.  Safety validation  Check overall system safety

31 Copyright © 2003 M. E. Kabay. All rights reserved. Hazard Analysis Stages  Hazard identification  Identify potential hazards which may arise  Risk analysis and hazard classification  Assess risk associated with each hazard  Hazard decomposition  Decompose hazards to discover their potential root causes  Risk reduction assessment  Define how each hazard must be taken into account when system designed

32 Copyright © 2003 M. E. Kabay. All rights reserved. Fault-Tree Analysis  Method of hazard analysis  Starts with identified fault  Works backward to causes of fault  Used at all stages of hazard analysis  Preliminary analysis  Detailed SW checking  Top-down hazard analysis method  May be combined with bottom-up methods Start with system failures Lead to hazards

33 Copyright © 2003 M. E. Kabay. All rights reserved. Fault-Tree Analysis  Identify hazard  Identify potential causes of hazard  Usually several alternative causes  Link these on fault-tree with ‘or’ or ‘and’ symbols  Continue process until root causes identified  Consider following example  How data might be lost  System where backup process running

35 Copyright © 2003 M. E. Kabay. All rights reserved. Risk Assessment  Assesses hazard severity, hazard probability and accident probability  Outcome of risk assessment statement of acceptability  Intolerable. Must never arise or result in accident  As low as reasonably practical (ALARP) Must minimize possibility of hazard given cost and schedule constraints  Acceptable. Consequences of hazard acceptable and no extra costs should be incurred to reduce hazard probability

37 Copyright © 2003 M. E. Kabay. All rights reserved. Risk Acceptability  Acceptability of risk determined by human, social and political considerations  In most societies, boundaries between regions pushed upwards with time; i.e., society increasingly less willing to accept risk  For example, costs of cleaning up pollution may be less than costs of preventing it but pollution may not be socially acceptable  Risk assessment often highly subjective  Often lack hard data on real probabilities  Risks identified as probable, unlikely, etc. depends on who making assessment

38 Copyright © 2003 M. E. Kabay. All rights reserved. Why Do We Lack Firm Risk Probabilities and Costs?  Failure of observation – don’t notice  Failure of reporting – don’t tell anyone  Variability of systems – can’t pool data  Difficulty of classifying incidents – can’t compare problems  Difficulty of measuring costs – don’t know all repercussions

39 Copyright © 2003 M. E. Kabay. All rights reserved. Risk Reduction  System should be specified so hazards do not arise or result in accident  Hazard avoidance  Design so hazard can never arise during correct system operation  Hazard detection and removal  Design so hazards are detected and neutralized before they result in accident  Damage limitation or mitigation  Design so consequences of accident are minimized or at least reduced

40 Copyright © 2003 M. E. Kabay. All rights reserved. Specifying Forbidden Behavior: Examples  System shall not allow users to modify access permissions on any files they have not created (security)  System shall not allow reverse thrust mode to be selected when aircraft in flight (safety)  System shall not allow simultaneous activation of more than three alarm signals (safety)

42 Copyright © 2003 M. E. Kabay. All rights reserved. Security Specification  Similar to safety specification  Not possible to specify security requirements quantitatively  Requirements often ‘shall not’ rather than ‘shall’ requirements  Differences  No well-defined notion of security life cycle for security management  Generic threats rather than system specific hazards  Mature security technology (encryption, etc.) but problems in transferring into general use – corporate culture

44 Copyright © 2003 M. E. Kabay. All rights reserved. Stages in Security Specification (1)  Asset identification and evaluation  Assets (data and programs) identified  Required degree of protection Criticality and sensitivity  Threat analysis and risk assessment  Possible threats  Risks estimated  Threat assignment  Identified threats related to assets  For each identified asset, list of associated threats

45 Copyright © 2003 M. E. Kabay. All rights reserved. Stages in Security Specification (2)  Technology analysis  Identify available security technologies  Assess applicability against identified threats  Security requirements specification  Policy  Procedure  Technology

46 Copyright © 2003 M. E. Kabay. All rights reserved. HOMEWORK  Apply full Read-Recite-Review phases of SQ3R to Chapter 17 of Sommerville’s text  For next class (Tuesday), apply Survey- Question phases to Chapter 18 on Critical Systems Development.  For Thursday 30 Nov 2003: REQUIRED  Hand in responses to Exercises 17.1(2 points),.2(6),.3(4),.4(4),.5(2),.6(6) and.7(6) = 30 points total  OPTIONAL by 6 Nov: 17.8 and/or 17.9 for 3 extra points each.

1 Copyright © 2003 M. E. Kabay. All rights reserved. Critical Systems Specification IS301 – Software Engineering Lecture #18 – 2003-10-23 M. E. Kabay,

Similar presentations

Presentation on theme: "1 Copyright © 2003 M. E. Kabay. All rights reserved. Critical Systems Specification IS301 – Software Engineering Lecture #18 – 2003-10-23 M. E. Kabay,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Copyright © 2003 M. E. Kabay. All rights reserved. Critical Systems Specification IS301 – Software Engineering Lecture #18 – 2003-10-23 M. E. Kabay,

Similar presentations

Presentation on theme: "1 Copyright © 2003 M. E. Kabay. All rights reserved. Critical Systems Specification IS301 – Software Engineering Lecture #18 – 2003-10-23 M. E. Kabay,"— Presentation transcript:

Similar presentations

About project

Feedback