Safety-Critical Systems 3 Hardware/Software T 79.232 Ilkka Herttua.

Slides:



Advertisements
Similar presentations
Fault-Tolerant Systems Design Part 1.
Advertisements

Safety Critical Systems T Safeware - Design for safety hardware and software Ilkka Herttua.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development.
3. Hardware Redundancy Reliable System Design 2010 by: Amir M. Rahmani.
Safety-Critical Systems 2 T Risk analysis and design for safety Ilkka Herttua.
Building Reliable Software Requirements and Methods.
Software Fault Tolerance – The big Picture RTS April 2008 Anders P. Ravn Aalborg University.
SWE Introduction to Software Engineering
©Ian Sommerville 2000Software Engineering, 6th edition. Chapter 19Slide 1 Verification and Validation l Assuring that a software system meets a user's.
1 Building with Assurance CSSE 490 Computer Security Mark Ardis, Rose-Hulman Institute May 10, 2004.
Software Requirements
Dependability ITV Real-Time Systems Anders P. Ravn Aalborg University February 2006.
Soft. Eng. II, Spr. 2002Dr Driss Kettani, from I. Sommerville1 CSC-3325: Chapter 9 Title : Reliability Reading: I. Sommerville, Chap. 16, 17 and 18.
Developing Dependable Systems CIS 376 Bruce R. Maxim UM-Dearborn.
OHT 3.1 Galin, SQA from theory to implementation © Pearson Education Limited 2004 The need for comprehensive software quality requirements Classification.
Design of SCS Architecture, Control and Fault Handling.
Software Quality Assurance For Software Engineering && Architecture and Design.
CIS 376 Bruce R. Maxim UM-Dearborn
Testing safety-critical software systems
Software Dependability CIS 376 Bruce R. Maxim UM-Dearborn.
Safety-Critical Systems 6 Quality Management and Certification T
Software Reliability Categorising and specifying the reliability of software systems.
Software Testing Verification and validation planning Software inspections Software Inspection vs. Testing Automated static analysis Cleanroom software.
CS527: (Advanced) Topics in Software Engineering Overview of Software Quality Assurance Tao Xie ©D. Marinov, T. Xie.
Dr. Pedro Mejia Alvarez Software Testing Slide 1 Software Testing: Building Test Cases.
Language Evaluation Criteria
System Testing There are several steps in testing the system: –Function testing –Performance testing –Acceptance testing –Installation testing.
Safety Critical Systems
CS 501: Software Engineering Fall 1999 Lecture 16 Verification and Validation.
1 Fault Tolerance in the Nonstop Cyclone System By Scott Chan Robert Jardine Presented by Phuc Nguyen.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 9 Slide 1 Critical Systems Specification 2.
Safety-Critical Systems 6 Certification
Software Software is omnipresent in the lives of billions of human beings. Software is an important component of the emerging knowledge based service.
Intro to Architecture – Page 1 of 22CSCI 4717 – Computer Architecture CSCI 4717/5717 Computer Architecture Topic: Introduction Reading: Chapter 1.
Protecting the Public, Astronauts and Pilots, the NASA Workforce, and High-Value Equipment and Property Mission Success Starts With Safety Believe it or.
1 Chapter 3 Critical Systems. 2 Objectives To explain what is meant by a critical system where system failure can have severe human or economic consequence.
This chapter is extracted from Sommerville’s slides. Text book chapter
Software Testing Yonsei University 2 nd Semester, 2014 Woo-Cheol Kim.
Part.1.1 In The Name of GOD Welcome to Babol (Nooshirvani) University of Technology Electrical & Computer Engineering Department.
Fault-Tolerant Systems Design Part 1.
Safety-Critical Systems T Ilkka Herttua. Safety Context Diagram HUMANPROCESS SYSTEM - Hardware - Software - Operating Rules.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
1 Reducing the Software Impact to System Safety Paul Mayo – SafeEng Limited.
Safety-Critical Systems 5 Testing and V&V T
1 Fault Tolerant Computing Basics Dan Siewiorek Carnegie Mellon University June 2012.
Quality Assurance.
CprE 458/558: Real-Time Systems
Software quality factors
Safety-Critical Systems 7 Summary T V - Lifecycle model System Acceptance System Integration & Test Module Integration & Test Requirements Analysis.
RELIABILITY ENGINEERING 28 March 2013 William W. McMillan.
Idaho RISE System Reliability and Designing to Reduce Failure ENGR Sept 2005.
Fault-Tolerant Systems Design Part 1.
Over View of CENELC Standards for Signalling Applications
Safety Critical Systems T Safeware - Design for safety hardware and software Ilkka Herttua.
Chapter 1: Fundamental of Testing Systems Testing & Evaluation (MNN1063)
© 2006 Pearson Addison-Wesley. All rights reserved 2-1 Chapter 2 Principles of Programming & Software Engineering.
Software Engineering1  Verification: The software should conform to its specification  Validation: The software should do what the user really requires.
©Ian Sommerville 2000Dependability Slide 1 Chapter 16 Dependability.
Testing Overview Software Reliability Techniques Testing Concepts CEN 4010 Class 24 – 11/17.
Safety-Critical Systems 3 T Designing Safety Software Ilkka Herttua.
CS203 – Advanced Computer Architecture Dependability & Reliability.
1 Chapter 1 Basic Structures Of Computers. Computer : Introduction A computer is an electronic machine,devised for performing calculations and controlling.
1 Software Requirements Descriptions and specifications of a system.
Hardware & Software Reliability
Software Testing An Introduction.
Fault Tolerance & Reliability CDA 5140 Spring 2006
COTS testing Tor Stålhane.
Presentation transcript:

Safety-Critical Systems 3 Hardware/Software T Ilkka Herttua

Current situation / critical systems Based on the data on recent failures of critical systems, the following can be concluded: a)Failures become more and more distributed and often nation-wide (e.g. commercial systems like credit card denial of authorisation) b)The source of failure is more rarely in hardware (physical faults), and more frequently in system design or end-user operation / interaction (software). c)The harm caused by failures is mostly economical, but sometimes health and safety concerns are also involved. d)Failures can impact many different aspects of dependability (dependability = ability to deliver service that can justifiably be trusted).

Examples of computer failures in critical systems

Driving force: federation Safety-related systems have traditionally been based on the idea of federation. This means, a failure of any equipment should be confined, and should not cause the collapse of the entire system. When computers were introduced to safety-critical systems, the principle of federation was in most cases kept in force. Applying federation means that Boeing 757 / 767 flight management control system has 80 distinct microprocessors (300, if redundancy is taken into account). Although having this number of microprocessors is no longer too expensive, there are other problems caused by the principle of federation.

Hardware Faults Intermittent faults - Fault occurs and recurrs over time (loose connector) Transient faults - Fault occurs and may not recurr (lightning) - Electromagnetic interference Permanent faults - Fault persists / physical processor failure (design fault – over current)

Fault tolerance hardware - Achieved mainly by redundancy Redundancy - Adds cost, weight, power consumption, complexity Other means: - Improved maintenance, single system with better materials (higher MTBF) Fault Tolerance

Redundancy types Active Redundancy: - Redundant units are always operating. Dynamic Redundancy (standby): - Failure has to be detected - Changeover to other modul

Hardware redundancy techniques Active techniques: - Parallel (k of N) - Voting (majority/simple) Standby : - Operating - hot stand by - Non-operating – cold stand by

Reliability prediction Electronic Component - Based on propability and statictical - MIL-Handbook 217 – experimental data on actual device behaviour - Manufacture information and allocated circuit types -Bath tube curve; burn in – useful life – wear out

Reliability calculation for system MTTF Mean time to failure- average time for which system would operate before first failure MTTR Mean time to repair – time to get system back in service again MTBF Mean time between failures MTBF= MTTF+MTTR

Safety-Critical Hardware Fault Detection: - Routines to check that hardware works - Signal comparisons - Information redundancy –parity check etc.. - Watchdog timers - Bus monitoring – check that processor alive - Power monitoring

Safety-Critical Hardware Possible hardware: COTS Microprocessors - No safety firmware, least assurance - Redundancy makes better, but common failures possible - Fabrication failures, microcode and documentation errors - Use components which have history and statistics.

Safety-Critical Hardware Specialist Microprocessors - Collins Avionics/Rockwell AAMP2 - Used in Boeing (30+ pieces) - High cost – bench testing, documentation, formal verification - Other models: SparcV7, TSC695E, ERC32 (ESA radiation-tolerant), 68HC908GP32 (airbag)

Safety-Critical Hardware Programmable Logic Controllers PLC Contains power supply, interface and one or more processors. Designed for high MTBFs Firmware Programm stored in EEPROMS Programmed with ladder or function block diagrams

Safety-Critical Software Correct Program: - Normally iteration is needed to develop a working solution. (writing code, testing and modification). - In non-critical environment code is accepted, when tests are passed. - Testing is not enough for safety-critical application – Needs an assessment process: dynamic/static testing, simulation, code analysis and formal verification.

Safety-Critical Software Dependable Software : - Process for development - Work discipline - Well documented - Quality management - Validated/verificated

Safety-Critical Software Safety-Critical Programming Language: -Logical soundness: Unambigous definition of the language- no dialects of C++ - Simple definition: Complexity can lead to errors in compliers or other support tools - Expressive power: Language shall support to express domain features efficiently and easily - Security of definition: Violations of the language definition shall be detected - Verification: Language supports verification, proving that the produced code is consistent with the specification. - Memory/time constrains: Stack, register and memory usage are controlled.

Safety-Critical Software Software faults: - Requirements defects: failure of software requirements to specify the environment in which the software will be used or unambigious requirements - Design defects: not satisfying the requirements or documentation defects - Code defects: Failure of code to conform to software designs.

Safety-Critical Software Software faults: - Subprogram effects: Definition of a called variable may be changed. -Definitions aliasing: Names refer to the same storage location. - Initialising failures: Variables are used before assigned values. - Memory management: Buffer, stack and memory overflows - Expression evalution errors: Divide-by- zero/arithmetic overflow

Safety-Critical Software Language comparison: -Structured assembler (wild jumps, exhaustion of memory, well understood) - Ada (wild jumps, data typing, exception handling, separate compilation) - Subset languages: CORAL, SPADE and Ada (Alsys CSMART Ada kernel) - Validated compilers for Pascal and Ada - Available expertise: with common languages higher productivity and fewer mistakes, but C still not appropriate.

Safety-Critical Software Languages used : - Boeing uses mostly Ada, but still for type about 75 languages used. - ESA mandated Ada for mission critical systems. - NASA Space station in Ada, some systems with C and Assembler. - Car ABS systems with Assembler - Train control systems with Ada - Medical systems with Ada and Assembler - Nuclear Reactors core and shut down system with Assembler, migrating to Ada.

Safety-Critical Software Tools - High reliability and validated tools are required: Faults in the tool can result in faults in the safety critical software. - Widespread tools are better tested - Use confirmed process of the usage of the tool - Analyse output of the tool: static analysis of the object code - Use alternative products and compare results - Use different tools (diversity) to reduce the likelihood of wrong test results.

Safety-Critical Software Designing Principles - Use hardware interlocks before computer/software - New software features add complexity, try to keep software simple - Plan for avoiding human error – unambigious human-computer interface - Removal of hazardous module (Ariane 5 unused code)

Safety-Critical Software Designing Principles - Add barriers: hard/software locks for critical parts - Minimise single point failures: increase safety margins, exploit redundancy and allow recovery. - Isolate failures: don‘t let things get worse. - Fail-safe: panic shut-downs, watchdog code - Avoid common mode failures: Use diversity – different programmers, n-version programming

Safety-Critical Software Designing Principles: - Fault tolerance: Recovery blocks – if one module fails, execute alternative module. - Don‘t relay on run-time systems

Safety-Critical Software Techniques/Tools: -Fault prevention: Preventing the introduction or occurence of faults by using design supporting tools (UML with CASE tool) -Fault removal: Testing, debugging and code modification

Safety-Critical Software Software faults: - Faults in software tools (development/modelling) can results in system faults. -Techniques for software development (language/design notation) can have a great impact on the performance od the people involved and also determine the likelihiid of faults. - The characteristics of the programming systems and their runtime determine how great the impact of possible faults on the overall software subsystem can be.

Safety-Critical Software Architectural design: Layered structure 1 - High level command and control functions 2 – Intermediate level routines 3 – I/O routines and device driver

Safety-Critical Software Architectural design: - Design is done after partitioning of the required functions on hardware and software. - Complete specification of the architecture with components, data structures and interfaces (messages/protocols)

Safety-Critical Software Architectural design: - Test plan for each module (testability) - Human-computer interface - Change control system needed for inconsistencies and inadequacies within specification. - Verification of the architectural design against specification - Software partitioning: modular aids comprehension and isolation (fault limiting)

Safety-Critical Software Reduction of Hazardous Conditions - summary - Simplify: Code contains only minimum features and no unnecessary or undocumented features or unused executable code - Diversity: Data and control redundancy - Multi-version programming: shared specification leads to common-mode failures, but synchronisation code increases complexity

Safety-Critical Software Home assignments 3 : (fault-tolerant system) (reliability model) (reuse of software) Please to 24 of February 2004

Home assignments 1& (primary, functional and indirect safety) 2.4 (unavailability) 3.23 (fault tree) 4.18 (tolerable risk) 5.10 (incompleteness within specification) before 24. February to 11 and 18 February Case Studies/ Teemu Tynjälä