Safety Critical Systems

Slides:



Advertisements
Similar presentations
Fault-Tolerant Systems Design Part 1.
Advertisements

Safety Critical Systems T Safeware - Design for safety hardware and software Ilkka Herttua.
11. Practical fault-tolerant system design Reliable System Design 2005 by: Amir M. Rahmani.
Programming Languages Marjan Sirjani 2 2. Language Design Issues Design to Run efficiently : early languages Easy to write correctly : new languages.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development.
Software Quality Assurance (SQA). Recap SQA goal, attributes and metrics SQA plan Formal Technical Review (FTR) Statistical SQA – Six Sigma – Identifying.
3. Hardware Redundancy Reliable System Design 2010 by: Amir M. Rahmani.
Safety-Critical Systems 2 T Risk analysis and design for safety Ilkka Herttua.
Building Reliable Software Requirements and Methods.
1 Solution proposal Exam 19. Mai 2000 No help tools allowed.
SE curriculum in CC2001 made by IEEE and ACM: Overview and Ideas for Our Work Katerina Zdravkova Institute of Informatics
Modified from Sommerville’s originals Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development.
Software Requirements
Karlstad University Computer Science Design Contracts and Error Management Design Contracts and Errors A Software Development Strategy (anpassad för PUMA)
Soft. Eng. II, Spr. 2002Dr Driss Kettani, from I. Sommerville1 CSC-3325: Chapter 9 Title : Reliability Reading: I. Sommerville, Chap. 16, 17 and 18.
Developing Dependable Systems CIS 376 Bruce R. Maxim UM-Dearborn.
Chapter 1 Principles of Programming and Software Engineering.
© 2006 Pearson Addison-Wesley. All rights reserved2-1 Chapter 2 Principles of Programming & Software Engineering.
Design of SCS Architecture, Control and Fault Handling.
20 February Detailed Design Implementation. Software Engineering Elaborated Steps Concept Requirements Architecture Design Implementation Unit test Integration.
Software Quality Assurance For Software Engineering && Architecture and Design.
Introduction to Software Testing
IV&V Facility Model-based Design Verification IVV Annual Workshop September, 2009 Tom Hempler.
Testing safety-critical software systems
Software Verification and Validation (V&V) By Roger U. Fujii Presented by Donovan Faustino.
Use of Multimedia in Engineering. Mechatronics engineering is based on the combination from three basic engineering field that is mechaninal, electronics.
Software Dependability CIS 376 Bruce R. Maxim UM-Dearborn.
Safety-Critical Systems 6 Quality Management and Certification T
CS527: (Advanced) Topics in Software Engineering Overview of Software Quality Assurance Tao Xie ©D. Marinov, T. Xie.
Language Evaluation Criteria
1 Fault-Tolerant Computing Systems #2 Hardware Fault Tolerance Pattara Leelaprute Computer Engineering Department Kasetsart University
CLEANROOM SOFTWARE ENGINEERING.
Safety-Critical Systems 3 Hardware/Software T Ilkka Herttua.
CS 501: Software Engineering Fall 1999 Lecture 16 Verification and Validation.
Critical systems development. Objectives l To explain how fault tolerance and fault avoidance contribute to the development of dependable systems l To.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 9 Slide 1 Critical Systems Specification 2.
Safety-Critical Systems 6 Certification
Verification and Validation Overview References: Shach, Object Oriented and Classical Software Engineering Pressman, Software Engineering: a Practitioner’s.
Protecting the Public, Astronauts and Pilots, the NASA Workforce, and High-Value Equipment and Property Mission Success Starts With Safety Believe it or.
Testing Basics of Testing Presented by: Vijay.C.G – Glister Tech.
Jump to first page (c) 1999, A. Lakhotia 1 Software engineering? Arun Lakhotia University of Louisiana at Lafayette Po Box Lafayette, LA 70504, USA.
Introduction CS 3358 Data Structures. What is Computer Science? Computer Science is the study of algorithms, including their  Formal and mathematical.
Fault-Tolerant Systems Design Part 1.
Safety-Critical Systems T Ilkka Herttua. Safety Context Diagram HUMANPROCESS SYSTEM - Hardware - Software - Operating Rules.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.
Safety Critical Systems 5 Testing T Safety Critical Systems.
Safety-Critical Systems 5 Testing and V&V T
Quality Assurance.
CprE 458/558: Real-Time Systems
Safety-Critical Systems 7 Summary T V - Lifecycle model System Acceptance System Integration & Test Module Integration & Test Requirements Analysis.
RELIABILITY ENGINEERING 28 March 2013 William W. McMillan.
LESSON 3. Properties of Well-Engineered Software The attributes or properties of a software product are characteristics displayed by the product once.
Idaho RISE System Reliability and Designing to Reduce Failure ENGR Sept 2005.
Fault-Tolerant Systems Design Part 1.
Over View of CENELC Standards for Signalling Applications
Safety Critical Systems T Safeware - Design for safety hardware and software Ilkka Herttua.
© 2006 Pearson Addison-Wesley. All rights reserved 2-1 Chapter 2 Principles of Programming & Software Engineering.
Software Engineering and Object-Oriented Design Topics: Solutions Modules Key Programming Issues Development Methods Object-Oriented Principles.
Software Engineering1  Verification: The software should conform to its specification  Validation: The software should do what the user really requires.
Testing Overview Software Reliability Techniques Testing Concepts CEN 4010 Class 24 – 11/17.
Safety-Critical Systems 3 T Designing Safety Software Ilkka Herttua.
Introduction to Computer Programming Concepts M. Uyguroğlu R. Uyguroğlu.
Hardware & Software Reliability
Software Testing An Introduction.
Verification and Validation Overview
Verification & Validation
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
Introduction to Software Testing
Verification and Validation Unit Testing
PSS0 Configuration Management,
Presentation transcript:

Safety Critical Systems Design for safety hardware and software Ilkka Herttua

V - Lifecycle model Knowledge Base * Requirements Test Scenarios System Acceptance Integration & Test Module Requirements Analysis Requirements Model Test Scenarios Software Implementation & Unit Test Design Document Systems Analysis & Design Functional / Architechural - Model Specification Knowledge Base * * Configuration controlled Knowledge that is increasing in Understanding until Completion of the System: Requirements Documentation Requirements Traceability Model Data/Parameters Test Definition/Vectors

Designing for Safety Faults groups - requirement/specification errors - random component failures - systematic faults in design (software) Approaches to tackle problems - right system architecture (fault-tolerant) - reliability engineering (component, system) - quality management (designing and producing processes)

Designing for Safety Hierarchical design - simple modules, encapsulated functionality - separated safety kernel – safety critical functions Maintainability - preventative versa corrective maintenance - scheduled maintenance routines for whole lifecycle - easy to find faults and repair – short MTTR (mean time to repair) Reduce human error - Proper HMI

Hardware Faults Intermittent faults Fault occurs and recurs over time (loose connector) Transient faults Fault occurs and may not recur (lightning) Electromagnetic interference Permanent faults Fault persists / physical processor failure (design fault – over current)

Fault Tolerance Fault tolerance hardware - Achieved mainly by redundancy - Adds cost, weight, power consumption, complexity Other means: - Improved maintenance, single system with better materials (higher mean time between failure - MTBF)

Redundancy types Active Redundancy: Redundant units are always operating in parallel Dynamic Redundancy (standby): Failure has to be detected Changeover to other module

Hardware redundancy techniques Active techniques: Parallel (k of N) Voting (majority/simple) Standby techniques : Operating - hot stand by Non-operating – cold stand by

Hardware reliability prediction Electronic Components Based on probability and statistical MIL-Handbook 217 – experimental data on actual device behaviour Manufacture information and allocated circuit types Bath tube curve; burn in – useful life – wear out

Safety Critical Hardware Fault Detection: Routines to check that hardware works Signal comparisons Information redundancy –parity check etc.. Watchdog timers Bus monitoring – check that processor alive Power monitoring

Safety Critical Hardware Commercial Microprocessors - No safety firmware, least assurance Redundancy makes better, but common failures possible Fabrication failures, microcode and documentation errors Use components which have history and statistics.

Safety Critical Hardware 2. Special reliable Microprocessors Collins Avionics/Rockwell AAMP2 Used in Boeing 747-400 (30+ pieces) High cost – bench testing, documentation, formal verification Other models: SparcV7, TSC695E, ERC32 (ESA radiation-tolerant), 68HC908GP32 (airbag)

Safety Critical Hardware 3. Programmable Logic Controllers (PLC) Contains power supply, interface and one or more processors. Designed for high mean time between failure (MTBF) Solid Firmware Program stored in EEPROMS Programmed with ladder or function block diagrams

Safety Critical Software Software development: Normally iteration is needed to develop a working solution. (writing code, testing and modification). In non-critical environment code is accepted, when tests are passed. Testing is not enough for safety critical application – Software needs an assessment process: dynamic/static testing, simulation, code analysis and formal verification.

Safety Critical Software Dependable Software : Process for development Work discipline Well documented Quality management Validated/verified

Safety-Critical Software Software faults: Requirements defects: failure of software requirements to specify the environment in which the software will be used or unambiguous requirements Design defects: not satisfying the requirements or documentation defects Code defects: Failure of code to conform to software designs.

Safety-Critical Software Software faults: Subprogram effects: Definition of a called variable may be changed. Definitions aliasing: Names refer to the same storage location. Initialising failures: Variables are used before assigned values. Memory management: Buffer, stack and memory overflows Expression evaluation errors: Divide-by-zero/arithmetic overflow

Safety Critical Software Safety Critical Programming Language: Logical soundness: Unambiguous definition of the language- no dialects of C++ Simple definitions: Complexity can lead to errors in compliers or other support tools Expressive power: Language shall support to express domain features efficiently and easily Security of definitions: Violations of the language definition shall be detected Verification: Language supports verification, proving that the produced code is consistent with the specification. Memory/time constrains: Stack, register and memory usage are controlled.

Safety Critical Software Language comparison: Structured assembler (wild jumps, exhaustion of memory, well understood) Ada (wild jumps, data typing, exception handling, separate compilation) Subset languages: CORAL, SPADE and Ada (Alsys CSMART Ada kernel) Validated compilers for Pascal and Ada Available expertise: with common languages higher productivity and fewer mistakes, but C still not appropriate.

Safety Critical Software Languages used : Boeing uses mostly Ada, but still for type 747-400 about 75 languages used. ESA mandated Ada for mission critical systems. NASA Space station in Ada, some systems with C and Assembler. Car ABS systems with Assembler Train control systems with Ada Medical systems with Ada and Assembler Nuclear Reactors core and shut down system with Assembler, migrating to Ada.

Safety Critical Software Tools High reliability and validated tools are required: Faults in the tool can result in faults in the safety critical software. Widespread tools are better tested Use confirmed process of the usage of the tool Analyse output of the tool: static analysis of the object code Use alternative products and compare results Use different tools (diversity) to reduce the likelihood of wrong test results.

Safety Critical Software Designing Principles 1 New software features add complexity, try to keep software simple Plan for avoiding human error – unambiguous human-computer interface Removal of hazardous module (Ariane 5 unused code)

Safety Critical Software Designing Principles 2 Add barriers: hard/software locks for critical parts Minimise single point failures: increase safety margins, exploit redundancy and allow recovery. Isolate failures: don‘t let things get worse. Fail-safe: panic shut-downs, watchdog code Avoid common mode failures: Use diversity – different programmers, n-version programming

Safety Critical Software Designing Principles 3 Fault tolerance: Recovery blocks – if one module fails, execute alternative module. Don‘t relay on run-time systems

Safety-Critical Software Techniques/Tools: Fault prevention: Preventing the introduction or occurrence of faults by using design supporting tools (UML with CASE tool) Fault removal: Testing, debugging and code modification

Safety Critical Software Software tool faults: - Faults in software tools (development/modelling) can results in system faults. Techniques for software development (language/design notation) can have a great impact on the performance or the people involved and also determine the likelihood of faults. The characteristics of the programming systems and their runtime determine how great the impact of possible faults on the overall software subsystem can be.

Practical Design Process (By I-Logix tool manufacture – Statemate)

Improved Development Process

Intergrated Development Process

Verified software process

Safety Critical Software Reduction of Hazardous Conditions Simplify: Code contains only minimum features and no unnecessary or undocumented features or unused executable code Diversity: Data and control redundancy Multi-version programming: shared specification leads to common-mode failures, but synchronisation code increases complexity

Home assignments 2 a Cont. Neil Storey’s book: Safety Critical Computer Systems - 5.10 Describe a common cause of incompleteness within specifications. How can this situation cause problems? 9.17 Describe the advantages and disadvantages of the reuse of software within safety critical projects. Cont.

Home assignments 2 b Email by 1. March to herttua@eurolock.org 7.15 A system may be described by the following reliability model, where the numbers within the boxes represent the module reliability. Calculate the system reliability. Email by 1. March to herttua@eurolock.org 0,7 0,7 0,9 0,7 0,98 0,97 0,99