Safety-Critical Systems 7 Summary T
V - Lifecycle model System Acceptance System Integration & Test Module Integration & Test Requirements Analysis Requirements Model Test Scenarios Software Implementation & Unit Test Software Design Requirements Document Systems Analysis & Design Functional / Architechural - Model Specification Document Knowledge Base * * Configuration controlled Knowledge that is increasing in Understanding until Completion of the System: Requirements Documentation Requirements Traceability Model Data/Parameters Test Definition/Vectors
1. - Requirements Requirements are stakeholders (customer) demands – what they want the system to do. Not defining how !!! => specification Safety requirements are defining what the system must do and must not do in order to ensure safety. Both positive and negative functionality.
1. - Requirement Engineering Right Requirements Ways to better Requirements - complete – use linking to hazards (possible dangerous events) - correct – validating with tests & model - consistent – use semi/formal language - unambiguous – use terms and sentences which are understandable
1. - Hazard Analysis A Hazard is situation in which there is actual or potential danger to people or to environment. Analytical techniques: - Failure modes and effects analysis (FMEA) - Failure modes, effects and criticality analysis (FMECA) - Hazard and operability studies (HAZOP) - Event tree analysis (ETA) - Fault tree analysis (FTA)
1. - Hazard formalisation
1. – Multiple Hazards
1. - Risk Analysis Risk is a combination of the severity (class) and frequency (probability) of the hazardous event. Risk Analysis is a process of evaluating the probability of hazardous events.
2. - Safety Design Faults groups: - requirement/specification errors - random component failures - systematic faults in design (software) Approaches to tackle problems - right system architecture (fault-tolerant) - reliability engineering (component, system) - quality management (designing and producing processes)
2. - Safety Design Hierarchical design - simple modules, encapsulated functionality - separated safety kernel – safety critical functions Maintainability - preventative versa corrective maintenance - scheduled maintenance routines for whole lifecycle - easy to find faults and repair – short MTTR mean time to repair Human error - Proper HMI
Fault tolerance hardware - Achieved mainly by redundancy Redundancy - Adds cost, weight, power consumption, complexity Other means: - Improved maintenance, single system with better materials (higher MTBF) 2.Safety Design – Fault Tolerance
3. Safety-Critical Software Correct Program: - Normally iteration is needed to develop a working solution. (writing code, testing and modification). - In non-critical environment code is accepted, when tests are passed. - Testing is not enough for safety-critical application – Needs an assessment process: dynamic/static testing, simulation, code analysis and formal verification.
3. Safety-Critical Software Dependable Software : - Process for development - Work discipline - Well documented - Quality management - Validated/verificated
3. Safety-Critical Software Designing Principles - Use hardware interlocks together with computer/software solutions - New software features add complexity, try to keep software simple - Plan for avoiding human error – unambigious human-computer interface - Remove unused code or modules
3. Safety-Critical Software Designing Principles - Add barriers: hard/software locks for critical parts - Minimise single point failures: increase safety margins, exploit redundancy and allow recovery. - Isolate failures: module integrity - Fail-safe: panic shut-downs, watchdog code - Avoid common mode failures: Use diversity – different programmers, n-version programming
3. Safety-Critical Software Designing Principles: - Fault tolerance: Recovery blocks – if one module fails, execute alternative module. - Don‘t relay on run-time operating systems on time critical solutions
3. Safety-Critical Software Reduction of Hazardous Conditions - summary - Simplify: Code contains only minimum features and no unnecessary or undocumented features or unused executable code - Diversity: Data and control redundancy - Multi-version programming: shared specification leads to common-mode failures, but synchronisation code increases complexity
Verified software process
4. Testing Testing is a process used to verify or validate system or its components. - Module testing – evaluation of a small function of the hardware/software. - System integration testing – investigates correct interaction of modules. - System validation testing – a complete system satisfies its requirements.
5. Safety Management
6. Certification Process to indicate conformance with a standard – checked by an authorised body. National Safety Authority, Minister of Transportation International institutes and certified /notified bodies in EU Follow given guidelines, like DO-178B, IEC or CENELEC norms.
Safety-Critical Systems Further information: -ERCIM working group on Formal Methods for Industrial Critical System FMICS -International Conference on Computer Safety, Reliability and Security Please your addtional home assignments by 15 May 2008 to -References: OFFIS, I-Logix, KnowGravity