Software Engineering for Safety: a Roadmap Ahmad Alsawi Dapeng Xie Vladimir Jakobac
Motivation Software become an integral part of many safety-critical systems “The Nation depends on fragile software”… more work needed More and more safety-critical systems are built Motivation for software engineering for safty The demand is increasing as the number of embedded software in safety-critical system increases 4/18/2019
You Can’t Compromise 4/18/2019
Some Issues in SE4Safty Hazard analysis Safety requirement specifications Designing for safety Testing Certifications & standards 4/18/2019
Hazard Analysis Core of safe systems development Analysis severity of effects Likelihood of occurrence Decide which hazard to avoid or handled Identify s/w component that can contribute or prevent hazards Generate a set of safety constraints and requirements Hazards: states that can lead to an accident Accident: unplanned events Forward analysis methods to identify the possibly hazardous effects of failure Backward analysis mehtods to investigate if the hypothesized failure is credible 4/18/2019
Req. Specification and Analysis Check safety properties are preserved Formal specification (allow automation) Req. are internally consistent All data are used, all states are reachable Interactive theorem provers, model checkers [wrong] 1st identify critical s/w component then analyze if hazard level acceptable 4/18/2019
Designing for Safety Design trade offs: Time to market, features, budget, … Will fault tolerant feature causes another hazardous condition? Vulnerability to simple design errors Tend to neglect small errors “small errors have small consequences” is not true in s/w Limited use of known design techniques Practitioners are not disciplined! Small error e.g. Mars climate orbiter spacecraft Do not follow the rules: Aegis missiles cruiser, avoid bad data by operator manual intervention 4/18/2019
Current State Testing: It is very important in both: . Development of safe system . Certification of safe system Assumption: . About environment . About users . About operation A new approach 4/18/2019
Current State (cont.) Certification and Standards: Certification: . More complicated . Less well-defined Standards: . Issue: what standards are appropriate for large, safety-critical systems composed of subsystem from different domains. . Problems: a.) lack of guidance in existing standards b.) poor integration of software issues with system safety c.) heavy burden of making a safety case for certification . Recommendation: a.) classifying and evaluating standards according to products, process and resources b.) constructing domain specific standards for products. 4/18/2019
Current State (cont.) Resources: . Book a.) Safeware by N. Leveson b.) Software Safety and Reliability by D.S. Hermann . Website a.) Bowen’s website “Safety-Critical Systems” b.) a recent IEEE video on the subject: “Developing software for safety critical systems” 4/18/2019
Directions for future work Integration of informal and formal methods . Three important working area a.) automatic translation of informal notation into formal models. b.) lightweight formal methods. c.) integration of previously distinct formal methods. 4/18/2019
Directions… (cont.) Constraints on safe product families and safe reuse . Two research areas. a.) Safety analysis of product families. . A major goal. b.) Safe reuse of COTS software . Two problems. 4/18/2019
Directions… (cont.) Testing and evaluation Runtime monitoring use of requirements-based testing evaluation from multiple sources model consistency testing virtual environment simulations Runtime monitoring Improve the testing and evaluation through: the use of reqs. Often, in practice, additional safety requirements are discovered during design or integration testing, especially from testing of prototypes Include Domain Experts, Independent Verification and Validation. Not only that software does what it is supposed to do, but that it cannot do what it is not supposed to do. Use runtime monitoring - Detection of hazardous states Well suited to monitoring for known, hazardous conditions Remote agent software can diagnose broader mismatches between expected and actual behavior and recommend recovery action (figure) 4/18/2019
Directions… (cont.) Education Related areas more scientific university courses textbooks Related areas safety – a subset of survivability, security? software architecture human factors engineering safety - freedom from accidents or losses; threats to life or property; focuses on well intended actions; preventing more general malicious activities; security - threats to privacy or national security; focuses on malicious actions; preventing unauthorized access. survivability - is the ability to satisfy certain specified critical requirements (for example, security, reliability, real-time responsiveness, and correctness), in the face of adverse conditions. In some cases, survivability may require reconfigurability, interoperability, etc. Software architecture: safety consequences of product lines Human factors: formal specification of mental models in order to have more accurate safety requirements… 4/18/2019
Software Fault Tree Analysis hazard events represented by nodes AND/OR gates domino effect errors in the requirements phase automated analysis with human interaction example taken from: http://www.cs.cmu.edu/~koopman/des_s99/safety_critical/ 4/18/2019
The Way Forward Placing too much reliance on probabilistic risk assessment is unwise Building safety into a system instead of adding protection devices Safety is a system problem Automate the process of safety analysis Tools able to evolve dynamically over time Building safety into a system will be much more effective than adding protection devices onto a completed design. The earlier safety is considered in the development process, the better will be the results. Safety is a system problem and can only be solved by experts in different disciplines working together. Software engineers must understand SYSTEM safety concepts and techniques. 4/18/2019