Computer in Safety-Critical Systems Chapter 6 Computer in Safety-Critical Systems Introduction “Safety-critical” and other definitions How and why failures occur? Risk analysis Evaluating software A case study of safety-critical failures Model verification and validation When human welfare is at stake, the price for haphazard practices is severe, and computing professionals must exercise extreme care to ensure that a system is safe. Two requirements must be satisfied: 1) Have some idea of the techniques needed to develop computer systems that are as safe as is practically possible. 2) To be able to arrive at a reasonably objective assessment of exactly what that level of safety is.
Introduction How will an accidents happen? Most accident are caused by a combination of: organizational managerial technical, and sociological or political factors. Preventing an accident requires paying attention to all the root causes.
“Safety-critical” and other definitions Sometimes seen as systems with a component of real-time control that can have a direct life-threatening impact. “Managing Murphy’s Law” Examples: Aircraft industry Medical treatment system Nuclear poser plants Missile systems What do we need? A through risk assessment of a system Risk, hazard, and reliability.
How and Why failures occur? Difficulty in assessing and predicting failures Three independent disk drives resulted in the failure of the Toronto Stock Exchange. Is a computer model an abstraction of the real world ? What is the danger of such an engineering view? Risk Analysis The responsibility of managers is to take the results of risk analysis seriously and act on them by ensuring that the system is managed conscientiously. By Selecting: Proper people Training them, Not overworking them
Evaluating Software Computers now have safety-critical functions in: both military and civilian aircraft nuclear plants medical devices It is incumbent upon those responsible for programming, purchasing, installing, and licensing these systems to determine whether or not the software is ready to be used. What standard must a software product satisfy if it is to be used in safety-critical applications? What document should be required? How much testing is required? How should the software be structured?
A case study of safety-critical failures The Therac-25 accident, failure in a radiation-therapy system. Errors in special timing-dependent sequences of events in the user interface. Software error? An appropriate software/hardware interlock could have prevented the events. What was the role of Medical Doctors in this accident? Should we use software in a safety-critical system? How the standard was used in the system? Was the redundancy check of the safety sufficient? Model verification and validation The numerical model of a real system must go through an extensive verification and validation process. Summary Risk assessment is very difficult to do Software models of real world systems can never fully present all cases.