Download presentation
Published byLaurel Glenn Modified over 9 years ago
1
STPA A new hazard analysis technique based on the STAMP model of accident causation
2
Outline What is STPA STPA process Example: Robot Example: TCAS
Comparison and Results In-class STPA
3
STAMP-Based Hazard Analysis (STPA)
Basic premise is to prevent accidents by enforcing safety constraints on system behavior (controlling hazardous system states) Goals (same as any hazard analysis) Identification of system hazards and related safety constraints necessary to ensure acceptable risk Design For Safety Accumulation of information about how hazards can occur. Use info to eliminate, mitigate and control hazards in system design, development, manufacturing, and operations
4
Controlling States Since hazardous states can be prevented through appropriate control (enforcing safety constraints), this hazard analysis method seeks to find instances of Inadequate Control Inadequate control occurs when there are state transitions to hazardous states The commands, decisions, or actions that lead to violation of safety constraints: “Inadequate Control Actions”
5
Inadequate Control Actions
Identify inadequate control actions A required control action is not provided or not followed An incorrect or unsafe control action is provided A potentially correct control action is provided too late or too early (at the wrong time) A correct control action is stopped too soon. Use automatic cruise control example 5
6
Control Flaw Taxonomy Design of the control algorithm does not enforce constraints Flaw(s) in creation process Process changes without appropriate change in control algorithm (asynchronous evolution) Incorrect modification or adaptation Process models are inconsistent, incomplete, or incorrect Flaw(s) in updating process Inadequate or missing feedback Not provided in system design Communication flaw Time lag Inadequate sensor operation
7
Control Flaw Taxonomy (cont)
Time lags and measurement inaccuracies not accounted for Expected process inputs are wrong or missing Expected control inputs are wrong or missing Disturbance model is wrong Amplitude, frequency, or period is out of range Unidentified disturbance Inadequate coordination among controllers and decision makers
8
Inadequate Control Execution
Inadequate Execution of Control Actions Communication flaw Inadequate actuator operation Time lag © Copyright Nancy Leveson, Aug. 2006
9
STPA: A New Hazard Analysis Technique
Based on STAMP Actuator(s) Inadequate control Commands Inadequate Actuator Operation Process Input Wrong or Missing Controller Inadequate Control Algorithm Control Input Wrong or Missing Controlled Process Failure Disturbances Unidentified or Out of Range Process Model Wrong Sensor(s) Inadequate Sensor Operation Feedback Wrong or Missing Process Output Wrong or Missing
10
How to Perform STPA High-level Hazard Analysis: Indentify Accidents
Hazards High-level Safety Constraints Identify Inadequate Control Actions Control structure Control Flaws In the design Change design to eliminate, mitigate, or control potentially unsafe control actions and behaviors. Or accept Iterate
11
Identifying and Specifying Safety Constraints
Most requirements only specify nominal behavior Need to specify off-nominal behavior Need to specify what system and software must NOT do What must not do is not inverse of what must do Derive from system hazard analysis © Copyright Nancy Leveson, Aug. 2006
12
Example: Mobile Robot
13
Thermal Tile Robot Example
Identify high-level functional requirements and environmental constraints. e.g. size of physical space, crowded area 2. Identify high-level hazards a. Violation of minimum separation between mobile base and objects (including orbiter and humans) b. Mobile robot becomes unstable (e.g., could fall over) c. Manipulator arm hits something d. Fire or explosion e. Contact of human with DMES f. Inadequate thermal control (e.g., damaged tiles not detected, DMES not applied correctly) g. Damage to robot © Copyright Nancy Leveson, Aug. 2006
14
Thermal Tile Robot Example (2)
3. Restate hazards as high-level safety constraints e.g. Robot must not allow humans to come in contact with DMES Try to eliminate from system design If cannot be eliminated or adequately controlled at system design level, will need to refine and allocate them to system components. © Copyright Nancy Leveson, Aug. 2006
15
Design Constraints are Refined and Traced to Components
Mobile Base (MB): Requirements: MB-FR1: The mobile base shall be able to carry all the mobile robot subsystem components [2.6.3(73)] MB-FR2: The mobile base shall be able to move smoothly in any direction and to cross cable covers on the floor [EA.3(15), H3(38), 2.6.2(73)] MB-FR3: The mobile base shall be able to raise its inspection and injection equipment to the level required for servicing the tiles, from 2.9 meters to 4 meters [EA.2(15), 2.6.3(73), (81), (81)] © Copyright Nancy Leveson, Aug. 2006
16
Design Constraints are Refined and Traced to Components (2)
MB-C1: The mobile base must be no more than 2.5 meters long and 1 meter wide. While moving, it must fit under structural beams as low as 1.75 meters [EA.2(15), 4.6)] Safety-Related Design Constraints MB-SC1: The mobile base must be able to ensure accuracy of 10 cm for positioning and 1 mm for tile servicing (inspection and injection tasks [EA.2(15), H4(38), 2.6.1(73), 2.6.4(73)] MB-SC2: The mobile base design must protect against fire and explosion [H6(39), 2.6.5(73), 2.6.6(73)] MB-SC3: It must be possible to move the mobile base out of the way in case of an emergency [2.9.2(79)] © Copyright Nancy Leveson, Aug. 2006
17
Design Constraints are Refined and Traced to Components (3)
Motor Controller: The drivetrains for locomotion are within the diameter of the wheel hub and consist of a brushless DC motor, resolver for positioning and commutation, a brake, a cycloidal reducer providing 225:1 gear reduction with exceptional stiffness, and a locking hub that couples the output of the reducer to the wheel. The locking hub allows the operator to disengage the wheels from the drivetrain completely [MB-SC3(20)] Rationale: In an emergency, the ability to disengage the wheels will allow towing or pushing the machine out of the way. © Copyright Nancy Leveson, Aug. 2006
18
© Copyright Nancy Leveson, Aug. 2006
Define preliminary control structure and refine constraints and design in parallel. © Copyright Nancy Leveson, Aug. 2006
19
Refinement and Allocation
After defining initial control structure, refine constraints and design in parallel. Identify potentially hazardous control actions by each of system components that would violate system design constraints. Restate as component safety design requirements and constraints. Perform hazard analysis using STPA to identify how safety-related requirements and constraints could be violated (the potential causes of inadequate control and enforcement of safety-related constraints). Augment the basic design to eliminate, mitigate, or control potential unsafe control actions and behaviors. Iterate over the process, i.e. perform STPA on the new augmented design and continue to refine the design until all hazardous scenarios are eliminated, mitigated, or controlled. Document design rationale and trace requirements and constraints to the related design decisions. © Copyright Nancy Leveson, Aug. 2006
20
© Copyright Nancy Leveson, Aug. 2006
Try to eliminate hazards from system conceptual design. If not possible, then identify controls and new design constraints. For unstable base hazard System Safety Constraint: Mobile base must not be capable of falling over under worst case operational conditions © Copyright Nancy Leveson, Aug. 2006
21
© Copyright Nancy Leveson, Aug. 2006
First try to eliminate: Make base heavy Could increase damage if hits someone or something. Difficult to move out of way manually in emergency Make base long and wide Eliminates hazard but violates environmental constraints Use lateral stability legs that are deployed when manipulator arm extended but must be retracted when mobile base moves. Two new design constraints: Manipulator arm must move only when stabilizer legs are fully deployed Stabilizer legs must not be retracted until manipulator arm is fully stowed. © Copyright Nancy Leveson, Aug. 2006
22
© Copyright Nancy Leveson, Aug. 2006
Identify potentially hazardous control actions by each of system components A required control action is not provided or not followed An incorrect or unsafe control action is provided A potentially correct or inadequate control action is provided too late or too early (at the wrong time) A correct control action is stopped too soon. Hazardous control of stabilizer legs: Legs not deployed before arm movement enabled Legs retracted when manipulator arm extended Legs retracted after arm movements are enabled or retracted before manipulator arm fully stowed Leg extension stopped before they are fully extended © Copyright Nancy Leveson, Aug. 2006
23
Restate as safety design constraints on components
Controller must ensure stabilizer legs are extended whenever arm movement Is enabled Controller must not command a retraction of stabilizer legs when manipulator arm extended Controller must not command deployment of stabilizer legs before arm movements are enabled. Controller must not command retraction of legs before manipulator arm fully stowed Controller must not stop leg deployment before they are fully extended © Copyright Nancy Leveson, Aug. 2006
24
© Copyright Nancy Leveson, Aug. 2006
Do same for all hazardous commands: e.g., Arm controller must not enable manipulator arm movement before stabilizer legs are completely extended. At this point, may decided to have arm controller and leg controller in same component © Copyright Nancy Leveson, Aug. 2006
25
© Copyright Nancy Leveson, Aug. 2006
To produce detailed scenarios for violation of safety constraints, augment control structure with process models Arm Movement Enabled Disabled Unknown Stabilizer Legs Extended Retracted Unknown Manipulator Arm Stowed Extended Unknown How could become inconsistent with real state? e.g. issue command to extend stabilizer legs but external object could block extension or extension motor could fail © Copyright Nancy Leveson, Aug. 2006
26
© Copyright Nancy Leveson, Aug. 2006
Problems often in startup or shutdown: e.g., Emergency shutdown while servicing tiles. Stability legs manually retracted to move robot out of way. When restart, assume stabilizer legs still extended and arm movement could be commanded. So use “unknown” state when starting up Do not need to know all causes, only safety constraints: - - May decide to turn off arm motors when legs extended or when arm extended. Could use interlock or tell computer to power it off. - Must not move when legs extended? – Power down wheel motors while legs extended. Coordination problems © Copyright Nancy Leveson, Aug. 2006
27
Example: TCAS
28
© Copyright Nancy Leveson, Aug. 2006
Step 1: Identify hazards and translate into high-level requirements and constraints on behavior TCAS Hazards: A near mid-air collision (NMAC): Two controlled aircraft violate minimum separation standards) A controlled maneuver into ground Loss of control of aircraft Interference with other safety-related aircraft systems Interference with the ground-based ATC system Interference with ATC safety-related advisory System Safety Design Constraints: TCAS must not cause or contribute to an NMAC TCAS must not cause or contribute to a controlled maneuver into the ground … © Copyright Nancy Leveson, Aug. 2006
29
© Copyright Nancy Leveson, Aug. 2006
Step 2: Define basic control structure © Copyright Nancy Leveson, Aug. 2006
30
Component Responsibilities
TCAS: Receive and update information about its own and other aircraft Analyze information received and provide pilot with Information about where other aircraft in the vicinity are located An escape maneuver to avoid potential NMAC threats Pilot Maintain separation between own and other aircraft using visual scanning Monitor TCAS displays and implement TCAS escape maneuvers Follow ATC advisories Air Traffic Controller Maintain separation between aircraft in controlled airspace by providing advisories (control action) for pilot to follow © Copyright Nancy Leveson, Aug. 2006
31
© Copyright Nancy Leveson, Aug. 2006
Aircraft components (e.g., transponders, antennas) Execute control maneuvers Receive and send messages to/from aircraft Etc. Airline Operations Management Provide procedures for using TCAS and following TCAS advisories Train pilots Audit pilot performance Air Traffic Control Operations Management Provide procedures Train controllers, Audit performance of controllers Audit performance of overall collision avoidance system © Copyright Nancy Leveson, Aug. 2006
32
© Copyright Nancy Leveson, Aug. 2006
For the NMAC hazard: TCAS: The aircraft are on a near collision course and TCAS does not provide an RA The aircraft are in close proximity and TCAS provides an RA that degrades vertical separation. The aircraft are on a near collision course and TCAS provides an RA too late to avoid an NMAC TCAS removes an RA too soon Pilot: The pilot does not follow the resolution advisory provided by TCAS (does not respond to the RA) The pilot incorrectly executes the TCAS resolution advisory. The pilot applies the RA but too late to avoid the NMAC The pilot stops the RA maneuver too soon. © Copyright Nancy Leveson, Aug. 2006
33
© Copyright Nancy Leveson, Aug. 2006
Step 3b: Use identified inadequate control actions to refine system safety design constraints When two aircraft are on a collision course, TCAS must always provide an RA to avoid the collision TCAS must not provide RAs that degrades vertical separation … The pilot must always follow the RA provided by TCAS © Copyright Nancy Leveson, Aug. 2006
34
© Copyright Nancy Leveson, Aug. 2006
Step 4: Determine how potentially hazardous control actions could occur (scenarios of how constraints can be violated). Eliminate from design or control in design or operations. Step4a: Augment control structure with process models for each control component. Step4b: For each of inadequate control actions, examine parts of control loop to see if could cause it. Guided by a set of generic control loop flaws Step 4c: Design controls and mitigation measures Step4d: Consider how designed controls could degrade over time. © Copyright Nancy Leveson, Aug. 2006
35
© Copyright Nancy Leveson, Aug. 2006
36
© Copyright Nancy Leveson, Aug. 2006
TCAS does not provide an RA when required to avoid an NMAC - Unit is not operational --Pilot does not turn it on -- Self-monitor turns off TCAS unit -- Component failure - TCAS does not perceive a conflict -- Current location of aircraft is incorrect TCAS thinks other aircraft is on the ground Incorrect altitude provided to TCAS Uneven terrain TCAS puts other aircraft outside protected volume -- Location of own aircraft incorrect Altimeter error Delay in receipt of information about altitude change © Copyright Nancy Leveson, Aug. 2006
37
Comparison with Traditional HA Techniques
Top-down (vs. bottom-up like FMECA) Considers more than just component failure and failure events (includes these but more general) Guidance in doing analysis (vs. FTA) Handles dysfunctional interactions and system accidents, software, management, etc. © Copyright Nancy Leveson, Aug. 2006
38
© Copyright Nancy Leveson, Aug. 2006
Comparisons (2) Concrete model (not just in head) Not physical structure (HAZOP) but control (functional) structure General model of inadequate control (based on control theory) HAZOP guidewords based on model of accidents being caused by deviations in system variables Includes HAZOP model but more general Compared with TCAS II Fault Tree (MITRE) STPA results more comprehensive Included Ueberlingen accident © Copyright Nancy Leveson, Aug. 2006
39
Ballistic Missile Defense System (BMDS) Non-Advocate Safety Assessment using STPA
A layered defense to defeat all ranges of threats in all phases of flight (boost, mid-course, and terminal) Made up of many existing systems (BMDS Element) Early warning radars Aegis Ground-Based Midcourse Defense (GMD) Command and Control Battle Management and Communications (C2BMC) Others MDA used STPA to evaluate the residual safety risk of inadvertent launch prior to deployment and test
40
Results Deployment and testing held up for 6 months because so many scenarios identified for inadvertent launch. In many of these scenarios: All components were operating exactly as intended Complexity of component interactions led to unanticipated system behavior STPA also identified component failures that could cause inadequate control (most analysis techniques consider only these failure events) As changes are made to the system, the differences are assessed by updating the control structure diagrams and assessment analysis templates. Adopted as primary safety approach for BMDS In the example above, neither of the causal factors identified by the assessment involved component failures All components involved were operating exactly as intended; however, the complexity of their interactions led to unanticipated system behavior Component failure can be a cause of inadequate control over a system’s behavior, and this assessment methodology does include those possibilities However, many analysis techniques, particularly those based on reliability failure, consider only failure events, not the effects of complex system interactions
41
In-Class STPA Subway Train Doors
42
Train Doors What are is the system goal(s)? What are the accidents?
What are the hazards? Translate the hazards into safety constraints
44
What to do for Train Doors Exercise
To Do in your groups or individually, your choice. See slide 10, “How to do STPA” and follow that to do a STPA hazard analysis on an existing train door design you are familiar with. (feel free to add design changes that are interesting to you) Be sure to include: Control structure and control loops, process models (the controller’s model of what the system/process is doing), The expected control inputs, measurements, sensors etc that you find to be relevant as you are going through the STPA process. Inadequate control actions (slide 5) Control flaws and inadequate control executions. (slides 6-9)
45
What to do for Train Doors Exercise
Once you’ve found the inadequate control actions and related control flaws and inadequate control executions, Identify new safety constraints on the system and new design decisions to enforce the safety constraints (and prevent inadequate control) We’ll talk about the Train Doors STPA in class next week. No need to turn any papers, but bring what you’ve done so we can go over it as a group. Feel free to contact me with questions. Maggie Stringfellow:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.