STPA FOR LINAC4 AVAILABILITY REQUIREMENTS

Slides:



Advertisements
Similar presentations
Accident and Incident Investigation
Advertisements

André Augustinus 15 March 2003 DCS Workshop Safety Interlocks.
EECE499 Computers and Nuclear Energy Electrical and Computer Eng Howard University Dr. Charles Kim Fall 2013 Webpage:
Reliability and Safety Lessons Learned. Ways to Prevent Problems Good computer systems Good computer systems Good training Good training Accountability.
Concept & architecture of the machine protection systems for FCC
Reliability Risk Assessment
Title slide PIPELINE QRA SEMINAR. PIPELINE RISK ASSESSMENT INTRODUCTION TO GENERAL RISK MANAGEMENT 2.
Hazards Analysis & Risks Assessment By Sebastien A. Daleyden Vincent M. Goussen.
T. Bajd, M. Mihelj, J. Lenarčič, A. Stanovnik, M. Munih, Robotics, Springer, 2010 SAFETY IN INDUSTRIAL ROBOTICS R. Kamnik, T. Bajd and M. Mihelj.
EuropeAid/131555/C/SER/RS Safety Procedures in the Chemical Industry Ernst SIMON, Styrian Regional Government, Austria Belgrade, December 2013.
EE551 Real-Time Operating Systems
Basics of OHSAS Occupational Health & Safety Management System
Openlab Workshop on Data Analytics 16 th of November 2012 Axel Voitier – CERN EN-ICE.
1. Objectives  Describe the responsibilities and procedures for reporting and investigating ◦ incidents / near-miss incidents ◦ spills, releases, ◦ injuries,
Consolidation of access systems for the injector Complex ATOP days 4-6 March 2009 P. Ninin & R, Nunes in behalf of the PS and SPS access project teams…
Managing Risks in Projects. Risk Concepts The Likelihood that some Problematical Event will Occur The Likelihood that some Problematical Event will Occur.
FAULT TREE ANALYSIS (FTA). QUANTITATIVE RISK ANALYSIS Some of the commonly used quantitative risk assessment methods are; 1.Fault tree analysis (FTA)
LSST Camera CD-3 Review Brookhaven National Laboratory, Brookhaven, NY LSST Safety Council Camera Review Bremerton, WA 2015 LSST Camera Environment,
André Augustinus 10 September 2001 DCS Architecture Issues Food for thoughts and discussion.
Introduction to availability modelling in ELMAS Arto Niemi.
July LEReC Review July 2014 Low Energy RHIC electron Cooling Edward T. Lessard ESHQ.
B. Todd et al. 25 th August 2009 Observations Since v1.
SMS Planning.  Safety management addresses all of the operational activities of the entire organization.  The four (4) components of an SMS are: 1)
CLIC Implementation Studies Ph. Lebrun & J. Osborne CERN CLIC Collaboration Meeting addressing the Work Packages CERN, 3-4 November 2011.
1 Safety - definitions Accident - an unanticipated loss of life, injury, or other cost beyond a pre-determined threshhold.  If you expect it, it’s not.
Hazard Identification
Quality Assurance.
Partikeldagarna, Göteborg 21 September 2007 LHC: Status and Plans Lyn Evans.
This Project is funded by the European Union Project implemented by Human Dynamics Consortium This project is funded by the European Union Projekat finansira.
PLC Workshop at ITER, 4-5 th of December 2014 A. Nordt, ESS, Lund/Sweden.
Status of ITER collaboration for Machine Protection I. Romera On behalf of the colleagues who contribute to the project Thanks to: Sigrid, Markus, Rüdiger,
WHAT IF ANALYSIS USED TO IDENTIFY HAZARDS HAZARDOUS EVENTS
EUROnu Costing Workshop May 2011 Beta-Beam Costing Exercise Elena Wildner, CERN 25/05/11 1 EUROnu Costing Workshop, CERN May 2011.
Beam Interlock System Technology Evaluation and Design MACS Week 1, 2011 Hannes Pavetits 1 R. Gutleber PR a-HPA, March 28 th, 2011 H. Pavetits.
BEAM INSTRUMENTATION GROUP DEPENDABILITY APPROACH CERN, Chamonix 26th January 2016 William Viganò
LHC machine protection close-out 1 Close-out. LHC machine protection close-out 2 Introduction The problem is obvious: –Magnetic field increase only a.
LHC’s Modular Machine ITER – Machine ProtectionB. ToddJuly 2010 Thanks to : TE/MPE/MI, CERN Machine Protection Panel, et al 1v0 Protection System.
Failure Modes, Effects and Criticality Analysis
Process Safety Management Soft Skills Programme Nexus Alliance Ltd.
MI Shielding Machine Protection Credit D. Capista March 7,2010.
Reliability and Performance of the SNS Machine Protection System Doug Curry 2013.
Registered participants participants from outside Cern.
CONDUCTING TEST ON THE INSTALLED COMPUTER SYSTEM
LS2 Safety Preparation Thomas Otto, LS2 Safety Support Officer for LHC
Guide for the application of the CSM design targets (CSM-DT)
Rüdiger Schmidt 24/11/2016
Technical Services: Unavailability Root Causes, Strategy and Limitations Data and presentation in collaboration with Ronan LEDRU and Luigi SERIO.
A. Apollonio CERN Machine Protection Group (TE-MPE)
Setup beam intensities for FCC-hh
Isograph Packages Apollonio, 7/7/2016
Estrella Vergara EN-ACE group 24th May 2017
the CERN Electrical network protection system
Quality Risk Management
Automation Topics: Elements of an Automated System
Initial Experience with the Machine Protection System for LHC
Machine Protection Xu Hongliang.
LHCCWG Meeting R. Alemany, M. Lamont, S. Page
Interlocking of CNGS (and other high intensity beams) at the SPS
Regulatory Oversight of HOF in Finland
Workshop on Accelerator Operations
Knowing When to Stop: An Examination of Methods to Minimize the False Negative Risk of Automated Abort Triggers RAM XI Training Summit October 2018 Patrick.
Geller Ch 1-3.
The LHC Beam Interlock System
Interlocking strategy
Hazards Analysis & Risks Assessment
Computer in Safety-Critical Systems
Operation of Target Safety System (TSS)
Review and comparison of the modeling approaches and risk analysis methods for complex ship system. Author: Sunil Basnet.
Integration into MP - Considerations -
Close-out.
Presentation transcript:

STPA FOR LINAC4 AVAILABILITY REQUIREMENTS A. Apollonio, R. Schmidt 4th European STAMP Workshop, Zurich, 2016

CERN: Accelerators and Experiments CERN provides the world’s largest and most complex scientific instruments to study the elementary particles These instruments are particle accelerators and experiments Accelerators boost beams of elementary particles to high energies before they are made to collide with each other Experiments observe and record the results of these collisions LHC relies on the reliable operation of the injectors, e.g. LINAC4 Our flag-ship project is the Large Hadron Collider…

CERN: Injector Chain LINAC4

Risks for High-Power Accelerators Not to complete the construction of the accelerator Happened to other projects, the most expensive was the Superconducting Super Collider in Texas / USA with a length of ~80 km Cost increase from 4.4 Billion US$ to 12 Billion US$, US congress stopped the project in 1993 after having invested more the 2 Billion US$ Not to be able to operate the accelerator Damage to the accelerator beyond repair due to an accident NO LHC: Future of particle Physics compromised SSC

LHC: Hazards and Machine Protection Safety-critical: 362 MJ stored beam energy 9 GJ energy stored in the magnet powering system Complex: Many 10000s interlocks Mix of hardware, software, human interventions, procedures, etc. Need for an efficient method to address protection requirements consistently STPA was not used to develop the LHC MPS Arcing in the interconnection, in 2008 at LHC Effect of 0.1% of the LHC beam energy (Experiment at SPS)

LHC MPS to prevent beam accidents Many 10000s interlocks, across more than 20 subsystems

Design principles for protection systems Efficient accelerator operation Priority 1: Avoid accidents (reducing availability and introducing repair cost) Priority 2: Operate with high availability Failsafe design detect internal faults if the protection system does not work, better stop operation rather than damage equipment (affecting availability) Excellent diagnostics recording all failures Flexibility: managing interlocks disabling of interlocks is common practice (keep track!) LHC: masking of some interlocks possible for low intensity / low energy beams

LINAC 4 New injector for the CERN accelerator complex Being commissioned, regular operation starting in the next years

Motivation for the use of STPA Increasing accelerator complexity requires a systematic approach for identification of machine protection requirements Address and optimize contradictory requirements (safety vs availability) Applicable from early design stages (not applied to a given design) Results should not regard only the system architecture, but also provide recommendations for correct operation and management of the accelerator Long-term goal Identify suitable method for the design of machine protection systems for the next generation of particle accelerators As a start… Apply method for the first time to a small accelerator to verify its suitability  Linac4

STPA steps Step 1: Identify accidents and hazards Step 2: Draw the control structure Controller + controlled process Control actions + feedback Step 3: Identify Unsafe Control Actions Step 4: Identify Causal Factors (Step 5: Iterate 1 to 4 until suitable mitigation is found) Controller Control Action Feedback Controlled Process J. Thomas, RSRA2015

Step 1: Linac4 Accidents and Hazards A1: Lack of beam for other accelerators A2: Damage to accelerator equipment A3: Injuries to staff members A4: Release of radioactive material in the environment HAZARDS (only related to A1): H1: Accelerator equipment is not ready for operation [A1, A2] H2: Beam is lost before reaching the transfer line [A1, A2] H3: Beam is stopped before reaching the transfer line when it is not necessary [A1] H4: Beam doesn’t have the required quality for following accelerators LINAC REQUIREMENTS: R1: Accelerator equipment must be operational [H1] R2: The beam must not be lost before reaching the transfer line [H2] R3: The beam must not be stopped when it is not necessary [H3] R4: The beam must have the required quality for following accelerators [H4] Transfer-Line for delivery

Step 2: Linac4 Control Structure CONTROL SYSTEM PERSONEL ACCESS CONTROL EQUIPMENT CONTROL Technical level Operational level (humans + software) Controls level BEAM DELIVERY CONTROL BEAM PARAMETER CONTROL REGULATION PARAMETERS BEAM STOP TRIGGER LINAC operators PERSONNEL Technical Personnel System under control: LINAC SENSORS AND INSTRUMENTATION PROTECTION ACTUATORS BEAM STOP Electrical supplies Cooling BEAM GENERATION BEAM BUNCHING AND FOCUSING BEAM ACCELERATION BEAM DELIVERY Vacuum LINAC

Step 3: Unsafe Control Actions Not providing causes hazard Providing causes hazard Too early/too late, wrong order Stopped too soon/applied too long Beam stop UCA2, UCA4, UCA5, UCA2 UCA1 UCA3 - UCA1: The beam is stopped when it is not necessary (automatically or by an operator) UCA2: The beam is not stopped in a detected emergency situation (automatically or by an operator) due to the unavailability of an actuator UCA3: The beam is not stopped while personnel has access to the linac UCA4: The beam is not stopped following the missed detection of an undesirable accelerator configuration UCA5: The beam is not stopped when the beam quality is not sufficient for following accelerators

Identify Causal Factors J. Thomas, RSRA2015

Associated Causal Factors Step 4: Causal Factors UCA: a beam stop is executed when it is not necessary Scenario Associated Causal Factors Notes Requirements Control input or external information wrong or missing: Operator trigger an unnecessary beam stop Operator accidentally act on the physical device connected to the controller The emergency button in the control room is accidentally pushed Protect the physical device from accidental contact   Operator misinterprets feedback from instrumentation and trigger the beam stop. The operator misinterprets a signal judging it as relevant deviation from nominal configuration and decides to stop the beam for safety reasons. Train operators to use software and processes running in the control room. Consider improving GUI. Operator executes a command that triggers a dangerous situation and thus a beam stop. Operator tries to compensate a beam or hardware setting but this leads to a dangerous state that requires a beam stop. Train operators to use software and processes running in the control room. Consider software that is providing limits / warnings. Technical personnel tries to access the linac while it is working, causing a beam stop. Technical personnel is unaware that the machine is running and tries to access it. Require authorization from the control room for machine access. Sensor - Inadequate or missing feedback: The sensor feedback is wrong and automatically triggers a beam stop. Sensor is faulty and causes a beam stop. A sensor gives wrong information and determines that a beam stopi s needed, even if no direct machine harm exists. A dedicated reliability analysis can assess number and type of sensors to minimize the occurrence of false detection. Spurious trigger of a sensor causing a beam stop. A sensor signals a hazardous operating condition due to a spurious failure (e.g. radiation- induced). Consider adding redundancy. When possible, locate sensors and instrumentation far from radiation-exposed areas. ‘Practical’ measures Managerial and organizational measures Procedural measures Technical requirements: trigger further analyses with traditional methods

Linac4 Machine Protection

Main Outcomes of Linac4 STPA Availability-oriented design of the Machine Protection System Modular design of MPS  Tree-like Architecture Management of beam destinations  External conditions Flexibility of MPS  Software Interlock System Procedural/managerial measures Definition of a MPS responsible for approval of changes/settings of the MPS Document for MPS requirements during Linac4 commissioning

Summary STPA: suitable tool for hazard analysis of safety-critical systems in accelerators Allows dealing with increasing system complexity Results go beyond requirements for hardware design Successful application to Linac4 MPS Set of availability requirements Impact on Linac4 architecture design Needs to be complemented by other tools (e.g. fault trees etc.) In particular for sub-systems / components Independently from the method: spread safety culture for particle accelerators

Thanks a lot for your attention References: J. Thomas, “The Reliability and System Risk Analysis (RSRA) Workshop”, CERN, 2015 (link) N. Leveson, “An STPA primer” (link) A. Apollonio, “Machine Protection: Availability for Particle Accelerators” (link) A-STPA, University of Stuttgart (link)

Safety vs Reliability Reliability component failures (bottom-up) Safety system accidents (top-down) Extreme: accidents can occur even in case of reliable behaviour of all components in the system (examples: wrong requirements, change of boundary conditions, etc.) J. Thomas, RSRA2015

Energy stored in one LHC beam 31014 protons in each beam Kinetic Energy of 200 m Train at 155 km/h ≈ 360 MJoule Stored energy per beam is 360 MJoule Picture source: http://en.wikipedia.org/wiki/File:Alstom_AGV_Cerhenice_img_0365.jpg Shared as: http://creativecommons.org/licenses/by-sa/3.0/deed.en [11]

Energy stored in the LHC magnet system Stored energy in the magnet circuits is 9 GJoule Kinetic Energy of Aircraft Carrier at 50 km/h ≈ 9 GJoule ….can melt 14 tons of copper [11] Picture source: http://militarytimes.com/blogs/scoopdeck/2010/07/07/the-airstrike-that-never-happened/ Shared as: public domain

Traditional Reliability Methods Qualitative FMECA Fault-Tree … Quantitative RBD Fault-Tree Markov-models …

STPA compared to traditional methods Qualitative method It can be used from early design stages Top-down ‘Design independent’ Suitable to handle increasing system complexity Guidelines provided to follow method steps (see references) Produces tables with Unsafe Control Actions (UCAs) and associated Causal Factors Readability similar to FMECA

OLD Step 4: Causal Factors ‘Practical’ measures Managerial and organizational measures Procedural measures Technical requirements: trigger further analyses with traditional methods

OLD CONTROL SYSTEM Technical level Operational level PERSONEL ACCESS CONTROL EQUIPMENT CONTROL Technical level Operational level (humans + software) Controls level BEAM DELIVERY CONTROL BEAM PARAMETER CONTROL BEAM TRAJECTORY CONTROL REGULATION PARAMETERS BEAM STOP TRIGGER LINAC operators PERSONNEL Technical Personnel System under control: LINAC SENSORS AND INSTRUMENTATION PROTECTION ACTUATORS BEAM STOP Electrical supplies Cooling BEAM GENERATION BEAM BUNCHING AND FOCUSING BEAM ACCELERATION BEAM DELIVERY Vacuum LINAC