Presentation is loading. Please wait.

Presentation is loading. Please wait.

TRACKING OF FAULTS AND FOLLOW-UP Accelerator Fault Tracking project Jakub Janczyk (TE-MPE-PE / BE-CO-DS) with input from: Andrea Apollonio, Chris Roderick,

Similar presentations


Presentation on theme: "TRACKING OF FAULTS AND FOLLOW-UP Accelerator Fault Tracking project Jakub Janczyk (TE-MPE-PE / BE-CO-DS) with input from: Andrea Apollonio, Chris Roderick,"— Presentation transcript:

1 TRACKING OF FAULTS AND FOLLOW-UP Accelerator Fault Tracking project Jakub Janczyk (TE-MPE-PE / BE-CO-DS) with input from: Andrea Apollonio, Chris Roderick, Rudiger Schmidt, Benjamin Todd, Daniel Wollmann

2 Agenda Purpose of fault tracking What has been done in the Past Accelerator Fault Tracking project – plans & status Summary 10/14/2014R2E/Availability Workshop 2

3 Purpose of fault tracking Complete and consistent tracking allows to identify: Problems as early as possible to allow for timely mitigation Key issues which will limit performance of accelerators or equipment in the future (Run2, Run3, HL-LHC) Increase availability, in both short- and long-term, by dealing with issues ASAP Track Faults in two areas: 1. Directly affecting accelerator operation – identify root causes (e.g. R2E effects, glitches in electrical network, etc.) 2. Equipment (electronic) faults independently of immediate impact on accelerator operation 10/14/2014R2E/Availability Workshop 3

4 What has been done in the Past A lot of different tools for logging of faults, used by different teams: eLogbook, Post-Mortem, RadWG page, tools in equipment groups (JIRA, Excel, Onenote, eLogbook) A lot of effort was required from individual teams/working groups to gather and exploit fault data Nevertheless, difficult to get a consistent picture 10/14/2014R2E/Availability Workshop 4

5 Credit M. Brugger

6 Cardiogram - „life” of LHC from operational point of view Graphical analytic tool for combining data from different sources Initially created by members of Availability WG: B. Todd, L. Ponce, A. Apollonio Tedious work to gather and prepare all the necessary data  several months for 2010-2012 cardiogram 10/14/2014R2E/Availability Workshop 6

7 Cardiogram - example 10/14/2014R2E/Availability Workshop 7 Accelerator Mode (Proton Physics, Ion Physics, etc.) Access Fill Number Particle Momentum Beams Intensities Stable Beams PM Beam Dump Beam Dump Classification Fault Fault Lines (Systems/ Fault Classifications) Credit AWG

8 Cardiogram – data preparation 10/14/2014R2E/Availability Workshop 8 Credit Benjamin Todd

9

10 Accelerator Fault Tracking project Project launched February 2014 (BE/CO, BE/OP, TE/MPE collaboration) Based on initial inputs from: Evian Workshops Availability Working Group Workshop on Machine Availability & Dependability for Post-LS1 LHC BE/OP Goals: Capture consistent and complete fault data Facilitate fault tracking from perspective of all interested parties (OP, equipment groups, working groups) Single source of data – easier to complete, clean and analyse. Provide consistent / standardized statistics, analyses, reports for different users (8:30 meetings, weekly reports / summaries) Interactive overview of faults (cardiogram on demand) Proactively identify incomplete data 10/14/2014R2E/Availability Workshop 10

11 Plans (as presented by Chris Roderick @ LMC 30-04-2014)as presented by Chris Roderick @ LMC 30-04-2014 Provide infrastructure to consistently & coherently capture, persist and make available accelerator fault data for further analysis. Foreseen project stages: 1. Put in place a fault tracking infrastructure to capture LHC fault data from an operational perspective Enable data exploitation by others (e.g. AWG and OP) to identify areas to improve accelerator availability for physics Ready before LHC beam commissioning Infrastructure should already support capture of equipment group fault data, but not primary focus 2. Focus on equipment group fault data capture 3. Explore integration with other CERN data management systems (e.g. Infor EAM) potential to perform deeper analyses of system and equipment availability in turn - start predicting and improving dependability To support data analysis, AFT data extraction infrastructure should also provide data complimentary to the actual fault data - such as accelerator operational modes and states. Scope: Initial focus on LHC, but aim to provide a generic infrastructure capable of handling fault data of any CERN accelerator. We are here... Time

12 Status AFT is under development – Web application, available for different users, and integration with eLogbook for LHC operators Functionalities available from day 1 will be as planned for first stage of the project AFT test version available We’re open to start discussion with equipment groups  acc-fault-tracking-team@cern.chacc-fault-tracking-team@cern.ch 10/14/2014R2E/Availability Workshop 12

13 10/14/2014R2E/Availability Workshop 13

14 10/14/2014R2E/Availability Workshop 14

15 10/14/2014R2E/Availability Workshop 15

16 Turnaround Time 10/14/2014R2E/Availability Workshop 16

17 Summary Consistent and complete tracking of faults is the key to identify and efficiently mitigate issues The AFT will ease the recording of faults and their root causes in a complete and consistent way Run2 data will be essential to identify future performance/availability limitations towards HL-LHC Quality and completeness of the data requires effort from all involved parties Open to discuss integration of equipment groups data 10/14/2014R2E/Availability Workshop 17

18 Questions 10/14/2014R2E/Availability Workshop 18

19 Extra Slides 10/14/2014R2E/Availability Workshop 19

20 Roles and simplified workflow 10/14/2014R2E/Availability Workshop 20

21 10/14/2014R2E/Availability Workshop 21 2011 2010 2012

22 Multiple failures It is easy to see if there are multiple failures at the same time, but it’s not obvious if they are related. One of the goal of AFT project is to capture data that will allow to show the relations between faults. 10/14/2014R2E/Availability Workshop 22 Faults related Water leak Problems caused by water leak Faults not related – QPS failed and rest of them are accesses in shadow

23 Access without faults In 2012, around 40 times there was access without any fault The reasons for these accesses are not classified, but often something is repaired Inconsistent data – cardiogram allows to spot this 10/14/2014R2E/Availability Workshop 23

24 Access without faults - examples 10/14/2014R2E/Availability Workshop 24 Few accesses: ATLAS, Change of PC, repair of QPS, intervention on the crates of the BPMD LHCb – fixing muon detectors Accesses in shadow of QPS fail: QPS – reset cards, ALICE and CMS, Cryogenics – valve regulation, RF – replacing broken attenuator ATLAS access


Download ppt "TRACKING OF FAULTS AND FOLLOW-UP Accelerator Fault Tracking project Jakub Janczyk (TE-MPE-PE / BE-CO-DS) with input from: Andrea Apollonio, Chris Roderick,"

Similar presentations


Ads by Google