Download presentation
Presentation is loading. Please wait.
Published byGilbert King Modified over 8 years ago
1
1 15 April 2010: Post Mortem Analysis by M.Zerlauth MP3.1 Markus, Adriaan, Odd, Zinur and many others…
2
2 15 April 2010: Post Mortem Analysis by M.Zerlauth Outline How do we analyze powering events today? Is there is room for improvement? What can we do to it even better/easier? Current MP3/MPE tasks What can the PMA framework + HWC analysis do for MP3? Use-case - Demo Conclusions
3
3 15 April 2010: Post Mortem Analysis by M.Zerlauth How we analyze powering events today…. Currently we rely on expert presence and follow-up for safe operation of LHC magnet powering system Despite the considerable effort (MP3 shifts, …), even today not all ‘critical’ events are analyzed in a deterministic way (for multiple reasons) We don’t know (or are not told) something happened e.g. during the night Correlated/Global events (>> buffers and circuit trips during e.g. tune feedback errors, global protection, buffers are not completely analyzed for the time being) We do things differently, we do make mistakes or might overlook something, no clear ‘analysis guidelines’ per circuit/equipment,… Attention and expert presence will (have to) further decrease in the months/years to come, e.g. future HWC campaigns will have to be done with << resources Equipment safety must not depend on continuous expert presence, procedures, … in the long run, but abnormal behavior must be detected and the equipment re-start inhibited automatically If not done on HW level, SW can help to do so (SIS, PM, XPOC, Sequencer,…)
4
4 15 April 2010: Post Mortem Analysis by M.Zerlauth Examples where we might want to improve… (from our MP3 logbook) Sector 3-4 SOC : - Feb 2010 CIRCUIT : many COMMENT : This concerns a provoked quench on dipole A19R3 a few days ago (on 20/2/2010 at 23h11) at a current of 6 kA. It did not cause secondary quenches in other dipoles but caused trips in many other circuits. Nothing is mentioned in the MP3 logbook about this (neither in QPS or OP logbooks)!!?? Circuits concerned: - RB.A34 - RQS.L4B1 - ROD.A34B2 - RCO.A34B2 - RSS.A34B1 - RQTD.A34B2 - RQTL11.R3B2 - RCS.A34B2 - RQTL9.R3B2 - RSD2.A34B2 - RQT13.R3B2 - RQ6.R3B2 - bus RQD.A34 - bus RQF.A34 Maybe a point of discussion for MP3. Do we have to analyze all trips/quenches (I think yes) or not. Quench signals on A19R3 look normal. I did not look at the others. Major trip (All sectors but 56&81. - Feb 2010 18kV failure in Meyrin Seen by the PC as external FPA About 400 PM files!! No heater discharged for the main circuits Could be interesting to have a check list for similar cases. Temperature of RB, RQF, RQD resistors not increasing (<50 deg) There are QPS files for many circuits (In 600 A QPS1 (2), 600 A QPS2 (2), 600 A QPS4 (about 50)) and also for dipole ; checked some of them and they are not quenches
5
5 15 April 2010: Post Mortem Analysis by M.Zerlauth What can be done to improve this ? Yes, it’s a complex system, but we have the necessary experience from these past months of beam operation and previous years of hardware commissioning We have plenty of tools, algorithms, etc.. which currently support the expert analysis of powering events To assure the safety of the magnet powering system whilst at the same time limiting the (expert) resources we must automate as much as possible our (repetitive) tasks (using automated PM analysis, SW interlocks, sequencer checks, etc…) It is obvious that… Automated analysis cannot cover the whole range, but is capable to help analyzing ALL PM buffers and bring forward those 10% which are not ‘normal’ for detailed manual analysis Quality of analysis modules is important (i.e. they have to be implemented and tested carefully to work as expected) To work for operations (and HWC), analysis has to be done purely data driven (i.e. no pre- knowledge about the cause or type of failure)
6
6 15 April 2010: Post Mortem Analysis by M.Zerlauth Currently 3 categories of MP3 tasks / interventions ‘Standard’ operational tasks currently requiring QPS_OP or QPS_EXPERT rights (e.g. SEND_LOGGING, changing board A/B, RESET,…) Commands that cannot put equipment in danger should be granted to operations (they are willing and asking to do more….) Detect degradation of equipment protection, e.g. heater redundancy, QPS_OK… Was manually so far, inclusion in SIS avoids these things to go undetected + trigger necessary actions early Trips of circuit during operation (with or without beam) through analysis of Post Mortem data Automated analysis can help to validate repetitive events, such as switch openings after global events, heater discharges,…. In case of malfunctioning, framework can create injection inhibits in SIS and/or (super-)lock circuits to force an acknowledge/intervention of an expert
7
7 15 April 2010: Post Mortem Analysis by M.Zerlauth WHEN TO CALL MP3 OR MPE ? In case of doubt! Call either MP3 or MPE Trip or QH discharge on IPQ/IPD/IT/RB/RQ: Call MP3 Any FPA from QPS (see PIC): Call MP3 Cannot close EE switches: Call MPE piquet QPS_OK lost: Call MPE QPS_OK lost, injection permit would need to be masked: Call MP3 Any signal to be masked: Call MP3 An event that is not fully understood: Inform MP3 An event that is not fully understood and happens twice in 48 hours time: Call MP3
8
8 15 April 2010: Post Mortem Analysis by M.Zerlauth WHEN TO CALL MP3.1 OR MPE? In case of doubt! Call either MP3 or MPE Trip or QH discharge on IPQ/IPD/IT/RB/RQ: Call MP3 -> QH Discharge tool + locking of circuit if NOT_OK Any FPA from QPS (see PIC): Call MP3 -> Powering Event analysis + circuit lock Cannot close EE switches: Call MPE piquet QPS_OK lost: Call MPE -> automatically captured by SIS, transfer rights to operations QPS_OK lost, injection permit would need to be masked: Call MP3 (only if cannot be recovered by standard tasks such as SEND_LOGGING, RESET etc..) Any signal to be masked: Call MP3 An event that is not fully understood: Inform MP3 An event that is not fully understood and happens twice in 48 hours time: Call MP3 -> Automated analysis will fill DB (or similar) and it will be easier to detect similar trips over time
9
9 15 April 2010: Post Mortem Analysis by M.Zerlauth LHC PM Analysis framework
10
10 15 April 2010: Post Mortem Analysis by M.Zerlauth Detailed raw data/result data GUIs
11
11 15 April 2010: Post Mortem Analysis by M.Zerlauth Current state of PMA framework Analysis framework includes (without going into details): – Dependable data storage (+ client APIs) for storage of PM raw data – Event building (ONLINE and OFFLINE identification of different event types) – DEV and PRO servers for ONLINE (‘real time’ CCC operation) and PLAYBACK analysis (‘expert OFFLINE analysis of any prior event), including versioning in SVN – Scheduling and execution of analysis configurations depending on event type – Data consistency and completeness checks for dependent modules (exchange ‘contracts’) – Execution environment for analysis modules (r/w of PM data, DB & Reference access, dump context, logger,..) – Execution environment for specific data viewers (dedicated GUI framework) – Integration with external systems (SIS, LHC sequencer, Page 1, DIP, Fixed Displays, PMA GUI, Database upload for event summaries)
12
12 15 April 2010: Post Mortem Analysis by M.Zerlauth Powering Event Analysis The PMA framework offers all necessary building blocks to schedule execution of (any combination) of analysis modules in case of powering events (with or without beam) + trigger necessary actions/info towards rest of controls environment Existing analysis for ‘global post mortems’ is covering mostly beam related systems (BLM, BPM, BIC, FMCM, LBDS, etc…) For powering event analysis it is vital to use the algorithms and tools developed during HWC Some work needed to make them more generic, fully automated and GUI less (ie strip them off the ‘HWC knowledge’) Example: Discharge analysis tool currently knows that PM file is originating from a PNO.d1 test. During operation, tool needs to identify by itself that the contained current decay is a normal (or not!!) discharge following a Fast PA or Slow PA. Example of a HWC event as identified by PMA
13
13 15 April 2010: Post Mortem Analysis by M.Zerlauth Powering Event Analysis configuration – 1 st Proposal FGC Raw Data Data QPS Raw data Data CRYO Raw Data Data WIC Raw Data Data PIC Raw Data Data FGC Ext Data MB Heater Discharge Data MQ Heater Discharge Data DQAMS 600A Data DQAMSRB/RQ Data DQAMGA Data FGC Faults Data SCE Data PIC_ISA Data Automatic Event Recog Data
14
14 15 April 2010: Post Mortem Analysis by M.Zerlauth Powering Event Analysis configuration – Example of SCE FGC Raw Data Data QPS Raw data Data CRYO Raw Data Data WIC Raw Data Data PIC Raw Data Data FMCM Raw Data Data FGC Ext Data MB Heater Discharge Data MQ Heater Discharge Data DQAMS 600A Data DQAMSRB/RQ Data DQAMGA Data FGC Faults Data FMCM_ISA Data SCE Data PIC_ISA Data Automatic Event Recog Data 1 output file for circuit RSF1.A45B2, stating failure type, trip parameters, related buffers, …
15
15 15 April 2010: Post Mortem Analysis by M.Zerlauth A first example…. LV module of Zinur for QH discharge analysis http://elogbook/eLogbook/eLogbook.jsp?lgbk=350&date=20100330&shift=1 CIRCUIT : RQF.A12 COMMENT : RQF tripped at 8h51h17m at 2151 A, so about 90 sec after sector 81. Same reason as before. One dipole magnet fired the quench heaters, C19R1.
16
16 15 April 2010: Post Mortem Analysis by M.Zerlauth Possible modules in Powering Event analysis we have/can do today General – Dissection of powering events into circuit trips (SCE builder) - Started – Global event analysis/automatic event recognition - New Power converters (in principle ‘self-protected’) – FGC_FAULTS (identification of ‘abnormal’ faults in the FGC buffer) to detect early signs of needed EPC interventions - Started – FGC_Discharge (Analysis of discharge curve + 60/120A quenches) - Started QPS (capture and block upon critical issues) – Heater Discharge (MB, MQ, IPQ/ IPD and IT) - Started – Splice measurement -> pure Logging (triggering tbd, possibly by sequencer) - Started – nQPS analysis -> pure Logging - Started – DQAMS 600A (Analysis of switch opening, redundancy, etc…) - New – DQAMS 13kA (Analysis of switch opening, redundancy, etc…) - New – DQAMGA (quench analysis for 600A, identification of quench origin,…) - New – >6kA circuits, block via SIS/PIC and request manual analysis - New
17
17 15 April 2010: Post Mortem Analysis by M.Zerlauth Conclusions To improve long-term safety of powering equipment and limit expert interventions, now is the moment to transform our existing!! knowledge into additional automated tools PMA framework in combination with HWC analysis modules can assist a lot in consistent and deterministic analysis of powering events during operation with or without beam Mechanism for integration of LV code exists (first use-case integrated) Quite some additional development & validation work tbd by all of us who have developed modules for HWC, but this effort will pay off! Implementation of missing building blocks and V1.0 working (e.g. with heater discharge modules, circuit locking, Event DB) possible for July
18
18 15 April 2010: Post Mortem Analysis by M.Zerlauth
19
19 15 April 2010: Post Mortem Analysis by M.Zerlauth PM in 2010 and beyond Extended PM data viewer (allowing correlation of PM data items, e.g. plotting beam losses against BPM pickup over time,…) -> Magnet quenches induced by beasm losses! Extended integration of powering analysis to enforce automated analysis of relevant powering events even in absence of experts/MP3… (not to miss anything important as it happens already now!!) – Dedicated analysis configuration for powering events, which will automatically archive all result files and populate DB (OK/NOT_OK) and eventually block next injection / circuit…. – As planned since beginning, vital to use the well developed modules from HWC for this purpose, but need to strip them off the ‘HWC knowledge’ Example: Discharge analysis tool is currently being told that the PM file is originating from a PNO.d1 test. During operation, tool needs to identify by itself that the contained current decay is a normal (or not!!) discharge following a Fast PA or Slow PA. – Modules for a 1 st go: Heater discharge for MB/MQ (Zinur), Discharge tool, PIC analysis, SCE module + new modules for automated switch + quench analysis
20
20 15 April 2010: Post Mortem Analysis by M.Zerlauth Main design goals of PMA framework PM analysis should assist operations (and equipment experts) in analyzing events, to – Identify event sequence leading to dump/incident (initiating event) – Verify the protection functionality and give green light to proceed In view of the diversity of equipment systems, event types, domain knowledge, etc.. involved, decided for a flexible and performing analysis framework (‘PM core’) in combination with analysis modules (provided by experts, operations,…), allowing for contributions of many people For normal machine operation, analysis must be purely data driven (no prior knowledge as to the cause or nature of the event) Due to the vast amount of data to limit the necessity of experts for analysis of ‘standard’ events, analysis must be fully automated (with logging of results) Today, PMA framework is used operationally by Global PM analysis (mainly for beam related equipment), Injection Quality Check (IQC), eXternal Post Operational Check for the LBDS (XPOC) – IQC & XPOC result <=20 s, PM preliminary result <=60 seconds, final <= 7 minutes after event
21
21 15 April 2010: Post Mortem Analysis by M.Zerlauth Event building for powering events… Maintain/extend current event building for SCE/MCE/… (+) more flexibility in (pre-)filtering and thus reducing interesting events (-) not necessarily useful info for analysis Simple event building (Powering events) (-) Potentially more events, what to do with FGC_ext? (+) No additional logic in EB, simpler (+) More flexibility for HWC (every ‘event’ will be analyzed) Link to HWC –Any event will produce result file for circuit, which can be matched with executed tests (will need additional class to search and pickup PMA results from DB, repository,…)
22
22 15 April 2010: Post Mortem Analysis by M.Zerlauth Modules in Powering Event analysis Current HWC tools –QPS (no automated analysis, dedicated displays for magnet, leads, heaters, etc..) –QPS PCC 600A (no automated analysis) –QPS Snapshot (not for operations) –PNO2, PCS (dedicated for HWC current cycle, algorithms re-usable if at stable current for correctors, but will only get FGC_ext buffers) –Discharge (possibility for automation, needs fault type as input) –PIC (possibility for integration, needs fault type as input) –DFB, CRYO (???)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.