TE-MPE-CP, RD, 28-Sep Problems with QPS DAQ Systems During LHC Operation, 1 st Results from 2010 CNRAD Tests R. Denz TE-MPE-CP
TE-MPE-CP, RD, 28-Sep Problems with stalled DAQ systems of QPS protection crates type DQLPU during LHC operation The QPS DAQ systems use hardwired triggers for signaling a trigger of a quench detection system etc. –The hardwired trigger link permits the time-stamping of the event with a precision of 1 ms as required for post mortem analysis During 2010 LHC operation several DAQ systems supervising protection units type DQLPU (main magnet protection) got stalled after a spurious trigger not linked to a real trigger of a detection system –Recovery only after a power cycle requiring access to LHC machine –Similar events observed already during electrical tests Electrical breakdowns during isolation tests of the LHC main superconducting circuits –Events are normally non destructive
TE-MPE-CP, RD, 28-Sep Problems with stalled DAQ systems of QPS protection crates type DQLPU during LHC operation Observed events Aa –Apart from the incident during the technical stop all other events correlated to beam losses –The problem concerns only the DAQ system not the detection electronics –The field-bus coupler has not been affected DateCrateRemark A8L7Close to cleaning insertion A8R8Close to TI B23L1During technical stop B9L2Close to TI A8R8Close to TI A9L2Close to TI C14R2BLM threshold test Q14 (2RM10S > 700 SEU counted)
TE-MPE-CP, RD, 28-Sep Problems with stalled DAQ systems of QPS protection crates type DQLPU during LHC operation Analysis: –The DAQ system gets stalled by a permanent logic low state on the trigger input –The problem can be partially cured by a firmware upgrade of the DAQ system, which will indicate the fault but allow to continue LHC operation Access still required but can be organized at a more suitable time, e.g. within a normal accelerator stop –Analysis of the hardware layer points to a specific problem with a digital isolator of the ISO150AU type Device successfully tested in TCC2 by QPS, failures reported recently (CNRAD 2009) by CRYO Recoverable EMC vulnerability reported by manufacturer
TE-MPE-CP, RD, 28-Sep Problems with stalled DAQ systems of QPS protection crates type DQLPU during LHC operation Analysis continued: –Problem seems to be related the operational mode: Device errors can be cured by changing input signal of the digital coupler normal operational mode for QPS and used during TCC2 tests no problem observed (isolation of SPI bus signals) The trigger channel operates with input signal only changing during real triggers or power cycles (to my understanding CRYO uses a similar operational mode) Possible solution: –Limited amount of potentially affected devices (so far only DS areas 72 units) –Firmware upgrade will reduce impact on accelerator operation significantly Could be deployed within one technical stop –Hardware upgrade would require production and installation of 160 new circuits boards
TE-MPE-CP, RD, 28-Sep st results from QPS CNRAD 2010 tests Objectives 2010 –Identify possible problems for QPS electronics installed in RR13, RR17, RR53, RR57, RR73, RR77, UJ14, UJ16 and UJ56 during the current LHC run especially in 2011 –Continue radiation test evaluation of latest uFIP version (manufacturer B) Test periods (position TSG46-4) –Period 6: crate to be moved to test with higher dose rate SlotPeriodTest program Devices installed in RR and UJ, uFIP Devices installed in RR and UJ, uFIP ISO150, uFIP, nQPS
TE-MPE-CP, RD, 28-Sep st results from QPS CNRAD 2010 tests Devices under test 2010 Board type / deviceQTYFunctionMain processors DQAMS1Field-bus couplerVY27257 (old version), ADUC831BS DQAMGS1Field-bus couplerVY27257 (new version, manufacturer B), ADuC831BS DQQDC2 x 2Quench detector HTS leadsADuC834BS DQQDG2Quench detector 600 A circuits, IPQ, IPD and IT TMS320C6211 AQW210EHA9Galvanic isolation of hardwired interlock signal N/A Linear tri-volt power supply (TE-MPE-CP development) 3+5V, ±15V AC-DC converterN/A
TE-MPE-CP, RD, 28-Sep st results from QPS CNRAD 2010 tests 1 st results 2010 –Radiation load ( ): D = 11.8 Gy (~0.01 Gy/h), n = 1.4 x cm -2, p = 9.6 x cm -2 Board type / deviceQTYObserved faults DQAMS1None DQAMGS1Partially stalled once (content of variables not correctly updated), cured by soft RESET DQQDC2 x 2Operating normally (detailed analysis to be done) DQQDG2One device stalled after power cycle non recoverable error (problem with boot-loader?), 2 nd device stalled once during operation (cured by soft RESET) AQW210EHA9No errors observed Linear tri-volt power supply (TE-MPE-CP development) 3Feeding all devices mentioned above, no errors observed