Download presentation
Presentation is loading. Please wait.
2
LBDS TSU & AS-I failure report (Sept. 2016)
A. Antoine LBDS TSU & AS-I failure report (Sept. 2016) 27 September 2016 LBDS: TSU & AS-i Status
3
Content TSU AS-I Conclusion Operation History Failure Impact
Failure Analysis AS-I Specifications & Framework LBDS Configuration Conclusion 27 September 2016 LBDS: TSU & AS-i Status
4
TSU 27 September 2016 LBDS: TSU & AS-i Status
5
TSU Version 1 - prototype never been in operation
Operation History Version 1 - prototype never been in operation Version 2 - in operation from LHC start up to LS1 First operational experience No critical hardware failure Poor diagnosis capability SPS compatibility required (new request) Potential major failure detected (internal review) Version 3 – in operation from LS1 Critical hardware failure on 1st July 2016 Synchronous dump done LBDS B1 – TSU-B replaced 27 September 2016 LBDS: TSU & AS-i Status
6
TSU LBDS worst case failure ! Thanks to redundancy fail-safe design:
Failure Impact LBDS worst case failure ! Thanks to redundancy fail-safe design: Synchronous dump done Operation: Expert investigation needed MTTR: ~ 1 hour 5 hours of downtime (LHC access required !) Cost: Materials: ~ 2500 CHF / intervention Expert & On call service: ~ 500 CHF 27 September 2016 LBDS: TSU & AS-i Status
7
TSU FPGA fatal error (not recoverable) Power supplies suspected
Failure Analysis (1st July) FPGA fatal error (not recoverable) Power supplies suspected 3 dependent + 2 independent power supplies on a TSU board: +1.2V -> FPGA core +1.8V -> EEPROM (Flash Rom for FPGA) +2.5V -> FPGA & CPLD +3.3V -> most of components, FPGA interface included +5V -> CIBO powering 27 September 2016 LBDS: TSU & AS-i Status
8
TSU Failure Analysis: abnormal startup ~ +3V ~ +1.8V +1.2V +1.8V +2.5V
27 September 2016 LBDS: TSU & AS-i Status
9
TSU Failure Analyse: normal startup (FPGA removed) +1.2V +1.8V +2.5V
27 September 2016 LBDS: TSU & AS-i Status
10
TSU Failure Diagnosis An internal FPGA failure induce a short circuit on the +1.2V power supply Design review with N. Magnin: +1.2V power supply very noisy Noise with transients above FPGA specifications Some decoupling capacitors missing on the +5V power supply used to generate the +1.2V Still not clear why FPGA create a short circuit ! 27 September 2016 LBDS: TSU & AS-i Status
11
TSU Failure Diagnosis: Power Supplies Noise ~250mV ~250mV +1.2V +1.8V
27 September 2016 LBDS: TSU & AS-i Status
12
TSU Failure Diagnosis: Power Supplies Noise
+ 5V from VME is the source of all power supplies … +5V 27 September 2016 LBDS: TSU & AS-i Status
13
Conclusion (TSU) 1 critical failure in 10 years of operation
MTTR of 5 Hours Redundant TSU strategy worked fine: Detection of the failure Synchronous Dump done Corrective action to be validated and deployed to remove noise on the +5V and 1.2V power supply 27 September 2016 LBDS: TSU & AS-i Status
14
AS-i 27 September 2016 LBDS: TSU & AS-i Status
15
AS-i Acuator-Sensor Interface Specifications: Framework:
CEI and EN Standards Data on power line (decoupling filter) 8 bits data serial bus with Safety capability (SIL3) Up to 62 standard nodes or 31 safety nodes Reaction time <10ms Up to 100m length (300m with repeater) Framework: 1x AS-I master controller 1x dedicated power supply Unshielded 2-wires cable wrapped with an electrical insulator for data and power Actuators & Sensors Safety monitor (when needed) 27 September 2016 LBDS: TSU & AS-i Status
16
AS-i LBDS Configuration 27 September 2016 LBDS: TSU & AS-i Status
17
AS-i 2 hardware failures in 10 years of operation
Operation history 2 hardware failures in 10 years of operation Same failure signature … but one was the AS-i F Link module All 4 systems impacted (beam 1 & 2) First occurrence shortly before LS1 (6 years of operation) Curative maintenance (on call service) Early LS1, preventive maintenance done with replacement of AS-I F Link & Power supply components. Second occurrence some weeks ago on 3 systems Preventive maintenance during TS done with replacement of all AS-I Power supplies. 27 September 2016 LBDS: TSU & AS-i Status
18
AS-i LBDS abruptly stopped (as an AUE)
Failure Impact LBDS abruptly stopped (as an AUE) AS-I worst case failure (Power and discharging switches switched off) Synchronous dump (thanks to fail-safe design) Operation: Short MTTR: 45 min 4h of downtime / intervention (access to the LHC needed !) Cost: Materials: ~ 1000 CHF / intervention On call service: ~ 300 CHF 27 September 2016 LBDS: TSU & AS-i Status
19
AS-i Failure Diagnosis 2 components identified as potential responsible of the AS-I failure: AS-I Master controller (AS-I F Link) AS-I Power supply Master controller: Controller down and not resettable ! No software diagnosis available Power Supply: Output filter showed degradation (capacitors) Out of specification connection of the AS-I bus (spring terminal -> no pod on wire allowed !) 27 September 2016 LBDS: TSU & AS-i Status
20
AS-i Scenario 1: Scenario 2: Failure Diagnosis
Data on the AS-I bus are altered by the degradation of the capacitor of the power supply output filter The AS-I Master controller get wrong reply messages from safety sensors (Data corruption) The AS-I Master controller goes to safe state with failure (not resettable) Scenario 2: Bad connections (use of pod on spring teminals) Data corruption 27 September 2016 LBDS: TSU & AS-i Status
21
AS-i Done on all systems (4x) New AS-I Power supply
Corrective action during TS3 Done on all systems (4x) New AS-I Power supply Remove all pods on wires connected with spring terminals 27 September 2016 LBDS: TSU & AS-i Status
22
Conclusion (AS-i) 2 periods of failures in 10 years
MTTR short but MTBF increase after one occurrence (burst behavior) Fail-safe design: Synchronous Dump done Corrective action during TS3: Replacement of all AS-I Power supply Remove wire pods on spring terminals 27 September 2016 LBDS: TSU & AS-i Status
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.