LBDS KICKERS MPP workshop - Annecy 11.03.2013 Nicolas Magnin TE-ABT-EC.

LBDS KICKERS MPP workshop - Annecy 11.03.2013 Nicolas Magnin TE-ABT-EC

Outline  Operational experience  Planned changes during LS1  Full re-commissioning Many thanks to all my colleagues for their help: A.Antoine, E.Carlier, F.Castronuovo, L.Ducimetière, R.Filippini, B.Goddard, V.Mertens, A.Patsouli, V.Senaj, M.Stjepic J.Uythoven, N.Voumard, …sorry for those I forgot !

Operational experience LBDS usage Number of dumps performed for LBDS - Beam 1 Many more LOCAL dumps than REMOTE dumps: Test and calibration purpose during WS and TS. During 3 years of operation for beam 1: > 40’000 pulses in LOCAL > 12’000 pulses in REMOTE 201020112012  LOCAL dumps  REMOTE dumps

Operational experience LBDS usage Number of days spent at various energies for LBDS - Beam 1 2010 - 2011: Most of the time is spent at INJECTION Same time is spent in STANDBY and at FULL energy 2012: Most of the time is spent at FULL energy Same time is spent in STANDBY state and at INJECTION => Related to LHC operation… 201020112012  FULL energy(3.5/4 TeV)  RAMP(0.45 –> 3.5/4 TeV)  INJECTION(0.45 TeV)  STANDBY(0.4 TeV)  OFF(0 TeV)

Operational experience LBDS failures 2011 Failures : 48 / False Dump: 19 2012 Failures : 33 / False Dump: 16  Failure events  False dumps More than 50% of problems occurred without beam (LOCAL / Failed arming / …) Many recurrent problems at start-up 2011 (TSU / VAC / …) Many recurrent problems at end of 2012 (ASI bus / ANY bus / …) => All dump requests were correctly executed ! Number of recorded failures / false dumps for LBDS - Beam 1

Operational experience LBDS failures in operation 2012 Total failures: 28 2011 Total failures: 43  Fail detected  Fail silent Fail detected: Surveillance system detects the fault (fail safe => dump OK). Example: ASI bus failure. Fail silent: Post operation check detects the fault (fault tolerant => dump OK) Example: Retrigger pulse is missing. More than 80 % of failures are detected

Operational experience LBDS interventions in operation 2012 Total interventions: 28 2011 Total interventions: 43  Remote  Masked & postponed  Postponed  Immediate access Remote: The system is operational after a remote reset. (Example: VAC spike/glitch error on MKB) Masked & Postponed: Problem is recurrent but not major (Example: Retrigger pulse is missing) Postponed: Access needed but postponed (Example: Problem arming B1, but proceed with B2) Immediate access: The problem is blocking OP, immediate access for repair (Example: ASI bus error) More than 50 % of remote interventions

Operational experience False XPOC errors in operation Number of false XPOC errors in 2012-2013: XPOC errors: ~430 / False XPOC errors: ~ 211 => nearly 50% of false XPOC errors Way too many false XPOC errors ! => We have to find explanations/solutions to all these problems during LS1… False resets done by OP: (~191) Filling pattern cleared Missing BCTFR Missing BLM B1/B2 False resets done by experts: (~21) Received E= -1 GeV TSU DR not detected (CTRV)

Operational experience LBDS self-trigger with beam Details on LBDS self-triggers 2012 (Following up on Ben’s talk):  02-OCT-12 16.17.38 – Beam 2 :  Context: Proton Physics / Ramp / E= 3310 GeV  Cause: ASI bus error (ASI PS)  27-OCT-12 07.58.38 – Beam 1 :  Context: Proton Physics / Stable beam / E= 4000 GeV  Cause: ASI bus error (ASI PS)  28-OCT-12 11.11.05 – Beam 2 :  Context: Proton Physics / Ramp / E= 494 GeV  Cause: BEM ANYBUS module error  29-OCT-12 19.50.47 – Beam 2 :  Context: Proton Physics / Ramp / E= 458 GeV  Cause: BEM ANYBUS module error Only 4 physics beams lost in 2012 due to LBDS self-trigger 2 recurrent problems: => Diagnostics problems ? E > 450 GeV

Operational experience Industrial component failures Many interventions due to industrial component design problems:  NI PXI-5122 : Fuse problem (Digitizers for kicker waveforms for IPOC/XPOC)  61 cards returned to manufacturer for repair.  PK55 PSU : Electrolytic capacitor problems.  50 units repaired at CERN.  3kV HVPS Heinzinger (PTU) : Electrolytic capacitor problems.  80 units repaired at CERN.  35kV HVPS Heinzinger (MKD principal) : HV transformer sparking problems.  40 units returned to manufacturer for repair.  ASI-bus Siemens PS : Electrolytic capacitor problems.  4 units replaced.

Operational experience Potentially dangerous failures LBDS failures (asynchronous dumps):  TFOT driver chip burned (19.11.2010) => async trigger of 2 MKDs. Context: Pilot in Actions:  Re-cabling of TFOT => Now only 1 MKD would trigger in this case  Review of TFOT => No design problem: Driver fault (included in reliability analysis)  WIENER power supply failure (02&04/2012) => async trigger only Context: No beam Actions:  Review of LBDS powering => Add 2 nd UPS (from QPS), fuses for each crate PSU,…

Operational experience Potentially dangerous failures Anticipated LBDS failure modes:  +12V PS loss on TSU crate (06/2012, but did not happen !) => sync and async trigger pulses blocked => NO DUMP ! Context: During analyses for preparation of LBDS powering review… Actions:  Stopped the machine immediately !  Implemented a temporary solution: Generate an async trigger in case of +12V failure  During following TS: Consolidated the fix implementation  Will be fully consolidated during LS1  MKD generator HV sparking above 6 TeV (2009) => self-trigger (async). Context: Related to Peltier cells added after reliability run 2009 Actions:  LBDS energy limited to 5000 GeV  Started upgrade of HV generators

Operational experience Procedure in case of hardware changes ? How to handle any hardware component exchange ? Examples:  Problem with TSU + CIBU (Start-up 2011): (Bad filtering of CIBU powering -> glitches on FPGA +1.2V => FPGA frozen in UNKNOWN state ! Many dumps due to this problem (no time allocated to test LBDS during machine cold check-out)  MKD GTO stack exchange (Q4 2011)  Peltier cells added into MKD generators (After reliability run 2009) (Modification has not been validated by reliability run) Limited re-commissioning occurred after these interventions ! => Procedure ?  Dry dumps & pilot dumps 450 -> 7000 GeV ?  Operation with safe beam for some time ?  Reliability run / re-commissioning ?

Operational experience Running the LBDS in ‘degraded mode’ How to handle temporarily masking of redundant signals or enlargement of tolerances ? Examples:  Masking of PTU retrigger (1 over 4) => Normal operation  Masking of MKD switch ratio (problem @ 450 GeV) => Normal operation  Enlarging XPOC tolerances (MKD & MKB) => Normal operation Decision to mask and go ahead was taken by LBDS experts and EIC  Should LHC operation be limited to safe beam until intervention ?  Risk evaluation / Procedure ? Some masks stayed for quite a long time (PTU retrigger for instance)  Should degraded mode be visible on LBDS fixed display (at least) ?

Planned changes during LS1 Additional re-trigger from BIS What happens if TSU cards do not send triggers (+12V VME problem) ? BIS sends retrigger pulses after 250 us: Potential danger: Increase of async dump rate ? => To be evaluated !

Planned changes during LS1 Additional re-trigger from BIS IPOC checks the presence of BIS pulses on retrigger lines (after every dump): Expected retrigger pulses from BIS

Planned changes during LS1 Upgrade of TSU card Why do we upgrade the TSU v2 card ?  Following external review TSU v2 (Studiel) outcome:  No major issues detected  Missing ESD protections on some I/O  Minor VHDL issues (already applied on TSU v2 in 2011).  Outlined problem with TSU cards powering => coupling through VME PSU, no protection (fuse) on TSU cards  Following LBDS Powering Review (internal) outcome:  The 2 TSU cards are deployed on 3 crates instead of 1 crate (2 TSU crates non-redundant / 1 BLMDD, BRF, CTRV crate redundant)  Following VME crate +12V problem:  PS surveillance: +1.2V / +1.8V / +3.3V / +5V / +12V => Second TSU will trigger in case of detected PS problem. More TSU changes:  Improvement of CIBU powering filters (After problems at start-up 2011).  Improvement of post operation diagnostics: (Check presence of current on all trigger outputs, added all CPDLs clients in IPOC, …)

Planned changes during LS1 LBDS powering modifications Modifications after LBDS Powering Review:  Separated connection to second UPS (US65) for LBDS  Individual circuit breaker for each crate PSU  Software monitoring of crate redundant PSU  FEAlimMonitor FESA class checks PSU states  SIS monitors PSU states => dump request in case of failure

Planned changes during LS1 MKB vacuum interlocking problems MKB vacuum probes give noisy signals (spikes/glitches):  Analog signal very noisy => Masked from the beginning !  Digital signal experienced glitches/spikes => Many dumps due to this problem ! (More than 13 during 2011-2012 operation)  3 noisy vacuum probes masked in XPOC since the beginning => Action plan for LS1 to be defined with TE-VSC

Planned changes during LS1 Changes to HV generators Sparking in the GTO stacks causing self-trigger: (operation limited to 5 TeV) Modify all stacks: 30x2 MKD + 10x2 MKB = 80 stacks To be noted as well: Until now we used two GTO brands: Westcode ABB-Dynex After LS1: only ABB GTOs, because: Best SEB test results More stable turn-on delay Problem: common mode failure ? => to be evaluated ! HV Insulators => HV insulators are added between return current Plexiglas isolated rods and GTO HV deflectors.

Planned changes during LS1 Upgrade of PTUs => fixed PTU voltage  Now PTU voltage is adjusted w.r.t. energy for each generator to guarantee that the MKD rising edge fits into the abort gap (GTO turn-on delay).  PTU voltage variable from 600 V up to 3000 V, measured overall rise time < 2.7 us (Not good for GTO gate current and dI/dt !)  Measures performed in 2011 with constant PTU voltage:  PTU voltage constant at 2800 V, measured overall rise time for ABB GTO ~2.75 us => Would need increase of abort gap of 50 ns minimum to guarantee the same margin From ABB specifications: We need to increase the GTO gate current ! (We had GTOs broken because gate current is too low)  Upgrade of PTUs  Increase PTU maximum voltage from 3 kV to 4 kV (replacement of HVPS)  Replace 1.2 kV IGBT with equivalent 1.7 kV type(better sensitivity to SEB)  Operate PTU at ~3300 V constant voltage, expected overall rise time < 2.7 us => No need to modify abort gap duration for fixed PTU voltage operation after LS1 !

Planned changes during LS1 Other changes…  Upgrade of the 30 MKD generator IPOC systems => Adding PTU current waveforms acquisition and analysis  Upgrade of the 30 MKD -300 V DCPS => Replacement of one OpAmp to solve offset problem  Upgrade of the 30 MKD generator temperature probes (Absolute temperature measurements are not precise enough) => Replacement of temperature probes (±0.3 ˚C -> ± 0.1 ˚C / 3 wires -> 4 wires)  Add more shielding in MKD cable ducts between UA and RA (Presently only ducts in front of TCDQ are filled)  Add 2 MKB magnets (1 tank) per beam => We will have nominal dilution for operation above 4 TeV  Many more smaller things here and there…

Full LBDS re-commissioning Standard commissioning Standard commissioning following procedures since 2009: EDMS 896392  Reliability run (3 months) foreseen around end 2013/beginning 2014  Commissioning without beam (3 months) around beginning 2014  Commissioning with beam: (3 months)  Other tests that we have to perform:  Effective rise time measurement with beam: How to proceed ? Scan of MKD ‘threshold’ and ‘start’ points Measurement with BTVDD, BPM, BLM & Collimators,… ?  Test dump with BLMDD: How to proceed ? (Never tested this TSU client with beam)  Update commissioning procedures ?

Full LBDS re-commissioning Procedures for ‘a non-working LBDS’  Current procedure in case of non-working dump trigger:  EDMS 1166480  Update the procedure after all changes during LS1  TSU crates modification  LBDS powering modification

Full LBDS re-commissioning Ongoing reliability studies An expert has been mandated to check LBDS reliability model: Mandate:  Analyse/classify all faults & interventions of Run 1  Confront all these faults with the failure modes defined in 2006  Update the reliability model after LS1 changes  Reconfiguration of TSU crates  Reconfiguration of TFOT crates  BIS retrigger added  …  Identify and evaluate the principal degraded modes => Give new reliability numbers based on experience

Summary Bad news:  Operational energy limited to 5 TeV due to HV generator self-triggers  NO DUMP case identified if +12V is lost in TSU VME crate  LBDS powering problems identified Good news:  ALL dump requests have been correctly executed  Good availability of LBDS over 3 years of operation Quite a lot of changes to be performed during LS1:  FULL re-commissioning of LBDS is required  Update of operational procedures is needed  Update of reliability calculation is on-going Thank you for your attention

Spare slides / Removed / Incomplete

Planned changes during LS1 LBDS powering modifications

OLD config = 1 TSU crate NEW config = 3 TSU crates

Planned changes during LS1 IPOC systems for MKD generators Upgrade of MKD generator IPOC systems:  30 IPOC systems already in place.  Adding 4 acquisition channels per generator: PTU currents  All internal currents are acquired:  2x CTs, 2x CTc, 2x CTf, 4x PTU => x30 MKD = 300 waveforms to acquire and analyse) Interlocks: Same approach as MKD/MKB IPOC/XPOC:  IPOC check with large limits => hardware interlock  Error = Probably important hardware problem =>Immediate access needed !  XPOC check with tight limits => software forewarning  Error = Anticipate degradation of hardware => Prepare access for next opportunity

Planned changes during LS1 Changes to Heinzinger HVPS Problem of offset in regulation loop: When exchanging a broken HVPS, we have to update BEI and SCSS tracking tables because of offset difference and tight BETS limits. => Replace one OpAmp on 35 kV Heinzinger HVPS. = 30 HVPS to upgrade.

Planned changes during LS1 Changes to HV generators Problem with HV switches: Sparking in the GTO stacks Plexiglas insulation GTO HV deflectors GTO HV deflectors: E > 8 MV/m => Air ionisation GTO stack AGTO stack B Plexiglas insulation on grounded rods => Gets charged by ionised air Spark between deflectors and isolators => HF oscillations start between stacks. => Self trigger of GTO stack.

Planned changes during LS1 Changes to HV generators (misc.) Problem with HV switches: Sensitivity to temperature:  problem with leakage current in GTO gates (switch ratio @ 450 GeV ?)  Kick strength depends on T (Capacitor value, GTO on-resistance, …) => Switch temperature is controlled with Peltier cells. But absolute measured temperature is not precise enough (± 1 ˚C ?). => Temperature probes upgrade:  ±0.3 ˚C -> ± 0.1 ˚C  3 wires -> 4 wires  Connectors changed: many pins in parallel (contact resistance problems) All generators have to be re-cabled: Huge amount of work !

Planned changes during LS1 Update of PTUs We need to increase the GTO gate current (ABB specs):  Trigger current x7  Trigger dI/dt x10 => Upgrade of PTU during LS1:  New Heinzinger HVPS 3 kV -> 4 kV  New IGBT 1.2 kV -> 1.7 kV (3 IGBTs/PTU -> 5.1 kV) => Upgrade of PTU trigger transformers & cables (2015/2016)

Planned changes during LS1 SEB measurements on IGBTs and GTOs Single Event Burnout measurements performed in north Area (???) in 2012 ? PTU IGBTs:  Before LS1: Voltage max 3000 V / 3 IGBT => 1000 V/ IGBT (1.2 kV) => SEB = ~1e-7 cm2 (360 IGBTs => 4 async dump per year)  After LS1: Voltage constant 3300 V/ 3 IGBT => ~1100 V/ IGBT (1.7 kV) => SEB = ~3e-11 cm2 (360 IGBTs => 1 async dump every 850 k years = negligible) MKD GTOs (ABB):  Before LS1: Max voltage 1.8 kV/GTO => SEB 1 async dump every >>100 years = negligible)  After LS1: Max voltage < 29 kV / 10 GTO < 2.9 kV/GTO => SEB = ~1.0e-8 cm2 (600 GTO in all MKDs => 0.6 async dump per years) All async dump values are computed based on 10 5 HEH.cm -2/ year. We don’t yet have the integrated radioactivity measured in UA after Run 1

LBDS parameters and interlocking Energy-Scan calibration Procedure after MKD generator exchange:  Update PTU tracking tables (SCSS)  Perform an Energy-Scan (100 pulses 400 -> 7100 GeV)  Analyse the recorded waveforms (1500 per beam)  Compute, measure, mount and install new delay board  Compute new reference tables (SCSS, check total kick ±0.25%)  Compute new tracking tables (BETS / SCSS )  Compute new limits (IPOC / XPOC)  Update all MCS tables  Energy-Scans / Reliability run / Re-commissioning to validate changes ? => It takes time ! At least 24h (3 days) to exchange a generator => Complicated: Error prone… No full generator exchanged in operation (Only during TS). Only 1 stack exchanged in operation. E-Scan plot…

LBDS parameters and interlocking Beam Energy Tracking tolerances Surveillance of GTO stack voltages:  BETS:  Fast reaction (< 1 ms)  Interlock tolerance = 0.5 %  SCSS:  Slow reaction (> 50 ms)  Interlock tolerance = 2 % Constant HVPS references, but difference between all GTO stack measured voltages: (±1.5% @ 450 GeV, ±1% @ 1 GeV, and ±0.5% @ 3500 GeV) => We need Individual limits/tolerances for each generator due to the tight tolerances (0.5%) => Tracking checks reproducibility, and not absolute error => Complicated in case of generator exchange (Needs to update PLC code, BEI tables, MCS references, …) Can we increase BETS / SCSS tolerances to have constant tracking tables (estimated >±1%) ? Taking into account the removal of HVPS offset, difference could be lower (to be verified)

LBDS parameters and interlocking Beam Energy Tracking differences Constant HVPS references, but difference between measured GTO stack voltage: At low energy: ±1.5%At high energy: ± 0.5%At ~1GeV: ±1%

LBDS parameters and interlocking Post Operation Check tolerances Surveillance of kicker current waveform shapes  IPOC: Hardware interlock  Checks absolute error (large limits, almost never updated)  Common references/tolerances for all kickers for all energies  Tolerance >5 % on strength, >500 ns on synchronization  Arming LBDS not possible if last IPOC analysis failed or IPOC not ready. Error = Probably important hardware problem =>Immediate access needed !  XPOC: Software forewarning  Reproducibility (tight limits)  Individual references/tolerances for each kicker and every energy.  Tolerance 1 % on strength, 50 ns on synchronization  Arming LBDS not possible if last XPOC analysis failed, expert needs to reset. Error = Anticipated degradation of hardware, check trends… => Mask or Increase limit and prepare access for next opportunity.

Full re-commissioning Reliability run  System in LOCAL control  LBDS not armed  Frequent pulses at injection 450 GeV  Many Energy-Scan tests (100 pulses from 400 GeV to 7 TeV)  Simulated cycles with SCSS:  30 min at injection 450 GeV  15 min ramp-up to 7100 GeV  11h at flat-top at 7100 GeV  Dump  Ramp-down 15 min Foreseen duration of reliability run is 3 month It could take place around end 2013/beginning 2014

Full re-commissioning Tests without beam  System in REMOTE control  BEM cards connected to BETS-Simulator  Optical switch to be installed  Local BIS loop, frequency controlled from CCC  BRF generated locally  LBDS armed from CCC  Simulated cycles with BETS-Simulator (controlled by Sequencer):  30min at injection 450 GeV  15 min ramp-up to 7100 GeV  11h at flat-top at 7100 GeV  Dump  Ramp-down 15 min => Test of retrigger pulses from BIS Foreseen duration of re-commissioning without beam is 3 months It could take place around beginning 2014

Full LBDS re-commissioning Tests with beam  Standard re-commissioning following procedure used in 2009:  EDMS: 896392 Other tests that we have to perform:  Effective rise time measurement with beam: How to proceed ?  Scan of MKD ‘threshold’ and ‘start’ points  Measurement with BTVDD, BPM, BLM & Collimators,… ?  Test dump with BLMDD: How to proceed ? (We never tested this TSU client)  …

Planned changes during LS1 Addition of 4 vertical dilution kickers Adding 2 MKBV magnets (=1 tank) per beam: Total dilution: 4x MKBH + 6x MKBV = 10 MKB total (nominal design) => We will have nominal dilution for operation above 4 TeV. => We have to update XPOC references (MKB, BTVDD, BPMD, …) Any impact on LBDS safety ?  Increased risks of self-triggers ?  4 more switches over 76 => ~5% more self-triggers  Increased risks of HVPS failure ?  4 more HVPS over 76 => ~5% more failures => No safety problem, but marginally less availability of LBDS.

Planned changes during LS1 Additional re-trigger from BIS What happens if TSU cards do not send triggers (+12V VME problem) ?

LBDS KICKERS MPP workshop - Annecy 11.03.2013 Nicolas Magnin TE-ABT-EC.

Similar presentations

Presentation on theme: "LBDS KICKERS MPP workshop - Annecy 11.03.2013 Nicolas Magnin TE-ABT-EC."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

LBDS KICKERS MPP workshop - Annecy 11.03.2013 Nicolas Magnin TE-ABT-EC.

Similar presentations

Presentation on theme: "LBDS KICKERS MPP workshop - Annecy 11.03.2013 Nicolas Magnin TE-ABT-EC."— Presentation transcript:

Similar presentations

About project

Feedback