Presentation is loading. Please wait.

Presentation is loading. Please wait.

Managing infrastructure faults to minimize accelerator down time

Similar presentations


Presentation on theme: "Managing infrastructure faults to minimize accelerator down time"— Presentation transcript:

1 Managing infrastructure faults to minimize accelerator down time
P. Sollander, CERN 15/4/2013 ARW2013

2 Outline Introduction to Technical Infrastructure operation at CERN
Workflow and processes, benefits Future improvements Summary 15/4/2013 ARW2013

3 CERN Accelerator complex
Controlled from CCC 15/4/2013 ARW2013

4 Intervention management Corrective Maintenance Response coordination
TI operation 24*365¼ 1 operator on shift Electricity Cooling Ventilation Safety Systems Controls Network Hardware Intervention management Corrective Maintenance Response coordination

5 Types and typical number of events yearly
72201 ~106 alarms ~ 4*104 calls ~ 104 minor ~ 102 major ~ 100 site wide ~ 10-1 crisis 15/4/2013 ARW2013

6 ~104 Minor – corrective maintenance
Remote diagnostics Specialist call Work order creation Equipment identifier Problem description Status & Follow up 15/4/2013 ARW2013

7 Minor – corrective maintenance
Statistics : what breaks most? History: what happened earlier? Same problem? Some works on this unit? Operator help: how was the problem fixed last time. Keep information up to date easily and visible where and when you need it ONLY. 15/4/2013 ARW2013

8 ~102 Major – stopping “production”
Diagnostics, piquet, WO like minor + Informing affected users Major Event report Who was affected Reports from everybody involved 15/4/2013 ARW2013

9 Major – stopping “production”
Weekly follow up Identify “time bombs” Find/agree mitigation measures Down time measures for statistics Part of total number of major event, by equipment group In number of events In down time Time bomb examples, ME10 current transformer. Mitigations help reduce the consequences even if the problem is not completely solved. 15/4/2013 ARW2013

10 ~100 Site wide power outage
Instant alarm and phone overload Work with priority list CERN Panel for Intervention Priorities Derived check list for TI operators Inform all users, experiments, experts, services (on line + debrief) Escalate to TI supervisor on duty Further escalation to OP group leader if necessary Investigations and follow up like Major events 15/4/2013 ARW2013

11 ~10-1 Crisis situation Event beyond what operators can manage
Predefined procedures do not exist Call in Crisis Management Team to handle event Fact finding and follow up 15/4/2013 ARW2013

12 Future improvements Too many logbooks and databases
Transformer short circuit  3 reports TI major event report Accelerator logbook Fire brigade report Consolidation of infrastructure and accelerator databases Extend beyond infrastructure systems 15/4/2013 ARW2013

13 Summary Technical Infrastructure operation handles 1’000’000+ events, 10’000 incidents per year Strict reporting and follow The right information is available when needed To operator To management Further extension and integration necessary 15/4/2013 ARW2013

14


Download ppt "Managing infrastructure faults to minimize accelerator down time"

Similar presentations


Ads by Google