Redundancy in the Control System of DESY’s Cryogenic Facility. M. Bieler, M. Clausen, J. Penning, B. Schoeneburg, DESY ARW 2013, Melbourne,
M. Bieler | Redundancy in the Control System| April 16, 2013 | Page 2 Content > DESY’s Cryogenic Facility Customers Requirements Control System > Control System Architecture Technical Requirements Redundancy Selection Criteria Selected Components > Operational Experience
M. Bieler | Redundancy in the Control System| April 16, 2013 | Page 3 DESY’s Cryogenic Facility > 3 independent sectors > 6.8 kW at 4.4 K > originally built in the 1980ties to serve the HERA proton ring
M. Bieler | Redundancy in the Control System| April 16, 2013 | Page 4 DESY’s Cryogenic Facility > Mayor upgrade for two of the three systems in 2010 to serve the European XFEL > 2.47 kW at 2 K > Cold compressors
M. Bieler | Redundancy in the Control System| April 16, 2013 | Page 5 DESY’s Cryogenic Facility FLASH-Coldbox Customers: > FLASH, a 1 GeV superconducting linac at 2 K > Accelerator Module Test Facility AMTF: 3 module test stands, 2 cavity test stands > Cryo Module Test Bed CMTB: 1 module test stand > Hall 3: 2 cavity test stands > XMTS: Magnet test stand XFEL-Coldbox(es) Customers: > European XFEL, a 1 km 17.5 GeV superconducting linac at 2 K Operation 24/7, no shutdowns foreseen.
M. Bieler | Redundancy in the Control System| April 16, 2013 | Page 6 DESY’s Cryogenic Facility Requirements: > 24/7 Operation round the year > Changing cryogenic loads > High availability > Maintenance during full operation Redundancy (where ever possible)!
M. Bieler | Redundancy in the Control System| April 16, 2013 | Page 7 DESY’s Cryogenic Facility Control system: > Process Control System based on EPICS All control loops run in EPICS (no PLCs for process controls) > Operating system: VxWorks for all processes requiring real-time controls (any process with PID loops) > VME/ PC/ Compact PCI (for redundant systems) > All I/O on field-busses (CAN, Modbus, Profibus (for redundant systems)) > Ethernet (100Mb/ 1Gb) Cisco (redundant) > PC Operator consoles running Control System Studio (CSS)
M. Bieler | Redundancy in the Control System| April 16, 2013 | Page 8 Control System Architecture Technical Requirements: XFEL Cryogenic Plant: Main objective: Maintenance (permanent operation for more than one year) Hardware maintenance Software maintenance Installing new system-/ application- Software XFEL Tunnel Installation: Main objective: Survive radiation damage (MTBR > 1 month) … same like cryogenic plant Seamless switch over of: Process Controller (IOC) Power supplies
M. Bieler | Redundancy in the Control System| April 16, 2013 | Page 9 T‘PTPT PLC IOC (redundant) FEC-IOC (redundant) Ethernet Field Bus Profibus (with ring topology) XFEL TunnelCryogenic Plant Router R-Link PT Field Bus Control Backbone Gateway Office Netw. 6 X 10 X Control System Architecture Outside Inside XFEL Tunnel
M. Bieler | Redundancy in the Control System| April 16, 2013 | Page 10 Power B Control System Architecture Redundancy: B A B A Power A Power B Power A Private Link Redundant network, redundant power, redundant IOCs
M. Bieler | Redundancy in the Control System| April 16, 2013 | Page 11 Control System Architecture Redundancy: > redundant sensors (temperature) > redundant front end processors > redundant power supplies > redundant network connections > well established failover procedures “Any redundant implementation must make the system more reliable than the non redundant one. Precaution must be taken especially for the detection of errors which shall initiate the failover. This operation should only be activated if there is no doubt that keeping the actual mastership will definitely cause more damage to the controlled system than an automatic failover.”
M. Bieler | Redundancy in the Control System| April 16, 2013 | Page 12 Control System Architecture > Key redundancy tasks on an IOC : Redundancy Monitor Task Supervision of the tasks running on an IOC Switching IOC’s in case of serious problems on one IOC Continuous Control Executive Task Synchronizing the continuous control processing on two IOC’s Permanent monitoring of all changes in record processing on an IOC > Core and main objective of any failover: > The resulting status of an IOC after a failover must be a more stable state than the status before the failover. Diagnostic analysis programs must be activated to ensure this. Redundancy:
M. Bieler | Redundancy in the Control System| April 16, 2013 | Page 13 Control System Architecture Selected Components: > Operating system: VxWorks > Process Control System: EPICS > Processors: no fans, no active cooling low power CPUs (~ 10 W) no hard discs > Network: redundant layout of all network layers seamless failover > Power: redundant power supplies, low power battery backup for power supplies (UPS) diesel generator backup of mains line
M. Bieler | Redundancy in the Control System| April 16, 2013 | Page 14 Operational Experience Redundant front-end processors are in operation for 2 ½ years: > During the first year we still earned experience Failover conditions What should trigger a failover? Update communication over the redundant (private) link caused a lot of network traffic. The VxWorks network task crashed randomly. The traffic could be reduced and the update tasks run stable since then. Identifying the ‘real’ state of the redundant partner –even without Ethernet communication- could be implemented by means of the connected field-bus. A failover should only occur when the redundant partner is in better conditions then the actual selected (active) processor. This can be ensured by the new – additional- measures.
M. Bieler | Redundancy in the Control System| April 16, 2013 | Page 15 Operational Experience II Redundant front-end processors are in operation for 2 ½ years: > Stable operations since 1 ½ years About 6 failover during full operation of the cryogenic plant: The plant continued operations without interruption. Beware: All control loops are running in EPICS on the front-end controller Primary cause of failover: Profibus communication (nodes were down) Three front-end controllers are running in redundant mode
M. Bieler | Redundancy in the Control System| April 16, 2013 | Page 16 Summary Redundant Process DESY > Implementation took quite some effort > The stable operation since the last 1 ½ years is a proof of a successful concept > All process controllers for XFEL operations will be running in redundant mode > The basic redundancy component (RMT Task) is written in ‘C’ and operating system independent. It is also used on Linux systems to supervise important (Channel Access) Network gateways.