Download presentation
Presentation is loading. Please wait.
1
Clara Gaspar, May 2010 The LHCb Run Control System An Integrated and Homogeneous Control System
2
Clara Gaspar, May 2010 2 The Experiment Control System Detector Channels Front End Electronics Readout Network HLT Farm Storage L0 Experiment Control System DAQ DCS Devices (HV, LV, GAS, Cooling, etc.) External Systems (LHC, Technical Services, Safety, etc) TFC Monitoring Farm ❚ Is in charge of the Control and Monitoring of all parts of the experiment
3
Clara Gaspar, May 2010 3 Some Requirements ❚ Large number of devices/IO channels ➨ Need for Distributed Hierarchical Control ❘ De-composition in Systems, sub-systems, …, Devices ❘ Local decision capabilities in sub-systems ❚ Large number of independent teams and very different operation modes ➨ Need for Partitioning Capabilities (concurrent usage) ❚ High Complexity & Non-expert Operators ➨ Need for Full Automation of: ❘ Standard Procedures ❘ Error Recovery Procedures ➨ And for Intuitive User Interfaces
4
Clara Gaspar, May 2010 4 Design Steps ❚ In order to achieve an integrated System: ❙ Promoted HW Standardization (so that common components could be re-used) ❘ Ex.: Mainly two control interfaces to all LHCb electronics 〡 Credit Card sized PCs (CCPC) for non-radiation zones 〡 A serial protocol (SPECS) for electronics in radiation areas ❙ Defined an Architecture ❘ That could fit all areas and all aspects of the monitoring and control of the full experiment ❙ Provided a Framework ❘ An integrated collection of guidelines, tools and components that allowed the development of each sub-system coherently in view of its integration in the complete system
5
Clara Gaspar, May 2010 5 Generic SW Architecture LV Dev1 LV Dev2 LV DevN DCS SubDetN DCS SubDet2 DCS SubDet1 DCS SubDet1 LV SubDet1 TEMP SubDet1 GAS … … Commands DAQ SubDetN DAQ SubDet2 DAQ SubDet1 DAQ SubDet1 FEE SubDet1 RO FEE Dev1 FEE Dev2 FEE DevN Control Unit Device Unit … … Legend: INFR.TFCLHC ECS HLT Status & Alarms
6
Clara Gaspar, May 2010 6 ❚ The JCOP * Framework is based on: ❙ SCADA System - PVSSII for: ❘ Device Description (Run-time Database) ❘ Device Access (OPC, Profibus, drivers) ❘ Alarm Handling (Generation, Filtering, Masking, etc) ❘ Archiving, Logging, Scripting, Trending ❘ User Interface Builder ❘ Alarm Display, Access Control, etc. ❙ SMI++ providing: ❘ Abstract behavior modeling (Finite State Machines) ❘ Automation & Error Recovery (Rule based system) * – The Joint COntrols Project (between the 4 LHC exp. and the CERN Control Group) The Control Framework Device Units Control Units
7
Clara Gaspar, May 2010 7 Device Units ❚ Provide access to “real” devices: ❙ The Framework provides (among others): ❘ “Plug and play” modules for commonly used equipment. For example: 〡 CAEN or Wiener power supplies (via OPC) 〡 LHCb CCPC and SPECS based electronics (via DIM) ❘ A protocol (DIM) for interfacing “home made” devices. For example: 〡 Hardware devices like a calibration source 〡 Software devices like the Trigger processes (based on LHCb’s offline framework – GAUDI) ❘ Each device is modeled as a Finite State Machine Device Unit
8
Clara Gaspar, May 2010 8 Hierarchical control ❚ Each Control Unit: ❙ Is defined as one or more Finite State Machines ❙ Can implement rules based on its children’s states ❙ In general it is able to: ❘ Summarize information (for the above levels) ❘ “Expand” actions (to the lower levels) ❘ Implement specific behaviour & Take local decisions 〡 Sequence & Automate operations 〡 Recover errors ❘ Include/Exclude children (i.e. partitioning) 〡 Excluded nodes can run is stand-alone ❘ User Interfacing 〡 Present information and receive commands DCS Muon DCS Tracker DCS … Muon LV Muon GAS Control Unit
9
Clara Gaspar, May 2010 9 Control Unit Run-Time ❚ Dynamically generated operation panels (Uniform look and feel) ❚ Configurable User Panels and Logos ❚ “Embedded” standard partitioning rules: ❙ Take ❙ Include ❙ Exclude ❙ Etc.
10
Clara Gaspar, May 2010 10 Operation Domains ❚ Three Domains have been defined: ❙ DCS ❘ For equipment which operation and stability is normally related to a complete running period Example: GAS, Cooling, Low Voltages, etc. ❙ HV ❘ For equipment which operation is normally related to the Machine state. Example: High Voltages ❙ DAQ ❘ For equipment which operation is related to a RUN Example: Readout electronics, High Level Trigger processes, etc.
11
Clara Gaspar, May 2010 11 FSM Templates ❚ DCS Domain ❚ HV Domain ❚ DAQ Domain READY STANDBY1 OFF ERROR Recover STANDBY2 RAMPING_STANDBY1 RAMPING_STANDBY2 RAMPING_READY NOT_READY Go_STANDBY1 Go_STANDBY2 Go_READY RUNNING READY NOT_READY StartStop ERRORUNKNOWN Configure Reset Recover CONFIGURING READY OFF ERRORNOT_READY Switch_ONSwitch_OFF RecoverSwitch_OFF ❚ All Devices and Sub- Systems have been implemented using one of these templates
12
Clara Gaspar, May 2010 12 ❚ Size of the Control Tree: ❙ Distributed over ~150 PCs ❘ ~100 Linux (50 for the HLT) ❘ ~ 50 Windows ❙ >2000 Control Units ❙ >30000 Device Units ❚ The Run Control can be seen as: ❙ The Root node of the tree ➨ If the tree is partitioned there can be several Run Controls. ECS: Run Control DCS SubDetN DCS SubDet1 DCS … DAQ SubDetN DAQ SubDet1 DAQ … HVTFCLHCHLT SubDet1 ECS X X
13
Clara Gaspar, May 2010 13 DAQ Partitioning ❚ ECS Domain RUNNING READY NOT_READY StartRunStopRun ERROR Configure Reset Recover CONFIGURING NOT_ALLOCATED Allocate Deallocate ALLOCATING ❚ Creating a Partition ❙ Allocate = Get a “slice” of: ❘ Timing & Fast Control (TFC) ❘ High Level Trigger Farm (HLT) ❘ Storage System ❘ Monitoring Farm ACTIVE StartTriggerStopTrigger
14
Clara Gaspar, May 2010 14 Domain X Sub-Detector Run Control User Interface ❚ Matrix ❚ Activity Used for Configuring all Sub-Systems
15
Clara Gaspar, May 2010 15 DAQ Configuration ❚ Cold Start to Running takes ~4 mins ❙ Slowest Sub-systems: ❘ MUON FE Electronics 〡 Did not use the recommended HW interface… ❘ High Level Trigger 〡 Startup and Configuration of ~10000 processes 〡 Trigger Processes based on Offline Software ❚ But a cold start is very rare: ❙ Run is started well before “Stable Beams” ❙ “Fast Run Change” mechanism ❘ When “conditions” change ❘ Takes ~5 seconds
16
Clara Gaspar, May 2010 16 LHCb Operations ❚ Only 2 Operators on shift ❙ The Shift Leader has two views of the system ❘ Run Control UI ❘ LHCb Status UI ❚ Automation ❙ In Sub-systems ❘ HLT:Dead Trigger Tasks/Nodes ❘ HV:Tripped Channels ❙ Centrally ❘ Voltages depending on LHC State ❘ More will come…
17
Clara Gaspar, May 2010 17 Conclusions ❚ LHCb has designed and implemented a coherent and homogeneous control system ❚ The Run Control allows to: ❙ Configure, Monitor and Operate the Full Experiment ❙ Run any combination of sub-detectors in parallel in standalone ❙ Can be completely automated (when we understand the machine) ❚ Some of its main features: ❙ Sequencing, Automation, Error recovery, Partitioning ➨ Come from the usage of SMI++ (integrated with PVSS) ❚ It’s being used daily for Physics data taking and other global or sub-detector activities
18
Clara Gaspar, May 2010 18 Sub-Detector Run Control ❚ “Scan” Run
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.