The Control and Hardware Monitoring System of the CMS Level-1 Trigger Ildefons Magrans, Computing and Software for Experiments I IEEE Nuclear Science Symposium, 30 th October 07 1.Context 2.Concept 3.Framework 4.System 5.Services
Ildefons MagransInstitute for High Energy Physics of the Austrian Academy of Sciences, Vienna, Austria Context ~55 Million Channels, ~1 Mbyte per event 100 Khz, no dead time 40 Mhz, ~20 events per BX 100 Hz L1 Decision Loop. HARDWARE CMS Control System. SOFTWARE
Ildefons MagransInstitute for High Energy Physics of the Austrian Academy of Sciences, Vienna, Austria Hardware context 3.2 µs L1-Trigger Decision Loop Configuration: 64 crates O(10 3 ) boards Firmware ~ 15 MB/board O(10 2 ) regs/board 8 independent detector partitions Project context Out of project context Testing: O(10 3 ) links Integration coordination: Large number of involved institutes
Ildefons MagransInstitute for High Energy Physics of the Austrian Academy of Sciences, Vienna, Austria Software context Run Control and Monitoring System (RCMS): Overall experiment control and monitoring RCMS framework implemented with java Detector Control System (DCS): Detector safety, gas and fluid control, cooling system, rack and crate control, high and low voltage control, and detector calibration. DCS is implemented with PVSSII Cross-platform Data AcQuisition middleware (XDAQ): C++ component based distributed programming framework Used to implement the distributed event builder L1-Trigger Control and Hardware Monitoring System: Provides a machine and a human interfaces to operate, test and monitor the Level-1 decision loop hardware components. (8)
Ildefons MagransInstitute for High Energy Physics of the Austrian Academy of Sciences, Vienna, Austria Concept [1] Conceptual design (IEEE TNS VOL. 53, NO. 2, APRIL 2006, pp ) [2] Prototype (IEEE NSS 2005, Puerto Rico)
Ildefons MagransInstitute for High Energy Physics of the Austrian Academy of Sciences, Vienna, Austria Baseline Infrastructure Subsystem OSWI integration effort (C++, Linux) Supervisory and Control Infrastructure development effort DCS (PVSSII, Windows) ++Ok RCMS (Java) ++ XDAQ (C++, Linux) Ok+ CMS official software frameworks to develop distributed systems: DCS, RCMS, XDAQ: Subsystems Online SoftWare Infrastructure needs to be integrated Infrastructure should be oriented to develop SCADA systems XDAQ-based baseline solution + additional development to reach SCADA framework
Ildefons MagransInstitute for High Energy Physics of the Austrian Academy of Sciences, Vienna, Austria The Cell Synchronous and Asynchronous SOAP API Other plug-ins: Command: RPC method. SOAP API extensions Monitoring items FSM Plug-ins Xhannel infrastructure: Designed to simplify access to web services (SOAP and HTTP/CGI) from operation transition methods Tstore (DB) Monitor collector Cells Control panel plug-ins + e.g. GT panel e.g. DTTF panel HTTP/CGI: Automatically generated e.g. Cell FSM operation
Ildefons MagransInstitute for High Energy Physics of the Austrian Academy of Sciences, Vienna, Austria The Trigger Supervisor Framework Components RCMS components Tstore: DB interface. Exposes SOAP. 1 per system. Mon. Collector: Polls all cell sensors. 1 per system. Mstore: interface M. collector with Tstore. 1 per system. Job control: Remote startup of XDAQ applications. 1 per host. XS: Reads logging data base. 1 per cell. Monitor sensor: Cell interface to poll monitoring information. 1 per cell. Cell: Facilitates subsystem integration and operation (additional development, next slide). 1 per crate. Log Collector: 1 per system. Collects log statements from cells and forward them to consumers. System based uniquely on these components
Ildefons MagransInstitute for High Energy Physics of the Austrian Academy of Sciences, Vienna, Austria L1 Trigger Control System Hierarchical system enhances: Distributed development Subsystem control Partial deployment Graceful degradation Centralized access to DBs Configuration and Interconnection Test Services framework 1 crate ~ 1 cell Multicrate subsystems ~ 2 level of subsystem cells (1 subsystem central cell)
Ildefons MagransInstitute for High Energy Physics of the Austrian Academy of Sciences, Vienna, Austria L1 Trigger Monitoring System 1 cell ~ 1 sensor System ~ 1 Mon. Collector, 1 Mstore (centralized system) Centralized access to DBs Hardware Monitoring Service framework
Ildefons MagransInstitute for High Energy Physics of the Austrian Academy of Sciences, Vienna, Austria L1 Trigger Logging and Start-up Systems 1 cell ~ 1 XS System ~ 1 Log. Collector (centralized system) 1 host ~ 1 JC Auxiliary systems
Ildefons MagransInstitute for High Energy Physics of the Austrian Academy of Sciences, Vienna, Austria Control System Services: Configuration New service methodology: 1 Define FSM plug-in in central cell 2 Define operation transition methods ( = Define expected subsystem central cells FSM) Subsystem integration coordination strategy Can be applied to Multicrate subsystems 1 Configures L1 Trigger HW 2 Configures Partition
Ildefons MagransInstitute for High Energy Physics of the Austrian Academy of Sciences, Vienna, Austria Crate Cell Services New command plugin extends the cell API Additional FSMs to fulfill the requirements of experts during commissioning and testing operations Control panel plug-ins extends the default cell GUI with expert oriented control panels Crate Cell Level Services = Expert level facilities replace standalone tools and programs
Ildefons MagransInstitute for High Energy Physics of the Austrian Academy of Sciences, Vienna, Austria New Monitoring Item visible to the central collector requires: 1 Declare it in XML file (shared with central collector) 2 Define callback routine in crate cell Monitoring System Services
Ildefons MagransInstitute for High Energy Physics of the Austrian Academy of Sciences, Vienna, Austria Summary HW Context Problem definition Concept Conceptual design of the solution ~ agreement with all involved parties Prototype Prove of concept and better understanding of the requirements SW Context Control system & Available facilities Framework Filling the gap between available sw facilities and the ideal framework System Distributed sw system = flexible services provider Services Solution