DQM Architecture From Online Perspective EvF wkg 11/10/2006 E. Meschi – CERN PH/CMD.

Slides:



Advertisements
Similar presentations
Operating System.
Advertisements

Clara Gaspar on behalf of the LHCb Collaboration, “Physics at the LHC and Beyond”, Quy Nhon, Vietnam, August 2014 Challenges and lessons learnt LHCb Operations.
GNAM and OHP: Monitoring Tools for the ATLAS Experiment at LHC GNAM and OHP: Monitoring Tools for the ATLAS Experiment at LHC M. Della Pietra, P. Adragna,
CHEP 2012 – New York City 1.  LHC Delivers bunch crossing at 40MHz  LHCb reduces the rate with a two level trigger system: ◦ First Level (L0) – Hardware.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
Lesson 2: Configuring Servers
GLAST LAT ProjectOnline Peer Review – July 21, Integration and Test L. Miller 1 GLAST Large Area Telescope: I&T Integration Readiness Review.
Data Quality Monitoring for CMS RPC A. Cimmino, D. Lomidze P. Noli, M. Maggi, P. Paolucci.
Cambodia-India Entrepreneurship Development Centre - : :.... :-:-
Maintaining and Updating Windows Server 2008
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 11 Managing and Monitoring a Windows Server 2008 Network.
L. Granado Cardoso, F. Varela, N. Neufeld, C. Gaspar, C. Haen, CERN, Geneva, Switzerland D. Galli, INFN, Bologna, Italy ICALEPCS, October 2011.
Operating Systems.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 7 Configuring File Services in Windows Server 2008.
First year experience with the ATLAS online monitoring framework Alina Corso-Radu University of California Irvine on behalf of ATLAS TDAQ Collaboration.
Types of Operating System
Data Quality Monitoring of the CMS Tracker
Data Center Infrastructure
Operating System. Architecture of Computer System Hardware Operating System (OS) Programming Language (e.g. PASCAL) Application Programs (e.g. WORD, EXCEL)
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
Module 7: Fundamentals of Administering Windows Server 2008.
Web Based Monitoring DT online shifter tutorial Jesús Puerta-Pelayo CIEMAT Muon_Barrel_Workshop_07/July/10.
Clara Gaspar, October 2011 The LHCb Experiment Control System: On the path to full automation.
Storage Manager Overview L3 Review of SM Software, 28 Oct Storage Manager Functions Event data Filter Farm StorageManager DQM data Event data DQM.
Chapter © 2006 The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/ Irwin Chapter 7 IT INFRASTRUCTURES Business-Driven Technologies 7.
CHAPTER TEN AUTHORING.
G. Maron, Agata Week, Orsay, January Agata DAQ Layout Gaetano Maron INFN – Laboratori Nazionali di Legnaro.
Web application for detailed real-time database transaction monitoring for CMS condition data ICCMSE 2009 The 7th International Conference of Computational.
Touchstone Automation’s DART ™ (Data Analysis and Reporting Tool)
Database Architectures Database System Architectures Considerations – Data storage: Where do the data and DBMS reside? – Processing: Where.
Recent Software Issues L3 Review of SM Software, 28 Oct Recent Software Issues Occasional runs had large numbers of single-event files. INIT message.
ALICE, ATLAS, CMS & LHCb joint workshop on
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Clara Gaspar, March 2005 LHCb Online & the Conditions DB.
CMS pixel data quality monitoring Petra Merkel, Purdue University For the CMS Pixel DQM Group Vertex 2008, Sweden.
Introduction CMS database workshop 23 rd to 25 th of February 2004 Frank Glege.
Online Software 8-July-98 Commissioning Working Group DØ Workshop S. Fuess Objective: Define for you, the customers of the Online system, the products.
Overview of DAQ at CERN experiments E.Radicioni, INFN MICE Daq and Controls Workshop.
Experiment Management System CSE 423 Aaron Kloc Jordan Harstad Robert Sorensen Robert Trevino Nicolas Tjioe Status Report Presentation Industry Mentor:
ALICE Use of CMF (CC) for the installation of OS and basic S/W OPC servers and other special S/W installed and configured by hand PVSS project provided.
DEPARTEMENT DE PHYSIQUE NUCLEAIRE ET CORPUSCULAIRE JRA1 Parallel - DAQ Status, Emlyn Corrin, 8 Oct 2007 EUDET Annual Meeting, Palaiseau, Paris DAQ Status.
Introduction to Interactive Media Interactive Media Tools: Authoring Applications.
Server Administration. [vpo_server_admin] 2 Server Administration Section Overview Controlling Management Server processes Controlling Managed Node processes.
Online Monitoring for the CDF Run II Experiment T.Arisawa, D.Hirschbuehl, K.Ikado, K.Maeshima, H.Stadie, G.Veramendi, W.Wagner, H.Wenzel, M.Worcester MAR.
CMS Luigi Zangrando, Cern, 16/4/ Run Control Prototype Status M. Gulmini, M. Gaetano, N. Toniolo, S. Ventura, L. Zangrando INFN – Laboratori Nazionali.
Physics & Data Quality Monitoring at CMS Emilio Meschi (original design, run-control, mentoring) CL (core functionality, rules & alarms library, tech support)
Pixel DQM Status R.Casagrande, P.Merkel, J.Zablocki (Purdue University) D.Duggan, D.Hidas, K.Rose (Rutgers University) L.Wehrli (ETH Zuerich) A.York (University.
DQM for the RPC subdetector M. Maggi and P. Paolucci.
Online Monitoring System at KLOE Alessandra Doria INFN - Napoli for the KLOE collaboration CHEP 2000 Padova, 7-11 February 2000 NAPOLI.
GLAST LAT Project CU Beam Test Workshop 3/20/2006 C. Sgro’, L. Baldini, J. Bregeon1 Glast LAT Calibration Unit Beam Test Status Report on Online Monitor.
TDAQ Experience in the BNL Liquid Argon Calorimeter Test Facility Denis Oliveira Damazio (BNL), George Redlinger (BNL).
Software for the CMS Cosmic Challenge Giacomo BRUNO UCL, Louvain-la-Neuve, Belgium On behalf of the CMS Collaboration CHEP06, Mumbay, India February 16,
Management of the LHCb DAQ Network Guoming Liu *†, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
DQM for the RPC subdetector M. Maggi and P. Paolucci.
Online Consumers produce histograms (from a limited sample of events) which provide information about the status of the different sub-detectors. The DQM.
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
CMS Luigi Zangrando, Cern, 16/4/ Run Control Prototype Status M. Gulmini, M. Gaetano, N. Toniolo, S. Ventura, L. Zangrando INFN – Laboratori Nazionali.
CITA 171 Section 1 DOS/Windows Introduction. DOS Disk operating system (DOS) –Term most often associated with MS-DOS –Single-tasking operating system.
M. Caprini IFIN-HH Bucharest DAQ Control and Monitoring - A Software Component Model.
Best SMS Gateway Software Provider Company in India By Aruhat Technologies.
Maintaining and Updating Windows Server 2008 Lesson 8.
Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.
Srećko Morović Institute Ruđer Bošković
CMS High Level Trigger Configuration Management
Controlling a large CPU farm using industrial tools
Pixel DQM Status & Plans
CMS Pixel Data Quality Monitoring
An Introduction to Software Architecture
Status of RPC DQM for Global DAQ in CMSSW
DQM for the RPC subdetector
Presentation transcript:

DQM Architecture From Online Perspective EvF wkg 11/10/2006 E. Meschi – CERN PH/CMD

E.M. - DQM Online View2 DQM Requirements 1.Primary goal: provide “fast” feedback to shift crew and subsystem experts about the quality of event data being taken 2.Provide global and subsystem-specific “quality flags” for each unit of event data (aka Luminosity Section) 3.Provide a uniform environment and a modular structure for DQM code (DQM code reusability) 4.Provide a common working environment for expert and generic monitoring alike 5.Integrate well into online operations (e.g. core activities started automatically by RunControl) 6.Provide a hierarchical online view of the status of the experiment 7.Provide a uniform look and feel for DQM GUIs 8.Enable seamless integration of offline DQM activities (see 3.) 9.Enable remote DQM shifts

E.M. - DQM Online View3 DQM Infrastructure DQMServices –Fully integrated with CMSSW –Modularity of user code imposed by framework –Uniform interface for creation/management of DQM objects –Bookkeeping, transport and collation of DQM data –Quality test and status tracking –Web interface toolkit, xdaq integration –Visual client integrated with Iguana –See C.L. presentation 80% of the requirements in previous slide are covered –How to get the remaining 20% is one of the subjects of this workshop.

E.M. - DQM Online View4 DQMServices use cases data subscriptions CRATE CONTROLLER PC COLLECTOR CONSUMERS data subscriptions Event CONSUMERS COLLECTOR DQM CONSUMERS EVENT SERVER / SM events data subscriptions directory COLLECTOR CONSUMERS ONLINE QUASI - ONLINE FWK FWK + XDAQ STANDALONE XDAQ/WRAPPED TCP/TMessage Event Data FILTER FARM CONSUMERS STORAGE MANAGER

E.M. - DQM Online View5 Frequent Questions Which network will I be running on ? Can I / should I use CMSSW ? How is my process going to be started / controlled ? Do I get to access OMDS ? ORCON ? Do I have access to DCS data ? Do I have access to DAQ monitoring data ?

E.M. - DQM Online View6 DQM Modes of Operation Online at crate controller level –Input rate: limited by VME access (*) –Event Building: No –CPU: crate controller PCs –Bw: consistent with experiment network –Delay: virtually 0 Online in Filter Farm –Input rate: up to 100 kHz –Event Building: Yes –CPU: 10-0% of HLT CPU –Bw: 5-0% of total bw (1 GB/s) –Delay: 0 Online in Event Consumer –Input rate: 1-10 Hz aggregate –Event Building: Yes –CPU: subsystem CPUs –Bw: consistent with experiment network –Delay: seconds EXP. NETWORK CAN USE CMSSW CAN USE RC (SUB-DET) FREE ACCESS TO DB DCS: via PSX DAQmon: via DB EXP. NETWORK MUST USE CMSSW MUST USE RC LIMITED ACCESS TO DB DCS: NO DAQmon: NO EXP. OR CAMPUS NETWORK MUST USE CMSSW CAN USE RC FREE ACCESS TO DB (EXP) DCS: via PSX or DB DAQmon: via DB

E.M. - DQM Online View7 DQM Modes of Operation Quasi-online processing local file from SM –Input rate: O(10) Hz aggregate –Event Building: Yes –CPU: subsystem CPUs –Bw: consistent with experiment network –Delay: minutes Offline processing –Input rate: virtually all data stored (O(100Hz)) –Event Building: Yes –CPU: batch farm –Bw: consistent with campus network –Delay: ~ 1 hour EXP. OR CAMPUS NETWORK MUST USE CMSSW CAN USE RC FREE ACCESS TO DB (EXP) DCS: via DB DAQmon: via DB GRID MUST USE CMSSW CANNOT USE RC ACCESS TO OFFLINE DB ONLY DCS: indirectly via condDB DAQmon: NO

E.M. - DQM Online View8 DQM in the FF The one and only way to get 100 % of the events from L1 Embedding DQM in the HLT has however the following disadvantages: 1.It must be accounted for in the HLT CPU budget 2.It affects the robustness of the HLT: DQM code to be run like that is going to be subject to much stricter requirements and will not be allowed to change frequently 3.DQM data is scattered over many sources: the bandwidth to the collector is limited, and a standard collation operation must be carried out in the collector to reduce data volume. It should be reserved for cases where The entire L1 accept rate is needed or Big statistics must be accumulated over a short period (e.g. at the beginning of a run)

E.M. - DQM Online View9 Filter Farm Data Operation EVENT/DQM SERVER DATALOGGER event data EVENT DATA BUFFERS DQM data SPECIAL STREAMS BUFFERS EVENT/DQM PROXY/CACHING SERVER DQM SNAPSHOT BUFFERS EVENT CONSUMERS DQM CONSUMERS STORAGE MANAGERS

E.M. - DQM Online View10 FF DQM Data Handling First Level of DQM Collection in Storage Manager –Does collation of many FU copies Proxy/Caching Server collects collated updates from all SMs –Does final collation –Saves snapshot per LS –Serves individual consumers –It’s only point of access from outside the experiment network Consumers of FF DQM –Can subscribe to individual DQM “folders” –Only have access to collated information –Are responsible for processing DQM information (Qtests, status variables, presentation etc.)

E.M. - DQM Online View11 Other Online Sources of DQM Data Event and non-event DQM from crate controllers –Should be part of the sub-detector online configuration (and thus be controlled by the sub-det FM) –Including collection and collation Event Consumers (both using Event Server or disk streams) –Should be controlled by RunControl –Should be grouped in few individual processes by functionality and input –E.g. all DQM modules that use a zero-bias special stream are run by the same process One or multiple collectors Collation in case of multiple identical sources is delegated to client

E.M. - DQM Online View12 DQM Clients Two types of consumers of DQM information –Intelligent clients (Superclients) Do data manipulation Are themselves producers of DQM data Can act as servers Can write into CondDB Can (but do not necessarily) provide graphical feedback Can (but do not necessarily) provide interactive control (e.g. switch to expert mode…) Should be xdaq applications so they can be best controlled by RunControl Can be FW applications to gain access to FW services (e.g. ORCON) See S.B. talk Can run unattended and provide feedback to operator via warning/error messages –Dumb clients (e.g. GUI) Do not add information or manipulate data Cannot act as servers Cannot write in CondDB Provide interactive feedback

E.M. - DQM Online View13 Client Operation DQM is controlled as a separate sub-system of DAQ (excluding DQM in FF) –Sources (event consumers) –Collectors –Intelligent clients If full state machine binding for xdaq applications (e.g. derived from DQMBaseClient) –Get configure, run start/stop commands Otherwise limited to start/stop of processes if no xdaq binding As a minimum gives a report line to know if a process is alive Control is on a “best-effort” base, I.e. DAQ will not stop if a DQM component crashes Each Superclient must provide a non-graphic synoptic view of the status of the sub-system it monitors Key plots (used in the status calculation) are stored in a snapshot (at every LS) Plus a navigable hierarchy of status information based on the folder organization (e.g. one folder per chamber: status calculated based on status of contained histograms, etc.) TOP DAQ DQM FF Subsystem CRATE CONTROLLER DQM SOURCEs EVENT CONSUMERS COLLECTORS SUPERCLIENTS HLT as DQM SOURCE SM as DQM COLLECTOR CLIENT CONCENTRATOR GLOBAL STATUS DISPLAY

E.M. - DQM Online View14 Organization of Online DQM Hardware –Online DQM PCs must be connected to the experiment network –They are in general a responsibility of the sub-detector –System management is carried out centrally by DAQ team –Disk space for monitor streams and DQM snapshots is managed centrally (as part of the Storage Manager complex) Software –XDAQ and CMSSW central installations are provided –Sub-systems can derive project trees for fast development –NO flexibility for code running on the filter farm –SOME flexibility for code to run in “quasi-online” mode (compatible with centralized configuration/control) –Freedom for applications under sub-system responsibility (e.g. DQM in crate controller under sub-detector FM control) DB –Database access by individual DQM processes MUST happen via one of the approved mechanisms (Tstore for OMDS and POOL-ORA for ORCON) –Database access bandwidth for DQM MUST be negotiated with the DB group –General rule of thumb is NO DEADTIME due to db stuck on dqm access

E.M. - DQM Online View15 Summary Existing infrastructure covers 80% of DQM requirements Standardization of DQM data generation is achieved (using DQMServices/FW components) Standardization of “SuperClients” must be achieved –Enforce hierarchy of views –Enforce use of quality test and status tools –Enforce use of standard entrypoints for data/status manipulation –Define policies for combining status information Standardization of control –Use Run control to drive DQM processes –DQM becomes a “subsystem” –Line of reporting for critical errors Standardization of look and feel –GUI: development needed for production-level use –Color codes, etc.