First year experience with the ATLAS online monitoring framework Alina Corso-Radu University of California Irvine on behalf of ATLAS TDAQ Collaboration.

Slides:



Advertisements
Similar presentations
RPC & LVL1 Mu Barrel Online Monitoring during LS1 M. Della Pietra.
Advertisements

GNAM and OHP: Monitoring Tools for the ATLAS Experiment at LHC GNAM and OHP: Monitoring Tools for the ATLAS Experiment at LHC M. Della Pietra, P. Adragna,
CHEP 2012 – New York City 1.  LHC Delivers bunch crossing at 40MHz  LHCb reduces the rate with a two level trigger system: ◦ First Level (L0) – Hardware.
CSC DQA and Commissioning Summary  We are responsible for the online and offline DQA for the CSC system, a US ATLAS responsibility  We are ready for.
Online Measurement of LHC Beam Parameters with the ATLAS High Level Trigger David W. Miller on behalf of the ATLAS Collaboration 27 May th Real-Time.
A Database Visualization Tool for ATLAS Monitoring Objects A Database Visualization Tool for ATLAS Monitoring Objects J. Batista, A. Amorim, M. Brandão,
March 2003 CHEP Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.
L. Granado Cardoso, F. Varela, N. Neufeld, C. Gaspar, C. Haen, CERN, Geneva, Switzerland D. Galli, INFN, Bologna, Italy ICALEPCS, October 2011.
1 The ATLAS Online High Level Trigger Framework: Experience reusing Offline Software Components in the ATLAS Trigger Werner Wiedenmann University of Wisconsin,
Control and monitoring of on-line trigger algorithms using a SCADA system Eric van Herwijnen Wednesday 15 th February 2006.
Hands-On Microsoft Windows Server 2008 Chapter 11 Server and Network Monitoring.
Windows Server 2008 Chapter 11 Last Update
O FFLINE T RIGGER M ONITORING TDAQ Training 30 July 2010 On behalf of the Trigger Offline Monitoring Experts team.
Large Scale and Performance Tests of the ATLAS Online Software CERN ATLAS TDAQ Online Software System D.Burckhart-Chromek, I.Alexandrov, A.Amorim, E.Badescu,
Virtual Organization Approach for Running HEP Applications in Grid Environment Łukasz Skitał 1, Łukasz Dutka 1, Renata Słota 2, Krzysztof Korcyl 3, Maciej.
Data Quality Monitoring of the CMS Tracker
2/10/2000 CHEP2000 Padova Italy The BaBar Online Databases George Zioulas SLAC For the BaBar Computing Group.
ATLAS ONLINE MONITORING. FINISHED! Now what? How to check quality of the data?!! DATA FLOWS!
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
Copyright © 2000 OPNET Technologies, Inc. Title – 1 Distributed Trigger System for the LHC experiments Krzysztof Korcyl ATLAS experiment laboratory H.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
The Region of Interest Strategy for the ATLAS Second Level Trigger
The huge amount of resources available in the Grids, and the necessity to have the most up-to-date experimental software deployed in all the sites within.
Control in ATLAS TDAQ Dietrich Liko on behalf of the ATLAS TDAQ Group.
Gnam Monitoring Overview M. Della Pietra, D. della Volpe (Napoli), A. Di Girolamo (Roma1), R. Ferrari, G. Gaudio, W. Vandelli (Pavia) D. Salvatore, P.
ALICE, ATLAS, CMS & LHCb joint workshop on
HEP 2005 WorkShop, Thessaloniki April, 21 st – 24 th 2005 Efstathios (Stathis) Stefanidis Studies on the High.
OFFLINE TRIGGER MONITORING TDAQ Training 5 th November 2010 Ricardo Gonçalo On behalf of the Trigger Offline Monitoring Experts team.
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
CMS pixel data quality monitoring Petra Merkel, Purdue University For the CMS Pixel DQM Group Vertex 2008, Sweden.
1 “Steering the ATLAS High Level Trigger” COMUNE, G. (Michigan State University ) GEORGE, S. (Royal Holloway, University of London) HALLER, J. (CERN) MORETTINI,
Overview of DAQ at CERN experiments E.Radicioni, INFN MICE Daq and Controls Workshop.
5/2/  Online  Offline 5/2/20072  Online  Raw data : within the DAQ monitoring framework  Reconstructed data : with the HLT monitoring framework.
2003 Conference for Computing in High Energy and Nuclear Physics La Jolla, California Giovanna Lehmann - CERN EP/ATD The DataFlow of the ATLAS Trigger.
September 2007CHEP 07 Conference 1 A software framework for Data Quality Monitoring in ATLAS S.Kolos, A.Corso-Radu University of California, Irvine, M.Hauschild.
CHEP March 2003 Sarah Wheeler 1 Supervision of the ATLAS High Level Triggers Sarah Wheeler on behalf of the ATLAS Trigger/DAQ High Level Trigger.
Artemis School On Calibration and Performance of ATLAS Detectors Jörg Stelzer / David Berge.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
Physics & Data Quality Monitoring at CMS Emilio Meschi (original design, run-control, mentoring) CL (core functionality, rules & alarms library, tech support)
Online (GNAM) and offline (Express Stream and Tier0) monitoring produced results during cosmic/collision runs (Oct-Dec 2009) Shifter and expert level monitoring.
Pixel DQM Status R.Casagrande, P.Merkel, J.Zablocki (Purdue University) D.Duggan, D.Hidas, K.Rose (Rutgers University) L.Wehrli (ETH Zuerich) A.York (University.
Claudio Grandi INFN-Bologna CHEP 2000Abstract B 029 Object Oriented simulation of the Level 1 Trigger system of a CMS muon chamber Claudio Grandi INFN-Bologna.
Online Monitoring System at KLOE Alessandra Doria INFN - Napoli for the KLOE collaboration CHEP 2000 Padova, 7-11 February 2000 NAPOLI.
TDAQ Experience in the BNL Liquid Argon Calorimeter Test Facility Denis Oliveira Damazio (BNL), George Redlinger (BNL).
The ATLAS DAQ System Online Configurations Database Service Challenge J. Almeida, M. Dobson, A. Kazarov, G. Lehmann-Miotto, J.E. Sloper, I. Soloviev and.
Software for the CMS Cosmic Challenge Giacomo BRUNO UCL, Louvain-la-Neuve, Belgium On behalf of the CMS Collaboration CHEP06, Mumbay, India February 16,
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
Alignment in real-time in current detector and upgrade 6th LHCb Computing Workshop 18 November 2015 Beat Jost / Cern.
Online Consumers produce histograms (from a limited sample of events) which provide information about the status of the different sub-detectors. The DQM.
03/09/2007http://pcalimonitor.cern.ch/1 Monitoring in ALICE Costin Grigoras 03/09/2007 WLCG Meeting, CHEP.
Calibration & Monitoring M.N Minard Monitoring News Status of monitoring tools Histogramm and monitoring meeting 6/02/08 Calibration farm brainstorming.
14 th IEEE-NPSS Real Time Stockholm - June 9 th 2005 P. F. Zema The GNAM monitoring system and the OHP histogram presenter for ATLAS 14 th IEEE-NPSS Real.
A Validation System for the Complex Event Processing Directives of the ATLAS Shifter Assistant Tool G. Anders (CERN), G. Avolio (CERN), A. Kazarov (PNPI),
The ALICE data quality monitoring Barthélémy von Haller CERN PH/AID For the ALICE Collaboration.
ATLAS The ConditionDB is accessed by the offline reconstruction framework (ATHENA). COOLCOnditions Objects for LHC The interface is provided by COOL (COnditions.
M. Caprini IFIN-HH Bucharest DAQ Control and Monitoring - A Software Component Model.
ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October
Online Data Monitoring Framework Based on Histogram Packaging in Network Distributed Data Acquisition Systems Tomoyuki Konno 1, Anatael Cabrera 2, Masaki.
TRTViewer: the ATLAS TRT detector monitoring and diagnostics tool 4 th Workshop on Advanced Transition Radiation Detectors for Accelerator and Space Applications.
Barthélémy von Haller CERN PH/AID For the ALICE Collaboration The ALICE data quality monitoring system.
TGC online DQ GNAM + RODSampler status 11/09/2009 GNAMs + ROD sampler: 6/6 GNAM segments running. 1 RODSampler application running. 1/6 GNAM segment (efficiencies.
ATLAS Tile Calorimeter Data Quality Assessment and Performance
Online remote monitoring facilities for the ATLAS experiment
Srećko Morović Institute Ruđer Bošković
CMS High Level Trigger Configuration Management
LHC experiments Requirements and Concepts ALICE
Controlling a large CPU farm using industrial tools
ProtoDUNE SP DAQ assumptions, interfaces & constraints
TDAQ commissioning and status Stephen Hillier, on behalf of TDAQ
Offline framework for conditions data
Presentation transcript:

First year experience with the ATLAS online monitoring framework Alina Corso-Radu University of California Irvine on behalf of ATLAS TDAQ Collaboration Alina Corso-Radu University of California Irvine on behalf of ATLAS TDAQ Collaboration CHEP 2009, March 23 rd -27 th Prague

2 Outline  ATLAS trigger and data acquisition system in a glance  Online monitoring framework  Readiness of the online monitoring for runs with cosmic rays and first LHC beam  Conclusions  ATLAS trigger and data acquisition system in a glance  Online monitoring framework  Readiness of the online monitoring for runs with cosmic rays and first LHC beam  Conclusions

3 High Level Triggers Software based ATLAS Trigger/DAQ Data Storage Event Filter ~200 Hz Event Builder (EB) LVL2 Trigger ~3 kHz Read Out Systems (ROSs) Pixel TileCalLArMDT SCTTRT Calorimeter Inner Detector Muon Spectrometer RPC TGC CSC LVL1 Trigger <100 kHz Interaction rate ~1 GHz Bunch crossing rate 40 MHz Coarse granularity data Calorimeter and Muon based Identifies Regions of Interest Partial event reconstruction in Regions of Interest Full granularity data Trigger algorithms optimized for fast rejection Full event reconstruction seeded by LVL2 Trigger algorithms similar to offline 900 farm nodes 1/3 of final system 150 PC ~100 PC Hardware based

4 Online monitoring framework  Analyze events content and produce histograms  Analyze operational conditions of hardware and software detector elements, trigger and data acquisition systems.  Automatic checks of histogram and operational data  Visualize and save results  Produce visual alerts  Set of tools to visualize information aimed for the shifters  Automatic archiving of histograms  Monitoring data available remotely Event Analysis Frameworks Event Analysis Frameworks Data Quality Analysis Framework Data Quality Analysis Framework Data Monitoring Archiving Tools Data Monitoring Archiving Tools Visualization Tools Visualization Tools Web Service Web Service Event samples Operational data Data Flow: ROD/LVL1/HLT About 35 dedicated machines Information Service  Complexity and diversity in terms of monitoring needs of the sub-systems

5 Data Quality Monitoring Framework Configuration DB Conditions DB Data Quality monitoring display Event Analysis Frameworks Event Analysis Frameworks Data Flow: ROD/LVL1/HLT Input Interface Output Interface Configuration Interface Information Service Histograms Configuration DQResults  Distributed framework that provides the mechanism to execute automatic checks on histograms and to produce results according to a particular user configuration.  Input and Output classes can be provided as plug-ins.  Custom plug-ins are supported  About 40 predefined algorithms exists (Histogram empty, Mean values, Fits, Reference comparison, etc)  Custom algorithms are allowed  Writes DQ Results automatically to Conditions Database.  Distributed framework that provides the mechanism to execute automatic checks on histograms and to produce results according to a particular user configuration.  Input and Output classes can be provided as plug-ins.  Custom plug-ins are supported  About 40 predefined algorithms exists (Histogram empty, Mean values, Fits, Reference comparison, etc)  Custom algorithms are allowed  Writes DQ Results automatically to Conditions Database. Event samplers Histograms

6 DQM Display  Summary panel shows overall color-coded DQ status produced by DQMF per sub-system  Run Control conditions  Log information  Details panel offers access to the detailed monitored information  Checked histograms and their references  Configuration information (algorithms, thresholds, etc.)  History tab displays time evolution of DQResults.  Details panel offers access to the detailed monitored information  Checked histograms and their references  Configuration information (algorithms, thresholds, etc.)  History tab displays time evolution of DQResults.  About 17 thousands histograms checked  Shifter attention focused on bad histograms

7 DQM Display - layouts  DQM Display allows for a graphical representation of the sub-systems and their components using detector-like pictorial views  Bad histograms spotted even faster  Expert tool DQM Configurator for editing configuration, aimed at layouts and shapes.  from a existing configuration one can attach layouts and shapes  these layouts are created and displayed online the same way they will show in the DQM Display  experts can tune layout/shape parameters until they look as required  DQM Display allows for a graphical representation of the sub-systems and their components using detector-like pictorial views  Bad histograms spotted even faster  Expert tool DQM Configurator for editing configuration, aimed at layouts and shapes.  from a existing configuration one can attach layouts and shapes  these layouts are created and displayed online the same way they will show in the DQM Display  experts can tune layout/shape parameters until they look as required

8 Online Histogram Presenter  Supports hierarchy of tabs which contain predefined set of histograms  Reference histograms can be displayed as well  Sub-systems normally have several tabs with most important histograms which have to be watched out  Supports hierarchy of tabs which contain predefined set of histograms  Reference histograms can be displayed as well  Sub-systems normally have several tabs with most important histograms which have to be watched out  Main shifter tool for checking histograms manually

9 Trigger Presenter  Presents trigger specific information in a user friendly way:  Trigger rates  Trigger Chains information  HLT farms status  Reflect status of HLT sub-farms using DQMF color codes.  Implemented as an OHP plug-in

10 Histogram Archiving  Almost 100 thousands histograms are currently saved at the end of a run (~200 MB per run)  Reads histograms from IS with respect to the given configuration and saves them to Root files  Registers Root files to Collection and Cache service  Accumulates files into large archives and send them to CDR  Archiving is done asynchronously with respect to the Run states/transitions  Histograms archived can be browsed as well by a dedicated tool Monitoring Data Archiving Collection and Cache CDR Archive Browser HistogramsROOT files ZIP Information Service

11 Operational Monitoring Display  Each process in the system publishes its status and running statistics into Information Service => O(1)M objects  Reads IS information with respect to user configuration and display it as time series graphs, bar charts.  Analyses distributions against thresholds  Groups and highlights the information for the shifter  Is being mostly used for the HLT farms status: CPU, memory, events distribution

12 Event Displays  Atlantis:  Java based 2D event display  VP1 :  3D Event display running in offline reconstruction framework  Both Atlantis and VP1 have been used during Commissioning runs and LHC start-up  Atlantis:  Java based 2D event display  VP1 :  3D Event display running in offline reconstruction framework  Both Atlantis and VP1 have been used during Commissioning runs and LHC start-up  Both Atlantis and VP1 can be used in remote monitoring mode - capable of browsing recent events via an http server.

13 Remote access to the monitoring information  Public - monitoring via Web Interface:  Information is updated periodically  No access restrictions  Expert and Shifter - monitoring via the mirror partition:  Quasi real time information access  Restricted access

14 Web Monitoring Interface  Generic framework which is running at P1 and is publishing information periodically to the Web  The information which is published is provided by plug-ins: currently two  Generic framework which is running at P1 and is publishing information periodically to the Web  The information which is published is provided by plug-ins: currently two  Run Status shows status and basic parameters for all active partitions at P1.  Data Quality shows the same information as the DQM Display (histograms, results, references, etc.) with few min. update interval.  Run Status shows status and basic parameters for all active partitions at P1.  Data Quality shows the same information as the DQM Display (histograms, results, references, etc.) with few min. update interval. Monitoring at Point 1 (ATCN) Data Flow: LVL1/HLT Remote Monitoring (CERN GPN) Event Analysis Frameworks Event Analysis Frameworks Data Quality Analysis Framework Data Quality Analysis Framework Data Monitoring Archiving Tools Data Monitoring Archiving Tools Visualization Tools Visualization Tools Web Service Web Service Web Browser Web Browser Information Service

15 Remote Monitoring via mirror partition  Remote users are able to open remote session on one of the dedicated machines located at CERN GPN:  Environment looks exactly like at P1  All monitoring tool displays are available and work exactly as at P1  The production system setup supports up to 24 concurrent remote users  Remote users are able to open remote session on one of the dedicated machines located at CERN GPN:  Environment looks exactly like at P1  All monitoring tool displays are available and work exactly as at P1  The production system setup supports up to 24 concurrent remote users Monitoring at Point 1 (ATCN) Data Flow: LVL1/HLT Remote Monitoring (CERN GPN) Event Analysis Frameworks Event Analysis Frameworks Data Quality Analysis Framework Data Quality Analysis Framework Data Monitoring Archiving Tools Data Monitoring Archiving Tools Visualization Tools Visualization Tools Web Service Web Service Web Browser Web Browser Visualization Tools Visualization Tools Information Service Information Service Mirror  Almost all information from Information Service is replicated to the mirroring partition  The information is available in the mirror partition with the O(1) ms delay  Almost all information from Information Service is replicated to the mirroring partition  The information is available in the mirror partition with the O(1) ms delay

16 Performance achieved  Online Monitoring Infrastructure is in place and is functioning reliably:  More than 150 event monitoring tasks are started per run  Handles more than 4 millions histogram updates per minute  Almost 100 thousands histograms are saved at the end of a run (~200 MB)  Data Quality status are calculated online (about 10 thousands histograms checked/min.) and stored in database.  Several Atlantis event displays are always running in the ATLAS Control Room and Satellite Control Rooms showing events for several data streams  Monitoring data is replicated in real-time to the mirror partition running outside P1 (with few msec delay)  Remote monitoring pilot system deployed successfully  Online Monitoring Infrastructure is in place and is functioning reliably:  More than 150 event monitoring tasks are started per run  Handles more than 4 millions histogram updates per minute  Almost 100 thousands histograms are saved at the end of a run (~200 MB)  Data Quality status are calculated online (about 10 thousands histograms checked/min.) and stored in database.  Several Atlantis event displays are always running in the ATLAS Control Room and Satellite Control Rooms showing events for several data streams  Monitoring data is replicated in real-time to the mirror partition running outside P1 (with few msec delay)  Remote monitoring pilot system deployed successfully

17 Conclusions  The tests performed on the system indicate that the online monitoring framework architecture meets ATLAS requirements.  The monitoring tools have been successfully used during data taking in detector commissioning runs and during LHC start- up.  Further details on DQM Display, Online Histogram Presenter and Gatherer on dedicated posters.  The tests performed on the system indicate that the online monitoring framework architecture meets ATLAS requirements.  The monitoring tools have been successfully used during data taking in detector commissioning runs and during LHC start- up.  Further details on DQM Display, Online Histogram Presenter and Gatherer on dedicated posters.

… …

19 Users have to provide: Framework components GNAM P Data Flow: ROD/LVL1/HLT EMON Event Monitoring MonaIsa P OMD (Operational Monitoring Display) C Event Filter PT (Processing Task) JO IS (Information Service) Gatherer OHP (Online Histogram Presenter) C DQMF (Data Quality Mon Framework) C MDA (Mon Data Archiving) C OH (Online Histogranning) C P JO Configuration files Plugins (C++ code) Job Option files TriP (Trigger Presenter) Event Display (ATLANTIS, VP1) WMI (Web Monitoring Interface) P C