Download presentation
Presentation is loading. Please wait.
Published byBryan Webster Modified over 8 years ago
1
Barthélémy von Haller CERN PH/AID For the ALICE Collaboration The ALICE data quality monitoring system
2
Data Quality Monitoring 1/17 Online feedback on the quality of data Make sure to take and record high-quality data Identify and solve problem(s) early Data Quality Monitoring (DQM) involves -Online gathering of data -Analysis by user-defined algorithm -Storage of monitoring output -Visualization Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID
3
Data-Acquisition architecture 2/17 Sub-event DA DQM
4
The AMORE framework AMORE : Automatic MOnitoRing Environment A DQM framework for the ALICE experiment 3/17Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID
5
Publisher – Subscriber paradigm Notification with DIM (Distributed Information Management System) Published objects are encapsulated into « MonitorObject » structure Plugin architecture using ROOT reflection – Modules are dynamic libraries loaded at runtime 4/17 Design & Architecture Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID
6
The Pool 5/17 Current implementation based on a database MySQL : – Light-weight – Freely distributable – Reliable – Performant Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID
7
Subscriber & User Interface Generic GUI – Display any object of any running agent – Possibility of handling automatically the layout – Layout can be saved for future reuse – Fit the basic needs of the users to check what is published by the agents For more complex needs, users can develop their own GUI 6/17Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID
8
7 Agents Monitor Object Sub-directories
9
8 MCH example : (Muon chambers) Too high occupancy -> electronic problem
10
Packaging and validation Subversion repositories Software distributed as RPMs Strict release procedure – Build and validate the modules on a test machine in a clean and controlled environment Nightly build – Identify broken code (wrong results, unable to compile) Coverity static build analysis to ensure code quality 9/17Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID
11
First year experience November 2009 : LHC restarts AMORE intensively used in a real world and production environment Up to – 35 agents running – 3400 objects published per second in average – 115 MB published per second in average 10/17Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID
12
First year experience 11
13
First year experience 12/17 TPC : event display in the detector Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID
14
Performances Improvement implemented – Execution of the agents Software profiling Hardware (blades 16 cores, E5530, 2.4GHz) Parallelization Architecture (32 vs 64 bits), compiler (gcc vs icc) – Database access Server tuning Query optimisation and grouping 13/17Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID
15
Performances: Multithreading 14/17 2 threads:1 thread for the analysis (monitor as many events as possible) 1 thread for the image production (heavy process) Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID
16
Performances: 32 -> 64 bits 15/17 A performance gain is not always the result of a long and detailed profiling. Here, a change of architecture is enough to run 20-30% faster. Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID -95%
17
Plans Fully automatize the process : comparisons to reference data, identification of problems, notification, actions taken Add features to take full advantage of multi- cores architecture – Multiple threads – Multiple processes 16/17Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID
18
Conclusion AMORE has been successfully used since the LHC restart and proved to be very useful Wide range of usages Capable of handling very large number of agents, clients and objects The architecture is adequate We are ready for the Heavy Ion runs 17/17Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID
19
Questions 18/17Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID
20
19 Example of a detector (PMD) monitoring the coverage. Plots must be simple to understand for a non-expert shifter. GOOD BAD Oct. 19, 2010 – CHEP 2010
21
The ALICE experiment CERN : European Organisation for Nuclear Physics LHC : Large Hadron Collider ALICE : A Large Ion Collider Experiment – 18 detectors – Trigger rate : 10 KHz (max) – Bandwidth to mass storage : 1.25 GB/s 20/15
22
Design & Architecture 21/17
23
Objects Producers 22/17Oct. 19, 2010 – CHEP 2010 GDC LDC File GDC LDC File Agent Data samples Data Pool Monitor Objects Clients Monitor Objects AliRoot QA AliRoot QA HistogramsEvent Detector Algorithms Monitor Objects High Level Trigger Objects HOMER Prompt Reconstruction Plots, ESD’s, Images Event
24
Database improvement : grouping of queries 23
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.