Presentation is loading. Please wait.

Presentation is loading. Please wait.

Barthélémy von Haller CERN PH/AID For the ALICE Collaboration The ALICE data quality monitoring system.

Similar presentations


Presentation on theme: "Barthélémy von Haller CERN PH/AID For the ALICE Collaboration The ALICE data quality monitoring system."— Presentation transcript:

1 Barthélémy von Haller CERN PH/AID For the ALICE Collaboration The ALICE data quality monitoring system

2 Data Quality Monitoring 1/17 Online feedback on the quality of data Make sure to take and record high-quality data Identify and solve problem(s) early Data Quality Monitoring (DQM) involves -Online gathering of data -Analysis by user-defined algorithm -Storage of monitoring output -Visualization Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID

3 Data-Acquisition architecture 2/17 Sub-event DA DQM

4 The AMORE framework AMORE : Automatic MOnitoRing Environment A DQM framework for the ALICE experiment 3/17Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID

5 Publisher – Subscriber paradigm Notification with DIM (Distributed Information Management System) Published objects are encapsulated into « MonitorObject » structure Plugin architecture using ROOT reflection – Modules are dynamic libraries loaded at runtime 4/17 Design & Architecture Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID

6 The Pool 5/17 Current implementation based on a database MySQL : – Light-weight – Freely distributable – Reliable – Performant Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID

7 Subscriber & User Interface Generic GUI – Display any object of any running agent – Possibility of handling automatically the layout – Layout can be saved for future reuse – Fit the basic needs of the users to check what is published by the agents For more complex needs, users can develop their own GUI 6/17Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID

8 7 Agents Monitor Object Sub-directories

9 8 MCH example : (Muon chambers) Too high occupancy -> electronic problem

10 Packaging and validation Subversion repositories Software distributed as RPMs Strict release procedure – Build and validate the modules on a test machine in a clean and controlled environment Nightly build – Identify broken code (wrong results, unable to compile) Coverity static build analysis to ensure code quality 9/17Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID

11 First year experience November 2009 : LHC restarts AMORE intensively used in a real world and production environment Up to – 35 agents running – 3400 objects published per second in average – 115 MB published per second in average 10/17Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID

12 First year experience 11

13 First year experience 12/17 TPC : event display in the detector Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID

14 Performances Improvement implemented – Execution of the agents Software profiling Hardware (blades 16 cores, E5530, 2.4GHz) Parallelization Architecture (32 vs 64 bits), compiler (gcc vs icc) – Database access Server tuning Query optimisation and grouping 13/17Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID

15 Performances: Multithreading 14/17 2 threads:1 thread for the analysis (monitor as many events as possible) 1 thread for the image production (heavy process) Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID

16 Performances: 32 -> 64 bits 15/17 A performance gain is not always the result of a long and detailed profiling. Here, a change of architecture is enough to run 20-30% faster. Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID -95%

17 Plans Fully automatize the process : comparisons to reference data, identification of problems, notification, actions taken Add features to take full advantage of multi- cores architecture – Multiple threads – Multiple processes 16/17Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID

18 Conclusion AMORE has been successfully used since the LHC restart and proved to be very useful Wide range of usages Capable of handling very large number of agents, clients and objects  The architecture is adequate  We are ready for the Heavy Ion runs 17/17Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID

19 Questions 18/17Oct. 19, 2010 – CHEP 2010Barthélémy von Haller - CERN PH/AID

20 19 Example of a detector (PMD) monitoring the coverage. Plots must be simple to understand for a non-expert shifter. GOOD BAD Oct. 19, 2010 – CHEP 2010

21 The ALICE experiment CERN : European Organisation for Nuclear Physics LHC : Large Hadron Collider ALICE : A Large Ion Collider Experiment – 18 detectors – Trigger rate : 10 KHz (max) – Bandwidth to mass storage : 1.25 GB/s 20/15

22 Design & Architecture 21/17

23 Objects Producers 22/17Oct. 19, 2010 – CHEP 2010 GDC LDC File GDC LDC File Agent Data samples Data Pool Monitor Objects Clients Monitor Objects AliRoot QA AliRoot QA HistogramsEvent Detector Algorithms Monitor Objects High Level Trigger Objects HOMER Prompt Reconstruction Plots, ESD’s, Images Event

24 Database improvement : grouping of queries 23


Download ppt "Barthélémy von Haller CERN PH/AID For the ALICE Collaboration The ALICE data quality monitoring system."

Similar presentations


Ads by Google