Validating CMS event display tools with real data Giuseppe Zito : Infn Bari Italy Beliy Nikita : University of Mons-Hainaut Belgium CHEP'09 17th International.

Slides:



Advertisements
Similar presentations
IT Systems Multiprocessor System EN230-1 Justin Champion C208 –
Advertisements

O. Stézowski IPN Lyon AGATA Week September 2003 Legnaro Data Analysis – Team #3 ROOT as a framework for AGATA.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
EventStore Managing Event Versioning and Data Partitioning using Legacy Data Formats Chris Jones Valentin Kuznetsov Dan Riley Greg Sharp CLEO Collaboration.
Core Application Software Activities Ian Fisk US-CMS Physics Meeting April 20, 2001.
Computer Organization and Architecture
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
Evaluation of G4 Releases in CMS (Sub-detector Studies) Software used Electrons in Tracker Photons in the Electromagnetic Calorimeter Pions in the Calorimeter.
Event display monitoring Giuseppe Zito : Infn Bari Italy Beliy Nikita : University of Mons-Hainaut Belgium.
Data Quality Monitoring of the CMS Tracker
Computing Hardware Starter.
14th IEEE-NPSS Real Time Conference 2005, 8 June Stockholm.
User Side Factors. Download Speed Download speed from a user’s side, is how long it takes a webpage to load, once requested. The measurement for time.
Craft09 Visual Summary and Efficiency Trackermaps Giuseppe Zito : Infn Bari Italy Giuseppe Zito, Infn Bari.
Offline Tracker DQM Shift Tutorial. 29/19/20152 Tracker Shifts Overview Online Shifts at P5 (3/day for 24 hours coverage) – One Pixel shifter and one.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
Performance Concepts Mark A. Magumba. Introduction Research done on 1058 correspondents in 2006 found that 75% OF them would not return to a website that.
Claudio Grandi INFN Bologna CMS Operations Update Ian Fisk, Claudio Grandi 1.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
CMSBrownBag,05/29/2007 B.Mangano How to “use” CMSSW on own Linux Box and be happy In this context “use” means: - check-out pre-compiled CMSSW code - run.
STAR Analysis Meeting, BNL, Dec 2004 Alexandre A. P. Suaide University of Sao Paulo Slide 1 BEMC software and calibration L3 display 200 GeV February.
Outline 3  PWA overview Computational challenges in Partial Wave Analysis Comparison of new and old PWA software design - performance issues Maciej Swat.
Root based event display Dmitry Romanov October 19, 2010.
9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.
Prediction W. Buchmueller (DESY) arXiv:hep-ph/ (1999)
Tracker data quality monitoring based on event display M.S. Mennea – G. Zito University & INFN Bari - Italy.
240-Current Research Easily Extensible Systems, Octave, Input Formats, SOA.
Monitoring of background events in 2010 run Giuseppe Zito 06/11/2015 PFG/MIG Topical meeting on beam background.
A Technical Validation Module for the offline Auger-Lecce, 17 September 2009  Design  The SValidStore Module  Example  Scripting  Status.
CMS pixel data quality monitoring Petra Merkel, Purdue University For the CMS Pixel DQM Group Vertex 2008, Sweden.
By N.Gopinath AP/CSE. There are 5 categories of Decision support tools, They are; 1. Reporting 2. Managed Query 3. Executive Information Systems 4. OLAP.
October 8, 2002P. Nilsson, SPD General Meeting1 Paul Nilsson, SPD General Meeting, Oct. 8, 2002 New tools and software updates Test beam analysis Software.
4 th Workshop on ALICE Installation and Commissioning January 16 th & 17 th, CERN Muon Tracking (MUON_TRK, MCH, MTRK) Conclusion of the first ALICE COSMIC.
VICOMTECH VISIT AT CERN CERN 2013, October 3 rd & 4 th O.COUET CERN/PH/SFT DATA VISUALIZATION IN HIGH ENERGY PHYSICS THE ROOT SYSTEM.
The CMS Simulation Software Julia Yarba, Fermilab on behalf of CMS Collaboration 22 m long, 15 m in diameter Over a million geometrical volumes Many complex.
ALICE Pixel Operational Experience R. Santoro On behalf of the ITS collaboration in the ALICE experiment at LHC.
Tracker Visualization Tool: integration in ORCA Maria S. Mennea, Giuseppe Zito University & INFN Bari, Italy Tracker b-tau Cosmic Challenge preparation.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
© GCSE Computing Computing Hardware Starter. Creating a spreadsheet to demonstrate the size of memory. 1 byte = 1 character or about 1 pixel of information.
1 A first look at the KEK tracker data with G4MICE Malcolm Ellis 2 nd December 2005.
The CMS CERN Analysis Facility (CAF) Peter Kreuzer (RWTH Aachen) - Stephen Gowdy (CERN), Jose Afonso Sanches (UERJ Brazil) on behalf.
Linda R. Coney – 5 November 2009 Online Reconstruction Linda R. Coney 5 November 2009.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
Michele de Gruttola 2008 Report: Online to Offline tool for non event data data transferring using database.
Jean-Roch Vlimant, CERN Physics Performance and Dataset Project Physics Data & MC Validation Group McM : The Evolution of PREP. The CMS tool for Monte-Carlo.
1 Checks on SDD Data Piergiorgio Cerello, Francesco Prino, Melinda Siciliano.
PERFORMANCE AND ANALYSIS WORKFLOW ISSUES US ATLAS Distributed Facility Workshop November 2012, Santa Cruz.
DQM for the RPC subdetector M. Maggi and P. Paolucci.
November 1, 2004 ElizabethGallas -- D0 Luminosity Db 1 D0 Luminosity Database: Checklist for Production Elizabeth Gallas Fermilab Computing Division /
TAGS in the Analysis Model Jack Cranshaw, Argonne National Lab September 10, 2009.
DQM for the RPC subdetector M. Maggi and P. Paolucci.
Software and Computing Status of Software development and MC production OpRoot-Fedra MC interface New CVS server Computing resources at CERN: present and.
1 G4UIRoot Isidro González ALICE ROOT /10/2002.
Detector SimOOlation activities in ATLAS A.Dell’Acqua CERN-EP/ATC May 19th, 1999.
Feb. 3, 2007IFC meeting1 Beam test report Ph. Bruel on behalf of the beam test working group Gamma-ray Large Area Space Telescope.
AliRoot survey: Reconstruction P.Hristov 11/06/2013.
VI/ CERN Dec 4 CMS Software Architecture vs Hybrid Store Vincenzo Innocente CMS Week CERN, Dec
ITT_04101 COMPUTER APPLICATIONS Gaper M CIT
Analysis Model Zhengyun You University of California Irvine Mu2e Computing Review March 5-6, 2015 Mu2e-doc-5227.
MONITORING CMS TRACKER CONSTRUCTION AND DATA QUALITY USING A GRID/WEB SERVICE BASED ON A VISUALIZATION TOOL G. ZITO, M.S. MENNEA, A. REGANO Dipartimento.
Barthélémy von Haller CERN PH/AID For the ALICE Collaboration The ALICE data quality monitoring system.
Diffractive Dijet Production: Update on analysis issues Hardeep Bansil Birmingham Weekly ATLAS Meeting 22/09/2011.
29/04/2008ALICE-FAIR Computing Meeting1 Resulting Figures of Performance Tests on I/O Intensive ALICE Analysis Jobs.
Tree based validation tool for track reconstruction
Data Analysis in Particle Physics
CMS Pixel Data Quality Monitoring
Chapter 8 I/O.
DQM for the RPC subdetector
Operating Systems p.describe the characteristics of knowledge-based systems; q.describe the purpose of operating systems; r.describe the characteristics.
CMS Pixel Data Quality Monitoring
Presentation transcript:

Validating CMS event display tools with real data Giuseppe Zito : Infn Bari Italy Beliy Nikita : University of Mons-Hainaut Belgium CHEP'09 17th International Conference on Computing in High Energy and Nuclear Physics March 2009 Prague, Czech Republic

Prelude

Prelude(2) At the end of last cosmic run I found a few events presenting this strange pattern of noise in endcaps of CMS tracker. Clusters aligned along the beam line. Sometime, like in this case, they were reconstructed as tracks. The clusters didn’t have any special feature that would enable their labeling as noise. To understand what they are I had to look literally at all 300M events taken in this run.To do this I had to develop (with the help of Nikita Biley) a system that would enable me to answer questions like these: 1)Does all runs present this problem? 2)Is the number of these pattern the same along the whole run? 3) Is the pattern present in all modules?

The Challenge 300 M cosmic events collected during CRAFT (the last cosmic run with magnetic field and all detectors on). The events were taken during two weeks in around 150 runs that contain up to 20M events stored in hundreds of files. Each file contains around 20,000 events. Look at them to find problems in tracker! How to know at which runs to look, and among the millions of events of a run at which events to look? Once you have a list of interesting events, how to access them quickly? Having collected a run of 10M events during the night, select interactively all events satisfying some arbitrary request (“give me all events with tracks in pixel detector “), start looking at them after a few minutes,and, provided the selection contains a reasonable number of events, look actually AT ALL EVENTS selected in the run. This without using skims.

Taming the wild beast From preliminary tests it was clear that the main problem was access to events through CMSSW (the framework containing all software to analyze CMS events) event display embedded in CMSSW is 10 times slower than the same event display that runs outside CMSSW. The only alternative to CMSSW (but only for limited tasks) is to access data using “bare” Root with a python/Cint interface developed for CMS called FWLite. creating a simple plot for a quantity present in a dataset with bare ROOT + FWLite is twice faster and requires only 1/5 of core storage Note that FWLite is fully embedded in CMSSW and is “lite” only if the developers keep it lite by limiting its possibilities (in comparison to full CMSSW) to keep it fast.

How the 4 event display tools used in this test cope with CMSSW slowness For this test I used mostly frog because it was the only general purpose event display (at that time) fast enough and with 3D. I used also the trackermap outside iguana. In this test we used the three general purpose event display tools available to CMS users: Iguana, Firework and Frog and a specialized tool for tracker visualization: the trackermap 1)Iguana has two versions: fully embedded in CMSSW and “lite” (outside CMSSW)(*) 2)Fireworks uses FWLite to optimize event access 3)Frog works completely outside CMSSW like Iguana lite (*) 4)The trackermap (a synoptic view of tracker) is implemented both in iguana (fully embedded) and outside iguana as a class created, filled and printed without using other CMSSW services, as a root histogram. *In order to work outside CMSSW Frog and Iguana lite work in parallel with a normal CMSSW task that provides the events in a special format.

General purpose event display tools performance Iguana embedded8 min 30 sec Iguana lite(*)30 sec Fireworks(**)2 min Frog 30 sec Time to scan 551 events on the same computer with local access from disc to events. For each event we look at a minimum of 3 windows included a 3D window Iguana lite tested only as a prototype: not yet available in CMS software ** Fireworks has added 3D capability after I started this test

Iguana

Fireworks

Frog

Trackermap

First exploratory tour Look at all 20,000 events in a file (around 3 hours using frog) Results: 1.Very few events with tracks (around 1 every 50). 400 in total 2.Among them 10 obviously noisy events First type – random electronic noise :already under study Second type - aligned clusters in tracker endcaps : not yet studied. 3. Half of these noise events presented also a normal cosmic track In tracker. As a consequence a tool was implemented that would authomatize the operation of looking at the events of an arbitrary run with number of tracks >0. Using this tool on random runs I found that events of type 1 and 2 seem to be present everywhere.

Monster events produced by random electronic noise

Managing root files frogFilter.sh Retrieve file list (using get_files.py) Initialization Processing files Root script applied to each file Run (and dataset)

A strategy to look at all 300M events based on two “devices” 1) Visual summary of runs Process all events of a run creating a synoptic view to show what’s going on during the run (hit occupancy trackermap).Do it for all runs and create a visual summary of runs. This should answer the question: Which run to look at? 2) Event classification In the same time create a kind of events database classifying events with a few quantities relevant for tracker monitoring. Use this classification to look quickly and interactively at arbitrary selections of all events in the run.This should answer the question : Which events in a run to look at ?.

A strategy to look at all 300M events :the prototype I have tested this strategy by building the visual summary and the event classification myself. This required around two months work running slow CMSSW tasks. It took so much time also because I didn’t have enough resources and knowledge to build it quickly. If you have enough resources and can fully exploit parallelism, it shouldn’t take more than a week. The resulting images helped to find and solve more problems in tracker The event classification results were loaded in a event display web server as root file “ntuple.root” one for each run. Each event was described by a ntuple in a root tree.

Building the prototype : the ordeal of running CMSSW tasks for normal users 75 runs processed(all good run with more than 200,000 events) with 154 batch jobs on LSF service in Cern queue used : 48 1nw, 10 2nd, 60 1nd, 27 8nh, 8 1nh 55 jobs exited! CPU time : up to 82,000 sec Max swap: from 1000 to 1870 MB Max memory: from 700 to 1500 MB Not having enough afs space (1GB) I couldn’t send more than a few jobs in parallel. Solar time to get results from a run (taking into account exited jobs): from 1 hour to two weeks! CMSSW uses an awful lot of resources also for very simple tasks like building a simple plot on a complete run. It is like using a jet to go home a few blocks away when you need a bicycle

Each run is represented as a single image.The miniature points to a larger image. Also a svg version is kept with the possibility to interact. Images of different runs can be visually compared. First image is a legend. The next 4 are trackermaps obtained by processing Mc events. Visual summary of CRAFT

Single run trackermap – legend The coloring is done using the same palette for all runs(no automatic scaling)

Visual summary of CRAFT used to find tracker problems: can you spot them? Integrated signal from all 345,000 cosmic tracks in run 67647

Trackermaps in visual summary can be processed like histograms Hit occupancy trackermap integrated on most of the runs CRAFT phase2

The results Comparison of resources used to answer a query “events having aligned strings of clusters in endcaps” in run ,9 Mevents, 197 files. 209 events selected in the whole run. We use three methods: 1)Full CMSSW :build a skim of the events presenting this patern and then look at them. Because of the amount of resources needed, this can be run only in batch using LSF service. 2)Root +FWLite without the classification ntuples. This can be run interactively on lxplus. 3)We create the selection using the classification ntuples. Then we use this list of events to look directly at events (or produce a skim). This second step requires a CMSSW task but uses very little cpu time and so can be run interactively on lxplus..

The results(2) 1 – Skim with full CMSSW 2 – Selection with FWLite3 – Use of selection ntuples RunsIn batch on LSFInteractively on lxplus CPU time14,881 sec3,546 sec860 sec Memory max/swap 1286/1599 MB250/360MB1200/1400MB Solar time 10 hours4 hours1 hour Can look at first event After minutes or hours (depends on time the job waits before starting) After 10 minAfter 1 min Time to look at all 200 events 10 hours+ time the job waites in queue before starting 4 hours1 hour

Why 2 and 3 are so fast? They use 3 jobs that run on three different computers in parallel. The 3 jobs are optimized in order to use less resources and to be run interactively. Each job is also specialized for a task. 1. This job is specialized for selecting events. This can be done using directly the events (method 2) or using the event classification (method 3). In this case doing interactively a list of all events in a run with an arbitrary cut (i.e. tracks in TID and in pixel) takes less than 1 minute. Otherwise using FWLite it needs 4 hours. 2. Frog Analyzer.A CMSSW task creating the frog file for the selected events extracts quickly from events the information needed by the Visualizer. As soon as a new event is ready it is made available to the Visualizer. This job in method 2 works in parallel with 1 and so it takes no time. 3. Frog Visualizer.After a few minutes let me look at the first selected events. In an hour I can look at all selected events in a 6M events run with method 3(4 hours with method 2). This job can run on any computer with an Internet connection. In any moment I can stop the process and try with a new selection

Proposal for implementing the strategy in offline DQM The building of the two devices is done once for all and requires CMSSW tasks. It is proposed to include them in offline DQM processing. The trackermap can be built harvesting DQM results provided that these contain the information needed for each tracker module. For the event classification we propose a simple scheme consisting in building an histogram for each file containing a bin for each event.

Creating classification histogram (recHits_geom.cc) Declaring histogram Calculate bin content Filling histogram Output: One histogram/root file Name: : Title : One bin/event Bin value: 1 st bit: recoTracks > 0 2 nd bit: recoTracks > 1 3 rd bit: clusters > th bit: recHits in pixel 5 th bit: “” in TIB 6 th bit: “” in TID- 7 th bit: “” in TID+ 8 th bit: “” in TOB 9 th bit: “” in TEC- 10 th bit: “” in TEC-

Reading histogram, create event list (scan.C) Getting name of root files Getting the run and first event in histogram Loading histogram Checking the selection and if accepted printing event number Root files: Event list: …

Conclusion 1 During data taking in CRAFT event display use for monitoring was almost absent for many reasons:  Too many events :how to select the good ones  Slow access to selected events  Event display too slow I showed here that this (not using event display) wasn’t a good idea because it can make difficult discover detector problems like the ones discovered only two months after data taking. A strategy proposed and tested consisting in creating a system that would allow the user to look immediately and quickly to interesting events. To set up this system all events should be processed once to create: 1) A synoptic view of subdetectors (in our case the trackermap) 2) A database of events

Conclusion 2 The synoptic view will help the user choosing the interesting runs and the database the interesting events in a run. All three CMS general purpose event display tools should be ready when we start LHC events data taking to be used with this system although during this test I was able to use only frog. I hope that it would not only enable us to find quickly tracker problems but also new physics events. Thanks for listening!

Finale

Finale(2) Once the system was working it was easy to solve the riddle of these aligned strings of clusters in tracker endcaps: 1)Processing a run without magnetic field showed that this pattern is absent 2)Processing a run with MC generated data showed that the pattern was present although in a slightly smaller percentage of events. 3) Processing 3 complete runs I could look at a sample of 1000 events containing this pattern. It was clear that, in almost all cases,when the primary cosmic was present in tracker, you could find that the line along the string of cluster was connected in 3D to the cosmic track. Sometimes,like in the event reported in the other slide, you could see the complete track originating from the primary track. 4) So these strings of aligned clusters are secondary tracks generated by interaction of the primary track with detector materials. These low momentum tracks spiral along the magnetic field generating the pattern 5) A plot of the cluster charge confirmed this explanation.