Validating CMS event display tools with real data Giuseppe Zito : Infn Bari Italy Beliy Nikita : University of Mons-Hainaut Belgium CHEP'09 17th International.

Validating CMS event display tools with real data Giuseppe Zito : Infn Bari Italy Beliy Nikita : University of Mons-Hainaut Belgium CHEP'09 17th International Conference on Computing in High Energy and Nuclear Physics 21 - 27 March 2009 Prague, Czech Republic

Prelude

Prelude(2) At the end of last cosmic run I found a few events presenting this strange pattern of noise in endcaps of CMS tracker. Clusters aligned along the beam line. Sometime, like in this case, they were reconstructed as tracks. The clusters didn’t have any special feature that would enable their labeling as noise. To understand what they are I had to look literally at all 300M events taken in this run.To do this I had to develop (with the help of Nikita Biley) a system that would enable me to answer questions like these: 1)Does all runs present this problem? 2)Is the number of these pattern the same along the whole run? 3) Is the pattern present in all modules?

The Challenge 300 M cosmic events collected during CRAFT (the last cosmic run with magnetic field and all detectors on). The events were taken during two weeks in around 150 runs that contain up to 20M events stored in hundreds of files. Each file contains around 20,000 events. Look at them to find problems in tracker! How to know at which runs to look, and among the millions of events of a run at which events to look? Once you have a list of interesting events, how to access them quickly? Having collected a run of 10M events during the night, select interactively all events satisfying some arbitrary request (“give me all events with tracks in pixel detector “), start looking at them after a few minutes,and, provided the selection contains a reasonable number of events, look actually AT ALL EVENTS selected in the run. This without using skims.

Taming the wild beast From preliminary tests it was clear that the main problem was access to events through CMSSW (the framework containing all software to analyze CMS events) event display embedded in CMSSW is 10 times slower than the same event display that runs outside CMSSW. The only alternative to CMSSW (but only for limited tasks) is to access data using “bare” Root with a python/Cint interface developed for CMS called FWLite. creating a simple plot for a quantity present in a dataset with bare ROOT + FWLite is twice faster and requires only 1/5 of core storage Note that FWLite is fully embedded in CMSSW and is “lite” only if the developers keep it lite by limiting its possibilities (in comparison to full CMSSW) to keep it fast.

How the 4 event display tools used in this test cope with CMSSW slowness For this test I used mostly frog because it was the only general purpose event display (at that time) fast enough and with 3D. I used also the trackermap outside iguana. In this test we used the three general purpose event display tools available to CMS users: Iguana, Firework and Frog and a specialized tool for tracker visualization: the trackermap 1)Iguana has two versions: fully embedded in CMSSW and “lite” (outside CMSSW)(*) 2)Fireworks uses FWLite to optimize event access 3)Frog works completely outside CMSSW like Iguana lite (*) 4)The trackermap (a synoptic view of tracker) is implemented both in iguana (fully embedded) and outside iguana as a class created, filled and printed without using other CMSSW services, as a root histogram. *In order to work outside CMSSW Frog and Iguana lite work in parallel with a normal CMSSW task that provides the events in a special format.

General purpose event display tools performance Iguana embedded8 min 30 sec Iguana lite(*)30 sec Fireworks(**)2 min Frog 30 sec Time to scan 551 events on the same computer with local access from disc to events. For each event we look at a minimum of 3 windows included a 3D window Iguana lite tested only as a prototype: not yet available in CMS software ** Fireworks has added 3D capability after I started this test

Iguana https://twiki.cern.ch/twiki/bin/view/CMS/WorkBookEventDisplay

Fireworks https://twiki.cern.ch/twiki/bin/view/CMS/WorkBookFireworks

Frog https://twiki.cern.ch/twiki/bin/view/CMS/FROG

Trackermap http://webcms.ba.infn.it/cms-software/cms-grid/index.php/CMSTrackerVisualizationSoftware/TrackerMap

First exploratory tour Look at all 20,000 events in a file (around 3 hours using frog) Results: 1.Very few events with tracks (around 1 every 50). 400 in total 2.Among them 10 obviously noisy events First type – random electronic noise :already under study Second type - aligned clusters in tracker endcaps : not yet studied. 3. Half of these noise events presented also a normal cosmic track In tracker. As a consequence a tool was implemented that would authomatize the operation of looking at the events of an arbitrary run with number of tracks >0. Using this tool on random runs I found that events of type 1 and 2 seem to be present everywhere.

Monster events produced by random electronic noise

Managing root files frogFilter.sh Retrieve file list (using get_files.py) Initialization Processing files Root script applied to each file Run (and dataset)

A strategy to look at all 300M events based on two “devices” 1) Visual summary of runs Process all events of a run creating a synoptic view to show what’s going on during the run (hit occupancy trackermap).Do it for all runs and create a visual summary of runs. This should answer the question: Which run to look at? 2) Event classification In the same time create a kind of events database classifying events with a few quantities relevant for tracker monitoring. Use this classification to look quickly and interactively at arbitrary selections of all events in the run.This should answer the question : Which events in a run to look at ?.

A strategy to look at all 300M events :the prototype I have tested this strategy by building the visual summary and the event classification myself. This required around two months work running slow CMSSW tasks. It took so much time also because I didn’t have enough resources and knowledge to build it quickly. If you have enough resources and can fully exploit parallelism, it shouldn’t take more than a week. The resulting images helped to find and solve more problems in tracker http://www.ba.infn.it/~zito/cms/craft1/craft.html The event classification results were loaded in a event display web server as root file “ntuple.root” one for each run. Each event was described by a ntuple in a root tree. http://cmstac05.cern.ch/event_display/

Building the prototype : the ordeal of running CMSSW tasks for normal users 75 runs processed(all good run with more than 200,000 events) with 154 batch jobs on LSF service in Cern queue used : 48 1nw, 10 2nd, 60 1nd, 27 8nh, 8 1nh 55 jobs exited! CPU time : up to 82,000 sec Max swap: from 1000 to 1870 MB Max memory: from 700 to 1500 MB Not having enough afs space (1GB) I couldn’t send more than a few jobs in parallel. Solar time to get results from a run (taking into account exited jobs): from 1 hour to two weeks! CMSSW uses an awful lot of resources also for very simple tasks like building a simple plot on a complete run. It is like using a jet to go home a few blocks away when you need a bicycle

Each run is represented as a single image.The miniature points to a larger image. Also a svg version is kept with the possibility to interact. Images of different runs can be visually compared. First image is a legend. The next 4 are trackermaps obtained by processing Mc events. Visual summary of CRAFT

Single run trackermap – legend The coloring is done using the same palette for all runs(no automatic scaling)

Visual summary of CRAFT used to find tracker problems: can you spot them? Integrated signal from all 345,000 cosmic tracks in run 67647

Trackermaps in visual summary can be processed like histograms Hit occupancy trackermap integrated on most of the runs CRAFT phase2

The results Comparison of resources used to answer a query “events having aligned strings of clusters in endcaps” in run 66714 5,9 Mevents, 197 files. 209 events selected in the whole run. We use three methods: 1)Full CMSSW :build a skim of the events presenting this patern and then look at them. Because of the amount of resources needed, this can be run only in batch using LSF service. 2)Root +FWLite without the classification ntuples. This can be run interactively on lxplus. 3)We create the selection using the classification ntuples. Then we use this list of events to look directly at events (or produce a skim). This second step requires a CMSSW task but uses very little cpu time and so can be run interactively on lxplus..

The results(2) 1 – Skim with full CMSSW 2 – Selection with FWLite3 – Use of selection ntuples RunsIn batch on LSFInteractively on lxplus CPU time14,881 sec3,546 sec860 sec Memory max/swap 1286/1599 MB250/360MB1200/1400MB Solar time 10 hours4 hours1 hour Can look at first event After minutes or hours (depends on time the job waits before starting) After 10 minAfter 1 min Time to look at all 200 events 10 hours+ time the job waites in queue before starting 4 hours1 hour

Why 2 and 3 are so fast? They use 3 jobs that run on three different computers in parallel. The 3 jobs are optimized in order to use less resources and to be run interactively. Each job is also specialized for a task. 1. This job is specialized for selecting events. This can be done using directly the events (method 2) or using the event classification (method 3). In this case doing interactively a list of all events in a run with an arbitrary cut (i.e. tracks in TID and in pixel) takes less than 1 minute. Otherwise using FWLite it needs 4 hours. 2. Frog Analyzer.A CMSSW task creating the frog file for the selected events extracts quickly from events the information needed by the Visualizer. As soon as a new event is ready it is made available to the Visualizer. This job in method 2 works in parallel with 1 and so it takes no time. 3. Frog Visualizer.After a few minutes let me look at the first selected events. In an hour I can look at all selected events in a 6M events run with method 3(4 hours with method 2). This job can run on any computer with an Internet connection. In any moment I can stop the process and try with a new selection

Proposal for implementing the strategy in offline DQM The building of the two devices is done once for all and requires CMSSW tasks. It is proposed to include them in offline DQM processing. The trackermap can be built harvesting DQM results provided that these contain the information needed for each tracker module. For the event classification we propose a simple scheme consisting in building an histogram for each file containing a bin for each event.

Creating classification histogram (recHits_geom.cc) Declaring histogram Calculate bin content Filling histogram Output: One histogram/root file Name: : Title : One bin/event Bin value: 1 st bit: recoTracks > 0 2 nd bit: recoTracks > 1 3 rd bit: clusters > 100 4 th bit: recHits in pixel 5 th bit: “” in TIB 6 th bit: “” in TID- 7 th bit: “” in TID+ 8 th bit: “” in TOB 9 th bit: “” in TEC- 10 th bit: “” in TEC-

Reading histogram, create event list (scan.C) Getting name of root files Getting the run and first event in histogram Loading histogram Checking the selection and if accepted printing event number Root files: Event list: …

Conclusion 1 During data taking in CRAFT event display use for monitoring was almost absent for many reasons:  Too many events :how to select the good ones  Slow access to selected events  Event display too slow I showed here that this (not using event display) wasn’t a good idea because it can make difficult discover detector problems like the ones discovered only two months after data taking. A strategy proposed and tested consisting in creating a system that would allow the user to look immediately and quickly to interesting events. To set up this system all events should be processed once to create: 1) A synoptic view of subdetectors (in our case the trackermap) 2) A database of events

Conclusion 2 The synoptic view will help the user choosing the interesting runs and the database the interesting events in a run. All three CMS general purpose event display tools should be ready when we start LHC events data taking to be used with this system although during this test I was able to use only frog. I hope that it would not only enable us to find quickly tracker problems but also new physics events. Thanks for listening!

Finale

Finale(2) Once the system was working it was easy to solve the riddle of these aligned strings of clusters in tracker endcaps: 1)Processing a run without magnetic field showed that this pattern is absent 2)Processing a run with MC generated data showed that the pattern was present although in a slightly smaller percentage of events. 3) Processing 3 complete runs I could look at a sample of 1000 events containing this pattern. It was clear that, in almost all cases,when the primary cosmic was present in tracker, you could find that the line along the string of cluster was connected in 3D to the cosmic track. Sometimes,like in the event reported in the other slide, you could see the complete track originating from the primary track. 4) So these strings of aligned clusters are secondary tracks generated by interaction of the primary track with detector materials. These low momentum tracks spiral along the magnetic field generating the pattern 5) A plot of the cluster charge confirmed this explanation. https://hypernews.cern.ch/HyperNews/CMS/get/tk-commissioning/191.html

Validating CMS event display tools with real data Giuseppe Zito : Infn Bari Italy Beliy Nikita : University of Mons-Hainaut Belgium CHEP'09 17th International.

Similar presentations

Presentation on theme: "Validating CMS event display tools with real data Giuseppe Zito : Infn Bari Italy Beliy Nikita : University of Mons-Hainaut Belgium CHEP'09 17th International."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Validating CMS event display tools with real data Giuseppe Zito : Infn Bari Italy Beliy Nikita : University of Mons-Hainaut Belgium CHEP'09 17th International.

Similar presentations

Presentation on theme: "Validating CMS event display tools with real data Giuseppe Zito : Infn Bari Italy Beliy Nikita : University of Mons-Hainaut Belgium CHEP'09 17th International."— Presentation transcript:

Similar presentations

About project

Feedback