Use of Root I/O Trees for CMS Crossings

Slides:



Advertisements
Similar presentations
De rien. Youre welcome Je dois I have to En vacances.
Advertisements

Youre welcome. D_ R_ De rien I have to J_ d_ Je dois.
À peu près/environ. about/approximately Cest tout pour aujourdhui.
TICE 2 ième Semestre Fonctions « logiques ». Février 2006TICE 2ième Semestre - Les fonctions logiques2 Petits rappels… Une formule est toujours de la.
File Management Chapter 12. File Management File management system is considered part of the operating system Input to applications is by means of a file.
File Management Chapter 12. File Management A file is a named entity used to save results from a program or provide data to a program. Access control.
CHEP ' 2003David Chamont (CMS - LLR)1 Twelve Ways to Build CMS Crossings from Root Files Benefits and deficiencies of Root trees and clones when : - NOT.
Filesytems and file access Wahid Bhimji University of Edinburgh, Sam Skipsey, Chris Walker …. Apr-101Wahid Bhimji – Files access.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
STAR Event data storage and management in STAR V. Perevoztchikov Brookhaven National Laboratory,USA.
1 Class Diagrams. 2 Overview Class diagrams are the most commonly used diagrams in UML. Class diagrams are for visualizing, specifying and documenting.
STAR Schema Evolution Implementation in ROOT I/O V. Perevoztchikov Brookhaven National Laboratory,USA.
9/28/2005Philippe Canal, ROOT Workshop TTree / SQL Philippe Canal (FNAL) 2005 Root Workshop.
STAR Persistent Pointers in the STAR Micro-DST V. Perevoztchikov Brookhaven National Laboratory,USA.
Tree-Structured Indexes. Introduction As for any index, 3 alternatives for data entries k*: – Data record with key value k –  Choice is orthogonal to.
Densité et masse volumique. Imaginons que nous prenions deux cylindres, de même volume, constitués avec des métaux différents. En les pesant, nous trouvons.
Dynamic Host Configuration Protocol 1 DHCP. Introduction Lorsque vous connectez une machine à un réseau Ethernet TCP/IP, cette machine, pour fonctionner.
le salon / la salle de séjour
WP12 - General Development News Sandro Wenzel
CE 454 Computer Architecture
Must + Have to with the Angry Family
CS 540 Database Management Systems
Data Structure Interview Question and Answers
Azita Keshmiri CS 157B Ch 12 indexing and hashing
Chapter 12 File Management
B-Trees B-Trees.
FileSystems.
Tree-Structured Indexes
Tree-Structured Indexes
Indexing ? Why ? Need to locate the actual records on disk without having to read the entire table into memory.
Introduction to Design Patterns
Recent performance improvements in ALICE simulation/digitization
Hash-Based Indexes Chapter 11
M. Amine BENNIS Booster sa carrière: perspectives. 07/12/20171 BOOSTER SA CARRIERE.
CS 3343: Analysis of Algorithms
Point de départ Like other commonly used verbs, the verb faire (to do, to make) is irregular in the present tense. © 2015 by Vista Higher Learning,
C4SI- Workshop on impact evaluation
IN EVERY OFFICE THERE IS SOMEBODY...
Vincenzo Innocente CERN/EP/CMC
AVG 24th 2015 ADVANCED c# - part 1.
B-Trees.
Connaissez-vous cet endroit ?
Tree-Structured Indexes
CS222/CS122C: Principles of Data Management Notes #07 B+ Trees
Tree-Structured Indexes
Hash-Based Indexes Chapter 10
Automated support of STL containers
Introduction to Operating Systems
Tree-Structured Indexes
Class Diagrams.
Index tuning Hash Index.
Multiway Trees Searching and B-Trees Advanced Tree Structures
OO-Design in PHENIX PHENIX, a BIG Collaboration A Liberal Data Model
CS202 - Fundamental Structures of Computer Science II
In each office there are people......
Chapter 12 Query Processing (1)
Zooming on ROOT files and Containers
Tree-Structured Indexes
Tree-Structured Indexes
Chapter 11 Instructor: Xin Zhang
Lecture 20: Indexes Monday, February 27, 2006.
B-Trees.
Ns-3 Training Simulator core ns-3 training, June 2016.
CS222/CS122C: Principles of Data Management UCI, Fall 2018 Notes #06 B+ trees Instructor: Chen Li.
Chapter 5 File Systems -Compiled for MCA, PU
Site diaporamas carminé
New I/O navigation scheme
CS222P: Principles of Data Management UCI, Fall Notes #06 B+ trees
B+-trees In practice, B-trees are not used much as defined earlier.
Presentation transcript:

Use of Root I/O Trees for CMS Crossings Benefits and deficiencies of Root I/O trees when : - NOT dealing with TObjects, - reading the trees entries NOT sequentially, - processing them NOT one by one. ROOT ' 2002 David Chamont (CMS - LLR)

David Chamont (CMS - LLR) Outline Goal & scope. Main use-case. Crossing data model. Persistent objects managers. Four persistency strategies. Storage of not-TObjects. Implementation issues. Performances. Conclusions. Wish-list to the Root team. Future work. ROOT ' 2002 David Chamont (CMS - LLR)

David Chamont (CMS - LLR) Goal & scope Evaluate the use of TTree for the persistency of CMS event data (whose classes heavily rely on templates and external packages). Focus on the generation of crossings (pile-up of about 200 simulated events chosen randomly). Not covered yet : meta-data and references. ROOT ' 2002 David Chamont (CMS - LLR)

David Chamont (CMS - LLR) Main Use-Case Digitizer signal event (hits) minbias event (hits) digis Pom Pom Pom files of ~ 500 events events of ~ in memory. crossing of ~ 200 minbias events. minbias events file signal events file digis file minbias events file ROOT ' 2002 David Chamont (CMS - LLR)

David Chamont (CMS - LLR) Crossing Data Model The folders //root/pool/* represent the events composing the current crossing. Each event folder contains collections of RtbTrackHit, RtbCaloHit,… These collections should be kind of RtbVArray<> : RtbCArray<> (home-made C-like array). RtbClonesArray<> (wrap a TClonesArray). RtVArray is an abstract class. The choice of the concrete class if delegated to the persistency manager. ROOT ' 2002 David Chamont (CMS - LLR)

Persistent Objects Managers Persistency managers are able to transfer event data (set of RtbVArray<>) between a TFolder and a TFile. We implemented three kinds of RtbVPom : RtbMatrixPom : for each RtbVArray<>, create a TMatrixD and attach it to a branch of a TTree. RtbDirectPom : attach each RtbVArray<> of the folder to a branch of a TTree. RtbKeysPom : directly store the TFolder in the TFile, each time with a different meaningful name. I had to use old branches for RtObjArray I was forced to split TClonesArray RtMatrixPom is a reference, it does not use rootcint, but only root predefined classes ROOT ' 2002 David Chamont (CMS - LLR)

Four persistency strategies RtbMatrixPom (matrix) RtbCArray and RtbKeysPom (keys) RtbCArray and RtbDirectPom (carray) RtbClonesArray and RtbDirectPom (clones) ROOT ' 2002 David Chamont (CMS - LLR)

Storage of non-TObjects RtbCArray RtbCArray I had to use old branches for RtObjArray I was forced to split TClonesArray RtMatrixPom is a reference, it does not use rootcint, but only root predefined classes ROOT ' 2002 David Chamont (CMS - LLR)

Implementation issues Tips to use Root together with Scram/Cms add -p to rootcint (why not the default ?) ask to parse namespaces (why not the default ?) always add -fPIC for link step remove -ansi –pedantic Typical problems with Root I/O : collections sizes and operator[], storage of empty collections, ownership of objects attached to the branches, tuning of chain branchs. ROOT ' 2002 David Chamont (CMS - LLR)

200 random events read time (s) Performances Matrix Keys CArray Clones 500 events file size (Mb) 119.6 63.3 118.9 63.5 119.3 63.8 106.6 57.7 500 events write time (s) 153 74 191 95 197 96 147 72 200 random events read time (s) 4.57 6.93 5.82 5.88 files are smaller than objy ones objarray have old branches and cannot have empty arrays clones cannot be unsplit, and there is a memory leak. strategy 1 is interesting because it does not require the use of rootcint (but we must implement the serialization ourselves). strategy 2 is interesting because we do not pay the price for TObjects, yet we do not take benefit from the annouced optimization of TTree for TClonesArray ( but is this optimization real ?). strategy 3 is the most ROOT oriented, but the wrapping of objects could be counter-productive. in fact, wrapping can only rely on templates, and this currently prevent us from bypassing the streamer An interesting result is that the 3 strategies have quite similar results… I should evaluate cpu time without compression of data. Les objets de base ne sont pas splites Les collections sont precrees au sein du pool ? si j’implemente la creation de collections de facon externes a la piscine et leur placement dans la piscine par l’utilisateur, je dois me demander ce qui se passe si une des branches a son pointeur vide (je pourrais essayer avant chaque fill de desactiver les branches des pointeurs vides, mais est ce que je conserve les bons rangs ; ou alors il me faut faire au vol une collection vide (mais j’ai deja des problemes avec les collections vides ! pour le moment, je peux faire l’hypothese qu’il n’y a jamais de vide et toujours le meme nombre de collections. ? est-ce que les TClonesArray vides posent les memes problemes que les RtObjArray vides ? ROOT ' 2002 David Chamont (CMS - LLR)

David Chamont (CMS - LLR) Conclusions We succeeded to read pseudo-random entries from a chain and dispatch them to few hundred folders, yet TChain/TTree/TBranch can be improved to support this use-case more efficiently. Use of specialized Root classes (TTree & TClonesArray) improve performance but increase problems with complex C++ structures and foreign classes. Best argument for TTree is the privileged connection to Root analysis features. strategy 1 is interesting because it does not require the use of rootcint (but we must implement the serialization ourselves). strategy 2 is interesting because we do not pay the price for TObjects, yet we do not take benefit from the annouced optimization of TTree for TClonesArray ( but is this optimization real ?). strategy 3 is the most ROOT oriented, but the wrapping of objects could be counter-productive. in fact, wrapping can only rely on templates, and this currently prevent us from bypassing the streamer An interesting result is that the 3 strategies have quite similar results… I should evaluate cpu time without compression of data. ROOT ' 2002 David Chamont (CMS - LLR)

David Chamont (CMS - LLR) Wish-list to Root team Enable to change the value of a pointer attached to a TBranch. Enable to add a user-defined function to a TChain, to be called each time the current tree is changed. Clarify/document the subset of C++ which is supported by Root I/O, and which subset is supported within trees. ROOT ' 2002 David Chamont (CMS - LLR)

David Chamont (CMS - LLR) Future work Provide root team with unscramed demonstrations of what we observed. Implement a fifth strategy using TObjArray, for comparison with TClonesArray. Try to make the data classes directly inherit from TObject and compare performances with RtbObj<>. Look inside Root classes implementations, optimize strategies accordingly, and upgrade the conclusions. Add references between objects. ROOT ' 2002 David Chamont (CMS - LLR)