CHEP ' 2003David Chamont (CMS - LLR)1 Twelve Ways to Build CMS Crossings from Root Files Benefits and deficiencies of Root trees and clones when : - NOT dealing with TObjects, - reading the trees entries NOT sequentially, - processing them NOT one by one.
CHEP ' 2003David Chamont (CMS - LLR)2 Outline Goal & scope. Main use-case. 4 kinds of containers for the crossing data model. 3 kinds of persistency managers. Results. Conclusions.
CHEP ' 2003David Chamont (CMS - LLR)3 Goal & Scope Evaluate the benefits of TTree and TClonesArray for the persistency of CMS event data (whose classes heavily rely on templates and external packages). Focus on the generation of crossings (pile-up of about 160 simulated events chosen pseudo randomly). Not covered yet : meta-data, associations between persistent objects, schema evolution.
CHEP ' 2003David Chamont (CMS - LLR)4 Main Use-Case signal event (hits) digis Digitizer Persistency Manager signal events file minbias events file digis file minbias event (hits) minbias events file
CHEP ' 2003David Chamont (CMS - LLR)5 Crossing Data Model The folders //root/crossing/* represent the events composing the current crossing. Each event folder contains a container for each kind of event objects : TrackHit, CaloHit,… The kind of container is chosen among four : –std::vector<> (by value). –dynamic C array (each event object is wrapped inside a class instrumented with classdef). –TObjArray (each event object is wrapped inside a class derived from TObject). –TClonesArray (each event object is wrapped inside a class derived from TObject).
CHEP ' 2003David Chamont (CMS - LLR)6 Persistency Managers The task of a persistency manager is to transfer an event from memory (TFolder) to disk (TFile) and vice-versa. Three flavors have been implemented : –RtbPomKeys : directly write the TFolder in the TFile, each time with a different meaningful name. –RtbPomTreeMatrix : for each container in the folder, creates a TMatrixD and attach it to a branch of a TTree. –RtbPomTreeDirect : attach directly each container to a branch of a TTree.
CHEP ' 2003David Chamont (CMS - LLR)7 Implementation issues Recent progress : –can now use -ansi -pedantic. –nice support of foreign classes. –better and better support of templates and std containers. Recurrent problems with Root I/O : –must explicitly ask to parse namespaces, components types and template instances. –multiple containers sizes and misleading operator[], –tuning of chain branchs, –it is unclear which subset of C++ is supported by Root I/O, and which in TTree, and which in TClonesArray.
CHEP ' 2003David Chamont (CMS - LLR)8 Configuration Pentium 4, 1.8 GHz. ~ 512 Mo of RAM. IDE disk. RedHat Linux 7.3 Gcc 3.2 Root 3.05/03
CHEP ' 2003David Chamont (CMS - LLR)9 Parameters of the Testbed compression level. size of buffers. split level. randomness : burst and jump. size of the containers within the events. number of crossings. direct inheritance or not from TObject, direct instrumentation or not with ClassDef. resetting or not the values in the empty constructors.
CHEP ' 2003David Chamont (CMS - LLR)10 Best Results File size (Kb/event) Cpu time (s/crossing) StlCObjArrayClones Keys Matrix Tree
CHEP ' 2003David Chamont (CMS - LLR)11 Remove compression File size (Kb/event) Cpu time (s/crossing) StlCObjArrayClones Keys Matrix Tree
CHEP ' 2003David Chamont (CMS - LLR)12 Then increase random File size (Kb/event) Cpu time (s/crossing) StlCObjArrayClones Keys Matrix Tree
CHEP ' 2003David Chamont (CMS - LLR)13 Then reduce containers /10 File size (Kb/event) Cpu time (s/crossing) StlCObjArrayClones Keys Matrix Tree
CHEP ' 2003David Chamont (CMS - LLR)14 Then remove random File size (Kb/event) Cpu time (s/crossing) StlCObjArrayClones Keys Matrix Tree
CHEP ' 2003David Chamont (CMS - LLR)15 Other Results Resetting the attributes to 0 in the empty constructors of the event data does not help compression. Write cpu time of the different strategies compares rather similarly than the read time, with the Tree strategies a little slower.
CHEP ' 2003David Chamont (CMS - LLR)16 Conclusions We succeeded to read pseudo-random entries from a chain and dispatch them to few hundred folders (despite tuning of TChain branches has not been straightforward). Support for foreign classes, templates and C++ standard library has greatly improved. The magic couple TTree/TClonesArray has proved very efficient, yet it requires top level TObjects and the benefits can become losses with less data or random access pattern. One can simply use std vectors and store them directly into root files. Their integration in a TTree is not as worth as a TClonesArray. This could change in the next release of ROOT. It would be interesting to retry with direct associations between objects, and to apply the testbed to POOL.