ATLAS Analysis Model
Introduction On Feb 11, 2008 the Analysis Model Forum published a report (D. Costanzo, I. Hinchliffe, S. Menke, ATL- GEN-INT ) describing the analysis model needed. This report draws some guidelines on the way analysis should be done in Atlas. Although some of the things might not go as planned, I think it is very helpful to see what is the idea behind all the tools. Comments during this lecture are very welcome, since this is an evolving subject Heavy Ion needs were not addressed in this report. Their need are much different from the proton collision needs.
EDM This paragraph emphasizes the need for event data model:
Data Structure RDO - Raw Data Object –Content - full information of the detector response. –Size – should be ~2MB/evt ESD - Event Summary Data –Content - The detailed output of the detector reconstruction. –Derivation – from RDO. –Purpose - should have sufficient information for particle identification and track re-fitting. –Size - should be ~500 kB/evt for real data. The current data size is ~20% larger. –Format – pool file AOD - Analysis Object Data – –Content - summary of all the reconstructed objects. –Derivation – from ESD –Purpose - provide sufficient information for common analyses. –Size - should be ~100kB/evt for real data, however it is now ~200kB/evt where most of the data is trigger information. MC truth should take ~60kB/evt, so the truth information in the AOD is not full (reduction according to ATL-SOFT-INT ). –Format – pool file
Data Structure DPD (Derived Physics Data) – –D 1 PD (primary DPD) Content – different content for different communities, defined by the relevant community. Derivation - from AOD (sometimes from ESD) Size – should be small enough to copy them to Tier-3 or off-grid disks. ~10kB/evt Format – pool file –D 2 PD (secondary DPD) Content – specific for a certain analysis (defined by the relevant group). Derived information can be added Derivation – from D 1 PD and AOD Format – pool file –D 3 PD (tertiary DPD) Content – should contain all the information need to produce the final plots for publication Format – hbook/ntuple/pool file/other Tags –Content – predefined fields for quick event identification –Size – should be ~1kB/evt –Format – database or ROOT files
Terms Skimming – Removal of events Thinning – Removal of containers Slimming – Removal of object from a container
Computing Model BS DAQ + Trigger RDO Reconstruction TAGs ESD AOD Reconstruction AOD TAGs AOD Common Analysis DPD Latex
Computing Model
Frameworks Athena – analysis inside Athena. The analysis is done by writing algorithms and tools using all the Athena framework Intermediate framework (“event view”) – collection of common tools to create DPDs. ARA – provides c++ and python code to convert persistent data into transient data. It does not include the Athena services, so analyses that need database services (like geometry) can’t be done in ARA (for example analyses that involve calorimeter cells and the full information of vertices and tracking)
Recommendations in the report Official analyses must be done using validated tools only! So work with Athena tools as much as you can. And add your private tools to Athena Many recommendations were made. For completeness I copied all of them to here, but I will talk only on few of them.
Recommendations in the report Storage format of DPDs Only D 3 PD can be ntuple Use official tools for analysis. Put your tools in public place
Recommendations in the report Distribution of and access to D n PDs ARA CINT is not recommended Python – two times faster than CINT Compiled C++ - two times faster than python
Recommendations in the report Code distribution and software infrastructure Event Data Model
Recommendations in the report EDM Back on the envelope calculation: reading time of 1M events~15min only for reading the info. Ntuples are ten times faster than that
Recommendations in the report Primary DPD content Priorities and coordination of Primary DPD production
Recommendations in the report Primary DPD production
Recommendations in the report Toolkits or analysis frameworks For my understanding that means that you cannot build primary DPDs with eventView – but I’m not sure I understand it correctly
Recommendations in the report –EventView