Data Formats and Impact on Federated Access

Slides:



Advertisements
Similar presentations
31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
Advertisements

Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
– Unfortunately, this problems is not yet fully under control – No enough information from monitoring that would allow us to correlate poor performing.
T1 at LBL/NERSC/OAK RIDGE General principles. RAW data flow T0 disk buffer DAQ & HLT CERN Tape AliEn FC Raw data Condition & Calibration & data DB disk.
ALICE Operations short summary and directions in 2012 Grid Deployment Board March 21, 2011.
ALICE Operations short summary LHCC Referees meeting June 12, 2012.
ALICE Operations short summary and directions in 2012 WLCG workshop May 19-20, 2012.
ALICE DATA ACCESS MODEL Outline ALICE data access model - PtP Network Workshop 2  ALICE data model  Some figures.
Wahid Bhimji University of Edinburgh J. Cranshaw, P. van Gemmeren, D. Malon, R. D. Schaffer, and I. Vukotic On behalf of the ATLAS collaboration CHEP 2012.
The ALICE Analysis Framework A.Gheata for ALICE Offline Collaboration 11/3/2008 ACAT'081A.Gheata – ALICE Analysis Framework.
ALICE data access WLCG data WG revival 4 October 2013.
Data Import Data Export Mass Storage & Disk Servers Database Servers Tapes Network from CERN Network from Tier 2 and simulation centers Physics Software.
LHC Computing Review - Resources ATLAS Resource Issues John Huth Harvard University.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
David N. Brown Lawrence Berkeley National Lab Representing the BaBar Collaboration The BaBar Mini  BaBar  BaBar’s Data Formats  Design of the Mini 
Costin Grigoras ALICE Offline. In the period of steady LHC operation, The Grid usage is constant and high and, as foreseen, is used for massive RAW and.
Analysis infrastructure/framework A collection of questions, observations, suggestions concerning analysis infrastructure and framework Compiled by Marco.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
Update on replica management
Infrastructure for QA and automatic trending F. Bellini, M. Germain ALICE Offline Week, 19 th November 2014.
PWG3 Analysis: status, experience, requests Andrea Dainese on behalf of PWG3 ALICE Offline Week, CERN, Andrea Dainese 1.
Analysis trains – Status & experience from operation Mihaela Gheata.
5/2/  Online  Offline 5/2/20072  Online  Raw data : within the DAQ monitoring framework  Reconstructed data : with the HLT monitoring framework.
Optimizing I/O Performance for ESD Analysis Misha Zynovyev, GSI (Darmstadt) ALICE Offline Week, October 28, 2009.
Claudio Grandi INFN Bologna CMS Computing Model Evolution Claudio Grandi INFN Bologna On behalf of the CMS Collaboration.
ALICE Operations short summary ALICE Offline week June 15, 2012.
Predrag Buncic Future IT challenges for ALICE Technical Workshop November 6, 2015.
ALICE DATA ACCESS MODEL Outline 05/13/2014 ALICE Data Access Model 2  ALICE data access model  Infrastructure and SE monitoring.
A. Gheata, ALICE offline week March 09 Status of the analysis framework.
AliRoot survey: Analysis P.Hristov 11/06/2013. Are you involved in analysis activities?(85.1% Yes, 14.9% No) 2 Involved since 4.5±2.4 years Dedicated.
1 Offline Week, October 28 th 2009 PWG3-Muon: Analysis Status From ESD to AOD:  inclusion of MC branch in the AOD  standard AOD creation for PDC09 files.
Workflows and Data Management. Workflow and DM Run3 and after: conditions m LHCb major upgrade is for Run3 (2020 horizon)! o Luminosity x 5 ( )
Predrag Buncic ALICE Status Report LHCC Referee Meeting CERN
ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.
M. Gheata ALICE offline week, October Current train wagons GroupAOD producersWork on ESD input Work on AOD input PWG PWG31 (vertexing)2 (+
The ATLAS Computing & Analysis Model Roger Jones Lancaster University ATLAS UK 06 IPPP, 20/9/2006.
Alien and GSI Marian Ivanov. Outlook GSI experience Alien experience Proposals for further improvement.
Improving the analysis performance Andrei Gheata ALICE offline week 7 Nov 2013.
Analysis efficiency Andrei Gheata ALICE offline week 03 October 2012.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
29/04/2008ALICE-FAIR Computing Meeting1 Resulting Figures of Performance Tests on I/O Intensive ALICE Analysis Jobs.
ANALYSIS TRAIN ON THE GRID Mihaela Gheata. AOD production train ◦ AOD production will be organized in a ‘train’ of tasks ◦ To maximize efficiency of full.
CALIBRATION: PREPARATION FOR RUN2 ALICE Offline Week, 25 June 2014 C. Zampolli.
The ALICE Analysis -- News from the battlefield Federico Carminati for the ALICE Computing Project CHEP 2010 – Taiwan.
THE ATLAS COMPUTING MODEL Sahal Yacoob UKZN On behalf of the ATLAS collaboration.
Federating Data in the ALICE Experiment
Atlas IO improvements and Future prospects
ALICE experience with ROOT I/O
Overview of the Belle II computing
Analysis trains – Status & experience from operation
Diskpool and cloud storage benchmarks used in IT-DSS
Predrag Buncic ALICE Status Report LHCC Referee Meeting CERN
ALICE Monitoring
Developments of the PWG3 muon analysis code
Workshop Computing Models status and perspectives
ALICE analysis preservation
fields of possible improvement
Bernd Panzer-Steindel, CERN/IT
Operations in 2012 and plans for the LS1
ALICE Computing : 2012 operation & future plans
Analysis Trains - Reloaded
AliRoot status and PDC’04
Simulation use cases for T2 in ALICE
Analysis framework - status
ALICE Computing Model in Run3
ALICE Computing Upgrade Predrag Buncic
Performance optimizations for distributed analysis in ALICE
The ATLAS Computing Model
Use Of GAUDI framework in Online Environment
The LHCb Computing Data Challenge DC06
Presentation transcript:

Data Formats and Impact on Federated Access ALICE Data Formats and Impact on Federated Access Costin.Grigoras@cern.ch Andrei.Gheata@cern.ch

General points All data files are in ROOT format Raw data is “ROOTified” at DAQ level Reconstructed & MC data (ESDs) Analysis input data (AODs) Intermediate and merged analysis results, QA, etc. Complex detector with 18 subsystems Large event size dominated by TPC contribution Up to 5-8 MB / event (uncompressed) for Pb-Pb Data is accessed directly from storage with the Xrootd protocol 2015/01/27 ALICE Data Formats

Main containers ESDs: 790 branches AODs: 400 branches Most analysis can run on AODs Some need the extra details in ESDs Formats highly dependent on AliRoot types, deep object hierarchy 2015/01/27 ALICE Data Formats

I/O contributors Raw data reconstruction: read from SE, reconstruct (=>ESDs) and filter (=>AODs), upload them 2-3 data passes Simulation: ESDs and AODs at low throughput Re-filtering: AODs from ESDs (infrequent) Analysis: I/O bound reading of ESDs or AODs Largest contributor to storage I/O 2015/01/27 ALICE Data Formats

Data rates On avg. 10:1 read to write 5% remote reading 45% remote writing 514 TB total read 2015/01/27 ALICE Data Formats

~2MB/s/CPU core delivered CPU eff. down to 50% at peak 2015/01/27 ALICE Data Formats

Analysis sample 1 2015/01/27 ALICE Data Formats

Analysis sample 2 2015/01/27 ALICE Data Formats

Considerations Remote reading avoided as much as possible As fallback if local replica doesn't work Or to speed up last jobs in an analysis Long term average: 2-3% of the volume Previously more permissive but had to limit it for job performance issues Both from CPU and network perspectives Even local storage element cannot sustain the throughput rates of analysis jobs 2015/01/27 ALICE Data Formats

Improvement vectors Combine more analysis in trains Well underway Restrict the number of branches to the minimum needed Hard to do and with little gain since most of the branches are touched Filter to more compact formats (nanoAODs) Inflexible analysis code, hard to control productions ROOT prefetching revisited Caching enabled by default, helped a lot, little gain from prefetching Flatten the AOD format Reduce part of the ~20% time spent in deserialization 2015/01/27 ALICE Data Formats

Flat AOD exercise Use AODtree->MakeClass() to generate a skeleton, then rework Keep all AOD info, but restructure the format Int_t fTracks.fDetPid.fTRDncls -> Int_t *fTracks_fDetPid_fTRDncls; //[ntracks_] More complex cases to support I/O: typedef struct { Double32_t x[10]; } vec10dbl32; Double32_t fTracks.fPID[10] -> vec10dbl32 *fTracks_fPID; //[ntracks_] Double32_t *fV0s.fPx //[fNprongs] -> TArrayF *fV0s.fPx; //[nv0s_] Convert AliAODEvent-> FlatEvent Try to keep the full content AND size Write FlatEvent on file Compare compression, file size and read speed 2015/01/27 ALICE Data Formats

Flat AOD results Tested on Pb-Pb AOD ****************************************************************************** *Tree :aodTree : AliAOD tree * *Entries : 2327 : Total = 2761263710 bytes File Size = 660491257 * * : : Tree compression factor = 4.18 * *Tree :AliAODFlat: Flattened AliAODEvent * *Entries : 2327 : Total = 2248164303 bytes File Size = 385263726 * * : : Tree compression factor = 5.84 * Smaller data (no TObject overhead, TRef->int) 30% better compression Reading speed CPU time= 103s , Real time=120s CPU time= 54s , Real time= 64s 2015/01/27 ALICE Data Formats

Implications User analysis more simple, working mostly with basic types (besides the event) Simplified access to data, highly reducing number of (virtual) calls ROOT-only analysis Backward incompatible, but migration from old to new format possible Sacrificing performance as first step (built-in transient event converter) Much better vectorizable track loops This approach is now considered for Run3 and even Run2 But would not improve the performance of I/O bound jobs 2015/01/27 ALICE Data Formats

Analysis trains Centrally organized user analysis By Physics Working Groups Individual user tasks committed to AliRoot Daily tags, available online each morning Users add their wagons to the next departing train Activity steered by PWG conveners Job submission by the central framework 2015/01/27 ALICE Data Formats

Trains' status 12% of the entire Grid activity vs 6% for all individual users' jobs ~700 trains / month ~8 wagons / train ~2/3 run on AOD, 1/3 on ESD 13h turnaround time 2015/01/27 ALICE Data Formats

Summary ALICE federates all SEs and all CPUs No dedicated roles (but for the tapes) Network topology-aware data distribution Jobs go to the data Remote copies as a fallback Complex data formats Deep object hierarchy Large event sizes Two main directions considered for improvement Flattening object structure to reduce the deserialization time Increasing the CPU/event ratio 2015/01/27 ALICE Data Formats