The ALICE Analysis -- News from the battlefield Federico Carminati for the ALICE Computing Project CHEP 2010 – Taiwan.

Slides:



Advertisements
Similar presentations
Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
Advertisements

1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
– Unfortunately, this problems is not yet fully under control – No enough information from monitoring that would allow us to correlate poor performing.
T1 at LBL/NERSC/OAK RIDGE General principles. RAW data flow T0 disk buffer DAQ & HLT CERN Tape AliEn FC Raw data Condition & Calibration & data DB disk.
GSIAF "CAF" experience at GSI Kilian Schwarz. GSIAF Present status Present status installation and configuration installation and configuration usage.
ALICE Operations short summary and directions in 2012 WLCG workshop May 19-20, 2012.
1 Status of the ALICE CERN Analysis Facility Marco MEONI – CERN/ALICE Jan Fiete GROSSE-OETRINGHAUS - CERN /ALICE CHEP Prague.
ALICE DATA ACCESS MODEL Outline ALICE data access model - PtP Network Workshop 2  ALICE data model  Some figures.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
The ALICE Analysis Framework A.Gheata for ALICE Offline Collaboration 11/3/2008 ACAT'081A.Gheata – ALICE Analysis Framework.
Zhiling Chen (IPP-ETHZ) Doktorandenseminar June, 4 th, 2009.
ALICE data access WLCG data WG revival 4 October 2013.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES P. Saiz (IT-ES) AliEn job agents.
Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.
July-2008Fabrizio Furano - The Scalla suite and the Xrootd1 cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd Client Client A small 2-level cluster. Can.
Federico Carminati, Peter Hristov NEC’2011 Varna September 12-19, 2011 Federico Carminati, Peter Hristov NEC’2011 Varna September 12-19, 2011 An Update.
IST E-infrastructure shared between Europe and Latin America High Energy Physics Applications in EELA Raquel Pezoa Universidad.
Costin Grigoras ALICE Offline. In the period of steady LHC operation, The Grid usage is constant and high and, as foreseen, is used for massive RAW and.
11-July-2008Fabrizio Furano - Data access and Storage: new directions1.
July-2008Fabrizio Furano - The Scalla suite and the Xrootd1.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
Infrastructure for QA and automatic trending F. Bellini, M. Germain ALICE Offline Week, 19 th November 2014.
PWG3 Analysis: status, experience, requests Andrea Dainese on behalf of PWG3 ALICE Offline Week, CERN, Andrea Dainese 1.
Analysis trains – Status & experience from operation Mihaela Gheata.
02-June-2008Fabrizio Furano - Data access and Storage: new directions1.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES A. Abramyan, S. Bagansco, S. Banerjee, L. Betev, F. Carminati,
Claudio Grandi INFN Bologna CMS Computing Model Evolution Claudio Grandi INFN Bologna On behalf of the CMS Collaboration.
ALICE Operations short summary ALICE Offline week June 15, 2012.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES A. Abramyan, S. Bagansco, S. Banerjee, L. Betev, F. Carminati,
PROOF and ALICE Analysis Facilities Arsen Hayrapetyan Yerevan Physics Institute, CERN.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
JAliEn Java AliEn middleware A. Grigoras, C. Grigoras, M. Pedreira P Saiz, S. Schreiner ALICE Offline Week – June 2013.
1 Offline Week, October 28 th 2009 PWG3-Muon: Analysis Status From ESD to AOD:  inclusion of MC branch in the AOD  standard AOD creation for PDC09 files.
ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.
11-June-2008Fabrizio Furano - Data access and Storage: new directions1.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Dynamic staging to a CAF cluster Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
09-Apr-2008Fabrizio Furano - Scalla/xrootd status and features1.
Alien and GSI Marian Ivanov. Outlook GSI experience Alien experience Proposals for further improvement.
PROOF on multi-core machines G. GANIS CERN / PH-SFT for the ROOT team Workshop on Parallelization and MultiCore technologies for LHC, CERN, April 2008.
CCR e INFN-GRID Workshop, Palau, Andrea Dainese 1 L’analisi per l’esperimento ALICE Andrea Dainese INFN Padova Una persona attiva come utente.
ANALYSIS TRAIN ON THE GRID Mihaela Gheata. AOD production train ◦ AOD production will be organized in a ‘train’ of tasks ◦ To maximize efficiency of full.
CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.
Storage discovery in AliEn
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES A. Abramyan, S. Bagnasco, L. Betev, D. Goyal, A. Grigoras, C.
THE ATLAS COMPUTING MODEL Sahal Yacoob UKZN On behalf of the ATLAS collaboration.
Federating Data in the ALICE Experiment
Data Formats and Impact on Federated Access
Xiaomei Zhang CMS IHEP Group Meeting December
Analysis tools in ALICE
Report PROOF session ALICE Offline FAIR Grid Workshop #1
Status of the Analysis Task Force
Status of the CERN Analysis Facility
PROOF – Parallel ROOT Facility
INFN-GRID Workshop Bari, October, 26, 2004
ALICE Physics Data Challenge 3
Status and Prospects of The LHC Experiments Computing
Experience in ALICE – Analysis Framework and Train
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
MC data production, reconstruction and analysis - lessons from PDC’04
Storage elements discovery
Simulation use cases for T2 in ALICE
Analysis framework - status
ALICE Computing Model in Run3
ALICE Computing Upgrade Predrag Buncic
N. De Filippis - LLR-Ecole Polytechnique
R. Graciani for LHCb Mumbay, Feb 2006
Support for ”interactive batch”
Simulation in a Distributed Computing Environment
Presentation transcript:

The ALICE Analysis -- News from the battlefield Federico Carminati for the ALICE Computing Project CHEP 2010 – Taiwan

The problem Data management is the major problem for our Grid infrastructure – Both because it is intrinsically difficult and because it took a long time to tackle it When data access is predictable, the problem is manageable – Simulation, calibration, reconstruction When data access is not, then we have a problem – Analysis Main problems – Data location may not follow data access pattern – Multiplication of data formats October 16,

Data location Send job to the data Be ready when the data is not where you expected it October 16,

Sending jobs to data 4 ALICE central services Job 1lfn1, lfn2, lfn3, lfn4 Job 2lfn1, lfn2, lfn3, lfn4 Job 3lfn1, lfn2, lfn3 Optimizer Submits job User ALICE Job Catalogue Registers output lfnguid{se’s} lfnguid{se’s} lfnguid{se’s} lfnguid{se’s} lfnguid{se’s} ALICE File Catalogue Computing Agent Site Computing Agent Site Computing Agent Site Computing Agent Send results Fetch job Job 1.1lfn1 Job 1.2lfn2 Job 1.3lfn3, lfn4 Job 2.1lfn1, lfn3 Job 2.1lfn2, lfn4 Job 3.1lfn1, lfn3 Job 3.2lfn2 October 16,

LPM chains logic October 16, Reco. 1job/chunk Reco. 1job/chunk QA 1job/chunk QA 1job/chunk QA merging QA merging Delete partial output Delete partial output Merge ROOT tags AOD 1job/chunk AOD 1job/chunk AOD Merging AOD Merging Delete partial output Resubmit error jobs Same mechanism is used also for MonteCarlo productions and analysis trains on MC and RAW data When complete, start in parallel Analysis ESD/AOD Analysis ESD/AOD Integrated with MonALISA

The access to the data 6 Application ALICE FC File GUID, lfn or MD SE & pfn & envelope lfn → guid → ( acl, size, md5) build pfn who has pfn? SE & pfn xrootd Tag catalogue Direct access to data via TAliEn/TGrid interface ev#guidTag1, tag2, tag3… ev#guidTag1, tag2, tag3… ev#guidTag1, tag2, tag3… ev#guidTag1, tag2, tag3…

Complete “pull” model? Caching or catalogue registration 7 cmsd xrootd Centre A cmsd xrootd … any other cmsd xrootd CERN cmsd xroot d xrootd global redirector all.role meta manager all.role manager But missing a file? Ask to the global metamgr Get it from any other collaborating cluster Local clients work normally Remote access 2-3 times slower than local one!

Opportunistic storage discovery A client-to-storage metric allows the automatic discovery of the closest (working) storage elements from every job October 16, France Italy Nordic Countries Russia USA B ased on MonALISA includes topology information, continuous functional and SE occupancy status

Multiple data formats Use one format But make it flexible to adapt to user needs October 16,

Analysis train 10 AOD production will be organized in a ‘train’ of tasks –To maximize efficiency of full dataset processing –To optimize CPU/IO –Using the analysis framework AOD TASK 1TASK 2TASK 3TASK 4 ESD Kine Eff cor October 16, example of AOD production

Analysis Framework Basics: make use of the main ROOT event loop initiated when processing a TChain of files with a TSelector AliESDs.root TChain TSelector Current file Current event AliAnalysis Selector Our framework uses the TSelector technology that defines three analysis stages: initialization, event processing and termination. The current file change is notified. Begin Process Terminate Notify The framework is steered by a manager class that gets called in the different selector stages… AliAnalysis Manager Defining a common interface to access input data files (MC, ESD, AOD) and to write AOD outputs AliVEvent Handler Input Output MC kine TrackRefs AOD AliVEvent … and the interface for the user analysis that follow the selector analysis stages. A train of such analysis tasks share the main event loop AND the input data while still hot in memory. AliAnalysis Task 0…n (s) October 16,

RUM07 General idea AliAODEvent TList AliAODHeader AliAODEvent AliAODTrackAliAODVertexAliAODCluster AliAODJet UserInfo AliAODUser1AliAODUser2 … … Standard part User part Too simple to be true… use a list

RUM07 Extending the AOD S-AOD GetList()->Add(Obj) … U1-AOD … S-AOD U1-AODS-AOD U1’-AOD GetList()->Add(Obj) … U1-AOD … S-AOD GetList()->Add(Obj) … U1-AOD … S-AOD U2-AOD … U1’-AOD U2-AOD You got the idea…

Get users into the system Convince the users that the system is usable October 16,

15 Grid load October 16,

16 Grid usage – chaotic October 16, 227 users, ~5% of Grid resources 6T1, 67T2, 50/50 contribution

17 CAF lfnguid{se’s} lfnguid{se’s} lfnguid{se’s} lfnguid{se’s} lfnguid{se’s} xrootd T0 WN PROOF XROOTD WN PROOF XROOTD WN PROOF XROOTD WN PROOF XROOTD WN PROOF XROOTD WN PROOF XROOTD PROOF master The whole CAF becomes a xrootd cluster Powerful and fast machinery – very popular with users Powerful and fast machinery – very popular with users Allows for any use pattern, however quite often leading to contention for resources Allows for any use pattern, however quite often leading to contention for resources Can load directly from the Grid Can load directly from the Grid October 16, CAF - CERN AF 208 workers (26 x 8 core) 80 TB space 24 GB RAM on all machines SKAF - Slovak Kosice AF 60 workers (15 x 4 core) 53 TB space 24 GB of RAM master 8 GB of RAM on workers

18 Prompt analysis - CAF Data volume – over 1PB User load – over 50 users average October 16,

Conclusions Analysis on the Grid “just works” – In spite of all our fears people are routinely doing analysis on the Grid This does not mean that things work well – A lot of work of optimisation could and should be done Again, commonality between experiments is large, and could be exploited better October 16,

20October 16,