CCR e INFN-GRID Workshop, Palau, 13.05.2009 Andrea Dainese 1 L’analisi per l’esperimento ALICE Andrea Dainese INFN Padova Una persona attiva come utente.

Slides:



Advertisements
Similar presentations
1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005.
Advertisements

31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
The LEGO Train Framework
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Batch Production and Monte Carlo + CDB work status Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
Trains status&tests M. Gheata. Train types run centrally FILTERING – Default trains for p-p and Pb-Pb, data and MC (4) Special configuration need to be.
A tool to enable CMS Distributed Analysis
ALICE Operations short summary LHCC Referees meeting June 12, 2012.
The ALICE Analysis Framework A.Gheata for ALICE Offline Collaboration 11/3/2008 ACAT'081A.Gheata – ALICE Analysis Framework.
F.Fanzago – INFN Padova ; S.Lacaprara – LNL; D.Spiga – Universita’ Perugia M.Corvo - CERN; N.DeFilippis - Universita' Bari; A.Fanfani – Universita’ Bologna;
Costin Grigoras ALICE Offline. In the period of steady LHC operation, The Grid usage is constant and high and, as foreseen, is used for massive RAW and.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Analysis infrastructure/framework A collection of questions, observations, suggestions concerning analysis infrastructure and framework Compiled by Marco.
Andreas Morsch, CERN EP/AIP CHEP 2003 Simulation in ALICE Andreas Morsch For the ALICE Offline Project 2003 Conference for Computing in High Energy and.
Sim/Recon DBD Editors Report Norman Graf (SLAC) Jan Strube (CERN) SiD Workshop SLAC, August 22, 2012.
Infrastructure for QA and automatic trending F. Bellini, M. Germain ALICE Offline Week, 19 th November 2014.
Vertex finding and B-Tagging for the ATLAS Inner Detector A.H. Wildauer Universität Innsbruck CERN ATLAS Computing Group on behalf of the ATLAS collaboration.
PWG3 Analysis: status, experience, requests Andrea Dainese on behalf of PWG3 ALICE Offline Week, CERN, Andrea Dainese 1.
Andrei Gheata, Mihaela Gheata, Andreas Morsch ALICE offline week, 5-9 July 2010.
Analysis trains – Status & experience from operation Mihaela Gheata.
ALICE Offline Week, CERN, Andrea Dainese 1 Primary vertex with TPC-only tracks Andrea Dainese INFN Legnaro Motivation: TPC stand-alone analyses.
ALICE analysis framework References for Analysis Tools used to the ALICE simulated data.
AliRoot survey P.Hristov 11/06/2013. Offline framework  AliRoot in development since 1998  Directly based on ROOT  Used since the detector TDR’s for.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
PROOF and ALICE Analysis Facilities Arsen Hayrapetyan Yerevan Physics Institute, CERN.
A New Tool For Measuring Detector Performance in ATLAS ● Arno Straessner – TU Dresden Matthias Schott – CERN on behalf of the ATLAS Collaboration Computing.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
1 Outline: Update on muon/dimuon AOD production (R. Arnaldi/E. Scomparin) Other ongoing activities: MUON correction framework (X. Lopez) MUON productions.
JAliEn Java AliEn middleware A. Grigoras, C. Grigoras, M. Pedreira P Saiz, S. Schreiner ALICE Offline Week – June 2013.
Large scale data flow in local and GRID environment Viktor Kolosov (ITEP Moscow) Ivan Korolko (ITEP Moscow)
A. Gheata, ALICE offline week March 09 Status of the analysis framework.
M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 1 Markus Oldenburg GridPP Metadata Workshop July 4–7 2006, Oxford University ALICE.
AliRoot survey: Analysis P.Hristov 11/06/2013. Are you involved in analysis activities?(85.1% Yes, 14.9% No) 2 Involved since 4.5±2.4 years Dedicated.
Gustavo Conesa ALICE offline week Gamma and Jet correlations analysis framework Short description, Status, HOW TO use and TO DO list 1/9.
1 Offline Week, October 28 th 2009 PWG3-Muon: Analysis Status From ESD to AOD:  inclusion of MC branch in the AOD  standard AOD creation for PDC09 files.
M. Gheata ALICE offline week, October Current train wagons GroupAOD producersWork on ESD input Work on AOD input PWG PWG31 (vertexing)2 (+
PWG3 Analysis: status, experience, requests Andrea Dainese on behalf of PWG3 ALICE Offline Week, CERN, Andrea Dainese 1.
PWG3 analysis (barrel)
Analysis experience at GSIAF Marian Ivanov. HEP data analysis ● Typical HEP data analysis (physic analysis, calibration, alignment) and any statistical.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
Analysis train M.Gheata ALICE offline week, 17 March '09.
M. Gheata ALICE offline week, 24 June  A new analysis train macro was designed for production  /ANALYSIS/macros/AnalysisTrainNew.C /ANALYSIS/macros/AnalysisTrainNew.C.
Alien and GSI Marian Ivanov. Outlook GSI experience Alien experience Proposals for further improvement.
The GridPP DIRAC project DIRAC for non-LHC communities.
David Adams ATLAS ADA: ATLAS Distributed Analysis David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
Barthélémy von Haller CERN PH/AID For the ALICE Collaboration The ALICE data quality monitoring system.
Event Display Meeting, CERN, Andrea Dainese 1 Visualisation of HF vertices HF vertices (D 0  K , J/y  ee, D  3prong, D  4prong) are not.
BaBar & Grid Eleonora Luppi for the BaBarGrid Group TB GRID Bologna 15 febbraio 2005.
ANALYSIS TRAIN ON THE GRID Mihaela Gheata. AOD production train ◦ AOD production will be organized in a ‘train’ of tasks ◦ To maximize efficiency of full.
Monthly video-conference, 18/12/2003 P.Hristov1 Preparation for physics data challenge'04 P.Hristov Alice monthly off-line video-conference December 18,
Lyon Analysis Facility - status & evolution - Renaud Vernet.
The ALICE Analysis -- News from the battlefield Federico Carminati for the ALICE Computing Project CHEP 2010 – Taiwan.
Data Formats and Impact on Federated Access
Analysis trains – Status & experience from operation
Analysis tools in ALICE
Developments of the PWG3 muon analysis code
Status of the Analysis Task Force
ALICE analysis preservation
INFN-GRID Workshop Bari, October, 26, 2004
ALICE Physics Data Challenge 3
Experience in ALICE – Analysis Framework and Train
Off-line weekly meeting
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
MC data production, reconstruction and analysis - lessons from PDC’04
Simulation use cases for T2 in ALICE
Analysis framework - status
Gines Martinez for the PWG3 : Heavy Flavours
Presentation transcript:

CCR e INFN-GRID Workshop, Palau, Andrea Dainese 1 L’analisi per l’esperimento ALICE Andrea Dainese INFN Padova Una persona attiva come utente nell'analisi dell'esperimento racconta come fa l'analisi passo passo e spiega quali problemi ha incontrato, quali risolti e quali ancora no, le difficolta' e i punti di forza del suo modo di procedere.

Outline Introduction to the presented analysis: heavy flavour vertexing (charm reconstruction) analysis goal and strategy input data The ALICE analysis framework analysis manager and tasks running modes, access to data and computing resources Analysis at work, step by step Conclusions CCR e INFN-GRID Workshop, Palau, Andrea Dainese 2

Heavy flavour vertexing analysis Goal: measure production cross sections for charm particles, using hadronic decay channels D 0  K  D 0  K  D +  K  D s +  KK   c  pK  D *+  D 0  Detection strategy: reconstruct secondary decay vertices (for pairs, triplets, quadruplutes of tracks) and use topological cuts to select vertices separated from the primary interaction vertex (c  ~  m) Input data: reconstructed tracks from ESD or AOD This analysis is done in collaboration by Padova, Torino, Utrecht, Heidelberg, within the ALICE Heavy Flavour Physics Working Group CCR e INFN-GRID Workshop, Palau, Andrea Dainese 3 Sketch of a D 0  K  decay

Analysis strategy CCR e INFN-GRID Workshop, Palau, Andrea Dainese 4 Charm candidates “production” (vertexing) Raw signal extraction Corrections (efficiencies, acceptance)

Correction maps Compare to MC input Analysis strategy CCR e INFN-GRID Workshop, Palau, Andrea Dainese 5 Charm candidates “production” (vertexing) Raw signal extraction Corrections (efficiencies, acceptance) Charm candidates “production” (vertexing) Raw signal extraction On the data (now a MC sample that “plays the data”): On the MC: Charm candidates “production” (vertexing) Raw signal extraction Correct the data

AOD + charm candidates Analysis input and output data ALICE offline framework, AliRoot, based on ROOT ESD (Event Summary Data): output of reconstruction, can be input for analysis AOD (Analysis Object Data): input / output for analysis CCR e INFN-GRID Workshop, Palau, Andrea Dainese 6 Charm candidates “production” (all channels in one go) Raw signal extraction ESDAOD or Histograms ✔ Possibility to use AOD or ESD as input with the same code allows more flexibility and cross checks ✔ Producing all candidates in one go allows optimisation of resource usage (cpu, storage)

Analysis input and output data CCR e INFN-GRID Workshop, Palau, Andrea Dainese 7 Charm candidates “production” (all channels in one go) Raw signal extraction ESDAOD or ✔ Writing charm candidates to a separate file allows possibility to regenerate this file if needed without touching the standard AOD ✗ Possible “synchronization” problem: AliAOD.root+AliAOD.VertexingHF.root have to be analysed together AliAODEvent AliAODTrack... AliAODJet... tracks jets ROOT tree in AliAOD.root (standard AOD) AliAODRecoDecay... AliAODVertex verticesHFD0toKpi friend ROOT tree in a separate file AliAOD.VertexingHF.root AliAODRecoDecay... Dstar

ALICE Analysis Framework ALICE has provided an analysis framework that sits between the user analysis algorithms and the existing back-ends Provides common access to data and CPU for a “train” of analysis tasks Optimizes CPU/IO usage and makes results reproducible Hides the complexity of the GRID and PROOF systems and balances usage of distributed resources TASK 1TASK 2TASK …TASK N standard AOD MC truth (for simulated data) charm candidates... AOD ESD (or AOD) Monte Carlo Truth  A.Gheata (CHEP09) CCR e INFN-GRID Workshop, Palau, Andrea Dainese 8 ANALYSIS TRAIN

ALICE Analysis Framework: Manager and Tasks Data-oriented model composed of independent tasks define output type, e.g. N histograms to a root file (via containers) implement single-event analysis Tasks are owned by a manager class Steers event loop, provides each event to each task Hides computing scheme dependent code (same approach for LOCAL, PROOF and GRID modes) Functionality provided for single and multi event analysis AliAnalysisManager AliAODHandler (Output) AliAODEvent AliMCEventHandler AliVEventHandler AliMCEvent AliAnalysisTask UserANALYSISTask AliMCParticleAliAODtrack AliESDEvent (AliAODEvent) AliESDtrack AliESDInputHandler AliAODInputHandler AliVParticle AliVEvent Data AliAnalysisTask AliAnalysisTasSE Tasks CCR e INFN-GRID Workshop, Palau, Andrea Dainese 9

A “transparent” approach User can do everything with ROOT: AliAnalysisManager and AliEn provide access to computing and storage (via xrootd) resources CCR e INFN-GRID Workshop, Palau, Andrea Dainese 10 MyAnalysis.C MyResults.root MY MACHINE StartAnalysis(“local”) local input files, or from any grid SE, via xrootd

A “transparent” approach User can do everything with ROOT: AliAnalysisManager and AliEn provide access to computing and storage (via xrootd) resources CCR e INFN-GRID Workshop, Palau, Andrea Dainese 11 MyAnalysis.C MyResults.root MY MACHINE + AliEn SETUP ________________ CREATE + CONFIGURE GRID PLUGIN StartAnalysis(“grid”) GRID: user has to: Create dataset(s), write fully customized JDL, write executable and validation scripts, copy all dependency files in AliEn FC, handle merging … AliEn plugin for ALICE analysis framework was developed to:  Keep user in ROOT  Generate all needed files, submit the job and collect the results  Everything done via AliEn API using ROOT TGrid interface

A “transparent” approach User can do everything with ROOT: AliAnalysisManager and AliEn provide access to computing and storage (via xrootd) resources CCR e INFN-GRID Workshop, Palau, Andrea Dainese 12 MyAnalysis.C MyResults.root MY MACHINE PROOF SETUP ________________ gProof->UploadPackage(“pack.par”) gProof->EnablePackage(“pack”).... StartAnalysis(“proof”) + PROOF: (Parallel ROOT Facility) The same local analysis can be run in PROOF with minor changes Fast response Limited-size datasets (~few 100 GB)

Correction maps Compare to MC input Analysis step by step: CCR e INFN-GRID Workshop, Palau, Andrea Dainese 13 Charm candidates “production” (vertexing) Raw signal extraction On the “data” (MC sample of 10 8 pp events): On the “MC” (5x10 6 pp events with charm): Charm candidates “production” (vertexing) Raw signal extraction Correct the data

Correction maps Compare to MC input Analysis step by step: 1) Production of charm candidates CCR e INFN-GRID Workshop, Palau, Andrea Dainese 14 Charm candidates “production” (vertexing) Raw signal extraction Charm candidates “production” (vertexing) Raw signal extraction Correct the data On the “data” (MC sample of 10 8 pp events): On the “MC” (5x10 6 pp events with charm): in progress

Analysis step by step: 1) Production of charm candidates CCR e INFN-GRID Workshop, Palau, Andrea Dainese 15 This an organized analysis Analysis Train (ALICE-wide): 1 st wagon: create AliAOD.root from ESD  write MC truth to AOD... i th wagon: create AliAOD.VertexingHF.root from AOD... Input ~100 ESD files; output: 1 AOD file (~10000 pp evts) ~1 TB for 10 8 events Run on grid (after validation on small PROOF dataset) ESDs spread over several SEs (mainly T1)  use corresponding CE AODs written in a few replicas (typically 3) to selected SEs (T2) ✔ allows to compare MC-level and reco-level reading only one file  reduce # of parallal accesses to SE

Correction maps Compare to MC input Analysis step by step: 2) Analysis of candidates from AOD CCR e INFN-GRID Workshop, Palau, Andrea Dainese 16 Charm candidates “production” (vertexing) Raw signal extraction Charm candidates “production” (vertexing) Raw signal extraction Correct the data On the “data” (MC sample of 10 8 pp events): On the “MC” (5x10 6 pp events with charm):

Analysis step by step: 2) Analysis of candidates from AOD This a is a semi-chaotic analysis in principle, end-user analysis on AOD PWGs try to give some organisation: common analysis train collecting analysis tasks from different users:  D 0  K  selection, like-sign background analysis, D* analysis,.... Analysis train submitted to grid using AliEn plugin one or more CE can be selected in the JDL, input data will be taken from the “close SE” use CE selection to get data from more stable SEs CCR e INFN-GRID Workshop, Palau, Andrea Dainese 17 ✔ expert users can share the work of “taking care of the jobs” ✔ non-expert users can attach their task and get results “for free” ✗ input for analysis: pairs of files AliAOD.root + AliAOD.VertexingHF.root  synchronization issues (problem if second file is missing in some cases; the two files should be on the same SE, which may not be the case if one of the two is replicated after production)

Analysis step by step: 2) Analysis of candidates from AOD Input data selection: the input event chain (ROOT trees from a set of files) is built from a xml collection with file URLs in AliEn file catalogue the xml collection can be generated in two ways:  using a simple “find” command on the AliEn file catalogue inside aliensh  using a system of tags (ROOT files) that allow to query the event metadata (collision type and energy, trigger, etc..) and some event properties (# of tracks, presence of high p T tracks or jets, etc...); in this case the xml collection is created directly from ROOT Analysis software: possibility to use “par” files: instead of using one of the official AliRoot versions distributed on the Grid, use ROOT and compile at run time, on the WN, the needed AliRoot libraries from source files sent in a tar archive (par file) with the job input data CCR e INFN-GRID Workshop, Palau, Andrea Dainese 18 ✔ unique possibility for testing new code developments on “the real thing” ✗ sometimes, problem with compilers on WNs... ✔ solution: deprecate (ab)use of par files for grid jobs, rather provide fast release cycle for analysis software  latest development available on the grid with “tagged” official versions of AliRoot

Correction maps Compare to MC input Analysis step by step: 3) Extraction of corrections (from MC) Similar to previous step, but access also to MC info Compute correction maps for signal using the AliRoot Correction Framework The dataset is small (~100 GB):  explore also PROOF analysis on CAF CCR e INFN-GRID Workshop, Palau, Andrea Dainese 19 Charm candidates “production” (vertexing) Raw signal extraction Charm candidates “production” (vertexing) Raw signal extraction Correct the data On the “data” (MC sample of 10 8 pp events): On the “MC” (5x10 6 pp events with charm): ✔ fast response for limited statistics ✗ possible issue with CPU/disk quota/priority when data will come

Correction maps Compare to MC input Analysis step by step: 4) Correct the data CCR e INFN-GRID Workshop, Palau, Andrea Dainese 20 Charm candidates “production” (vertexing) Raw signal extraction Charm candidates “production” (vertexing) Raw signal extraction Correct the data On the “data” (MC sample of 10 8 pp events): On the “MC” (5x10 6 pp events with charm): D0KD0K D +  K  D 0  K  MC input Reco corrected

Conclusions ALICE provides a user-friendly and versatile framework for analysis regular tutorials at CERN (attended by >400 people in total)  a new student runs his/her code on Grid within ~1 week from getting a certificate documentation on the web: Stable access to data is crucial SE stability/reliability steadily improving In my case, input for analysis: pairs of files AliAOD.root + AliAOD.VertexingHF.root possible synchronization issues at the moment I check this “by hand” Reporting and tracking problems could be improved: user reports problem to ALICE-analysis mailing list only a few people at CERN have “mandate/expertise” to solve these problems  investigate, then contact site admin if needed in some cases, site response is not “prompt” (but we know that grid users are not patient...) user cannot follow the state of investigation/debugging CCR e INFN-GRID Workshop, Palau, Andrea Dainese 21

EXTRA SLIDES CCR e INFN-GRID Workshop, Palau, Andrea Dainese 22

GRID analysis via plugin CCR e INFN-GRID Workshop, Palau, Andrea Dainese 23 MyAnalysis.jdl submit File catalog Analysis Manager task1 task2 task3 taskN Outputs MyAnalysis.C CLIENT (laptop) AliEn grid plugin SetGridDataDir() AddRunNumber() SetAditionalLibs() SetOutputFile() MyAnalysis.root AnalysisPlayer.C Dataset.xml WN ALIEN UI SE WN SE WN SE AM Outputs AM Outputs AM Outputs AM->StartAnalysis(“grid”) Analysis Manager TAlien AM->StartAnalysis(“local”) Terminate()  A.Gheata (CHEP09)

Analysis step by step: 2) Analysis of candidates from AOD CCR e INFN-GRID Workshop, Palau, Andrea Dainese 24

Analysis step by step: 2) Analysis of candidates from AOD CCR e INFN-GRID Workshop, Palau, Andrea Dainese 25