CCR e INFN-GRID Workshop, Palau, Andrea Dainese 1 L’analisi per l’esperimento ALICE Andrea Dainese INFN Padova Una persona attiva come utente nell'analisi dell'esperimento racconta come fa l'analisi passo passo e spiega quali problemi ha incontrato, quali risolti e quali ancora no, le difficolta' e i punti di forza del suo modo di procedere.
Outline Introduction to the presented analysis: heavy flavour vertexing (charm reconstruction) analysis goal and strategy input data The ALICE analysis framework analysis manager and tasks running modes, access to data and computing resources Analysis at work, step by step Conclusions CCR e INFN-GRID Workshop, Palau, Andrea Dainese 2
Heavy flavour vertexing analysis Goal: measure production cross sections for charm particles, using hadronic decay channels D 0 K D 0 K D + K D s + KK c pK D *+ D 0 Detection strategy: reconstruct secondary decay vertices (for pairs, triplets, quadruplutes of tracks) and use topological cuts to select vertices separated from the primary interaction vertex (c ~ m) Input data: reconstructed tracks from ESD or AOD This analysis is done in collaboration by Padova, Torino, Utrecht, Heidelberg, within the ALICE Heavy Flavour Physics Working Group CCR e INFN-GRID Workshop, Palau, Andrea Dainese 3 Sketch of a D 0 K decay
Analysis strategy CCR e INFN-GRID Workshop, Palau, Andrea Dainese 4 Charm candidates “production” (vertexing) Raw signal extraction Corrections (efficiencies, acceptance)
Correction maps Compare to MC input Analysis strategy CCR e INFN-GRID Workshop, Palau, Andrea Dainese 5 Charm candidates “production” (vertexing) Raw signal extraction Corrections (efficiencies, acceptance) Charm candidates “production” (vertexing) Raw signal extraction On the data (now a MC sample that “plays the data”): On the MC: Charm candidates “production” (vertexing) Raw signal extraction Correct the data
AOD + charm candidates Analysis input and output data ALICE offline framework, AliRoot, based on ROOT ESD (Event Summary Data): output of reconstruction, can be input for analysis AOD (Analysis Object Data): input / output for analysis CCR e INFN-GRID Workshop, Palau, Andrea Dainese 6 Charm candidates “production” (all channels in one go) Raw signal extraction ESDAOD or Histograms ✔ Possibility to use AOD or ESD as input with the same code allows more flexibility and cross checks ✔ Producing all candidates in one go allows optimisation of resource usage (cpu, storage)
Analysis input and output data CCR e INFN-GRID Workshop, Palau, Andrea Dainese 7 Charm candidates “production” (all channels in one go) Raw signal extraction ESDAOD or ✔ Writing charm candidates to a separate file allows possibility to regenerate this file if needed without touching the standard AOD ✗ Possible “synchronization” problem: AliAOD.root+AliAOD.VertexingHF.root have to be analysed together AliAODEvent AliAODTrack... AliAODJet... tracks jets ROOT tree in AliAOD.root (standard AOD) AliAODRecoDecay... AliAODVertex verticesHFD0toKpi friend ROOT tree in a separate file AliAOD.VertexingHF.root AliAODRecoDecay... Dstar
ALICE Analysis Framework ALICE has provided an analysis framework that sits between the user analysis algorithms and the existing back-ends Provides common access to data and CPU for a “train” of analysis tasks Optimizes CPU/IO usage and makes results reproducible Hides the complexity of the GRID and PROOF systems and balances usage of distributed resources TASK 1TASK 2TASK …TASK N standard AOD MC truth (for simulated data) charm candidates... AOD ESD (or AOD) Monte Carlo Truth A.Gheata (CHEP09) CCR e INFN-GRID Workshop, Palau, Andrea Dainese 8 ANALYSIS TRAIN
ALICE Analysis Framework: Manager and Tasks Data-oriented model composed of independent tasks define output type, e.g. N histograms to a root file (via containers) implement single-event analysis Tasks are owned by a manager class Steers event loop, provides each event to each task Hides computing scheme dependent code (same approach for LOCAL, PROOF and GRID modes) Functionality provided for single and multi event analysis AliAnalysisManager AliAODHandler (Output) AliAODEvent AliMCEventHandler AliVEventHandler AliMCEvent AliAnalysisTask UserANALYSISTask AliMCParticleAliAODtrack AliESDEvent (AliAODEvent) AliESDtrack AliESDInputHandler AliAODInputHandler AliVParticle AliVEvent Data AliAnalysisTask AliAnalysisTasSE Tasks CCR e INFN-GRID Workshop, Palau, Andrea Dainese 9
A “transparent” approach User can do everything with ROOT: AliAnalysisManager and AliEn provide access to computing and storage (via xrootd) resources CCR e INFN-GRID Workshop, Palau, Andrea Dainese 10 MyAnalysis.C MyResults.root MY MACHINE StartAnalysis(“local”) local input files, or from any grid SE, via xrootd
A “transparent” approach User can do everything with ROOT: AliAnalysisManager and AliEn provide access to computing and storage (via xrootd) resources CCR e INFN-GRID Workshop, Palau, Andrea Dainese 11 MyAnalysis.C MyResults.root MY MACHINE + AliEn SETUP ________________ CREATE + CONFIGURE GRID PLUGIN StartAnalysis(“grid”) GRID: user has to: Create dataset(s), write fully customized JDL, write executable and validation scripts, copy all dependency files in AliEn FC, handle merging … AliEn plugin for ALICE analysis framework was developed to: Keep user in ROOT Generate all needed files, submit the job and collect the results Everything done via AliEn API using ROOT TGrid interface
A “transparent” approach User can do everything with ROOT: AliAnalysisManager and AliEn provide access to computing and storage (via xrootd) resources CCR e INFN-GRID Workshop, Palau, Andrea Dainese 12 MyAnalysis.C MyResults.root MY MACHINE PROOF SETUP ________________ gProof->UploadPackage(“pack.par”) gProof->EnablePackage(“pack”).... StartAnalysis(“proof”) + PROOF: (Parallel ROOT Facility) The same local analysis can be run in PROOF with minor changes Fast response Limited-size datasets (~few 100 GB)
Correction maps Compare to MC input Analysis step by step: CCR e INFN-GRID Workshop, Palau, Andrea Dainese 13 Charm candidates “production” (vertexing) Raw signal extraction On the “data” (MC sample of 10 8 pp events): On the “MC” (5x10 6 pp events with charm): Charm candidates “production” (vertexing) Raw signal extraction Correct the data
Correction maps Compare to MC input Analysis step by step: 1) Production of charm candidates CCR e INFN-GRID Workshop, Palau, Andrea Dainese 14 Charm candidates “production” (vertexing) Raw signal extraction Charm candidates “production” (vertexing) Raw signal extraction Correct the data On the “data” (MC sample of 10 8 pp events): On the “MC” (5x10 6 pp events with charm): in progress
Analysis step by step: 1) Production of charm candidates CCR e INFN-GRID Workshop, Palau, Andrea Dainese 15 This an organized analysis Analysis Train (ALICE-wide): 1 st wagon: create AliAOD.root from ESD write MC truth to AOD... i th wagon: create AliAOD.VertexingHF.root from AOD... Input ~100 ESD files; output: 1 AOD file (~10000 pp evts) ~1 TB for 10 8 events Run on grid (after validation on small PROOF dataset) ESDs spread over several SEs (mainly T1) use corresponding CE AODs written in a few replicas (typically 3) to selected SEs (T2) ✔ allows to compare MC-level and reco-level reading only one file reduce # of parallal accesses to SE
Correction maps Compare to MC input Analysis step by step: 2) Analysis of candidates from AOD CCR e INFN-GRID Workshop, Palau, Andrea Dainese 16 Charm candidates “production” (vertexing) Raw signal extraction Charm candidates “production” (vertexing) Raw signal extraction Correct the data On the “data” (MC sample of 10 8 pp events): On the “MC” (5x10 6 pp events with charm):
Analysis step by step: 2) Analysis of candidates from AOD This a is a semi-chaotic analysis in principle, end-user analysis on AOD PWGs try to give some organisation: common analysis train collecting analysis tasks from different users: D 0 K selection, like-sign background analysis, D* analysis,.... Analysis train submitted to grid using AliEn plugin one or more CE can be selected in the JDL, input data will be taken from the “close SE” use CE selection to get data from more stable SEs CCR e INFN-GRID Workshop, Palau, Andrea Dainese 17 ✔ expert users can share the work of “taking care of the jobs” ✔ non-expert users can attach their task and get results “for free” ✗ input for analysis: pairs of files AliAOD.root + AliAOD.VertexingHF.root synchronization issues (problem if second file is missing in some cases; the two files should be on the same SE, which may not be the case if one of the two is replicated after production)
Analysis step by step: 2) Analysis of candidates from AOD Input data selection: the input event chain (ROOT trees from a set of files) is built from a xml collection with file URLs in AliEn file catalogue the xml collection can be generated in two ways: using a simple “find” command on the AliEn file catalogue inside aliensh using a system of tags (ROOT files) that allow to query the event metadata (collision type and energy, trigger, etc..) and some event properties (# of tracks, presence of high p T tracks or jets, etc...); in this case the xml collection is created directly from ROOT Analysis software: possibility to use “par” files: instead of using one of the official AliRoot versions distributed on the Grid, use ROOT and compile at run time, on the WN, the needed AliRoot libraries from source files sent in a tar archive (par file) with the job input data CCR e INFN-GRID Workshop, Palau, Andrea Dainese 18 ✔ unique possibility for testing new code developments on “the real thing” ✗ sometimes, problem with compilers on WNs... ✔ solution: deprecate (ab)use of par files for grid jobs, rather provide fast release cycle for analysis software latest development available on the grid with “tagged” official versions of AliRoot
Correction maps Compare to MC input Analysis step by step: 3) Extraction of corrections (from MC) Similar to previous step, but access also to MC info Compute correction maps for signal using the AliRoot Correction Framework The dataset is small (~100 GB): explore also PROOF analysis on CAF CCR e INFN-GRID Workshop, Palau, Andrea Dainese 19 Charm candidates “production” (vertexing) Raw signal extraction Charm candidates “production” (vertexing) Raw signal extraction Correct the data On the “data” (MC sample of 10 8 pp events): On the “MC” (5x10 6 pp events with charm): ✔ fast response for limited statistics ✗ possible issue with CPU/disk quota/priority when data will come
Correction maps Compare to MC input Analysis step by step: 4) Correct the data CCR e INFN-GRID Workshop, Palau, Andrea Dainese 20 Charm candidates “production” (vertexing) Raw signal extraction Charm candidates “production” (vertexing) Raw signal extraction Correct the data On the “data” (MC sample of 10 8 pp events): On the “MC” (5x10 6 pp events with charm): D0KD0K D + K D 0 K MC input Reco corrected
Conclusions ALICE provides a user-friendly and versatile framework for analysis regular tutorials at CERN (attended by >400 people in total) a new student runs his/her code on Grid within ~1 week from getting a certificate documentation on the web: Stable access to data is crucial SE stability/reliability steadily improving In my case, input for analysis: pairs of files AliAOD.root + AliAOD.VertexingHF.root possible synchronization issues at the moment I check this “by hand” Reporting and tracking problems could be improved: user reports problem to ALICE-analysis mailing list only a few people at CERN have “mandate/expertise” to solve these problems investigate, then contact site admin if needed in some cases, site response is not “prompt” (but we know that grid users are not patient...) user cannot follow the state of investigation/debugging CCR e INFN-GRID Workshop, Palau, Andrea Dainese 21
EXTRA SLIDES CCR e INFN-GRID Workshop, Palau, Andrea Dainese 22
GRID analysis via plugin CCR e INFN-GRID Workshop, Palau, Andrea Dainese 23 MyAnalysis.jdl submit File catalog Analysis Manager task1 task2 task3 taskN Outputs MyAnalysis.C CLIENT (laptop) AliEn grid plugin SetGridDataDir() AddRunNumber() SetAditionalLibs() SetOutputFile() MyAnalysis.root AnalysisPlayer.C Dataset.xml WN ALIEN UI SE WN SE WN SE AM Outputs AM Outputs AM Outputs AM->StartAnalysis(“grid”) Analysis Manager TAlien AM->StartAnalysis(“local”) Terminate() A.Gheata (CHEP09)
Analysis step by step: 2) Analysis of candidates from AOD CCR e INFN-GRID Workshop, Palau, Andrea Dainese 24
Analysis step by step: 2) Analysis of candidates from AOD CCR e INFN-GRID Workshop, Palau, Andrea Dainese 25