INFN-GRID Workshop Bari, October, 26, 2004

Slides:



Advertisements
Similar presentations
DataTAG WP4 Meeting CNAF Jan 14, 2003 Interfacing AliEn and EDG 1/13 Stefano Bagnasco, INFN Torino Interfacing AliEn to EDG Stefano Bagnasco, INFN Torino.
Advertisements

1 Databases in ALICE L.Betev LCG Database Deployment and Persistency Workshop Geneva, October 17, 2005.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
CHEP – Mumbai, February 2006 The LCG Service Challenges Focus on SC3 Re-run; Outlook for 2006 Jamie Shiers, LCG Service Manager.
F.Fanzago – INFN Padova ; S.Lacaprara – LNL; D.Spiga – Universita’ Perugia M.Corvo - CERN; N.DeFilippis - Universita' Bari; A.Fanfani – Universita’ Bologna;
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES P. Saiz (IT-ES) AliEn job agents.
8th November 2002Tim Adye1 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
Belle II Data Management System Junghyun Kim, Sunil Ahn and Kihyeon Cho * (on behalf of the Belle II Computing Group) *Presenter High Energy Physics Team.
IST E-infrastructure shared between Europe and Latin America High Energy Physics Applications in EELA Raquel Pezoa Universidad.
- Distributed Analysis (07may02 - USA Grid SW BNL) Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software.
F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
The ALICE short-term use case DataGrid WP6 Meeting Milano, 11 Dec 2000Piergiorgio Cerello 1 Physics Performance Report (PPR) production starting in Feb2001.
M.MaseraALICE CALCOLO 2004 PHYSICS DATA CHALLENGE III IL CALCOLO NEL 2004: PHYSICS DATA CHALLENGE III Massimo Masera Commissione Scientifica.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
DataGrid is a project funded by the European Commission under contract IST rd EU Review – 19-20/02/2004 WP8 - Demonstration ALICE – Evolving.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Phase 2 of the Physics Data Challenge ‘04 Latchezar Betev ALICE Offline week Geneva, September 15, 2004.
Large scale data flow in local and GRID environment Viktor Kolosov (ITEP Moscow) Ivan Korolko (ITEP Moscow)
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The LCG interface Stefano BAGNASCO INFN Torino.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
ARDA P.Cerello – INFN Torino ARDA Workshop June
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
Alien and GSI Marian Ivanov. Outlook GSI experience Alien experience Proposals for further improvement.
ATLAS Distributed Analysis DISTRIBUTED ANALYSIS JOBS WITH THE ATLAS PRODUCTION SYSTEM S. González D. Liko
Phase 2 of the Physics Data Challenge ‘04 Peter Hristov For the ALICE DC team Russia-CERN Joint Group on Computing CERN, September 20, 2004.
II EGEE conference Den Haag November, ROC-CIC status in Italy
ALICE Physics Data Challenge ’05 and LCG Service Challenge 3 Latchezar Betev / ALICE Geneva, 6 April 2005 LCG Storage Management Workshop.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
BaBar & Grid Eleonora Luppi for the BaBarGrid Group TB GRID Bologna 15 febbraio 2005.
The ALICE Analysis -- News from the battlefield Federico Carminati for the ALICE Computing Project CHEP 2010 – Taiwan.
Claudio Grandi INFN Bologna Workshop congiunto CCR e INFNGrid 13 maggio 2009 Le strategie per l’analisi nell’esperimento CMS Claudio Grandi (INFN Bologna)
INFNGRID Technical Board, Feb
Grid Computing: Running your Jobs around the World
SuperB – INFN-Bari Giacinto DONVITO.
ALICE and LCG Stefano Bagnasco I.N.F.N. Torino
Real Time Fake Analysis at PIC
Eleonora Luppi INFN and University of Ferrara - Italy
LCG Service Challenge: Planning and Milestones
U.S. ATLAS Grid Production Experience
Design rationale and status of the org.glite.overlay component
Running a job on the grid is easier than you think!
Practical: The Information Systems
Report PROOF session ALICE Offline FAIR Grid Workshop #1
Joint JRA1/JRA3/NA4 session
ALICE FAIR Meeting KVI, 2010 Kilian Schwarz GSI.
Roberto Barbera (a nome di Livia Torterolo)
ALICE Physics Data Challenge 3
Grid2Win: Porting of gLite middleware to Windows XP platform
ALICE – Evolving towards the use of EDG/LCG - the Data Challenge 2004
Grid2Win: Porting of gLite middleware to Windows XP platform
CRAB and local batch submission
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Readiness of ATLAS Computing - A personal view
Nicolas Jacq LPC, IN2P3/CNRS, France
MC data production, reconstruction and analysis - lessons from PDC’04
Simulation use cases for T2 in ALICE
LCG middleware and LHC experiments ARDA project
N. De Filippis - LLR-Ecole Polytechnique
R. Graciani for LHCb Mumbay, Feb 2006
Simulation in a Distributed Computing Environment
Architecture of the gLite Data Management System
MonteCarlo production for the BaBar experiment on the Italian grid
ATLAS DC2 & Continuous production
The LHCb Computing Data Challenge DC06
Presentation transcript:

INFN-GRID Workshop Bari, October, 26, 2004 “ALICE @ ARDA” Interactive Analysis on a GRID P. Cerello on behalf of the ALICE Offline Team (Presented by Stefano Bagnasco)

Outline The ALICE Physics Data Challenge Summary of phase I / II What? How? When? Summary & Outlook INFN-GRID Workshop, Bari, October, 26, 2004 - 2

The ALICE Physics Data Challenge Phase 1 - production of underlying events using heavy ion MC generators Status: 100% complete (Mar-May 2004) Basic statistics - ~ 1.3 million files, 26 TB data volume Phase 2 – mixing of signal events in the underlying events Status: 100% complete (Jun-Sep 2004) Phase 3 – analysis of signal + underlying events Goal – to test the data analysis model of ALICE Status – to be started INFN-GRID Workshop, Bari, October, 26, 2004 - 3

Phase I & II layout: a “Meta-Grid” AliEn CE/SE Master job queue AliEn CE/SE Submission AliEn CE/SE Server LCG CE/SE AliEn CE LCG UI Catalogue LCG CE/SE LCG RB Catalogue LCG CE/SE INFN-GRID Workshop, Bari, October, 26, 2004 - 4

Interfacing AliEn and LCG LCG RB Interface Site AliEn CE LCG UI AliEn SE Job submission Server LCG Site EDG CE LCG SE Status report WN AliEn PFN Data Registration PFN = GUID Data Registration LFN Data Catalogue Replica Catalogue GUID INFN-GRID Workshop, Bari, October, 26, 2004 - 5

Phase I - Simulation Signal-free event Small input with interaction conditions Signal-free event Large distributed output (1 GB/event) with simulated detector response Long execution time (10 hours/event) INFN-GRID Workshop, Bari, October, 26, 2004 - 6

Phase I results Number of jobs: Number of files: File size: Central 1 (long, 12 hours) – 20 K Peripheral 1 (medium – 6 hours) – 20 K Peripheral 2 to 5 (short – 1 to 3 hours) – 16 K Number of files: AliEn file catalogue: 3.8 million (no degradation in performance observed) CERN Castor: 1.3 million We did not use the LCG Data Management tools as they were not mature or not deployed when we started Phase 1 File size: Total: 26 TB Resources provided by 27 active production centres, 12 workhorses Total: 285 MSI-2K hours LCG: 67 MSI-2K hours (24%, can be 50% at top resource exploitation) Some exotic centres as proof-of-concept (e.g. in Pakistan), an Itanium farm in Houston INFN-GRID Workshop, Bari, October, 26, 2004 - 7

Phase II - Merging & Reconstruction Large distributed input (1 GB/event) Signal-free event Mixed signal Fairly Large distributed output (100 MB/event, 7MB files) with reconstructed events INFN-GRID Workshop, Bari, October, 26, 2004 - 8

Phase II results Number of jobs: Number of files: File size: 180K, 5M events; Jobs running: 430 average, 1150 max. average duration: 4.5 hours total CPU work: 780 MSi2K hours Number of files: AliEn file catalogue: 5.4M (+3.8M from Phase I, no degradation in performance observed) About 0.5M files, generated on LCG, also registered in the LCG RLS File size: Total: 5.5 TB, all on remote SEs Resources provided by 17 AliEn sites + 12 LCG sites Total: 780 MSI-2K hours LCG: 80 MSI-2K hours (10%) Thanks to LCG support (P.Mendez in particular) for the excellent support INFN-GRID Workshop, Bari, October, 26, 2004 - 9

Phase III – (Interactive) Analysis Large distributed input (100 MB/event) Signal-free event Mixed signal GRID Server Fairly small merged output INFN-GRID Workshop, Bari, October, 26, 2004 - 10

Phase III - Design strategy Distributed Input 5.4 M files, on about 30 different SEs Do not move the input Algorithm definition by a user on site S On site S, Input Query to the Data Catalogue, based on selected MetaData Input files typically from many SEs, so Split them in N subgroups defined by files stored on a given SE Split the task into N sub-tasks, to be run in parallel on the CEs associated to SEs containing a fraction of the input files Run the N sub-tasks in parallel Merge the output on the user’s site S How? From a ROOT shell on the user’s site Hopefully interactively with PROOF INFN-GRID Workshop, Bari, October, 26, 2004 - 11

Phase III - Original Plan lfn 1 Central servers lfn 2 lfn 3 Master job submission, Job Optimizer (N sub-jobs), RB, File catalogue, processes monitoring and control, SE… lfn 4 lfn 5 lfn 6 Job splitter lfn 7 lfn 8 Get PFN’s Sub-jobs Sub-jobs AliEn-LCG interface File catalogue PFN = (LCG SE:) LCG LFN PFN = AliEn PFN Metadata RB Query LFN’s CEs User query CEs Job processing Job processing Input files Input files Local SEs Local SEs File catalogue Primary copy Primary copy INFN-GRID Workshop, Bari, October, 26, 2004 - 12

Phase III - Execution Strategy Very labour intensive The status of LCG DMS is not brilliant Does not “leverage” the (excellent!) work done in JRA-1 and ARDA So… why not doing it with gLite? Advantages Uniform configuration: gLite on EGEE/LCG-managed sites & on ALICE-managed sites If we have to go that way, the sooner the better AliEn is anyway “frozen” as all the developers are working on gLite Disadvantages It may introduce a delay with respect to the use of the present – available – AliEn/LCG configuration But we believe it will pay off in the medium term INFN-GRID Workshop, Bari, October, 26, 2004 - 13

Phase III - Layout gLite/A CE/SE lfn 1 lfn 2 gLite/E CE/SE lfn 3 lfn 4 Server User Query gLite/L CE/SE lfn 7 lfn 8 gLite/A CE/SE Catalog INFN-GRID Workshop, Bari, October, 26, 2004 - 14

Phase III – The Plan ALICE is ready to play the guinea-pig for a large scale deployment i.e. on all ALICE resources and on as many existing LCG resources as possible We have experience in deploying AliEn on most centres, we can redo the exercise with gLite Even on most LCG centres we have a parallel AliEn installation Many ALICE site-managers are ready to try it And we would need little help We need a gLite (beta-) as soon as possible, beginning November Installation and configuration of sites must be as simple as possible I.e. do not require root access We expect help from LCG/EGEE to help us configure and maintain the ALICE gLite server, running common services INFN-GRID Workshop, Bari, October, 26, 2004 - 15

Phase III – Steps to get started Data Management Services Input Data They are already registered in the AliEn & LCG Data Catalogues and stored on AliEn & LCG Storage Elements Access the Alien & LCG Catalogues from gLite … or Translate the AliEn & LCG Catalogues to a gLite Catalogue instance Input Data is scattered on about 30 sites: we need ALL of them to become gLite sites … or we may do some data movement Output Data Must be made available at the User’s site for merging and optional registration to the gLite Data Catalogue INFN-GRID Workshop, Bari, October, 26, 2004 - 16

Phase III – Steps to get started Workload Management Services No special requirement, as the job distribution is intrinsically defined by the input location, but: We need the functionality to split a job into sub-jobs according to the input distribution -- gLite has it! We would like to use the same paradigm as in phase I & II: Be a “dumb” user Test the system as a whole Use (and therefore test) ALL of the available tools and modules INFN-GRID Workshop, Bari, October, 26, 2004 - 17