ATLAS Distributed Analysis DISTRIBUTED ANALYSIS JOBS WITH THE ATLAS PRODUCTION SYSTEM S. González D. Liko

Slides:



Advertisements
Similar presentations
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Advertisements

23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
Computational Steering on the GRID Using a 3D model to Interact with a Large Scale Distributed Simulation in Real-Time Michael.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL June 23, 2003 GAE workshop Caltech.
Test results Test definition (1) Istituto Nazionale di Fisica Nucleare, Sezione di Roma; (2) Istituto Nazionale di Fisica Nucleare, Sezione di Bologna.
The ATLAS Production System. The Architecture ATLAS Production Database Eowyn Lexor Lexor-CondorG Oracle SQL queries Dulcinea NorduGrid Panda OSGLCG The.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
December 17th 2008RAL PPD Computing Christmas Lectures 11 ATLAS Distributed Computing Stephen Burke RAL.
ATLAS DQ2 Deletion Service D.A. Oleynik, A.S. Petrosyan, V. Garonne, S. Campana (on behalf of the ATLAS Collaboration)
Don Quijote Data Management for the ATLAS Automatic Production System Miguel Branco – CERN ATC
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL July 15, 2003 LCG Analysis RTAG CERN.
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL March 18, 2004 ATLAS Software Workshop Grid session.
F.Fanzago – INFN Padova ; S.Lacaprara – LNL; D.Spiga – Universita’ Perugia M.Corvo - CERN; N.DeFilippis - Universita' Bari; A.Fanfani – Universita’ Bologna;
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
The first year of LHC physics analysis using the GRID: Prospects from ATLAS Davide Costanzo University of Sheffield
ATLAS DIAL: Distributed Interactive Analysis of Large Datasets David Adams – BNL September 16, 2005 DOSAR meeting.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,
The european ITM Task Force data structure F. Imbeaux.
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
INFSO-RI Enabling Grids for E-sciencE ATLAS Distributed Analysis A. Zalite / PNPI.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC.
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
ATLAS is a general-purpose particle physics experiment which will study topics including the origin of mass, the processes that allowed an excess of matter.
5/2/  Online  Offline 5/2/20072  Online  Raw data : within the DAQ monitoring framework  Reconstructed data : with the HLT monitoring framework.
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL September 30, 2004 CHEP2004 Track 5: Distributed Computing Systems and Experiences.
D. Adams, D. Liko, K...Harrison, C. L. Tan ATLAS ATLAS Distributed Analysis: Current roadmap David Adams – DIAL/PPDG/BNL Dietrich Liko – ARDA/EGEE/CERN.
A GRID solution for Gravitational Waves Signal Analysis from Coalescing Binaries: preliminary algorithms and tests F. Acernese 1,2, F. Barone 2,3, R. De.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
Performance of The NorduGrid ARC And The Dulcinea Executor in ATLAS Data Challenge 2 Oxana Smirnova (Lund University/CERN) for the NorduGrid collaboration.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
David Adams ATLAS ATLAS Distributed Analysis: Overview David Adams BNL December 8, 2004 Distributed Analysis working group ATLAS software workshop.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
The ATLAS Strategy for Distributed Analysis on several Grid Infrastructures D. Liko, IT/PSS for the ATLAS Distributed Analysis Community.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
ATLAS Distributed Analysis Dietrich Liko IT/GD. Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI.
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Response of the ATLAS Spanish Tier2 for.
WMS baseline issues in Atlas Miguel Branco Alessandro De Salvo Outline  The Atlas Production System  WMS baseline issues in Atlas.
David Adams ATLAS ADA: ATLAS Distributed Analysis David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
ATLAS Experience on Large Scale Productions on the Grid CHEP-2006 Mumbai 13th February 2006 Gilbert Poulard (CERN PH-ATC) on behalf of ATLAS Data Challenges;
G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th A proposal for distributed computing monitoring for SuperB G.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
EXPERIENCE WITH ATLAS DISTRIBUTED ANALYSIS TOOLS S. González de la Hoz L. March IFIC, Instituto.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL May 19, 2003 BNL Technology Meeting.
CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.
ATLAS Distributed Analysis S. González de la Hoz 1, D. Liko 2, L. March 1 1 IFIC – Valencia 2 CERN.
U.S. ATLAS Grid Production Experience
Data Challenge with the Grid in ATLAS
INFN-GRID Workshop Bari, October, 26, 2004
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
ATLAS DC2 ISGC-2005 Taipei 27th April 2005
ATLAS DC2 & Continuous production
Production Manager Tools (New Architecture)
Presentation transcript:

ATLAS Distributed Analysis DISTRIBUTED ANALYSIS JOBS WITH THE ATLAS PRODUCTION SYSTEM S. González D. Liko A. Nairz G. Mair F. Orellana L. Goossens CERN, European Organization for the Nuclear Research 1211 Genève 23, Switzerland Silvia Resconi INFN-Istituto Nazionale di Fisica Nucleare, Sezione di Milano, Italy Alessandro De Salvo INFN-Istituto Nazionale di Fisica Nucleare, Sezione di Roma, Italy Experience running the Analysis - Analysis has been done running our own supervisor and Lexor/CondorG instance. - Delays due to data transfer are not an issue any more because AOD input is available on-site and jobs are sent to those sites only. - System setup is not able yet to support long queues (simulation) and short queues (analysis) in parallel: - queues are filled with simulation jobs - long pending times for analysis jobs - The analysis has been launched over events in the ttbar reconstruction and over events in the SUSY case. - W->jj and Z  reconstructed masses, after merging the histogram files, were produced through the ATLAS production system (“a la Grid”) with 100 input files each. - With free resources, system was able to process 10k-event jobs in about 10 min (total). Two datasets were used: - 50 events per file, a total of 7000 files for first dataset and 1000 for second one jobs (70 for the first one and 10 for the second one) with 100 input files each were defined with AtCom. - these 80 jobs were submitted with Eowyn-Lexor/CondorG and they ran in several LCG sites. Each job produced three output files (ntuple, histogram and log) stored at Castor (CERN, IFIC-Valencia, Taiwan). - ROOT has been used to merge these histogram output files, in a post-processing step. Introduction Detector for the study of high-energy proton- proton collisions. The offline computing will have to deal with an output event rate of 200 Hz. i.e 10 9 events per year, with an average event size of 1.6 MB. In 2002 ATLAS computing planned a first series of Data Challenges (DC’s) in order to validate its: - Computing Model - Software - Data Model The ATLAS collaboration decided to perform the DCs using the Grid middleware developed in several Grid projects (Grid flavours) like: - LHC Computing Grid project (LCG), to which CERN is committed - GRID3/OSG - NorduGRID Storage: - Raw recording rate 320 MBytes/sec - Accumulating at 5-8 PetaBytes/year - 20 PetaBytes of disk - 10 PetaBytes of tape Processing: - 40,000 of today’s fastest PCs International Conference on Computing in High Energy and Nuclear Physics February 2006, Mumbai, India The production database, which contains abstract job definitions; The Eowyn supervisor that reads the production database for job definitions and present them to the different Grid executors in an easy-to-parse XML format; The Executors, one for each Grid flavor, that receive the job-definitions in XML format and convert them to the job description language of that particular Grid; Don Quijote II, the Atlas Distributed Data Management System (DDM), moves files from their temporary output locations to their final destination on some Storage Element and registers the files in the Replica Location Service of that Grid. ATLAS Production System In order to handle the task of ATLAS Data Challenges, an automated production system was designed. The ATLAS production system consists of 4 components The ATLAS production system has been successfully used to run production of ATLAS jobs at an unprecedented scale. On successful days there were more then jobs processed by the system. The experiences obtained operating the system, which includes several grid systems, are considered to be essential also to perform analysis using Grid resources. Don Quijote II Eowyn CondorG Panda/OSG Setup for Distributed Analysis - R 4 bl Using latest version of Production System: - Supervisor: Eowyn - Executors: Condor-G Lexor (currently under testing) - Data Management: plain LCG tools, LFC catalog DQ2 currently being integrated - Development database (devdb) dedicated DA database already in place Generic analysis transformation has been created: - compiles user code/package on the worker node - processes Analysis Object Data (AOD) input files - produces histogram + n-tuple file as outputs User Interface: AtCom4 The ATLAS Commander (AtCom) is used as a graphical user interface. Currently used for task and job definitions: - task: contains summary information about the jobs to be run (input/output datasets, transformation parameters, resource + environment requirements, etc). - job: concrete parameters needed for running, but no Grid-specifics - Following the ProdDB schema and xml description VO-based user authentication in AtCom: - user certificate and proxy used to enable DB access - no proxy delegation, following current ProdSys strategy (jobs still run under (a few) production managers‘ IDs) It communicates directly with the Oracle database and provides support to define jobs and monitor their status The two algorithms of choice have been ttbar and a SUSY reconstructions algorithms provided by the physics group. In case of the ttbar reconstruction the algorithm is already part of the release and has been used as installed already on the Grid sites. In case of the SUSY algorithm the source code of the algorithm has been transferred together with the job and the shared library was obtained by compilation on the worker node.