ATLAS Data Challenge on NorduGrid CHEP2003 – UCSD Anders Wäänänen

Slides:



Advertisements
Similar presentations
ATLAS/LHCb GANGA DEVELOPMENT Introduction Requirements Architecture and design Interfacing to the Grid Ganga prototyping A. Soroko (Oxford), K. Harrison.
Advertisements

EU 2nd Year Review – Jan – Title – n° 1 WP1 Speaker name (Speaker function and WP ) Presentation address e.g.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
NorduGrid Grid Manager developed at NorduGrid project.
4/2/2002HEP Globus Testing Request - Jae Yu x Participating in Globus Test-bed Activity for DØGrid UTA HEP group is playing a leading role in establishing.
Job Submission The European DataGrid Project Team
The DataGrid Project NIKHEF, Wetenschappelijke Jaarvergadering, 19 December 2002
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
The NorduGrid Toolkit: Overview and architecture The 4 th NorduGrid Workshop November 11 th 2002, Uppsala Anders Wäänänen.
Current Monte Carlo calculation activities in ATLAS (ATLAS Data Challenges) Oxana Smirnova LCG/ATLAS, Lund University SWEGRID Seminar (April 9, 2003, Uppsala)
Grid Computing Reinhard Bischof ECFA-Meeting March 26 th 2004 Innsbruck.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
The NorduGrid project: Using Globus toolkit for building Grid infrastructure presented by Aleksandr Konstantinov Mattias Ellert Aleksandr Konstantinov.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
BaBar WEB job submission with Globus authentication and AFS access T. Adye, R. Barlow, A. Forti, A. McNab, S. Salih, D. H. Smith on behalf of the BaBar.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
Grid Computing - AAU 14/ Grid Computing Josva Kleist Danish Center for Grid Computing
3 Sept 2001F HARRIS CHEP, Beijing 1 Moving the LHCb Monte Carlo production system to the GRID D.Galli,U.Marconi,V.Vagnoni INFN Bologna N Brook Bristol.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
EGEE is a project funded by the European Union under contract IST The GENIUS portal Roberto Barbera University of Catania and INFN SEE-GRID.
CHEP 2000, Giuseppe Andronico Grid portal based data management for Lattice QCD data ACAT03, Tsukuba, work in collaboration with A.
F.Fanzago – INFN Padova ; S.Lacaprara – LNL; D.Spiga – Universita’ Perugia M.Corvo - CERN; N.DeFilippis - Universita' Bari; A.Fanfani – Universita’ Bologna;
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
NorduGrid Architecture and tools CHEP2003 – UCSD Anders Wäänänen
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
DataGrid is a project funded by the European Union VisualJob Demonstation EDG 1.4.x 2003 The EU DataGrid How the use of distributed resources can help.
Nadia LAJILI User Interface User Interface 4 Février 2002.
Belle MC Production on Grid 2 nd Open Meeting of the SuperKEKB Collaboration Soft/Comp session 17 March, 2009 Hideyuki Nakazawa National Central University.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
- Distributed Analysis (07may02 - USA Grid SW BNL) Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software.
EGEE is a project funded by the European Union under contract IST The GENIUS portal Roberto Barbera University of Catania and INFN First Latinamerican.
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Quick Introduction to NorduGrid Oxana Smirnova 4 th Nordic LHC Workshop November 23, 2001, Stockholm.
Production Tools in ATLAS RWL Jones GridPP EB 24 th June 2003.
First attempt for validating/testing Testbed 1 Globus and middleware services WP6 Meeting, December 2001 Flavia Donno, Marco Serra for IT and WPs.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
ATLAS Data Challenges on the Grid Oxana Smirnova Lund University October 31, 2003, Košice.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
Alex Read, Dept. of Physics Grid Activities in Norway R-ECFA, Oslo, 15 May, 2009.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
Performance of The NorduGrid ARC And The Dulcinea Executor in ATLAS Data Challenge 2 Oxana Smirnova (Lund University/CERN) for the NorduGrid collaboration.
High-Performance Computing Lab Overview: Job Submission in EDG & Globus November 2002 Wei Xing.
The NorduGrid toolkit user interface Mattias Ellert Presented at the 3 rd NorduGrid workshop, Helsinki,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CRAB: the CMS tool to allow data analysis.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
C. Loomis – Demonstration-Dec. 12, n° 1 Testbed 1 Demonstration December 12, 2001
Overview of ATLAS Data Challenge Oxana Smirnova LCG/ATLAS, Lund University GAG monthly, February 28, 2003, CERN Strongly based on slides of Gilbert Poulard.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
The DataGrid Project NIKHEF, Wetenschappelijke Jaarvergadering, 19 December 2002
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
ATLAS Distributed Analysis DISTRIBUTED ANALYSIS JOBS WITH THE ATLAS PRODUCTION SYSTEM S. González D. Liko
Bob Jones – Project Architecture - 1 March n° 1 Project Architecture, Middleware and Delivery Schedule Bob Jones Technical Coordinator, WP12, CERN.
EGEE is a project funded by the European Union under contract IST GENIUS and GILDA Guy Warner NeSC Training Team Induction to Grid Computing.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
NorduGrid Architecture EDG ATF Meeting CERN – June 12 th 2002 Anders Wäänänen.
ATLAS Distributed Analysis S. González de la Hoz 1, D. Liko 2, L. March 1 1 IFIC – Valencia 2 CERN.
The EDG Testbed Deployment Details
Oxana Smirnova, Jakob Nielsen (Lund University/CERN)
Ruslan Fomkin and Tore Risch Uppsala DataBase Laboratory
US ATLAS Physics & Computing
ATLAS DC2 & Continuous production
Presentation transcript:

ATLAS Data Challenge on NorduGrid CHEP2003 – UCSD Anders Wäänänen

2ATLAS Data Challenge with NorduGridAnders Wäänänen NorduGrid project u Launched in spring of 2001, with the aim of creating a Grid infrastructure in the Nordic countries. u Idea to have a Monarch architecture with a common tier 1 center u Partners from Denmark, Norway, Sweden, and Finland u Initially meant to be the Nordic branch of the EU DataGrid (EDG) project u 3 full-time researchers with few externally funded

3ATLAS Data Challenge with NorduGridAnders Wäänänen Motivations u NorduGrid was initially meant to be a pure deployment project u One goal was to have the ATLAS data challenge run by May 2002 u Should be based on the the Globus Toolkit™ u Available Grid middleware: n The Globus Toolkit™ s A toolbox – not a complete solution n European DataGrid software s Not mature for production in the beginning of 2002 s Architecture problems

4ATLAS Data Challenge with NorduGridAnders Wäänänen A Job Submission Example UI JDL Logging & Book-keeping ResourceBroker Output “sandbox” Input “sandbox” Job Submission Service StorageElement ComputeElement Brokerinfo Output “sandbox” Input “sandbox”InformationService Job Status ReplicaCatalogue Author. &Authen. Job Submit Job Query Job Status

5ATLAS Data Challenge with NorduGridAnders Wäänänen Architecture requirements u No single point of failure u Should be scalable u Resource owners should have full control over their resources u As few site requirements as possible: n Local cluster installation details should not be dictated s Method, OS version, configuration, etc… n Compute nodes should not be required to be on the public network n Clusters need not be dedicated to the Grid

6ATLAS Data Challenge with NorduGridAnders Wäänänen User interface u The NorduGrid user interface provides a set of commands for interacting with the grid n ngsub – for submitting jobs n ngstat – for states of jobs and clusters n ngcat – to see stdout/stderr of running jobs n ngget – to retrieve the results from finished jobs n ngkill – to kill running jobs n ngclean – to delete finished jobs from the system n ngcopy – to copy files to, from and between file servers and replica catalogs n ngremove – to delete files from file servers and RC’s

7ATLAS Data Challenge with NorduGridAnders Wäänänen ATLAS Data Challenges u A series of computing challenges within Atlas of increasing size and complexity. u Preparing for data-taking and analysis at the LHC. u Thorough validation of the complete Atlas software suite. u Introduction and use of Grid middleware as fast and as much as possible.

8ATLAS Data Challenge with NorduGridAnders Wäänänen Data Challenge 1 u Main goals: n Need to produce data for High Level Trigger & Physics groups s Study performance of Athena framework and algorithms for use in HLT s High statistics needed n Few samples of up to 10 7 events in days, O(1000) CPU’s n Simulation & pile-up n Reconstruction & analysis on a large scale s learn about data model; I/O performances; identify bottlenecks etc n Data management s Use/evaluate persistency technology (AthenaRoot I/O) s Learn about distributed analysis n Involvement of sites outside CERN n use of Grid as and when possible and appropriate

9ATLAS Data Challenge with NorduGridAnders Wäänänen DC1, phase 1: Task Flow  Example: one sample of di-jet events PYTHIA event generation: 1.5 x 10 7 events split into partitions (read: ROOT files) Detector simulation: 20 jobs per partition, ZEBRA output Atlsim/Geant3 + Filter 10 5 events Atlsim/Geant3 + Filter Hits/ Digits MCTruth Atlsim/Geant3 + Filter Pythia6 Di-jet Athena-Root I/OZebra HepMC Event generation Detector Simulation (5000 evts) (~450 evts) Hits/ Digits MCTruth Hits/ Digits MCtruth

10ATLAS Data Challenge with NorduGridAnders Wäänänen DC1, phase 1: Summary u July-August 2002 u 39 institutes in 18 countries u 3200 CPU’s, approx.110 kSI95 – CPU-days u 5 × 10 7 events generated u 1 × 10 7 events simulated u 30 Tbytes produced u files of output

11ATLAS Data Challenge with NorduGridAnders Wäänänen DC1, phase1 for NorduGrid u Simulation u Dataset 2000 & 2003 (different event generation) assigned to NorduGrid u Total number of fully simulated events: n (1.15 × 10 7 of input events) u Total output size: 762 GB. u All files uploaded to a Storage Element (University of Oslo) and registered in the Replica Catalog.

12ATLAS Data Challenge with NorduGridAnders Wäänänen Job xRSL script & (executable=”ds2000.sh”) (arguments=”1244”) (stdout=”dc simul hlt.pythia_jet_17.log”) (join=”yes”) (inputfiles=(“ds2000.sh” “ (outputfiles= (“atlas zebra” “rc://dc1.uio.no/2000/log/dc simul hlt.pythia_jet_17.zebra”) (“atlas his” “rc://dc1.uio.no/2000/log/dc simul hlt.pythia_jet_17.his”) (“dc simul hlt.pythia_jet_17.log” “rc://dc1.uio.no/2000/log/dc simul hlt.pythia_jet_17.log”) (“dc simul hlt.pythia_jet_17.AMI” “rc://dc1.uio.no/2000/log/dc simul hlt.pythia_jet_17.AMI”) (“dc simul hlt.pythia_jet_17.MAG” “rc://dc1.uio.no/2000/log/dc simul hlt.pythia_jet_17.MAG”)) (jobname=”dc simul hlt.pythia_jet_17”) (runtimeEnvironment=”DC1-ATLAS”) (replicacollection=”ldap://grid.uio.no:389/lc=ATLAS,rc=NorduGrid,dc=nordugrid,dc=org”) (maxCPUTime=2000)(maxDisk=1200) (notify=”e

13ATLAS Data Challenge with NorduGridAnders Wäänänen NorduGrid job submission  The user submits a xRSL-file specifying the job-options.  The xRSL-file is processed by the User-Interface.  The User-Interface queries the NG Information System for resources and the NorduGrid Replica-Catalog for location of input-files and submits the job to the selected resource.  Here the job is processed by the Grid Manager, which downloads or links files to the local session directory.  The Grid Manager submits the job to the local resource management system.  After simulation finishes, the Grid-Manager moves requested output to Storage Elements and registers these into the NorduGrid Replica- Catalog.

14ATLAS Data Challenge with NorduGridAnders Wäänänen NorduGrid job submission RC RSL MDS Grid Manager Gatekeeper GridFTP RSL

15ATLAS Data Challenge with NorduGridAnders Wäänänen NorduGrid Production sites

16ATLAS Data Challenge with NorduGridAnders Wäänänen

17ATLAS Data Challenge with NorduGridAnders Wäänänen NorduGrid Pileup u DC1, pile-up: n Low luminosity pile-up for the phase 1 events u Number of jobs: 1300 n dataset 2000: 300 n dataset 2003: 1000 u Total output-size: 1083 GB n dataset 2000: 463 GB n dataset 2003: 620 GB

18ATLAS Data Challenge with NorduGridAnders Wäänänen Pileup procedure u Each job downloaded one zebra-file from dc1.uio.no of approximate n 900MB for dataset 2000 n 400MB for dataset 2003 u Use locally present minimum-bias zebra-files to "pileup" events on top of the original simulated ones present in the downloaded file. The output size of each file was about 50 % bigger than the original downloaded file i.e.: n 1.5 GB for dataset 2000 n 600 GB for dataset 2003 u Upload output-files to dc1.uio.no and dc2.uio.no SE‘s u Register into the RC.

19ATLAS Data Challenge with NorduGridAnders Wäänänen Other details u At peak production, up to 200 jobs were managed by the NorduGrid at the same time. u Has most of Scandinavian production clusters under its belt (2 of them are in Top 500) u However not all of them allow for installation of ATLAS Software u Atlas job manager Atlas Commander support the NorduGrid toolkit u Issues n Replica Catalog scalability problems n MDS / OpenLDAP hangs – solved n Software threading problems – partly solved s Problems partly in Globus libraries

20ATLAS Data Challenge with NorduGridAnders Wäänänen NorduGrid DC1 timeline u April 5 th 2002 n First ATLAS job submitted (Athena Hello World) u May 10 th 2002 n First pre-DC1-validation-job submitted (ATLSIM test using Atlas-release 3.0.1) u End of May 2002 n Now clear that NorduGrid mature enough to handle real production u Spring 2003 (now) n Keep running Data challenges and improve the toolkit

21ATLAS Data Challenge with NorduGridAnders Wäänänen Quick client installation/job run u As a normal user (non system privileges required): Retrieve nordugrid-standalone rh72.i386.tgz tar xfz nordugrid-standalone rh72.i386.tgz cd nordugrid-standalone source./setup.sh Get a personal certificate: grid-cert-request n Install certificate per instructions n Get authorized on a cluster n Run a job grid-proxy-init ngsub '&(executable=/bin/echo)(arguments="Hello World")‘

22ATLAS Data Challenge with NorduGridAnders Wäänänen Resources u Documentation and source code are available for download u Main Web site: n u ATLAS DC1 with NorduGrid n u Software repository n ftp://ftp.nordugrid.org/pub/nordugrid/ ftp://ftp.nordugrid.org/pub/nordugrid/

23ATLAS Data Challenge with NorduGridAnders Wäänänen The NorduGrid core group u Александр Константинов u Balázs Kónya u Mattias Ellert u Оксана Смирнова u Jakob Langgaard Nielsen u Trond Myklebust u Anders Wäänänen