Dario Barberis: ATLAS Activities at Tier-2s Tier-2 Workshop - 12-14 June 2006 1 ATLAS Activities at Tier-2s Dario Barberis CERN & Genoa University.

Slides:

Advertisements

Similar presentations

31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.

Advertisements

Roger Jones: The ATLAS Computing Model Ankara, Turkey - 2 May The ATLAS Computing Model Roger Jones Lancaster University.

23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.

CERN – June 2007 View of the ATLAS detector (under construction) 150 million sensors deliver data … … 40 million times per second.

Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.

Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.

December 17th 2008RAL PPD Computing Christmas Lectures 11 ATLAS Distributed Computing Stephen Burke RAL.

Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.

Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.

Computing and LHCb Raja Nandakumar. The LHCb experiment  Universe is made of matter  Still not clear why  Andrei Sakharov’s theory of cp-violation.

Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.

How to Install and Use the DQ2 User Tools US ATLAS Tier2 workshop at IU June 20, Bloomington, IN Marco Mambelli University of Chicago.

The first year of LHC physics analysis using the GRID: Prospects from ATLAS Davide Costanzo University of Sheffield

Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.

S. Gadomski, "The ATLAS cluster in Geneva", Swiss WLCG experts workshop, CSCS, June ATLAS cluster in Geneva Szymon Gadomski University of Geneva.

ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.

F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,

USATLAS SC4. 2 ?! …… The same host name for dual NIC dCache door is resolved to different IP addresses depending.

November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.

EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,

ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC.

The ATLAS Grid Progress Roger Jones Lancaster University GridPP CM QMUL, 28 June 2006.

David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.

Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,

The ATLAS Cloud Model Simone Campana. LCG sites and ATLAS sites LCG counts almost 200 sites. –Almost all of them support the ATLAS VO. –The ATLAS production.

The CMS Top 5 Issues/Concerns wrt. WLCG services WLCG-MB April 3, 2007 Matthias Kasemann CERN/DESY.

Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.

Service Availability Monitor tests for ATLAS Current Status Tests in development To Do Alessandro Di Girolamo CERN IT/PSS-ED.

The ATLAS Computing Model and USATLAS Tier-2/Tier-3 Meeting Shawn McKee University of Michigan Joint Techs, FNAL July 16 th, 2007.

ATLAS Computing Requirements LHCC - 19 March ATLAS Computing Requirements for 2007 and beyond.

Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.

INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.

ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.

1. 2 Overview Extremely short summary of the physical part of the conference (I am not a physicist, will try my best) Overview of the Grid session focused.

Handling of T1D0 in CCRC’08 Tier-0 data handling Tier-1 data handling Experiment data handling Reprocessing Recalling files from tape Tier-0 data handling,

The ATLAS Strategy for Distributed Analysis on several Grid Infrastructures D. Liko, IT/PSS for the ATLAS Distributed Analysis Community.

1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.

January 20, 2000K. Sliwa/ Tufts University DOE/NSF ATLAS Review 1 SIMULATION OF DAILY ACTIVITITIES AT REGIONAL CENTERS MONARC Collaboration Alexander Nazarenko.

Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,

ATLAS Distributed Analysis Dietrich Liko IT/GD. Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI.

The ATLAS Computing & Analysis Model Roger Jones Lancaster University ATLAS UK 06 IPPP, 20/9/2006.

Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.

Laura Perini Roma: HEPIX - 4 April ATLAS plans for batch system use l Grid and local l Steady state and startup.

D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.

Dynamic Data Placement: the ATLAS model Simone Campana (IT-SDC)

ATLAS Distributed Analysis DISTRIBUTED ANALYSIS JOBS WITH THE ATLAS PRODUCTION SYSTEM S. González D. Liko

WMS baseline issues in Atlas Miguel Branco Alessandro De Salvo Outline  The Atlas Production System  WMS baseline issues in Atlas.

Dario Barberis: ATLAS Computing TDR LHCC - 29 June ATLAS Computing Technical Design Report Dario Barberis (CERN & Genoa University) on behalf of.

Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.

LHCb Computing activities Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group.

ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)

ALICE Physics Data Challenge ’05 and LCG Service Challenge 3 Latchezar Betev / ALICE Geneva, 6 April 2005 LCG Storage Management Workshop.

SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,

Dario Barberis: ATLAS DB S&C Week – 3 December Oracle/Frontier and CondDB Consolidation Dario Barberis Genoa University/INFN.

EXPERIENCE WITH ATLAS DISTRIBUTED ANALYSIS TOOLS S. González de la Hoz L. March IFIC, Instituto.

1 S. JEZEQUEL- First chinese-french workshop 13 December 2006 Grid: An LHC user point of vue S. Jézéquel (LAPP-CNRS/Université de Savoie)

ATLAS Computing Model Ghita Rahal CC-IN2P3 Tutorial Atlas CC, Lyon

CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.

THE ATLAS COMPUTING MODEL Sahal Yacoob UKZN On behalf of the ATLAS collaboration.

ATLAS Distributed Analysis S. González de la Hoz 1, D. Liko 2, L. March 1 1 IFIC – Valencia 2 CERN.

Data Challenge with the Grid in ATLAS

INFN-GRID Workshop Bari, October, 26, 2004

LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.

Readiness of ATLAS Computing - A personal view

Zhongliang Ren 12 June 2006 WLCG Tier2 Workshop at CERN

Artem Trunov and EKP team EPK – Uni Karlsruhe

Simulation use cases for T2 in ALICE

ATLAS DC2 & Continuous production

The ATLAS Computing Model

The LHCb Computing Data Challenge DC06

Presentation transcript:

Dario Barberis: ATLAS Activities at Tier-2s Tier-2 Workshop June ATLAS Activities at Tier-2s Dario Barberis CERN & Genoa University

Dario Barberis: ATLAS Activities at Tier-2s Tier-2 Workshop June Computing Model: central operations l Tier-0: nCopy RAW data to Castor tape for archival nCopy RAW data to Tier-1s for storage and reprocessing nRun first-pass calibration/alignment (within 24 hrs) nRun first-pass reconstruction (within 48 hrs) nDistribute reconstruction output (ESDs, AODs & TAGS) to Tier-1s l Tier-1s: nStore and take care of a fraction of RAW data nRun “slow” calibration/alignment procedures nRerun reconstruction with better calib/align and/or algorithms nDistribute reconstruction output to Tier-2s nKeep current versions of ESDs and AODs on disk for analysis l Tier-2s: nRun simulation nRun calibration/alignment procedures nKeep current versions of AODs on disk for analysis nRun user analysis jobs

Dario Barberis: ATLAS Activities at Tier-2s Tier-2 Workshop June Computing Model and Resources l The ATLAS Computing Model is still the same as in the Computing TDR (June 2005) and basically the same as in the Computing Model document (Dec. 2004) submitted for the LHCC review in January 2005 l The sum of Tier-2s will provide ~40% of the total ATLAS computing and disk storage capacity nCPUs for full simulation productions and user analysis jobs  On average 1:2 for central simulation and analysis jobs nDisk for AODs, samples of ESDs and RAW data, and most importantly for selected event samples for physics analysis l We do not ask Tier-2s to run any particular service for ATLAS in addition to providing the Grid infrastructure (CE, SE, etc.) nAll data management services (catalogues and transfers) are run from Tier-1s l Some “larger” Tier-2s may choose to run their own services, instead of depending on a Tier-1 nIn this case, they should contact us directly l Depending on local expertise, some Tier-2s will specialise in one particular task nSuch as calibrating a very complex detector that needs special access to particular datasets

Dario Barberis: ATLAS Activities at Tier-2s Tier-2 Workshop June Tier-0 Tier-2s Tier-1s CERN Analysis Facility

Dario Barberis: ATLAS Activities at Tier-2s Tier-2 Workshop June ATLAS Distributed Data Management l ATLAS reviewed all its own Grid distributed systems (data management, production, analysis) during the first half of 2005 nIn parallel with the LCG BSWG activity l A new Distributed Data Management System (DDM) was designed, based on: nA hierarchical definition of datasets nCentral dataset catalogues nData blocks as units of file storage and replication nDistributed file catalogues nAutomatic data transfer mechanisms using distributed services (dataset subscription system) l The DDM system allows the implementation of the basic ATLAS Computing Model concepts, as described in the Computing TDR (June 2005): nDistribution of raw and reconstructed data from CERN to the Tier-1s nDistribution of AODs (Analysis Object Data) to Tier-2 centres for analysis nStorage of simulated data (produced by Tier-2s) at Tier-1 centres for further distribution and/or processing

Dario Barberis: ATLAS Activities at Tier-2s Tier-2 Workshop June ATLAS DDM Organization

Dario Barberis: ATLAS Activities at Tier-2s Tier-2 Workshop June Central vs Local Services l The DDM system has now a central role with respect to ATLAS Grid tools l One fundamental feature is the presence of distributed file catalogues and (above all) auxiliary services nClearly we cannot ask every single Grid centre to install ATLAS services nWe decided to install “local” catalogues and services at Tier-1 centres nThen we defined “regions” which consist of a Tier-1 and all other Grid computing centres that:  Are well (network) connected to this Tier-1  Depend on this Tier-1 for ATLAS services (including the file catalogue) l We believe that this architecture scales to our needs for the LHC data- taking era: nMoving several 10000s files/day nSupporting up to organized production jobs/day nSupporting the analysis work of >1000 active ATLAS physicists

Dario Barberis: ATLAS Activities at Tier-2s Tier-2 Workshop June Tiers of ATLAS T1 T0 T2 LFC FTS Server T1 FTS Server T0 T1 …. VO box LFC: local within ‘cloud’ All SEs SRM

Dario Barberis: ATLAS Activities at Tier-2s Tier-2 Workshop June ATLAS Data Management Model l Tier-1s send AOD data to Tier-2s l Tier-2s produce simulated data and send them to Tier-1s l In the ideal world (perfect network communication hardware and software) we would not need to define default Tier-1—Tier-2 associations l In practice, it turns out to be convenient (robust?) to partition the Grid so that there are default (not compulsory) data paths between Tier-1s and Tier-2s nFTS channels are installed for these data paths for production use nAll other data transfers go through normal network routes l In this model, a number of data management services are installed only at Tier-1s and act also on their “associated” Tier-2s: nVO Box nFTS channel server (both directions) nLocal file catalogue (part of DDM/DQ2)

Dario Barberis: ATLAS Activities at Tier-2s Tier-2 Workshop June Data Management Considerations l It is therefore “obvious” that the association must be between computing centres that are “close” from the point of view of: nnetwork connectivity (robustness of the infrastructure) ngeographical location (round-trip time) l Rates are not a problem: nAOD rates (for a full set) from a Tier-1 to a Tier-2 are nominally:  20 MB/s for primary production during data-taking  plus the same again for reprocessing from 2008 onwards  more later on as there will be more accumulated data to reprocess nUpload of simulated data for an “average” Tier-2 (3% of ATLAS Tier-2 capacity) is constant:  0.03 * 0.2 * 200 Hz * 2.6 MB = 3.2 MB/s continuously l Total storage (and reprocessing!) capacity for simulated data is a concern nThe Tier-1s must store and reprocess simulated data that match their overall share of ATLAS  Some optimization is always possible between real and simulated data, but only within a small range of variations

Dario Barberis: ATLAS Activities at Tier-2s Tier-2 Workshop June Job Management: Productions l Once we have data distributed in the correct way (rather than sometimes hidden in the guts of automatic mass storage systems), we can rework the distributed production system to optimise job distribution, by sending jobs to the data (or as close as possible to them) nThis was not the case previously, as jobs were sent to free CPUs and had to copy the input file(s) to the local WN, from wherever in the world the data happened to be l Next: make better use of the task and dataset concepts nA “task” acts on a dataset and produces more datasets nUse bulk submission functionality to send all jobs of a given task to the location of their input datasets nMinimise the dependence on file transfers and the waiting time before execution nCollect output files belonging to the same dataset to the same SE and transfer them asynchronously to their final locations

Dario Barberis: ATLAS Activities at Tier-2s Tier-2 Workshop June ATLAS Production System (2006) EGEENorduGridOSG EGEE exe EGEE exe NG exe OSG exe super ProdDB (jobs) DDM (Data Management) Python DQ2 Eowyn Tasks PanDADulcinea LexorLexor-CG LSF exe super Python T0MS

Dario Barberis: ATLAS Activities at Tier-2s Tier-2 Workshop June Job Management: Analysis l A system based on a central database (job queue) is good for scheduled productions (as it allows proper priority settings), but too heavy for user tasks such as analysis l Lacking a global way to submit jobs, a few tools have been developed to submit Grid jobs in the meantime: nLJSF (Lightweight Job Submission framework) can submit ATLAS jobs to the LCG/EGEE Grid  It was derived initially from the framework developed to install ATLAS software at EDG Grid sites nPathena can generate ATLAS jobs that act on a dataset and submits them to PanDA on the OSG Grid l The ATLAS baseline tool to help users to submit Grid jobs is Ganga nJob splitting and bookkeeping nSeveral submission possibilities nCollection of output files

Dario Barberis: ATLAS Activities at Tier-2s Tier-2 Workshop June ATLAS Analysis Work Model 1. Job preparation: 2. Medium-scale testing: 3. Large-scale running: Local system (shell) Prepare JobOptions  Run Athena (interactive or batch)  Get Output Local system (Ganga) Job book-keeping Get Output Local system (Ganga) Prepare JobOptions Find dataset from DDM Generate & submit jobs Grid Run Athena Local system (Ganga) Job book-keeping Access output from Grid Merge results Local system (Ganga) Prepare JobOptions Find dataset from DDM Generate & submit jobs ProdSys Run Athena on Grid Store o/p on Grid

Dario Barberis: ATLAS Activities at Tier-2s Tier-2 Workshop June Analysis Jobs at Tier-2s l Analysis jobs must run where the input data files are nAs transfering data files from other sites may take longer than actually running the job l Most analysis jobs will take AODs as input for complex calculations and event selections nAnd most likely will output Athena-Aware Ntuples (AAN, to be stored on some close SE) and histograms (to be sent back to the user) l We assume that people will develop their analyses and run them on reduced samples many many times before launching runs on a complete dataset nThere will be a large number of failures due to people’s code! l In order to assure execution of analysis jobs with a reasonable turn-around time, we have to set up a priority system that separates centrally organised productions from analysis tasks nMore on this in D.Liko’s talk tomorrow afternoon

Dario Barberis: ATLAS Activities at Tier-2s Tier-2 Workshop June Conclusions l ATLAS operations at Tier-2s are well defined in the Computing Model l We thank all Tier-2 managers and funding authorities for providing ATLAS with the much needed capacity l We are trying not to impose any particular load on Tier-2 managers by running distributed services at Tier-1s nAlthough this concept breaks the symmetry and forces us to set up default Tier-1–Tier-2 associations l All that is required of Tier-2s is to set up the Grid environment nIncluding whichever job queue priority scheme will be found most useful nAnd SRM Storage Elements with (when available) a correct implementation of the space reservation and accounting system