Dario Barberis: ATLAS Computing TDR LHCC - 29 June 2005 1 ATLAS Computing Technical Design Report Dario Barberis (CERN & Genoa University) on behalf of.

Slides:



Advertisements
Similar presentations
Preparing for data analysis in ATLAS Andrea Dell’Acqua - CERN PH/SFT on behalf of the ATLAS collaboration.
Advertisements

1 Databases in ALICE L.Betev LCG Database Deployment and Persistency Workshop Geneva, October 17, 2005.
ATLAS Analysis Model. Introduction On Feb 11, 2008 the Analysis Model Forum published a report (D. Costanzo, I. Hinchliffe, S. Menke, ATL- GEN-INT )
CERN – June 2007 View of the ATLAS detector (under construction) 150 million sensors deliver data … … 40 million times per second.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
August 98 1 Jürgen Knobloch ATLAS Software Workshop Ann Arbor ATLAS Computing Planning ATLAS Software Workshop August 1998 Jürgen Knobloch Slides also.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
LHCb Quarterly Report October Core Software (Gaudi) m Stable version was ready for 2008 data taking o Gaudi based on latest LCG 55a o Applications.
ATLAS-Specific Activity in GridPP EDG Integration LCG Integration Metadata.
December 17th 2008RAL PPD Computing Christmas Lectures 11 ATLAS Distributed Computing Stephen Burke RAL.
ATLAS and Grid Computing RWL Jones GridPP 13 5 th July 2005.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
LCG and HEPiX Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002.
Offline Coordinators  CMSSW_7_1_0 release: 17 June 2014  Usage:  Generation and Simulation samples for run 2 startup  Limited digitization and reconstruction.
The first year of LHC physics analysis using the GRID: Prospects from ATLAS Davide Costanzo University of Sheffield
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER Charles Leggett The Athena Control Framework in Production, New Developments and Lessons Learned.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,
Event Data History David Adams BNL Atlas Software Week December 2001.
7April 2000F Harris LHCb Software Workshop 1 LHCb planning on EU GRID activities (for discussion) F Harris.
ATLAS in LHCC report from ATLAS –ATLAS Distributed Computing has been working at large scale Thanks to great efforts from shifters.
9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC.
Zprávy z ATLAS SW Week March 2004 Seminář ATLAS SW CZ Duben 2004 Jiří Chudoba FzÚ AV CR.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
GDB Meeting - 10 June 2003 ATLAS Offline Software David R. Quarrie Lawrence Berkeley National Laboratory
Post-DC2/Rome Production Kaushik De, Mark Sosebee University of Texas at Arlington U.S. Grid Phone Meeting July 13, 2005.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
Jean-Roch Vlimant, CERN Physics Performance and Dataset Project Physics Data & MC Validation Group McM : The Evolution of PREP. The CMS tool for Monte-Carlo.
Dario Barberis: ATLAS Activities at Tier-2s Tier-2 Workshop June ATLAS Activities at Tier-2s Dario Barberis CERN & Genoa University.
Large scale data flow in local and GRID environment Viktor Kolosov (ITEP Moscow) Ivan Korolko (ITEP Moscow)
The ATLAS Computing Model and USATLAS Tier-2/Tier-3 Meeting Shawn McKee University of Michigan Joint Techs, FNAL July 16 th, 2007.
ATLAS Computing Requirements LHCC - 19 March ATLAS Computing Requirements for 2007 and beyond.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
LHCb report to LHCC and C-RSG Philippe Charpentier CERN on behalf of LHCb.
Predrag Buncic ALICE Status Report LHCC Referee Meeting CERN
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.
The MEG Offline Project General Architecture Offline Organization Responsibilities Milestones PSI 2/7/2004Corrado Gatto INFN.
Victoria, Sept WLCG Collaboration Workshop1 ATLAS Dress Rehersals Kors Bos NIKHEF, Amsterdam.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
The ATLAS Computing & Analysis Model Roger Jones Lancaster University ATLAS UK 06 IPPP, 20/9/2006.
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
Laura Perini Roma: HEPIX - 4 April ATLAS plans for batch system use l Grid and local l Steady state and startup.
ATLAS Distributed Analysis DISTRIBUTED ANALYSIS JOBS WITH THE ATLAS PRODUCTION SYSTEM S. González D. Liko
VI/ CERN Dec 4 CMS Software Architecture vs Hybrid Store Vincenzo Innocente CMS Week CERN, Dec
Dario Barberis: Conclusions ATLAS Software Week - 10 December Conclusions Dario Barberis CERN & Genoa University.
ATLAS Physics Analysis Framework James R. Catmore Lancaster University.
Dario Barberis: ATLAS DB S&C Week – 3 December Oracle/Frontier and CondDB Consolidation Dario Barberis Genoa University/INFN.
ATLAS Computing Model Ghita Rahal CC-IN2P3 Tutorial Atlas CC, Lyon
ATLAS Distributed Computing Tutorial Tags: What, Why, When, Where and How? Mike Kenyon University of Glasgow.
Monthly video-conference, 18/12/2003 P.Hristov1 Preparation for physics data challenge'04 P.Hristov Alice monthly off-line video-conference December 18,
ATLAS Distributed Analysis S. González de la Hoz 1, D. Liko 2, L. March 1 1 IFIC – Valencia 2 CERN.
Database Replication and Monitoring
Data Challenge with the Grid in ATLAS
ALICE analysis preservation
The LHCb Software and Computing NSS/IEEE workshop Ph. Charpentier, CERN B00le.
Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group
Readiness of ATLAS Computing - A personal view
US ATLAS Physics & Computing
Simulation and Physics
ATLAS DC2 & Continuous production
The ATLAS Computing Model
The LHCb Computing Data Challenge DC06
Presentation transcript:

Dario Barberis: ATLAS Computing TDR LHCC - 29 June ATLAS Computing Technical Design Report Dario Barberis (CERN & Genoa University) on behalf of the ATLAS Collaboration

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Computing TDR

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Computing TDR structure The TDR describes the whole Software & Computing Project as defined within the ATLAS organization: Major activity areas within the S&C Project Liaisons to other ATLAS projects

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Computing TDR structure l The chapter structure follows closely the organization of the Software & Computing Project: nChapter 1: Introduction nChapter 2: Computing Model nChapter 3: Offline Software nChapter 4: Databases and Data Management nChapter 5: Grid Tools and Services nChapter 6: Computing Operations nChapter 7: Resource Requirements nChapter 8: Project Organization and Planning nAppendix: Glossary

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Computing Model: event data flow from EF l Events are written in “ByteStream” format by the Event Filter farm in 2 GB files n~1000 events/file (nominal size is 1.6 MB/event) n200 Hz trigger rate (independent of luminosity) nCurrently 4 streams are foreseen:  Express stream with “most interesting” events  Calibration events (including some physics streams, such as inclusive leptons)  “Trouble maker” events (for debugging)  Full (undivided) event stream nOne 2-GB file every 5 seconds will be available from the Event Filter nData will be transfered to the Tier-0 input buffer at 320 MB/s (average) l The Tier-0 input buffer will have to hold raw data waiting for processing nAnd also cope with possible backlogs n~125 TB will be sufficient to hold 5 days of raw data on disk

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Computing Model: central operations l Tier-0: nCopy RAW data to Castor tape for archival nCopy RAW data to Tier-1s for storage and reprocessing nRun first-pass calibration/alignment (within 24 hrs) nRun first-pass reconstruction (within 48 hrs) nDistribute reconstruction output (ESDs, AODs & TAGS) to Tier-1s l Tier-1s: nStore and take care of a fraction of RAW data nRun “slow” calibration/alignment procedures nRerun reconstruction with better calib/align and/or algorithms nDistribute reconstruction output to Tier-2s nKeep current versions of ESDs and AODs on disk for analysis l Tier-2s: nRun simulation nKeep current versions of AODs on disk for analysis

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Event Data Model l RAW: n“ByteStream” format, ~1.6 MB/event l ESD (Event Summary Data): nFull output of reconstruction in object (POOL/ROOT) format:  Tracks (and their hits), Calo Clusters, Calo Cells, combined reconstruction objects etc. nNominal size 500 kB/event  currently 2.5 times larger: contents and technology under revision, following feedback on the first prototype implementation l AOD (Analysis Object Data): nSummary of event reconstruction with “physics” (POOL/ROOT) objects:  electrons, muons, jets, etc. nNominal size 100 kB/event  currently 70% of that: contents and technology under revision, following feedback on the first prototype implementation l TAG: nDatabase used to quickly select events in AOD and/or ESD files

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Offline Software: Architecture l The architecture of the Athena framework is based on Gaudi: nSeparation of data from algorithms nSeparation of transient (in-memory) from persistent (in-file) data nExtensive use of abstract interfaces to decouple the various components

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Offline Software: Geometry l The GeoModel detector description system provides us with an application-independent way to describe the geometry l In this way Simulation, Reconstruction, Event Display etc. use by definition the same geometry l Geometry data are stored in a database with a Hierarchical Versioning System l Alignment corrections are applied with reference to a given baseline geometry

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Offline Software: Simulation l Event generator framework interfaces multiple packages nincluding the Genser distribution provided by LCG-AA l Simulation with Geant4 since early 2004 nautomatic geometry build from GeoModel n>25M events fully simulated up to now since mid-2004  only a handful of crashes! l Digitization tested and tuned with Test Beam l Fast simulation also used for preliminary large-statistics (physics) background level studies

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Offline Software: Reconstruction l Separation of data and algorithms: nTracking code: nCalorimetry code: l Resource needs (memory and CPU) currently larger than target values nOptimization and performance, rather than functionality, will be the focus of developments until detector turn-on

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Offline Software: Physics Analysis Tools l The Physics Analysis Tools group develops common utilities for analysis based on the Athena framework nclasses for selections, sorting, combinations etc. of data objects nconstituent navigation (e.g. jets to clusters) and back navigation (e.g. AOD to ESD) nUserAnalysis package in Athena ninteractive analysis in Athena nanalysis in Python ninterfaces to event displays ntesting the concept of “Event View”: a coherent list of physics objects that are mutually exclusive  any object appears only once in the list of reconstructed objects available for analysis

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Databases and Data Management l The DB/DM project takes care of all types of ATLAS data l Event data: nfile organization (using LCG POOL/ROOT data storage) ncataloguing nbook-keeping ndata access, converters, schema evolution l Non-event data: nTechnical Coordination, Production, Installation, Survey DBs nOnline conditions and configuration data nGeometry and Calibration/alignment DB (using LCG COOL DB) nDB access methods, data distribution

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Distributed Data Management l Accessing distributed data on the Grid is not a simple task l Several DBs are needed centrally to hold dataset information l “Local” catalogues hold information on local data storage l The new DDM system (right) is under test this summer l It will be used for all ATLAS data from October on (LCG Service Challenge 3)

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Distributed Production System l ATLAS has run already several large-scale distributed production exercises nDC1 in , DC2 in 2004, “Rome Production” in 2005 nSeveral tens of millions of events fully simulated and reconstructed l It has not been an easy task, despite the availability of 3 Grids nDC2 and Rome prod. were run entirely on Grids (LCG/EGEE, Grid3/OSG, NorduGrid) l A 2 nd version of the distributed ProdSys is in preparation nkeeping the same architecture:  ProdDB  Common supervisor  One executor per Grid  Interface to DDM

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Distributed Analysis System l Several groups started independently in to work on prototypes of distributed analysis systems l LCG RTAG 11 did not produce in 2003 a common analysis system project as hoped. ATLAS therefore planned to combine the strengths of various existing prototypes: nGANGA to provide a user interface to the Grid for Gaudi/Athena jobs nDIAL to provide fast, quasi-interactive, access to large local clusters nThe ATLAS Production System to interface to the 3 Grid flavours l The combined system did not provide the users with the expected functionality for the Rome Physics Workshop (June ’05) l We are currently reviewing this activity in order to define a baseline for the development of a performing Distributed Analysis System l In the meantime other projects appeared on the market: nDIANE to submit interactive jobs to a local computing cluster nLJSF to submit batch jobs to the LCG/EGEE Grid (using part of ProdSys) l All this has to work together with the DDM system described earlier l If we decide a baseline “now”, we can have a testable system by this autumn

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Computing Operations l The Computing Operations organization has to provide for: a)CERN Tier-0 operations lfrom output of EF to data distribution to Tier-1's, including calibration/alignment and reconstruction procedures b)World-wide operations: lsimulation job distribution lre-processing of real and simulated data at Tier-1's ldata distribution and placement c)Software distribution and installation d)Site and software installation validation and monitoring e)Coordination of Service Challenges in f)User Support l... along the guidelines of the Computing Model l Some of the needed components already exist nand have been tested during Data Challenges

Dario Barberis: ATLAS Computing TDR LHCC - 29 June ATLAS Virtual Organization l Right now the Grid is “free for all” nno CPU or storage accounting nno priorities nno storage space reservation l Already we have seen competition for resources between “official” Rome productions and “unofficial”, but organised, productions nB-physics, flavour tagging... l The latest release of the VOMS (Virtual Organisation Management Service) middleware package allows the definition of user groups and roles within the ATLAS Virtual Organisation nand is used by all 3 Grid flavours! l Once groups and roles are set up, we have to use this information l Relative priorities are easy to enforce if all jobs go through the same queue (or database) l In case of a distributed submission system, it is up to the resource providers to: nagree the policies of each site with ATLAS npublish and enforce the agreed policies

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Computing Model and Resources l The Computing Model in the TDR is basically the same as in the Computing Model document (Dec. 2004) submitted for the LHCC review in January 2005 l Main differences: nInclusion of Heavy Ion runs from 2008 onwards  add ~10% to data storage needs  limit the increase in CPU needed at Tier-1s to <10% by a different model for data reprocessing nDecrease of the nominal run time in 2007 from 100 to 50 days  after agreement with LHCC referees (McBride/Forti) l Immediate consequences of these changes on the resource needs are a slower ramp-up in , followed by a steeper increase as soon as Heavy Ions start (order 10% for storage everywhere and for CPU at Tier-1s)

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Tier-0 Tier-2s Tier-1s CERN Analysis Facility

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Massive productions on 3 Grids (1) July-September 2004: DC2 Geant4 simulation (long jobs) 40% on LCG/EGEE Grid, 30% on Grid3 and 30% on NorduGrid October-December 2004: DC2 digitization and reconstruction (short jobs) February-May 2005: Rome production (mix of jobs as digitization and reconstruction was started as soon as samples had been simulated) 65% on LCG/EGEE Grid, 24% on Grid3, 11% on NorduGrid

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Massive productions on 3 Grids (2) l 73 data sets containing 6.1M events simulated and reconstructed (without pile-up) l Total simulated data: 8.5M events l Pile-up done later (for 1.3M events done up to last week)

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Computing System Commissioning Goals l We have recently discussed and defined the high-level goals of the Computing System Commissioning operation during the first half of 2006 nFormerly called “DC3” nMore a running-in of continuous operation than a stand-alone challenge l Main aim of Computing System Commissioning will be to test the software and computing infrastructure that we will need at the beginning of 2007: nCalibration and alignment procedures and conditions DB nFull trigger chain nTier-0 reconstruction and data distribution nDistributed access to the data for analysis l At the end (summer 2006) we will have a working and operational system, ready to take data with cosmic rays at increasing rates

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Computing System Commissioning Tests l Sub-system tests with well-defined goals, preconditions, clients and quantifiable acceptance tests nFull Software Chain nTier-0 Scaling nCalibration & Alignment nTrigger Chain & Monitoring nDistributed Data Management nDistributed Production (Simulation & Re-processing) nPhysics Analysis nIntegrated TDAQ/Offline (complete chain) l Each sub-system is decomposed into components nE.g. Generators, Reconstruction (ESD creation) l Goal is to minimize coupling between sub-systems and components and to perform focused and quantifiable tests l Detailed planning being discussed now

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Project Milestones l July 2005: test the TAG and event collection infrastructure for data access for analysis (important for the Computing Model and the Event Data Model). l July 2005: decision on the future baseline for the Distributed Analysis System. l September 2005: new version of the Production System fully tested and deployed. l September 2005: delivery of software and infrastructure for ATLAS Commissioning and for Computing System Commissioning; start of integration testing. l October 2005: review of the implementation of the Event Data Model for reconstruction. l October 2005: Distributed Data Management system ready for operation with LCG Service Challenge 3. l November 2005: start of simulation production for Computing System Commissioning. l January 2006: production release for Computing System Commissioning and early Cosmic Ray studies; completion of the implementation of the Event Data Model for reconstruction. l February 2006: start of DC3 (Computing System Commissioning). l April 2006: integration of ATLAS components with LCG Service Challenge 4. l July 2006: production release for the main Cosmic Ray runs (starting in Autumn 2006). l December 2006: production release for early real proton data.

Dario Barberis: ATLAS Computing TDR LHCC - 29 June Conclusions l Data-taking for ATLAS is going to start “soon” nas a matter of fact, cosmic ray runs are starting now already in the pit l The Software & Computing Project is going to provide a working system and infrastructure for the needs of first ATLAS data nindeed, already now for detector commissioning and cosmic ray runs l Offline software and databases are already on a good level of functionality nwork will concentrate, from now on, on usability, optimization and performance l Distributed systems still need a lot of testing and running in ntrue for Distributed Productions, Distributed Data Management and Distributed Analysis l The Computing System Commissioning activity in the first half of 2006 will test an increasingly complex and functional system and provide us in summer 2006 with an operational infrastructure