Download presentation
Presentation is loading. Please wait.
Published bySarah Franklin Modified over 9 years ago
1
Advanced Grid Technologies in ATLAS Data Management Alexandre Vaniachine Argonne National Laboratory Invited talk at NEC’2003 XIX International Symposium on Nuclear Electronics & Computing Varna, Bulgaria 15-20 September 2003
2
NEC'2003 Varna, Bulgaria Alexandre Vaniachine ATLAS Software Overview Grid technologies deployed DC1 production experience ATLAS computing challenge Core software domains Data management architecture
3
NEC'2003 Varna, Bulgaria Alexandre Vaniachine ATLAS Computing Challenge Our event size: 1-1.5 MB After on-line selection events will be written to permanent storage at a rate of 100-200 Hz Raw data: 1 PB/year With reconstructed and simulated data the total is ~10 PB/year ATLAS depends on computing as much as it depends on the trigger or the hadron calorimeter These data start coming at the full rate at the end of 2006
4
NEC'2003 Varna, Bulgaria Alexandre Vaniachine + The problem of the larger and more distributed collaboration >2000 collaborators 151 institutions 34 countries + The decision that CERN will supply only a fraction of the computing with the rest supplied by collaborators The RESULT of the unprecedented data sizes and the distributed nature of physicists and computing is the need for multiple advances in computing tools Planetary Computing Model Computing infrastructure, which was centralized in the past, now will be distributed (For experiments the trend is reverse)
5
NEC'2003 Varna, Bulgaria Alexandre Vaniachine Software Framework: Athena Athena features: Common code base with Gaudi framework (LHCb) Separation of data and algorithms Memory management Transient/ persistent data split The backbone of ATLAS Computing Model data flow
6
NEC'2003 Varna, Bulgaria Alexandre Vaniachine Software Framework Grid Computing Data Management My presentation will focus on advances in computing technologies integrating Grid Computing and Data Management – two core software domains providing foundation for ATLAS Software Framework Separation of transient and persistent data in ATLAS software architecture determines three core computing domains Core Computing Domains Scalable solutions for data persistency Software framework for data processing algorithms Grid computing for data processing and analysis
7
NEC'2003 Varna, Bulgaria Alexandre Vaniachine Interfacing Athena to the Grid Athena/GAUDI Application Virtual Data, Algorithms GRID Services Histograms, Monitoring Results Job: configuration monitoring scheduling Resource: estimation booking GANGA: Gaudi/Athena aNd Grid Alliance
8
NEC'2003 Varna, Bulgaria Alexandre Vaniachine ATLAS Database Architecture Described in ATLAS Database Architecture document Site 1 Site 3Site 2 Transport & Install Extract & Transform Just Extract Transport, Transform & Install Ready for Grid Integration Independent on persistency technology
9
NEC'2003 Varna, Bulgaria Alexandre Vaniachine Ensuring that the ‘application’ software is independent of underlying persistency technology is one of the defining characteristics of the ATLAS software architecture (“transient/persistent” split) Changing the persistency mechanism (e.g. Objectivity -> Root I/O) requires a change of “converter”, but of nothing else The ‘ease’ of the baseline change demonstrates benefits of decoupling transient/persistent representations Integrated operation of framework & data management domains demonstrated the capability of reading the same data from different frameworks switching between persistency technologies: Objectivity DB & ROOT I/O persistency in ATLAS DC0 ATLAS-specific temporary solution (AthenaROOT) in DC1 An important milestone towards DC2 has been achieved recently: the LHC-wide hybrid ROOT-based persistency technology POOL for DC2 delivered in the latest ATLAS software release 7.0.0 (AthenaPOOL) Technology Independence
10
NEC'2003 Varna, Bulgaria Alexandre Vaniachine LHC Common Persistence Infrastructure ( POOL ) During the past year a new effort emerged – the LHC-wide Computing Grid Project (LCG) The LCG's Requirements Technical Assessment Group (RTAG) on persistence recommended a common infrastructure: an object streaming layer based upon ROOT and a relational database layer for file management and higher-level services Based on RTAG recommendations a common development project was launched: POOL ATLAS is committed to this effort and adopted POOL technology To be clear: the common project infrastructure that POOL will provide is our baseline event store technology
11
NEC'2003 Varna, Bulgaria Alexandre Vaniachine ATLAS Data Challenges In a recent world-wide collaborative effort - Data Challenge 1 (DC1) - spanning over 56 prototype tier centers in 21 countries on four continents, ATLAS produced more than 60 TB of data for physics studies DC1 provided a testbed for integration and testing of advanced Grid computing components in a production environment
12
NEC'2003 Varna, Bulgaria Alexandre Vaniachine DC1 Production on the Grid A significant fraction of DC1 data produced: NorduGrid US ATLAS Grid Testbed DC1 jobs successfully tested: EDG Grid3 (US ATLAS, US CMS, LIGO, SDSS sites)
13
NEC'2003 Varna, Bulgaria Alexandre Vaniachine Innovative Technologies Several novel Grid technologies were used in ATLAS data production and data management for the first time. My presentation will describe new Grid technologies introduced in HEP production environment: Chimera Virtual Data System automating data derivation Virtual Data Cookbook services managing templated production recipes efficient Grid certificate authorization technologies for virtual data access control virtual database services delivery for reconstruction on Grid clusters behind closed firewalls
14
NEC'2003 Varna, Bulgaria Alexandre Vaniachine Centralized Management For efficiency of the large production tasks distributed worldwide, it is essential to establish shared production management tools The ATLAS Metadata Catalogue AMI and the Replica Catalogue MAGDA exemplify such Grid tools deployed in DC1 To complete the data management architecture for distributed production ATLAS prototyped Virtual Data services
15
NEC'2003 Varna, Bulgaria Alexandre Vaniachine MAGDA Architecture Replica Catalogue MAGDA: MAnager for Grid-based DAta
16
NEC'2003 Varna, Bulgaria Alexandre Vaniachine AMI Architecture Metadata Catalogue AMI: ATLAS Metadata Interface
17
NEC'2003 Varna, Bulgaria Alexandre Vaniachine Introducing Virtual Data The prevailing views in HEP Computing have been data-centric: we need to produce the data (ASAP), with the production recipes being just some tools that were used in the process by the “production gurus”. The value of the production recipes has not been fully appreciated. Preparation of recipes for data production requires significant efforts and encapsulates a considerable experts’ knowledge Because the production recipes have to be fully validated their development is an iterative time-consuming process similar to the fundamental knowledge discovery The GriPhyN project (www.griphyn.org) introduced a different perspective: recipes are as valuable as the data If you have the recipes you may not even need the data: you can reproduce the data ‘on-demand’
18
NEC'2003 Varna, Bulgaria Alexandre Vaniachine VDC Architecture
19
NEC'2003 Varna, Bulgaria Alexandre Vaniachine Virtual Data in DC1 Production To deliver scalable data management solution ATLAS implemented innovative Computing Science concepts in practice: first use of Virtual Data technologies in DC1 production Two concepts are implemented in ATLAS Virtual Data System operation: Production workflow became computerized Acyclic data dependencies tracking using GriPhyN and iVDGL software Providing Data Provenance Services first use of Chimera Virtual Data system in production Production recipes became templetized Templated recipes repository: Cookbook Providing Data Providence* Services about a half of more than two hundred DC1 datasets were serviced * prov·i·dence n. 1. Care or preparation in advance; foresight, The American Heritage Dictionary of the English Language
20
NEC'2003 Varna, Bulgaria Alexandre Vaniachine Acyclic Portion of DC1 Workflow Chimera Virtual Data system eliminates ‘manual’ tracking of the data dependencies between independent production steps & enables multi-step compound data transformations on-demand Athena Generators HepMC.root digis.zebra atlsim atlsim pileup digis.rootAthena recon recon.root QA.ntuple geometry.zebra Athena QA Athena Atlfast filtering.ntuple geometry.root Athena conversion QA.ntuple Athena QA Atlfast.root Atlfast recon recon.root Feedback loop introduced in ATLAS by physics validation is omitted
21
NEC'2003 Varna, Bulgaria Alexandre Vaniachine Chimera in DC1 Reconstruction Installed ATLAS releases 6.0.2+ (Pacman cache) on select US ATLAS testbed sites 2x520 partitions of DataSet 2001 (lumi10) have been reconstructed at JAZZ-cluster (Argonne), LBNL, IU and BU, BNL (test) 2x520 Chimera derivations, ~200,000 events reconstructed Submit hosts - LBNL; others: Argonne, UC, IU RLS-servers at the University of Chicago and BNL Storage host and Magda cache at BNL Group-level Magda registration of output Output transferred to BNL and CERN/Castor
22
NEC'2003 Varna, Bulgaria Alexandre Vaniachine Uncharted OGSA Area Interest in X509 authorization capabilities of MySQL was prompted by Doug Olson announcement to PPDG mailing list Numerous e-mail exchanges and discussions with interested PPDG participants on grid- enabling MySQL Grid example by Kate Keahey Database services on the grid is an uncharted OGSA area At CHEP’03 MySQL emerged as the most popular database
23
NEC'2003 Varna, Bulgaria Alexandre Vaniachine Database Access on the Grid Different security models A separate server does the grid authorization: Spitfire (EDG WP2) – SOAP/XML text-only data transport DAI (IBM UK) – Spitfire technologies + XML binary extensions Perl DBI database proxy (ALICE) – SQL data transport Oracle 10g (separate authorization layer) Authorization is integrated in database server: on a higher level: GSS API (work by Richard Casella, BNL) on a lower level: certificate verification (my current work)
24
NEC'2003 Varna, Bulgaria Alexandre Vaniachine Grid-enabling MySQL Tested MySQL X509 certificate authorization technology validated with DOE, CERN and Nordugrid certificates potential problem with host certificates issued at CERN Developed solutions for MySQL security problems adopted in MySQL 4.0.13 Increased MySQL AB awareness of the grid computing needs Set up grid-enabled server prototype for ATLAS used in ATLAS Data Challenge 1 production for Chimera- based reconstruction
25
NEC'2003 Varna, Bulgaria Alexandre Vaniachine Production Experience Collected production experience with grid security model: need to expand backward compatibility of grid proxy tools need to add the server purpose to grid host certificates need to initiate the grid proxy upon login (similar to AFS token) need for shared grid certificates similar to privileged accounts traditionally shared in HENP computing for production, librarian, data management and database administration tasks More information was presented at PPDG (All-hands meeting) Grid3 (production experience reported)
26
NEC'2003 Varna, Bulgaria Alexandre Vaniachine Coherent Approach Main Server Replica Servers Transport & Install Extract & Transport Extract-Transport-Install MySQL simplified the delivery of the extract- transport-install components of ATLAS database architecture to provide database services needed for the DC1 reconstruction on sites with Grid Compute Elements behind closed firewalls (e.g., NorduGrid)
27
NEC'2003 Varna, Bulgaria Alexandre Vaniachine Roadmap to Success ATLAS computing is steadily progressing towards a highly functional software suite, plus a World Wide computing model During the past year, Data Challenges have provided both an impetus and a testbed for bringing coherence to developments in all core software domains Several advanced Grid Computing technologies were successfully tested and deployed in ATLAS Data Challenge 1 production environment
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.