Database Access Patterns in ATLAS Computing Model G. Gieraltowski, J. Cranshaw, K. Karr, D. Malon, A. Vaniachine ANL P, Nevski, Yu. Smirnov, T. Wenaus.

Slides:



Advertisements
Similar presentations
Open Science Grid Project DASH: Securing Direct MySQL Database Access for the Grid D. Malon, E. May, D. Ratnikov, A. Vaniachine Argonne National Laboratory.
Advertisements

1 Databases in ALICE L.Betev LCG Database Deployment and Persistency Workshop Geneva, October 17, 2005.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Information Systems and Data Acquisition for ATLAS What was achievedWhat is proposedTasks Database Access DCS TDAQ Athena ConditionsDB Time varying data.
CMS Alignment and Calibration Yuriy Pakhotin on behalf of CMS Collaboration.
A tool to enable CMS Distributed Analysis
ATLAS Distributed Database Services Client Library D. Malon, A. Vaniachine ANL T. Wenaus BNL R. Hawkings, Yu. Shapiro CERN A. Pérus, RD Schaffer LAL, Orsay.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
Database Deployment on OSG Yuri Smirnov BNL US ATLAS DDM operations and MC production Workshop, BNL September 28-29, 2006.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
ATLAS Metrics for CCRC’08 Database Milestones WLCG CCRC'08 Post-Mortem Workshop CERN, Geneva, Switzerland June 12-13, 2008 Alexandre Vaniachine.
Conditions DB in LHCb LCG Conditions DB Workshop 8-9 December 2003 P. Mato / CERN.
Alignment Strategy for ATLAS: Detector Description and Database Issues
LHC: ATLAS Experiment meeting “Conditions” data challenge Elizabeth Gallas - Oxford - August 29, 2009 XLDB3.
ATLAS Scalability Tests of Tier-1 Database Replicas WLCG Collaboration Workshop (Tier0/Tier1/Tier2) Victoria, British Columbia, Canada September 1-2, 2007.
DOSAR Workshop, Sao Paulo, Brazil, September 16-17, 2005 LCG Tier 2 and DOSAR Pat Skubic OU.
ATLAS Database Operations Invited talk at the XXI International Symposium on Nuclear Electronics & Computing Varna, Bulgaria, September 2007 Alexandre.
Databases E. Leonardi, P. Valente. Conditions DB Conditions=Dynamic parameters non-event time-varying Conditions database (CondDB) General definition:
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
Computing in High Energy Physics – Interlaken - September 2004 Ada Farilla Offline Software for the ATLAS Combined Test Beam Ada Farilla – I.N.F.N. Roma3.
BNL ATLAS Database service update Yuri Smirnov, Iris Wu BNL, USA LCG Database Deployment and Persistency Workshop, CERN, Geneva October 17-19, 2005.
ATLAS Grid Data Processing: system evolution and scalability D Golubkov, B Kersevan, A Klimentov, A Minaenko, P Nevski, A Vaniachine and R Walker for the.
The Persistency Patterns of Time Evolving Conditions for ATLAS and LCG António Amorim CFNUL- FCUL - Universidade de Lisboa A. António, Dinis.
ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC.
Zprávy z ATLAS SW Week March 2004 Seminář ATLAS SW CZ Duben 2004 Jiří Chudoba FzÚ AV CR.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
CERN-IT Oracle Database Physics Services Maria Girone, IT-DB 13 December 2004.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
Alexandre Vaniachine (ANL) LCG PEB Applications Area Meeting November 20, 2002 Alexandre Vaniachine (ANL) MySQL Service Plans and Needs in ATLAS.
A Flexible Distributed Event-level Metadata System for ATLAS David Malon*, Jack Cranshaw, Kristo Karr (Argonne), Julius Hrivnac, Arthur Schaffer (LAL Orsay)
23/2/2000Status of GAUDI 1 P. Mato / CERN Computing meeting, LHCb Week 23 February 2000.
Oracle for Physics Services and Support Levels Maria Girone, IT-ADC 24 January 2005.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
12 February 2004 ATLAS presentation to LCG PEB 1 Why ATLAS needs MySQL  For software developed by the ATLAS offline group, policy is to avoid dependencies.
ATLAS Database Access Library Local Area LCG3D Meeting Fermilab, Batavia, USA October 21, 2004 Alexandre Vaniachine (ANL)
12 March, 2002 LCG Applications Area - Introduction slide 1 LCG Applications Session LCG Launch Workshop March 12, 2002 John Harvey, CERN LHCb Computing.
Grid Technologies for Distributed Database Services 3D Project Meeting CERN, May 19, 2005 A. Vaniachine (ANL)
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
CORAL CORAL a software system for vendor-neutral access to relational databases Ioannis Papadopoulos, Radoval Chytracek, Dirk Düllmann, Giacomo Govi, Yulia.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
PCAP Close Out Feb 2, 2004 BNL. Overall  Good progress in all areas  Good accomplishments in DC-2 (and CTB) –Late, but good.
Site Services and Policies Summary Dirk Düllmann, CERN IT More details at
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
ATLAS Distributed Analysis DISTRIBUTED ANALYSIS JOBS WITH THE ATLAS PRODUCTION SYSTEM S. González D. Liko
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
ATLAS 3D Requests LCG3D Kickoff Workshop CERN, Geneva, Swizerland December 13, 2004 Alexandre Vaniachine, Jeff Tseng, Andrea Formica.
Update on CHEP from the Computing Speaker Committee G. Carlino (INFN Napoli) on behalf of the CSC ICB, October
CMS Experience with the Common Analysis Framework I. Fisk & M. Girone Experience in CMS with the Common Analysis Framework Ian Fisk & Maria Girone 1.
VO Box discussion ATLAS NIKHEF January, 2006 Miguel Branco -
Dario Barberis: ATLAS DB S&C Week – 3 December Oracle/Frontier and CondDB Consolidation Dario Barberis Genoa University/INFN.
Towards Dynamic Database Deployment LCG 3D Meeting November 24, 2005 CERN, Geneva, Switzerland Alexandre Vaniachine (ANL)
ATLAS Distributed Computing Tutorial Tags: What, Why, When, Where and How? Mike Kenyon University of Glasgow.
System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June.
Bob Jones EGEE Technical Director
Business process management (BPM)
Database Replication and Monitoring
(on behalf of the POOL team)
Database Readiness Workshop Intro & Goals
Business process management (BPM)
Readiness of ATLAS Computing - A personal view
US ATLAS Physics & Computing
ATLAS DC2 & Continuous production
Offline framework for conditions data
Presentation transcript:

Database Access Patterns in ATLAS Computing Model G. Gieraltowski, J. Cranshaw, K. Karr, D. Malon, A. Vaniachine ANL P, Nevski, Yu. Smirnov, T. Wenaus BNL N. Barros, L. Goossens, R. Hawkings, A. Nairz, G. Poulard, Yu. Shapiro, F. Zema CERN XV International Conference on Computing in High Energy and Nuclear Physics T.I.F.R., Mumbai, India February 13-17, 2006

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)2 Outline 1) Emphasis on the early days of LHC running: Calibration/Alignment is a priority Must be done before the reconstruction start ATLAS 2006 Computing System Commissioning: Calibration/Alignment procedures are included in acceptance tests 2) Real experience in prototypes and production systems General issues encountered: Increased fluctuations in database server load Connections count limitations 3) Development of the ATLAS distributed computing model: Server-side developments: Deployment: LCG3D Project and OSG Edge Services Framework Activity Technology: Grid-enabled server technology - Project DASH Application-side technology developments: Deployment: Integration with Production System database (Conditions data slices) Technology: ATLAS Database Client Library (now adopted by COOL/POOL/CORAL)

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)3 ATLAS Computing Model In the ATLAS Computing Model widely distributed applications require access to terabytes of data stored in relational databases Realistic database services data flow – including Calibration & Alignment – is presented in the Computing Technical Design Report Preparations are on track towards Computing System Commissioning to exercise realistic database data flow

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)4 ATLAS CSC Goals 2006 is the year of ATLAS CSC The first goal of the CSC is calibration and alignment procedures ConditionsDB is included in CSC acceptance tests 4

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)5 Towards the Early Days of LHC Running Calibration/Alignment is a priority Must be done before the reconstruction start Calibration/Alignment is a part of the overall Computing System Commissioning activity to: Demonstrate the calibration ‘closed loop’: Iterate and improve reconstruction Exercise the conditions DB access and distribution infrastructure Encourage development of subdetector calibration algorithms Initially focussed on ‘steady-state’ calibration Assuming required samples are available and can be selected Also want to look at initial 2007/2008 running at low luminosity

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)6 Calibration Data Flow

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)7 Prerequisites for Success Simulation Ability to simulate a realistic, misaligned, miscalibrated detector Reconstruction Use of calibration data in reconstruction; ability to handle time- varying calibration Calibration Algorithms Algorithms in Athena, running from standard ATLAS data Data Preparation Organisation and bookkeeping run number ranges, production system,…

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)8 Production System Enhancements To prepare for new challenges first ATLAS Database Services Workshop was organized in December: Among the Workshop recommendations was: A tighter integration of the production system database, task definition, Distributed Data Management and conditions data tags Implementation opportunities: Distribute (push) snapshots via pacman Use of DDM for large payload files Try Oracle 10g file management for external files Expand existing ServersCatalog with top tags

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)9 ATLAS DB Applications In preparation for data taking, the ATLAS experiment has run a series of large-scale computational exercises to test and validate multi-tier distributed data grid solutions under development Real experience in prototypes and production systems was collected with three ATLAS major database applications: Geometry DB Conditions DB TAG databases ATLAS computational exercises run on a world-wide federation of computational grids

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)10 Data Mining of Operations The data-mining of the collected operations data reveals a striking feature – a very high degree of correlations between the failures: if the job submitted to some cluster failed, there is a high probability that a next job submitted to the cluster would fail too if the submit host failed, all the jobs scattered over different clusters will fail too Taking these correlations into account is not yet automated by the grid middleware That is why production databases and grid monitoring data that are providing immediate feedback on the Data Challenge operations to the production operators is very important for efficient utilization of the Grid capacities

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)11 Production Rate Growth and Daily Fluctuations 2005 Database Capacities Bottleneck

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)12 Lessons Learned Among the lessons learned is the increase in fluctuations in database server workloads due to the chaotic nature of grid computations The observed fluctuations in database access patterns are of a general nature and must be addressed through services enabling dynamic and flexibly managed provisioning of database resources In many cases the connections count happens to be the limiting resource

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)13 Opportunistic Grids Campus computing grids like the GLOW utilize spare cycles to run jobs The priority has the owner of resource ATLAS jobs are often put to hibernate Thus optimal jobs are shorter, i.e. only few events Resulting in order of magnitude more frequent database access Jobs put to hibernation during the initialization phase overload CERN database resources by keeping database connections open for days This problem was resolved by deploying dedicated replica servers in US and CERN to support the GLOW grid In comparison to production grids opportunistic grids require extra development and support efforts not sustainable in the long run

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)14 To improve robustness of database access in a data grid environment we developed the application- side solution – a software component abstracting the database and/or middleware connectivity concerns in a generalized Database Client Library Client Library

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)15 Server Indirection One of lessons learnt in ATLAS Data Challenges is that the database server address should NOT be hardwired in data processing transformations The logical-physical indirection for database servers is now introduced in ATLAS Similar to the logical-physical file Replica Location Service indirection of the Grid file catalogs Supported by ATLAS Client Library Now adopted by LHC POOL project:

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)16 Tier-0 Operations In addition to distributed operations, ATLAS database services are relevant to local CERN data taking operations including the conditions data flow of ATLAS Combined Test Beam operations, prototype Tier-0 scalability tests and event tag database operations Data acquisition programs Data acquisition programs Online server (atlobk01) Offline server (atlobk02) Browsing applications, Athena programs (Other Browsing applications) Browsing applications, Athena programs (Other Browsing applications) DB replication OBK DBs Test DBs CondDBB CTB DBs OBK DBs CondDB CTB DBs NOVA DBs NOVA DBs POOLcat

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)17 TAG Database Access TAG Replication is a part of SC4 Tier-0 test Loading TAGs into the relational database at CERN Replicating it using Oracle streams from Tier-0 to Tier-1s and to Tier-2s Also as an independent test, using TAG files that are already available generated

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)18 Participation in LCG 3D ATLAS is fully committed to use Distributed Database Deployment infrastructure developed in collaboration with the LCG 3D Project

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)19 Participation in OSG ESF US ATLAS is participating in OSG Edge Services Framework Activity to enhance traditional database services infrastructure deployed in 3D with dynamic database services deployment capabilities

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)20 Project DASH To grid-enable MySQL database server ATLAS is participating in the project DASH: A new collaborative project has just started at Argonne to grid-enable PostgreSQL database Both projects target integration with OSGA-DAI Please contact us if you are interested to contribute to these projects

CHEP06, Mumbai, India February 13-17, 2006 Alexandre Vaniachine (ANL)21 Conclusions As grid computing technologies mature, development must focus on database and grid integration New technologies are required to bridge the gap between data accessibility and the increasing power of grid computing used for distributed event production and processing Changes must happen both on the server side and on the client side Server technology Must support dynamic deployment of capacities Must support replication on a lower granularity level: Conditions DB slices Must be coordinated with production system Must support grid authorization (Project DASH) Client technology Must support database server indirection Must support coordinated client-side solution: ATLAS Database Client Library (now a part of COOL/POOL/CORAL)