16-May-2013 1 ADC technical interchange meeting Tokyo, May 2013 Database aspects of ATLAS distributed computing Gancho Dimitrov (CERN) G. Dimitrov.

Slides:



Advertisements
Similar presentations
ITEC474 INTRODUCTION.
Advertisements

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
CS4432: Database Systems II Buffer Manager 1. 2 Covered in week 1.
Big Data Working with Terabytes in SQL Server Andrew Novick
Oracle Data Guard Ensuring Disaster Recovery for Enterprise Data
Reliability Week 11 - Lecture 2. What do we mean by reliability? Correctness – system/application does what it has to do correctly. Availability – Be.
Harvard University Oracle Database Administration Session 2 System Level.
About physical design After you have provided your scripts Understand the problems Present a template that can be used to report on the physical design.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Module 14: Scalability and High Availability. Overview Key high availability features available in Oracle and SQL Server Key scalability features available.
Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.
07-June Database futures workshop, June 2011 CERN Enhance the ATLAS database applications by using the new Oracle 11g features Gancho Dimitrov.
Hadoop Team: Role of Hadoop in the IDEAL Project ●Jose Cadena ●Chengyuan Wen ●Mengsu Chen CS5604 Spring 2015 Instructor: Dr. Edward Fox.
Maintaining a Microsoft SQL Server 2008 Database SQLServer-Training.com.
SSIS Over DTS Sagayaraj Putti (139460). 5 September What is DTS?  Data Transformation Services (DTS)  DTS is a set of objects and utilities that.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Copyright © 2003 by Prentice Hall Module 4 Database Management Systems 1.What is a database? Data hierarchy and data organization Field, record, file,
Copyright © 2003 by Prentice Hall Computers: Tools for an Information Age Chapter 13 Database Management Systems: Getting Data Together.
05-June CERN Oracle tutorials 2013 Database solutions to ATLAS specific application requirements (real-life examples) Gancho Dimitrov (CERN) G.
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
Online Database Support Experiences Diana Bonham, Dennis Box, Anil Kumar, Julie Trumbo, Nelly Stanfield.
ATLAS DQ2 Deletion Service D.A. Oleynik, A.S. Petrosyan, V. Garonne, S. Campana (on behalf of the ATLAS Collaboration)
Database Technical Session By: Prof. Adarsh Patel.
DB Installation and Care Carmine Cioffi Database Administrator and Developer ICAT Developer Workshop, The Cosener's House August
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
Development of Hybrid SQL/NoSQL PanDA Metadata Storage PanDA/ CERN IT-SDC meeting Dec 02, 2014 Marina Golosova and Maria Grigorieva BigData Technologies.
ATLAS Detector Description Database Vakho Tsulaia University of Pittsburgh 3D workshop, CERN 14-Dec-2004.
CERN Physics Database Services and Plans Maria Girone, CERN-IT
Module 4 Designing and Implementing Views. Module Overview Introduction to Views Creating and Managing Views Performance Considerations for Views.
A State Perspective Mentoring Conference New Orleans, LA 2/28/2005 RCRAInfo Network Exchange.
Process Architecture Process Architecture - A portion of a program that can run independently of and concurrently with other portions of the program. Some.
Database Security. Multi-user database systems like Oracle include security to control how the database is accessed and used for example security Mechanisms:
08-Nov Database TEG workshop, Nov 2011 ATLAS Oracle database applications and plans for use of the Oracle 11g enhancements Gancho Dimitrov.
CERN-IT Oracle Database Physics Services Maria Girone, IT-DB 13 December 2004.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
17-Oct CHEP conference, Amsterdam Oct 2013 Next generation database relational solutions for ATLAS Distributed Computing Gancho Dimitrov (CERN)
Full and Para Virtualization
Managing Schema Objects
Chapter 5 : Integrity And Security  Domain Constraints  Referential Integrity  Security  Triggers  Authorization  Authorization in SQL  Views 
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Distributed Logging Facility Castor External Operation Workshop, CERN, November 14th 2006 Dennis Waldron CERN / IT.
11-Nov Distr. DB Operations workshop - November 2008 The PVSS Oracle DB Archive in ATLAS ( life cycle of the data ) Gancho Dimitrov (LBNL)
Scalable data access with Impala Zbigniew Baranowski Maciej Grzybek Daniel Lanza Garcia Kacper Surdy.
RDS Administration & Security Session #396 Monday, 3/17/ :45am HEUG 2003 Conference - Dallas.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
CNAF Database Service Barbara Martelli CNAF-INFN Elisabetta Vilucchi CNAF-INFN Simone Dalla Fina INFN-Padua.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Shifters Jamboree Kaushik De ADC Jamboree, CERN December 4, 2014.
Status of tests in the LCG 3D database testbed Eva Dafonte Pérez LCG Database Deployment and Persistency Workshop.
Future of Distributed Production in US Facilities Kaushik De Univ. of Texas at Arlington US ATLAS Distributed Facility Workshop, Santa Cruz November 13,
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
The Database Project a starting work by Arnauld Albert, Cristiano Bozza.
SQL Basics Review Reviewing what we’ve learned so far…….
1 Copyright © 2005, Oracle. All rights reserved. Oracle Database Administration: Overview.
CPSC-310 Database Systems
Jean-Philippe Baud, IT-GD, CERN November 2007
Practical Database Design and Tuning
Oracle structures on database applications development
Elizabeth Gallas - Oxford ADC Weekly September 13, 2011
Maximum Availability Architecture Enterprise Technology Centre.
Common Analysis Framework, June 2013
Ákos Frohner EGEE'08 September 2008
Oracle Storage Performance Studies
Case studies – Atlas and PVSS Oracle archiver
Data Lifecycle Review and Outlook
Building a Database on S3
Support for ”interactive batch”
Operating Systems.
Microsoft SQL Server 2014 for Oracle DBAs Module 7
Presentation transcript:

16-May ADC technical interchange meeting Tokyo, May 2013 Database aspects of ATLAS distributed computing Gancho Dimitrov (CERN) G. Dimitrov

16-May Outline  Current ATLAS database topology, HW specifications, metrics  PanDA accounts, roles and privileges organization  PanDA data flow and management, volume and trends, improvements  New developments - JEDI (Job Execution and Definition Interface) - Rucio data management system  Newer hardware specs for the ADCR database  Conclusions G. Dimitrov

16-May ATLAS databases topology G. Dimitrov ADC applications PanDA, DQ2, LFC, PS, AKTR, AGIS are hosted on the ADCR database ACDR database cluster HW specifications: 4 machines 2 quad core CPUs Intel 2.53GHz 48 GB RAM 10 GigE for storage and cluster access NetApp NAS storage with 512 GB SSD cache

16-May PanDa job management system: accounts, roles, privileges  Panda server is allowed to modify the PANDA, PANDAMETA and PANDAARCH  PanDA monitor is allowed to modify only the PANDAMETA data  DEfT (Database Engine for Tasks) can modify only its own tables via direct privileges ( PanDA gets a direct write privilege in a single DEFT table )  All owner accounts grant select privilege on their objects to the ATLAS_PANDA_READROLE  All application writer and reader accounts get read privilege on all objects by granting the ATLAS_PANDA_READROLE to them G. Dimitrov

16-May-2013 PanDA data organization G. Dimitrov 5 ATLAS_PANDA JOB, PARAMS, META and FILES tables: partitioned on a ‘modificationtime’ column. Each partition covers a time range of a day ATLAS_PANDAARCH archive tables partitioned on the ‘modificationtime’ column. Some table have defined partitions each covering three days window, others a time range of a month Inserts the data of the last complete day A scheduler job daily Filled partitions Empty partitions relevant to the future. A job is scheduled to run every Monday for creating seven new partitions Partitions that can be dropped. An Oracle scheduler job is taking care of that daily Panda server Time line Certain PanDA tables have defined data sliding window of three days, others 30 days. Data removal on partition level happens only after data being copied to the PANDAARCH schema. This natural approach showed to be adequate and not resource demanding !

16-May Trend in number of daily PanDA jobs G. Dimitrov Jan – Dec 2011Jan – Dec 2012Jan- Apr 2013

16-May Trend in PanDA segments growth G. Dimitrov Jan – Dec 2011: 1.3 TBJan – Dec 2012 : 1.7TBJan–Apr 2013:0.7 TB Removal of unnecessary indexes

16-May Improvements on the current PanDA system  Despite the increased activity on the grid, using several tuning techniques the server resource usage stayed in the low range: e.g. CPU usage in the range 20-30%  Thorough studies on the WHERE clauses of the PanDA server queries resulted in: - revision of the indexes: removal or replacement with more approriate multi- column ones - weekly index maintenance (rebuids triggered by a scheduler job) on the tables with high transaction activity - queries tuning - an auxiliary table for the mapping Panda ID Modification time  In the light of the Rucio and the DQ2, PanDA now works with files and datasets with defined scope G. Dimitrov

16-May PanDA complete archive  PanDA full archive now hosts information of 800 million jobs – all jobs since the job system start in 2006  Studies on the WHERE clauses of the PanDA monitor queries resulted in: - revision of the indexes: replacement with more approriate multi-column ones so that less tables blocks get accessed - dynamic re-definition of different time range views in order to protect the large PanDA archive tables from time range unconstrained queries and to avoid time comparison on row level. The views comprise set of partitions based on ranges of 7, 15, 30, 60, 90, 180 and 365 days. These views are shifting windows with defined ranges and the underlying partition list is updated daily automatically via a scheduler job. G. Dimitrov

16-May Potential use of ADG from PanDA monitor  ADCR database has two standby databases: -Data Guard for disaster recovery and backup offloading -Active Data Guard (ADCR_ADG) for read-only replica  PanDA monitor can benefit from the Active Data Guard (ADG) resources It is planned that PanDA monitor sustains two connection pools: - to the primary database ADCR - to ADCR’s ADG The idea is queries that span on time ranges larger than certain threshold to be resolved from the ADG where we can afford several paralell slave processes per user query. G. Dimitrov Read-only replica

16-May New development: JEDI (a component of PanDA)  JEDI is a new component of the PanDA server which dynamically defines jobs from a task definition. The main goal is to make PanDA task- oriented.  Tables of initial relational model of the new JEDI schema (documented by T. Maeno) complement the existing PanDA tables on the INTR database  Ongoing activities : - understand the new data flow, requirements, access patterns - studies for the best possible data organization (partitioning) from manageability and performance point of view. - address the requirement of storing information on event level for keeping track of the active jobs’ progress. That data will be transient, but the writing and reading of it must be highly optimised. - get to the most appropriate physical implelentaion of the agreed relational model - tests with representative data volume G. Dimitrov

16-May JEDI database relational schema G. Dimitrov

16-May Transition from current PanDA schema to a new one  The idea is the transition from the current PanDA server to the new one with the DB backend objects to be transparent to the users.  JEDI tables are complementary to the existing PanDA tables. The current schema and PANDA => PANDAARCH data copying should not change much.  However the relations between the existing and the new set of tables have to exist. In particular: - Relation between JEDI’s Task and PanDa’s Job by having a foreign key in all JOBS* tables to the JEDI_TASKS table - Relation between JEDI’s Work queue (for different shares of workload) and PanDa’s Job by having a foreign key in all JOBS* tables to the JEDI_WORK_QUEUE table - Relation between JEDI’s Task, Dataset and File (new seq. ID) and PanDA’s Job processing the file (or fraction of it) by having a foreign key in the FILESTABLE4 table to the JEDI_DATASET_CONTENTS table ( when a task tries a file multiple times, there are multiple rows in PanDA’s FILESTABLE4 while there is only one parent row in the JEDI’s DATASET_CONTENTS table) G. Dimitrov

16-May Steps in JEDI database tests  Step 1: The ATLAS_PANDA schema and data exported from production and imported on the INTR testbed database – done  Step 2: New set of JEDI tables created into the imported schema. Alter tables where necessary. Set relations between the existing tables and the new ones, enforce integrity when possible – done  Step 3: DEfT (Database Engine for Tasks) schema validation. Test JEDI and DEfT interaction and access rights.  Step 4: Validate access patterns to the new objects. Studies on the efficiency of the chosen partitioning and index strategy. Agreement on data archiving policy to be achieved.  Step 5: If needed introduce changes to the PANDA => PANDAARCH data flow. Activate the scheduler jobs responsible for the data copying and maintenance of the PANDA sliding window.  Step 6: Test with representative workload and repeat 2,3,4,5 until getting to an acceptable state.  Step 7: Import the PANDA, PANDAARCH, PANDAMETA data from the production. Repeat step 2 to validate again the SQL script. Perform pre- production test with all PanDA, JEDI, DEfT components. G. Dimitrov

16-May New development: Rucio system Rucio is the new ATLAS file management system meant as successor of DQ2. The performed validation on the first Rucio relational schema initially created by the DDM team resulted in : - DB objects (tables, constraints, indices, triggers, sequences) are named based on an agreed naming convention. The purpose is to have easy way to identify object role and relation with the others. - studies of different partitioning techniques. A promising one is the list partitioning based on a virtual column “scope” concatenated with “data type” (file, dataset, container, del_file, del_dataset, del_container) A home made logic of automatic partition creation tested and showed to be reliable. - special check constraints logic for enforcing important policies - explored techniques for guaranteeing any object name uniqueness within each scope for the full lifetime of the system - test with representative data volume is needed G. Dimitrov

16-May New development : Rucio DB relational schema G. Dimitrov Tables partitioned by scope and data identifier type

16-May New HW for the ADCR cluster In Q4 of 2013 the ADCR hardware will be replaced by a newer one: => CPU 2x6 cores of newer type (about 2x more CPU power compared to now) => RAM 128 GB (3x more than now, great for caching more data blocks ) => 10 GigE for storage and cluster access (same as now) => Storage: newer generation and higher spec NetApp NAS with more SSD cache and more RAM (about 2x in performance for random reads and writes) G. Dimitrov

16-May What’s next?  The new HW will give a room for increased performance and throughput.  However a big challenge and responsibility on many of us is to design and develop DB applications with scalability in mind to serve ever increasing workload for many years ahead.  Much tight collaboration between SW developers, DB application admins and DB infrastructure DBAs is needed. G. Dimitrov

16-May Conclusions  The ADCR database resource usage stayed in the green zone for the whole 2012 and Q1 of All that is due to the hard work from many experts on keeping the DB HW and SW in good state, usage of new Oracle 11g features in production, ADCR application improvements and maturity.  On focus now are developments of few new systems. Collaborative work is much appreciated. THANK YOU! G. Dimitrov