07-June-2011 1 Database futures workshop, June 2011 CERN Enhance the ATLAS database applications by using the new Oracle 11g features Gancho Dimitrov.

Slides:



Advertisements
Similar presentations
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
Advertisements

Common Mistakes Developers Make By Bryan Oliver SQL Server Mentor at SolidQ.
SQL Server 2005 features for VLDBs. SQL Server 2005 features for VLDBs aka (it’s fixed in the next release)
Database Optimization & Maintenance Tim Richard ECM Training Conference#dbwestECM Agenda SQL Configuration OnBase DB Planning Backups Integrity.
Calendar Browser is a groupware used for booking all kinds of resources within an organization. Calendar Browser is installed on a file server and in a.
12 Copyright © 2005, Oracle. All rights reserved. Proactive Maintenance.
Backup Concepts. Introduction Backup and recovery procedures protect your database against data loss and reconstruct the data, should loss occur. The.
Chapter 9 Overview  Reasons to monitor SQL Server  Performance Monitoring and Tuning  Tools for Monitoring SQL Server  Common Monitoring and Tuning.
DB Audit Expert v1.1 for Oracle Copyright © SoftTree Technologies, Inc. This presentation is for DB Audit Expert for Oracle version 1.1 which.
CERN IT Department CH-1211 Genève 23 Switzerland t Streams new features in 11g Zbigniew Baranowski.
CERN - IT Department CH-1211 Genève 23 Switzerland t Partitioning in COOL Andrea Valassi (CERN IT-DM) R. Basset (CERN IT-DM) Distributed.
Hands-On Microsoft Windows Server 2008 Chapter 5 Configuring, Managing, and Troubleshooting Resource Access.
12 Copyright © 2007, Oracle. All rights reserved. Database Maintenance.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
05-June CERN Oracle tutorials 2013 Database solutions to ATLAS specific application requirements (real-life examples) Gancho Dimitrov (CERN) G.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
IT The Relational DBMS Section 06. Relational Database Theory Physical Database Design.
Database Administration TableSpace & Data File Management
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
Oracle on Windows Server Introduction to Oracle10g on Microsoft Windows Server.
1 Oracle Database 11g – Flashback Data Archive. 2 Data History and Retention Data retention and change control requirements are growing Regulatory oversight.
March 19981© Dennis Adams Associates Tuning Oracle: Key Considerations Dennis Adams 25 March 1998.
Informix IDS Administration with the New Server Studio 4.0 By Lester Knutsen My experience with the beta of Server Studio and the new Informix database.
OSG Area Coordinator’s Report: Workload Management February 9 th, 2011 Maxim Potekhin BNL
The protection of the DB against intentional or unintentional threats using computer-based or non- computer-based controls. Database Security – Part 2.
1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach.
Stanford Linear Accelerator Center R. D. Hall1 EPICS Collaboration Mtg Oct , 2007 Oracle Archiver Past Experience Lessons Learned for Future EPICS.
CERN IT Department CH-1211 Genève 23 Switzerland t LCG Gridview / LCG SAM use cases Miguel Anjo 8 th July 2008 Database Developers’ Workshop.
1 All Powder Board and Ski Oracle 9i Workbook Chapter 9: Database Administration Jerry Post Copyright © 2003.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
Digas Digital Archiving System. Digas is the database program used for research and fact checking in the Research Department (“Dokumentation”, ~ 60 researchers)
ATLAS Detector Description Database Vakho Tsulaia University of Pittsburgh 3D workshop, CERN 14-Dec-2004.
7 Copyright © 2005, Oracle. All rights reserved. Managing Undo Data.
1 Oracle Enterprise Manager Slides from Dominic Gélinas CIS
To Tune or not to Tune? A Lightweight Physical Design Alerter Nico Bruno, Surajit Chaudhuri DMX Group, Microsoft Research VLDB’06.
CERN - IT Department CH-1211 Genève 23 Switzerland t COOL Conditions Database for the LHC Experiments Development and Deployment Status Andrea.
08-Nov Database TEG workshop, Nov 2011 ATLAS Oracle database applications and plans for use of the Oracle 11g enhancements Gancho Dimitrov.
17-Oct CHEP conference, Amsterdam Oct 2013 Next generation database relational solutions for ATLAS Distributed Computing Gancho Dimitrov (CERN)
Copyright 2007, Information Builders. Slide 1 Machine Sizing and Scalability Mark Nesson, Vashti Ragoonath June 2008.
The Self-managing Database: Proactive Space and Schema Object Management Amit Ganesh Director, Data, Space and Transaction Processing Oracle Corporation.
Over view  Why Oracle Forensic  California Breach security Act  Oracle Logical Structure  Oracle System Change Number  Oracle Data Block Structure.
16-May ADC technical interchange meeting Tokyo, May 2013 Database aspects of ATLAS distributed computing Gancho Dimitrov (CERN) G. Dimitrov.
Distributed Logging Facility Castor External Operation Workshop, CERN, November 14th 2006 Dennis Waldron CERN / IT.
20 Copyright © 2008, Oracle. All rights reserved. Cache Management.
CERN IT Department CH-1211 Genève 23 Switzerland t COOL performance optimization using Oracle hints Andrea Valassi and Romain Basset (IT-DM)
Session id: Darrell Hilliard Senior Delivery Manager Oracle University Oracle Corporation.
Chap 5. Disk IO Distribution Chap 6. Index Architecture Written by Yong-soon Kwon Summerized By Sungchan IDS Lab
11-Nov Distr. DB Operations workshop - November 2008 The PVSS Oracle DB Archive in ATLAS ( life cycle of the data ) Gancho Dimitrov (LBNL)
ORACLE & VLDB Nilo Segura IT/DB - CERN. VLDB The real world is in the Tb range (British Telecom - 80Tb using Sun+Oracle) Data consolidated from different.
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
Scalable data access with Impala Zbigniew Baranowski Maciej Grzybek Daniel Lanza Garcia Kacper Surdy.
Database Systems, 8 th Edition SQL Performance Tuning Evaluated from client perspective –Most current relational DBMSs perform automatic query optimization.
Introduction to Utilities for New DBAs Session #332 4/19/2008 Erik Hobbs Introduction to Utilities for New DBAs.
OSG Area Coordinator’s Report: Workload Management February 9 th, 2011 Maxim Potekhin BNL
October 15-18, 2013 Charlotte, NC Accelerating Database Performance Using Compression Joseph D’Antoni, Solutions Architect Anexinet.
SQL Server Statistics DEMO SQL Server Statistics SREENI JULAKANTI,MCTS.MCITP,MCP. SQL SERVER Database Administration.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
9 Copyright © 2004, Oracle. All rights reserved. Getting Started with Oracle Migration Workbench.
Session Name Pelin ATICI SQL Premier Field Engineer.
10 Copyright © 2007, Oracle. All rights reserved. Managing Undo Data.
Data, Space and Transaction Processing
Oracle structures on database applications development
How To Pass Oracle 1z0-060 Exam In First Attempt?
Elizabeth Gallas - Oxford ADC Weekly September 13, 2011
Common Analysis Framework, June 2013
Case studies – Atlas and PVSS Oracle archiver
Hitting the SQL Server “Go Faster” Button
Data Lifecycle Review and Outlook
How to Thrive as a DBA in an Oracle10g World
Index Index.
Presentation transcript:

07-June Database futures workshop, June 2011 CERN Enhance the ATLAS database applications by using the new Oracle 11g features Gancho Dimitrov

07-June Outline  The ATLAS databases based on the Oracle backend  Some interesting facts about the ATLAS databases content at CERN  New Oracle database 11g features for development that I consider applicable for ATLAS  Conclusions Gancho Dimitrov

07-June The ATLAS databases Gancho Dimitrov At the T0 site  ATONR - database for the ATLAS detector online datataking  ATLR - database for the offline (post datataking) analysis  ADCR - database for the grid jobs and files distribution managent  ATLARC - database for the ATLAS event metadata (TAGS) and also a place for archiving the old data from all database accounts

07-June Interaction database admininstrators developers Gancho Dimitrov 70 ATLAS developers 2 ATLAS DBAs 8 PhyDB DBAs - The database developers community in ATLAS is a large one. About 70 people are responsible for different database applications. - To help them, ATLAS has a team of two DB application administrators with more privileges on the databases level. - The mentioned above people rely on the database service provided from the PhyDB DBAs team. - And for having a success, we all need to work interactively

07-June The ATLAS databases at CERN (facts) Gancho Dimitrov Database / Statistic ATONRATLRADCRATLARC # DB schemes (incl 23 arch) # tables - Non partitioned - Partitioned Volume - Table segments - Index segments - LOB segments 2,14 TB 2,15 TB 0,3 TB 4 TB 5,2 TB 0,4 TB 4,1 TB 2,1 TB (incl. IOTs) 0,3 TB 3,1 TB 7,5 TB - Largest table in # rows 1 billion rows in a range partitioned table (65 GB) 1,2 billion rows in non- partitioned table (18 GB) 2,1 billion rows in a range partitioned table (650 GB) 160 million rows in partitioned table (20 GB) Note: the data is compressed

07-June Toward the newest Oracle DB version …  The ATLAS databases need to support different type of applications and access patterns and they do it well, but still there are many annoying issues that need to be addresed. Some of them are : - Unaccurate or stale statistics on the tables cause non-optimal and resource wasting access plans (some of the large tables cannot get even analized) - Sudden and undesirable execution plan changes - Lack of intrumentation in changes validation against a real production workload on a test system  I hope we really be happier with 11g and in the next slides will mention some of the 11g features that sound promising Gancho Dimitrov

07-June (1) Object statistics gathering 1. Incremental for the partitioned tables In 10g we have problems with statistics gathering for large partitioned tables. We had to lock the stats for some of them as practically the process was never ending (MDT_DCS, ProdSys and DQ2 tables) 2. Configurable stale percent threshold In 10g it is fixed to 10 % and this causes for the large tables, the stats gathering kick in to happen more and more saldom ( PanDa, DQ2, PVSS and other large systems) 3. Multicolumn statistics: Oracle will be aware of the logical relation on a set of coulumns. In 10g the lack of such was causing not optimal data set joins to take place 4. Statistics can be published manually after an ‘approval’ from the DBA. Gancho Dimitrov

07-June (1) Object statistics gathering (cont.) 5. Statistics on expressions For example on the UPPER (dataset_name) from the DQ2 system 6. The issue with stale statistics for the max value on columns of type DATE or TIMESTAMP is real problem for which there seem to be not a solution in 11g. Workaround: In ATLAS we will stick to our home made solution on updating the relevent statistics values regularly via a daily job. Gancho Dimitrov

07-June (2) SQL plan management 1. Bind-aware peeking Oracle may construct and sustain several execution plans that reflect the bind values In 10g it was a single plan which in many cases was not appropriate, because the data in is not evenly distributed ( a lot of problems with the DQ2 system) 2. SQL plan baselines (the idea is to avoid SQL plan degradation) I do not believe that we will set automatic capture, evaluation and evolving of the plans (the default setting in ‘manual’) as this sounds like bringing more instability to the system. For some particular use cases we could use this new feature for manually fixing the execution plan (with other words, forbid plan evolving for good) Gancho Dimitrov

07-June (3) Automatic partitions creation 1. Automatic partition creation for tables with range based partitioning on columns of type DATE/TIMESTAMP or NUMBER In 10g, the partitions had to be pre-created from the developer and had to keep in mind to create new ones with the time (the MDT_DCS, PanDA, ProdSys, Dq2 and other systems ) Currently for the PanDa schema, the sliding window with daily partitions is home made solution of mine using DBMS_SCHEDULER jobs (details in the next slide) Gancho Dimitrov

The PanDA ( jobs submission system) ‘live’ and ‘archive’ data 07-June ATLAS_PANDA.JOBSARCHIVED4 table: partitioned on a ‘modificationtime’ column. Each partition covers a time range of a day (the same is applied for other tables as well) ATLAS_PANDAARCH.JOBSARCHIVED: range partitioned based on the ‘modificationtime’ column. Inserts the data of the last complete day A scheduler job daily Filled partitions Empty partitions relevant to the future. A job is scheduled to run every Monday for creating seven new partitions Partitions that can be dropped. An Oracle scheduler job is taking care of that daily Gancho Dimitrov Panda server Time line Important: For being sure that job information will be not erased without being copied in the PANDAARCH schema, a special verification PLSQL procedure is taking place before the partition drop event!

07-June (4) Result set caching 1. Caching of the result set of queires, sub-queries and PL/SQL functions. A quite appealing new feature. However I do not see it as being activated for all quieries, but rather per specific use cases. A particular interest could for the DQ2, PanDa and TAGs systems As you know, the best metric to consider when tuning queries, is the number of Oracle block reads. Queries for which result has been gotten from the result cache, would show ‘block reads = 0’ Gancho Dimitrov

07-June (5) Less space wastage 1. Disk space can be re-claimed by setting index partitions as unusable Now in the production we have couple of use cases with partitioned indexes on DATE or TIMESTAMP columns that are there only to satisfy searches in the most recent rows. By setting the older partitions as unusable, Oracle scratches the relevent segment and frees the space in the tablespace 2. Data compression. Can be activated for the normal (conventional) insertion (OLTP). Would be benficial for systems as PVSS, PanDA, DQ2,, Alert log messing systems and few others … The advantage is not just saving space, but the real one is saving IO by having fewer block read when quering the data. In 10g, compression was only possible for bulk insert operation Gancho Dimitrov

07-June (6) Flashback data archive 1. Complete history of the changes can be kept in an archive with a desired retention. Thus, the client can have a data snapshot representation from any given time in the past. A good candidate for using this is the ATLAS authorship qualification list and probably DQ2 systems Gancho Dimitrov

07-June (7) And few more … 1. Invisible indexes (‘invisible’ because the Optimizer does not see it) The importance of an index (used or not used at all) can be verified by playing with its visibility from the Oracle Oprimizer 2. Read only tables 3. Virtual Columns (computed on a fly from a user defined expression ) 4. Resuming suspended statements Gancho Dimitrov

07-June Conclusions  From developer’s point of view there are couple of new features in the Oracle 11g that sound appealing.  Plus, there is: - More flexibility - More instrumentation to intervene - More sophisticated monitoring and testing tools  The next months would show us what from those we can really put in production in  Would be good to know the plans of the other DB groups and experiments and the experience they have got so far with 11g! Gancho Dimitrov