Database Services for CERN Deployment and Monitoring

Slides:



Advertisements
Similar presentations
How We Manage SaaS Infrastructure Knowledge Track
Advertisements

GridPP7 – June 30 – July 2, 2003 – Fabric monitoring– n° 1 Fabric monitoring for LCG-1 in the CERN Computer Center Jan van Eldik CERN-IT/FIO/SM 7 th GridPP.
Database System Concepts and Architecture
Introduction to DBA.
CERN IT Department CH-1211 Genève 23 Switzerland t Integrating Lemon Monitoring and Alarming System with the new CERN Agile Infrastructure.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
Using the WDK for Windows Logo and Signature Testing Craig Rowland Program Manager Windows Driver Kits Microsoft Corporation.
7/2/2003Supervision & Monitoring section1 Supervision & Monitoring Organization and work plan Olof Bärring.
Oracle9i Performance Tuning Chapter 1 Performance Tuning Overview.
Workshop Summary (my impressions at least) Dirk Duellmann, CERN IT LCG Database Deployment & Persistency Workshop.
Oracle 10g Database Administrator: Implementation and Administration Chapter 2 Tools and Architecture.
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
1 The new Fabric Management Tools in Production at CERN Thorsten Kleinwort for CERN IT/FIO HEPiX Autumn 2003 Triumf Vancouver Monday, October 20, 2003.
CERN Physics Database Services and Plans Maria Girone, CERN-IT
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Lemon Monitoring Miroslav Siket, German Cancio, David Front, Maciej Stepniewski CERN-IT/FIO-FS LCG Operations Workshop Bologna, May 2005.
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle Real Application Clusters (RAC) Techniques for implementing & running robust.
CERN-IT Oracle Database Physics Services Maria Girone, IT-DB 13 December 2004.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
Xrootd Monitoring and Control Harsh Arora CERN. Setting Up Service  Monalisa Service  Monalisa Repository  Test Xrootd Server  ApMon Module.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Oracle for Physics Services and Support Levels Maria Girone, IT-ADC 24 January 2005.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Site Services and Policies Summary Dirk Düllmann, CERN IT More details at
Status of tests in the LCG 3D database testbed Eva Dafonte Pérez LCG Database Deployment and Persistency Workshop.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN Agile Infrastructure Monitoring Pedro Andrade CERN – IT/GT HEPiX Spring 2012.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
DB Questions and Answers open session (comments during session) WLCG Collaboration Workshop, CERN Geneva, 24 of April 2008.
Oracle for Physics Services and Support Levels Maria Girone, IT-ADC 6 April 2005.
CERN IT Department CH-1211 Genève 23 Switzerland t Load testing & benchmarks on Oracle RAC Romain Basset – IT PSS DP.
INFSO-RI Enabling Grids for E-sciencE Running reliable services: the LFC at CERN Sophie Lemaitre
I/Watch™ Weekly Sales Conference Call Presentation (See next slide for dial-in details) Andrew May Technical Product Manager Dax French Product Specialist.
Calgary Oracle User Group
Introduction to Oracle Forms Developer and Oracle Forms Services
Jean-Philippe Baud, IT-GD, CERN November 2007
Business System Development
Agenda:- DevOps Tools Chef Jenkins Puppet Apache Ant Apache Maven Logstash Docker New Relic Gradle Git.
WP4 meeting Heidelberg - Sept 26, 2003 Jan van Eldik - CERN IT/FIO
Netscape Application Server
(on behalf of the POOL team)
CMS High Level Trigger Configuration Management
IT-DB Physics Services Planning for LHC start-up
Introduction to Oracle Forms Developer and Oracle Forms Services
Overview – SOE PatchTT November 2015.
POW MND section.
Database Services at CERN Status Update
3D Application Tests Application test proposals
Database Readiness Workshop Intro & Goals
Introduction to Oracle Forms Developer and Oracle Forms Services
SQL Server Monitoring Overview
WLCG Service Interventions
POOL persistency framework for LHC
Dirk Düllmann CERN Openlab storage workshop 17th March 2003
A Messaging Infrastructure for WLCG
Ákos Frohner EGEE'08 September 2008
Oracle Database Monitoring and beyond
Introduction of Week 6 Assignment Discussion
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2: Database System Concepts and Architecture
Oracle Architecture Overview
Lecture 1: Multi-tier Architecture Overview
Database Environment Transparencies
Module 01 ETICS Overview ETICS Online Tutorials
Deploying Production GRID Servers & Services
Presentation transcript:

Database Services for Physics @ CERN Deployment and Monitoring Radovan Chytracek CERN IT Department

Radovan Chytracek, CERN IT Department Outline Database services for physics Status today How we do the services tomorrow? Performance tuning Monitoring Oracle monitoring sensors Status of monitoring Conclusions 1. Describe mandate 2. Experiments’ SW 3. GRID service challenges 4. 18 November 2018 Radovan Chytracek, CERN IT Department

Database Services for Physics Mandate DB Applications deployment, DB administration, Consulting Data challenges, data distribution between CERN & T1 centers (3D project) Scope LHC & non-LHC experiments ATLAS, CMS, LHCb, Alice + COMPASS & HARP LHC Grid applications FTS, LFC, VOMS, GridView Preparation activities for LHC start-up increase requirements for DB service Number of DB servers, data volume, availability, scalability, automated deployment procedures Having scalable and reliable service is the priority 1. Describe mandate 2. Experiments’ SW 3. GRID service challenges 4. 18 November 2018 Radovan Chytracek, CERN IT Department

Database Services for Physics Today HW infrastructure Oracle Sun cluster Set of single DB instances SUN cluster overloaded Difficulties to isolate existing applications Not the fastest storage Still on Oracle 9i Maintenance issues for single DB instances Many used to run as stop-gap to off-load SUN cluster, now phased out No load balancing & fail-over Complex maintenance & backup 18 November 2018 Radovan Chytracek, CERN IT Department

Towards a Scalable Service for LHC Deploying Oracle 10g RAC/Linux Isolation (10g services), Scalability (CPU & storage), Reliability (failover), Manageability (easier to administer) Coordinating work-plan across several IT groups Hardware now in place and acceptance tested RAC configuration and functionality tests going on now Working on automated DB Server install integrated with s/w installation tools used for OS (thanks to IT/FIO) Setting-up several RAC systems 4 x 2-node RAC for LHC experiments 2-node integration + 4-node testing RAC Migrating apps from SUN cluster to RAC by end of 2005 18 November 2018 Radovan Chytracek, CERN IT Department

Oracle RAC Architecture 18 November 2018 Radovan Chytracek, CERN IT Department

Steps Towards a Reliable Service Well defined pro-active deployment process Proper planning of database capacity (volume & CPU) Insure the optimization of key applications before production starts Classified database application types Resource consuming applications Guarantee of resources Start low, increase as needed Standard applications Smaller database applications which can run in a shared service Layered service implemented Development Service (code development, low data volumes, no backup) Integration and Validation Service (for key apps) Enough resources for larger tests, consulting available, booking 2 months in advance Production Service Full production quality service (backup, monitoring, on call service) Monitoring to detect new resource consuming applications or changes in access patterns 18 November 2018 Radovan Chytracek, CERN IT Department

Radovan Chytracek, CERN IT Department Performance tuning Constant fight on three front-lines HW (CPU, network, storage) Server side (OS, DB, schema design) Client side (bugs, wrong practices, queries) HW can be improved by better “iron” SW should be safe by not making mistakes New or upgraded apps have the same or new bugs Good schema designs is often difficult Following good practices seems to be tough job too DBAs are inevitable Spit out & analyze the bad things & give advices 24/7 18 November 2018 Radovan Chytracek, CERN IT Department

Radovan Chytracek, CERN IT Department Tracing Server side Various levels, session tracing is the most used one Must ship the server trace file back to user Security issues, some development effort required Supported by LCG SW (POOL Oracle plug-in) Client side Required to make the whole picture complete Does not exist out-of-the-box Application code instrumentation needed Often connected to monitoring systems Support being built into LCG SW 18 November 2018 Radovan Chytracek, CERN IT Department

Radovan Chytracek, CERN IT Department Monitoring Allow DBAs and developers inspect the current state of a database instance in an easy way without a need for complex software Goal is to enable database & application level monitoring in coherent way with the existing OS level monitoring provided by LEMON Easy access via web interface to quantities and trends describing current database instance behavior with keeping their history and possibility to zoom in a given time period 18 November 2018 Radovan Chytracek, CERN IT Department

Radovan Chytracek, CERN IT Department Monitoring Metrics Considered OEM repository but requires OEM infrastructure in place and not all instances are in OEM What if OEM is down? Data kept only 1 month Source: instance’s SYS.V$... performance views The baseline DB metrics extracted from SYS.V$SYSSTAT dynamic performance view Recalculated exactly the same way as done in OEM Examples: SQLNet in/out data rate, logical I/O, physical I/O, SQL per second… Application level monitored via SYS.V$SESSION… views 18 November 2018 Radovan Chytracek, CERN IT Department

DB sensor for LEMON version I SQL script executed via SQLPlus Connecting to the locally detected database Shell driver script executed by a simple Perl sensor in LEMON framework Detects local DB settings from /etc/oratab file and names of local oracle daemons (pmon…) LEMON framework Activates each 5 mins Communication via pipe LEMON DB DB sensor Captures stdout from driver script lemon_sensor.sh Executes query via Sqlplus and writes data to stdout Monitored DB instance query lemon_sensor.sql 18 November 2018 Radovan Chytracek, CERN IT Department

DB sensor for LEMON version II SQL queries still executed via SQLPlus Connecting to the locally detected or remote database SQL*Plus tool wrapped in Perl class module Allows to keep single permanent connection only DB instance & SQL*Plus tool auto detection DDL & DML and queries API provided LEMON framework Activates each 5 mins Communication via pipe LEMON DB DB sensor Captures output from Oracle sensor SQLPlus.pm oracle_sensor.pl Executes query via Sqlplus instance Monitored DB instance query SQLPlus instance result 18 November 2018 Radovan Chytracek, CERN IT Department

Radovan Chytracek, CERN IT Department Monitoring status DB LEMON sensors tested on various systems Single DB instances, Oracle 9i/10g LEMON databases RAC systems, Oracle 10g Web display & metrics deployed in LEMON development version Little development needed DB metadata read from OEM repository Clicking a metrics graph in detailed view jumps to zoomable time period view similar to OEM RAC cluster databases shown as computer cluster in LEMON 18 November 2018 Radovan Chytracek, CERN IT Department

Radovan Chytracek, CERN IT Department Next steps The monitoring of the WAIT events in progress Performance tuning is difficult without having these Deployment of the new LEMON DB sensor on all physics databases Currently running on selected instances and few RAC nodes Oracle installation procedures need to be updated to include proper monitoring settings 18 November 2018 Radovan Chytracek, CERN IT Department

Radovan Chytracek, CERN IT Department Summary Building DB services for LHC is a challenge Well defined pro-active service is required Performance tuning and testing are essential for the resource planning Save some resources by proper monitoring For details about the LEMON system, see talk by Miroslav Siket later this afternoon 18 November 2018 Radovan Chytracek, CERN IT Department

LCG Database Deployment For the Curious Check out the upcoming LCG Database Deployment And Persistency Workshop 17 October - 19 October 2005 http://agenda.cern.ch/fullAgenda.php?ida=a055549 18 November 2018 Radovan Chytracek, CERN IT Department