Presentation is loading. Please wait.

Presentation is loading. Please wait.

Database Services for CERN Deployment and Monitoring

Similar presentations


Presentation on theme: "Database Services for CERN Deployment and Monitoring"— Presentation transcript:

1 Database Services for Physics @ CERN Deployment and Monitoring
Radovan Chytracek CERN IT Department

2 Radovan Chytracek, CERN IT Department
Outline Database services for physics Status today How we do the services tomorrow? Performance tuning Monitoring Oracle monitoring sensors Status of monitoring Conclusions 1. Describe mandate 2. Experiments’ SW 3. GRID service challenges 4. 18 November 2018 Radovan Chytracek, CERN IT Department

3 Database Services for Physics
Mandate DB Applications deployment, DB administration, Consulting Data challenges, data distribution between CERN & T1 centers (3D project) Scope LHC & non-LHC experiments ATLAS, CMS, LHCb, Alice + COMPASS & HARP LHC Grid applications FTS, LFC, VOMS, GridView Preparation activities for LHC start-up increase requirements for DB service Number of DB servers, data volume, availability, scalability, automated deployment procedures Having scalable and reliable service is the priority 1. Describe mandate 2. Experiments’ SW 3. GRID service challenges 4. 18 November 2018 Radovan Chytracek, CERN IT Department

4 Database Services for Physics Today
HW infrastructure Oracle Sun cluster Set of single DB instances SUN cluster overloaded Difficulties to isolate existing applications Not the fastest storage Still on Oracle 9i Maintenance issues for single DB instances Many used to run as stop-gap to off-load SUN cluster, now phased out No load balancing & fail-over Complex maintenance & backup 18 November 2018 Radovan Chytracek, CERN IT Department

5 Towards a Scalable Service for LHC
Deploying Oracle 10g RAC/Linux Isolation (10g services), Scalability (CPU & storage), Reliability (failover), Manageability (easier to administer) Coordinating work-plan across several IT groups Hardware now in place and acceptance tested RAC configuration and functionality tests going on now Working on automated DB Server install integrated with s/w installation tools used for OS (thanks to IT/FIO) Setting-up several RAC systems 4 x 2-node RAC for LHC experiments 2-node integration + 4-node testing RAC Migrating apps from SUN cluster to RAC by end of 2005 18 November 2018 Radovan Chytracek, CERN IT Department

6 Oracle RAC Architecture
18 November 2018 Radovan Chytracek, CERN IT Department

7 Steps Towards a Reliable Service
Well defined pro-active deployment process Proper planning of database capacity (volume & CPU) Insure the optimization of key applications before production starts Classified database application types Resource consuming applications Guarantee of resources Start low, increase as needed Standard applications Smaller database applications which can run in a shared service Layered service implemented Development Service (code development, low data volumes, no backup) Integration and Validation Service (for key apps) Enough resources for larger tests, consulting available, booking 2 months in advance Production Service Full production quality service (backup, monitoring, on call service) Monitoring to detect new resource consuming applications or changes in access patterns 18 November 2018 Radovan Chytracek, CERN IT Department

8 Radovan Chytracek, CERN IT Department
Performance tuning Constant fight on three front-lines HW (CPU, network, storage) Server side (OS, DB, schema design) Client side (bugs, wrong practices, queries) HW can be improved by better “iron” SW should be safe by not making mistakes New or upgraded apps have the same or new bugs Good schema designs is often difficult Following good practices seems to be tough job too DBAs are inevitable Spit out & analyze the bad things & give advices 24/7 18 November 2018 Radovan Chytracek, CERN IT Department

9 Radovan Chytracek, CERN IT Department
Tracing Server side Various levels, session tracing is the most used one Must ship the server trace file back to user Security issues, some development effort required Supported by LCG SW (POOL Oracle plug-in) Client side Required to make the whole picture complete Does not exist out-of-the-box Application code instrumentation needed Often connected to monitoring systems Support being built into LCG SW 18 November 2018 Radovan Chytracek, CERN IT Department

10 Radovan Chytracek, CERN IT Department
Monitoring Allow DBAs and developers inspect the current state of a database instance in an easy way without a need for complex software Goal is to enable database & application level monitoring in coherent way with the existing OS level monitoring provided by LEMON Easy access via web interface to quantities and trends describing current database instance behavior with keeping their history and possibility to zoom in a given time period 18 November 2018 Radovan Chytracek, CERN IT Department

11 Radovan Chytracek, CERN IT Department
Monitoring Metrics Considered OEM repository but requires OEM infrastructure in place and not all instances are in OEM What if OEM is down? Data kept only 1 month Source: instance’s SYS.V$... performance views The baseline DB metrics extracted from SYS.V$SYSSTAT dynamic performance view Recalculated exactly the same way as done in OEM Examples: SQLNet in/out data rate, logical I/O, physical I/O, SQL per second… Application level monitored via SYS.V$SESSION… views 18 November 2018 Radovan Chytracek, CERN IT Department

12 DB sensor for LEMON version I
SQL script executed via SQLPlus Connecting to the locally detected database Shell driver script executed by a simple Perl sensor in LEMON framework Detects local DB settings from /etc/oratab file and names of local oracle daemons (pmon…) LEMON framework Activates each 5 mins Communication via pipe LEMON DB DB sensor Captures stdout from driver script lemon_sensor.sh Executes query via Sqlplus and writes data to stdout Monitored DB instance query lemon_sensor.sql 18 November 2018 Radovan Chytracek, CERN IT Department

13 DB sensor for LEMON version II
SQL queries still executed via SQLPlus Connecting to the locally detected or remote database SQL*Plus tool wrapped in Perl class module Allows to keep single permanent connection only DB instance & SQL*Plus tool auto detection DDL & DML and queries API provided LEMON framework Activates each 5 mins Communication via pipe LEMON DB DB sensor Captures output from Oracle sensor SQLPlus.pm oracle_sensor.pl Executes query via Sqlplus instance Monitored DB instance query SQLPlus instance result 18 November 2018 Radovan Chytracek, CERN IT Department

14 Radovan Chytracek, CERN IT Department
Monitoring status DB LEMON sensors tested on various systems Single DB instances, Oracle 9i/10g LEMON databases RAC systems, Oracle 10g Web display & metrics deployed in LEMON development version Little development needed DB metadata read from OEM repository Clicking a metrics graph in detailed view jumps to zoomable time period view similar to OEM RAC cluster databases shown as computer cluster in LEMON 18 November 2018 Radovan Chytracek, CERN IT Department

15 Radovan Chytracek, CERN IT Department
Next steps The monitoring of the WAIT events in progress Performance tuning is difficult without having these Deployment of the new LEMON DB sensor on all physics databases Currently running on selected instances and few RAC nodes Oracle installation procedures need to be updated to include proper monitoring settings 18 November 2018 Radovan Chytracek, CERN IT Department

16 Radovan Chytracek, CERN IT Department
Summary Building DB services for LHC is a challenge Well defined pro-active service is required Performance tuning and testing are essential for the resource planning Save some resources by proper monitoring For details about the LEMON system, see talk by Miroslav Siket later this afternoon 18 November 2018 Radovan Chytracek, CERN IT Department

17 LCG Database Deployment
For the Curious Check out the upcoming LCG Database Deployment And Persistency Workshop 17 October - 19 October 2005 18 November 2018 Radovan Chytracek, CERN IT Department


Download ppt "Database Services for CERN Deployment and Monitoring"

Similar presentations


Ads by Google