LCG 3D and Oracle Cluster

Slides:



Advertisements
Similar presentations
Database Architectures and the Web
Advertisements

DB server limits (process/sessions) Carlos Fernando Gamboa, BNL Andrew Wong, TRIUMF WLCG Collaboration Workshop, CERN Geneva, April 2008.
High Availability Group 08: Võ Đức Vĩnh Nguyễn Quang Vũ
High Availability 24 hours a day, 7 days a week, 365 days a year… Vik Nagjee Product Manager, Core Technologies InterSystems Corporation.
RLS Production Services Maria Girone PPARC-LCG, CERN LCG-POOL and IT-DB Physics Services 10 th GridPP Meeting, CERN, 3 rd June What is the RLS -
Oracle Clustering and Replication Technologies CCR Workshop - Otranto Barbara Martelli Gianluca Peco.
Module 14: Scalability and High Availability. Overview Key high availability features available in Oracle and SQL Server Key scalability features available.
1 RAL Status and Plans Carmine Cioffi Database Administrator and Developer 3D Workshop, CERN, November 2009.
BNL Oracle database services status and future plans Carlos Fernando Gamboa RACF Facility Brookhaven National Laboratory, US Distributed Database Operations.
Database monitoring and service validation Dirk Duellmann CERN IT/PSS and 3D
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
ASGC 1 ASGC Site Status 3D CERN. ASGC 2 Outlines Current activity Hardware and software specifications Configuration issues and experience.
Clustering  Types of Clustering. Objectives At the end of this module the student will understand the following tasks and concepts. What clustering is.
By Lecturer / Aisha Dawood 1.  You can control the number of dispatcher processes in the instance. Unlike the number of shared servers, the number of.
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
Daniela Anzellotti Alessandro De Salvo Barbara Martelli Lorenzo Rinaldi.
Workshop Summary (my impressions at least) Dirk Duellmann, CERN IT LCG Database Deployment & Persistency Workshop.
Heterogeneous Database Replication Gianni Pucciani LCG Database Deployment and Persistency Workshop CERN October 2005 A.Domenici
Achieving Scalability, Performance and Availability on Linux with Oracle 9iR2-RAC Grant McAlister Senior Database Engineer Amazon.com Paper
Database Administrator RAL Proposed Workshop Goals Dirk Duellmann, CERN.
CERN Physics Database Services and Plans Maria Girone, CERN-IT
3D Workshop Outline & Goals Dirk Düllmann, CERN IT More details at
LFC Replication Tests LCG 3D Workshop Barbara Martelli.
CERN-IT Oracle Database Physics Services Maria Girone, IT-DB 13 December 2004.
Mark E. Fuller Senior Principal Instructor Oracle University Oracle Corporation.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Implementation and performance analysis of.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
CERN Computer Centre Tier SC4 Planning FZK October 20 th 2005 CERN.ch.
Oracle Database Architecture By Ayesha Manzer. Automatic Storage Management Spreads database data across all disks Creates and maintains a storage grid.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
ClinicalSoftwareSolutions Patient focused.Business minded. Slide 1 Opus Server Architecture Fritz Feltner Sept 7, 2007 Director, IT and Systems Integration.
Oracle for Physics Services and Support Levels Maria Girone, IT-ADC 24 January 2005.
BNL Oracle database services status and future plans Carlos Fernando Gamboa, John DeStefano, Dantong Yu Grid Group, RACF Facility Brookhaven National Lab,
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
CNAF Database Service Barbara Martelli CNAF-INFN Elisabetta Vilucchi CNAF-INFN Simone Dalla Fina INFN-Padua.
Database CNAF Barbara Martelli Rome, April 4 st 2006.
Status of tests in the LCG 3D database testbed Eva Dafonte Pérez LCG Database Deployment and Persistency Workshop.
AMGA-Bookkeeping Carmine Cioffi Department of Physics, Oxford University UK Metadata Workshop Oxford, 05 July 2006.
Replicazione e QoS nella gestione di database grid-oriented Barbara Martelli INFN - CNAF.
A quick summary and some ideas for the 2005 work plan Dirk Düllmann, CERN IT More details at
CERN - IT Department CH-1211 Genève 23 Switzerland t Service Level & Responsibilities Dirk Düllmann LCG 3D Database Workshop September,
An Introduction to GPFS
1 LCG Distributed Deployment of Databases A Project Proposal Dirk Düllmann LCG PEB 20th July 2004.
INFSO-RI Enabling Grids for E-sciencE Running reliable services: the LFC at CERN Sophie Lemaitre
Oracle Clustering and Replication Technologies UK Metadata Workshop - Oxford Barbara Martelli Gianluca Peco.
Jean-Philippe Baud, IT-GD, CERN November 2007
Video Security Design Workshop:
Maria Girone, CERN – IT, Data Management Group
High Availability 24 hours a day, 7 days a week, 365 days a year…
Dirk Duellmann CERN IT/PSS and 3D
Database Services Katarzyna Dziedziniewicz-Wojcik On behalf of IT-DB.
StoRM: a SRM solution for disk based storage systems
Barbara Martelli INFN - CNAF
The Client/Server Database Environment
IT-DB Physics Services Planning for LHC start-up
LCG 3D Distributed Deployment of Databases
Lead SQL BankofAmerica Blog: SQLHarry.com
3D Application Tests Application test proposals
Database Readiness Workshop Intro & Goals
LCG Distributed Deployment of Databases A Project Proposal
A Messaging Infrastructure for WLCG
Scalable Database Services for Physics: Oracle 10g RAC on Linux
Database Architectures and the Web
Introduction to Networks
Workshop Summary Dirk Duellmann.
The INFN Tier-1 Storage Implementation
Database Services for CERN Deployment and Monitoring
Oracle Streams Performance
Scalable Database Services for Physics: Oracle 10g RAC on Linux
Message Queuing.
Presentation transcript:

LCG 3D and Oracle Cluster Storage Workshop 20-21 Mar 2006 Barbara Martelli, INFN - CNAF Gianluca Peco, INFN Bologna

Overview LCG 3D project LCG 3D deployment plans Oracle technologies deployed at CNAF Experience with Oracle Real Application Clusters RAC tests results

LCG 3D Goals LCG 3D is a joint project between the service providers (CERN and LCG sites) and the service users (experiments and grid projects) aimed at: Defining distributed database services and application access allowing LCG applications and services to find relevant database back-ends, authenticate and use the provided data in a location independent way. Helping to avoid the costly parallel development of data distribution, backup and high availability mechanisms in each experiment or grid site in order to limit the support costs. Enabling a distributed deployment of an LCG database infrastructure with a minimal number of LCG database administration personnel.

LCG 3D Non Goals Store all database data Experiments are free to deploy databases and replicate data under their responsibility Setup a single monolithic distributed database system Given constraints like WAN connections one can not assume that a single synchronously updated database would work or give sufficient availability Setup a single vendor system Technology independence and multi-vendor implementation will be required to minimize the long term risks and to adapt to the different requirements/constraints on different tiers.

LCG 3D Project Structure WP1 - Data Inventory and Application requirements Working Group Members are s/w providers from experiments and grid services based on RDBMS data. Gather data properties (volume, ownership, access patterns) requirements and integrate the provided service into their software. WP2 - Service Definition and Implementation Working Group Members are site technology and deployment experts. Propose an agreeable deployment setup and common deployment procedures, deploy the DB service according to 3D agreed policies. CNAF involvement

Proposed Service Architecture M M O T0 - autonomous T3/4 T1- db back bone - all data replicated - reliable service T2 - local db cache -subset data -only local service O M O M Oracle Streams Cross vendor extract MySQL Files Proxy Cache Slide by Dirk Duellmann, CERN-IT

Proposed Service Structure 3 separate environments with different service levels and different HW resources: Development environment: Shared HW setup DBA limited support (via email) 8/5 monitoring and availability Integration: Dedicated node for a defined slot (usually a week) to perfom performance and functionality tests DBA support via email or phone Production: 24/7 monitoring and availability  Backups every 10 minutes Limited number and scheduled number interventions

Present CNAF DB Service Structure 3D proposed service policy will be met by steps: At present we have: Test environment (used for testing new Oracle based technologies such as RAC, Grid Control and Streams) Preproduction environment composed by 2 dual RAC with 12 FC each of shared storage allocated to LHCb and Atlas 1 HP Proliant DL380G4 for Service instances such as Castor2 stager and FTS. By the end of April preproduction environment will be moved to production. When RAC tests will be finished, test machines will become our development/integration environment.

3D Milestones 31.03.06 Tier-1 services starts - milestone for earlly production Tier 1 sites CNAF will start with 2 dual RAC (LHCb,ATLAS) Bookkeeping DB replica LFC file catalog replica VOMS replica 31.05.06 Service review workshop >> hardware defined for full production. Experiment and site reports after first 3 month of service deployment. Define db resource requirements for full service. Milestone for experiments and all tier 1 sites 30.09.06 Full LCG database service in place - milestone for all Tier 1 sites

3D – Oracle Technologies 3D project leverages on several Oracle technologies to guarantee scalability and reliability of provided DB services: Oracle Streams Replication Oracle Real Application Clusters Oracle Enterprise Manager / Grid Control

Oracle Streams Replication Consumption Staging Capture Streams captures events Implicitly: log-based capture of DML and DDL Explicitly: Direct enqueue of user messages Captured events are published in the staging area Streams publishes captured events into a staging area Implemented as a queue Messages remain in staging area until consumed by all subscribers Other staging areas can subscribe to events in same database or in a remote database Events can be routed through a series of staging areas

Staged events are consumed by subscribers Implicitly: Apply Process Consumption Staging Capture Transformations can be performed as events enter, leave or propagate between staging areas Staged events are consumed by subscribers Implicitly: Apply Process Default Apply User-Defined Apply Explicitly: Application dequeue via API (C++, Java…) The default apply engine will directly apply the DML or DDL represented in the LCR apply to local Oracle table apply via DB Link to non-Oracle table Automatic conflict detection with optional resolution unresolved conflicts placed in exception queue Rule based configuration: expressed as “WHERE” clause

Example of Streams Replication User executes an update statement at source node: update table1 set field1= ‘value3’ where table1id = ‘id1’; Update table1 set field1=‘value3’ where table1id=‘id1’; table1 table1id |field1|.. id1 | value3 |… id2 | value2 | ... Apply Queue ----- LCRs ------ Propagation ACK table1 Source Node Capture Destination Node Redo Log

Oracle Real Application Cluster The Oracle Real Application Cluster technology allows to share a database amongst several database servers All datafiles, control files, PFILEs, and redo log files in RAC environments must reside on cluster-aware shared disks so that all of the cluster database instances can access them. RAC aims to provide highly available, fault tolerant and scalable database services Database servers Network shared disks (Cluster Filesystem)

Shared Storage on RAC Various choices for shared storage management in RAC: Raw Devices Network FS Supported only on certificated devices Oracle Cluster File System v2 Posix Compliant, general purpose (can be used also for Oracle Homes) Automatically configured to use Direct I/O Enable async I/O by setting filesystemio_options = SETALL Automatic Storage Management Logical Volume Manager with striping and mirroring Dynamic data distribution within and between storage arrays

Preproduction Environment Setup Gigabit Switch Gigabit Switch Private LHCB link Private LHCB link Private ATLS link Private ATLS link Dual Xeon 3,2GHz 4GB memory 2x73GB disks in RAID1 Aggiungere l’hp proliant con castor 2 e la slide sul backup rac-lhcb-02 rac-atlas-02 rac-lhcb-01 rac-atlas-01 Dell 224F 14 x 73GB disks Dell 224F 14 x 73GB disks

1.2 TB RAID-5 disk array formatted with OCFS2 RAC testbed Disk I/O traffic Fiber Channel Sw ORA-RAC-01 GigaSw1 ORA-RAC-02 Private network for interconnect traffic ORA-RAC-03 IBM FAStT900 FC RAID Controller ORA-RAC-04 1.2 TB RAID-5 disk array formatted with OCFS2 4 x Dual Xeon 2.8 GHz 4 GB RAM Red Hat Enterprise 4 on RAID-1 disks 2 x Intel PRO1000 NICs 1 QLogic 2312 FC HBA with 2 x 2Gb/s links Public and VIP Network Interface GigaSw2 Clients Clients Clients

RAC Test AS3AP 1-4 nodes Select Query 1GB cache

Select Query 8GB no db cache RAC Test AS3AP 1-4 nodes Overview Summarize the main plans Explain the long-term course to follow Select Query 8GB no db cache

RAC Test OLTP 4 nodes

RAC Test OLTP 1-2-4 nodes Con una applicazione OLTP la scalabilita' del sistema e' meno evidente Probabilmente il sistema shared disks utilizzato non e' adeguato alla tipologia di accesso ai dati da parte dell'applicazione.

RAC Test OLTP 4 nodes TransactionPerMinute con workload OLTP ( Order Entry) O_DIRECT Abilitato ASYNC_IO Abilitato TransactionPerMinute con workload OLTP ( Order Entry) O_DIRECT Disabilitato ASYNC_IO Disabilitato