Experience and Lessons learnt from running High Availability Databases on Network Attached Storage Ruben Gaspar Manuel Guijarro et al IT/DES.

Slides:



Advertisements
Similar presentations
Tom Hamilton – America’s Channel Database CSE
Advertisements

ITEC474 INTRODUCTION.
NAS vs. SAN 10/2010 Palestinian Land Authority IT Department By Nahreen Ameen 1.
Introduction to DBA.
Automatic Storage Management The New Best Practice Steve Adams Ixora Rich Long Oracle Corporation Session id:
Network-Attached Storage
1 © Copyright 2010 EMC Corporation. All rights reserved. EMC RecoverPoint/Cluster Enabler for Microsoft Failover Cluster.
High Performance Computing Course Notes High Performance Storage.
1© Copyright 2011 EMC Corporation. All rights reserved. EMC RECOVERPOINT/ CLUSTER ENABLER FOR MICROSOFT FAILOVER CLUSTER.
COEN 180 NAS / SAN. NAS Network Attached Storage (NAS) Each storage device has its own network interface. Filers: storage device that interfaces at the.
Storage Area Network (SAN)
Installing software on personal computer
Module – 7 network-attached storage (NAS)
COEN 180 NAS / SAN. Storage Trends Storage Trends: Money is spend on administration Morris, Truskowski: The evolution of storage systems, IBM Systems.
Implementing Failover Clustering with Hyper-V
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
Hardening Linux for Enterprise Applications Peter Knaggs & Xiaoping Li Oracle Corporation Sunil Mahale Network Appliance Session id:
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
BNL Oracle database services status and future plans Carlos Fernando Gamboa RACF Facility Brookhaven National Laboratory, US Distributed Database Operations.
© 2009 Oracle Corporation. S : Slash Storage Costs with Oracle Automatic Storage Management Ara Vagharshakian ASM Product Manager – Oracle Product.
Elad Hayun Agenda What's New in Hyper-V 2012 Storage Improvements Networking Improvements VM Mobility Improvements.
High Availability & Oracle RAC 18 Aug 2005 John Sheaffer Platform Solution Specialist
File Systems and N/W attached storage (NAS) | VTU NOTES | QUESTION PAPERS | NEWS | VTU RESULTS | FORUM | BOOKSPAR ANDROID APP.
CERN IT Department CH-1211 Genève 23 Switzerland t Some Hints for “Best Practice” Regarding VO Boxes Running Critical Services and Real Use-cases.
1 Copyright © 2009, Oracle. All rights reserved. Exploring the Oracle Database Architecture.
Understand what’s new for Windows File Server Understand considerations for building Windows NAS appliances Understand how to build a customized NAS experience.
Chapter 10 : Designing a SQL Server 2005 Solution for High Availability MCITP Administrator: Microsoft SQL Server 2005 Database Server Infrastructure Design.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
Module 13: Configuring Availability of Network Resources and Content.
CERN IT Department CH-1211 Geneva 23 Switzerland t Experience with NetApp at CERN IT/DB Giacomo Tenaglia on behalf of Eric Grancher Ruben.
ASGC 1 ASGC Site Status 3D CERN. ASGC 2 Outlines Current activity Hardware and software specifications Configuration issues and experience.
Oracle10g RAC Service Architecture Overview of Real Application Cluster Ready Services, Nodeapps, and User Defined Services.
Clustering  Types of Clustering. Objectives At the end of this module the student will understand the following tasks and concepts. What clustering is.
5 Copyright © 2004, Oracle. All rights reserved. Using Recovery Manager.
INSTALLING MICROSOFT EXCHANGE SERVER 2003 CLUSTERS AND FRONT-END AND BACK ‑ END SERVERS Chapter 4.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
CERN - IT Department CH-1211 Genève 23 Switzerland t Experience and Lessons learnt from running High Availability Databases on Network Attached.
Module 11: Implementing ISA Server 2004 Enterprise Edition.
1 Week #10Business Continuity Backing Up Data Configuring Shadow Copies Providing Server and Service Availability.
Microsoft Virtual Academy Module 8 Managing the Infrastructure with VMM.
1 The new Fabric Management Tools in Production at CERN Thorsten Kleinwort for CERN IT/FIO HEPiX Autumn 2003 Triumf Vancouver Monday, October 20, 2003.
IST Storage & Backup Group 2011 Jack Shnell Supervisor Joe Silva Senior Storage Administrator Dennis Leong.
CERN IT Department CH-1211 Geneva 23 Switzerland t IT/DB Tests and evolution SSD as flash cache.
Using NAS as a Gateway to SAN Dave Rosenberg Hewlett-Packard Company th Street SW Loveland, CO 80537
Mark E. Fuller Senior Principal Instructor Oracle University Oracle Corporation.
ASM General Architecture
VMware vSphere Configuration and Management v6
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
High Availability in DB2 Nishant Sinha
3 Copyright © 2006, Oracle. All rights reserved. Using Recovery Manager.
WINDOWS SERVER 2003 Genetic Computer School Lesson 12 Fault Tolerance.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
R. Krempaska, October, 2013 Wir schaffen Wissen – heute für morgen Controls Security at PSI Current Status R. Krempaska, A. Bertrand, C. Higgs, R. Kapeller,
Scalable Oracle 10g for the Physics Database Services Luca Canali, CERN IT January, 2006.
Database CNAF Barbara Martelli Rome, April 4 st 2006.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
VCS Building Blocks. Topic 1: Cluster Terminology After completing this topic, you will be able to define clustering terminology.
3 Copyright © 2006, Oracle. All rights reserved. Installation and Administration Basics.
RHEV Platform at LHCb Red Hat at CERN 17-18/1/17
Bentley Systems, Incorporated
Network Attached Storage Overview
High Availability Linux (HA Linux)
Direct Attached Storage and Introduction to SCSI
Castor services at the Tier-0
Introduction to Networks
Introduction to Networks
Storage Virtualization
Module – 7 network-attached storage (NAS)
Direct Attached Storage and Introduction to SCSI
Cost Effective Network Storage Solutions
Presentation transcript:

Experience and Lessons learnt from running High Availability Databases on Network Attached Storage Ruben Gaspar Manuel Guijarro et al IT/DES

Outline Why NAS system for DB Infrastructure Topology Bonding NAS Configuration Server Node Configuration Oracle Software installation Performance Maintenance & Monitoring Conclusion

What is a NAS ? Network-attached storage (NAS) is the name given to dedicated Data Storage technology that can be connected directly to a computer Network to provide centralized data access and storage to heterogeneous network clients. Operating System and other software on the NAS unit provide only the functionality of data storage, data access and the management of these functionalities. Several file transfer protocols supported (NFS, SMB, etc) By contrast to a SAN (Storage Area Network), remote devices do not appear as locally attached. Client requests portions of a file rather than doing block access.

Why NAS for DB infrastructure ? To ease file sharing needed for Oracle RAC: it is much simpler than using raw devices. The complexity is managed by the appliance instead of by the server To ease relocation of services within Server nodes To use NAS specific features: Snapshots, RAID, Failover based on NAS partnership, Dynamic re-Sizing of file systems, Remote Sync to offsite NAS, etc Ethernet is much simpler to manage than Fiber Channel To ease File Server Management: automatic failure reporting to vendor, etc To reduce cost (no HBA no FC switch) To simplify administration of Data Base

DES cluster cabling dbnasd101 Disk shelf: 1 dbnasd102 Disk shelf: 2 Disk shelf: 3 Disk shelf: 4

CASTOR cluster cabling dbnasc101 dbnasc102 Disk shelf 1 Disk shelf 2

Bonding modes used Bonding (AKA Trunking): Aggregate multiple network interfaces into a single logical bonded interface. Using load balancing, i.e. using multiple network ports in parallel to increase the link speed beyond the limits of a single port or for automatic failover vif create multi -b ip dbnasc101_vif1 e0a e0b e0c e0d Each NAS has 2 nd level VIF (Virtual Interface) made of two 1 st level VIF’s of 4 NIC’s each. Only one of the 1 st level VIF is active. Use active-backup mode for bonding with monitoring frequency of 100 ms on the Node server side (dbsrvdXXX and dbsrvcXXX machines) Use IEEE 802.3ad Dynamic link aggregation mode for trunking between NAS and switches and connection between switches

NAS Configuration Data ONTAP disks in each Disk Shelve (all in a single disk aggregate) NetApps’ RAID_DP (Double Parity): Improved RAID 6 implementation Several FlexVol’s in each aggregate AutoSize Volume option SnapShots: Only used before DB maintenance operations for the time being Quattor can not be used since no agent can be installed in Data ONTAP 7.2.2

NAS Configuration Aggregate (13 disks + 1 spare disk) dbnasc101> sysconfig -r Aggregate dbdskc101 (online, raid_dp) (block checksums) Plex /dbdskc101/plex0 (online, normal, active) RAID group /dbdskc101/plex0/rg0 (normal) RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks) dparity 0a.16 0a 1 0 FC:A - FCAL / / parity 0a.17 0a 1 1 FC:A - FCAL / / data 0a.18 0a 1 2 FC:A - FCAL / / –Raid_dp: two failing disks protection

NAS Configuration Flexible volumes vol create volname aggregate 1g vol autosize volname -m 20g -i 1g on vol options volname nosnap on vol options volname no_atime_update on snap delete -a volname Redundancy accessing disks in the storage system, dual loop: dbnasc101> storage show disk -p PRIMARY PORT SECONDARY PORT SHELF BAY d.32 B 0b.32 A 2 0 0b.33 A 0d.33 B 2 1

Server Node Configuration Quattor based installation and configuration covering: –Basic RHE4 x86_64 installation –Network topology setup (Bonding): modprobe and network NCM components –Squid, Sendmail setup (http proxy for NAS report system) using NCM components –All hardware components definition in CDB (including NAS boxes and disk shelves) –RPM’s for: Oracle-RDBMS, Oracle-RACRDBMS and Oracle- CRS –NCM RAC component –TDPO and TSM components used to configure Tivoli Lemon based monitoring for all Server node resources Always using CERN’s CC infrastructure (Installations performed by SysAdmin support team, CDB)

Server Node Configuration: standard DB set-up Four volumes created, critical files split among volumes (RAC databases) –/ORA/dbs00: logging and archive redo log files and a copy of the control file –/ORA/dbs02: a copy of the control file and the voting disk –/ORA/dbs03: spfile and datafiles, and a copy of the control file, the voting disk and registry file –/ORA/dbs04:a copy of the control file, the voting disk and the registry - different aggregate (i.e. different disk shelf and different NAS filer). Accessed via NFS from the sever machines Mount options rw,bg,hard,nointr,tcp,vers=3,actimeo=0,timeo=600,rsize=32768,wsi ze=32768 DATABASES in production on NAS. RAC instances in production: –Castor Name server, various Castor 2 Stagers (for Atlas, CMS LHCb, Alice..), Lemon, etc

Oracle Software Installation CRS, RDBMS are packaged in RPMs. (Oracle Universal Installer + Cloning) BUT dependencies in RPMs are not usable – Inclusion of strange packages (perl for kernel 2.2!)‏ – Reference in some libraries to SUN (?!) packages BUT package verification cannot be used as the cloning procedure does a re-link (checksum)‏ THUS, RPMs can only be used for installation of files A Quattor component configures RAC accordingly and starts daemons successfully

Performance Oracle uses direct I/O over NFS –avoids external caching in the OS page cache –more performing (for typical Oracle I/O database workloads) than buffered I/O (double as much) –Requires enabling direct I/O at the database level: alter system set FILESYSTEMIO_OPTIONS = directIO scope=spfile sid='*'; Measurements –Stress test: 8KB read operations by multiple threads in a set of aggregates (with a total of 12 data disks): Max of ~ 150 I/O operations per second and per disk.

Performance

Maintenance and Monitoring Some interventions done (NO downtime): –Some failed Disks replacement –HA change –NAS Filer OS update Data ONTAP upgrade to (rolling forward) –RAC member full automatic reinstallation –PSU replacement (both in NAS filer and Server nodes) –Ethernet Network cable replacement Several Hardware failures successfully tested –Powering off: Network switche(s), NAS filer, RAC member, … –Unplugging: FC cables connecting NAS filer to Disk Shelf, Active bond members cable(s) of Private interconnect, VIF’s,.. Using Oracle Enterprise Manager monitoring – Lemon sensor on Server Node could be developed

Maintenance and Monitoring

Conclusion NAS based infrastructure is easy to maintain All interventions so far done with 0 downtime Good Performance for Oracle RAC databases Several pieces of infrastructure (NCM components, oracle rpm,…) were developed to automate installation and configuration of databases within the Quattor framework Ongoing migration of legacy db’s to NAS based infrastructure Strategic deployment platform –Reduces the number of Storage Technologies to be maintained –Avoids complexity of SAN based infrastructure (multipathing, Oracle ASM, etc).

QUESTIONS