ACFS Under Scrutiny Luca Canali, CERN Dawid Wojcik, CERN UKOUG Conference, Birmingham, Nov 2010.

Slides:



Advertisements
Similar presentations
SOM Sponsors: RAC, GRID, CLOUD OR ON THE WAY TO ORACLE CLOUD 11GR2 RAC FEATURES REVIEW By: Ahmed Baraka (Independent) Yury Velikanov (Pythian) & All of.
Advertisements

Tom Hamilton – America’s Channel Database CSE
ITEC474 INTRODUCTION.
Cloud Computing: Theirs, Mine and Ours Belinda G. Watkins, VP EIS - Network Computing FedEx Services March 11, 2011.
INTRODUCTION TO ORACLE Lynnwood Brown System Managers LLC Oracle High Availability Solutions RAC and Standby Database Copyright System Managers LLC 2008.
Introduction to DBA.
Automatic Storage Management The New Best Practice Steve Adams Ixora Rich Long Oracle Corporation Session id:
Wim Coekaerts Director of Linux Engineering Oracle Corporation.
High Availability Group 08: Võ Đức Vĩnh Nguyễn Quang Vũ
Module 14: Scalability and High Availability. Overview Key high availability features available in Oracle and SQL Server Key scalability features available.
Oracle Database High Availability Brandon Kuschel Jian Liu Source: Oracle Database 11g Release 2 High Availability An Oracle White Paper November 2010.
Simplify your Job – Automatic Storage Management Angelo Session id:
© 2009 Oracle Corporation. S : Slash Storage Costs with Oracle Automatic Storage Management Ara Vagharshakian ASM Product Manager – Oracle Product.
Database Upgrade/Migration Options & Tips Sreekanth Chintala Database Technology Strategist.
High Availability & Oracle RAC 18 Aug 2005 John Sheaffer Platform Solution Specialist
Oracle Confidential Extending Automatic Storage Management to Manage ALL Data Oracle Database 11g Release 2 Ara Shakian Principal Product Manager.
Oracle Recovery Manager (RMAN) 10g : Reloaded
CERN IT Department CH-1211 Genève 23 Switzerland t Oracle 11g R2 New Features RAC and ASM Dawid Wojcik 27 November 2009.
SANPoint Foundation Suite HA Robert Soderbery Sr. Director, Product Management VERITAS Software Corporation.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
Oracle ASM Cluster File System (ACFS)- See What’s New
Recovery Manager Overview Target Database Recovery Catalog Database Enterprise Manager Recovery Manager (RMAN) Media Options Server Session.
Introducing Snap Server™ 700i Series. 2 Introducing the Snap Server 700i series Hardware −iSCSI storage appliances with mid-market features −1U 19” rack-mount.
CERN IT Department CH-1211 Geneva 23 Switzerland t Experience with NetApp at CERN IT/DB Giacomo Tenaglia on behalf of Eric Grancher Ruben.
ASGC 1 ASGC Site Status 3D CERN. ASGC 2 Outlines Current activity Hardware and software specifications Configuration issues and experience.
Oracle10g RAC Service Architecture Overview of Real Application Cluster Ready Services, Nodeapps, and User Defined Services.
An Open Source approach to replication and recovery.
Luca Canali, CERN Dawid Wojcik, CERN
Database Edition for Sybase Sales Presentation. Market Drivers DBAs are facing immense time pressure in an environment with ever-increasing data Continuous.
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
Page 1 of John Wong CTO Twin Peaks Software Inc. Mirror File System A Multiple Server File System.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
MCTS Guide to Microsoft Windows Vista Chapter 4 Managing Disks.
Configuring Disk Devices. Module 4 – Configuring Disk Devices ♦ Overview This module deals with making partitions using fdisk, implementing RAID and Logical.
Week #3 Objectives Partition Disks in Windows® 7 Manage Disk Volumes Maintain Disks in Windows 7 Install and Configure Device Drivers.
Continuous DB integration testing with RAT „RATCOIN”
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle Real Application Clusters (RAC) Techniques for implementing & running robust.
ASM Configuration Review Luca Canali, CERN-IT Distributed Database Operations Workshop CERN, November 26 th, 2009.
Mark E. Fuller Senior Principal Instructor Oracle University Oracle Corporation.
CERN IT Department CH-1211 Genève 23 Switzerland t Possible Service Upgrade Jacek Wojcieszuk, CERN/IT-DM Distributed Database Operations.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
Oracle 10g Automatic Storage Management Overview of ASM as a Storage Option for Oracle 10g.
CERN IT Department CH-1211 Genève 23 Switzerland t DBA Experience in a multiple RAC environment DM Technical Meeting, Feb 2008 Miguel Anjo.
Oracle Database Architecture By Ayesha Manzer. Automatic Storage Management Spreads database data across all disks Creates and maintains a storage grid.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
| nectar.org.au NECTAR TRAINING Module 9 Backing up & Packing up.
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
CNAF Database Service Barbara Martelli CNAF-INFN Elisabetta Vilucchi CNAF-INFN Simone Dalla Fina INFN-Padua.
Scalable Oracle 10g for the Physics Database Services Luca Canali, CERN IT January, 2006.
Database CNAF Barbara Martelli Rome, April 4 st 2006.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
3 Copyright © 2006, Oracle. All rights reserved. Installation and Administration Basics.
2 Copyright © 2006, Oracle. All rights reserved. RAC and Shared Storage.
Manage large RAC Clusters Session# 851 Tom S. Reddy Database Administration, Inc.
DB Questions and Answers open session (comments during session) WLCG Collaboration Workshop, CERN Geneva, 24 of April 2008.
CERN IT Department CH-1211 Genève 23 Switzerland t Using Data Guard for hardware migration UKOUG RAC & HA SIG, Feb 2008 Miguel Anjo, CERN.
1 Copyright © 2009, Oracle. All rights reserved. Oracle Grid Infrastructure.
Oracle Database High Availability
Backup & Recovery of Physics Databases
NGS Oracle Service.
Oracle Database High Availability
Scalable Database Services for Physics: Oracle 10g RAC on Linux
Storage Virtualization
Introduction of Week 6 Assignment Discussion
Oracle Storage Performance Studies
ASM-based storage to scale out the Database Services for Physics
Implementing ASM Without HW RAID, A User’s Experience
Scalable Database Services for Physics: Oracle 10g RAC on Linux
ASM File Group Parity Protection New to ASM for Oracle Database 19c
ASM Database Clones New to ASM for Oracle Database 18c
Presentation transcript:

ACFS Under Scrutiny Luca Canali, CERN Dawid Wojcik, CERN UKOUG Conference, Birmingham, Nov 2010

Outline  ACFS – cluster file system for 11gR2 ASM  Use cases  Architecture  Installation and setup  Some investigations of the internals  ACFS at CERN  Use cases  Deployment  Performance tests  Conclusions ACFS under scrutiny – Luca Canali, Dawid Wojcik2

ACFS Use Cases  Use cases for ACFS with Oracle  Automatic Diagnostic Repository (ADR) for 11g RDBMS unified logging structure for RAC  RDBMS related usage: BFILES, datapump dump files, ETL files, miscellaneous log files (RMAN), etc  Can also be used for Oracle RDBMS home binaries shared or non-shared Allows to take snapshots before applying a patch or a patchset  Use cases of ACFS as generic file system  Can be deployed for custom applications and application servers  No need to have RDBMS installation, only clusterware 11.2  Performance, maintenance and high availability  DBAs will find it easy to use 3ACFS under scrutiny – Luca Canali, Dawid Wojcik

ASM and ACFS – Architecture ACFS under scrutiny – Luca Canali, Dawid Wojcik4 Grid Infrastructure ASM Clusterware 11.2 Oracle Database Files ASM Cluster File System (ACFS) Third Party File System (optional) Oracle RAC or Single Instance DBs ASM dynamic Volume Manager (ADVM) Applications

ASM and ACFS  ASM: volume manager and cluster file system for Oracle DB files  ACFS: POSIX compliant cluster file system  Build on top of ASM disk groups  For ‘all other files’ (DB not supported on ACFS yet)  ACFS leverages ASM and CRS  Performance  Manageability  High Availability  Ref: ACFS Technical Overview and Deployment Guide [ID ] 5ACFS under scrutiny – Luca Canali, Dawid Wojcik

Automatic Storage Management  ASM (Automatic Storage Management)  Oracle’s cluster file system and volume manager for Oracle databases  HA: fault tolerant, online storage reorganization/addition  Performance: stripe and mirroring everything  Commodity HW: Physics databases at CERN use ASM normal redundancy (similar to RAID 1+0 across multiple disks and storage arrays) ACFS under scrutiny – Luca Canali, Dawid Wojcik6 Failgroup4Failgroup2Failgroup3Failgroup1 DATA_DiskGroup RECOVERY_DiskGroup

ASM Dynamic Volume Manager  ASM Dynamic Volume Manager (ADVM)  New feature in Oracle Clusterware 11.2  Volumes are implemented as ASM files  exposed to OS as block devices: /dev/asm/volume_name-x  Configurable redundancy, stripe width and stripe columns  A dirty region logging file is created if redundancy is mirror or high  On top of ADVM volume one can create any file system (ext3, ACFS,...)  Volumes can be resized online  File system must also support online resize (ACFS, grow: ext2, ext3, ext4)  Further investigations on internals:  v$asm_volume, v$asm_file, x$kffxp 7ACFS under scrutiny – Luca Canali, Dawid Wojcik

ASM Cluster File System  What is ACFS  ASM-based Cluster File System – new in Oracle 11.2 Built on top of ADVM volumes Can be used cluster-wide or single-node only  Multi platform support ( )  Can be shared using NFS, CIFS, …  Online file system expansion / shrink  Mirror protection when using NORMAL or HIGH redundancy diskgroups/volumes  Read-only snapshots built-in  Replication, security realms, encryption and tagging introduced in ACFS under scrutiny – Luca Canali, Dawid Wojcik8

ACFS Integration with the Oracle Software Stack  ASM, ADVM and ACFS are integrated with Oracle 11gR2 clusterware  ASM and clusterware are tightly integrated in 11gR2  A single ‘GRID HOME’ is used  Notable: administration is simplified by storing OCR and voting disk(s) in ASM  ADVM and ACFS resources are managed by clusterware  Ease maintenance and learning curve for the DBA 9ACFS under scrutiny – Luca Canali, Dawid Wojcik

ACFS Crash Recovery  In case of a node crash or force dismounting of ACFS – recovery is needed (three levels)  ASM in RAC will use surviving nodes to recover ASM uses ‘internal files’ such as ACD (Active Change Directory) and COD (Continuing Operation Directory) for this purpose  ADVM volumes with normal or high redundancy have associated dirty region logging file (high redundancy by default) – recovery run by ASM processes  ACFS utilizes Metadata Transaction Log 10ACFS under scrutiny – Luca Canali, Dawid Wojcik

ACFS setup  Setting up ACFS for the first time  The quick way: asmca  The alternative CLI setup  Create and enable volume (enabled on all nodes by default) Asmcmd: create volume -G {diskroup_name} -s {size} {vol_name} /dev/asm/{vol_name}-x device is created (Linux)  Create ACFS file system mkfs -t acfs /dev/asm/{vol_name}  Register acfs general purpose filesystem with CRS ( Allows to mount filesystem automatically with CRS) acfsutil registry -a -f {vol_path} {acfs_mount_point} If ACFS will be used for Oracle home use this instead: –srvctl add filesystem -d {vol_path} -v {volume_name} -g {disgroup_name} –Allows to maintain ACFS, ASM and DB dependencies 11ACFS under scrutiny – Luca Canali, Dawid Wojcik

ASM and ACFS internals  New ASM background processes in 11gR2  Used to manage interaction ADVM and ASM, IO fencing and clusterware membership One can see them with ps -elf | grep ASM  Volume Driver Background (VDBG)  Volume Background (VBG#) processes  Volume Membership Background (VMB)  ACFS background process  More details on metalink note [ID ] 12ACFS under scrutiny – Luca Canali, Dawid Wojcik

ACFS, ADVM and Linux  Kernel modules needed for ADVM and ACFS  oracleacfs, oracleadvm, oracleoks Can been seen on OS level with lsmod | grep oracle  Binaries in $GRID_HOME/install/usm/ One can check location with acfsroot version_check  How to remove acfs-related kernel modules  Modules are proprietary (non-GPL) and trigger message on kernel tainting in /var/log/messages  If don’t want to use acfs or are afraid of kernel tainting  acfsroot uninstall  See also note [ID ] 13ACFS under scrutiny – Luca Canali, Dawid Wojcik

ACFS Command Line Tools  Main command line interface – acfsutil  Display filesystem information, resize filesystem, register mountpoints, create snapshots, …  Can be used to configure new features of security, realms, encryption, replication and tagging  Most operations can also be done via GUI tool asmca  Other utilities  Typical Linux/UNIX: fsck, mkfs, mount, umount  afcsdbg – debug tool  advmutil – display ADVM information, tune ADVM 14ACFS under scrutiny – Luca Canali, Dawid Wojcik

ACFS Snapshots  ACFS snapshots provide point-in-time images  Can be used for consistent backups  Performed online  Copy on first write mechanism (before-images shared between snapshots)  Snapshots within the same file system  Snapshots visible in CLI or in V$ASM_ACFSSNAPSHOTS  You can read snapshots in /mount_point/.ACFS/snaps/  Limited to 63 snapshots per ACFS file system ACFS under scrutiny – Luca Canali, Dawid Wojcik15

ACFS Replication  File system level replication from one primary site to another  Can replicate whole ACFS filesystem or only tagged fragments  Changes on the primary system captured to replication logs  Replication logs send by background processes to destination cluster and replayed there  Logs deleted from both system after applying  Replication logs stored in the same filesystem – check for free space!  Replication can be set-up with acfsutil  Possible use case:  Replicate ACFS file system data in Data Guard setup 16ACFS under scrutiny – Luca Canali, Dawid Wojcik

CERN experience with ACFS

Physics DB HW, a typical setup  Dual-CPU quad-core blade servers, 24GB memory, Intel Nehalem low power ; 2.26GHz clock  Redundant power, mirrored local disks, 4 NIC (2 private/ 2 public), dual HBAs, “RAID 1+0 like” with ASM 18ACFS under scrutiny – Luca Canali, Dawid Wojcik

ACFS and ASM on low-cost storage  Advantages  High performance  Reliability  Low cost Normal redundancy ASM disk groups instead of complicated RAIDs Cheap SATA disks rather than more expensive enterprise solutions Can provide mirroring across storage arrays  Online operations (grow/shrink, add/remove disks)  Disadvantages  Can only be used on nodes with clusterware installed Unless exported via NFS  Some operations cause cluster-wide sync – performance impact  Async IO not supported – cannot put DB data files on ACFS 19ACFS under scrutiny – Luca Canali, Dawid Wojcik

ACFS use cases at CERN  ACFS is used in production at CERN  General purpose cluster file system for backup & monitoring cluster – fast and reliable  Repository of oracle binaries  Temporary storage for large exports/imports  Other usages predicted after moving to Oracle 11.2  Automatic Diagnostic Repository (ADR)  Export/import directory for each cluster DB  Local Oracle homes (?) – snapshots can be used before patching ACFS under scrutiny – Luca Canali, Dawid Wojcik20

ACFS test setup  64 SATA II, 7k2 RPM, 400GB lower end disks  JBOD configuration – visible to ASM as 64 LUNs  45% of each disk’s capacity used for DATA diskgroup  For improved IOPS and throughput (OS level partitioning)  ASM normal redundancy used – 10.5TB diskgroup  Two 800GB and 80GB ADVM normal redundancy volumes created for tests  ACFS, ext2 and ext3 file systems created on ADVM volumes  No difference in speed between small and large file systems in any of the tests (80GB vs 800GB) ACFS under scrutiny – Luca Canali, Dawid Wojcik21

Tests conducted  Tests conducted using  dd tool – different operations and different operation block sizes (all file sizes of 70GB)  bonnie++ – generic file system tests (ver. 1.96)  Tests presented  Comparing ACFS, ext2 ext3 and encrypted ACFS (AES 192-bit)  ADVM used in all tests  dd command sequential write (synchronous and asynchronous)  dd command sequential read (synchronous)  bonnie++ - file system block write, rewrite and read; file creation and deletion speed  Multithread tests – workload run from 1 node and 2 nodes Running same workload (2 threads) Running one workload listing directories on the second node (10x/s) Streaming tests – one thread writing, second thread reading ACFS under scrutiny – Luca Canali, Dawid Wojcik22

Write test results results in our enviroment ACFS under scrutiny – Luca Canali, Dawid Wojcik23

Read and write results in our enviroment ACFS under scrutiny – Luca Canali, Dawid Wojcik24

bonnie++ test results results in our enviroment ACFS under scrutiny – Luca Canali, Dawid Wojcik25

Multithread test results results in our enviroment ACFS under scrutiny – Luca Canali, Dawid Wojcik26

Conclusions  ADVM and ACFS  Cluster file system integrated in 11.2  Leverages ASM features for non-RDBMS files  ACFS usage at CERN  Positive experience Currently used to provide cluster filesystem for our custom DB monitoring  Being considered for the 11g RDBMS deployments To support log files on ADR, …  ACFS performance tests Positive results, more tests in progress ACFS under scrutiny – Luca Canali, Dawid Wojcik27

Acknowledgments  CERN-IT DB group and in particular:  Jacek Wojcieszuk  More info: ACFS under scrutiny – Luca Canali, Dawid Wojcik28