ASM-based storage to scale out the Database Services for Physics Database Service Meeting - April 11th, 2006 Luca Canali, CERN
Outline Storage for Oracle 10g A solution based on cost/performance Some performance measurements and benchmarks ASM at CERN, lessons learned DB Service Meeting, 11-Apr-2006 Luca Canali
Architectural Goal High-end performance, scalability and HA at low cost RAC Conventional -> Scale UP Grid-like -> Scale OUT ASM DB Service Meeting, 11-Apr-2006 Luca Canali
Storage Solutions for RAC RAC implements ‘shared everything’ clustering Common storage solutions for RAC High-end SAN (ex: EMC2 symmetrix, Hithachi) Specialized NAS (ex: Netapps filer) SAN + low-cost storage + ASM Infortrend storage arrays with FC controllers and SATA disks QLogic HBAs and FC Switches Less common solutions Directly attached storage Solid state disks DB Service Meeting, 11-Apr-2006 Luca Canali
Automatic Storage Manager ASM is a volume manager and cluster filesystem for Oracle DB files Raw IO, Direct IO, Asynch IO Implements S.A.M.E. (stripe and mirror everything) ‘the software glue’ to scale out for performance and HA Online storage reconfiguration (ex: in case of disk failure) Ex: ASM ‘filesystems’ -> disk groups: DiskGrp1 DiskGrp2 Mirroring Striping DB Service Meeting, 11-Apr-2006 Luca Canali
Performance, Capacity and Cost ASM proven to ‘scale out’ low-cost storage: I/Os per second: Tests at CERN showed good scalability up to max tested (64 HDs) ~ 100 IOPS per disk (SATA disks, small random IO) Sequential throughput: Scales out, but limited by fabric to 2Gbps (per HBA) Tests on a 4 node RAC at CERN -> ~800MB/s for seq. read High capacity: leverages SATA disks (typical DB size 5-10 TB) Comparison with the top performers: Solid State Disks (SSD) SSD has highest performance: ~100K IOPS, latency < 1 ms BUT cost/capacity (SSD vs. SATA) > 1000, while cost/IOPS ~ 1 DB Service Meeting, 11-Apr-2006 Luca Canali
Orion 10.2, Sequential IO (RO) Bottleneck: Disk controller = 2 Gb (from other tests HBA = 2 + 2 Gb) DB Service Meeting, 11-Apr-2006 Luca Canali
Sequential IO measured with SQL 4 Disk arrays = 4 x 2 Gb (measured with parallel query) DB Service Meeting, 11-Apr-2006 Luca Canali
Orion 10.2, Small Random IO (RO) DB Service Meeting, 11-Apr-2006 Luca Canali
Small Random IO, with SQL 8675 IOPS 135 IOPS (uniformly) per disk Extra tuning: equivalent to using only the external edges of the disks DB Service Meeting, 11-Apr-2006 Luca Canali
Lessons Learned Performance Administration: ASM and ASMLib scale out, tested with 64 HDs. IO is spread uniformly across disks and mirror pairs for HA and performance (no data yet on write activity, but no issues expected) Administration: We experienced a few stability issues with 10.1, but all fixed in 10.2. No pending issues with ASM. ‘ASM DBA’ need storage admin and sysadmin skills. Ex: we configure ourselves HBAs multipathing, LUN mapping and SAN switch zoning. We don’t use arrays’ RAID, but deploy a custom ASM config using JBOD (HA and performance in return for the added complexity) Single disk failure rate, so far as expected: MTBF = 60 years. DB Service Meeting, 11-Apr-2006 Luca Canali
Conclusions Storage for RAC on Linux at PSS successfully consolidated using Infortrend disk arrays and Oracle ASM. Scalability and performance tests are positive. Extra effort for administration Specialized operation required mainly during installation and replacement of failed disks DBAs occasionally need to wear the storage admin hat More info on wiki: https://twiki.cern.ch/twiki/bin/view/PSSGroup/HAandPerf DB Service Meeting, 11-Apr-2006 Luca Canali