CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t Storage for Data Management and Physics Databases Luca Canali, IT-DM After-C5 Presentation.

Slides:



Advertisements
Similar presentations
Scalable Storage Configuration for the Physics Database Services Luca Canali, CERN IT LCG Database Deployment and Persistency Workshop October, 2005.
Advertisements

© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice HP Simply StorageWorks Roadshow.
NAS vs. SAN 10/2010 Palestinian Land Authority IT Department By Nahreen Ameen 1.
1 Magnetic Disks 1956: IBM (RAMAC) first disk drive 5 Mb – Mb/in $/year 9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625.
2 June 2015 © Enterprise Storage Group, Inc. 1 The Case for File Server Consolidation using NAS Nancy Marrone Senior Analyst The Enterprise Storage Group,
Storage area Network(SANs) Topics of presentation
High Performance Computing Course Notes High Performance Storage.
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Product Manager Networking Infrastructure Choices for Storage.
© Hitachi Data Systems Corporation All rights reserved. 1 1 Det går pænt stærkt! Tony Franck Senior Solution Manager.
© 2010 IBM Corporation Storwize V7000 IBM’s Solution vs The Competition.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
Simplify your Job – Automatic Storage Management Angelo Session id:
© 2009 Oracle Corporation. S : Slash Storage Costs with Oracle Automatic Storage Management Ara Vagharshakian ASM Product Manager – Oracle Product.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
Storage Area Networks The Basics. Storage Area Networks SANS are designed to give you: More disk space Multiple server access to a single disk pool Better.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
CERN IT Department CH-1211 Genève 23 Switzerland t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
Hardware (The part you can kick). Overview  Selection Process  Equipment Categories  Processors  Memory  Storage  Support.
Nexenta Proprietary Global Leader in Software Defined Storage Nexenta Technical Sales Professional (NTSP) COURSE CONTENT.
CERN IT Department CH-1211 Geneva 23 Switzerland t Experience with NetApp at CERN IT/DB Giacomo Tenaglia on behalf of Eric Grancher Ruben.
Nexenta Proprietary Global Leader in Software Defined Storage Nexenta Technical Sales Professional (NTSP) COURSE CONTENT.
Luca Canali, CERN Dawid Wojcik, CERN
Best Practices for Backup in SAN/NAS Environments Jeff Wells.
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS From data management to storage services to the next challenges.
Storage Systems Market Analysis Dec 04. Storage Market & Technologies.
CERN - IT Department CH-1211 Genève 23 Switzerland t Experience and Lessons learnt from running High Availability Databases on Network Attached.
Designing and Deploying a Scalable EPM Solution Ken Toole Platform Test Manager MS Project Microsoft.
Storage Trends: DoITT Enterprise Storage Gregory Neuhaus – Assistant Commissioner: Enterprise Systems Matthew Sims – Director of Critical Infrastructure.
IST Storage & Backup Group 2011 Jack Shnell Supervisor Joe Silva Senior Storage Administrator Dennis Leong.
CERN IT Department CH-1211 Geneva 23 Switzerland t IT/DB Tests and evolution SSD as flash cache.
CERN Physics Database Services and Plans Maria Girone, CERN-IT
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle Real Application Clusters (RAC) Techniques for implementing & running robust.
ASM Configuration Review Luca Canali, CERN-IT Distributed Database Operations Workshop CERN, November 26 th, 2009.
VMware vSphere Configuration and Management v6
CERN IT Department CH-1211 Genève 23 Switzerland t Possible Service Upgrade Jacek Wojcieszuk, CERN/IT-DM Distributed Database Operations.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
 The End to the Means › (According to IBM ) › 03.ibm.com/innovation/us/thesmartercity/in dex_flash.html?cmp=blank&cm=v&csr=chap ter_edu&cr=youtube&ct=usbrv111&cn=agus.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
ISCSI. iSCSI Terms An iSCSI initiator is something that requests disk blocks, aka a client An iSCSI target is something that provides disk blocks, aka.
CERN IT Department CH-1211 Genève 23 Switzerland t Storage Overview and IT-DM Lessons Learned Luca Canali, IT-DM DM Group Meeting
CERN IT Department CH-1211 Geneva 23 Switzerland t WLCG Operation Coordination Luca Canali (for IT-DB) Oracle Upgrades.
SATA In Enterprise Storage Ron Engelbrecht Vice President and General Manager Engineering and Manufacturing Operations September 21, 2004.
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
STORAGE ARCHITECTURE/ MASTER): Where IP and FC Storage Fit in Your Enterprise Randy Kerns Senior Partner The Evaluator Group.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Juraj Sucik, Michal Kwiatek, Rafal.
Tackling I/O Issues 1 David Race 16 March 2010.
Database CNAF Barbara Martelli Rome, April 4 st 2006.
PIC port d’informació científica Luis Diaz (PIC) ‏ Databases services at PIC: review and plans.
CERN IT Department CH-1211 Genève 23 Switzerland t The Tape Service at CERN Vladimír Bahyl IT-FIO-TSI June 2009.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
CERN Disk Storage Technology Choices LCG-France Meeting April 8 th 2005 CERN.ch.
System Storage TM © 2007 IBM Corporation IBM System Storage™ DS3000 Series Jüri Joonsaar Tartu.
E2800 Marco Deveronico All Flash or Hybrid system
Storage Area Networks The Basics.
EonStor DS 2000.
Video Security Design Workshop:
iSCSI Storage Area Network
Storage Overview and IT-DM Lessons Learned
Scalable Database Services for Physics: Oracle 10g RAC on Linux
Oracle Storage Performance Studies
Case studies – Atlas and PVSS Oracle archiver
ASM-based storage to scale out the Database Services for Physics
Storage Trends: DoITT Enterprise Storage
Scalable Database Services for Physics: Oracle 10g RAC on Linux
Evaluating and testing storage performance for Oracle DBs
Presentation transcript:

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases Luca Canali, IT-DM After-C5 Presentation CERN, May 8 th, 2009

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 2 Outline An overview of the challenges for data management and physics DB regarding storage –Description of main architectural details –Production experience –Our working model for storage sizing, architecture and rollout Our activities in testing new storage solutions –Results detailed for FC and iSCSI

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 3 Data management and storage Service managers handle a ‘deep technology stack’. Example, DBAs at CERN: –More involved in lower stack than ‘traditional DBAs’ –Running DB Service (match users requirements, help app developers to optimize applications) –Keep system up, tune SQL execution, backup, security, replication –Database software installation, monitoring, patching –DBAs are involved in activities related to HW provisioning and setup, tuning, monitoring: Servers, Storage, Network

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 4 Enterprise-class vs. commodity HW RAC ASM Grid-like, Scale OUT

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 5 A real-world example, RAC7

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 6 Why storage is a very interesting area in the coming years Storage market is very conservative –Few vendors share market for large enterprise solutions –Enterprise storage has typically a high premium Opportunities –Commodity HW/grid-like solutions provide order of magnitude gain in cost/performance –New products coming to the market promise many changes: –Solid state disks, high capacity disks, high performance and low cost interconnects

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 7 Commodity HW for critical data handling services High-end and low-end storage can be easily bought and used out of the box Low-end storage for critical services requires customization Recipe for production rollout: –Understand requirements –Consult storage and HW procurement experts –Decide on suitable architecture –Test and Measure (and learn from production) –Deploy the ‘right’ hardware and software to achieve desired level of high availability and performance and share with Tier1s and online

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 8 HW layer – HD, the basic element Hard disk technology –Basic block of storage since 40 years –Main intrinsic limitation: latency

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 9 HD specs HDs are limited –In particular seek time is unavoidable (7.2k to 15k rpm, ~2-10 ms) – IOPS –Throughput ~100MB/s, typically limited by interface –Capacity range 300GB -2TB –Failures: mechanical, electric, magnetic, firmware issues. –In our experience with ~2000 disks in prod we have about 1 disk failure per week

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 10 Enterprise disks Performance –Enterprise disks offer more ‘performance’ –They spin faster and have better interconnect protocols (e.g. SAS vs SATA) –Typically of low capacity –Our experience: often not competitive in cost/perf vs. SATA Reliability –Evidence that low-end and high end disks don’t differ significantly

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 11 HD wrap-up HD is a old but evergreen technology –In particular disk capacities have increased of one order of magnitude in just a few years –At the same time prices have gone down (below 0.1 USD per GB for consumer products) –1.5 TB consumer disks, and 450GB enterprise disks are common –2.5’’ drives are becoming standard to reduce power consumption

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 12 Scaling out the disk The challenge for storage systems –Scale out the disk performance to meet demands –Throughput –IOPS –Latency –Capacity Sizing storage systems –Must focus on critical metric(s) –Avoid ‘capacity trap’

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 13 RAID and redundancy Storage arrays with RAID are the traditional approach –implement RAID to protect data. –Parity based: RAID5, RAID6 –Stripe and mirror: RAID10 Scalability problem of RAID –For very large configurations the time between two failures can become close to RAID volume rebuild time (!) –That’s also why RAID6 is becoming more popular than RAID5

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 14 Beyond RAID Google and Amazon don’t use RAID Main idea: –Divide data in ‘chunks’ –Write multiple copies of the chunks –Examples: Google file system, Amazon S3 Additional advantages: –Removes the constraint of locally storing redundancy inside one storage arrays –Facilitates moving, refreshing, and relocating data chunks –Allows to deploy low-cost arrays

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 15 Our experience Physics DB storage uses Oracle ASM –Volume manager and cluster file system integrated with Oracle –Soon to serve also a general-purpose cluster file system (involved in 11gR2 beta testing) –Oracle files are divided in chunks –Chunks are distributed evenly across storage –Chunks are written in multiple copies (2 or 3 it depends on file type and configuration) –Allows the use of low-cost storage arrays: does not need RAID support (JBOD is enough)

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 16 The interconnect Several technologies available with different characteristics –SAN –NAS –iSCSI –Direct attach

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 17 The interconnect Throughput challenge –It takes 1 day to copy/backup 10 TB over 1 GBPS network

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 18 Fiber channel SAN FC SAN is currently most used architecture for enterprise level storage –Fast and low overhead on server CPU Used for physics DBs and Tier1s SAN networks with max 64 ports at low cost –Measured: 8 Gbps transfer rate (4+4 dual ported HBAs for redundancy and load balancing) –Proof of concept FC backup (LAN free) reached full utilization of tape heads –Scalable: proof of concept ‘Oracle supercluster’ of 410 SATA disks, and 14 dual quadcore servers

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 19 Case Study: the largest cluster I have ever installed, RAC5 The test used:14 servers

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 20 Multipathed fiber channel 8 FC switches: 4Gbps (10Gbps uplink)

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 21 Many spindles 26 storage arrays (16 SATA disks each)

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 22 Case Study: I/O metrics for the RAC5 cluster Measured, sequential I/O –Read: 6 GB/sec –Read-Write: 3+3 GB/sec Measured, small random IO –Read: 40K IOPS (8 KB read ops) Note: –410 SATA disks, 26 HBAS on the storage arrays –Servers: 14 x 4+4Gbps HBAs, 112 cores, 224 GB of RAM

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 23 Testing storage ORION –Oracle provides a testing utility that has proven to give same results as more complex SQL based tests –Sharing experience: it’s not a DBA tool, it can be used to test storage for other purposes –Used for stress testing (our experience: identified controller problems in RAC5 in 2008) In the following some examples of results –Metrics measured for various disk types –FC results –iSCSI 1 Gbps and 10 GigE results

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 24 Metrics of interest Basic IO metrics measured by ORION –IOPS for random I/O (8KB) –MBPS for sequential I/O (in chunks of 1 MB) –Latency associated with the IO operations Simple to use –Get started:./orion_linux_em64t -run simple -testname mytest -num_disks 2 –More info: /PSSGroup/OrionTests

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 25 ORION output, an example

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 26 ORION results, small random read IOPS Disks UsedArrayIOPSIOPS / DISK Mirrored Capacity 128x SATAInfortrend 16-bay TB 120x Raptor 2.5’’ Infortrend 12-bay TB 144xWD ‘Green disks’ Infortrend 12-bay TB 22 TB 96x Raptor 3.5’’cmsonline Infortrend 16-bay TB 80x SAS Pics Netapps RAID-DP TB

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 27 iSCSI iSCSI is interesting for cost reduction –Get rid of ‘specialized’ FC network Many concerns on performance though, due to –IP interconnect throughput –CPU usage –Adoption seems to be only for low-end systems at the moment –10GigE tests very promising

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 28 iSCSI 1 Gbps, Infortrend Scalability tests IOPS, FC vs. iSCSI Data: D. Wojcik

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 29 iSCSI 1 Gbps, Infortrend Scalability tests IOPS, FC vs. iSCSI Data: D. Wojcik

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 30 iSCSI tests, 10 GigE Recent ORION tests on 10 GigE iSCSI –‘CERN-made’ disk servers that export storage as iSCSI over 10 GigE –Details of HW in next slide –ORION tests up to 3 disk arrays (of 14 drives) –Almost linear scalability –Up to 42 disks tested -> 4000 IOPS at saturation –85% CPU idle during test –IOPS of a single disk: ~110 IOPS –Overall, these are preliminary test data Data: A. Horvath

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 31 iSCSI on 10 GigE HW details Test HW installed by IT-FIO and IT-CS –2 Clovertown quad-core processors of 2.00 GHz –Blackford mainboard –8 GB of RAM –16 SATA-2 drives of 500 GB, 7'200 rpm –RAID controller 3ware 9650SE-16ML –Intel 10GigE dual port server adapter PCIexpress (EXPX9502CX4 - Oplin) –HP Procurve 10GigE switch Data: H. Meinhard

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 32 NAS IT-DES experience of using NAS for databases Netapp filer can use several protocols, the main being NFS –Compared to FC throughput is limited by Gbps Ethernet, trunking or use of 10 GigE also possible Overall different solution from SAN and iSCSI –The filer contains a server with CPU and OS –In particular the proprietary WAFL filesystem is capable of creating read-only snapshots –Proprietary Data ONTAP OS runs on the filer box –Higher cost due to high-end features

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 33 The quest for ultimate latency reduction Solid state disks provide unique specs –Seek time are at least one order of magnitude better than best HDs –A single disk can provide >10k random read IOPS –High read throughput

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 34 SSD (flash) problems Flash based SSD still suffer from major problems for enterprise solutions –Cost/GB: more than 10 times vs. ‘normal HDs’ –Small capacity compared to HDs –They have several issues with write performance –Limited number of erase cycles –Need to write entire cells (issue for transactional activities) –Some workarounds for write performance and cell lifetime improvements are being implemented, different quality from different vendors and grade –A field in rapid evolution

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 35 Conclusions Storage technologies are in a very interesting evolution phase On one side ‘old-fashioned storage technologies’ give more capacity and performance for a lower price every year –currently used for production by physics DB services (offline and online) and Tier1s New ideas and implementations are emerging for scaling out very large data sets without RAID –Google file system, Amazon S3, SUN’s ZFS –Oracle’s ASM (which is in production at CERN and Tier1s) 10 GigE Ethernet and SSD are new players in the storage game with high potential –10 GigE iSCSI tests with FIO and CS are very promising

CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases, Luca Canali - 36 Acknowledgments Many thanks to Dawid, Jacek, Maria Andras, Andreas, Helge, Tim Bell, Bernd Eric, Nilo