CERN Disk Storage Technology Choices LCG-France Meeting April 8 th 2005 CERN.ch.

Slides:



Advertisements
Similar presentations
Archive Task Team (ATT) Disk Storage Stuart Doescher, USGS (Ken Gacke) WGISS-18 September 2004 Beijing, China.
Advertisements

Storage Procurements Some random thoughts on getting the storage you need Martin Bly Tier1 Fabric Manager.
Redundant Array of Independent Disks (RAID) Striping of data across multiple media for expansion, performance and reliability.
RAID A RRAYS Redundant Array of Inexpensive Discs.
NAS vs. SAN 10/2010 Palestinian Land Authority IT Department By Nahreen Ameen 1.
Module – 3 Data protection – raid
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
1 Magnetic Disks 1956: IBM (RAMAC) first disk drive 5 Mb – Mb/in $/year 9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625.
R.A.I.D. Copyright © 2005 by James Hug Redundant Array of Independent (or Inexpensive) Disks.
Chapter 3 Presented by: Anupam Mittal.  Data protection: Concept of RAID and its Components Data Protection: RAID - 2.
Chapter 5: Server Hardware and Availability. Hardware Reliability and LAN The more reliable a component, the more expensive it is. Server hardware is.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
Server Platforms Week 11- Lecture 1. Server Market $ 46,100,000,000 ($ 46.1 Billion) Gartner.
Computer Hardware and Procurement at CERN Helge Meinhard (at) cern ch HEPiX fall SLAC.
How to Cluster both Servers and Storage W. Curtis Preston President The Storage Group.
Native InfiniBand Storage Deployment & Customer Experiences Dave Ellis Director, HPC Architecture LSI Logic IBTA & OFA DevCon San Francisco September 25,
© 2009 IBM Corporation Statements of IBM future plans and directions are provided for information purposes only. Plans and direction are subject to change.
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
Solutions today for tomorrows problems iSCSI Storage Revealed By Ray Quattromini Fortuna Power Systems Ltd February 2004.
IBM TotalStorage ® IBM logo must not be moved, added to, or altered in any way. © 2007 IBM Corporation Break through with IBM TotalStorage Business Continuity.
Storage Area Networks The Basics. Storage Area Networks SANS are designed to give you: More disk space Multiple server access to a single disk pool Better.
IDE Interface. Objectives In this chapter, you will -Learn about each of the ATA standards (ATA-1 through ATA-6) used in PCs -Identify the ATA connector.
Storage Survey and Recent Acquisition at LAL Michel Jouvin LAL / IN2P3
Managing Storage Lesson 3.
Lecture 4 1 Reliability vs Availability Reliability: Is anything broken? Availability: Is the system still available to the user?
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
Hardware (The part you can kick). Overview  Selection Process  Equipment Categories  Processors  Memory  Storage  Support.
Tier 1A Storage Procurement 2001/2002 Andrew Sansum CLRC eScience Centre.
© 2009 IBM Corporation IBM Systems & Technology Group System x and BladeCenter® Why BladeCenter S SAN is the Right Choice Lowest cost, lowest complexity,
RAID REDUNDANT ARRAY OF INEXPENSIVE DISKS. Why RAID?
Nexenta Proprietary Global Leader in Software Defined Storage Nexenta Technical Sales Professional (NTSP) COURSE CONTENT.
Buffalo Data Storage Expansion June As organizations grow the amount of data storage capacity required to support it grows as well Increased data.
Best Practices for Backup in SAN/NAS Environments Jeff Wells.
Module 9: Configuring Storage
Planning and Designing Server Virtualisation.
NOAA WEBShop A low-cost standby system for an OAR-wide budgeting application Eugene F. Burger (NOAA/PMEL/JISAO) NOAA WebShop July Philadelphia.
Storage Systems Market Analysis Dec 04. Storage Market & Technologies.
Chapter 12 – Mass Storage Structures (Pgs )
CERN.ch 1 Issues  Hardware Management –Where are my boxes? and what are they?  Hardware Failure –#boxes  MTBF + Manual Intervention = Problem!
The concept of RAID in Databases By Junaid Ali Siddiqui.
Managing the CERN LHC Tier0/Tier1 centre Status and Plans March 27 th 2003 CERN.ch.
Phase II Purchasing LCG PEB January 6 th 2004 CERN.ch.
IDE disk servers at CERN Helge Meinhard / CERN-IT CERN OpenLab workshop 17 March 2003.
Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Implementation of a reliable and expandable on-line storage for compute clusters Jos van Wezel.
Cluster Configuration Update Including LSF Status Thorsten Kleinwort for CERN IT/PDP-IS HEPiX I/2001 LAL Orsay Tuesday, December 08, 2015.
RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,
 The End to the Means › (According to IBM ) › 03.ibm.com/innovation/us/thesmartercity/in dex_flash.html?cmp=blank&cm=v&csr=chap ter_edu&cr=youtube&ct=usbrv111&cn=agus.
CERN Computer Centre Tier SC4 Planning FZK October 20 th 2005 CERN.ch.
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. NAS versus SAN NAS – Architecture to provide dedicated file level access.
STORAGE ARCHITECTURE/ MASTER): Disk Storage: What Are Your Options? Randy Kerns Senior Partner The Evaluator Group.
STORAGE ARCHITECTURE/ MASTER): Where IP and FC Storage Fit in Your Enterprise Randy Kerns Senior Partner The Evaluator Group.
The RAL Tier-1 and the 3D Deployment Andrew Sansum 3D Meeting 22 March 2006.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
This courseware is copyrighted © 2016 gtslearning. No part of this courseware or any training material supplied by gtslearning International Limited to.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
System Storage TM © 2007 IBM Corporation IBM System Storage™ DS3000 Series Jüri Joonsaar Tartu.
E2800 Marco Deveronico All Flash or Hybrid system
Ryan Leonard Storage and Solutions Architect
Video Security Design Workshop:
Fujitsu Training Documentation RAID Groups and Volumes
Vladimir Stojanovic & Nicholas Weaver
Cluster Active Archive
ASM-based storage to scale out the Database Services for Physics
The Problem ~6,000 PCs Another ~1,000 boxes But! Affected by:
RAID RAID Mukesh N Tekwani April 23, 2019
Hard Drives & RAID PM Video 10:28
Presentation transcript:

CERN Disk Storage Technology Choices LCG-France Meeting April 8 th 2005 CERN.ch

CERN.ch 2 History  99/00 – EIDE Disk server evaluation  2001 – Problem with IBM disks –But no serious worries about model at that stage  2002 – Continued expansion  2003 – Major problem with new servers –Significant impact on servers (and support staff!) –Entire low-cost disk server model questioned.  2004 – Western Digital admit disk problem –1224 disks replaced. –Confidence in low-cost model restored. »Added 75 with 1800 SATA disks for 175TB usable capacity »Installed base exceeds 500TB across ~350 servers  2005 – Plan to buy <3CHF/GB (usable)

CERN.ch 3 Cost Evolution usable (RAID 5) gross Jumbos 4U…8U rackmount FC attached disk array

CERN.ch 4 (Some) Options  SAN vs NAS  SCSI vs FC vs SATA  “In a box” vs server and trays  “White box” vs major vendor

CERN.ch 5 (Some) Options  SAN vs NAS –SAN-style solutions not obviously advantageous for HEP use pattern—and require expensive infrastructure. –ISCSI maturing, but not there yet. –Could be great with a global file system, but that technology not mature either. –CERN choice: large scale storage as NAS »But exploring SAN for some special uses, e.g. tape->disk transfer  SCSI vs FC vs SATA  “In a box” vs server and trays  “White box” vs major vendor

CERN.ch 6 (Some) Options  SAN vs NAS  SCSI vs FC vs EIDE/SATA –Common view is that EIDE/SATA disks are less reliable. »Most reliable platters integrated with higher value electronics –No evidence for lower reliability of EIDE vs SCSI at CERN. »MTBF for both ~ ,000hrs »Historically, bad batch of disks (SCSI or EIDE) every 2-3years –But, note some SATA disks are rated for intermittent operation, others for 24x7 operation. –CERN choice: SATA disks rated for 24x7 operation »(high capacity 7,200rpm, not lower capacity 10,000rpm)  “In a box” vs server and trays  “White box” vs major vendor

CERN.ch 7 (Some) Options  SAN vs NAS  SCSI vs FC vs SATA  “In a box” vs server and trays –Specialist disk trays seen by some as better quality than trays in PC server chassis. –Possibly true, but who is responsible if there are communication problems between the disk tray and the server? –CERN choice: integrated system from one vendor »“storage in a box” has won every tender to date. »work specifications to ensure high quality chassis & trays.  “White box” vs major vendor

CERN.ch 8 (Some) Options  SAN vs NAS  SCSI vs FC vs SATA  “In a box” vs server and trays  “White box” vs major vendor –Major vendors claim better reliability… –… but are unable to explain how they achieve this »the underlying components are generally identical –CERN “choice”: free competition and white boxes win »but some white boxes are more equal than others; unfortunately CERN rules make prior selection of companies based on proven past performance rather difficult  »Long term relationship with at least 3 suppliers would be good.

CERN.ch 9 RAID and filesystems  Originally mirrored the disks; redundancy with maximum performance (#independent spindles) –mirrored EIDE still cheaper than SCSI per usable byte!  Gradually became less worried about disk performance –required I/O bandwidth per TB falls with each tender; »current systems can saturate GigE interface »disk sizes continue to increase »observed performance still below server capability  Current CERN choice: hardware Raid5 with xfs –Hardware Raid5 performance has greatly improved –Reiserfs still immature –Some tests of hardware Raid5 with software Raid0; performance poor.

CERN.ch 10 Hardware will fail

CERN.ch 11 Hardware will fail  On delivery or due to systematic h/w problem –CERN choice: dual source major procurements  In service –RAIDx –Hot spares »Probability of 2 nd disk failure during RAID array rebuild is u a concern for 250GB disks u likely a significant problem for 400GB disks u a certainty for 1TB disks in large scale installations?  However, this is a concern for any architecture with an equivalent number of disks. –Remember: CERN sees equivalent MTBF figures for SCSI and EIDE disks. »Although SCSI disks are lower capacity and higher bandwidth so reducing window for 2 nd failure. –Be prepared…

CERN.ch 12 Summary  CERN –will have a (Gigabit-)Ethernet based NAS configuration for bulk disk storage for LHC –is not convinced TCO concerns justify a higher initial purchase cost for SCSI/FC disk –buys (and will buy) SATA disk from the lowest bidder, but »with as much pre-selection of bidders as we are allowed, and »dual sourcing all purchases to minimise risk of major problems due to systematic failures. »with warranty (3years) to encourage initial quality –is focussing strongly on »redundancy, »rigour and organisation in operational procedures, and on »anonymity for disk servers, just as for CPU servers.

CERN.ch 13 Summary  CERN –will have a (Gigabit-)Ethenet based NAS configuration for bulk disk storage for LHC –is not convinced TCO concerns justify a higher initial purchase cost for SCSI/FC disk –buys (and will buy) SATA disk from the lowest bidder, but »with as much pre-selection of bidders as we are allowed, and »dual sourcing all purchases to minimise risk of major problems due to systematic failures. –is focussing strongly on »redundancy, »rigour and organisation in operational procedures, and on »anonymity for disk servers, just as for CPU servers. These points are valid whatever the disk technology!