Low-cost BYO Mass Storage Project

Slides:



Advertisements
Similar presentations
Archive Task Team (ATT) Disk Storage Stuart Doescher, USGS (Ken Gacke) WGISS-18 September 2004 Beijing, China.
Advertisements

Computer Basics Whats that thingamagige?. Parts of a computer.
Hardware specification. The “brain” The brain processes instructions, performs calculations, and manages the flow of information through the body The.
NAS vs. SAN 10/2010 Palestinian Land Authority IT Department By Nahreen Ameen 1.
Preparing For Server Installation Instructor: Enoch E. Damson.
1 CSC 486/586 Network Storage. 2 Objectives Familiarization with network data storage technologies Understanding of RAID concepts and RAID levels Discuss.
MCITP Guide to Microsoft Windows Server 2008 Server Administration (Exam #70-646) Chapter 11 Windows Server 2008 Virtualization.
1 System Software “Background software”, manages the computer’s internal resources.
Chapter 3 Chapter 3: Server Hardware. Chapter 3 Learning Objectives n Describe the base system requirements for Windows NT 4.0 Server n Explain how to.
1 DOS with Windows 3.1 and 3.11 Operating Environments n Designed to allow applications to have a graphical interface DOS runs in the background as the.
© 2009 IBM Corporation Statements of IBM future plans and directions are provided for information purposes only. Plans and direction are subject to change.
2 © 2004, Cisco Systems, Inc. All rights reserved. IT Essentials I v. 3 Module 9 Advanced Hardware Fundamentals for Servers.
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Secondary Storage Unit 013: Systems Architecture Workbook: Secondary Storage 1G.
BACKUP/MASTER: Immediate Relief with Disk Backup Presented by W. Curtis Preston VP, Service Development GlassHouse Technologies, Inc.
Database Storage Considerations Adam Backman White Star Software DB-05:
Adam Leidigh Brandon Pyle Bernardo Ruiz Daniel Nakamura Arianna Campos.
Elements of a Computer System Dr Kathryn Merrick Thursday 4 th June, 2009.
Servers Last Update Copyright Kenneth M. Chipps Ph.D. 1.
Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
Planning and Designing Server Virtualisation.
1 Selecting LAN server (Week 3, Monday 9/8/2003) © Abdou Illia, Fall 2003.
1 U.S. Department of the Interior U.S. Geological Survey Contractor for the USGS at the EROS Data Center EDC CR1 Storage Architecture August 2003 Ken Gacke.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
By : Reem Hasayen. A storage device is a hardware device capable of storing information. There are two types of storage devices used in computers 1. Primary.
 The End to the Means › (According to IBM ) › 03.ibm.com/innovation/us/thesmartercity/in dex_flash.html?cmp=blank&cm=v&csr=chap ter_edu&cr=youtube&ct=usbrv111&cn=agus.
The motherboard was made to hold the CPU and allow the owner to upgrade the machine on their own.
Tackling I/O Issues 1 David Race 16 March 2010.
Jérôme Jaussaud, Senior Product Manager
Seagate Confidential Battle Card: Seagate Business Storage 8-bay Rackmount NAS THE BASICS Target Customers & Relevant Value KEEP IN MIND “Need-To-Know”
Extending Auto-Tiering to the Cloud For additional, on-demand, offsite storage resources 1.
Computer Hardware. Focus Items  Design systems that meet business needs  Hardware industry trends  Problems Legacy hardware (and software) Dealing.
Server Performance, Scaling, Reliability and Configuration Norman White.
Testing the Zambeel Aztera Chris Brew FermilabCD/CSS/SCS Caveat: This is very much a work in progress. The results presented are from jobs run in the last.
Enterprise Vitrualization by Ernest de León. Brief Overview.
System Storage TM © 2007 IBM Corporation IBM System Storage™ DS3000 Series Jüri Joonsaar Tartu.
Computer Hardware What is a CPU.
Open-E Data Storage Software (DSS V6)
Managing Explosive Data Growth
Virtualization for Cloud Computing
Storage Area Networks The Basics.
Integrating Disk into Backup for Faster Restores
Video Security Design Workshop:
Processing Device and Storage Devices
iSCSI Storage Area Network
Installation 1. Installation Sources
Chapter 1: Minimizing Power Usage
Computer Hard Drive.
Principles of Information Technology
Cluster Active Archive
SAN and NAS.
Introduction to Networks
Storage Virtualization
Computer System Basics- The Pieces & Parts
Chapter 7.
Distinguish between primary and secondary storage.
CSE 451: Operating Systems Spring 2006 Module 18 Redundant Arrays of Inexpensive Disks (RAID) John Zahorjan Allen Center.
File Manager for Microsoft Office 365, SharePoint, and OneDrive: Extensible Via Custom Connectors in Enterprise Deployments, Ideal for End Users OFFICE.
Web Server Administration
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
What is iSCSI and why is it a major selling point for NAS?
Chapter 2: Operating-System Structures
Cost Effective Network Storage Solutions
Cloud? Computing? noSQL vs SQL RAID 0,1,5,6,10
Year 10 Computer Science Hardware - CPU and RAM.
Hard Drives & RAID PM Video 10:28
Chapter 2: Operating-System Structures
CS 295: Modern Systems Organizing Storage Devices
Presentation transcript:

Low-cost BYO Mass Storage Project 8/19/2009 Low-cost BYO Mass Storage Project James Cizek Unix Systems Manager Academic Computing and Networking Services

CHECO - Fall Conference The Problem Reduced Budget Storage needs growing Storage needs changing (Tiered Storage) I NEED MORE DISK SPACE! (DBA’s!!) Current commercial offerings are not addressing this problem without major budget implications September 21, 2010 CHECO - Fall Conference

Projected Needs (2009 Survey) September 21, 2010 CHECO - Fall Conference

CHECO - Fall Conference The Goal Find a mass storage solution that won’t break the bank CSU attempted NSF grant to meet this need ($1 million for 500 TB x 2), but were not awarded the grant (1,000 1 TB drives!!!) Vendors sell high-speed, costly systems (suitable for Amazon, Google, etc.), but we want slower, low-cost Looking at vendor offerings, we decided to “roll our own” Maximize TB/$$ with reasonable assurance that data are redundant and safe September 21, 2010 CHECO - Fall Conference

CHECO - Fall Conference Some Understandings Project approached as “Secondary” or “Tier 2” type storage, not intended to replace extremely fast, ultra-reliable, expensive disk systems Device management, support, and component failure need to be addressed September 21, 2010 CHECO - Fall Conference

CHECO - Fall Conference A starting point Online backup company “Backblaze” open- sourced their storage pod design, see https://www.backblaze.com/petabytes-on-a-budget-how-to-build-cheap-cloud-storage.html Starting with a proven design would eliminate many unknowns and speed up our design process Turned out to be helpful, but ran into many of our own headaches September 21, 2010 CHECO - Fall Conference

CHECO - Fall Conference The BackBlaze design September 21, 2010 CHECO - Fall Conference

BackBlaze vs. CSU design goals Realized that the BackBlaze design didn’t exactly meet our requirements No redundant power supplies Cheap SATA cards didn’t take advantage of performance available by having large number of spinning hard drives Case too small to accommodate server-class motherboard Single “system” hard drive is single point of failure. Realized the need to over-engineer cooling and vibration reduction (2 major contributors to drive failure) Chassis was red instead of CSU green! September 21, 2010 CHECO - Fall Conference

CHECO - Fall Conference CSU design changes Lengthened case by 3 inches to accommodate dual CPU server-class motherboard Added more RAM for file system buffering (6 GB compared to BackBlaze 4GB) Added larger, redundant power supplies - individual supply can run entire case Used “Enterprise” grade drives instead of consumer grade, after much research Drives selected have vibration sense / damping Replaced cheap SATA cards with high- performance PCI-e cards September 21, 2010 CHECO - Fall Conference

CSU chassis nearing completion September 21, 2010 CHECO - Fall Conference

CSU chassis nearing completion 8/19/2009 CSU chassis nearing completion September 21, 2010 CHECO - Fall Conference

CHECO - Fall Conference Costs Case: $700 1 TB Drives: $100 x 45 ($4,500) Drives were purchased earlier this year, now 1.5TB for $100 Motherboard / Processors / Memory: $900 Power Supplies: $200 SATA cards: $300 Ethernet card with iSCSI offload: $350 SATA Multipliers: $45 x 9 ($405) Fans/Cables/Hardware/DVD/Mounts/etc.: $1,000 Total: 45 Raw TB for $8,355! September 21, 2010 CHECO - Fall Conference

CHECO - Fall Conference Testing Environment Testing was done with both small files and large files (Larger than largest memory buffer) Same data was used for all tests. Allowed us to validate results from various benchmark utilities against each other All RAID configurations were done in multiples of 3 to spread load across as many backplanes as possible All test data below assume worst case (Reads all random, writes all continuous) Highest recorded temperature (excluding CPU exhaust fan) under full load is 100F (ambient office temperature at input, should see even more improvement in datacenter) September 21, 2010 CHECO - Fall Conference

CHECO - Fall Conference Initial Performance 9-Drive Raid 6 Write – 100.32 MB/s 9-Drive Raid 6 Read – 107 MB/s 9-Drive Raid 5 Write – 103.2 MB/s 9-Drive Raid 5 Read – 106.8 MB/s 18-Drive Raid 6 Write – 98.26 MB/s 18-Drive Raid 6 Read – 106 MB/s RAID sets less than 6 drives showed degraded performance, RAID sets above 18 drives showed only small performance benefits September 21, 2010 CHECO - Fall Conference

Cost / Performance Comparison We are using IBM DS4300 and DS4700 Fibre channel disk systems as Tier 1 disk in the unix environment. These use 18U and nearly $100K We are using Equallogic (various models) iSCSI arrays for Tier 1 disk in windows environment. P6500E model hold 48 TB but runs near $80K We are using “Jetstor” SATA based products for Tier 2. 16TB capacity for $8000 Although Fibre channel capable, have no ability to present disk space standalone (i.e. must be connected to a server) DIY disk box is 45 (67) TB for $8300 in 4U September 21, 2010 CHECO - Fall Conference

CHECO - Fall Conference Configuration Much was learned during testing RAID levels, 5 & 6 tested, 5 faster, but not enough to disregard the added safety of 6. 1 & 10 not considered Operating system – Debian 64bit Server Performance testing – unix DD, IOmeter, IOzone Consistent data obtained from all tools Connection offerings (Fibre, iSCSI, NFS, AOE) Fibre iSCSI NFS (SLOW!!!) AOE (Working out kinks) September 21, 2010 CHECO - Fall Conference

CHECO - Fall Conference Challenges ahead Support management (What happens when a disk fails?) Backup and protection of stored data Mirroring units Avoid backing up to enterprise backup system Data storage and protection policies Parallel file system September 21, 2010 CHECO - Fall Conference

Where will this be useful? Library digital repository Research computing HPC, tier 2 Campus wide “Cloud” storage Second or Third Tier storage for your Enterprise backups Email/File archiving Database “snapshots” kept for long term (LMS) September 21, 2010 CHECO - Fall Conference

What other possibilities? Very large JBOD (Directly attached to server) Linux server offering CIFS/NFS NAS capability iSCSI target Direct connection via FibreChannel (HPC) VMWare ESXi Standalone VM cluster with massive attached storage 4 U server running Windows/Linux/FreeNAS/OpenFiler September 21, 2010 CHECO - Fall Conference

CHECO - Fall Conference Where are we today? September 21, 2010 CHECO - Fall Conference

CHECO - Fall Conference Next steps at CSU Collection of final “parts list” for a complete build Documentation Put it into “semi production” and see how it performs under real-world situations September 21, 2010 CHECO - Fall Conference

CHECO - Fall Conference Resources http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/ (Original BackBlaze project) http://www.ctcustomfab.com (Cases) http://www.chyangfun.com (SATA multipliers) http://www.colostate.edu/curtisb/mass_storage (Wiki on CSU progress) September 21, 2010 CHECO - Fall Conference

CHECO - Fall Conference Questions? james@colostate.edu September 21, 2010 CHECO - Fall Conference