Storage Spaces Affordable and Scalable Shared Storage

Slides:



Advertisements
Similar presentations
Storage Management Lecture 7.
Advertisements

Virtualisation From the Bottom Up From storage to application.
What’s New: Windows Server 2012 R2 Tim Vander Kooi Systems Architect
WS2012 File System Enhancements: ReFS and Storage Spaces Rick Claus Sr. Technical WSV316.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
Chapter 3 Presented by: Anupam Mittal.  Data protection: Concept of RAID and its Components Data Protection: RAID - 2.
JBOD storage Server Message Block (SMB) PowerShell & SCVMM 2012 R2 Management Scale-out file server clusters Storage Space Hyper-V clusters.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
VIRTUALIZATION AND YOUR BUSINESS November 18, 2010 | Worksighted.
© 2009 IBM Corporation Statements of IBM future plans and directions are provided for information purposes only. Plans and direction are subject to change.
Implementing Failover Clustering with Hyper-V
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
“Better together” PowerVault virtualization solutions
Upgrading the Platform - How to Get There!
© Hitachi Data Systems Corporation All rights reserved. 1 1 Det går pænt stærkt! Tony Franck Senior Solution Manager.
Configuring File Services Lesson 6. Skills Matrix Technology SkillObjective DomainObjective # Configuring a File ServerConfigure a file server4.1 Using.
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
Nexenta Proprietary Global Leader in Software Defined Storage Nexenta Technical Sales Professional (NTSP) COURSE CONTENT.
Nexenta Proprietary Global Leader in Software Defined Storage Nexenta Technical Sales Professional (NTSP) COURSE CONTENT.
Module – 4 Intelligent storage system
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
Session objectives Discuss whether or not virtualization makes sense for Exchange 2013 Describe supportability of virtualization features Explain sizing.
Chapter 10 Chapter 10: Managing the Distributed File System, Disk Quotas, and Software Installation.
Davie 5/18/2010.  Thursday, May 20 5:30pm  Ursa Minor  Co-sponsored with CSS  Guest Speakers  Dr. Craig Rich – TBA  James Schneider – Cal Poly.
VMware vSphere Configuration and Management v6
Take enterprise virtualization to the next level
Rick Claus Sr. Technical Evangelist,
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
System Storage TM © 2007 IBM Corporation IBM System Storage™ DS3000 Series Jüri Joonsaar Tartu.
PHD Virtual Technologies “Reader’s Choice” Preferred product.
E2800 Marco Deveronico All Flash or Hybrid system
About ProLion CEO, Robert Graf Headquarter in Austria
EonStor DS 1000.
Storage spaces direct Hyper converged
Ryan Leonard Storage and Solutions Architect
Storage Area Networks The Basics.
Microsoft Virtual Academy
Integrating Disk into Backup for Faster Restores
Configuring File Services
Video Security Design Workshop:
VSPHERE 6 FOUNDATIONS BETA Study Guide QUESTION ANSWER
iSCSI Storage Area Network
Kako sa novim tehnologijama graditi datacentar budućnosti?
Fujitsu Training Documentation RAID Groups and Volumes
Software Defined Storage
Building a Virtual Infrastructure
What is Fibre Channel? What is Fibre Channel? Introduction
Section 7 Erasure Coding Overview
Introduction to Networks
Storage Virtualization
Optimizing SQL Server Performance in a Virtual Environment
Module – 7 network-attached storage (NAS)
Real IBM C exam questions and answers
TechEd /11/2018 6:28 PM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered.
Windows Server 2016 Software Defined Storage
Hyperconvergence Your Way
Fault Tolerance Distributed Web-based Systems
Building continuously available systems with Hyper-V
Specialized Cloud Architectures
Cost Effective Network Storage Solutions
Hard Drives & RAID PM Video 10:28
Storage Management Lecture 7.
PerformanceBridge Application Suite and Practice 2.0 IT Specifications
Microsoft Virtual Academy
CS 295: Modern Systems Organizing Storage Devices
Efficient Migration of Large-memory VMs Using Private Virtual Memory
Presentation transcript:

Storage Spaces Affordable and Scalable Shared Storage William Osborne System Engineer – Central Virginia Community College

Agenda What is Storage Spaces? History Features Technical Implementation Performance considerations New Features in Server 2016 Storage Spaces in the real world

What is Storage Spaces? Software solution developed by Microsoft for providing enterprise-grade storage using commodity off-the-shelf hardware Originally developed for in-house use Designed to scale in both capacity and performance Included in Windows Server Standard and greater at no additional cost Most features are included in Server 2012 Standard and Server 2016 Standard Certain advanced features require Server 2016 Datacenter edition

History of Storage Spaces Developed for the Windows Release Team Problems: Capacity Cost Resiliency Needs: Support over 1 PB of storage Dramatically lower cost per TB Scalable platform for future expansion in both storage capacity and I/O volume Storage foundation for Azure service platform

History of Storage Spaces Raw storage is cheap High speed 10 Gbps or better NICs are readily available The primary difference between a bunch of raw disks and a SAN is software and support.

History of Storage Spaces In house development of a pure software storage solution started by the Windows Server team Instead of continuing to pay for storage from EMC and NetApp, funds were redirected to development Used to support the development of Windows 8 and Server 2012 and beyond Used to support workloads on Azure today

“We’re able to use commodity hardware to achieve the same functionality [of traditional storage] at a far lower cost.” Jeremy Russell Senior Development Lead Microsoft

History of Storage Spaces Results 2x increase in raw storage throughput 3x increase in raw capacity with the same budget Traditional Storage $1,350/TB Storage Spaces less than $450/TB 5x increase in effective capacity through more efficient dedupe algorithms 6x reduction in the number of file servers. Many workloads can be serviced directly from the storage nodes. These numbers are from 2012

Hardware Platform Storage is managed separately from compute just like in a traditional SAN Storage can be managed and scaled independently Commodity servers, networking and storage hardware Inexpensive IP network hardware Inexpensive shared JBOD storage Scale-Out File Server Clusters Storage Spaces Virtualization and Resiliency Hyper-V Clusters SMBv3 Shared JBOD Storage PowerShell & SCVMM Management

Windows File Server Cluster Hardware Platform Traditional Storage with FC/iSCSI Storage Array Windows File Server Cluster with Storage Spaces VM Host / Compute Nodes VM Host / Compute Nodes FC/iSCSI SMBv3 Embedded CPUs and Controllers (proprietary hardware) Windows File Server Cluster (commodity hardware) FC/SAS/SATA Disk Shelf SAN/NAS Shared SAS JBOD Storage Spaces SoFS

Components Standard Servers with SAS HBAs (not RAID controllers) Commodity Storage Hardware Dual Domain SAS JBOD arrays with enterprise-grade redundancy Dual-Port SAS drives

VM Hosts / Compute Nodes Platform Design VM Hosts / Compute Nodes SMB High Speed Network CSV aggregates namespace for data access across volumes Architecturally similar to other storage systems (10GbE or better) Unified Cluster Shared Volume Namespace \\SRV\VDI \\SRV\CRM \\SRV\DB Clustered Storage Spaces Mirror Space Simple Space Parity Space Clustered File Servers SAS Links Shared SAS JBOD Arrays SAS JBOD SAS JBOD

Basic Capabilities Pooling of disks Flexible, resilient storage options Native data striping Enclosure aware (SES) Parallelized rebuild utilizes spare capacity of the entire storage pool Storage Pool Storage Spaces Data Copy 1 Data Copy 2 Mirror Space Parity Space

Basic Capabilities Data Integrity Scanning Simple Management Interface Periodic background scan detects data corruption and auto-corrects where possible Uncorrectable failures are logged Requires the ReFS file system for best results Simple Management Interface PowerShell, Server Manager, and SCVMM are all supported Continuous Availability when used in a SoFS configuration

Enterprise-Grade Features Traditional Storage Storage Tiering Data deduplication RAID groups Disk Pooling High Availability Write-back cache Snapshots Replication Storage Spaces Storage Tiering Data deduplication Flexible resiliency options Disk Pooling Continuous Availability Write-back cache SMB copy offload Snapshots Replication There is a difference between HA and CA CA relies on key features of the SMB v3 protocol to provide resiliency in the event of a node or network failure and the application must be SMBv3 aware to be able to handle failures without and interruption at all. HA typically doesn’t require the application to be aware of anything special. Fortunately, SMBv3 support is built-in on Server 2012 and newer. Any application relying on the OS storage routines will have CA support by default. Hyper-V also has special support for CA.

Planning Choose hardware for a DIY solution or purchase a pre-configured Cluster-in-a-box. Resiliency Needs How many drive failures do you need to withstand? Is tolerating a failure of an entire enclosure necessary? Capacity Different resiliency levels have different capacity overhead Performance Needs Workload dependent Can be measured based on raw IOPS and throughput

Planning Plan additional capacity for: Parallel rebuilds (having 1-2 physical disks extra per pool ensures rebuilds complete rapidly) Unallocated SSDs for Caching Consider using SSDs for tiered storage to contain frequently accessed or changed data or SSD-only pools for high performance workloads.

Technical Implementation Raw disks are assigned to storage pools by a storage administrator One or more volumes are created inside a storage pool. Each volume has it’s own resiliency settings.

Technical Implementation Data for a volume is divided into blocks or chunks. These blocks are typically 256 MB by default but the size can be customized Blocks are distributed across drives in a pool as required to meet the resiliency requirements of the volume. Parity blocks are created for volumes that have the parity resiliency type.

Technical Implementation Storage Spaces is NOT a RAID implementation It is tempting to think of Simple volumes as RAID 0, Mirror volumes as RAID 1, and parity volumes as RAID 5 or 6 but the actual implementation is different. There are typically no guarantees about exactly which blocks a given drive may have at any given time. Blocks are moved around as needed automatically.

Understanding Mirror and Simple Spaces Performance and Resiliency with Mirroring can be understood with: Number of Data Columns Number of Data Copies Max Read Perf* = Columns x Copies x Single Disk Perf Max Write Perf* = Columns x Single Disk Perf Minimum number of drives required to expand storage space = Columns x Copies 2 Data Columns 2 Data Copies 2-Column 2-Way Mirrored Space I/O is distributed among the physical disks 1 2 3 4 5 6 7 8 Physical Disks *Ideal performance of a single I/O operation, aggregate performance can be higher depending on queue depth (parallelism) and block distribution.

Understanding Mirror and Simple Spaces Remember Storage Spaces ≠ RAID The data from the previous example could be laid out like this: 2-Column 2-Way Mirrored Space I/O is distributed among the physical disks 1 2 3 4 5 6 7 8 Physical Disks

Drive Replacements You must inform Storage Spaces that you are swapping out a bad drive before you do it. It is possible that only part of the drive is bad and the drive may still have useable data blocks on it. Even if the drive appears to be totally dead, marking it for replacement first prevents errors for occurring when a replacement drive is installed. Marking the drive for replacement ensures that all blocks have been moved to another drive before the drive is physically removed from the storage system. It is possible to recover from a “surprise” drive replacement but it is best to do the replacement properly in the first place.

Mirror: General Purpose Storage 2-way mirror ensures every block is on at least 2 disks in different enclosures if possible. Provides resiliency for at least 1 disk failure, possibly more. Can withstand failure of an entire enclosure if properly configured with SES-aware enclosures. 3-way and 4-way mirrors are supported for even more redundancy.

Mirror: General Purpose Storage Reads are served from any available disk, more copies increases read speed. Writes complete on all copies roughly simultaneously. Speeds depend on the number of disks, more copies decreases write speed on average.

Parity and Dual Parity: Cheap but slow Parity provides resiliency for a single disk failure, dual parity provides resiliency for two disk failures. Designed for archival workloads Not suitable for high speed I/O requirements Fast rebuild times, requires fewer disks than Mirrors Disks are cheap and becoming cheaper as time goes on

Simple: Basic Striping Simple Spaces stripe data across all disks in the pool. Cannot tolerate any disk or enclosure failures Avoid for applications where data integrity is critical. Useful for specialized applications that need raw speed over anything else. (Scratch disks, Temporary work areas, etc.)

Performance Considerations Resiliency Type Number of Data Copies Maintained Deployment Recommendations Mirror 2 (two-way mirror) 3 (three-way mirror) Suitable for most workloads. Fast Read, moderately fast writes. Parity 2 (single parity) 3 (dual parity) Sequential workloads that are rarely accessed such as Archival. Very slow writes, moderately slow reads. Simple 1 Workloads which do not need resiliency, or provide alternate resiliency mechanism. Fastest write speeds.

Performance Planning Performance scales linearly as more drives are added to a pool Ensure performance is measured at latencies which are suitable for your application/workload

Performance Planning SSD-only pools can be created for high performance workloads SSDs can also be assigned to a pool as a WBC. Writes go to the SSDs first and then blocks are moved to the other disks during disk idle time. Ensure that you have enough SSDs to match the maximum number of copies used for the volumes hosted in the pool or WBC can’t be enabled Resiliency Setting Simple Two-Way Mirror Three-Way Mirror Single Parity Dual Parity Minimum Number of SSDs for WBC 1 2 3

Tiered Storage Spaces SSDs can be used as more than just a cache Pools don’t limit the maximum number of SSDs or SSD capacity Tiering requires enough SSDs to accommodate all columns Tiering is enabled on a per volume basis If using tiering, match your SSD capacity to the size of your most active data/working set. Minimum Number of SSDs Resiliency Type 2 Columns 4 Columns Two-Way Mirror 4 8 Three-Way Mirror 6 12

Tiered Storage Spaces Data is moved daily between tiers based on disk activity. Tiering optimization can be scheduled to run more frequently for volumes that have a high amount of data churn. If particular files have a high access rate, you can manually pin them to the SSD tier to ensure they are never evicted from the SSD tier.

Tiered Storage Spaces A Storage Tier Optimization Report can be generated from Powershell: Total I/O Tracked by Tiering Engine (indicates the majority of I/O) Ideal I/O distribution with current SSD tier size for the current workloads Current I/O distribution post-optimization

New Features in Server 2016 Storage Replica Storage Quality of Service Allows entire volumes to be replicated in real time across servers or across clusters Completely hardware agnostic, pool configuration doesn’t have to be the same Designed to be bandwidth efficient, data is compressed and deduplicated Storage Quality of Service Can specify minimum and maximum performance characteristics for various storage objects Policies can be applied to a pool, volume, or individual files in a volume. Tiering, and Caching behavior is adjusted dynamically to meet the performance policies.

New Features in Server 2016 Storage Spaces Direct Allows Storage Spaces to be used in a SoFS configuration without the need for each node to have access to a shared SAS JBOD or fabric. Makes it possible to use SATA disks. SATA SSDs are cheaper and more plentiful than SAS SSDs Data is replicated between nodes over an IP network (using SMBv3) Higher network bandwidth requirements (especially between storage nodes) but greater flexibility

Storage Spaces in the Real World What’s Good Low Cost (Commodity Hardware, no support contracts required) Easy to expand (just add more disks or more servers) Integrates nicely with other Microsoft solutions particularly Hyper-V Powerful management capability via Powershell What’s Not No single vendor to call in the event of a problem (if you build it yourself) Must deal with drive vendors directly for drive replacements under waranty Keep spare drives on hand Limited support for NFS or iSCSI. Storage Spaces really works best with SMBv3 and only SMBv3.

Deployment Options DIY or Pre-configured? You can easily build a Storage Spaces SoFS cluster yourself with commodity servers and off-the-shelf hardware. Large number of JBOD enclosures and drives to choose from Always use Windows Server certified enclosures to ensure SES works! Only use HBAs do not attempt to use RAID controllers unless they have a pass-through mode. Most cost effective option, but a good working knowledge of SAS and SMBv3 is crucial. Must deal with vendors directly for any hardware compatibility or support issues

Deployment Options Dell DSMS OEMs are beginning to offer pre-configured storage solutions that utilize Storage Spaces such as Dell DSMS. Covered by Dell service contracts, single point of contact for hardware failures and drive replacements. Can be configured as a storage-only solution or as a converged cluster capable of running Hyper-V VMs on storage backed by Storage Spaces. Learn more at http://dellstoragevr.dell.com/dsms

Storage Spaces in the Real World CVCC’s Story CVCC migrated all virtual machines from VMWare ESX to Hyper-V because Hyper-V had all of the functionality that we were using from VMWare for a fraction of the licensing cost. Hyper-V nodes were being served from an EMC SAN. EMC SAN was nearing end of life New projects on the horizon that needed more storage than we had

Storage Spaces in the Real World CVCC’s Story Set up a small scale Storage Spaces cluster with enough storage to support the new project. Early performance results were impressive – easily exceeding the performance of the existing EMC SAN even though the cluster only had 7200 RPM disks and no SSDs. EMC SAN was quite old and had 4 Gbps FC disks, vs the 6 Gbps SAS2 disks in the storage spaces cluster. The cost of the entire proof-of-concept storage system was cheaper than even a single disk shelf from EMC!

Storage Spaces in the Real World CVCC’s Story Calculated cost to add enough capacity to move all storage from the EMC SAN to Storage Spaces Yearly maintenance cost for the EMC SAN was roughly half the total cost of the expansion. All components in the expansion would have a 5 year warranty. Storage Spaces would effectively pay for itself in 2 years! Migrated everything to Storage Spaces and have had far fewer reliability problems than we had with EMC

Storage Spaces in the Real World Is it worth it? If you are currently using Hyper-V with a traditional shared storage model, Storage Spaces is a very attractive option. If you are currently using VMWare and hosting primarily non-Windows VMs, then Storage Spaces is probably not for you. If you are using VMWare and hosting mostly Windows VMs, take a look at Hyper-V. It may be included in the license cost you are already paying. Hyper-V was improved dramatically in Server 2012 and several handy features were introduced in Server 2012 R2 and Server 2016. Hyper-V works decently for Linux VMs, if you use a supported Linux distribution. Not so well if using an unsupported distribution.

Questions? William Osborne System Engineer Central Virginia Community College osbornew@centralvirginia.edu (434) 832-7644