Download presentation
Presentation is loading. Please wait.
Published byLillian Wilkins Modified over 6 years ago
1
Storage Spaces Affordable and Scalable Shared Storage
William Osborne System Engineer – Central Virginia Community College
2
Agenda What is Storage Spaces? History Features
Technical Implementation Performance considerations New Features in Server 2016 Storage Spaces in the real world
3
What is Storage Spaces? Software solution developed by Microsoft for providing enterprise-grade storage using commodity off-the-shelf hardware Originally developed for in-house use Designed to scale in both capacity and performance Included in Windows Server Standard and greater at no additional cost Most features are included in Server 2012 Standard and Server 2016 Standard Certain advanced features require Server 2016 Datacenter edition
4
History of Storage Spaces
Developed for the Windows Release Team Problems: Capacity Cost Resiliency Needs: Support over 1 PB of storage Dramatically lower cost per TB Scalable platform for future expansion in both storage capacity and I/O volume Storage foundation for Azure service platform
5
History of Storage Spaces
Raw storage is cheap High speed 10 Gbps or better NICs are readily available The primary difference between a bunch of raw disks and a SAN is software and support.
6
History of Storage Spaces
In house development of a pure software storage solution started by the Windows Server team Instead of continuing to pay for storage from EMC and NetApp, funds were redirected to development Used to support the development of Windows 8 and Server 2012 and beyond Used to support workloads on Azure today
7
“We’re able to use commodity hardware to achieve the same functionality [of traditional storage] at a far lower cost.” Jeremy Russell Senior Development Lead Microsoft
8
History of Storage Spaces
Results 2x increase in raw storage throughput 3x increase in raw capacity with the same budget Traditional Storage $1,350/TB Storage Spaces less than $450/TB 5x increase in effective capacity through more efficient dedupe algorithms 6x reduction in the number of file servers. Many workloads can be serviced directly from the storage nodes. These numbers are from 2012
9
Hardware Platform Storage is managed separately from compute just like in a traditional SAN Storage can be managed and scaled independently Commodity servers, networking and storage hardware Inexpensive IP network hardware Inexpensive shared JBOD storage Scale-Out File Server Clusters Storage Spaces Virtualization and Resiliency Hyper-V Clusters SMBv3 Shared JBOD Storage PowerShell & SCVMM Management
10
Windows File Server Cluster
Hardware Platform Traditional Storage with FC/iSCSI Storage Array Windows File Server Cluster with Storage Spaces VM Host / Compute Nodes VM Host / Compute Nodes FC/iSCSI SMBv3 Embedded CPUs and Controllers (proprietary hardware) Windows File Server Cluster (commodity hardware) FC/SAS/SATA Disk Shelf SAN/NAS Shared SAS JBOD Storage Spaces SoFS
11
Components Standard Servers with SAS HBAs (not RAID controllers)
Commodity Storage Hardware Dual Domain SAS JBOD arrays with enterprise-grade redundancy Dual-Port SAS drives
12
VM Hosts / Compute Nodes
Platform Design VM Hosts / Compute Nodes SMB High Speed Network CSV aggregates namespace for data access across volumes Architecturally similar to other storage systems (10GbE or better) Unified Cluster Shared Volume Namespace \\SRV\VDI \\SRV\CRM \\SRV\DB Clustered Storage Spaces Mirror Space Simple Space Parity Space Clustered File Servers SAS Links Shared SAS JBOD Arrays SAS JBOD SAS JBOD
13
Basic Capabilities Pooling of disks
Flexible, resilient storage options Native data striping Enclosure aware (SES) Parallelized rebuild utilizes spare capacity of the entire storage pool Storage Pool Storage Spaces Data Copy 1 Data Copy 2 Mirror Space Parity Space
14
Basic Capabilities Data Integrity Scanning Simple Management Interface
Periodic background scan detects data corruption and auto-corrects where possible Uncorrectable failures are logged Requires the ReFS file system for best results Simple Management Interface PowerShell, Server Manager, and SCVMM are all supported Continuous Availability when used in a SoFS configuration
15
Enterprise-Grade Features
Traditional Storage Storage Tiering Data deduplication RAID groups Disk Pooling High Availability Write-back cache Snapshots Replication Storage Spaces Storage Tiering Data deduplication Flexible resiliency options Disk Pooling Continuous Availability Write-back cache SMB copy offload Snapshots Replication There is a difference between HA and CA CA relies on key features of the SMB v3 protocol to provide resiliency in the event of a node or network failure and the application must be SMBv3 aware to be able to handle failures without and interruption at all. HA typically doesn’t require the application to be aware of anything special. Fortunately, SMBv3 support is built-in on Server 2012 and newer. Any application relying on the OS storage routines will have CA support by default. Hyper-V also has special support for CA.
16
Planning Choose hardware for a DIY solution or purchase a pre-configured Cluster-in-a-box. Resiliency Needs How many drive failures do you need to withstand? Is tolerating a failure of an entire enclosure necessary? Capacity Different resiliency levels have different capacity overhead Performance Needs Workload dependent Can be measured based on raw IOPS and throughput
17
Planning Plan additional capacity for:
Parallel rebuilds (having 1-2 physical disks extra per pool ensures rebuilds complete rapidly) Unallocated SSDs for Caching Consider using SSDs for tiered storage to contain frequently accessed or changed data or SSD-only pools for high performance workloads.
18
Technical Implementation
Raw disks are assigned to storage pools by a storage administrator One or more volumes are created inside a storage pool. Each volume has it’s own resiliency settings.
19
Technical Implementation
Data for a volume is divided into blocks or chunks. These blocks are typically 256 MB by default but the size can be customized Blocks are distributed across drives in a pool as required to meet the resiliency requirements of the volume. Parity blocks are created for volumes that have the parity resiliency type.
20
Technical Implementation
Storage Spaces is NOT a RAID implementation It is tempting to think of Simple volumes as RAID 0, Mirror volumes as RAID 1, and parity volumes as RAID 5 or 6 but the actual implementation is different. There are typically no guarantees about exactly which blocks a given drive may have at any given time. Blocks are moved around as needed automatically.
21
Understanding Mirror and Simple Spaces
Performance and Resiliency with Mirroring can be understood with: Number of Data Columns Number of Data Copies Max Read Perf* = Columns x Copies x Single Disk Perf Max Write Perf* = Columns x Single Disk Perf Minimum number of drives required to expand storage space = Columns x Copies 2 Data Columns 2 Data Copies 2-Column 2-Way Mirrored Space I/O is distributed among the physical disks 1 2 3 4 5 6 7 8 Physical Disks *Ideal performance of a single I/O operation, aggregate performance can be higher depending on queue depth (parallelism) and block distribution.
22
Understanding Mirror and Simple Spaces
Remember Storage Spaces ≠ RAID The data from the previous example could be laid out like this: 2-Column 2-Way Mirrored Space I/O is distributed among the physical disks 1 2 3 4 5 6 7 8 Physical Disks
23
Drive Replacements You must inform Storage Spaces that you are swapping out a bad drive before you do it. It is possible that only part of the drive is bad and the drive may still have useable data blocks on it. Even if the drive appears to be totally dead, marking it for replacement first prevents errors for occurring when a replacement drive is installed. Marking the drive for replacement ensures that all blocks have been moved to another drive before the drive is physically removed from the storage system. It is possible to recover from a “surprise” drive replacement but it is best to do the replacement properly in the first place.
24
Mirror: General Purpose Storage
2-way mirror ensures every block is on at least 2 disks in different enclosures if possible. Provides resiliency for at least 1 disk failure, possibly more. Can withstand failure of an entire enclosure if properly configured with SES-aware enclosures. 3-way and 4-way mirrors are supported for even more redundancy.
25
Mirror: General Purpose Storage
Reads are served from any available disk, more copies increases read speed. Writes complete on all copies roughly simultaneously. Speeds depend on the number of disks, more copies decreases write speed on average.
26
Parity and Dual Parity: Cheap but slow
Parity provides resiliency for a single disk failure, dual parity provides resiliency for two disk failures. Designed for archival workloads Not suitable for high speed I/O requirements Fast rebuild times, requires fewer disks than Mirrors Disks are cheap and becoming cheaper as time goes on
27
Simple: Basic Striping
Simple Spaces stripe data across all disks in the pool. Cannot tolerate any disk or enclosure failures Avoid for applications where data integrity is critical. Useful for specialized applications that need raw speed over anything else. (Scratch disks, Temporary work areas, etc.)
28
Performance Considerations
Resiliency Type Number of Data Copies Maintained Deployment Recommendations Mirror 2 (two-way mirror) 3 (three-way mirror) Suitable for most workloads. Fast Read, moderately fast writes. Parity 2 (single parity) 3 (dual parity) Sequential workloads that are rarely accessed such as Archival. Very slow writes, moderately slow reads. Simple 1 Workloads which do not need resiliency, or provide alternate resiliency mechanism. Fastest write speeds.
29
Performance Planning Performance scales linearly as more drives are added to a pool Ensure performance is measured at latencies which are suitable for your application/workload
30
Performance Planning SSD-only pools can be created for high performance workloads SSDs can also be assigned to a pool as a WBC. Writes go to the SSDs first and then blocks are moved to the other disks during disk idle time. Ensure that you have enough SSDs to match the maximum number of copies used for the volumes hosted in the pool or WBC can’t be enabled Resiliency Setting Simple Two-Way Mirror Three-Way Mirror Single Parity Dual Parity Minimum Number of SSDs for WBC 1 2 3
31
Tiered Storage Spaces SSDs can be used as more than just a cache
Pools don’t limit the maximum number of SSDs or SSD capacity Tiering requires enough SSDs to accommodate all columns Tiering is enabled on a per volume basis If using tiering, match your SSD capacity to the size of your most active data/working set. Minimum Number of SSDs Resiliency Type 2 Columns 4 Columns Two-Way Mirror 4 8 Three-Way Mirror 6 12
32
Tiered Storage Spaces Data is moved daily between tiers based on disk activity. Tiering optimization can be scheduled to run more frequently for volumes that have a high amount of data churn. If particular files have a high access rate, you can manually pin them to the SSD tier to ensure they are never evicted from the SSD tier.
33
Tiered Storage Spaces A Storage Tier Optimization Report can be generated from Powershell: Total I/O Tracked by Tiering Engine (indicates the majority of I/O) Ideal I/O distribution with current SSD tier size for the current workloads Current I/O distribution post-optimization
34
New Features in Server 2016 Storage Replica Storage Quality of Service
Allows entire volumes to be replicated in real time across servers or across clusters Completely hardware agnostic, pool configuration doesn’t have to be the same Designed to be bandwidth efficient, data is compressed and deduplicated Storage Quality of Service Can specify minimum and maximum performance characteristics for various storage objects Policies can be applied to a pool, volume, or individual files in a volume. Tiering, and Caching behavior is adjusted dynamically to meet the performance policies.
35
New Features in Server 2016 Storage Spaces Direct
Allows Storage Spaces to be used in a SoFS configuration without the need for each node to have access to a shared SAS JBOD or fabric. Makes it possible to use SATA disks. SATA SSDs are cheaper and more plentiful than SAS SSDs Data is replicated between nodes over an IP network (using SMBv3) Higher network bandwidth requirements (especially between storage nodes) but greater flexibility
36
Storage Spaces in the Real World
What’s Good Low Cost (Commodity Hardware, no support contracts required) Easy to expand (just add more disks or more servers) Integrates nicely with other Microsoft solutions particularly Hyper-V Powerful management capability via Powershell What’s Not No single vendor to call in the event of a problem (if you build it yourself) Must deal with drive vendors directly for drive replacements under waranty Keep spare drives on hand Limited support for NFS or iSCSI. Storage Spaces really works best with SMBv3 and only SMBv3.
37
Deployment Options DIY or Pre-configured?
You can easily build a Storage Spaces SoFS cluster yourself with commodity servers and off-the-shelf hardware. Large number of JBOD enclosures and drives to choose from Always use Windows Server certified enclosures to ensure SES works! Only use HBAs do not attempt to use RAID controllers unless they have a pass-through mode. Most cost effective option, but a good working knowledge of SAS and SMBv3 is crucial. Must deal with vendors directly for any hardware compatibility or support issues
38
Deployment Options Dell DSMS
OEMs are beginning to offer pre-configured storage solutions that utilize Storage Spaces such as Dell DSMS. Covered by Dell service contracts, single point of contact for hardware failures and drive replacements. Can be configured as a storage-only solution or as a converged cluster capable of running Hyper-V VMs on storage backed by Storage Spaces. Learn more at
39
Storage Spaces in the Real World
CVCC’s Story CVCC migrated all virtual machines from VMWare ESX to Hyper-V because Hyper-V had all of the functionality that we were using from VMWare for a fraction of the licensing cost. Hyper-V nodes were being served from an EMC SAN. EMC SAN was nearing end of life New projects on the horizon that needed more storage than we had
40
Storage Spaces in the Real World
CVCC’s Story Set up a small scale Storage Spaces cluster with enough storage to support the new project. Early performance results were impressive – easily exceeding the performance of the existing EMC SAN even though the cluster only had 7200 RPM disks and no SSDs. EMC SAN was quite old and had 4 Gbps FC disks, vs the 6 Gbps SAS2 disks in the storage spaces cluster. The cost of the entire proof-of-concept storage system was cheaper than even a single disk shelf from EMC!
41
Storage Spaces in the Real World
CVCC’s Story Calculated cost to add enough capacity to move all storage from the EMC SAN to Storage Spaces Yearly maintenance cost for the EMC SAN was roughly half the total cost of the expansion. All components in the expansion would have a 5 year warranty. Storage Spaces would effectively pay for itself in 2 years! Migrated everything to Storage Spaces and have had far fewer reliability problems than we had with EMC
42
Storage Spaces in the Real World
Is it worth it? If you are currently using Hyper-V with a traditional shared storage model, Storage Spaces is a very attractive option. If you are currently using VMWare and hosting primarily non-Windows VMs, then Storage Spaces is probably not for you. If you are using VMWare and hosting mostly Windows VMs, take a look at Hyper-V. It may be included in the license cost you are already paying. Hyper-V was improved dramatically in Server 2012 and several handy features were introduced in Server 2012 R2 and Server 2016. Hyper-V works decently for Linux VMs, if you use a supported Linux distribution. Not so well if using an unsupported distribution.
43
Questions? William Osborne System Engineer
Central Virginia Community College (434)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.