Microsoft Storage Spaces Direct – the future of Hyper-V and Azure Stack (EN) Carsten Rachfahl Microsoft Cloud & Datacenter MVP Microsoft Regional Director Germany
Carsten Rachfahl Microsoft CDM MVP Microsoft Reginal Director Organisator of the Cloud & Datacenter Conference Germany http://cdc-gemany.de @hypervserver one of the Hyper-V Amigos I blog, do screencast and interviews at https://www.hyper-v-server.de
Agenda S2D overview S2D in depth Deployment options Performance Demo Q&A
Storage Overview
Traditional Storage Array 1/26/2018 1:09 AM Compute Virtual Machines Virtualization Host Connectivity Fibre Channel / iSCSI / FCoE / SAS Storage Array SAN © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Traditional Storage Array 1/26/2018 1:09 AM Compute Virtual Machines Virtualization Host Connectivity Fibre Channel / iSCSI / FCoE / SAS Storage Array SAN Controller Controller Storage Software Storage Software Disk Connectivity Backplane Disks Raw Storage © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Shared Storage Spaces 1/26/2018 1:09 AM Virtual Machines Compute Virtual Machines Virtualization Host Connectivity SMB3 Scale-out File Server NAS Storage Software SAS Connectivity SAS Enclosure (JBOD) Raw Storage © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Shared Storage Spaces 1/26/2018 1:09 AM Virtual Machines Compute Virtual Machines Virtualization Host Connectivity SMB3 Scale-out File Server NAS Storage Software SAS Connectivity SAS Enclosure (JBOD) Raw Storage © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
S2D Overview
Shared Storage Spaces 1/26/2018 1:09 AM Virtual Machines Compute Virtual Machines Virtualization Host Connectivity SMB3 Scale-out File Server NAS Storage Software SAS Connectivity SAS Enclosure (JBOD) Raw Storage © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Converged with Storage Spaces Direct 1/26/2018 1:09 AM Compute Virtual Machines Virtualization Host Connectivity SMB3 Scale-out File Server NAS Storage Software Raw Storage © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Hyper-converged with Storage Spaces Direct 1/26/2018 1:09 AM Compute and Storage Virtual Machines Virtualization and Storage Host Storage Software © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Microsoft Storage Spaces Direct 1/26/2018 What is Storage Spaces Direct? Software-defined storage Highly available and scalable Storage for Hyper-V and Private Cloud Why Storage Spaces Direct? Servers with local storage Industry standard hardware Lower cost flash with SATA SSDs Better flash performance with NVMe SSDs Ethernet/RDMA network as storage fabric Hyper-V cluster with local attached storage © 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
S2D in Depth
Storage Stack File System (CSVFS with ReFS) Storage Spaces Converged 2 Virtual Machines Hyper-Converged 1 File System (CSVFS with ReFS) Fast VHDX creation, expansion and checkpoints Cluster-wide data access Storage Spaces Scalable pool with all disk devices Resilient virtual disk Software Storage Bus Storage Bus Cache Leverages SMB3 and SMB Direct Servers with local disks SATA, SAS and NVMe SMB 3 2 1 Scale-Out File Server Virtual Machines CSVFS Cluster File System ReFS On-Disk File System Storage Spaces Virtual Disk Storage Spaces Virtual Disk Storage Spaces Storage Pool Software Storage Bus
Cluster Shared Volumes File System (CSVFS) Software Storage Bus Virtual storage bus spanning all servers Virtualizes physical disks and enclosures Consists of: Clusport: Initiator (virtual HBA) ClusBflt: Target (virtual disk / enclosures) SMB3/SMB Direct transport RDMA enabled networks for latency and CPU Bandwidth management Fair device access from any server IO prioritization (App vs System) De-randomization of random IO Drives sequential IO pattern on rotational media Node 1 Node 2 Application Cluster Shared Volumes File System (CSVFS) File System File System Virtual Disks Virtual Disks SpacePort SpacePort Block over SMB ClusPort ClusPort ClusBflt ClusBflt Physical Devices Physical Devices
Built-In Cache Integral part of Software Storage Bus Cache scoped to local machine Agnostic to storage pools and virtual disks Automatic configuration when enabling S2D Special partition on each caching device Leaves 32GB for pool and virtual disks metadata Round robin binding of SSD to HDD Rebinding with topology change Cache behavior All writes up to 256KB are cached Reads of 64KB or less are cached on first miss Reads of 64+ KB are cached on second miss (<10 minutes) Sequential reads of 32+KB are not cached Write cache only on all flash systems SATA SSD SATA SSD Caching Devices SATA HDD SATA HDD Capacity Devices
Storage Pool Metadata on select devices Device selection Improves pool scalability Improved pool update performance Device selection Faster media is preferred Metadata on up to 10 devices Evenly spread across fault domain Dynamic update on node or device failure SATA SSD SATA SSD Potential metadata devices SATA HDD SATA HDD
Volume Types Performance volumes (mirror) Capacity volumes (parity) Usually 3-way Mirror or 2-way Mirror Capacity volumes (parity) Should be double Parity Hybrid volumes hybrid of 3-way mirror and double parity Mirror Parity Hybrid Volume Mirror Parity
Hybrid Volumes Volume with mirror and parity Mirror for hot data Requires at least 4 nodes Requires ReFS Mirror for hot data Optimized for write performance Little CPU or storage churn Parity for cold data Erasure coding storage efficiency CPU or storage churn only on cold data Local Reconstruction Codes (LRC) algorithm Nodes Mirror Efficiency Parity Efficiency SSD + HDD All-Flash Resiliency 4 33% 50% 2 node 8 66% 12 72% 75% 16 80% Mirror Server 1 Server 2 Server 3 Server 4 Server 5 Server 6 Server 7 Server 8 A Group 1 A’ B Group 2 B’ A’’ B’’ Parity X1 X2 PX Y1 Y2 PY Q
LRC data reconstruction Most common failure is 1 fault domain 1 disk failure (X2) Read X1 and PX Recalculate X2 Write X2 to different disk Total of 2 reads and 1 write Traditional Reed Solomon 4 data, 2 parity Total of 4 reads and 2 write LRC requires 50% less disk IO Tolerant to failure of 2 fault domains 2 disk failure (X1 and X2) Read PX, Y1, Y2 and Q Recalculate and write X1 to a different disk Recalculate and write X2 to a different disk Total of 4 reads and 2 writes Traditional Reed Solomon 4 data and 2 parity LRC and RS requires the same disk IO Server 1 X1 Group 1 Server 2 X2 Server 3 PX Server 4 Y1 Group 2 Server 5 Y2 Server 6 PY Server 7 Q Server 8
ReFS Real-Time Tiering Writes go to mirror tier (hot data) Rotate data into parity tier as needed (cold data) Erasure Code calculation only on rotation Updates to data stored in parity tier Updated data is written to mirror tier Old data in parity tier is invalidated (metadata operation) ReFS 1 ReFS writes into mirror tier W Logical 1 2 ReFS writes land in write cache (SSD) 1 Mirror tier Parity tier 3 Cache destages as needed onto HDD 2 SATA SSD SATA HDD 2 1 ReFS data rotation into parity as needed Physical 3 2 ReFS data rotation bypass the cache
ReFS VM Optimizations Basics Efficient VM Checkpoints and Backup Metadata checksums with optional user data checksum Data corruption detection and repair On-volume backup of critical metadata with online repair Efficient VM Checkpoints and Backup VHD(X) checkpoints cleaned up without physical data copies Data migrated between parent and child VHD(X) files as a ReFS metadata operation Reduction of I/O to disk Increased speed Reduces impact of checkpoint clean-up to foreground workloads Accelerated Fixed VHD(X) Creation Fixed VHD(X) files zeroed with metadata operations Minimal impact on workloads Decreases VM deployment time Quick Dynamic VHD(X) Expansion Dynamic VHD(X) files zeroed with metadata operations Reduces latency spike for foreground workloads
S2D Deployment Options
With 16 nodes a Maximum of 416 devices Scale 2 node (minimum) Only 2-way Mirror 3 node 2-way and 3-way mirror 4 node to 16 node (maximum) 2-way and 3-way mirror Parity possible Hybrid Disk Minimum 6 devices (2 cache + 4 capacity drives) With 16 nodes a Maximum of 416 devices
Deployment Options SQL 2016 SQL 2016 and storage resources together Easy deployment and management (I hope ) sql sql sql sql
Deployment Options Hyper-Converged Compute and storage resources together Easy deployment and management
Deployment Options Hyper-Converged Converged SDS SMB 3 Fabric Hyper-Converged Compute and storage resources together Easy deployment and management Converged SDS Compute and storage resources separate Scaling for larger deployments
Vendors who are commited to Storage Spaces Direct
RAID Inc. Ability™ HCI Series S2D200 SuperMicro SYS-2028U-TRT+ Cisco UCS C240 M4 DataON S2D-3110 DELL PowerEdge R730XD Fujitsu Primergy RX2540 M2 HPE ProLiant DL380 Gen9 Inspur NF5280M4 Intel MCB2224TAF3 Lenovo X3650 M5 NEC Express5800 R120f-2M Quanta D51B-2U (MSW6000) RAID Inc. Ability™ HCI Series S2D200 SuperMicro SYS-2028U-TRT+
2 Node PoC Project Kepler-47 Mini-ITX Motherboard Intel Xeon E3v5 1235L 4C 2.00 GHz 2 x 16 GB ECC DDR4 6 x 4TB SATA HDD 2 x 200GB SATA SSD USB3 DOM
2 Node PoC Project Kepler-47 Mini-ITX Motherboard Intel Xeon E3v5 1235L 4C 2.00 GHz 2 x 16 GB ECC DDR4 U-NAS NSC-800 6 x 4TB SATA HDD 2 x 200GB SATA SSD USB3 DOM
2 Node PoC Project Kepler-47
2 Node PoC Project Kepler-47
2 Node PoC Project Kepler-47 Server and drive fault tolerance 20+ TB of mirrored storage capacity 50+ GB of memory for 5-10 mid-sized VMs Great for remote/branch office!
S2D Performance
Microsoft and Intel showcase at IDF’15 Showcase Hardware 16 Intel® Server System S2600WT(2U) nodes Dual Intel® Xeon® processor E5-2699 v3 Processors 128GB Memory (16GB DDR4-2133 1.2V DR x4 RDIMM) Storage per Server 4 - Intel® SSD DC P3700 Series (800 GB, 2.5” SFF) Boot Drive: 1 Intel® SSD DC S3710 Series (200 GB, 2.5” SFF) Network per server 1 Chelsio® 10GbE iWARP RDMA Card (CHELT520CRG1P10) Intel® Ethernet Server Adapter X540-AT2 for management Load Generator (8 VMs per Compute Node => 128 VMs) 8 virtual cores and 7.5 GB memory DISKSPD with 8 threads and Queue Depth of 20 per thread Load Profile Total IOPS IOPS/Server 100% 4K Read 4.2M IOPS 268K IOPS 90%/10% 4K Read/Write 3.5M IOPS 218K IOPS 70%/30% 4K Read/Write 2.3M IOPS 143K IOPS
Performance Video auf Channel9 Configuration: 4x Dell R730XD 2x Xeon E5-2660v3 2.6Ghz (10c20t) 256GB DRAM (16x 16GB DDR4 2133 MHz DIMM) 4x Samsung PM1725 3.2TB NVME SSD (PCIe 3.0 x8 AIC) Dell HBA330 4x Intel S3710 800GB SATA SSD 12x Seagate 4TB Enterprise Capacity 3.5” SATA HDD 2x Mellanox ConnectX-4 100Gb (Dual Port 100Gb PCIe 3.0 x16) Mellanox FW v. 12.14.2036 Mellanox ConnectX-4 Driver v. 1.35.14894 Device PSID MT_2150110033 Single port connected / adapter
My own Benchmarks Benchmark: Top: Fujitsu Mid: Dell Bottom: HPE Microsoft VMFleet with 60 VMs on 4 Nodes Diskspd testing 64kb Blöcke and 70% Read / 30% Write Top: Fujitsu 2x E5-2680 CPUs with 2x 800GB NVMe + 4x 1.9TB SSD Mid: Dell 2x E5-2640 with 18x 800GB SSDs Bottom: HPE 2x E5-2660 2x 800GB SSDs + 4x 4TB HDD
S2D Demo
DEMO
Q&A
Thank you!
S2D and Azure Stack
Azure Stack Integrated System Software Hardware Support Services Architecture, hardware, and topology Security and privacy Deployment, configuration, provisioning Validation Monitoring, diagnostics Business continuity Patching and updating Field replacement of parts BMC Switch ToR Switch Server Integrated System Azure Stack 4 Node Hyper-converged S2D
Different Storage Systems
Traditional Storage Array 1/26/2018 1:09 AM Compute Virtual Machines Virtualization Host Connectivity Fibre Channel / iSCSI / FCoE / SAS Storage Array SAN © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Traditional Storage Array 1/26/2018 1:09 AM Compute Virtual Machines Virtualization Host Connectivity Fibre Channel / iSCSI / FCoE / SAS Storage Array SAN Controller Controller Storage Software Storage Software Disk Connectivity Backplane Disks Raw Storage © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Shared Storage Spaces 1/26/2018 1:09 AM Virtual Machines Compute Virtual Machines Virtualization Host Connectivity SMB3 Scale-out File Server NAS Storage Software SAS Connectivity SAS Enclosure (JBOD) Raw Storage © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Shared Storage Spaces 1/26/2018 1:09 AM Virtual Machines Compute Virtual Machines Virtualization Host Connectivity SMB3 Scale-out File Server NAS Storage Software SAS Connectivity SAS Enclosure (JBOD) Raw Storage © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Converged with Storage Spaces Direct 1/26/2018 1:09 AM Compute Virtual Machines Virtualization Host Connectivity SMB3 Scale-out File Server NAS Storage Software Raw Storage © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Hyper-converged with Storage Spaces Direct 1/26/2018 1:09 AM Compute and Storage Virtual Machines Virtualization and Storage Host Storage Software © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Didier Van Hoye Microsoft Cloud & Datacenter Management <Volgende sessie 16:00 – 17:00 uur> Get your work/life balance in check with Hyper-V 24/7/365 High Availability Didier Van Hoye Microsoft Cloud & Datacenter Management