Migrating Server Storage to SSDs: Analysis of Tradeoffs Dushyanth Narayanan Eno Thereska Austin Donnelly Sameh Elnikety Antony Rowstron Microsoft Research.

Slides:



Advertisements
Similar presentations
MS SQL Server & Solid State Storage November 2013 Gavin McLaughlin Solutions Development Director X-IO International Cutting through the marketing hype.
Advertisements

Write off-loading: Practical power management for enterprise storage D. Narayanan, A. Donnelly, A. Rowstron Microsoft Research, Cambridge, UK.
1 Magnetic Disks 1956: IBM (RAMAC) first disk drive 5 Mb – Mb/in $/year 9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625.
International Conference on Supercomputing June 12, 2009
What will my performance be? Resource Advisor for DB admins Dushyanth Narayanan, Paul Barham Microsoft Research, Cambridge Eno Thereska, Anastassia Ailamaki.
Intel Confidential Key Points Buy More Sell More.
Migrating Server Storage to SSDs: Analysis of Tradeoffs
Systems & networking MSR Cambridge Tim Harris 2 July 2009.
Everest: scaling down peak loads through I/O off-loading D. Narayanan, A. Donnelly, E. Thereska, S. Elnikety, A. Rowstron Microsoft Research Cambridge,
STORAGE Virtualization
Ruston Panabaker Architect Windows Hardware Innovation Group
Solid-State Drive Ding Ruogu Kong Liang. A solid-state drive (SSD) is a data storage device that uses solid-state memory to store persistent data.
CSE 451: Operating Systems Winter 2010 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura.
SSDs versus HDDs. 2 HDD options  HDD options are well known  Enterprise/Desktop/Laptop  SGI sells virtually no desktop or laptop drives  15K, 10K,
Ji-Yong Shin Cornell University In collaboration with Mahesh Balakrishnan (MSR SVC), Tudor Marian (Google), and Hakim Weatherspoon (Cornell) Gecko: Contention-Oblivious.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
Hystor : Making the Best Use of Solid State Drivers in High Performance Storage Systems Presenter : Dong Chang.
Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives Junghee Lee, Youngjae Kim, Sarp Oral, Galen M. Shipman, David A. Dillow,
Solid State Drive Feb 15. NAND Flash Memory Main storage component of Solid State Drive (SSD) USB Drive, cell phone, touch pad…
Just a really fast drive Jakub Topič, I3.B
Usage Centric Green Metrics for Storage Doron Chen, Ealan Henis, Ronen Kat and Dmitry Sotnikov IBM Haifa Research Lab Most of the metrics defined today.
“Five minute rule ten years later and other computer storage rules of thumb” Authors: Jim Gray, Goetz Graefe Reviewed by: Nagapramod Mandagere Biplob Debnath.
Operating Systems CMPSC 473 I/O Management (2) December Lecture 24 Instructor: Bhuvan Urgaonkar.
Tape is Dead Disk is Tape Flash is Disk RAM Locality is King Jim Gray Microsoft December 2006 Presented at CIDR2007 Gong Show
Slide 1 Windows PC Accelerators Reporter :吳柏良. Slide 2 Outline l Introduction l Windows SuperFetch l Windows ReadyBoost l Windows ReadyDrive l Conclusion.
DISKS IS421. DISK  A disk consists of Read/write head, and arm  A platter is divided into Tracks and sector  The R/W heads can R/W at the same time.
PMC Proprietary & Confidential November 2010 Product Presentation.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
Flashing Up the Storage Layer I. Koltsidas, S. D. Viglas (U of Edinburgh), VLDB 2008 Shimin Chen Big Data Reading Group.
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
Nexenta Proprietary Global Leader in Software Defined Storage Nexenta Technical Sales Professional (NTSP) COURSE CONTENT.
X-IO Technologies All Flash Arrays – Saviour of the storage world ? October 2013 Jim Litke Principal Systems Engineer X-IO.
SOLID STATE DRIVES By: Vaibhav Talwar UE84071 EEE(5th Sem)
DAC-FF The Ultimate Fibre-to-Fibre Channel External RAID Controller Solution for High Performance Servers, Clusters, and Storage Area Networks (SAN)
Lecture 16: Storage and I/O EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr.
Chapter Twelve Memory Organization
I/O Computer Organization II 1 Introduction I/O devices can be characterized by – Behavior: input, output, storage – Partner: human or machine – Data rate:
Eric Burgener VP, Product Management A New Approach to Storage in Virtual Environments March 2012.
PROBLEM STATEMENT A solid-state drive (SSD) is a non-volatile storage device that uses flash memory rather than a magnetic disk to store data. SSDs provide.
Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.
1EMC CONFIDENTIAL—INTERNAL USE ONLY FAST VP and Exchange Server 2010 Don Turner Consultant Systems Integration Engineer Microsoft TPM.
CLOUD BASED STORAGE Amy. Cloud Based Storage Cloud based storage is “the storage of data online in the cloud”
Jérôme Jaussaud, Senior Product Manager
Maximizing Performance – Why is the disk subsystem crucial to console performance and what’s the best disk configuration. Extending Performance – How.
Transactional Flash V. Prabhakaran, T. L. Rodeheffer, L. Zhou (MSR, Silicon Valley), OSDI 2008 Shimin Chen Big Data Reading Group.
대용량 플래시 SSD의 시스템 구성, 핵심기술 및 기술동향
W4118 Operating Systems Instructor: Junfeng Yang.
RAID Technology By: Adarsha A,S 1BY08A03. Overview What is RAID Technology? What is RAID Technology? History of RAID History of RAID Techniques/Methods.
Taking SSDs to the Next Level – A Hot Year for Storage $9.5B+ in Acquisitions and Investments In the PC Market… Mainstream 3-bit MLC SSDs PCIe.
1 Paolo Bianco Storage Architect Sun Microsystems An overview on Hybrid Storage Technologies.
Decentralized Distributed Storage System for Big Data Presenter: Wei Xie Data-Intensive Scalable Computing Laboratory(DISCL) Computer Science Department.
Lecture 17 Raid. Device Protocol Variants Status checks: polling vs. interrupts Data: PIO vs. DMA Control: special instructions vs. memory-mapped I/O.
System Storage TM © 2007 IBM Corporation IBM System Storage™ DS3000 Series Jüri Joonsaar Tartu.
Internal Parallelism of Flash Memory-Based Solid-State Drives
COS 518: Advanced Computer Systems Lecture 8 Michael Freedman
MERANTI Caused More Than 1.5 B$ Damage
Database Management Systems (CS 564)
BD-CACHE Big Data Caching for Datacenters
File Processing : Storage Media
Upgrading to Microsoft SQL Server 2014
Lecture 9: Data Storage and IO Models
reFresh SSDs: Enabling High Endurance, Low Cost Flash in Datacenters
COS 518: Advanced Computer Systems Lecture 8 Michael Freedman
File Processing : Storage Media
CSE 451: Operating Systems Winter 2009 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
CSE 451: Operating Systems Winter 2012 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
COS 518: Advanced Computer Systems Lecture 9 Michael Freedman
Dong Hyun Kang, Changwoo Min, Young Ik Eom
Presentation transcript:

Migrating Server Storage to SSDs: Analysis of Tradeoffs Dushyanth Narayanan Eno Thereska Austin Donnelly Sameh Elnikety Antony Rowstron Microsoft Research Cambridge, UK

Solid-state drive (SSD) 2 NAND Flash memory Flash Translation Layer (FTL) Block storage interface Persistent Random-access Low power Cost, Parallelism, FTL complexity USB driveLaptop SSD“Enterprise” SSD

Enterprise storage High-end disks, RAID Fault tolerance Throughput under load Capacity Energy ($) Laptop storage Low speed disks Form factor Responsiveness Ruggedness Battery life Enterprise storage is different 3

Flash $$$$$ Replacing disks with SSDs 4 Disks $$ Match performance Flash $ Match capacity

SSD as intermediate tier? 5 DRAM buffer cache Read cache + write-ahead log CapacityPerformance $$$$ $

Other options? Hybrid drives? –Flash inside the disk  can pin hot blocks –Volume-level tier more sensible for enterprise Modify file system? We want to plug in SSDs transparently –Replace disks by SSDs –Add SSD tier for caching and/or write logging 6

Challenge Given a workload –Which device type, how many, 1 or 2 tiers? We benchmarked enterprise SSDs, disks We traced many real enterprise workloads And built an automated provisioning tool –Takes workload, device models –And computes best configuration for workload 7

High-level design 8

Devices (2008) 9 DevicePriceSizeSequential throughput Random- access throughput Seagate Cheetah 10K$ GB85 MB/s288 IOPS Seagate Cheetah 15K$ GB88 MB/s384 IOPS Memoright MR25.2$73932 GB121 MB/s6450 IOPS Intel X25-E (2009)$41532GB250 MB/s35000 IOPS Seagate Momentus 7200$53160 GB64 MB/s102 IOPS

Characterizing devices Sequential vs random, read vs write –Some SSDs have slow random writes –Newer SSDs remap internally to sequential –We model both “vanilla” and “remapped” Multiple capacity versions per device –Different cost/capacity/performance tradeoffs 10

Device metrics MetricUnitSource Price$Retail CapacityGBVendor Random-access read rateIOPSMeasured Random-access write rateIOPSMeasured Sequential read rateMB/sMeasured Sequential write rateMB/sMeasured PowerWVendor 11

Enterprise workload traces I/O traces from live production servers –Exchange server (5000 users): 24 hr trace –MSN back-end file store: 6 hr trace –13 servers from MSRC DC: 1 week File servers, web server, web cache, etc. 15 servers, 49 volumes, 313 disks, 14 TB –Volumes are RAID-1, RAID-10, or RAID-5 12

Enterprise workload traces Traces are at volume (block device) level Below buffer cache, above RAID controller Timestamp, LBN, size, read/write Each volume’s trace is a workload –We consider each volume separately 13

Workload metrics MetricUnit CapacityGB Peak random-access read rateIOPS Peak random-access write rateIOPS Peak random-access I/O rate (reads+writes)IOPS Peak sequential read rateMB/s Peak sequential write rateMB/s Fault toleranceRedundancy level 14

Workload trace  metrics Capacity –largest LBN accessed in trace Performance = peak (or 99 th pc) load –Highest observed IOPS of random I/Os –Highest observed transfer rate (MB/s) Fault tolerance –Same as current (= 1 redundant device) 15

What is the best config? Cheapest one that meets requirements –Capacity, perf, fault-tolerance Re-run/replay trace? –Cannot provision h/w just to ask “what if” –Simulators not always available/reliable First-order models of device performance –Input is device metrics, workload metrics 16

Solver For each workload, device type –Compute #devices needed in RAID array Throughput, capacity scaled linearly with #devices –To match every workload requirement “Most costly” workload metric determines #devices –Add devices for fault tolerance –Compute total cost 17

Two-tier model 18

Solving for two-tier 19

Solving for two-tier model Iterate over cache sizes, policies –Write-back, write-through for logging –LRU, LTR (long-term random) for caching Inclusive cache model –Can also model exclusive (partitioning) –More complexity, negligible capacity savings 20

Model assumptions First-order models –Ok for provisioning  coarse-grained –Not for detailed performance modelling Open-loop traces –I/O rate not limited by traced storage h/w –Traced volumes are well-provisioned 21

Roadmap Introduction Devices and workloads Finding the best configuration Analysis results 22

Single-tier results Cheetah 10K best device for all workloads! SSDs cost too much per GB Capacity or read IOPS determines cost –Not read MB/s, write MB/s, or write IOPS –For SSDs, always capacity Read IOPS vs. GB is the key tradeoff 23

Workload IOPS vs GB 24

When will SSDs win? When IOPS dominates cost Break even $/GB for SSD when –Cost of GB (SSD) = Cost of IOPS (disk) Our tool also computes this point –New SSD  compare its $/GB to break-even –Then decide whether to buy it 25

Break-even point CDF 26

Break-even point CDF 27

Break-even point CDF 28

Capacity limits SSD On performance, SSD already beats disk $/GB too high by 1-3 orders of magnitude –Except for small (system boot) volumes SSD price has gone down but –This is per-device price, not per-byte price –Raw flash $/GB also needs to drop a lot 29

SSD as intermediate tier Read caching of little benefit –Servers already cache in DRAM Persistent write-ahead log is useful –Can improve write latency with a little flash –But does not reduce disk tier provisioning –Because writes are not the limiting factor 30

Power and wear SSDs use less power than Cheetahs –But $ savings << cost difference Flash wear is not an issue –SSDs have finite #write cycles –But will last well beyond 5 years Workloads’ long-term write rate not that high You will upgrade before you wear device out 31

Conclusion Capacity limits flash SSD in enterprise –Not performance, not wear Workload IOPS/GB ratio is key metric Might never get cheap enough [Hetzler2008] –All Si capacity today = 12% of HDD market –There are more profitable uses of Si capacity –Need higher density technologies (PCM?) 32

This space intentionally left blank 33

What are SSDs good for? Mobile, laptop, desktop Maybe niche apps for enterprise SSD –Too big for DRAM, small enough for flash And huge appetite for IOPS –Single-request latency –Power –Fast persistence (write log) 34

Assumptions that favour flash IOPS = peak IOPS –Most of the time, load << peak Faster storage will not help: already underutilized Disk = enterprise disk –Low power disks have lower $/GB, $/IOPS LTR caching uses knowledge of future –Looks through entire trace for randomly- accessed blocks 35

Supply-side analysis [Hetzler2008] Disks: 14,000 PB/year, fab cost $1B MLC NAND flash: 390 PB/year, $3.4B If all Si capacity moved to MLC flash today –Will only match 12% of HDD production Revenue: $35B HDD, $280B Silicon –No economic incentive to use fabs for flash 36

Device characteristics 37 DeviceMemoright SSDCheetah 10KCheetah 15KMomentus 7200 Price$739$339$172$150 Capacity32 GB300 GB146 GB200 GB Power1.0 W10.1 W12.5 W0.8 W Read (seq)121 MB/s85 MB/s88 MB/s64 MB/s Write (seq)126 MB/s84 MB/s85 MB/s54 MB/s Read (random)6450 IOPS277 IOPS384 IOPS102 IOPS Write (random)351 IOPS256 IOPS269 IOPS118 IOPS

9 of 49 benefit from caching 38

Energy savings << SSD cost 39

Wear-out times 40