PrincetonUniversity From application requests to Virtual IOPs: Provisioned key-value storage with Libra David Shue * and Michael J. Freedman (*now at Google)

Slides:

Advertisements

Similar presentations

Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky David G. Andersen * Princeton, Intel Labs, CMU Dont Settle for Eventual : Scalable Causal Consistency.

Advertisements

Gecko Storage System Tudor Marian, Lakshmi Ganesh, and Hakim Weatherspoon Cornell University.

Sweet Storage SLOs with Frosting Andrew Wang, Shivaram Venkataraman, Sara Alspaugh, Ion Stoica, Randy Katz.

PrincetonUniversity Performance Isolation and Fairness for Multi-Tenant Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM.

Green Cloud Computing Hadi Salimi Distributed Systems Lab, School of Computer Engineering, Iran University of Science and Technology,

Low-Cost Data Deduplication for Virtual Machine Backup in Cloud Storage Wei Zhang, Tao Yang, Gautham Narayanasamy University of California at Santa Barbara.

Reciprocal Resource Fairness: Towards Cooperative Multiple-Resource Fair Sharing in IaaS Clouds School of Computer Engineering Nanyang Technological University,

Performance Engineering Methodology Chapter 4. Performance Engineering Performance engineering analyzes the expected performance characteristics of a.

What will my performance be? Resource Advisor for DB admins Dushyanth Narayanan, Paul Barham Microsoft Research, Cambridge Eno Thereska, Anastassia Ailamaki.

FAWN: A Fast Array of Wimpy Nodes Presented by: Aditi Bose & Hyma Chilukuri.

Fair Scheduling in Web Servers CS 213 Lecture 17 L.N. Bhuyan.

ENFORCING PERFORMANCE ISOLATION ACROSS VIRTUAL MACHINES IN XEN Diwaker Gupta, Ludmila Cherkasova, Rob Gardner, Amin Vahdat Middleware '06 Proceedings of.

Scale-out File Server Cluster Hyper-V Cluster Virtual Machines SMB3 Storage Network Fabric.

Bandwidth Allocation in a Self-Managing Multimedia File Server Vijay Sundaram and Prashant Shenoy Department of Computer Science University of Massachusetts.

Everest: scaling down peak loads through I/O off-loading D. Narayanan, A. Donnelly, E. Thereska, S. Elnikety, A. Rowstron Microsoft Research Cambridge,

IOFlow: A Software-defined Storage Architecture Eno Thereska, Hitesh Ballani, Greg O’Shea, Thomas Karagiannis, Antony Rowstron, Tom Talpey, Richard Black,

Chapter 9 Overview  Reasons to monitor SQL Server  Performance Monitoring and Tuning  Tools for Monitoring SQL Server  Common Monitoring and Tuning.

Module 8: Monitoring SQL Server for Performance. Overview Why to Monitor SQL Server Performance Monitoring and Tuning Tools for Monitoring SQL Server.

Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives Junghee Lee, Youngjae Kim, Sarp Oral, Galen M. Shipman, David A. Dillow,

Measuring zSeries System Performance Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012 Sponsored in part by Deer &

Copyright © 2010 Platform Computing Corporation. All Rights Reserved.1 The CERN Cloud Computing Project William Lu, Ph.D. Platform Computing.

Integrated Risk Analysis for a Commercial Computing Service Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. Dept.

Introduction To Windows Azure Cloud

Adaptive Control of Virtualized Resources in Utility Computing Environments HP Labs: Xiaoyun Zhu, Mustafa Uysal, Zhikui Wang, Sharad Singhal University.

Key Perf considerations & bottlenecks Windows Azure VM characteristics Monitoring TroubleshootingBest practices.

A Bandwidth-aware Memory-subsystem Resource Management using Non-invasive Resource Profilers for Large CMP Systems Dimitris Kaseridis, Jeffery Stuecheli,

Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis [1] 4/24/2014 Presented by: Rakesh Kumar [1 ]

Logging in Flash-based Database Systems Lu Zeping

Storage Management in Virtualized Cloud Environments Sankaran Sivathanu, Ling Liu, Mei Yiduo and Xing Pu Student Workshop on Frontiers of Cloud Computing,

Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Profile Driven Component Placement for Cluster-based Online Services Christopher Stewart (University of Rochester) Kai Shen (University of Rochester) Sandhya.

1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.

Speaker: 吳晋賢 (Chin-Hsien Wu) Embedded Computing and Applications Lab Department of Electronic Engineering National Taiwan University of Science and Technology,

Tag line, tag line Power Management in Storage Systems Kaladhar Voruganti Technical Director CTO Office, Sunnyvale June 12, 2009.

Abdulelah Alwabel Fault-Aware VM Allocation Mechanism For Desktop Clouds.

Quality of Service Karrie Karahalios Spring 2007.

PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research.

The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter: 李長霖.

Quantifying and Improving I/O Predictability in Virtualized Systems Cheng Li, Inigo Goiri, Abhishek Bhattacharjee, Ricardo Bianchini, Thu D. Nguyen 1.

A Cyclic-Executive-Based QoS Guarantee over USB Chih-Yuan Huang,Li-Pin Chang, and Tei-Wei Kuo Department of Computer Science and Information Engineering.

Embedded System Lab. 정범종 A_DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters H. Wang et al. VEE, 2015.

Arne Wiebalck -- VM Performance: I/O

Scalability == Capacity * Density.

Enabling the Cloud OS Today  New high-density Web Sites with elastic cloud scaling and complete dev-ops experiences  New rich IaaS experience for self-service.

Introduction to Exadata X5 and X6 New Features

Architecture for Resource Allocation Services Supporting Interactive Remote Desktop Sessions in Utility Grids Vanish Talwar, HP Labs Bikash Agarwalla,

 The emerged flash-memory based solid state drives (SSDs) have rapidly replaced the traditional hard disk drives (HDDs) in many applications.  Characteristics.

Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.

Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and Prashant Shenoy.

Microsoft Virtual Academy

Running virtualized Hadoop, does it make sense?

Scaling In e Scaling Out através do elastic pool

Couchbase Server is a NoSQL Database with a SQL-Based Query Language

Data Warehouse in the Cloud – Marketing or Reality?

Analyzing Security and Energy Tradeoffs in Autonomic Capacity Management Wei Wu.

Windows Azure Migrating SQL Server Workloads

Azure SQL Database – Scaling in and Scaling out with elastic pool

Azure SQL Database – Scaling in and Scaling out with elastic pool

VNX Storage Report Project: Sample VNX Report Project ID:

COS 518: Advanced Computer Systems Lecture 11 Michael Freedman

XtremIO Storage Array Asset Profile

HashKV: Enabling Efficient Updates in KV Storage via Hashing

Building a Database on S3

Arash Tavakkol, Mohammad Sadrosadati, Saugata Ghose,

Cloud Computing Architecture

Cloud Computing Architecture

Request Units & Billing

COS 518: Advanced Computer Systems Lecture 12 Michael Freedman

Design Tradeoffs for SSD Performance

Presentation transcript:

PrincetonUniversity From application requests to Virtual IOPs: Provisioned key-value storage with Libra David Shue * and Michael J. Freedman (*now at Google)

Shared Cloud Tenant ATenant B VM Tenant C VM

Shared Cloud Storage Tenant ATenant B VM Key-Value Storage Block Storage SQL Database Tenant C VM

Unpredictable Shared Cloud Storage Tenant ATenant B VM Key-Value Storage Block Storage SQL Database Tenant C VM Disk IO-bound Tenants SSD-backed storage Key-Value Storage

Provisioned Shared Key- Value Storage Tenant A VM Tenant B VM Tenant C VM Application Requests Shared Key-Value Storage Low-level IO SSD GET/s PUT/s IOPs BW Reservation A Reservation B Reservation C (1KB normalized)

6 Libra Contributions Libra IO Scheduler - Provisions low-level IO allocations for app-request reservations w/ high utilization. - Supports arbitrary object distributions and workloads. 2 key mechanisms - Track per-tenant app-request resource profiles. - Model IO resources with Virtual IOPs.

7 Related Work Storage Type App- request s Work Conserving Medi a MaestroBlockNNHDD mClockBlockNYHDD FlashFQBlockNYSSD DynamoD B Key- Value YNSSD

Provisioned Distributed Key-Value Storage Tenant ATenant B VM Reservation A Reservation B Tenant B VM Storage Node N Storage Node 1... Shared Key-Value Storage Global Reservation Problem Local Reservation Problem [Pisces OSD1 ’12]

9 Storage Node N Provisioned Distributed Key-Value Storage Reservation A Reservation B Storage Node 1 data partitions demand... Tenant A VM Tenant B VM

10 Storage Node N Storage Node 1 Provisioned Distributed Key-Value Storage Reservation A Reservation B... Reservation A Reservation B

11 Storage Node N Storage Node 1 Provisioned Distributed Key-Value Storage... Res An Res Bn Res A1 Res B1 Reservation A Reservation B Reservation A Reservation B

12 Key-Value Protocol Persistence Engine IO Scheduler Physical Disk Retrieve K Read l337 IO operation Provisioned Local Key-Value Storage GET K GET Libra IO Scheduler

13 blah Libra Design Libra Provisioning Policy Libra IO Scheduler GET PUT How much IO to consume? Reservation Distribution Policy How much IO to provision? Persistence Engine Physical Disk DRR Determine the cost of a tenant request Provision IO allocations for tenant app-request reservations

14 Provisioning App-request Reservations is Hard Reservation Bn IO Amplification IO Interference Non-linear IO Performance 1 KB PUT ≥ 1 KB Write Variable IO throughput Underestimate provisionable IO Non-linear cost per KB Model IO with Virtual IOPs Track tenant app-request resource profiles

15 Workload-dependent IO Amplification PUT PUT K,V 3 V3V3 V2V2 FLUSH V3V3 index A M LevelDB (LSM-Tree) COMPACT V1V1 index HV V0V0 FO V3V3 A O

16 Workload-dependent IO Amplification PUT FLUSH COMPACT PUT K,V 3 V3V3 V2V2 V3V3 index A M V1V1 HV V0V0 FO V3V3 A O LevelDB (LSM-Tree)

17 Workload-dependent IO Amplification PUT FLUSH COMPACT PUT K,V 3 V3V3 V2V2 V3V3 index A M V1V1 HV V0V0 FO V3V3 A O LevelDB (LSM-Tree)

18 Workload-dependent IO Amplification GET K index A M Workload-dependent IO Amplification index HV FO K GZ

19 Libra Tracks App-request IO Consumption to Determine IO Allocations Libra IO Scheduler 5 GET 25 PUT FLUSH COMPACT Per-PUT Per-GET EWMA per-request averages Compute app- request IO profiles GET x x PUT IO = Tenant A 5 IO units PUT FLUSH Track IO consumption Provision IO allocations blah Libra Provisioning Policy IO Tenant A 500 IO/s

20 Unpredictable IO Interference Die-level parallelism, low latency IOPs 4 read/4 write tenants Shared-controller and bus contention Erase-before-write overhead FTL and read-modify- write garbage colleciton

21 Unpredictable IO Interference 21 Unpredictable IO Interference

22 Libra Underestimates IO Capacity to Ensure Provisionable Throughput Provisionable IO throughput = floor(workloads) (18 Kop/s) Provisionable IO limit

23 Libra Underestimates IO Capacity to Ensure Provisionable Throughput Provisionable IO throughput = floor(workloads) (18 Kop/s) Provisionable IO limit

24 Libra Underestimates IO Capacity to Ensure Provisionable Throughput Provisionable IO throughput = floor(workloads) (18 Kop/s) Provisionable IO limit

25 Non-linear IO Performance IOP Cost = linear decrease until 6 KB, then constant IOP Cost = constant until 6 KB, then linear increase Max BW Max IOP/s

26 Non-linear IO Performance Max BW Max IOP/s

27 Libra Uses Virtual IOPs to Model IO Resources Unifies IO cost into a single metric Captures non-linear IO performance Provides IO insulation 2 equal-allocation tenants IO Insulation = 1/2 Max Read/Write

28 Libra Uses Virtual IOPs to Model IO Resources 2 equal-allocation tenants IO Insulation = 1/2 Max Read/Write

29 blah Libra Design Libra Provisioning Policy Libra IO Scheduler Persistence Engine Physical Disk Determine the cost of a tenant request Provision IO allocations for tenant app-request reservations Charge tenant IOPs based on VOP cost Track app-request VOP consumption Provision VOPs within provisionable limit Update tenant VOP allocations

30 Evaluation Does Libra's IO resource model achieve accurate resource allocations? Does Libra's IO threshold make an acceptable tradeoff of performance for predictability in a real storage stack? Can Libra ensure per-tenant app-request reservations while achieving high utilization?

31 Interference-free Ideal Libra Achieves Accurate IO Allocations Throughput Ratio = Actual / Expected (IO Insulation) Read 1 KB even

32 Interference-free Ideal Libra Achieves Accurate IO Allocations Write KB Throughput Ratio = Actual / Expected (IO Insulation)

33 Libra Achieves Accurate IO Allocations

34 Libra Achieves Accurate IO Allocations

35 Libra Achieves Accurate IO Allocations Min-Max Ratio = Min Throughput Ratio / Max Throughput Ratio

36 Libra Trades-off Nominal IO Throughput For Predictability

37 Libra Trades-off Nominal IO Throughput For Predictability < 10th percentile covered by SLA and higher-level policies Percentile Workload10th50th80thAll 99:1 1.6%30.5%40.5%45.8% 25:75 1.4%14.9%25.0%34.7% 1:99 0.7%12.2%19.5%28.1% Unprovisionable Throughput As a Percentage of Total Throughput

38 Libra Achieves App-request Reservations Read HeavyMixedWrite Heavy 1.5x0.5x Work-conserving consumption of unprovisioned resources Fully provisioned allocations

39 Libra Achieves App-request Reservations Read HeavyMixedWrite Heavy

40 Libra IO Scheduler  Provisions IO allocations for app-request reservations w/ high utilization.  Supports arbitrary object distributions and workloads. 2 key mechanisms  Track per-tenant app-request resource profiles.  Model IO resources with Virtual IOPs. Evaluation  Achieves accurate low-level IO allocations.  Provisions the majority of IO resources over a wide range of workloads  Satisfies app-request reservations w/ high utilization. Conclusion