Frequency Governors for Cloud Database OLTP Workloads

Slides:



Advertisements
Similar presentations
Energy Efficiency through Burstiness Athanasios E. Papathanasiou and Michael L. Scott University of Rochester, Computer Science Department Rochester, NY.
Advertisements

1 MemScale: Active Low-Power Modes for Main Memory Qingyuan Deng, David Meisner*, Luiz Ramos, Thomas F. Wenisch*, and Ricardo Bianchini Rutgers University.
Detecting Transient Bottlenecks in n-Tier Applications through Fine- Grained Analysis Qingyang Wang Advisor: Calton Pu.
Automatic Resource Scaling for Web Applications in the Cloud Ching-Chi Lin Institute of Information Science, Academia Sinica Department of Computer Science.
Shimin Chen Big Data Reading Group Presented and modified by Randall Parabicoli.
Shimin Chen Big Data Reading Group.  Energy efficiency of: ◦ Single-machine instance of DBMS ◦ Standard server-grade hardware components ◦ A wide spectrum.
Towards High-Availability for IP Telephony using Virtual Machines Devdutt Patnaik, Ashish Bijlani and Vishal K Singh.
Mathew Paul and Peter Petrov Proceedings of the IEEE Symposium on Application Specific Processors (SASP ’09) July /6/13.
Energy Efficient Web Server Cluster Andrew Krioukov, Sara Alspaugh, Laura Keys, David Culler, Randy Katz.
A Hybrid Caching Strategy for Streaming Media Files Jussara M. Almeida Derek L. Eager Mary K. Vernon University of Wisconsin-Madison University of Saskatchewan.
SQL Server Query Optimizer Cost Formulas Joe Chang
Alleviating Constraints with Resource Pools & Live Migration with Enhanced VMotion* Breakout Session# 2823 Raghu Yeluri Sr. Architect Intel Corporation.
Towards Eco-friendly Database Management Systems W. Lang, J. M. Patel (U Wisconsin), CIDR 2009 Shimin Chen Big Data Reading Group.
Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.
Authors: Tong Li, Dan Baumberger, David A. Koufaty, and Scott Hahn [Systems Technology Lab, Intel Corporation] Source: 2007 ACM/IEEE conference on Supercomputing.
Introduction to HP LoadRunner Getting Familiar with LoadRunner >>>>>>>>>>>>>>>>>>>>>>
Database Replication Policies for Dynamic Content Applications Gokul Soundararajan, Cristiana Amza, Ashvin Goel University of Toronto EuroSys 2006: Leuven,
Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads Iraklis Psaroudakis (EPFL), Tobias Scheuer (SAP AG), Norman May.
Storage Management in Virtualized Cloud Environments Sankaran Sivathanu, Ling Liu, Mei Yiduo and Xing Pu Student Workshop on Frontiers of Cloud Computing,
Power and Performance Modeling in a Virtualized Server System M. Pedram and I. Hwang Department of Electrical Engineering Univ. of Southern California.
Autonomic SLA-driven Provisioning for Cloud Applications Nicolas Bonvin, Thanasis Papaioannou, Karl Aberer Presented by Ismail Alan.
Temperature Aware Load Balancing For Parallel Applications Osman Sarood Parallel Programming Lab (PPL) University of Illinois Urbana Champaign.
Politecnico di Torino Dipartimento di Automatica ed Informatica TORSEC Group Performance of Xen’s Secured Virtual Networks Emanuele Cesena Paolo Carlo.
Eneryg Efficiency for MapReduce Workloads: An Indepth Study Boliang Feng Renmin University of China Dec 19.
Srihari Makineni & Ravi Iyer Communications Technology Lab
Power-Aware Scheduling of Virtual Machines in DVFS-enabled Clusters
A dynamic optimization model for power and performance management of virtualized clusters Vinicius Petrucci, Orlando Loques Univ. Federal Fluminense Niteroi,
Towards Dynamic Green-Sizing for Database Servers Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, Kenneth Salem University of Waterloo.
Embedded System Lab. 김해천 The TURBO Diaries: Application-controlled Frequency Scaling Explained.
1 Admission Control and Request Scheduling in E-Commerce Web Sites Sameh Elnikety, EPFL Erich Nahum, IBM Watson John Tracey, IBM Watson Willy Zwaenepoel,
Dynamic Voltage Frequency Scaling for Multi-tasking Systems Using Online Learning Gaurav DhimanTajana Simunic Rosing Department of Computer Science and.
Consolidation and Optimization Best Practices: SQL Server 2008 and Hyper-V Dandy Weyn | Microsoft Corp. Antwerp, March
Energy-Aware Resource Adaptation in Tessellation OS 3. Space-time Partitioning and Two-level Scheduling David Chou, Gage Eads Par Lab, CS Division, UC.
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
1 Adaptive Parallelism for Web Search Myeongjae Jeon Rice University In collaboration with Yuxiong He (MSR), Sameh Elnikety (MSR), Alan L. Cox (Rice),
Technical Reading Report Virtual Power: Coordinated Power Management in Virtualized Enterprise Environment Paper by: Ripal Nathuji & Karsten Schwan from.
An Efficient Threading Model to Boost Server Performance Anupam Chanda.
E-MOS: Efficient Energy Management Policies in Operating Systems
1 Automated Power Management Through Virtualization Anne Holler, VMware Anil Kapur, VMware.
A Hierarchical Edge Cloud Architecture for Mobile Computing IEEE INFOCOM 2016 Liang Tong, Yong Li and Wei Gao University of Tennessee – Knoxville 1.
Azure.
Persistent Memory (PM)
Hathi: Durable Transactions for Memory using Flash
Anshul Gandhi 347, CS building
Threads vs. Events SEDA – An Event Model 5204 – Operating Systems.
Distributed Network Traffic Feature Extraction for a Real-time IDS
Hybrid Cloud Architecture for Software-as-a-Service Provider to Achieve Higher Privacy and Decrease Securiity Concerns about Cloud Computing P. Reinhold.
Scaling the Memory Power Wall with DRAM-Aware Data Management
Windows Server* 2016 & Intel® Technologies
Ching-Chi Lin Institute of Information Science, Academia Sinica
Effective Data-Race Detection for the Kernel
HPE Persistent Memory Microsoft Ignite 2017
Azure.
SQL Server 2012 Licensing Overview.
SPECpower_ssj2008** Characterization
What is the Azure SQL Datawarehouse?
HyperLoop: Group-Based NIC Offloading to Accelerate Replicated Transactions in Multi-tenant Storage Systems Daehyeok Kim Amirsaman Memaripour, Anirudh.
SQL 2014 In-Memory OLTP What, Why, and How
Haishan Zhu, Mattan Erez
George Prekas, Marios Kogias, Edouard Bugnion
Zhen Xiao, Qi Chen, and Haipeng Luo May 2013
Shane Case and Kanad Ghose Dept. of Computer Science
Admission Control and Request Scheduling in E-Commerce Web Sites
Benchmarking Cloud Serving Systems with YCSB
SQL Server Query Optimizer Cost Formulas
Supporting Time-Sensitive Applications on a Commodity OS
Chih-Hsun Chou Daniel Wong Laxmi N. Bhuyan
04 | Performance and the Premium SKU
Xing Pu21 Ling Liu1 Yiduo Mei31 Sankaran Sivathanu1 Younggyun Koh1
Presentation transcript:

Frequency Governors for Cloud Database OLTP Workloads Rathijit Sen and Alan Halverson Gray Systems Lab Microsoft Corporation

Overview Goal: Maximize power savings while serving the offered load Reduction in COGS Meet customer SLOs Mechanism: Per-core frequency control Decision metric: Effective Utilization Allows more aggressive operations than traditional utilization New reactive governor leverages this metric Study Microsoft SQL Server on Linux Transactional cloud workload

Outline Cloud database workload overview Existing frequency governors Reactive governor motivation and design Experiments Conclusion

Cloud Workload: Schema Six tables Fixed size: 2 Scaling: 3 Growing: 1 Data types: integer, numeric, character, date/time Constraints: primary & secondary keys, no foreign key Synthetic data: distributions, permutations of sets of values, weighted lists of words, etc. Designed to support a broad range of operations

Cloud Workload: Transaction Mix Read Lite SELECT; in-memory; read-only Read Medium SELECT; mostly in-memory; read-only Read Heavy SELECT; mostly not in-memory; read-only Update Lite UPDATE; in-memory; read-write Update Heavy UPDATE; mostly not in-memory; read-write Insert Lite INSERT; in-memory; read-write Insert Heavy INSERT; mostly not in-memory; read-write Delete DELETE; mix of in-memory and not in-memory; read-write CPU Heavy SELECT; in-memory; relatively heavy CPU load; read-only

Cloud Workload: Pacing Delay (D) “think time” between transactions Negative exponential distribution Capped at 10x the mean Mean: Pacing Delay (D) TPS increases as D decreases We consider D=1, 0.5, 0.2, 0.1, 0.05, 0.04, 0 Other values could be chosen

Linux Frequency Governors Intel P-state Performance: static, highest available frequency Powersave: dynamic, load-dependent Cpufreq Ondemand: dynamic , load-dependent Conservative: dynamic , load-dependent Powersave: static, lowest frequency Userspace: up to the administrator Reactive: dynamic, load-dependent operates with Cpufreq Userspace enabled

Design Constraint: No Application Knowledge Host Server VM Docker App What application is running? Is it transactional? When do transactions start/end? How many queries/responses per transaction? Which guest thread does this request correspond to? …

Reactive Governor Design Goal: react to load, but use minimum power Constraint: only host server (and host OS) counters available %C0 is not enough! Other idle states such as C1, C3 may be occupied even under full load Write-ahead logging Network processing Scheduling delays … %C0 (and performance) can decrease if frequency is erroneously lowered

Example execution at max. frequency

Example execution at max. frequency

Example execution at max. frequency

Example execution at max. frequency

Reactive Governor Heuristics/Assumptions %C0 scales inversely with frequency Deviations in practice due to memory stalls %C6 decreases as load increases or as frequency decreases %C0 has opposite behavior %C6 may be fully eliminated %C1 + … +%C5 may not change with frequency at a fixed load As long as system is not over-committed Deviations in practice, e.g., due to kernel processing

Effective Utilization

Reactive Governor Operations eff > Threshold? New frequency = eff x Current frequency Increase current frequency by , subject to some cap N Y repeat Parameter choices can tradeoff power savings for performance Targeted for 10 ms intervals, invoked about 70 times per second

Experimental Setup Client-Server setup, transactional cloud workload E5-2620 v4 (Broadwell Xeon) Dual socket 8 cores/socket x 2 (hyperthreading enabled) 1.2 GHz – 2.1 GHz Turbo: upto 3.0 GHz, all cores: 2.3 GHz Per-core frequency control 64GB DDR4 2 x 512 GB SSD BIOS setting favors energy efficient operations Microsoft SQL Server on Linux (Ubuntu 16.04.1 LTS), CTP 1.1

Power-Performance: Intel P-state & Reactive Approx. 30W savings at D=0.1 over P-state Performance Approx. 20W savings at D=0.1 over P-state Powersave

Power-Performance: Cpufreq & Reactive Reactive most power-efficient among governors that serve full load Approx. 7% performance loss at D=0.05, 0.04

Service Level Objectives (SLOs) Dependent on Class of Service Resource allocations also vary Premium 95th percentile ≤ 0.5 second Standard 90th percentile ≤ 1.0 second Basic 80th percentile ≤ 2.0 second

Conclusion Hosting database workloads at the scale of the cloud brings need to manage both COGS and customer SLOs We propose a reactive per-core frequency governor to reduce power and power-related costs while meeting SLOs Key concept is effective utilization eff, with frequency changes based on this metric