Towards Dynamic Green-Sizing for Database Servers Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, Kenneth Salem University of Waterloo.

Slides:

Advertisements

Similar presentations

Zehan Cui, Yan Zhu, Yungang Bao, Mingyu Chen Institute of Computing Technology, Chinese Academy of Sciences July 28, 2011.

Advertisements

International Symposium on Low Power Electronics and Design Energy-Efficient Non-Minimal Path On-chip Interconnection Network for Heterogeneous Systems.

Daniel Schall, Volker Höfner, Prof. Dr. Theo Härder TU Kaiserslautern.

Walter Binder University of Lugano, Switzerland Niranjan Suri IHMC, Florida, USA Green Computing: Energy Consumption Optimized Service Hosting.

1 MemScale: Active Low-Power Modes for Main Memory Qingyuan Deng, David Meisner*, Luiz Ramos, Thomas F. Wenisch*, and Ricardo Bianchini Rutgers University.

A Cyber-Physical Systems Approach to Energy Management in Data Centers Presented by Chen He Adopted form the paper authors.

1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,

Institute of Networking and Multimedia, National Taiwan University, Jun-14, 2014.

Mohammed Abouzour, Kenneth Salem, Peter Bumbulis Presentation by Mohammed Abouzour SMDB2010.

Thin Servers with Smart Pipes: Designing SoC Accelerators for Memcached Bohua Kou Jing gao.

What will my performance be? Resource Advisor for DB admins Dushyanth Narayanan, Paul Barham Microsoft Research, Cambridge Eno Thereska, Anastassia Ailamaki.

Shimin Chen Big Data Reading Group Presented and modified by Randall Parabicoli.

Shimin Chen Big Data Reading Group.  Energy efficiency of: ◦ Single-machine instance of DBMS ◦ Standard server-grade hardware components ◦ A wide spectrum.

Akhil Langer, Harshit Dokania, Laxmikant Kale, Udatta Palekar* Parallel Programming Laboratory Department of Computer Science University of Illinois at.

ECE 510 Brendan Crowley Paper Review October 31, 2006.

1 Lecture 1: Introduction and Memory Systems CS 7810 Course organization:  5 lectures on memory systems  5 lectures on cache coherence and consistency.

Synergy.cs.vt.edu Power and Performance Characterization of Computational Kernels on the GPU Yang Jiao, Heshan Lin, Pavan Balaji (ANL), Wu-chun Feng.

Power Management of Online Data Intensive Services Meisner et al.

Towards Eco-friendly Database Management Systems W. Lang, J. M. Patel (U Wisconsin), CIDR 2009 Shimin Chen Big Data Reading Group.

Analyzing the Energy Efficiency of a Database Server Hanskamal Patel SE 521.

CS 423 – Operating Systems Design Lecture 22 – Power Management Klara Nahrstedt and Raoul Rivas Spring 2013 CS Spring 2013.

Scaling and Packing on a Chip Multiprocessor Vincent W. Freeh Tyler K. Bletsch Freeman L. Rawson, III Austin Research Laboratory.

Folklore Confirmed: Compiling for Speed = Compiling for Energy Tomofumi Yuki INRIA, Rennes Sanjay Rajopadhye Colorado State University 1.

Flashing Up the Storage Layer I. Koltsidas, S. D. Viglas (U of Edinburgh), VLDB 2008 Shimin Chen Big Data Reading Group.

PARAID: The Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, An-I Andy Wang – Florida State University Peter Reiher – University of California,

Cloud Computing Energy efficient cloud computing Keke Chen.

AUTHORS: STIJN POLFLIET ET. AL. BY: ALI NIKRAVESH Studying Hardware and Software Trade-Offs for a Real-Life Web 2.0 Workload.

Temperature Aware Load Balancing For Parallel Applications Osman Sarood Parallel Programming Lab (PPL) University of Illinois Urbana Champaign.

DONE-08 Sizing and Performance Tuning N-Tier Applications Mike Furgal Performance Manager Progress Software

Simulating a $2M Commercial Server on a $2K PC Alaa R. Alameldeen, Milo M.K. Martin, Carl J. Mauer, Kevin E. Moore, Min Xu, Daniel J. Sorin, Mark D. Hill.

1 A New Approach to File System Cache Writeback of Application Data Sorin Faibish – EMC Distinguished Engineer P. Bixby, J. Forecast, P. Armangau and S.

Row Buffer Locality Aware Caching Policies for Hybrid Memories HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu.

Power-Aware Scheduling of Virtual Machines in DVFS-enabled Clusters

1 Process Scheduling in Multiprocessor and Multithreaded Systems Matt Davis CS5354/7/2003.

Energy Management in Virtualized Environments Gaurav Dhiman, Giacomo Marchetti, Raid Ayoub, Tajana Simunic Rosing (CSE-UCSD) Inside Xen Hypervisor Online.

A Row Buffer Locality-Aware Caching Policy for Hybrid Memories HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu.

GreenSched: An Energy-Aware Hadoop Workflow Scheduler

Computational Sprinting on a Real System: Preliminary Results Arun Raghavan *, Marios Papaefthymiou +, Kevin P. Pipe +#, Thomas F. Wenisch +, Milo M. K.

Embedded System Lab. 김해천 The TURBO Diaries: Application-controlled Frequency Scaling Explained.

Adaptive Multi-Threading for Dynamic Workloads in Embedded Multiprocessors 林鼎原 Department of Electrical Engineering National Cheng Kung University Tainan,

VGreen: A System for Energy Efficient Manager in Virtualized Environments G. Dhiman, G Marchetti, T Rosing ISLPED 2009.

Meikel Poess Oracle Corporation. Analytical Power Consumption Model Based on nameplate power consumption Nameplate power is conservative estimate Model.

Heracles: Improving Resource Efficiency at Scale ISCA’15 Stanford University Google, Inc.

Understanding Performance, Power and Energy Behavior in Asymmetric Processors Nagesh B Lakshminarayana Hyesoon Kim School of Computer Science Georgia Institute.

Compiler and Runtime Support for Enabling Generalized Reduction Computations on Heterogeneous Parallel Configurations Vignesh Ravi, Wenjing Ma, David Chiu.

PPEP: online Performance, power, and energy prediction framework

DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.

XI HE Computing and Information Science Rochester Institute of Technology Rochester, NY USA Rochester Institute of Technology Service.

1 Adaptive Parallelism for Web Search Myeongjae Jeon Rice University In collaboration with Yuxiong He (MSR), Sameh Elnikety (MSR), Alan L. Cox (Rice),

Technical Reading Report Virtual Power: Coordinated Power Management in Virtualized Enterprise Environment Paper by: Ripal Nathuji & Karsten Schwan from.

Gather-Scatter DRAM In-DRAM Address Translation to Improve the Spatial Locality of Non-unit Strided Accesses Vivek Seshadri Thomas Mullins, Amirali Boroumand,

Accounting for Load Variation in Energy-Efficient Data Centers

An Integrated GPU Power and Performance Model (ISCA’10, June 19–23, 2010, Saint-Malo, France. International Symposium on Computer Architecture)

Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.

Sunpyo Hong, Hyesoon Kim

E-MOS: Efficient Energy Management Policies in Operating Systems

Equalizer: Dynamically Tuning GPU Resources for Efficient Execution Ankit Sethia* Scott Mahlke University of Michigan.

The CRISP Performance Model for Dynamic Voltage and Frequency Scaling in a GPGPU Rajib Nath, Dean Tullsen 1 Micro 2015.

Optimizing Distributed Actor Systems for Dynamic Interactive Services

Seth Pugsley, Jeffrey Jestes,

Green cloud computing 2 Cs 595 Lecture 15.

Scaling the Memory Power Wall with DRAM-Aware Data Management

Frequency Governors for Cloud Database OLTP Workloads

BitWarp Energy Efficient Analytic Data Processing on Next Generation General Purpose GPUs Jason Power || Yinan Li || Mark D. Hill || Jignesh M. Patel.

Spare Register Aware Prefetching for Graph Algorithms on GPUs

Haishan Zhu, Mattan Erez

Shane Case and Kanad Ghose Dept. of Computer Science

Dynamic Voltage Scaling

Admission Control and Request Scheduling in E-Commerce Web Sites

Chih-Hsun Chou Daniel Wong Laxmi N. Bhuyan

Presentation transcript:

Towards Dynamic Green-Sizing for Database Servers Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, Kenneth Salem University of Waterloo

Data Center Power Consumption US in million Servers %2 of all electricity Keeps Increasing Data Center Efficiency Assessment, National Resources Defense Council,

Inside a Data Center Direct Consumption By The Server Is The Largest Component Servers Must Also Be Cooled 2 Energy Logic: Reducing Data Center Energy Consumption by Creating Savings that Cascade Across Systems, Emerson Network Power, 2010

Our Goal Improve Power Efficiency in DBMS In-Memory Transactional Workload Two Parts: CPU Power Efficiency Memory Power Efficiency 3 Analyzing the Energy Efficiency of a Database Server, Tsirogiannis et. al., SIGMOD ‘10

Improving CPU Power Efficiency DBMS-Managed Dynamic Voltage & Frequency Scaling Slow the CPU at low load to save energy Speed the CPU at high load to maintain performance 4

Details are in the Paper Improving Memory Power Efficiency Reduce Memory Power Consumption by Allowing Unneeded Memory to Idle Example: 8 GB DB in 64 GB Server  Up to 56 GB Memory can idle Not Trivial Must control Virtual  Physical  DIMM mapping to use as few DIMM’s as possible Estimation 8 GB DB on 64GB Server  %40 power reduction over default configuration 5

Talk Outline Motivation & Introduction DBMS-Managed Dynamic Voltage Frequency Scaling Background Proposed Work Results Conclusion & Future Work 6

Why Power Management in DBMS? Power is Already Managed Hardware & Kernel level DBMS Has Unique Information Workload characteristics Quality of Service(QoS): Latency budget Database characteristics Size, locality 7

Database Workload Workload is not Steady Patterns Fluctuations, bursts Systems are Over-provisioned Configured for the peak load Lower Loads? Scale power 8

Dynamic Voltage Frequency Scaling (DVFS) Recent CPUs Support Multiple Frequency Levels Can Be Adjusted Dynamically 9 AMD FX 6300 P-StateVoltageFrequency P01.4 V3.5 GHz P V3.0 GHz P V2.5 GHz P V2.0 GHz P40.9 V1.4 GHz

Existing DVFS Managements Linux Kernel Supports DVFS Governors Static, Dynamic Governors Dynamic Governors Sample CPU utilization Difference between samples for decision 10

DBMS-Managed DVFS Varying Load Transaction Latency Our Approach: Exploit Latency Budget Except at Peak Load Slow down the execution Stay under latency budget 11

Energy: 0.04 joule Energy 0.07 joule How Slowing Helps Low Frequency is More Power Efficient 12 High: 0.07 joule Low: 0.04 joule

How to Scale Power in DB Set Frequency Before a Transaction Executes Predict Response Time for Each Waiting Transaction Select CPU Frequency Level Stay under latency budget Slowest possible Emergency High number of waiting transaction Set maximum frequency 13

DVFS in Shore-MT Each Worker thread Has a transaction wait queue Is pinned to a core Controls core frequency level 14 Core 3 Core 2 Core 1 Core 4 Core 5 Core 6 Worker 1 Worker 2 Worker 3 Worker 4 Worker 5 Worker 6

Latency Aware P-State Selection - LAPS 15 Trx 3 Wait: 60 Trx 2 Wait: 130 Trx 1 Wait: 150 Latency Budget 600 Trx1 For P4: = 420 Next P-State P4 Service Time Prediction P-StateTime P0100 P1120 P2150 P3200 P4270 Wait Time Service Time Prediction

Latency Aware P-State Selection - LAPS 16 P4 is fast enough for Trx1, Check next transaction Latency Budget 600 Next P-State P4 Service Time Prediction P-StateTime P0100 P1120 P2150 P3200 P4270 Trx 3 Wait: 60 Trx 2 Wait: 130 Trx 1 Wait: 150

Latency Aware P-State Selection - LAPS 17 Trx2 using P4 = 670 Latency Budget 600 Next P-State P4 Service Time Prediction P-StateTime P0100 P1120 P2150 P3200 P4270 Trx 3 Wait: 60 Trx 2 Wait: 130 Trx 1 Wait: 150

Latency Aware P-State Selection - LAPS 18 P4 is not Fast Enough! Try next Frequency Level Latency Budget 600 Next P-State P4 Service Time Prediction P-StateTime P0100 P1120 P2150 P3200 P4270 Trx 3 Wait: 60 Trx 2 Wait: 130 Trx 1 Wait: 150

Latency Aware P-State Selection - LAPS 19 Trx 3 Wait: 60 Trx 2 Wait: 130 Trx 1 Wait: 150 Trx2 using P3 = 530 Latency Budget 600 Next P-State P4 Service Time Prediction P-StateTime P0100 P1120 P2150 P3200 P4270

Latency Aware P-State Selection - LAPS 20 P3 is fast enough for Trx2, set next P-State, Check next transaction Latency Budget 600 Next P-State P3 Service Time Prediction P-StateTime P0100 P1120 P2150 P3200 P4270 Trx 3 Wait: 60 Trx 2 Wait: 130 Trx 1 Wait: 150

Service Time Prediction P-StateTime P0100 P1120 P2150 P3200 P4270 Latency Aware P-State Selection - LAPS 21 Trx3 using P3 = 660 Latency Budget 600 Next P-State P3 Trx 3 Wait: 60 Trx 2 Wait: 130 Trx 1 Wait: 150

Latency Aware P-State Selection - LAPS 22 P3 is not Fast Enough! Try next Frequency Level Latency Budget 600 Next P-State P3 Service Time Prediction P-StateTime P0100 P1120 P2150 P3200 P4270 Trx 3 Wait: 60 Trx 2 Wait: 130 Trx 1 Wait: 150

Service Time Prediction P-StateTime P0100 P1120 P2150 P3200 P4270 Latency Aware P-State Selection - LAPS 23 Trx3 using P2 = 510 Latency Budget 600 Next P-State P3 Trx 3 Wait: 60 Trx 2 Wait: 130 Trx 1 Wait: 150

Latency Aware P-State Selection - LAPS 24 P2 is fast enough for Trx3, set next P-State Latency Budget 600 Next P-State P2 Service Time Prediction P-StateTime P0100 P1120 P2150 P3200 P4270 Trx 3 Wait: 60 Trx 2 Wait: 130 Trx 1 Wait: 150

Latency Aware P-State Selection - LAPS 25 Trx 3 Wait: 60 Trx 2 Wait: 130 Trx 1 Wait: 150 All Trxs visited, change state to P2 Latency Budget 600 Next P-State P2 Service Time Prediction P-StateTime P0100 P1120 P2150 P3200 P4270

Latency Aware P-State Selection - LAPS 26 Execute Trx1 under P2 Service Time Prediction P-StateTime P0100 P1120 P2150 P3200 P4270 Latency Budget 600 Next P-State P2 Trx 3 Wait: 60 Trx 2 Wait: 130 Trx 1 Wait: 150

Experimental Setup System: AMD FX-6300, 6 cores, 5 P-states, Ubuntu 14.04, Kernel 3.13 Watts up? Power meter TPC-C 12 Warehouses, Single transaction type: NEW_ORDER Shore-MT 12 Clients, each issues requests for a different warehouse 6 Workers, a worker per core, 12 GB buffer pool Experiment Workloads High, Medium, Low offered load 27

Results – Medium Load 28 Medium Load 23 W 42W

Results – Frequency Residency 29

Results – Low Load 30 Low Load

Results – High Load 31 High Load

Conclusion DBMS-Managed DVFS Exploited workload characteristics Transaction Latency Budget Reduce CPU power, ensure performance 32

Future Work DBMS Managed CPU Power Better Prediction Scheduling DBMS Managed Memory Power Workload related capacity/performance decision CPU/Memory Hybrid approach 33

Thank You Questions? 34

Results 35

Results - 36

How slowing helps 37

Power Model Operation Power Memory access operations ACTIVATE, READ, WRITE Optimization is in CPU domain (Cache awareness, algorithm design) Background Power STANDBY(ACTIVE), POWER-DOWN, SELF-REFRESH 38

Memory Control Challenges Default Memory Access: Interleaved Use all ranks, data is spread Concurrent, multi-rank read/write Memory Address Mapping physical memory ranks to the application 39

Proposed Work Our approach Opportunity in scaling background power Keep memory ranks in their lowest power state Non-interleaved Store data in the selected ranks Activate ranks with increasing memory Possible performance degradation 40

Results – DRAM Power 41

DVFS in Shore-MT Each worker Has a transaction wait queue Is pinned to a core Controls core frequency level Clients Submit requests to workers All pinned to a core 42

Improving CPU Power Efficiency DBMS-Managed Dynamic Voltage & Frequency Scaling Slow the CPU at low load to save energy Speed the CPU at high load to maintain performance 43