E-MOS: Efficient Energy Management Policies in Operating Systems

Slides:



Advertisements
Similar presentations
Chapter 9 Contributed by Alex Turek
Advertisements

Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures Pree Thiengburanathum Advanced computer architecture Oct 24,
ECOS: Leveraging Software-Defined Networks to Support Mobile Application Offloading Aaron Gember, Christopher Dragga, Aditya Akella University of Wisconsin-Madison.
1 MemScale: Active Low-Power Modes for Main Memory Qingyuan Deng, David Meisner*, Luiz Ramos, Thomas F. Wenisch*, and Ricardo Bianchini Rutgers University.
Institute of Networking and Multimedia, National Taiwan University, Jun-14, 2014.
Introduction Operating Systems’ Concepts and Structure Lecture 1 ~ Spring, 2008 ~ Spring, 2008TUCN. Operating Systems. Lecture 1.
SECTION 1: INTRODUCTION TO SIMICS Scott Beamer CS152 - Spring 2009.
1 The Problem of Power Consumption in Servers L. Minas and B. Ellison Intel-Lab In Dr. Dobb’s Journal, May 2009 Prepared and presented by Yan Cai Fall.
By- Jaideep Moses, Ravi Iyer , Ramesh Illikkal and
Energy Model for Multiprocess Applications Texas Tech University.
Hystor : Making the Best Use of Solid State Drivers in High Performance Storage Systems Presenter : Dong Chang.
Power Management of Online Data Intensive Services Meisner et al.
CS 423 – Operating Systems Design Lecture 22 – Power Management Klara Nahrstedt and Raoul Rivas Spring 2013 CS Spring 2013.
The Impact of Performance Asymmetry in Multicore Architectures Saisanthosh Ravi Michael Konrad Balakrishnan Rajwar Upton Lai UW-Madison and, Intel Corp.
A Comparison of Software and Hardware Techniques for x86 Virtualization Keith Adams Ole Agesen Oct. 23, 2006.
Folklore Confirmed: Compiling for Speed = Compiling for Energy Tomofumi Yuki INRIA, Rennes Sanjay Rajopadhye Colorado State University 1.
1 Overview 1.Motivation (Kevin) 1.5 hrs 2.Thermal issues (Kevin) 3.Power modeling (David) Thermal management (David) hrs 5.Optimal DTM (Lev).5 hrs.
Critical Power Slope Understanding the Runtime Effects of Frequency Scaling Akihiko Miyoshi, Charles Lefurgy, Eric Van Hensbergen Ram Rajamony Raj Rajkumar.
Green Computing Power Management Standards Maziar Goudarzi.
Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Energy Savings with DVFS Reduction in CPU power Extra system power.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Overview Part 2: History (continued)
1 Tuning Garbage Collection in an Embedded Java Environment G. Chen, R. Shetty, M. Kandemir, N. Vijaykrishnan, M. J. Irwin Microsystems Design Lab The.
Srihari Makineni & Ravi Iyer Communications Technology Lab
An Overlay Network Providing Application-Aware Multimedia Services Maarten Wijnants Bart Cornelissen Wim Lamotte Bart De Vleeschauwer.
Energy Management in Virtualized Environments Gaurav Dhiman, Giacomo Marchetti, Raid Ayoub, Tajana Simunic Rosing (CSE-UCSD) Inside Xen Hypervisor Online.
LINUX System : Lecture 7 Bong-Soo Sohn Lecture notes acknowledgement : The design of UNIX Operating System.
Processes Introduction to Operating Systems: Module 3.
Towards Dynamic Green-Sizing for Database Servers Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, Kenneth Salem University of Waterloo.
Embedded System Lab. 김해천 The TURBO Diaries: Application-controlled Frequency Scaling Explained.
VGreen: A System for Energy Efficient Manager in Virtualized Environments G. Dhiman, G Marchetti, T Rosing ISLPED 2009.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
Heracles: Improving Resource Efficiency at Scale ISCA’15 Stanford University Google, Inc.
Dynamic Voltage Frequency Scaling for Multi-tasking Systems Using Online Learning Gaurav DhimanTajana Simunic Rosing Department of Computer Science and.
Understanding Performance, Power and Energy Behavior in Asymmetric Processors Nagesh B Lakshminarayana Hyesoon Kim School of Computer Science Georgia Institute.
Embedded System Lab. 정범종 A_DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters H. Wang et al. VEE, 2015.
UNIX Unit 1- Architecture of Unix - By Pratima.
Lev Finkelstein ISCA/Thermal Workshop 6/ Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)
Critical Power Slope: Understanding the Runtime Effects of Frequency Scaling Akihiko Miyoshi †,Charles Lefurgy ‡, Eric Van Hensbergen ‡, Ram Rajamony ‡,
Energy-Aware Resource Adaptation in Tessellation OS 3. Space-time Partitioning and Two-level Scheduling David Chou, Gage Eads Par Lab, CS Division, UC.
Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS
Full and Para Virtualization
Technical Seminar Presentation 2004 Presented by- Geetanjali Konhar EE O81 1 Dynamic power management for embedded system “ Dynamic power management.
Embedded System Lab. 오명훈 Addressing Shared Resource Contention in Multicore Processors via Scheduling.
An Integrated GPU Power and Performance Model (ISCA’10, June 19–23, 2010, Saint-Malo, France. International Symposium on Computer Architecture)
Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences Processes Scheduling on Heterogeneous Multi-core Architecture.
Tackling I/O Issues 1 David Race 16 March 2010.
IMPLEMENTING A LEADING LOADS PERFORMANCE PREDICTOR ON COMMODITY PROCESSORS BO SU † JOSEPH L. GREATHOUSE ‡ JUNLI GU ‡ MICHAEL BOYER ‡ LI SHEN † ZHIYING.
OPERATING SYSTEMS DO YOU REQUIRE AN OPERATING SYSTEM IN YOUR SYSTEM?
Rakesh Kumar Keith Farkas Norman P Jouppi,Partha Ranganathan,Dean M.Tullsen University of California, San Diego MICRO 2003 Speaker : Chun-Chung Chen Single-ISA.
Matthew Locke November 2007 A Linux Power Management Architecture.
Evaluation of Advanced Power Management for ClassCloud based on DRBL Rider Grid Technology Division National Center for High-Performance Computing Research.
Computer System Structures
Measuring Performance II and Logic Design
Overview Motivation (Kevin) Thermal issues (Kevin)
CS203 – Advanced Computer Architecture
Performance Comparison of Virtual Machines and Containers with Unikernels Nagashree N Suprabha S Rajat Bansal.
Green cloud computing 2 Cs 595 Lecture 15.
The Multikernel: A New OS Architecture for Scalable Multicore Systems
Reducing OS noise using offload driver on Intel® Xeon Phi™ Processor
Frequency Governors for Cloud Database OLTP Workloads
Smart Batteries Spring 2016 – Update 1.
Overview Introduction VPS Understanding VPS Architecture
Section 1: Introduction to Simics
Haishan Zhu, Mattan Erez
Processes and operating systems
LINUX System : Lecture 7 Lecture notes acknowledgement : The design of UNIX Operating System.
Operating Systems: A Modern Perspective, Chapter 3
A very basic introduction
Presentation transcript:

E-MOS: Efficient Energy Management Policies in Operating Systems Vinay Gangadhar, Clint Lestourgeon, Jay Yang, Varun Dattatreya CS736: Advanced Operating Systems Spring 2015

System Power Management Computing Platforms  Power Management (PM) problem Mobile and Embedded Devices Desktop Systems Server Environment Takeaway Better Control on System Power Consumption Policy decisions favoring Energy Efficient Computing But wait!!  Who has the control ? Power management in computing systems is an important prevalent problem In mobile domain, its mostly the battery reosurce to conserve Desktops  CPU overheating problem Servers  Cost you pay for cooling systems and computation resources for demand

Levels of Power Management Coarse Fine Control KERNEL POWER GOVERNORS P-STATE DRIVERS HARDWARE Run the workload to completion at the maximum performance setting and then transition into a low-power mode Assumes that the highest energy savings can be achieved by running at the lowest performance setting Workload Awareness High Low

Our Project Evaluate existing Linux PM mechanisms Application-aware energy efficient policy decisions  E-MOS model Evaluate E-MOS using User-Space power governors

Outline Introduction Linux Power Governors Case Study Application Categorization & Analysis E-MOS Model Evaluation & Results Lessons Learnt & Conclusion

Core frequency setting Core frequency setting Linux Power Governors CPUFreq Intel P-State Governors Governors Ondemand Conservative PowerSave PowerSave Performance Performance Run the workload to completion at the maximum performance setting and then transition into a low-power mode Assumes that the highest energy savings can be achieved by running at the lowest performance setting OS PM uses ACPI interface to communicate with CPUFreq driver C-States are Processor power states P-states are performance states for a particular operating point  Freq, Voltage Userspace Architecture Independent CPUFreq module Architecture Dependent Core frequency setting Intel P-State driver Core frequency setting

Power Governors Contd.

Applications Compute Intensive  CPU Core Cache Sensitive  Memory Intensive I/O Intensive  Takeaway Cannot apply same policies to all workloads !! Need Application-aware energy management policy decisions Each of these applications demand different CPU and memory demands

Imperfect Scaling Case study Expectation: High CPU usage  Performance should scale with CPU frequency Problem: Cache misses Memory is fast But not fast enough Existing Solutions Better Programming skills Modern Computer Architecture improvements For the programmers part, note that cache misses are already considered harmful, so usually programmers avoid them anyways

Imperfect Scaling Case study For the programmers part, note that cache misses are already considered harmful, so usually programmers avoid them anyways

Application Categorization PIN Based Modeling Could not model the I/O operations SPEC2K6  Compute SPLASH2  Compute + Cache MICRO-BENCH  Cache + Memory

Power Governors’ Analysis SPEC2K6 – Compute Intensive SPLASH2 – Cache Sensitive MICRO-BENCH – Memory Intensive - All governors are optimized for compute workloads and they mainly aim for ‘race to halt condition’ - But this doesn’t work for cache sensitive and memory intensive workloads as it simply wastes the energy of core

Power Governors’ Analysis SPEC2K6 – Compute Intensive SPLASH2 – Cache Sensitive Takeaway – Default power governors Don’t provide better energy efficiency  Optimized only for Compute workloads!! Don’t rely on application characteristics to make better policy decisions OS can relinquish freedom to User space for better energy management MICRO-BENCH – Memory Intensive - All governors are optimized for compute workloads and they mainly aim for ‘race to halt condition’ - But this doesn’t work for cache sensitive and memory intensive workloads as it simply wastes the energy of core

E-MOS Analytical Model RAPL – Running Average Power Limit Perf – Linux utility for performance measurements CPUPower – Utility for frequency measurement Profiled Application Information RAPL + Perf + CPUPower Application-aware Energy Policy E-MOS Analytical Model RAPL – interface to monitor SOC power consumption (sysfs layout) -- Based on the estimates, user can then use a spectrum of scaled frequency for better energy efficiency Reduce Freq. by 5%, 10%, …, 25% steps Increase Freq. by 5%, 10%, …, 25% steps - + User-Space Power Governor Energy Estimates for CPU cores

E-MOS Decision Table Example Reduce Core Freq. by 25% Increase Core Freq. by 25% DRAM frequency scaling needs BIOS settings modifications. But modern RAPL interface to provide some way to limit power of DRAM but not frequency And we test each application with possible scalable frequencies and test how energy efficient it is compared to Ondemand governor Application Objective CPU Freq. DRAM Freq. Compute Intensive Energy Maintain/Increase Decrease Performance Increase Maintain Cache Sensitive Memory Intensive

Evaluation Platform: Kernel: Linux 3.14 CPU: Intel core i7 3630M Scaling frequencies (Ghz): 0.8, 1.2, 1.5, 1.8, 2.1, 2.4, 3.4 (turbo) Energy & timing measurement: Perf (RAPL interface) – Power and Performance measurements CPUPower – Frequency Setting

Results - 1 Compute Intensive workloads -- SPEK2K6 E-MOS Frequency Setting chosen  1.8Ghz to 2.4Ghz -- It should be flexible enough to accommodate research studies of various types With 2.4Ghz  1.44x (44%) Energy Efficiency with Performance loss of 3%

Results - 2 Compute Intensive + Cache Sensitive workloads -- SPLASH2 E-MOS Frequency Setting chosen  1.8Ghz to 2.1Ghz - For cache sensitive, you want to decrease the frequency but not to extent that the perf gets affected too much because of slowing down of core With 1.8Ghz  1.61x (61%) Energy Efficiency with Performance loss of 9%

Results - 3 Memory Intensive workloads -- MICRO-BENCH E-MOS Frequency Setting chosen  1.2Ghz to 1.8Ghz - Interesting that the performance remains the same even with reducing the frequency With 1.2Ghz  2x (200%) Energy Efficiency with Performance loss of 13%

Lessons Learnt Power estimation is challenging but the tools are getting better Performance benchmarks may not be the best way to evaluate power management Based on our results, P-state drivers may not be as awesome as Intel would have you believe Intel’s most useful P-state documentation is a Google+ post !!

Conclusion Application-aware energy management  Energy as a first class resource EMOS achieves upto 2x energy efficiency with performance loss of 13% Power governors can take advantage of reducing CPU frequency for Memory bound applications User-space power governors with more runtime library support can be more efficient than Ondemand

Back Up Slides -- It should be flexible enough to accommodate research studies of various types

Power Governors – Optimized for Compute workloads Energy = Power X Time Benchmarks: SPEC CPU2006, Splash2, Micro benchmarks

ACPI Interface Applications OS Power Management ACPI Software drivers Hardware: CPU, BIOS etc.

Problem with Power Governors Power management tends to be simplistic Most policies use “race to halt” approach Run the workload to completion at the maximum performance setting and then transition into a low-power mode Assumes that the highest energy savings can be achieved by running at the lowest performance setting Either statically scale the CPU frequency or provide limited CPU capability to define energy requirements of an application Use only CPU usage as source of information What about memory bound applications? Do not provide a fine-grained control and management over the energy utilized by the applications