Energy Efficiency through Burstiness Athanasios E. Papathanasiou and Michael L. Scott University of Rochester, Computer Science Department Rochester, NY.

Slides:

Advertisements

Similar presentations

Remus: High Availability via Asynchronous Virtual Machine Replication

Advertisements

Conserving Disk Energy in Network Servers ACM 17th annual international conference on Supercomputing Presented by Hsu Hao Chen.

The Performance Impact of Kernel Prefetching on Buffer Cache Replacement Algorithms (ACM SIGMETRIC 05 ) ACM International Conference on Measurement & Modeling.

1 Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers By Sreemukha Kandlakunta Phani Shashank.

LEVERAGING ACCESS LOCALITY FOR THE EFFICIENT USE OF MULTIBIT ERROR-CORRECTING CODES IN L2 CACHE By Hongbin Sun, Nanning Zheng, and Tong Zhang Joseph Schneider.

Department of Computer Science iGPU: Exception Support and Speculative Execution on GPUs Jaikrishnan Menon, Marc de Kruijf Karthikeyan Sankaralingam Vertical.

Live Migration of Virtual Machines Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen, Eric Jul, Christian Limpach, Ian Pratt, Andrew Warfield.

CPU Scheduling Questions answered in this lecture: What is scheduling vs. allocation? What is preemptive vs. non-preemptive scheduling? What are FCFS,

1 Conserving Energy in RAID Systems with Conventional Disks Dong Li, Jun Wang Dept. of Computer Science & Engineering University of Nebraska-Lincoln Peter.

Introduction CSCI 444/544 Operating Systems Fall 2008.

Helper Threads via Virtual Multithreading on an experimental Itanium 2 processor platform. Perry H Wang et. Al.

1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,

U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science CRAMM: Virtual Memory Support for Garbage-Collected Applications Ting Yang, Emery.

OS2-1 Chapter 2 Computer System Structures. OS2-2 Outlines Computer System Operation I/O Structure Storage Structure Storage Hierarchy Hardware Protection.

Energy Efficient Prefetching – from models to Implementation 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software Engineering.

CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.

Cs238 CPU Scheduling Dr. Alan R. Davis. CPU Scheduling The objective of multiprogramming is to have some process running at all times, to maximize CPU.

Techniques for Efficient Processing in Runahead Execution Engines Onur Mutlu Hyesoon Kim Yale N. Patt.

A Status Report on Research in Transparent Informed Prefetching (TIP) Presented by Hsu Hao Chen.

SAIU: An Efficient Cache Replacement Policy for Wireless On-demand Broadcasts Jianliang Xu, Qinglong Hu, Dik Lun Department of Computer Science in HK University.

Mitigating Prefetcher-Caused Pollution Using Informed Caching Policies for Prefetched Blocks Vivek Seshadri Samihan Yedkar ∙ Hongyi Xin ∙ Onur Mutlu Phillip.

Bandwidth Allocation in a Self-Managing Multimedia File Server Vijay Sundaram and Prashant Shenoy Department of Computer Science University of Massachusetts.

Pipelining. Overview Pipelining is widely used in modern processors. Pipelining improves system performance in terms of throughput. Pipelined organization.

University of Karlsruhe, System Architecture Group Balancing Power Consumption in Multiprocessor Systems Andreas Merkel Frank Bellosa System Architecture.

Ideas for Cooperative Disk Management with ECOSystem Emily Tennant Mentor: Carla Ellis Duke University.

Flashing Up the Storage Layer I. Koltsidas, S. D. Viglas (U of Edinburgh), VLDB 2008 Shimin Chen Big Data Reading Group.

Reducing Energy Consumption of Disk Storage Using Power- Aware Cache Management Q. Zhu, F. David, C. Devaraj, Z. Li, Y. Zhou, P. Cao* University of Illinois.

Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?

Integrating Fine-Grained Application Adaptation with Global Adaptation for Saving Energy Vibhore Vardhan, Daniel G. Sachs, Wanghong Yuan, Albert F. Harris,

Ramazan Bitirgen, Engin Ipek and Jose F.Martinez MICRO’08 Presented by PAK,EUNJI Coordinated Management of Multiple Interacting Resources in Chip Multiprocessors.

Critical Power Slope Understanding the Runtime Effects of Frequency Scaling Akihiko Miyoshi, Charles Lefurgy, Eric Van Hensbergen Ram Rajamony Raj Rajkumar.

An I/O Simulator for Windows Systems Jalil Boukhobza, Claude Timsit 27/10/2004 Versailles Saint Quentin University laboratory.

An Energy-Efficient Hypervisor Scheduler for Asymmetric Multi- core 1 Ching-Chi Lin Institute of Information Science, Academia Sinica Department of Computer.

1 University of Maryland Linger-Longer: Fine-Grain Cycle Stealing in Networks of Workstations Kyung Dong Ryu © Copyright 2000, Kyung Dong Ryu, All Rights.

CPU Scheduling CSCI 444/544 Operating Systems Fall 2008.

1 A New Approach to File System Cache Writeback of Application Data Sorin Faibish – EMC Distinguished Engineer P. Bixby, J. Forecast, P. Armangau and S.

A Cyclic-Executive-Based QoS Guarantee over USB Chih-Yuan Huang,Li-Pin Chang, and Tei-Wei Kuo Department of Computer Science and Information Engineering.

Data Replication and Power Consumption in Data Grids Susan V. Vrbsky, Ming Lei, Karl Smith and Jeff Byrd Department of Computer Science The University.

1 Virtual Machine Memory Access Tracing With Hypervisor Exclusive Cache USENIX ‘07 Pin Lu & Kai Shen Department of Computer Science University of Rochester.

Latency Reduction Techniques for Remote Memory Access in ANEMONE Mark Lewandowski Department of Computer Science Florida State University.

1 Presented By: Michael Bieniek. Embedded systems are increasingly using chip multiprocessors (CMPs) due to their low power and high performance capabilities.

Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.

MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.

Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.

MIAO ZHOU, YU DU, BRUCE CHILDERS, RAMI MELHEM, DANIEL MOSSÉ UNIVERSITY OF PITTSBURGH Writeback-Aware Bandwidth Partitioning for Multi-core Systems with.

ENERGY-EFFICIENCY AND STORAGE FLEXIBILITY IN THE BLUE FILE SYSTEM E. B. Nightingale and J. Flinn University of Michigan.

Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 4 Computer Systems Review.

Adaptive Sleep Scheduling for Energy-efficient Movement-predicted Wireless Communication David K. Y. Yau Purdue University Department of Computer Science.

Application Transformations for Energy and Performance-Aware Device Management Taliver Heath, Eduardo Pinheiro, Jerry Hom, Ulrich Kremer, and Ricardo Bianchini.

Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences Processes Scheduling on Heterogeneous Multi-core Architecture.

Energy Efficient Prefetching and Caching Athanasios E. Papathanasiou and Michael L. Scott. University of Rochester Proceedings of 2004 USENIX Annual Technical.

Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.

Transforming Policies into Mechanisms with Infokernel Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Nathan C. Burnett, Timothy E. Denehy, Thomas J.

Sunpyo Hong, Hyesoon Kim

Improving the Reliability of Commodity Operating Systems Michael M. Swift, Brian N. Bershad, Henry M. Levy Presented by Ya-Yun Lo EECS 582 – W161.

Video Caching in Radio Access network: Impact on Delay and Capacity

GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.

PACMan: Coordinated Memory Caching for Parallel Jobs Ganesh Ananthanarayanan, Ali Ghodsi, Andrew Wang, Dhruba Borthakur, Srikanth Kandula, Scott Shenker,

Input and Output Optimization in Linux for Appropriate Resource Allocation and Management James Avery King.

Jonathan Walpole Computer Science Portland State University

Performance directed energy management using BOS technique

Adaptive Cache Partitioning on a Composite Core

Section 10: Last section! Final review.

PA an Coordinated Memory Caching for Parallel Jobs

Informed Prefetching and Caching

Application Slowdown Model

ICIEV 2014 Dhaka, Bangladesh

Qingbo Zhu, Asim Shankar and Yuanyuan Zhou

Virtual Memory: Working Sets

Presentation transcript:

Energy Efficiency through Burstiness Athanasios E. Papathanasiou and Michael L. Scott University of Rochester, Computer Science Department Rochester, NY 14623, U.S.A. (WMCSA'03) 5th IEEE Workshop on Mobile Computing Systems and Applications Presented by Hsu Hao Chen

outline Introduction Design Guidelines Prototype Experimental Evaluation Conclusion

Introduction Smoothness: OS resource management policies traditionally employ buffering Maximize overall throughput and minimize the latency of individual requests. A little energy efficiency Burstiness: Can improve energy efficiency Without a significant impact on performance

Power-efficient devices Energy consumption parameters for various disks

Linux File System Behavior(1/2) CD copy (1359 sec) Disk Idle Time: 1191 seconds 92%: shorter then 5 seconds

Linux File System Behavior(2/2) MP3 playback (300 sec) Disk Idle Time: 292 seconds 66%: shorter than 8 seconds 6% only: longer than 16 second breakeven time

Disk Energy Savings vs. Memory Size Example: MPEG playback 64~496MB memory increase 3.4% disk energy savings

Design Guidelines(1/3) Maximize idle phases Aggressive, Speculative Prefetching With hints to improve accuracy Bursty Periodic Update Coordinating across applications Arrange for all applications to run out of data at the same time Prefetch daemon

Design Guidelines(2/3)

Design Guidelines(3/3) Maintaining responsiveness Responsiveness may be decreased because of Increased disk congestion due to burstiness Latency penalty for disk power-up Solution: Preactivate disk Monitor application progress and file system cache state Data consumption and production rates Initiate prefetch cycle before applications run out of data Interactive responsiveness Prioritized disk queues Required to service quickly unpredicted demand misses during periods of high disk congestion

Prototype(1/5) Epoch-based algorithm in the basic memory management mechanisms of the Linux kernel Each epoch consists of two phases: Request generation phase Idle phase

Prototype(2/5) Request generation phase Flush all dirty buffers Predict future data accesses Compute the amount of memory that can be used Prefetching Storing new data Free the required amount of memory Prefetch

Prototype(3/5) Idle phase Estimate time to next request Disk spin-down Predicted idle time is higher than the disk’s break-even time New epoch triggered by A new prefetching cycle has to be initiated A demand miss took place One or more dirty buffers have expired and it is time for them to be flushed Low system memory

Prototype(4/5) The Prefetch Cache Prefetch cache size Large enough to contain all predicted data accesses Without causing eviction of useful data Type of first miss determines prefetch cache size for next epoch Compulsory Miss Prefetch Miss Eviction Miss

Prototype(5/5) Update Policy Update daemon Flushes all dirty buffers once per minute. open Modified open system call Postpone write-behind until the file is closed Application without strict reliability constraints Examples: Copying of a file or MP3 encoding of a CD track

Experimental Evaluation(1/7) Dell Inspiron 4100 Laptop 512MB of memory Hitachi DK23DA hard disk Workload scenarios MPEG Playback (two 76MB files) Concurrent MPEG Playback and MP3 encoding MPEG Player Input: two 76MB files MP3 Encoder Input: 10 WAV files (626MB) MP3 Encoder Output: 10 MP3 files (42.9MB) Power management policy: Linux: 10 second fixed threshold Bursty: Predictive algorithm that monitors application progress and file system cache state

Experimental Evaluation(2/7) Cumulative distribution of disk idle time intervals during MPEG playback

Experimental Evaluation(3/7) Distribution of disk idle time intervals during concurrent MPEG playback and MP3 encoding

Experimental Evaluation(4/7)

Experimental Evaluation(5/7) execution of the MPEG player on Linux with 492MB

Experimental Evaluation(6/7) execution of the MPEG player on Bursty with 64MB and 128MB

Experimental Evaluation(7/7)

Conclusion Works well with predictable applications Energy savings scale with memory size Up to 78.5% disk energy savings Less than 5% performance penalty across all workloads and memory sizes