Energy Efficient Prefetching and Caching Athanasios E. Papathanasiou and Michael L. Scott. University of Rochester Proceedings of 2004 USENIX Annual Technical.

Slides:

Advertisements

Similar presentations

Reducing Energy Consumption of Disk Storage Using Power Aware Cache Management Qingbo Zhu, Francis M. David, Christo F. Deveraj, Zhenmin Li, Yuanyuan Zhou.

Advertisements

The Performance Impact of Kernel Prefetching on Buffer Cache Replacement Algorithms (ACM SIGMETRIC 05 ) ACM International Conference on Measurement & Modeling.

Background Virtual memory – separation of user logical memory from physical memory. Only part of the program needs to be in memory for execution. Logical.

Energy Efficiency through Burstiness Athanasios E. Papathanasiou and Michael L. Scott University of Rochester, Computer Science Department Rochester, NY.

1 Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers By Sreemukha Kandlakunta Phani Shashank.

Application-Controlled File Caching Policies Pei Cao, Edward W. Felten and Kai Li Presented By: Mazen Daaibes Gurpreet Mavi ECE 7995 Presentation.

Operating Systems Process Scheduling (Ch 3.2, )

Cache Definition Cache is pronounced cash. It is a temporary memory to store duplicate data that is originally stored elsewhere. Cache is used when the.

Virtual Memory Background Demand Paging Performance of Demand Paging

FlashVM: Virtual Memory Management on Flash Mohit Saxena and Michael M. Swift Introduction Flash storage is the largest change to memory and storage systems.

1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,

Chapter 101 Virtual Memory Chapter 10 Sections and plus (Skip:10.3.2, 10.7, rest of 10.8)

Lecture 17 I/O Optimization. Disk Organization Tracks: concentric rings around disk surface Sectors: arc of track, minimum unit of transfer Cylinder:

Virtual Memory Chapter 8. Hardware and Control Structures Memory references are dynamically translated into physical addresses at run time –A process.

Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing, Volume: 3.

CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.

Memory Management 2010.

Chapter 9: Virtual Memory. 9.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Feb 22, 2005 Chapter 9: Virtual Memory Background.

Virtual Memory Chapter 8.

Memory Management Virtual Memory Page replacement algorithms

1 Chapter 8 Virtual Memory Virtual memory is a storage allocation scheme in which secondary memory can be addressed as though it were part of main memory.

CS444/CS544 Operating Systems Virtual Memory 4/06/2007 Prof. Searleman

Techniques for Efficient Processing in Runahead Execution Engines Onur Mutlu Hyesoon Kim Yale N. Patt.

Computer Organization and Architecture

1 Virtual Memory Management B.Ramamurthy Chapter 10.

Operating Systems Process Scheduling (Ch 4.2, )

Page Replacement Algorithms

An Intelligent Cache System with Hardware Prefetching for High Performance Jung-Hoon Lee; Seh-woong Jeong; Shin-Dug Kim; Weems, C.C. IEEE Transactions.

Operating System Process Scheduling (Ch 4.2, )

1 Token-ordered LRU an Effective Policy to Alleviate Thrashing Presented by Xuechen Zhang, Pei Yan ECE7995 Presentation.

Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.

Memory Management ◦ Operating Systems ◦ CS550. Paging and Segmentation  Non-contiguous memory allocation  Fragmentation is a serious problem with contiguous.

Ideas for Cooperative Disk Management with ECOSystem Emily Tennant Mentor: Carla Ellis Duke University.

Rensselaer Polytechnic Institute CSC 432 – Operating Systems David Goldschmidt, Ph.D.

Modularizing B+-trees: Three-Level B+-trees Work Fine Shigero Sasaki* and Takuya Araki NEC Corporation * currently with 1st Nexpire Inc.

Flashing Up the Storage Layer I. Koltsidas, S. D. Viglas (U of Edinburgh), VLDB 2008 Shimin Chen Big Data Reading Group.

1 Design and Performance of a Web Server Accelerator Eric Levy-Abegnoli, Arun Iyengar, Junehwa Song, and Daniel Dias INFOCOM ‘99.

Page 19/17/2015 CSE 30341: Operating Systems Principles Optimal Algorithm  Replace page that will not be used for longest period of time  Used for measuring.

July 30, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 8: Exploiting Memory Hierarchy: Virtual Memory * Jeremy R. Johnson Monday.

The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.

1 University of Maryland Linger-Longer: Fine-Grain Cycle Stealing in Networks of Workstations Kyung Dong Ryu © Copyright 2000, Kyung Dong Ryu, All Rights.

1 A New Approach to File System Cache Writeback of Application Data Sorin Faibish – EMC Distinguished Engineer P. Bixby, J. Forecast, P. Armangau and S.

CS 149: Operating Systems March 3 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak

Memory Management Fundamentals Virtual Memory. Outline Introduction Motivation for virtual memory Paging – general concepts –Principle of locality, demand.

Uniprocessor Scheduling

Latency Reduction Techniques for Remote Memory Access in ANEMONE Mark Lewandowski Department of Computer Science Florida State University.

Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.

Virtual Memory The memory space of a process is normally divided into blocks that are either pages or segments. Virtual memory management takes.

CS333 Intro to Operating Systems Jonathan Walpole.

Project Presentation By: Dean Morrison 12/6/2006 Dynamically Adaptive Prepaging for Effective Virtual Memory Management.

Operating Systems ECE344 Ashvin Goel ECE University of Toronto Demand Paging.

Demand Paging Reference Reference on UNIX memory management

Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache And Pefetch Buffers Norman P. Jouppi Presenter:Shrinivas Narayani.

Virtual Memory Questions answered in this lecture: How to run process when not enough physical memory? When should a page be moved from disk to memory?

ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 7 – Buffer Management.

Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.

Memory Management memory hierarchy programs exhibit locality of reference - non-uniform reference patterns temporal locality - a program that references.

Informed Prefetching and Caching R. Hugo Patterson, Garth A. Gibson, Eka Ginting, Daniel Stodolsky, Jim Zelenka.

CS 3204 Operating Systems Godmar Back Lecture 18.

Virtual Memory Chapter 8.

Performance directed energy management using BOS technique

Chapter 9: Virtual Memory

Virtual Memory Chapter 8.

Informed Prefetching and Caching

Chapter 9: Virtual-Memory Management

Qingbo Zhu, Asim Shankar and Yuanyuan Zhou

Virtual Memory: Working Sets

Lecture 9: Caching and Demand-Paged Virtual Memory

COMP755 Advanced Operating Systems

Page Cache and Page Writeback

Presentation transcript:

Energy Efficient Prefetching and Caching Athanasios E. Papathanasiou and Michael L. Scott. University of Rochester Proceedings of 2004 USENIX Annual Technical Conference Presenter: Ningfang Mi July 01, 2005

Outline Motivation New Energy-Aware Prefetching Algorithm  Basic idea  Key challenges Implementation in Linux kernel Evaluation results Conclusion

Motivation Prefetching and caching in modern OS  A smooth access pattern improves performance Increase throughput Decrease latency What about energy efficiency?  A smooth access pattern results in relatively short intervals of idle times Idle times too short to save energy Spin-up time is not free

New Design Goal Maximize energy efficiency  Create a bursty access pattern  Maximize idle interval length  Maximize utilization when disk is active  Not degrade performance Focus on hard disks

G D Background(1) -- Fetch-on-Demand ABC A ABC BC EF DEF DEF Stream: A B C D E F G …. Access: 10 times units Fetch: 1 time unit time units 6 idle time intervals with 10 times units each idle

Background (2) -- Traditional Prefetching (Cao ’ 95) Aim -- minimize execution time Four rules 1. Optimal Prefetching Prefetch the next referenced block not in cache 2. Optimal Replacement Discard the block whose next reference is farthest 3. Do no harm Never replace A with B when A be referenced before B 4. First Opportunity Never do prefect-and-replace later What to prefetch or discard? When to prefetch?

H G D Background (2) -- Traditional Prefetching (Cao ’ 95) ABC A ABC BC EF DEF DEF Stream: A B C D E F G …. Access: 10 times units Fetch: 1 time unit 0 61 GH I G 61 time units 5 idle time intervals with 9 times units each 1 idle time intervals of 8 Idle idle

Background (3) -- Energy-conscious Prefetching Replace “First opportunity” with  Maximize Disk Utilization Always initiate a prefetch when there are blocks available for replacement  Respect Idle Time Never interrupt a period of inactivity with a prefetch operation unless unless prefetching is urgent

G D Background (3) -- Energy-conscious Prefetching ABC A ABC BC EF DEF DEF Idle 4-30 Idle time units 1 idle time intervals of 27 1 idle time intervals of Stream: A B C D E F G …. Access: 10 times units Fetch: 1 time unit

Energy-Aware Prefetching -- Basic Idea Design guideline  Fetch as many blocks as possible when the disk is active  Not prefetch until the latest opportunity when the disk is idle. Epoch-Based Extensions to Linux Memory Management System  Divide the time into epochs  Each epoch: an active phase and an idle phase

Key Challenges When to prefech? What to prefech? How much to prefetch?

Key Challenges (1) -- When to Prefetch? In an epoch: 1. predict future accesses 2. do prefetching 3. predict idle period 4. if possible, go to sleep 5. wake up for demand miss or prefetching or low on memory Estimate memory size for prefetching Free the required amount of memory Prefetch new data idle active

Key Challenges (2) -- What to Prefetch? Prediction is based on hints.  Hint interface: File Specifier X Pattern Specifier +Time Information New applications submit hints to OS using new system calls  Monitor Daemon Provide hints automatically on behalf of applications Track file activity Access Analysi s Hint Generatio n

Key Challenges (3) -- How much to Prefetch? Decide # of pages be freed in active phase  The reserved memory be large enough to contain all predicted data accesses.  Prefetching not cause the eviction of pages that are going to be accessed sooner than the prefetched data First miss during idle phase  Compulsory Miss: A miss on a page without prior information  Prefetch Miss: A miss on a page with a prediction (hint)  Eviction Miss: A miss on a page be evicted for prefetching

Implementation In the Linux kernel  Hinted files  Prefetch thread  Prefetch cache  Eviction Cache  Handling write activity  Power management policy

Hinted Files Disclosed by:  Monitor daemon or applications  Kernel for long sequential file accesses Maintained in a doubly linked list Sorted by estimated first access time

Prefetch Thread Coordinating across applications  A lack of coordination limits idle interval length Issuing read/write from concurrently running applications during the same small window of times  Write: the update daemon  Page-out: the swap daemon  Prefetch/read: the prefetch daemon Generate prefetch requests for all running applications Coordinating three daemons I/O activity

Prefetch Cache & Eviction Cache Extend LRU with Prefetch Cache  Contain pages requested by the prefetch daemon  Timestamp: when the page will be accessed  When a page is referenced or its timestamp is exceeded, move it to the standard LRU list Eviction Cache: Stores eviction history  Metadata of recently evicted pages  Eviction number: # of pages that have been evicted  When an eviction miss occurs page’s eviction number - epoch’s starting eviction number => # of pages that were evicted without causing an eviction miss => Estimate prefetch cache size for next epoch

Handle Write Activity In the original kernel, update daemon runs every 5 sec and flushes dirty buffers > 30 sec  => the idle interval <= 5 seconds Now, a modified update daemon flushes dirty buffers once per minute.  A flag in the extended open system call indicates dirty buffers can be delayed until the corresponding file is closed the process opening the file exits  The monitor daemon provides guideline to OS “flush-on-close” or “flush-on-exit”

Power Management Policy Power management policy based on the prediction of the next idle length  Set the disk to Standby within 1 sec after idle if predicted length > Standby breakeven time The problem of mispredictions  Actual idle time < Standby breakeven time  Return to a dynamic-threshold spin-down policy Ignore predictions until the accuracy increases Avoid harmful spin-down operations

Evaluation Used Hitachi hard disk  three low power modes Workloads:  MPEG playback (MPEG)  MP3 encoding and MPEG playback (Concurrent)  kernel compilation (Make)  speech recognition system (SPHINX) Metrics  Length of idle periods: make longer  Energy savings  Slowdown: minimize performance penalties

Results (1) -- Idle Time Intervals MPGE concurrentSPHINX make 80% >200 s Standard kernel, 100% idle time less than 1 second, independent of memory size Bursty system, larger memory sizes lead to longer idle interval lengths

Results (2) -- Energy Savings Linux kernel  Base case (64MB)  Independent on memory size Bursty system  Depend on memory size  Significant energy saving when mem size is large 78.5% 77.4% 62.5% 66.6%

Results (3) -- Execution Time Successfully avoid delay caused by disk spin-up ops An increased cache hit ratio improves the performance <2.8% <1.6% 4.8% 15% Increased paging and disk congestion <5% Increased cache hit ratio speeds the time

Conclusion Energy-conscious prefetching algorithm  Maximize idle interval length  Maximize energy efficiency  Minimize performance penalties Experimental results  Increase the length of idle intervals  Save 60-80% disk energy USENIX'04 Best Paper Award  BurstyFS