Adaptive Cache Replacement Policy

Slides:

Advertisements

Similar presentations

9.4 Page Replacement What if there is no free frame?

Advertisements

Performance Evaluation of Cache Replacement Policies for the SPEC CPU2000 Benchmark Suite Hussein Al-Zoubi.

361 Computer Architecture Lecture 15: Cache Memory

Yannis Smaragdakis / 11-Jun-14 General Adaptive Replacement Policies Yannis Smaragdakis Georgia Tech.

A Survey of Web Cache Replacement Strategies Stefan Podlipnig, Laszlo Boszormenyl University Klagenfurt ACM Computing Surveys, December 2003 Presenter:

1 Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers By Sreemukha Kandlakunta Phani Shashank.

Caching and Virtual Memory. Main Points Cache concept – Hardware vs. software caches When caches work and when they don’t – Spatial/temporal locality.

CS 342 – Operating Systems Spring 2003 © Ibrahim Korpeoglu Bilkent University1 Memory Management – 4 Page Replacement Algorithms CS 342 – Operating Systems.

Review CPSC 321 Andreas Klappenecker Announcements Tuesday, November 30, midterm exam.

Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing, Volume: 3.

Cs 61C L17 Cache.1 Patterson Spring 99 ©UCB CS61C Cache Memory Lecture 17 March 31, 1999 Dave Patterson (http.cs.berkeley.edu/~patterson) www-inst.eecs.berkeley.edu/~cs61c/schedule.html.

An Intelligent Cache System with Hardware Prefetching for High Performance Jung-Hoon Lee; Seh-woong Jeong; Shin-Dug Kim; Weems, C.C. IEEE Transactions.

Dyer Rolan, Basilio B. Fraguela, and Ramon Doallo Proceedings of the International Symposium on Microarchitecture (MICRO’09) Dec /7/14.

COEN 180 Main Memory Cache Architectures. Basics Speed difference between cache and memory is small. Therefore:  Cache algorithms need to be implemented.

Caching and Virtual Memory. Main Points Cache concept – Hardware vs. software caches When caches work and when they don’t – Spatial/temporal locality.

Lecture 19: Virtual Memory

« Performance of Compressed Inverted List Caching in Search Engines » Proceedings of the International World Wide Web Conference Commitee, Beijing 2008)

CSIT 301 (Blum)1 Cache Based in part on Chapter 9 in Computer Architecture (Nicholas Carter)

Computer Architecture Memory organization. Types of Memory Cache Memory Serves as a buffer for frequently accessed data Small  High Cost RAM (Main Memory)

L/O/G/O Cache Memory Chapter 3 (b) CS.216 Computer Architecture and Organization.

Computer Architecture Lecture 26 Fasih ur Rehman.

11 Intro to cache memory Kosarev Nikolay MIPT Nov, 2009.

1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson.

Virtual Memory Questions answered in this lecture: How to run process when not enough physical memory? When should a page be moved from disk to memory?

Lecture 20 Last lecture: Today’s lecture: Types of memory

Cache Miss-Aware Dynamic Stack Allocation Authors: S. Jang. et al. Conference: International Symposium on Circuits and Systems (ISCAS), 2007 Presenter:

1 Chapter 11 I/O Management and Disk Scheduling Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and.

Associative Mapping A main memory block can load into any line of cache Memory address is interpreted as tag and word Tag uniquely identifies block of.

CMSC 611: Advanced Computer Architecture

COSC3330 Computer Architecture

CSE 351 Section 9 3/1/12.

18-447: Computer Architecture Lecture 23: Caches

Dynamic Branch Prediction

Associativity in Caches Lecture 25

Computer Architecture

CSC 4250 Computer Architectures

Multilevel Memories (Improving performance using alittle “cash”)

The Hardware/Software Interface CSE351 Winter 2013

18742 Parallel Computer Architecture Caching in Multi-core Systems

Cache Memory Presentation I

Consider a Direct Mapped Cache with 4 word blocks

William Stallings Computer Organization and Architecture 7th Edition

Virtual Memory 3 Hakim Weatherspoon CS 3410, Spring 2011

Andy Wang Operating Systems COP 4610 / CGS 5765

TLC: A Tag-less Cache for reducing dynamic first level Cache Energy

Module IV Memory Organization.

Demand Paged Virtual Memory

Andy Wang Operating Systems COP 4610 / CGS 5765

Adapted from slides by Sally McKee Cornell University

Massachusetts Institute of Technology

Overheads for Computers as Components 2nd ed.

Lecture 14: Large Cache Design II

Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory

M. Usha Professor/CSE Sona College of Technology

EE108B Review Session #6 Daxia Ge Friday February 23rd, 2007

José A. Joao* Onur Mutlu‡ Yale N. Patt*

CS 3410, Spring 2014 Computer Science Cornell University

CSC3050 – Computer Architecture

Module IV Memory Organization.

Virtual Memory: Working Sets

rePLay: A Hardware Framework for Dynamic Optimization

Sarah Diesburg Operating Systems CS 3430

Andy Wang Operating Systems COP 4610 / CGS 5765

A Novel Cache-Utilization Based Dynamic Voltage Frequency Scaling (DVFS) Mechanism for Reliability Enhancements *Yen-Hao Chen, *Yi-Lun Tang, **Yi-Yu Liu,

What Are Performance Counters?

Virtual Memory.

10/18: Lecture Topics Using spatial locality

Overview Problem Solution CPU vs Memory performance imbalance

Sarah Diesburg Operating Systems COP 4610

Presentation transcript:

Adaptive Cache Replacement Policy By: Neslisah Torosdagli, Awrad Mohammed Ali, Stacy Gramajo Spring 2015

Introduction What is Adaptive Cache Replacement? Motivations Given more than one cache replacement algorithm, one is chosen to access memory based on recent memory accesses Motivations Cache performance is critical for various reasons Changing the hardware (i.e. increasing size of cache) can only go so far Various replacement policies work their best in different situation Combining different replacement policies can help with performance

Research Paper Implementation Author: Yannis Smaragdakis ACM 2004 Hardware Modifications First Structure: Two sets of parallel tag arrays Same size of the regular tag array of adaptive cache Contains cache contents for each replacement policy Second Structure: Miss History Buffer Tracks of past performance of cache misses for one replacement policy Replacement Algorithm For every memory reference, the parallel tag arrays and miss history buffer is updated

Research Paper Extension Assume that we are running a program of 2000 instructions which’s: first 1000 instructions perform pretty good with policy A last 1000 instructions perform pretty good with policy B At instruction 1000, miss rates are as follows: policy A : 250 policy B : 750 Since miss rate of policy A is less than miss rate of policy B, policy A will definitely be prefered for instruction 1001, 1002, .., 1500 although policy B performs better due to accumulated success of policy A Can we use an algorithm similar to tournament algorithm of branch prediction to solve this unfair selection?

Our Implementation - Software Used SimpleScalar Implementing Adaptive Cache Policy, LFR, and Branch-like Predictions Added Features LRU Tag Array LFU Tag Array Adaptive Tag Array Local Histories Global History Global History keeps track of which replacement algorithm was used Local History records the history of hits and misses of a policy. Miss Counts

Research Paper Extension Global History Buffer 32 bits 0 - Policy A is used 1 - Policy B is used Local History Buffer 0 - Policy missed 1 - Policy Hit Global History Buffer 1 0 0 1 32 bits 0 1 32 1 1 32 bits 1 Local History Buffer 1 1 1 32 bits 1

Issues In the initial implementation, history buffers are initialized to 0: Since 0 refers to policy A, voting unfairly selects policy A. History buffers are filled randomly: Un-consistent results are obtained directly proportional with the success of randomization function History ready buffers are added to the implementation: Voting algorithm uses error counts until history buffers are filled completely, Once history buffers are ready, voting algorithm uses history buffers and error counts to make a decision

Our Implementation - Software Added three additional arrays Global History Ready (1) Local History Ready (2) The additional features removes the randomization problem from occurring Miss Counts

Adaptive Replacement UML Diagram

Policies The adaptive cache mimic either LFU or user selected cache mechanism by SimpleScalar command line arguments: LRU: evict the blocks that least recently used strong in conditions where access is made mainly to the most recent items, such as an application computing average temperature of last 2 hours LFU: evict the blocks with the lowest referenced frequencies through creating a counter. strong in conditions where large regions of blocks used only once from commonly accessed data Fails when there is an item that is accessed frequently in the past, and not accessed anymore. LFU do not evict it New items added to cache are more probably to be evicted

Configurations for Simulations First Configuration L1 Data and Instruction Cache 16 KB cache, 64B block, 4–way associative Unified L2 Cache 512K cache size, 64B block, 8-way associative Second Configuration L1 Data and Instruction Cache 16 KB cache, 64B block, 2–way associative Unified L2 Cache 512K cache size, 64B block 4-way associative

Results: Comparison MPKI (Misses Per Thousand Instructions = (# access/ # instructions) * miss rate * 1000

More Results

Success and Difficulties The project is implemented in an iterative manner shaped by issues faced SimpleScalar is slow and some benchmarks are not working correctly The results are in proportion with the paper

Conclusion Adaptive cache replacement policy guaranteed a good performance with small percentage of error We run our experiments using different configurations on different benchmarks to show that the adaptive policy is able to perform good on multiple platforms Future Recommendations Implement other replacement algorithms such as, Pseudo LRU (PLRU) and Segmented LRU (SLRU). Adaptively decide for replacement algorithm among more than 2 replacement policies

References [1] Smaragdakis, Yannis. "General adaptive replacement policies." Proceedings of the 4th international symposium on Memory management. ACM, 2004. [2] Subramanian, Ranjith, Yannis Smaragdakis, and Gabriel H. Loh. "Adaptive caches: Effective shaping of cache behavior to workloads." Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 2006. [3] Smaragdakis, Yannis. "General adaptive replacement policies." Proceedings of the 4th international symposium on Memory management. ACM, 2004. [4] E. G. Hallnor and S. K. Reinhardt. A Fully Associative Software-Managed Cache Design. In Proceedings of the 27th International Symposium on Computer Architecture, Vancouver, Canada, June 2000.

Questions?