University of Toronto Department of Electrical And Computer Engineering Jason Zebchuk RegionTracker: Optimizing On-Chip Cache.

Slides:



Advertisements
Similar presentations
SE-292 High Performance Computing
Advertisements

Memory.
A Framework for Coarse-Grain Optimizations in the On-Chip Memory Hierarchy J. Zebchuk, E. Safi, and A. Moshovos.
SE-292 High Performance Computing Memory Hierarchy R. Govindarajan
1 Adapted from UCB CS252 S01, Revised by Zhao Zhang in IASTATE CPRE 585, 2004 Lecture 14: Hardware Approaches for Cache Optimizations Cache performance.
Operating Systems Lecture 10 Issues in Paging and Virtual Memory Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard. Zhiqing.
Technical University of Lodz Department of Microelectronics and Computer Science Elements of high performance microprocessor architecture Memory system.
Virtual Memory. The Limits of Physical Addressing CPU Memory A0-A31 D0-D31 “Physical addresses” of memory locations Data All programs share one address.
CSIE30300 Computer Architecture Unit 10: Virtual Memory Hsin-Chou Chi [Adapted from material by and
1 Lecture 16: Large Cache Design Papers: An Adaptive, Non-Uniform Cache Structure for Wire-Dominated On-Chip Caches, Kim et al., ASPLOS’02 Distance Associativity.
Hit or Miss ? !!!.  Cache RAM is high-speed memory (usually SRAM).  The Cache stores frequently requested data.  If the CPU needs data, it will check.
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University Caches P & H Chapter 5.1, 5.2 (except writes)
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University Caches P & H Chapter 5.1, 5.2 (except writes)
Virtual Memory 3 Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University P & H Chapter
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
CS 333 Introduction to Operating Systems Class 12 - Virtual Memory (2) Jonathan Walpole Computer Science Portland State University.
Power Efficient IP Lookup with Supernode Caching Lu Peng, Wencheng Lu*, and Lide Duan Dept. of Electrical & Computer Engineering Louisiana State University.
11/2/2004Comp 120 Fall November 9 classes to go! VOTE! 2 more needed for study. Assignment 10! Cache.
Recap. The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of the.
File System Implementation CSCI 444/544 Operating Systems Fall 2008.
11/3/2005Comp 120 Fall November 10 classes to go! Cache.
Computer Organization Cs 147 Prof. Lee Azita Keshmiri.
Cs 61C L17 Cache.1 Patterson Spring 99 ©UCB CS61C Cache Memory Lecture 17 March 31, 1999 Dave Patterson (http.cs.berkeley.edu/~patterson) www-inst.eecs.berkeley.edu/~cs61c/schedule.html.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
1 Lecture 11: Large Cache Design Topics: large cache basics and… An Adaptive, Non-Uniform Cache Structure for Wire-Dominated On-Chip Caches, Kim et al.,
Overview: Memory Memory Organization: General Issues (Hardware) –Objectives in Memory Design –Memory Types –Memory Hierarchies Memory Management (Software.
Skewed Compressed Cache
CS 7810 Lecture 17 Managing Wire Delay in Large CMP Caches B. Beckmann and D. Wood Proceedings of MICRO-37 December 2004.
Data Cache Prefetching using a Global History Buffer Presented by: Chuck (Chengyan) Zhao Mar 30, 2004 Written by: - Kyle Nesbit - James Smith Department.
Caches Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University See P&H 5.1, 5.2 (except writes)
Cache memory October 16, 2007 By: Tatsiana Gomova.
CS333 Intro to Operating Systems Jonathan Walpole.
Computer Orgnization Rabie A. Ramadan Lecture 7. Wired Control Unit What are the states of the following design:
Winter 2002CSE Cache Improving memory with caches CPU On-chip cache Off-chip cache DRAM memory Disk memory.
A Framework for Coarse-Grain Optimizations in the On-Chip Memory Hierarchy Jason Zebchuk, Elham Safi, and Andreas Moshovos
2013/10/21 Yun-Chung Yang An Energy-Efficient Adaptive Hybrid Cache Jason Cong, Karthik Gururaj, Hui Huang, Chunyue Liu, Glenn Reinman, Yi Zou Computer.
Lecture Topics: 11/17 Page tables TLBs Virtual memory flat page tables
Multilevel Memory Caches Prof. Sirer CS 316 Cornell University.
Two Ways to Exploit Multi-Megabyte Caches AENAO Research Toronto Kaveh Aasaraai Ioana Burcea Myrto Papadopoulou Elham Safi Jason Zebchuk Andreas.
Computer Architecture Memory organization. Types of Memory Cache Memory Serves as a buffer for frequently accessed data Small  High Cost RAM (Main Memory)
Computer Architecture Lecture 26 Fasih ur Rehman.
1 Virtual Memory and Address Translation. 2 Review Program addresses are virtual addresses.  Relative offset of program regions can not change during.
CS 149: Operating Systems March 3 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak
3-May-2006cse cache © DW Johnson and University of Washington1 Cache Memory CSE 410, Spring 2006 Computer Systems
Moshovos © 1 ReCast: Boosting L2 Tag Line Buffer Coverage “for Free” Won-Ho Park, Toronto Andreas Moshovos, Toronto Babak Falsafi, CMU
Moshovos © 1 RegionScout: Exploiting Coarse Grain Sharing in Snoop Coherence Andreas Moshovos
Virtual Memory. Virtual Memory: Topics Why virtual memory? Virtual to physical address translation Page Table Translation Lookaside Buffer (TLB)
CSE378 Intro to caches1 Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early.
Lecture 14: Caching, cont. EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr.
1 CSCI 2510 Computer Organization Memory System II Cache In Action.
Computer Organization CS224 Fall 2012 Lessons 41 & 42.
Virtual Memory Review Goal: give illusion of a large memory Allow many processes to share single memory Strategy Break physical memory up into blocks (pages)
University of Toronto Department of Electrical and Computer Engineering Jason Zebchuk and Andreas Moshovos June 2006.
Memory Management Continued Questions answered in this lecture: What is paging? How can segmentation and paging be combined? How can one speed up address.
Lecture Topics: 12/1 File System Implementation –Space allocation –Free Space –Directory implementation –Caching Disk Scheduling File System/Disk Interaction.
3/1/2002CSE Virtual Memory Virtual Memory CPU On-chip cache Off-chip cache DRAM memory Disk memory Note: Some of the material in this lecture are.
CDA 5155 Virtual Memory Lecture 27. Memory Hierarchy Cache (SRAM) Main Memory (DRAM) Disk Storage (Magnetic media) CostLatencyAccess.
Fall EE 333 Lillevik 333f06-l16 University of Portland School of Engineering Computer Organization Lecture 16 Write-through, write-back cache Memory.
CSE 351 Caches. Before we start… A lot of people confused lea and mov on the midterm Totally understandable, but it’s important to make the distinction.
Region-Centric Memory Design AENAO Research Group Patrick Akl, M.A.Sc. Ioana Burcea, Ph.D. C. Myrto Papadopoulou, M.A.Sc. C. Elham Safi, Ph.D. C. Jason.
Chapter 5 Memory II CSE 820. Michigan State University Computer Science and Engineering Equations CPU execution time = (CPU cycles + Memory-stall cycles)
CMSC 611: Advanced Computer Architecture
Two Dimensional Highly Associative Level-Two Cache Design
Reactive NUMA: A Design for Unifying S-COMA and CC-NUMA
18742 Parallel Computer Architecture Caching in Multi-core Systems
Directory-based Protocol
Chapter 5 Memory CSE 820.
Interconnect with Cache Coherency Manager
Help! How does cache work?
Paging and Segmentation
Presentation transcript:

University of Toronto Department of Electrical And Computer Engineering Jason Zebchuk RegionTracker: Optimizing On-Chip Cache Lookups

June 9, What is a Cache? CPU Memory CPU Memory Cache Either Big & Slow or Small & Fast Big & Slow

June 9, How Does a Cache Work? Tag 1Tag 2Tag 3Tag 4 Data 1Data 2Data 3Data 4 TagIndexOffset Address: ==== Select Tag ArrayData Array

June 9, Towards Large On-Chip Caches CPU CPU: Memory: CPUCache

June 9, Towards Large On-Chip Caches Cache CPU CPU Larger, Slower caches  How can we make caches Fast again?

June 9, Tag Lookup Plays Critical Role Larger Caches are Slower, and use More Power Serial Tag-Data lookup to reduce power CPU 1.Search for Tag in Tag Array 2.Then read desired block from data array Tag Array Data Array ð Lots of fine-grain information ð Can We Improve This?

June 9, Key Insight Only a few regions in cache at a time. Can we use something small and fast to track these regions? Memory Cache

June 9, RegionTracker Design Cached Block Vector (CBV) Cached Region Hash (CRH) (not shown) Track which Regions are currently Cached Allocated new CBV entry on access to new region Combine Coarse-Grain and Fine-Grain Information block 0 block n-1 region location 4 28

June 9, Results and Insights  Takes advantage of common program behavior  Access a few large, continuous areas of memory at a time  Simple, Small Structures  Compact Information Encoding & Fast Access  Still use tag array when RegionTracker misses  Reduces Cache Latency  Can Reduce Cache Lookup Power  Software does NOT change