March 2005 1R. Smith - University of St Thomas - Minnesota ENGR 330: Today’s Class CachesCaches Direct mapped cacheDirect mapped cache Set associative.

Slides:



Advertisements
Similar presentations
SE-292 High Performance Computing
Advertisements

SE-292 High Performance Computing Memory Hierarchy R. Govindarajan
Lecture 19: Cache Basics Today’s topics: Out-of-order execution
1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.
Lecture 8: Memory Hierarchy Cache Performance Kai Bu
Cs 325 virtualmemory.1 Accessing Caches in Virtual Memory Environment.
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 (and Appendix B) Memory Hierarchy Design Computer Architecture A Quantitative Approach,
The Memory Hierarchy (Lectures #24) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer Organization.
Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
Cache Memory Adapted from lectures notes of Dr. Patterson and Dr. Kubiatowicz of UC Berkeley.
Chapter 7 Large and Fast: Exploiting Memory Hierarchy Bo Cheng.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
EENG449b/Savvides Lec /13/04 April 13, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG Computer.
March R. Smith - University of St Thomas - Minnesota ENGR 330: Today’s Class Homework 8 recap; Homework 9 questionsHomework 8 recap; Homework 9 questions.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
331 Lec20.1Spring :332:331 Computer Architecture and Assembly Language Spring 2005 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
Computer ArchitectureFall 2008 © November 3 rd, 2008 Nael Abu-Ghazaleh CS-447– Computer.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy (Part II)
March R. Smith - University of St Thomas - Minnesota ENGR 330: Today’s Class Toys, er, Processor TechnologyToys, er, Processor Technology Cache ReviewCache.
11/10/2005Comp 120 Fall November 10 8 classes to go! questions to me –Topics you would like covered –Things you don’t understand –Suggestions.
COEN 180 Main Memory Cache Architectures. Basics Speed difference between cache and memory is small. Therefore:  Cache algorithms need to be implemented.
Lecture 33: Chapter 5 Today’s topic –Cache Replacement Algorithms –Multi-level Caches –Virtual Memories 1.
Memory Hierarchy and Cache Design The following sources are used for preparing these slides: Lecture 14 from the course Computer architecture ECE 201 by.
Maninder Kaur CACHE MEMORY 24-Nov
Lecture 10 Memory Hierarchy and Cache Design Computer Architecture COE 501.
July 30, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 8: Exploiting Memory Hierarchy: Virtual Memory * Jeremy R. Johnson Monday.
The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.
How to Build a CPU Cache COMP25212 – Lecture 2. Learning Objectives To understand: –how cache is logically structured –how cache operates CPU reads CPU.
CS 3410, Spring 2014 Computer Science Cornell University See P&H Chapter: , 5.8, 5.15.
10/18: Lecture topics Memory Hierarchy –Why it works: Locality –Levels in the hierarchy Cache access –Mapping strategies Cache performance Replacement.
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 10 Memory Hierarchy.
CSIE30300 Computer Architecture Unit 08: Cache Hsin-Chou Chi [Adapted from material by and
Computer Architecture Memory organization. Types of Memory Cache Memory Serves as a buffer for frequently accessed data Small  High Cost RAM (Main Memory)
3-May-2006cse cache © DW Johnson and University of Washington1 Cache Memory CSE 410, Spring 2006 Computer Systems
1  1998 Morgan Kaufmann Publishers Recap: Memory Hierarchy of a Modern Computer System By taking advantage of the principle of locality: –Present the.
Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.
Computer Organization & Programming
Lecture 08: Memory Hierarchy Cache Performance Kai Bu
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
COMP SYSTEM ARCHITECTURE HOW TO BUILD A CACHE Antoniu Pop COMP25212 – Lecture 2Jan/Feb 2015.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Nov. 15, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 8: Memory Hierarchy Design * Jeremy R. Johnson Wed. Nov. 15, 2000 *This lecture.
DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%
CS.305 Computer Architecture Memory: Caches Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.
Caches Hiding Memory Access Times. PC Instruction Memory 4 MUXMUX Registers Sign Ext MUXMUX Sh L 2 Data Memory MUXMUX CONTROLCONTROL ALU CTL INSTRUCTION.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
Memory Hierarchy How to improve memory access. Outline Locality Structure of memory hierarchy Cache Virtual memory.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
Lecture 20 Last lecture: Today’s lecture: Types of memory
1 Appendix C. Review of Memory Hierarchy Introduction Cache ABCs Cache Performance Write policy Virtual Memory and TLB.
Constructive Computer Architecture Realistic Memories and Caches Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
COMP 3221: Microprocessors and Embedded Systems Lectures 27: Cache Memory - III Lecturer: Hui Wu Session 2, 2005 Modified.
CMSC 611: Advanced Computer Architecture
Multilevel Memories (Improving performance using alittle “cash”)
Consider a Direct Mapped Cache with 4 word blocks
Morgan Kaufmann Publishers
CS61C : Machine Structures Lecture 6. 2
Lecture 21: Memory Hierarchy
Lecture 21: Memory Hierarchy
Lecture 08: Memory Hierarchy Cache Performance
Morgan Kaufmann Publishers Memory Hierarchy: Cache Basics
EE108B Review Session #6 Daxia Ge Friday February 23rd, 2007
Caches III CSE 351 Autumn 2018 Instructor: Justin Hsia
CSC3050 – Computer Architecture
Chapter Five Large and Fast: Exploiting Memory Hierarchy
Cache - Optimization.
Cache Memory Rabi Mahapatra
10/18: Lecture Topics Using spatial locality
Caches III CSE 351 Spring 2019 Instructor: Ruth Anderson
Presentation transcript:

March R. Smith - University of St Thomas - Minnesota ENGR 330: Today’s Class CachesCaches Direct mapped cacheDirect mapped cache Set associative cacheSet associative cache Magic: fully associative cacheMagic: fully associative cache Four Questions/3 C’sFour Questions/3 C’s

March R. Smith - University of St Thomas - Minnesota Caches Make computers faster via bits of extra RAMMake computers faster via bits of extra RAM –CPU “sees” RAM through the MAR and MDR –Cache sits “behind” the MAR/MDR providing the data “Local” data is saved in the faster cache storage“Local” data is saved in the faster cache storage –Fast to retrieve –Handles most cases “Other” data in the regular RAM“Other” data in the regular RAM –Slower to retrieve –Stays in the cache in case it’s used again soon

March R. Smith - University of St Thomas - Minnesota Direct Mapped Cache The basis of today’s designsThe basis of today’s designs –A collection of high speed RAM locations –Broken into individually addressed “cache entries” –Part of RAM address chooses cache entry (“Direct mapping”) A cache entryA cache entry –“Index” is its address in the cache –Valid bit - true if the entry contains valid RAM data –“Tag” holds the address bits not matching the cache address –Data area - where the stored data resides Store multiple words (spatial locality)Store multiple words (spatial locality)

March R. Smith - University of St Thomas - Minnesota Example 32 bit RAM addresses32 bit RAM addresses 64 cache entries, each contains 16 bytes64 cache entries, each contains 16 bytes How do we resolve cache addresses?How do we resolve cache addresses? How big is the tag field?How big is the tag field? How much RAM does it need, in bits, per entry?How much RAM does it need, in bits, per entry? How much for the whole cache?How much for the whole cache?

March R. Smith - University of St Thomas - Minnesota CPU and Cache Handling What happens with a cache hit?What happens with a cache hit? What happens with a cache miss?What happens with a cache miss? –A stall, like a pipeline stall, but simpler –We stall the whole CPU - inefficient but it’s the best approach How do we replace a word in the cache?How do we replace a word in the cache? –Pick one to replace –Option: pick at random Easy to implementEasy to implement Not always optimalNot always optimal –Option: LRU – least recently used OptimalOptimal Hard to implement – usually just approximatedHard to implement – usually just approximated

March R. Smith - University of St Thomas - Minnesota What happens when we write data? Option: write throughOption: write through –Do the write in the ‘background’ after it hits the cache –Often needs a buffer to hold the data being written –The usual choice in caches Option: write backOption: write back –Save the updated data in the cache –Write data back only when replacing the word in the cache –Makes it much slower to replace a cache entry We have to wait for the write to finishWe have to wait for the write to finish

March R. Smith - University of St Thomas - Minnesota Set Associative Caches That 2-way, 4-way, 8-way stuffThat 2-way, 4-way, 8-way stuff Provides multiple ‘hit’ entries per mappingProvides multiple ‘hit’ entries per mapping Problem:Problem: –Calculate size information for a set associative cache AttributesAttributes –Address size –Block size –Number of lines –N-way

March R. Smith - University of St Thomas - Minnesota A specific problem We are building an 8-way set associative cache to handle 32 bit addresses.We are building an 8-way set associative cache to handle 32 bit addresses. –We will use 32 byte blocks. –We have 256K bytes of high speed RAM we can use for the data space. –How much extra space do we need for address tags? How large are the address tags in bits? –How much extra space do we need for address tags? How large are the address tags in bits? –How many "valid" bits do we need?

March R. Smith - University of St Thomas - Minnesota Fully associative cache “Association list” approach“Association list” approach –Accepts an address –Returns the data Not a RAM – stores tags and dataNot a RAM – stores tags and data –Tag field = full address – block size –Data field = data block Parallel tag field checkingParallel tag field checking –Automatically matches, retrieves data with matching tag –Expensive in terms of logic

March R. Smith - University of St Thomas - Minnesota Four Questions General framework for memory hierarchiesGeneral framework for memory hierarchies 1. Where can a block be placed?1. Where can a block be placed? –Different schemes have different restrictions –Some have no restrictions (fully associative) 2. How is a block found?2. How is a block found? –Fully associative - logic does all the work in one cycle –Direct addressing does much of the work 3. How do we choose a block to replace?3. How do we choose a block to replace? –Option: Randomly –Option: LRU 4. What happens during a write?4. What happens during a write? –Write-back –Write-through

March R. Smith - University of St Thomas - Minnesota Types of Misses (Three C’s) Compulsory misses or Cold start missesCompulsory misses or Cold start misses –When a block is first accessed by the program –Impossible to eliminate these –Right block size can reduce the number Capacity missesCapacity misses –Cache can’t contain all blocks needed by the program –i.e. the program keeps pulling blocks back in after they’ve been replaced by other referenced blocks –Suggests the cache isn’t big enough Conflict misses or Collision missesConflict misses or Collision misses –When multiple blocks compete for the same set/location –Happens in set associative and direct mapped –Doesn’t happen in fully associative cache

March R. Smith - University of St Thomas - Minnesota That’s it. Questions?Questions? Creative Commons License This work is licensed under the Creative Commons Attribution-Share Alike 3.0 United States License. To view a copy of this license, visit or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.