Additional Slides By Professor Mary Jane Irwin Pennsylvania State University Group 3.

Slides:



Advertisements
Similar presentations
CS2100 Computer Organisation Cache II (AY2014/2015) Semester 2.
Advertisements

1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 (and Appendix B) Memory Hierarchy Design Computer Architecture A Quantitative Approach,
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: Krste Asanovic & Vladimir Stojanovic
331 Week13.1Spring :332:331 Computer Architecture and Assembly Language Spring 2006 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
Computer Systems Organization: Lecture 11
Memory Chapter 7 Cache Memories.
331 Lec20.1Fall :332:331 Computer Architecture and Assembly Language Fall 2003 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
Cs 61C L17 Cache.1 Patterson Spring 99 ©UCB CS61C Cache Memory Lecture 17 March 31, 1999 Dave Patterson (http.cs.berkeley.edu/~patterson) www-inst.eecs.berkeley.edu/~cs61c/schedule.html.
331 Lec20.1Spring :332:331 Computer Architecture and Assembly Language Spring 2005 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
Review: The Memory Hierarchy
CSE 431 Computer Architecture Fall 2008 Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Mary Jane Irwin ( ) [Adapted from Computer.
DAP Spr.‘98 ©UCB 1 Lecture 11: Memory Hierarchy—Ways to Reduce Misses.
COEN 180 Main Memory Cache Architectures. Basics Speed difference between cache and memory is small. Therefore:  Cache algorithms need to be implemented.
Memory Hierarchy and Cache Design The following sources are used for preparing these slides: Lecture 14 from the course Computer architecture ECE 201 by.
CS3350B Computer Architecture Winter 2015 Lecture 3.2: Exploiting Memory Hierarchy: How? Marc Moreno Maza [Adapted from.
Computer Architecture and Design – ECEN 350 Part 9 [Some slides adapted from M. Irwin, D. Paterson. D. Garcia and others]
1 CMPE 421 Advanced Computer Architecture Caching with Associativity PART2.
Caching & Virtual Memory Systems Chapter 7  Caching l To address bottleneck between CPU and Memory l Direct l Associative l Set Associate  Virtual Memory.
CS 3410, Spring 2014 Computer Science Cornell University See P&H Chapter: , 5.8, 5.15.
CSIE30300 Computer Architecture Unit 9: Improving Cache Performance Hsin-Chou Chi [Adapted from material by and
CS.305 Computer Architecture Improving Cache Performance Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly.
1 CMPE 421 Advanced Computer Architecture Accessing a Cache PART1.
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 10 Memory Hierarchy.
CSIE30300 Computer Architecture Unit 08: Cache Hsin-Chou Chi [Adapted from material by and
CPE432 Chapter 5A.1Dr. W. Abu-Sufah, UJ Chapter 5A: Exploiting the Memory Hierarchy, Part 2 Adapted from Slides by Prof. Mary Jane Irwin, Penn State University.
1 CMPE 421 Parallel Computer Architecture PART4 Caching with Associativity.
1Chapter 7 Memory Hierarchies (Part 2) Outline of Lectures on Memory Systems 1. Memory Hierarchies 2. Cache Memory 3. Virtual Memory 4. The future.
Computer Organization & Programming
Lecture 08: Memory Hierarchy Cache Performance Kai Bu
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%
CS.305 Computer Architecture Memory: Caches Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.
CPE232 Cache Introduction1 CPE 232 Computer Organization Spring 2006 Cache Introduction Dr. Gheith Abandah [Adapted from the slides of Professor Mary Irwin.
11 Intro to cache memory Kosarev Nikolay MIPT Nov, 2009.
Review °We would like to have the capacity of disk at the speed of the processor: unfortunately this is not feasible. °So we create a memory hierarchy:
CS224 Spring 2011 Computer Organization CS224 Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Spring 2011 With thanks to M.J. Irwin, D. Patterson,
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
Additional Slides By Professor Mary Jane Irwin Pennsylvania State University Group 1.
Lecture 20 Last lecture: Today’s lecture: Types of memory
CPE232 Improving Cache Performance1 CPE 232 Computer Organization Spring 2006 Improving Cache Performance Dr. Gheith Abandah [Adapted from the slides of.
Cache Organization 1 Computer Organization II © CS:APP & McQuain Cache Memory and Performance Many of the following slides are taken with.
CS 61C: Great Ideas in Computer Architecture Lecture 15: Caches, Part 2 Instructor: Sagar Karandikar
CSE431 L20&21 Improving Cache Performance.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 20&21. Improving Cache Performance Mary Jane.
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Set-Associative Caches Instructors: Randy H. Katz David A. Patterson
CS 61C: Great Ideas in Computer Architecture Caches Part 2 Instructors: Nicholas Weaver & Vladimir Stojanovic
CS 110 Computer Architecture Lecture 15: Caches Part 2 Instructor: Sören Schwertfeger School of Information Science and Technology.
Associative Mapping A main memory block can load into any line of cache Memory address is interpreted as tag and word Tag uniquely identifies block of.
CMSC 611: Advanced Computer Architecture
COSC3330 Computer Architecture
ENG3380 Computer Organization and Architecture “Cache Memory Part III”
Cache Memory Presentation I
Consider a Direct Mapped Cache with 4 word blocks
Caches II CSE 351 Spring 2017 Instructor: Ruth Anderson
William Stallings Computer Organization and Architecture 7th Edition
Lecture 21: Memory Hierarchy
Lecture 21: Memory Hierarchy
Instructors: Yuanqing Cheng
Lecture 08: Memory Hierarchy Cache Performance
ECE 445 – Computer Organization
Lecture 22: Cache Hierarchies, Memory
ECE232: Hardware Organization and Design
Morgan Kaufmann Publishers
EE108B Review Session #6 Daxia Ge Friday February 23rd, 2007
Lecture 22: Cache Hierarchies, Memory
Basic Cache Operation Prof. Eric Rotenberg
Lecture 21: Memory Hierarchy
Chapter Five Large and Fast: Exploiting Memory Hierarchy
Cache - Optimization.
Presentation transcript:

Additional Slides By Professor Mary Jane Irwin Pennsylvania State University Group 3

CSE431 Chapter 5A.2Irwin, PSU, 2008 Consider again the simple direct mapped cache xx 0001xx 0010xx 0011xx 0100xx 0101xx 0110xx 0111xx 1000xx 1001xx 1010xx 1011xx 1100xx 1101xx 1110xx 1111xx 64 bytes Main Memory Tag Cache: 4 blocks Data Valid 16 one word blocks Index Block 0 Block 1 Block 2 Block 3 Block 4 Block 15 Block 14 Block 13 Block 12 Block 11 Block 10 Block 9 Block 8 Block 7 Block 6 Block 5

CSE431 Chapter 5A.3Irwin, PSU, 2008 Another Reference String Mapping  Consider the main memory word reference string miss 00 Mem(0) Mem(4) Mem(0) Mem(0) Mem(0) Mem(4) Mem(4) 0 00 Start with an empty cache - all blocks initially marked as not valid  Ping pong effect due to conflict misses - two memory blocks that map into the same cache block l 8 requests, 8 misses

CSE431 Chapter 5A.4Irwin, PSU, 2008 Reducing Cache Miss Rates: Approach #1 Allow more flexible block placement  In a direct mapped cache a memory block maps to exactly one cache block  At the other extreme, could allow a memory block to be mapped to any cache block – fully associative cache  A compromise is to divide the cache into sets each of which consists of n “ways” (n-way set associative). A memory block maps to a unique set (specified by the index field) and can be placed in any way of that set (so there are n choices) Set #= (block address) modulo (# sets in the cache)

CSE431 Chapter 5A.5Irwin, PSU, 2008 Set Associative Cache Example Main Memory Q1: How do we find where to look in the cache to find a memory block? A1: Use next 1 low order memory address bit to determine which cache set block # modulo the number of sets in the cache block # modulo 2 Q2: Is it* there? A2: Compare the high order 3 memory address bits to all the cache tags bits in the set to tell if the memory block is in the cache 0000xx 0001xx 0010xx 0011xx 0100xx 0101xx 0110xx 0111xx 1000xx 1001xx 1010xx 1011xx 1100xx 1101xx 1110xx 1111xx 16 words; One word blocks Two low order address bits define the byte in the word Consider the main memory word reference string ; Is there a ping pong effect now? Cache 0 TagDataVSet Way blocks; 2-way set associative *

CSE431 Chapter 5A.6Irwin, PSU, 2008 Four-Way Set Associative Cache  2 8 = 256 sets each with four ways; Block size= 1 word Byte offset Data TagV Data TagV Data TagV Index Data TagV Index 22 Tag HitData 32 4x1 select Way 0Way 1Way 2Way 3

CSE431 Chapter 5A.7Irwin, PSU, 2008 Range of Set Associative Caches  For a fixed size cache, each increase by a factor of two in associativity doubles the number of blocks per set (i.e., the number or ways) and halves the number of sets – decreases the size of the index by 1 bit and increases the size of the tag by 1 bit Block offsetByte offsetIndexTag

CSE431 Chapter 5A.8Irwin, PSU, 2008 Range of Set Associative Caches  For a fixed size cache, each increase by a factor of two in associativity will: l double the number of blocks per set (i.e., the number of ways) AND l halve the number of sets – decreases the size of the index by 1 bit and increases the size of the tag by 1 bit Block offsetByte offsetIndexTag Decreasing associativity Fully associative (only one set) Tag is all the bits except block and byte offset Direct mapped (only one way) Smaller tags, only a single comparator Increasing associativity Selects the setUsed for tag compareSelects the word in the block

CSE431 Chapter 5A.9Irwin, PSU, 2008 Costs of Set Associative Caches  When a miss occurs, which way’s block do we pick for replacement? l Least Recently Used (LRU): the block replaced is the one that has been unused for the longest time -Must have hardware to keep track of when each way’s block was used relative to the other blocks in the set -For 2-way set associative, takes one bit per set → set the bit when a block is referenced (and reset the other way’s bit)

CSE431 Chapter 5A.10Irwin, PSU, 2008 Costs of Set Associative Caches (continued)  N-way set associative cache costs l N comparators (delay and area)

CSE431 Chapter 5A.11Irwin, PSU, 2008  One word blocks, cache size = 1K words (or 4KB) MIPS Direct Mapped Cache Example 20 Tag 10 Index Data IndexTagValid Byte offset 20 Data 32 Hit

CSE431 Chapter 5A.12Irwin, PSU, 2008 Four-Way Set Associative Cache  2 8 = 256 sets each with four ways; Block size= 1 word Byte offset Data TagV Data TagV Data TagV Index Data TagV Index 22 Tag HitData 32 4x1 select Way 0Way 1Way 2Way 3

CSE431 Chapter 5A.13Irwin, PSU, 2008 Costs of Set Associative Caches (continued)  N-way set associative cache costs l N comparators (delay and area) l Use a MUX for selecting a block of a set before data is available -Hence a N-way set associative cache will also be slower than a direct mapped cache because of this extra multiplexer delay. l Data available after set block selection and Hit/Miss decision. In a direct mapped cache, the cache block is available before the Hit/Miss decision -So in a set associative cache it is not possible to just assume a hit and continue and recover later if it was a miss (In a direct mapped cache, this is possible)

CSE431 Chapter 5A.14Irwin, PSU, 2008 Benefits of Set Associative Caches  The choice of direct mapped or set associative depends on the cost of a miss versus the cost of implementation As cache sizes grow the relative improvement from associativity increases only slightly Since the overall miss rate of a larger cache is lower, the opportunity for improving the miss rate decreases  For a given cache size, largest gains are in going from direct mapped to 2-way (more than 20% reduction in miss rate)