Cache Organization 1 Computer Organization II ©2005-2013 CS:APP & McQuain Cache Memory and Performance Many of the following slides are taken with.

Slides:



Advertisements
Similar presentations
Lecture 19: Cache Basics Today’s topics: Out-of-order execution
Advertisements

1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.
Cache Performance 1 Computer Organization II © CS:APP & McQuain Cache Memory and Performance Many of the following slides are taken with.
Recitation 7 Caching By yzhuang. Announcements Pick up your exam from ECE course hub ◦ Average is 43/60 ◦ Final Grade computation? See syllabus
Carnegie Mellon 1 Cache Memories : Introduction to Computer Systems 10 th Lecture, Sep. 23, Instructors: Randy Bryant and Dave O’Hallaron.
The Lord of the Cache Project 3. Caches Three common cache designs: Direct-Mapped store in exactly one cache line Fully Associative store in any cache.
Multilevel Memory Caches Prof. Sirer CS 316 Cornell University.
Cache Memories September 30, 2008 Topics Generic cache memory organization Direct mapped caches Set associative caches Impact of caches on performance.
How caches take advantage of Temporal locality
Cache Memories May 5, 2008 Topics Generic cache memory organization Direct mapped caches Set associative caches Impact of caches on performance EECS213.
Cache Organization Topics Background Simple examples.
Caches Oct. 22, 1998 Topics Memory Hierarchy
Caches – basic idea Small, fast memory Stores frequently-accessed blocks of memory. When it fills up, discard some blocks and replace them with others.
Systems I Locality and Caching
ECE Dept., University of Toronto
Caches – basic idea Small, fast memory Stores frequently-accessed blocks of memory. When it fills up, discard some blocks and replace them with others.
Memory Hierarchy 1 Computer Organization II © CS:APP & McQuain Cache Memory and Performance Many of the following slides are taken with.
CacheLab 10/10/2011 By Gennady Pekhimenko. Outline Memory organization Caching – Different types of locality – Cache organization Cachelab – Warnings.
Lecture Objectives: 1)Define set associative cache and fully associative cache. 2)Compare and contrast the performance of set associative caches, direct.
Multilevel Memory Caches Prof. Sirer CS 316 Cornell University.
CS 3410, Spring 2014 Computer Science Cornell University See P&H Chapter: , 5.8, 5.15.
1 Cache Memories Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron.
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 10 Memory Hierarchy.
CPE432 Chapter 5A.1Dr. W. Abu-Sufah, UJ Chapter 5A: Exploiting the Memory Hierarchy, Part 2 Adapted from Slides by Prof. Mary Jane Irwin, Penn State University.
Lecture 13: Caching EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr. Rozier.
CSE 241 Computer Engineering (1) هندسة الحاسبات (1) Lecture #3 Ch. 6 Memory System Design Dr. Tamer Samy Gaafar Dept. of Computer & Systems Engineering.
Additional Slides By Professor Mary Jane Irwin Pennsylvania State University Group 3.
Code and Caches 1 Computer Organization II © CS:APP & McQuain Cache Memory and Performance Many of the following slides are taken with permission.
The Memory Hierarchy Lecture # 30 15/05/2009Lecture 30_CA&O_Engr Umbreen Sabir.
1 Seoul National University Cache Memories. 2 Seoul National University Cache Memories Cache memory organization and operation Performance impact of caches.
Nov. 15, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 8: Memory Hierarchy Design * Jeremy R. Johnson Wed. Nov. 15, 2000 *This lecture.
DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%
1 Memory Hierarchy ( Ⅲ ). 2 Outline The memory hierarchy Cache memories Suggested Reading: 6.3, 6.4.
11 Intro to cache memory Kosarev Nikolay MIPT Nov, 2009.
Systems I Cache Organization
1 Cache Memory. 2 Outline General concepts 3 ways to organize cache memory Issues with writes Write cache friendly codes Cache mountain Suggested Reading:
Cache Memory.
Carnegie Mellon 1 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Virtual Memory: Concepts Slides adapted from Bryant.
1 Cache Memories. 2 Today Cache memory organization and operation Performance impact of caches  The memory mountain  Rearranging loops to improve spatial.
Cache Memories Topics Generic cache-memory organization Direct-mapped caches Set-associative caches Impact of caches on performance CS 105 Tour of the.
Lecture 20 Last lecture: Today’s lecture: Types of memory
Memory Hierarchy 1 Computer Organization II © CS:APP & McQuain Cache Memory and Performance Many of the following slides are taken with.
Recitation 6 – 3/11/01 Outline Cache Organization Replacement Policies MESI Protocol –Cache coherency for multiprocessor systems Anusha
Cache Small amount of fast memory Sits between normal main memory and CPU May be located on CPU chip or module.
CACHE MEMORY CS 147 October 2, 2008 Sampriya Chandra.
Vassar College 1 Jason Waterman, CMPU 224: Computer Organization, Fall 2015 Cache Memories CMPU 224: Computer Organization Nov 19 th Fall 2015.
Carnegie Mellon 1 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Cache Memories CENG331 - Computer Organization Instructors:
COSC2410: LAB 19 INTRODUCTION TO MEMORY/CACHE DIRECT MAPPING 1.
CSCI206 - Computer Organization & Programming
CMSC 611: Advanced Computer Architecture
Cache Memories.
CSE 351 Section 9 3/1/12.
Cache Memory and Performance
Cache Memories CSE 238/2038/2138: Systems Programming
Replacement Policy Replacement policy:
The Hardware/Software Interface CSE351 Winter 2013
Consider a Direct Mapped Cache with 4 word blocks
CS 105 Tour of the Black Holes of Computing
ReCap Random-Access Memory (RAM) Nonvolatile Memory
Cache Memories September 30, 2008
Systems Architecture II
Module IV Memory Organization.
Lecture 22: Cache Hierarchies, Memory
Module IV Memory Organization.
Caches III CSE 351 Autumn 2018 Instructor: Justin Hsia
Cache Memories Professor Hugh C. Lauer CS-2011, Machine Organization and Assembly Language (Slides include copyright materials from Computer Systems:
Chapter Five Large and Fast: Exploiting Memory Hierarchy
Cache - Optimization.
Cache Memory and Performance
Cache Memory and Performance
Presentation transcript:

Cache Organization 1 Computer Organization II © CS:APP & McQuain Cache Memory and Performance Many of the following slides are taken with permission from Complete Powerpoint Lecture Notes for Computer Systems: A Programmer's Perspective (CS:APP) Randal E. BryantRandal E. Bryant and David R. O'HallaronDavid R. O'Hallaron The book is used explicitly in CS 2505 and CS 3214 and as a reference in CS 2506.

Cache Organization 2 Computer Organization II © CS:APP & McQuain Cache Memories Cache memories are small, fast SRAM-based memories managed automatically in hardware. – Hold frequently accessed blocks of main memory CPU looks first for data in caches (e.g., L1, L2, and L3), then in main memory. Typical system structure: Main memory I/O bridge Bus interface ALU Register file CPU chip System busMemory bus Cache memories

Cache Organization 3 Computer Organization II © CS:APP & McQuain General Cache Organization (S, E, B) E = 2 e lines (blocks) per set S = 2 s sets set line (block) Cache size: C = S x E x B data bytes 012 B-1 tag v B = 2 b bytes per cache block (the data) valid bit s e -1

Cache Organization 4 Computer Organization II © CS:APP & McQuain Cache Defines View of DRAM The "geometry" of the cache is defined by: S = 2 s the number of sets in the cache E = 2 e the number of lines (blocks) in a set B = 2 b the number of bytes in a line (block) These values define a related way to think about the organization of DRAM: DRAM consists of a sequence of blocks of B bytes. The bytes in a block (line) can be indexed by using b bits. DRAM consists of a sequence of groups of S blocks (lines). The blocks (lines) in a group can be indexed by using s bits. Each group contains SxB bytes, which can be indexed by using s + b bits.

Cache Organization 5 Computer Organization II © CS:APP & McQuain Cache (8, 2, 4) and 256-Byte DRAM E = 2 1 blocks (lines) per set S = 2 3 sets Cache size: C = S x E x B = 64 data bytes tag v B = 2 2 bytes per cache block (the data) valid bit DRAM

Cache Organization 6 Computer Organization II © CS:APP & McQuain Example of Cache View of DRAM Assume a cache has the following geometry: S = 2 2 = 8 the number of sets in the cache E = 2 1 = 2 the number of lines (blocks) in a set B = 2 2 = 4 the number of bytes in a line (block) Suppose that DRAM consists of 256 bytes, so we have 8-bit addresses. Then DRAM consists of: -64 blocks, each holding 4 bytes -8 groups, each holding 8 blocks addressDRAM block 0 group block 1block 7

Cache Organization 7 Computer Organization II © CS:APP & McQuain Example of Cache View of DRAM Pick an address: group groupblockbyte byte byte byte byte byte byte byte byte byte byte byte byte byte addressDRAM block 100 block 101 block 110 block 000 a 1 a 0 give the byte number and the offset of the byte in the block.

Cache Organization 8 Computer Organization II © CS:APP & McQuain Example of Cache View of DRAM Pick an address: byte byte byte byte byte byte byte byte byte byte byte byte byte addressDRAM block 100 group groupblockbyte block 101 block 110 block 000 a 4 a 3 a 2 give the block number a 4 a 3 a 2 00* equals the offset of the block in the group. * a 4 a 3 a 2 00= a 4 a 3 a 2 x 2 2 = a 4 a 3 a 2 x (size of a block)

Cache Organization 9 Computer Organization II © CS:APP & McQuain Example of Cache View of DRAM Pick an address: byte byte byte byte byte byte byte byte byte byte byte byte byte addressDRAM block 100 group groupblockbyte block 101 block 110 block 000 * Why? a 7 a 6 a 5 give the group number a 7 a 6 a * equals the offset of the group in the DRAM

Cache Organization 10 Computer Organization II © CS:APP & McQuain The BIG Picture byte byte byte byte 11 Block 101 in Group 011 DRAM Group 011 in DRAM

Cache Organization 11 Computer Organization II © CS:APP & McQuain Example of Cache View of DRAM byte byte byte byte byte byte byte byte byte byte byte byte addressDRAM block 100 block 101 block 110 Pick an address: How does this address map into the cache? The DRAM block number determines the cache set used to store the block. Note this means that two DRAM blocks from the same DRAM group cannot map into the same cache set. So the address: maps to set 101 in the cache. Each set in our cache can hold 2 blocks. This block could be stored at either location within the corresponding set.

Cache Organization 12 Computer Organization II © CS:APP & McQuain Example of Cache View of DRAM byte byte byte byte byte byte byte byte byte byte byte byte addressDRAM block 100 block 101 block 110 DRAM block containing address: Cache Maps to cache set: or valid tag

Cache Organization 13 Computer Organization II © CS:APP & McQuain Cache View of DRAM So, to generalize, suppose a cache has: S = 2 s sets E = 2 e blocks (lines) per set B = 2 b bytes per block (line) And, suppose that DRAM uses N-bit addresses. then for any address: a N-1 … a s+b a s+b-1 … a b a b-1 … a 0 Bits a b-1 :a 0 give the byte index within the block Bits a b+s-1 :a b give the set index Bits a N-1 :a s+b become the tag for the data Note that these bits are only the same for blocks that are within the same DRAM group.

Cache Organization 14 Computer Organization II © CS:APP & McQuain Cache Read t bitss bits = Kb bits = J Address of word: 1.Locate set 2.Check if any line in set has matching tag 3.Yes + line valid: hit 4.Locate data starting at offset Line Set 0 1 K 2 s e -1 K 01 2 b -1 tag v J set index 1 tag 2 3 block offset data begins at this offset 4

Cache Organization 15 Computer Organization II © CS:APP & McQuain Example: Direct Mapped Cache (E = 1) S = 2 s sets Direct mapped: One line per set Assume: cache block size 8 bytes t bits0…01100 Address of int: 0127tagv v v v3654 find set

Cache Organization 16 Computer Organization II © CS:APP & McQuain Example: Direct Mapped Cache (E = 1) Direct mapped: One line per set Assume: cache block size 8 bytes t bits0…01100 Address of int: 0127tagv3654 block offset tag matching tags  hitvalid? +

Cache Organization 17 Computer Organization II © CS:APP & McQuain Example: Direct Mapped Cache (E = 1) Direct mapped: One line per set Assume: cache block size 8 bytes t bits0…01100 Address of int: 0127tagv3654 int (4 Bytes) is here block offset No match: old line (block) is evicted and replaced by requested block from DRAM matching tags  hitvalid? +

Cache Organization 18 Computer Organization II © CS:APP & McQuain Direct-Mapped Cache Simulation M=16 byte addresses, B=2 bytes/block, S=4 sets, E=1 Blocks/set Address trace (reads, one byte per read): 0[ ], 1[ ], 7[ ], 8[ ], 0[ ] x t=1s=2b=1 xxx 0?? vTagBlock miss 10M[0-1] hit miss 10M[6-7] miss 11M[8-9] miss 10M[0-1] Set 0 Set 1 Set 2 Set 3

Cache Organization 19 Computer Organization II © CS:APP & McQuain E-way Set Associative Cache (Here: E = 2) E = 2: Two lines per set Assume: cache block size 8 bytes t bits0…01100 Address of short int: 0127tagv v v v v v v v3654 find set

Cache Organization 20 Computer Organization II © CS:APP & McQuain E-way Set Associative Cache (Here: E = 2) E = 2: Two lines per set Assume: cache block size 8 bytes t bits0…01100 Address of short int: 0127tagv v3654 compare both block offset tag matching tags  hitvalid? +

Cache Organization 21 Computer Organization II © CS:APP & McQuain E-way Set Associative Cache (Here: E = 2) E = 2: Two lines per set Assume: cache block size 8 bytes t bits0…01100 Address of short int: 0127tagv v3654 compare both block offset short int (2 Bytes) is here No match: One line in set is selected for eviction and replacement Replacement policies: random, least recently used (LRU), … matching tags  hitvalid? +

Cache Organization 22 Computer Organization II © CS:APP & McQuain 2-Way Set Associative Cache Simulation M=16 byte addresses, B=2 bytes/block, S=2 sets, E=2 blocks/set Address trace (reads, one byte per read): 0[ ], 1[ ], 7[ ], 8[ ], 0[ ] xx t=2s=1b=1 xx 0?? vTagBlock miss 100M[0-1] hit miss 101M[6-7] miss 110M[8-9] hit Set 0 Set 1

Cache Organization 23 Computer Organization II © CS:APP & McQuain Cache Organization Types The "geometry" of the cache is defined by: S = 2 s the number of sets in the cache E = 2 e the number of lines (blocks) in a set B = 2 b the number of bytes in a line (block) E = 1 ( e = 0 )direct-mapped cache only one possible location in cache for each DRAM block S > 1 E = K > 1K -way associative cache K possible locations (in same cache set) for each DRAM block S = 1 (only one set)fully-associative cache E = # of cache blockseach DRAM block can be at any location in the cache

Cache Organization 24 Computer Organization II © CS:APP & McQuain Searching a Set If we have an associative cache ( K -way or fully), how do we determine if a given DRAM block occurs within a set? Compare the tag we’re trying to match to all of the tags for blocks in the relevant set at the same time! Then factor in the valid bits, also in parallel. And employ a MUX