Cache Organization Topics Background Simple examples.

Slides:



Advertisements
Similar presentations
Technical University of Lodz Department of Microelectronics and Computer Science Elements of high performance microprocessor architecture Memory system.
Advertisements

Today Memory hierarchy, caches, locality Cache organization
Carnegie Mellon 1 Cache Memories : Introduction to Computer Systems 10 th Lecture, Sep. 23, Instructors: Randy Bryant and Dave O’Hallaron.
The Lord of the Cache Project 3. Caches Three common cache designs: Direct-Mapped store in exactly one cache line Fully Associative store in any cache.
How caches take advantage of Temporal locality
LRU Replacement Policy Counters Method Example
Cache Memories May 5, 2008 Topics Generic cache memory organization Direct mapped caches Set associative caches Impact of caches on performance EECS213.
Lecture 41: Review Session #3 Reminders –Office hours during final week TA as usual (Tuesday & Thursday 12:50pm-2:50pm) Hassan: Wednesday 1pm to 4pm or.
5/27/99 Ashish Sabharwal1 Cache Misses - The 3 C’s n Compulsory misses –cold start –don’t have a choice –except that by increasing block size, can reduce.
COEN 180 Main Memory Cache Architectures. Basics Speed difference between cache and memory is small. Therefore:  Cache algorithms need to be implemented.
Caches – basic idea Small, fast memory Stores frequently-accessed blocks of memory. When it fills up, discard some blocks and replace them with others.
ECE Dept., University of Toronto
Caches – basic idea Small, fast memory Stores frequently-accessed blocks of memory. When it fills up, discard some blocks and replace them with others.
Computer Architecture Project #2 Cache Simulator
In1210/01-PDS 1 TU-Delft The Memory System. in1210/01-PDS 2 TU-Delft Organization Word Address Byte Address
Cache Performance Metrics Miss Rate  Fraction of memory references not found in cache (misses / accesses) = 1 – hit rate  Typical numbers (in percentages):
Memory and cache CPU Memory I/O. CEG 320/52010: Memory and cache2 The Memory Hierarchy Registers Primary cache Secondary cache Main memory Magnetic disk.
How to Build a CPU Cache COMP25212 – Lecture 2. Learning Objectives To understand: –how cache is logically structured –how cache operates CPU reads CPU.
1 Cache Memories Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron.
Computer Architecture Lecture 26 Fasih ur Rehman.
ECE 454 Computer Systems Programming Memory performance (Part II: Optimizing for caches) Ding Yuan ECE Dept., University of Toronto
Additional Slides By Professor Mary Jane Irwin Pennsylvania State University Group 3.
1 Seoul National University Cache Memories. 2 Seoul National University Cache Memories Cache memory organization and operation Performance impact of caches.
ICC Module 3 Lesson 2 – Memory Hierarchies 1 / 13 © 2015 Ph. Janson Information, Computing & Communication Memory Hierarchies – Clip 9 – Locality School.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
1 CSCI 2510 Computer Organization Memory System II Cache In Action.
DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%
11 Intro to cache memory Kosarev Nikolay MIPT Nov, 2009.
Systems I Cache Organization
Carnegie Mellon Introduction to Computer Systems /18-243, spring th Lecture, Feb. 19 th Instructors: Gregory Kesden and Markus Püschel.
1 Cache Memories. 2 Today Cache memory organization and operation Performance impact of caches  The memory mountain  Rearranging loops to improve spatial.
Cache Memories Topics Generic cache-memory organization Direct-mapped caches Set-associative caches Impact of caches on performance CS 105 Tour of the.
Lecture 20 Last lecture: Today’s lecture: Types of memory
Cache Small amount of fast memory Sits between normal main memory and CPU May be located on CPU chip or module.
Cache Organization 1 Computer Organization II © CS:APP & McQuain Cache Memory and Performance Many of the following slides are taken with.
Vassar College 1 Jason Waterman, CMPU 224: Computer Organization, Fall 2015 Cache Memories CMPU 224: Computer Organization Nov 19 th Fall 2015.
Carnegie Mellon 1 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Cache Memories CENG331 - Computer Organization Instructors:
Characteristics Location Capacity Unit of transfer Access method Performance Physical type Physical characteristics Organisation.
COSC2410: LAB 19 INTRODUCTION TO MEMORY/CACHE DIRECT MAPPING 1.
Memory Hierarchy and Cache. A Mystery… Memory Main memory = RAM : Random Access Memory – Read/write – Multiple flavors – DDR SDRAM most common 64 bit.
Lecture 5 Cache Operation
CSCI206 - Computer Organization & Programming
Cache Memories.
Two Dimensional Highly Associative Level-Two Cache Design
Caches Samira Khan March 23, 2017.
CSE 351 Section 9 3/1/12.
Associativity in Caches Lecture 25
Cache Memories CSE 238/2038/2138: Systems Programming
Replacement Policy Replacement policy:
The Hardware/Software Interface CSE351 Winter 2013
Caches III CSE 351 Autumn 2017 Instructor: Justin Hsia
Cache Memory Presentation I
Consider a Direct Mapped Cache with 4 word blocks
CS 105 Tour of the Black Holes of Computing
Cache Miss Rate Computations
Memory hierarchy.
Memory and cache CPU Memory I/O.
Lecture 22: Cache Hierarchies, Memory
Help! How does cache work?
Module IV Memory Organization.
CDA 5155 Caches.
Lecture 15: Memory Design
Caches III CSE 351 Autumn 2018 Instructor: Justin Hsia
Cache Memories Professor Hugh C. Lauer CS-2011, Machine Organization and Assembly Language (Slides include copyright materials from Computer Systems:
Caches III CSE 351 Winter 2019 Instructors: Max Willsey Luis Ceze
Cache - Optimization.
Cache Memory and Performance
Caches III CSE 351 Spring 2019 Instructor: Ruth Anderson
Overview Problem Solution CPU vs Memory performance imbalance
Presentation transcript:

Cache Organization Topics Background Simple examples

– 2 – Typical Cache Organization =? cache line address tagindexoffset tag array data array

Cache Organization Details (S, E, B) E = 2 e blocks (lines) per set S=2 s sets set block 012 B-1 tagv valid bit B = 2 b bytes per cache block (the data) Cache size: S x E x B data bytes

– 4 – Example: Direct Mapped Cache (E = 1) S=2 s sets Direct mapped: One block per set Assume: cache block size 8 bytes t bits 100 Address of int: tagv v v v3 654 find set

– 5 – Example: Direct Mapped Cache (E = 1) t bits 100 Address of int: tagv3 654 match: assume yes = hit block offset tag Direct mapped: One block per set Assume: cache block size 8 bytes

– 6 – Example: Direct Mapped Cache (E = 1) t bits Address of int: tagv3 654 match: assume yes = hit block offset tag Direct mapped: One block per set Assume: cache block size 8 bytes int (4 Bytes) is here No match: old line is evicted and replaced 100

– 7 – E-way Set Associative Cache (E = 2) E = 2: Two lines per set Assume: cache block size 8 bytes t bits 100 Address of short int: tg v v v v v v v v3 654 find set

E-way Set Associative Cache (E = 2) t bits 100 Address of short int: tgv v3 654 compare both match: yes = hit block offset tg E = 2: Two lines per set cache block size 8 bytes tg

E-way Set Associative Cache (E = 2) t bits 100 Address of short int: v tg v3 654 compare both match: yes = hit block offset tg E = 2: Two lines per set cache block size 8 bytes short int (2 Bytes) is here No match: One line in set is selected for eviction and replacement Replacement policies: random, least recently used (LRU), … tg

– 10 – Assumed Simple Cache 2 ints per block 2-way set associative 2 blocks total 1 set i.e., same thing as fully associative Replacement policy: Least Recently Used (LRU) Cache Block 0 Block 1

– 11 – Array Access: Key Questions How many array elements are there per block? Does the data structure fit in the cache? Do I re-use blocks over time? In what order am I accessing blocks?

– 12 – Simple Array 1234 A Cache for (i=0;i<N;i++){ … = A[i]; } Miss rate = #misses / #accesses

– 13 – Simple Array 1234 A Cache for (i=0;i<N;i++){ … = A[i]; } Miss rate = #misses / #accesses = (N//2) / N = ½ = 50%

– 14 – Simple Array w outer loop 1234 A Cache for (k=0;k<P;k++){ for (i=0;i<N;i++){ … = A[i]; } Assume A[] fits in the cache: Miss rate = #misses / #accesses = Lesson: for sequential accesses with re-use, If fits in the cache, first visit suffers all the misses

– 15 – Simple Array w outer loop 1234 A Cache for (k=0;k<P;k++){ for (i=0;i<N;i++){ … = A[i]; } Assume A[] fits in the cache: Miss rate = #misses / #accesses = Lesson: for sequential accesses with re-use, If fits in the cache, first visit suffers all the misses (N//2) / N = ½ = 50%

– 16 – Simple Array A Cache for (i=0;i<N;i++){ … = A[i]; } Assume A[] does not fit in the cache: Miss rate = #misses / #accesses

– 17 – Simple Array A Cache for (i=0;i<N;i++){ … = A[i]; } Assume A[] does not fit in the cache: Miss rate = #misses / #accesses = (N/2) / N = ½ = 50% Lesson: for sequential accesses, if no-reuse it doesn’t matter whether data structure fits

– 18 – Simple Array with outer loop A Cache Assume A[] does not fit in the cache: Miss rate = #misses / #accesses = for (k=0;k<P;k++){ for (i=0;i<N;i++){ … = A[i]; } (N/2) / N = ½ = 50% Lesson: for sequential accesses with re-use, If the data structure doesn’t fit, same miss rate as no-reuse

– 19 – 2D array A Cache Assume A[] fits in the cache: Miss rate = #misses / #accesses = for (i=0;i<N;i++){ for (j=0;j<N;j++){ … = A[i][j]; } (N*N/2) / (N*N) = ½ = 50%

– 20 – 2D array A Cache for (i=0;i<N;i++){ for (j=0;j<N;j++){ … = A[i][j]; } Lesson: for 2D accesses, if row order and no-reuse, same hit rate as sequential, whether fits or not Assume A[] does not fit in the cache: Miss rate = #misses / #accesses = 50%

– 21 – 2D array A Cache for (j=0;j<N;j++){ for (i=0;i<N;i++){ … = A[i][j]; } Lesson: for 2D accesses, if column order and no-reuse, same hit rate as sequential if entire column fits in the cache Assume A[] fits in the cache: Miss rate = #misses / #accesses = (N*N/2) / N*N = ½ = 50%

– 22 – 2D array A Cache Assume A[] does not fit in the cache: Miss rate = #misses / #accesses for (j=0;j<N;j++){ for (i=0;i<N;i++){ … = A[i][j]; } = 100% Lesson: for 2D accesses, if column order, if entire column doesn’t fit, then 100% miss rate (block (1,2) is gone after access to element 9).

– 23 –