5/27/99 Ashish Sabharwal1 Cache Misses - The 3 C’s n Compulsory misses –cold start –don’t have a choice –except that by increasing block size, can reduce.

Slides:



Advertisements
Similar presentations
SE-292 High Performance Computing
Advertisements

SE-292 High Performance Computing Memory Hierarchy R. Govindarajan
Lecture 19: Cache Basics Today’s topics: Out-of-order execution
Cache Here we focus on cache improvements to support at least 1 instruction fetch and at least 1 data access per cycle – With a superscalar, we might need.
CS2100 Computer Organisation Cache II (AY2014/2015) Semester 2.
CSC1016 Coursework Clarification Derek Mortimer March 2010.
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: Krste Asanovic & Vladimir Stojanovic
The Lord of the Cache Project 3. Caches Three common cache designs: Direct-Mapped store in exactly one cache line Fully Associative store in any cache.
Multilevel Memory Caches Prof. Sirer CS 316 Cornell University.
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
How caches take advantage of Temporal locality
Caches J. Nelson Amaral University of Alberta. Processor-Memory Performance Gap Bauer p. 47.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Oct 31, 2005 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
March R. Smith - University of St Thomas - Minnesota ENGR 330: Today’s Class CachesCaches Direct mapped cacheDirect mapped cache Set associative.
Lecture 41: Review Session #3 Reminders –Office hours during final week TA as usual (Tuesday & Thursday 12:50pm-2:50pm) Hassan: Wednesday 1pm to 4pm or.
Cache Organization Topics Background Simple examples.
Cache intro CSE 471 Autumn 011 Principle of Locality: Memory Hierarchies Text and data are not accessed randomly Temporal locality –Recently accessed items.
COEN 180 Main Memory Cache Architectures. Basics Speed difference between cache and memory is small. Therefore:  Cache algorithms need to be implemented.
EECS 370 Discussion 1 xkcd.com. EECS 370 Discussion Topics Today: – Caches!! Theory Design Examples 2.
CMPE 421 Parallel Computer Architecture
Lecture Objectives: 1)Define set associative cache and fully associative cache. 2)Compare and contrast the performance of set associative caches, direct.
Multilevel Memory Caches Prof. Sirer CS 316 Cornell University.
How to Build a CPU Cache COMP25212 – Lecture 2. Learning Objectives To understand: –how cache is logically structured –how cache operates CPU reads CPU.
CS 3410, Spring 2014 Computer Science Cornell University See P&H Chapter: , 5.8, 5.15.
10/18: Lecture topics Memory Hierarchy –Why it works: Locality –Levels in the hierarchy Cache access –Mapping strategies Cache performance Replacement.
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 10 Memory Hierarchy.
CPE432 Chapter 5A.1Dr. W. Abu-Sufah, UJ Chapter 5A: Exploiting the Memory Hierarchy, Part 2 Adapted from Slides by Prof. Mary Jane Irwin, Penn State University.
Computer Architecture Lecture 26 Fasih ur Rehman.
1 CMPE 421 Parallel Computer Architecture PART4 Caching with Associativity.
Additional Slides By Professor Mary Jane Irwin Pennsylvania State University Group 3.
CS2100 Computer Organisation Cache II (AY2015/6) Semester 1.
Computer Organization CS224 Fall 2012 Lessons 45 & 46.
COMP SYSTEM ARCHITECTURE HOW TO BUILD A CACHE Antoniu Pop COMP25212 – Lecture 2Jan/Feb 2015.
DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%
11 Intro to cache memory Kosarev Nikolay MIPT Nov, 2009.
Lecture 17 Final Review Prof. Mike Schulte Computer Architecture ECE 201.
Cache Small amount of fast memory Sits between normal main memory and CPU May be located on CPU chip or module.
Cache Organization 1 Computer Organization II © CS:APP & McQuain Cache Memory and Performance Many of the following slides are taken with.
CS 61C: Great Ideas in Computer Architecture Caches Part 2 Instructors: Nicholas Weaver & Vladimir Stojanovic
CSCI206 - Computer Organization & Programming
Main Memory Cache Architectures
CSE 351 Section 9 3/1/12.
CS2100 Computer Organization
Replacement Policy Replacement policy:
Multilevel Memories (Improving performance using alittle “cash”)
Basic Performance Parameters in Computer Architecture:
The Hardware/Software Interface CSE351 Winter 2013
Caches III CSE 351 Autumn 2017 Instructor: Justin Hsia
Consider a Direct Mapped Cache with 4 word blocks
CS61C : Machine Structures Lecture 6. 2
Lecture 21: Memory Hierarchy
Lecture 21: Memory Hierarchy
Lecture 22: Cache Hierarchies, Memory
Direct Mapping.
Performance metrics for caches
Adapted from slides by Sally McKee Cornell University
Main Memory Cache Architectures
Caches III CSE 351 Autumn 2018 Instructor: Justin Hsia
Lecture 22: Cache Hierarchies, Memory
Lecture 11: Cache Hierarchies
Lecture 21: Memory Hierarchy
Cache - Optimization.
Cache Memory Rabi Mahapatra
Cache Memory and Performance
Principle of Locality: Memory Hierarchies
Lecture 13: Cache Basics Topics: terminology, cache organization (Sections )
10/18: Lecture Topics Using spatial locality
Caches III CSE 351 Spring 2019 Instructor: Ruth Anderson
Overview Problem Solution CPU vs Memory performance imbalance
Presentation transcript:

5/27/99 Ashish Sabharwal1 Cache Misses - The 3 C’s n Compulsory misses –cold start –don’t have a choice –except that by increasing block size, can reduce the number of distinct blocks that are requested n Capacity misses –cache is much smaller than total addressable memory n Conflict misses –collision within a set –requested block was thrown out when some other block wanted to occupy the same position –can reduce by increasing assiciativity (each block has more options) n What can I measure directly? –Compulsory - yes (check for first reference) –Conflict - yes (remember if block was thrown out because of some other block) –Capacity - no But Capacity = Total - Compulsory - Conflict

5/27/99 Ashish Sabharwal2 The Assignment n Read input using system calls –Input: –Read both operation and address using system call 5 (read_int) n Maintain a cache (without data) –will need valid/tag/dirty bits n Keep counters for the 3 C’s n How to decide whether a miss is compulsory or capacity? –Cold start –Keep track of what all you have already seen How much memory to allocate for this? Use stack!

5/27/99 Ashish Sabharwal3 Cache Replacement Policies n Not needed for direct mapped cache –why? n Ideally: Least Likely to be Used –natural! n LRU –For each set, maintain the order in which elements were referenced –Why does this work? –Given temporal locality, a good approximation to the ideal policy n Random –For small associativity (~2), doesn’t really matter –For large associativity, may accidentally (and unnecessarily) throw away a block that is being used now –Might also approximate the ideal policy better! –Works well in practice...

5/27/99 Ashish Sabharwal4 Computing Cache Implementation Bits n Total implementation bits = data bits + valid bits + tag bits n (7.24)Address size = k bits Cache size = S bytes Block size = B = 2 b bytes Associativity = A ¶ data bits = S x 8 = 8S bits · valid bits = 1 x (number of blocks) = S/B bits ¸ tag bits = (tag bits per address) x (blocks/cache) tag bits per address = log 2 ( (memory size / cache size) x A ) = k - log 2 ( S/A ) Hence, total tag bits = ( k - log 2 ( S/A ) ) x ( S/B )