Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty.

Slides:



Advertisements
Similar presentations
Lecture 12 Reduce Miss Penalty and Hit Time
Advertisements

Memory Address Decoding
Computer Organization CS224 Fall 2012 Lesson 44. Virtual Memory  Use main memory as a “cache” for secondary (disk) storage l Managed jointly by CPU hardware.
Lecture 34: Chapter 5 Today’s topic –Virtual Memories 1.
Cs 325 virtualmemory.1 Accessing Caches in Virtual Memory Environment.
Virtual Memory 3 Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University P & H Chapter
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University Virtual Memory 2 P & H Chapter
Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.
CS Lecture 10 Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers N.P. Jouppi Proceedings.
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
Modified from notes by Saeid Nooshabadi
CS 333 Introduction to Operating Systems Class 11 – Virtual Memory (1)
331 Lec20.1Fall :332:331 Computer Architecture and Assembly Language Fall 2003 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
Translation Buffers (TLB’s)
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed, Nov 9, 2005 Topic: Caches (contd.)
331 Lec20.1Spring :332:331 Computer Architecture and Assembly Language Spring 2005 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
ENGS 116 Lecture 131 Caches and Virtual Memory Vincent H. Berk October 31 st, 2008 Reading for Today: Sections C.1 – C.3 (Jouppi article) Reading for Monday:
CS 241 Section Week #12 (04/22/10).
Lecture 33: Chapter 5 Today’s topic –Cache Replacement Algorithms –Multi-level Caches –Virtual Memories 1.
Systems I Locality and Caching
Lecture 19: Virtual Memory
Lecture 15: Virtual Memory EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr.
The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.
Operating Systems COMP 4850/CISG 5550 Page Tables TLBs Inverted Page Tables Dr. James Money.
0 High-Performance Computer Architecture Memory Organization Chapter 5 from Quantitative Architecture January 2006.
CSIE30300 Computer Architecture Unit 08: Cache Hsin-Chou Chi [Adapted from material by and
CS399 New Beginnings Jonathan Walpole. Virtual Memory (1)
The Three C’s of Misses 7.5 Compulsory Misses The first time a memory location is accessed, it is always a miss Also known as cold-start misses Only way.
Chapter 91 Logical Address in Paging  Page size always chosen as a power of 2.  Example: if 16 bit addresses are used and page size = 1K, we need 10.
4.3 Virtual Memory. Virtual memory  Want to run programs (code+stack+data) larger than available memory.  Overlays programmer divides program into pieces.
Memory Architecture Chapter 5 in Hennessy & Patterson.
Nov. 15, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 8: Memory Hierarchy Design * Jeremy R. Johnson Wed. Nov. 15, 2000 *This lecture.
CS.305 Computer Architecture Memory: Caches Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.
CS2100 Computer Organisation Virtual Memory – Own reading only (AY2015/6) Semester 1.
Virtual Memory Ch. 8 & 9 Silberschatz Operating Systems Book.
Princess Sumaya Univ. Computer Engineering Dept. Chapter 5:
Virtual Memory Review Goal: give illusion of a large memory Allow many processes to share single memory Strategy Break physical memory up into blocks (pages)
Summary of caches: The Principle of Locality: –Program likely to access a relatively small portion of the address space at any instant of time. Temporal.
1 Adapted from UC Berkeley CS252 S01 Lecture 17: Reducing Cache Miss Penalty and Reducing Cache Hit Time Hardware prefetching and stream buffer, software.
Virtual Memory 1 Computer Organization II © McQuain Virtual Memory Use main memory as a “cache” for secondary (disk) storage – Managed jointly.
High Performance Computing1 High Performance Computing (CS 680) Lecture 2a: Overview of High Performance Processors * Jeremy R. Johnson *This lecture was.
3/1/2002CSE Virtual Memory Virtual Memory CPU On-chip cache Off-chip cache DRAM memory Disk memory Note: Some of the material in this lecture are.
COMP 3221: Microprocessors and Embedded Systems Lectures 27: Cache Memory - III Lecturer: Hui Wu Session 2, 2005 Modified.
Memory Hierarchy— Five Ways to Reduce Miss Penalty.
COMP SYSTEM ARCHITECTURE CACHES IN SYSTEMS Sergio Davies Feb/Mar 2014COMP25212 – Lecture 4.
CS161 – Design and Architecture of Computer
Memory: Page Table Structure
Translation Lookaside Buffer
CMSC 611: Advanced Computer Architecture
COMP SYSTEM ARCHITECTURE
Caches in Systems Feb 2013 COMP25212 Cache 4.
CS161 – Design and Architecture of Computer
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Morgan Kaufmann Publishers
CS510 Operating System Foundations
Virtual Memory 3 Hakim Weatherspoon CS 3410, Spring 2011
Part V Memory System Design
CMSC 611: Advanced Computer Architecture
Replacement Policies Assume all accesses are: Cache Replacement Policy
Andy Wang Operating Systems COP 4610 / CGS 5765
Andy Wang Operating Systems COP 4610 / CGS 5765
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
Summary 3 Cs: Compulsory, Capacity, Conflict Misses Reducing Miss Rate
Translation Buffers (TLB’s)
CSC3050 – Computer Architecture
Fundamentals of Computing: Computer Architecture
Translation Buffers (TLBs)
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Sarah Diesburg Operating Systems CS 3430
Presentation transcript:

Caches in Systems COMP25212 Cache 4

Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty cache –Systems interconnect issues with caching –and solutions! –Caching and Virtual Memory COMP25212 Cache 4

Describing Cache Misses Compulsory Misses Capacity Misses Conflict Misses COMP25212 Cache 4

Cache Performance again Today’s caches, how long does it take: a) to fill L3 cache? (8MB) b) to fill L2 cache? (256KB) c) to fill L1 D cache? (32KB) Number of lines = (cache size) / (line size) Number of lines = 32K/64 = x memory access times at 20nS = 10 uS 20,000 clock cycles at 2GHz COMP25212 Cache 4

Caches in Systems e.g. disk, network COMP25212 Cache 4 L1 Data Cache CPU RAM Memory On-chip L1 Inst Cache fetch data L2 Input/Outp ut how often? internc onnect stuff

Cache Consistency Problem 1 Problem: –I/O writes to mem; cache outdated COMP25212 Cache 4 Data Cache CPU RAM Memory On-chip L1 Inst Cache fetch data L2 Input/Outp ut internc onnect stuff “I”

Cache Consistency Problem 2 COMP25212 Cache 4 Data Cache CPU RAM Memory On-chip L1 Inst Cache fetch data L2 Input/Outp ut internc onnect stuff “I” Problem: –I/O reads mem; cache holds newer

Cache Consistency Software Solutions O/S knows where I/O takes place in memory –Mark I/O areas as non-cachable (how?) O/S knows when I/O starts and finishes –Clear caches before&after I/O? COMP25212 Cache 4

Hardware Solutions:1 COMP25212 Cache 4 Unfortunately: tends to slow down cache COMP25212 Cache 4 Data Cache CPU RAM Memory On-chip L1 Inst Cache fetch data L2 Input/Outp ut internc onnect stuff “I”

Hardware Solutions: 2 - Snooping COMP25212 Cache 4 Data Cache CPU RAM Memory On-chip L1 Inst Cache fetch data L2 Input/Outp ut internc onnect stuff “I” Snoop logic in cache observes every memory cycle snoop L2 keeps track of L1 contents

Caches and Virtual Addresses CPU addresses – virtual Memory addresses – physical Recap – use TLB to translate v-to-p What addresses in cache? COMP25212 Cache 4

Option 1: Cache by Physical Addresses COMP25212 Cache 4 CPU RAM Memory On-chip address data $ TLB BUT: –Address translation in series with cache SLOW

COMP25212 Cache 4 Option 2: Cache by Virtual Addresses COMP25212 Cache 4 CPU RAM Memory On-chip address data $ TLB BUT: –Snooping? –Aliasing? More Functional Difficulties

3: Translate in parallel with Cache Lookup Translation only affects high-order bits of address Address within page remains unchanged Low-order bits of Physical Address = low-order bits of Virtual Address Select “index” field of cache address from within low- order bits Only “Tag” bits changed by translation COMP25212 Cache 4

Option 3 in operation: within line index virtual page no Virtual address data line tag line multiplexer compare = ? TLB Physical address Hit? Data

The Last Word on Caching? COMP25212 Cache 4 RAM Memory On-chip L1 Data Cache CPU L1 Inst Cache fetch dat a L2 L1 Data Cache CPU L1 Inst Cache fetch dat a L2 L3 Input/Outp ut On-chip L1 Data Cache CPU L1 Inst Cache fetch dat a L2 L1 Data Cache CPU L1 Inst Cache fetch dat a L2 L3 On-chip L1 Data Cache CPU L1 Inst Cache fetch dat a L2 L1 Data Cache CPU L1 Inst Cache fetch dat a L2 L3 You ain’t seen nothing yet!