Caches in Systems Feb 2013 COMP25212 Cache 4.

Slides:



Advertisements
Similar presentations
Virtual Memory 1 Computer Organization II © McQuain Virtual Memory Use main memory as a cache for secondary (disk) storage – Managed jointly.
Advertisements

1 Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers By Sreemukha Kandlakunta Phani Shashank.
Lecture 12 Reduce Miss Penalty and Hit Time
Memory Address Decoding
Computer Organization CS224 Fall 2012 Lesson 44. Virtual Memory  Use main memory as a “cache” for secondary (disk) storage l Managed jointly by CPU hardware.
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University Virtual Memory 2 P & H Chapter
Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.
CS Lecture 10 Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers N.P. Jouppi Proceedings.
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed, Nov 9, 2005 Topic: Caches (contd.)
CS 241 Section Week #12 (04/22/10).
Lecture 19: Virtual Memory
Lecture 15: Virtual Memory EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr.
The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.
Operating Systems COMP 4850/CISG 5550 Page Tables TLBs Inverted Page Tables Dr. James Money.
Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty.
Virtual Memory Review Goal: give illusion of a large memory Allow many processes to share single memory Strategy Break physical memory up into blocks (pages)
CSIE30300 Computer Architecture Unit 08: Cache Hsin-Chou Chi [Adapted from material by and
CS399 New Beginnings Jonathan Walpole. Virtual Memory (1)
The Three C’s of Misses 7.5 Compulsory Misses The first time a memory location is accessed, it is always a miss Also known as cold-start misses Only way.
Lecture 40: Review Session #2 Reminders –Final exam, Thursday 3:10pm Sloan 150 –Course evaluation (Blue Course Evaluation) Access through.
Chapter 91 Logical Address in Paging  Page size always chosen as a power of 2.  Example: if 16 bit addresses are used and page size = 1K, we need 10.
4.3 Virtual Memory. Virtual memory  Want to run programs (code+stack+data) larger than available memory.  Overlays programmer divides program into pieces.
Memory Architecture Chapter 5 in Hennessy & Patterson.
CS2100 Computer Organisation Virtual Memory – Own reading only (AY2015/6) Semester 1.
Virtual Memory Ch. 8 & 9 Silberschatz Operating Systems Book.
Princess Sumaya Univ. Computer Engineering Dept. Chapter 5:
Virtual Memory Review Goal: give illusion of a large memory Allow many processes to share single memory Strategy Break physical memory up into blocks (pages)
For each of these, where could the data be and how would we find it? TLB hit – cache or physical memory TLB miss – cache, memory, or disk Virtual memory.
Summary of caches: The Principle of Locality: –Program likely to access a relatively small portion of the address space at any instant of time. Temporal.
1 Adapted from UC Berkeley CS252 S01 Lecture 17: Reducing Cache Miss Penalty and Reducing Cache Hit Time Hardware prefetching and stream buffer, software.
Memory Hierarchy— Five Ways to Reduce Miss Penalty.
COMP SYSTEM ARCHITECTURE CACHES IN SYSTEMS Sergio Davies Feb/Mar 2014COMP25212 – Lecture 4.
CMSC 611: Advanced Computer Architecture Memory & Virtual Memory Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material.
CS161 – Design and Architecture of Computer
Memory: Page Table Structure
Translation Lookaside Buffer
CS 704 Advanced Computer Architecture
CMSC 611: Advanced Computer Architecture
Cache Organization of Pentium
COMP SYSTEM ARCHITECTURE
CS161 – Design and Architecture of Computer
How will execution time grow with SIZE?
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
CS-301 Introduction to Computing Lecture 17
CS510 Operating System Foundations
CSE 153 Design of Operating Systems Winter 2018
Virtual Memory 2 Hakim Weatherspoon CS 3410, Spring 2012
CSCI206 - Computer Organization & Programming
Virtual Memory 3 Hakim Weatherspoon CS 3410, Spring 2011
Cache Memories September 30, 2008
Part V Memory System Design
CMSC 611: Advanced Computer Architecture
Replacement Policies Assume all accesses are: Cache Replacement Policy
Andy Wang Operating Systems COP 4610 / CGS 5765
ECE 445 – Computer Organization
Lecture 22: Cache Hierarchies, Memory
MICROPROCESSOR MEMORY ORGANIZATION
Andy Wang Operating Systems COP 4610 / CGS 5765
Chap. 12 Memory Organization
Summary 3 Cs: Compulsory, Capacity, Conflict Misses Reducing Miss Rate
Virtual Memory 2 Hakim Weatherspoon CS 3410, Spring 2012
CSC3050 – Computer Architecture
Cache - Optimization.
CSE 153 Design of Operating Systems Winter 2019
Fundamentals of Computing: Computer Architecture
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Cache Memory and Performance
Sarah Diesburg Operating Systems CS 3430
Andy Wang Operating Systems COP 4610 / CGS 5765
Sarah Diesburg Operating Systems COP 4610
Presentation transcript:

Caches in Systems Feb 2013 COMP25212 Cache 4

Learning Objectives To understand: “3 x C’s” model of cache performance Time penalties for starting with empty cache Systems interconnect issues with caching and solutions! Caching and Virtual Memory Feb 2013 COMP25212 Cache 4

Describing Cache Misses Compulsory Misses Capacity Misses Conflict Misses How can we avoid them? Feb 2013 COMP25212 Cache 4

Cache Performance again Today’s caches, how long does it take: a) to fill L3 cache? (8MB) b) to fill L2 cache? (256KB) c) to fill L1 D cache? (32KB) (e.g.) Number of lines = (cache size) / (line size) Number of lines = 32K/64 = 512 512 x memory access times at 20nS = 10 uS 20,000 clock cycles at 2GHz 131 072 x 20nS = 2.5 mS 4K x 20nS = 100 uS Feb 2013 COMP25212 Cache 4

Caches in Systems e.g. disk, network how often? L1 Inst CPU Cache fetch RAM Memory L1 Data Cache data e.g. SATA = 300 MB/s – one byte every 3 nS (64 bytes every 200 nS) e.g. 1G Ethernet = 1 bit every nS, or 64 bytes every 512 nS e.g. 10Gig E? On-chip internconnect stuff e.g. disk, network how often? Input/Output Feb 2013 COMP25212 Cache 4

Cache Consistency Problem 1 Inst Cache L2 “I” CPU fetch RAM Memory Data Cache “I” data “I” On-chip internconnect stuff Problem: I/O writes to mem; cache outdated Input/Output Feb 2013 COMP25212 Cache 4

Cache Consistency Problem 2 Inst Cache L2 “I” CPU fetch RAM Memory Data Cache “I” data “I” On-chip internconnect stuff Problem: I/O reads mem; cache holds newer Input/Output Feb 2013 COMP25212 Cache 4 COMP25212 Cache 4

Cache Consistency Software Solutions O/S knows where I/O takes place in memory Mark I/O areas as non-cachable (how?) O/S knows when I/O starts and finishes Clear caches before&after I/O? Feb 2013 COMP25212 Cache 4

Hardware Solutions:1 Unfortunately: tends to slow down cache L1 “I” Inst Cache L2 “I” CPU fetch RAM Memory Data Cache “I” data Issues? “I” On-chip internconnect stuff Unfortunately: tends to slow down cache Input/Output Feb 2013 COMP25212 Cache 4 COMP25212 Cache 4

Hardware Solutions: 2 - Snooping Inst Cache L2 “I” CPU fetch RAM Memory Data Cache “I” data Issues? “I” On-chip internconnect stuff L2 keeps track of L1 contents Snoop logic in cache observes every memory cycle Input/Output Feb 2013 COMP25212 Cache 4 COMP25212 Cache 4

Caches and Virtual Addresses CPU addresses – virtual Memory addresses – physical Recap – use TLB to translate v-to-p What addresses in cache? Feb 2013 COMP25212 Cache 4

Option 1: Cache by Physical Addresses TLB $ CPU address RAM Memory data On-chip BUT: Address translation in series with cache SLOW Feb 2013 COMP25212 Cache 4

Option 2: Cache by Virtual Addresses $ TLB CPU address RAM Memory data On-chip BUT: Snooping? Aliasing? More Functional Difficulties Feb 2013 COMP25212 Cache 4 COMP25212 Cache 4

3: Translate in parallel with Cache Lookup Translation only affects high-order bits of address Address within page remains unchanged Low-order bits of Physical Address = low-order bits of Virtual Address Select “index” field of cache address from within low-order bits Only “Tag” bits changed by translation Feb 2013 COMP25212 Cache 4 COMP25212 Cache 4

Option 3 in operation: 20 5 7 Virtual address virtual page no index within line TLB tag line data line What are my assumptions? Line size = 2^7 bytes – 128 bytes Index = 5 bits => cache has 2^5 lines (32 lines) – 4K byte cache Page size = 12 bits – 4K page How can we increase cache size? (multi-way set-associativity) – 8-way set associativity would give us 32K cache Physical address multiplexer compare = ? Hit? Data Feb 2013 COMP25212 Cache 4

The Last Word on Caching? On-chip L1 Data Cache CPU Inst fetch data L2 L3 On-chip L1 Data Cache CPU Inst fetch data L2 L3 On-chip L1 Data Cache CPU Inst fetch data L2 L3 RAM Memory Input/Output You ain’t seen nothing yet! Feb 2013 COMP25212 Cache 4