COMP SYSTEM ARCHITECTURE

Slides:

Advertisements

Similar presentations

1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.

Advertisements

Lecture 12 Reduce Miss Penalty and Hit Time

Computer Organization CS224 Fall 2012 Lesson 44. Virtual Memory  Use main memory as a “cache” for secondary (disk) storage l Managed jointly by CPU hardware.

Cs 325 virtualmemory.1 Accessing Caches in Virtual Memory Environment.

Virtual Memory 3 Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University P & H Chapter

Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University Virtual Memory 2 P & H Chapter

Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.

CS Lecture 10 Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers N.P. Jouppi Proceedings.

331 Lec20.1Fall :332:331 Computer Architecture and Assembly Language Fall 2003 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.

ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )

1 COMP 206: Computer Architecture and Implementation Montek Singh Wed, Nov 9, 2005 Topic: Caches (contd.)

Memory: Virtual MemoryCSCE430/830 Memory Hierarchy: Virtual Memory CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng Zhu.

CS 241 Section Week #12 (04/22/10).

Lecture 19: Virtual Memory

Lecture 15: Virtual Memory EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr.

The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.

Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty.

CSIE30300 Computer Architecture Unit 08: Cache Hsin-Chou Chi [Adapted from material by and

The Three C’s of Misses 7.5 Compulsory Misses The first time a memory location is accessed, it is always a miss Also known as cold-start misses Only way.

Chapter 5 Memory III CSE 820. Michigan State University Computer Science and Engineering Miss Rate Reduction (cont’d)

COMP SYSTEM ARCHITECTURE HOW TO BUILD A CACHE Antoniu Pop COMP25212 – Lecture 2Jan/Feb 2015.

Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.

Nov. 15, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 8: Memory Hierarchy Design * Jeremy R. Johnson Wed. Nov. 15, 2000 *This lecture.

CS.305 Computer Architecture Memory: Caches Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.

CS2100 Computer Organisation Virtual Memory – Own reading only (AY2015/6) Semester 1.

Virtual Memory Ch. 8 & 9 Silberschatz Operating Systems Book.

Virtual Memory Review Goal: give illusion of a large memory Allow many processes to share single memory Strategy Break physical memory up into blocks (pages)

Summary of caches: The Principle of Locality: –Program likely to access a relatively small portion of the address space at any instant of time. Temporal.

1 Adapted from UC Berkeley CS252 S01 Lecture 17: Reducing Cache Miss Penalty and Reducing Cache Hit Time Hardware prefetching and stream buffer, software.

High Performance Computing1 High Performance Computing (CS 680) Lecture 2a: Overview of High Performance Processors * Jeremy R. Johnson *This lecture was.

Memory Hierarchy— Five Ways to Reduce Miss Penalty.

COMP SYSTEM ARCHITECTURE CACHES IN SYSTEMS Sergio Davies Feb/Mar 2014COMP25212 – Lecture 4.

CMSC 611: Advanced Computer Architecture Memory & Virtual Memory Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material.

CS161 – Design and Architecture of Computer

Memory: Page Table Structure

Translation Lookaside Buffer

CS 704 Advanced Computer Architecture

CMSC 611: Advanced Computer Architecture

Improving Memory Access The Cache and Virtual Memory

Caches in Systems Feb 2013 COMP25212 Cache 4.

CS161 – Design and Architecture of Computer

CS 704 Advanced Computer Architecture

Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.

Cache Memory Presentation I

CS-301 Introduction to Computing Lecture 17

CS510 Operating System Foundations

Virtual Memory 2 Hakim Weatherspoon CS 3410, Spring 2012

CSCI206 - Computer Organization & Programming

Virtual Memory 3 Hakim Weatherspoon CS 3410, Spring 2011

Part V Memory System Design

CMSC 611: Advanced Computer Architecture

Replacement Policies Assume all accesses are: Cache Replacement Policy

Andy Wang Operating Systems COP 4610 / CGS 5765

Lecture 23: Cache, Memory, Virtual Memory

FIGURE 12-1 Memory Hierarchy

Andy Wang Operating Systems COP 4610 / CGS 5765

CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs

Summary 3 Cs: Compulsory, Capacity, Conflict Misses Reducing Miss Rate

CS 3410, Spring 2014 Computer Science Cornell University

Virtual Memory 2 Hakim Weatherspoon CS 3410, Spring 2012

CSC3050 – Computer Architecture

CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs

Lecture 24: Virtual Memory, Multiprocessors

Lecture 23: Virtual Memory, Multiprocessors

Lecture 8: Efficient Address Translation

Translation Lookaside Buffers

Cache - Optimization.

Sarah Diesburg Operating Systems CS 3430

Andy Wang Operating Systems COP 4610 / CGS 5765

Sarah Diesburg Operating Systems COP 4610

Presentation transcript:

COMP25212 - SYSTEM ARCHITECTURE CACHES IN SYSTEMS Antoniu Pop Antoniu.Pop@manchester.ac.uk Jan/Feb 2015 COMP25212 – Lecture 4

Learning Objectives To understand: “3 x C’s” model of cache performance Time penalties for starting with empty cache Systems interconnect issues with caching and solutions! Caching and Virtual Memory Jan/Feb 2015 COMP25212 – Lecture 4

Describing Cache Misses Compulsory Misses Cold start Capacity Misses Even with full associativity, cache cannot contain all the blocks of the program Conflict Misses Multiple blocks compete for the same set. This would not happen in fully associative cache How can we avoid them? Jan/Feb 2015 COMP25212 – Lecture 4

Cache Performance again Today’s caches, how long does it take: a) to fill L3 cache? (8MB) b) to fill L2 cache? (256KB) c) to fill L1 D cache? (32KB) (e.g.) Number of lines = (cache size) / (line size) Number of lines = 32K/64 = 512 512 x memory access times at 20nS = 10 uS 20,000 clock cycles at 2GHz 131 072 x 20nS = 2.5 mS 4K x 20nS = 100 uS Jan/Feb 2015 COMP25212 – Lecture 4

Caches in Systems how often? (bandwidth required) e.g. disk, network L1 Inst Cache L2 CPU fetch RAM Memory L1 Data Cache data e.g. SATA = 300 MB/s – one byte every 3 nS (64 bytes every 200 nS) e.g. 1G Ethernet = 1 bit every nS, or 64 bytes every 512 nS e.g. 10Gig E? On-chip Inter connect stuff how often? (bandwidth required) e.g. disk, network Input/Output Jan/Feb 2015 COMP25212 – Lecture 4

Cache Consistency Problem 1 Inst Cache L2 “5” CPU fetch RAM Memory Data Cache “3” data “3” On-chip Inter connect stuff Problem: I/O writes to mem; cache outdated Input/Output Jan/Feb 2015 COMP25212 – Lecture 4

Cache Consistency Problem 2 Inst Cache L2 “5” CPU fetch RAM Memory Data Cache “3” data “5” On-chip Inter connect stuff Problem: I/O reads mem; cache holds newer Input/Output Jan/Feb 2015 COMP25212 – Lecture 4

Cache Consistency Software Solutions O/S knows where I/O takes place in memory Mark I/O areas as non-cachable (how?) O/S knows when I/O starts and finishes Clear caches before&after I/O? Jan/Feb 2015 COMP25212 – Lecture 4

Hardware Solutions:1 Disadvantage: tends to slow down cache L1 “5” Inst Cache L2 “5” CPU fetch RAM Memory Data Cache “5” data “5” Issues? On-chip Inter connect stuff Disadvantage: tends to slow down cache Input/Output Jan/Feb 2015 COMP25212 – Lecture 4

Hardware Solutions: 2 - Snooping Inst Cache L2 “5” CPU fetch RAM Memory Data Cache “5” data “5” Issues? On-chip Inter connect stuff L2 keeps track of L1 contents Snoop logic in cache observes every memory cycle Input/Output Jan/Feb 2015 COMP25212 – Lecture 4

Caches and Virtual Addresses CPU addresses – virtual Memory addresses – physical Recap – use Translation-Lookaside Buffer (TLB) to translate V-to-P What addresses in cache? Jan/Feb 2015 COMP25212 – Lecture 4

Option 1: Cache by Physical Addresses TLB $ CPU address RAM Memory data On-chip BUT: Address translation in series with cache SLOW Jan/Feb 2015 COMP25212 – Lecture 4

Option 2: Cache by Virtual Addresses $ TLB CPU address RAM Memory data On-chip BUT: Snooping? Aliasing? More Functional Difficulties Jan/Feb 2015 COMP25212 – Lecture 4

3: Translate in parallel with Cache Lookup Translation only affects high-order bits of address Address within page remains unchanged Low-order bits of Physical Address = low-order bits of Virtual Address Select “index” field of cache address from within low-order bits Only “Tag” bits changed by translation Jan/Feb 2015 COMP25212 – Lecture 4

Option 3 in operation: 20 5 7 Virtual address virtual page no index within line TLB tag line data line What are my assumptions? Line size = 2^7 bytes – 128 bytes Index = 5 bits => cache has 2^5 lines (32 lines) – 4K byte cache Page size = 12 bits – 4K page How can we increase cache size? (multi-way set-associativity) – 8-way set associativity would give us 32K cache Physical address multiplexer compare = ? Hit? Data Jan/Feb 2015 COMP25212 – Lecture 4

The Last Word on Caching? On-chip L1 Data Cache CPU Inst fetch data L2 L3 On-chip L1 Data Cache CPU Inst fetch data L2 L3 On-chip L1 Data Cache CPU Inst fetch data L2 L3 RAM Memory Input/Output You ain’t seen nothing yet! Jan/Feb 2015 COMP25212 – Lecture 4

Summary “3 x C’s” model of cache performance Systems interconnect issues with caching and solutions! Non-cacheable areas Cache flushing Snooping Caching and Virtual Memory Physical to virtual conversion (TLB) Cache architectures to support P-to-V conversion Jan/Feb 2015 COMP25212 – Lecture 4