Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania 18042 ECE 313 - Computer Organization Lecture 20 - Memory.

Slides:



Advertisements
Similar presentations
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Oct. 23, 2002 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
Advertisements

CMSC 611: Advanced Computer Architecture Cache Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from.
1 Recap: Memory Hierarchy. 2 Memory Hierarchy - the Big Picture Problem: memory is too slow and or too small Solution: memory hierarchy Fastest Slowest.
CMPE 421 Parallel Computer Architecture MEMORY SYSTEM.
CS.305 Computer Architecture Memory: Structures Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Pipelined Processor.
Memory Subsystem and Cache Adapted from lectures notes of Dr. Patterson and Dr. Kubiatowicz of UC Berkeley.
1 Lecture 20 – Caching and Virtual Memory  2004 Morgan Kaufmann Publishers Lecture 20 Caches and Virtual Memory.
Review of Mem. HierarchyCSCE430/830 Review of Memory Hierarchy CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng Zhu (U.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 19 - Pipelined.
331 Week13.1Spring :332:331 Computer Architecture and Assembly Language Spring 2006 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 23 - Course.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Oct 31, 2005 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania Computer Organization Pipelined Processor Design 1.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 3, 2003 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
Memory Hierarchy.1 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output.
331 Lec20.1Fall :332:331 Computer Architecture and Assembly Language Fall 2003 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 22 - Input/Output.
CIS629 - Fall 2002 Caches 1 Caches °Why is caching needed? Technological development and Moore’s Law °Why are caches successful? Principle of locality.
CIS °The Five Classic Components of a Computer °Today’s Topics: Memory Hierarchy Cache Basics Cache Exercise (Many of this topic’s slides were.
ECE 232 L27.Virtual.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 27 Virtual.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania Computer Organization Pipelined Processor Design 3.
Computer ArchitectureFall 2007 © November 7th, 2007 Majd F. Sakr CS-447– Computer Architecture.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
331 Lec20.1Spring :332:331 Computer Architecture and Assembly Language Spring 2005 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
1 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value is stored as a charge.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Memory Hierarchy 2.
Memory: PerformanceCSCE430/830 Memory Hierarchy: Performance CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng Zhu (U. Maine)
©UCB CS 161 Ch 7: Memory Hierarchy LECTURE 14 Instructor: L.N. Bhuyan
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 17 - Pipelined.
1 CSE SUNY New Paltz Chapter Seven Exploiting Memory Hierarchy.
DAP Spr.‘98 ©UCB 1 Lecture 11: Memory Hierarchy—Ways to Reduce Misses.
Memory Hierarchy and Cache Design The following sources are used for preparing these slides: Lecture 14 from the course Computer architecture ECE 201 by.
CPE432 Chapter 5A.1Dr. W. Abu-Sufah, UJ Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Adapted from Slides by Prof. Mary Jane Irwin, Penn State University.
CPE232 Memory Hierarchy1 CPE 232 Computer Organization Spring 2006 Memory Hierarchy Dr. Gheith Abandah [Adapted from the slides of Professor Mary Irwin.
CSIE30300 Computer Architecture Unit 07: Main Memory Hsin-Chou Chi [Adapted from material by and
CS1104: Computer Organisation School of Computing National University of Singapore.
Lecture 14 Memory Hierarchy and Cache Design Prof. Mike Schulte Computer Architecture ECE 201.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Multi-Cycle Processor.
Lecture 19 Today’s topics Types of memory Memory hierarchy.
EEE-445 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output Cache Main Memory Secondary Memory (Disk)
CSIE30300 Computer Architecture Unit 08: Cache Hsin-Chou Chi [Adapted from material by and
EEL5708/Bölöni Lec 4.1 Fall 2004 September 10, 2004 Lotzi Bölöni EEL 5708 High Performance Computer Architecture Review: Memory Hierarchy.
1010 Caching ENGR 3410 – Computer Architecture Mark L. Chang Fall 2006.
The Goal: illusion of large, fast, cheap memory Fact: Large memories are slow, fast memories are small How do we create a memory that is large, cheap and.
CSE378 Intro to caches1 Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early.
Computer Organization & Programming
CS.305 Computer Architecture Memory: Caches Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.
Caches Hiding Memory Access Times. PC Instruction Memory 4 MUXMUX Registers Sign Ext MUXMUX Sh L 2 Data Memory MUXMUX CONTROLCONTROL ALU CTL INSTRUCTION.
CPE232 Cache Introduction1 CPE 232 Computer Organization Spring 2006 Cache Introduction Dr. Gheith Abandah [Adapted from the slides of Professor Mary Irwin.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
1  1998 Morgan Kaufmann Publishers Chapter Seven.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 16 - Multi-Cycle.
The Memory Hierarchy (Lectures #17 - #20) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.
Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 14: Memory Hierarchy Chapter 5 (4.
1 Chapter Seven. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value.
CS35101 Computer Architecture Spring 2006 Lecture 18: Memory Hierarchy Paul Durand ( ) [Adapted from M Irwin (
CPEG3231 Integration of cache and MIPS Pipeline  Data-path control unit design  Pipeline stalls on cache misses.
CMSC 611: Advanced Computer Architecture
CS 704 Advanced Computer Architecture
COSC3330 Computer Architecture
Computer Organization
Yu-Lun Kuo Computer Sciences and Information Engineering
The Goal: illusion of large, fast, cheap memory
Morgan Kaufmann Publishers Memory & Cache
CMSC 611: Advanced Computer Architecture
EE108B Review Session #6 Daxia Ge Friday February 23rd, 2007
Morgan Kaufmann Publishers Memory Hierarchy: Introduction
Presentation transcript:

Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 20 - Memory Hierarchy 1 Fall 2004 Reading: Portions of these slides are derived from: Textbook figures © 1998 Morgan Kaufmann Publishers all rights reserved Tod Amon's COD2e Slides © 1998 Morgan Kaufmann Publishers all rights reserved Dave Patterson’s CS 152 Slides - Fall 1997 © UCB Rob Rutenbar’s Slides - Fall 1999 CMU other sources as noted

ECE 313 Fall 2004Lecture 20 - Memory2 Roadmap for the term: major topics  Overview / Abstractions and Technology  Instruction sets  Logic & arithmetic  Performance  Processor Implementation  Single-cycle implemenatation  Multicycle implementation  Pipelined Implementation  Memory systems   Input/Output

ECE 313 Fall 2004Lecture 20 - Memory3 Outline - Memory Systems  Overview   Motivation  General Structure and Terminology  Memory Technology  Static RAM  Dynamic RAM  Disks  Cache Memory  Virtual Memory

ECE 313 Fall 2004Lecture 20 - Memory4 Memory Systems - the Big Picture  Memory provides processor with  Instructions  Data  Problem: memory is too slow and too small Control Datapath Memory Processor Input Output Instructions Data “Five Classics Components” Picture

ECE 313 Fall 2004Lecture 20 - Memory5 Memory Hierarchy - the Big Picture  Problem: memory is too slow and too small  Solution: memory hierarchy Fastest Slowest Smallest Biggest Highest Lowest Speed: Size: Cost: Control Datapath Secondary Storage (Disk) Processor Registers L2 Off-Chip Cache Main Memory (DRAM) L1 On-Chip Cache

ECE 313 Fall 2004Lecture 20 - Memory6 Why Hierarchy Works  The principle of locality  Programs access a relatively small portion of the address space at any instant of time.  Temporal locality: recently accessed data is likely to be used again  Spatial locality: data near recently accessed data is likely to be used soon  Result: the illusion of large, fast memory Address Space 02 n - 1 Probability of reference

ECE 313 Fall 2004Lecture 20 - Memory7 Memory Hierarchy - Speed vs. Size Control Datapath Secondary Storage (Disk) Processor Registers L2 Off-Chip Cache Main Memory (DRAM) L1 On-Chip Cache ,000,000 (5ms)Speed (ns): <1K Size (bytes):>100G <16G<16M

ECE 313 Fall 2004Lecture 20 - Memory8 Memory Hierarchy - Terminology Processor Blocks of Data Hit: Data in Upper Level Miss: Data not in Upper Level

ECE 313 Fall 2004Lecture 20 - Memory9 Memory Hierarchy Terminology (cont’d)  Hit: data appears in some block in the upper level (green block)  Hit Rate: the fraction of memory accesses that “hit”  Hit Time: time to access the upper level (time to determine hit/miss + access time)  Miss: data must be retrieved from block in lower level (orange block)  Miss Rate = 1 - (Hit Rate)  Miss Penalty: Time to replace block in upper level + Time to deliver data to the processor  Hit Time > Miss Rate

ECE 313 Fall 2004Lecture 20 - Memory10 Typical Memory Hierarchy - Details  Registers - Small, fastest on-chip storage  Managed by compiler and run-time system  Cache - Small, fast on-chip storage  Associative lookup - managed by hardware  Memory - Slower, Larger off-chip storage  Limited size <16Gb - managed by hardware, OS  Disk - Slowest, Largest off-chip storage  Virtual memory - simulate a large memory using disk, hardware, and operating system  File storage - store data files using operating system

ECE 313 Fall 2004Lecture 20 - Memory11 Outline - Memory Systems  Overview  Motivation  General Structure and Terminology  Memory Technology   Static RAM  Dynamic RAM  Cache Memory  Virtual Memory

ECE 313 Fall 2004Lecture 20 - Memory12 Memory Types  Static RAM  Storage using latch circuits  Values saved while power on  Dynamic RAM  Storage using capacitors  Values must be refreshed bit word / row select bit C

ECE 313 Fall 2004Lecture 20 - Memory13 Tradeoffs - Static vs. Dynamic RAM  Static RAM (SRAM) - used for L1, L2 cache  Fast ns access time (less for on-chip)  Larger, More Expensive  Higher power consumption  Dynamic RAM (DRAM) - used for PC main memory  Slower ns access time*  Smaller, Cheaper  Lower power consumption

ECE 313 Fall 2004Lecture 20 - Memory14 DRAM Organization Row Decoder Column Selector / Latch / IO Row Address Column Address /RAS /CAS DATA Row Select Line Bit (data) Line

ECE 313 Fall 2004Lecture 20 - Memory DRAM Read Operation Row Decoder Column Selector / Latch / IO Row Address Column Address /RAS /CAS DATA

ECE 313 Fall 2004Lecture 20 - Memory16 DRAM Trends  RAM size: 4X every 3 years  RAM speed: 2X every 10 years DRAM YearSizeCycle Time Kb250 ns Kb220 ns Mb190 ns Mb165 ns Mb145 ns Mb120 ns 1997?128 Mb ?? ns 1999?256 Mb?? ns Size change: 1000:1! Speed change: 2:1!

ECE 313 Fall 2004Lecture 20 - Memory17 The Processor/Memory Speed Gap DRAM 9%/yr. (2X/10 yrs) DRAM CPU 1982 Processor-Memory Performance Gap: (grows 50% / year) Performance Time “Moore’s Law”

ECE 313 Fall 2004Lecture 20 - Memory18 Addressing the Speed Gap  Latency depends on physical limitations  Bandwidth can be increased using:  Parallelism - transfer more bits / word  Burst transfers - transfer successive words on each cycle  So... use bandwidth to support memory hierarchy!  Use cache to support locality of reference  Design hierarchy to transfer large blocks of memory

ECE 313 Fall 2004Lecture 20 - Memory19 Current DRAM Parts  Synchronous DRAM (SDRAM) - clocked transfer of bursts of data starting at a specific address  Double-Data Rate SDRAM - transfer two bits/clock cycle  Quad-Data Rate SDRAM - transfer four bits / clock cycle  Rambus RDRAM - High-speed interface for fast transfers  Current PCs use some form of SDRAM/RDRAM  SDRAM w/ PC100 or PC133 memory bus  RDRAM w/ PC800 memory bus

ECE 313 Fall 2004Lecture 20 - Memory20 Memory Configuration in Current PCs Processor System Controller L1 Cache Main Memory (DRAM) L2/L3 Cache (SRAM) (I/O Bus)

ECE 313 Fall 2004Lecture 20 - Memory21 Outline - Memory Systems  Overview  Motivation  General Structure and Terminology  Memory Technology  Static RAM  Dynamic RAM  Cache Memory   Virtual Memory

ECE 313 Fall 2004Lecture 20 - Memory22 CPU Hit: Data in Cache (no penalty) Miss: Data not in Cache (miss penalty) Cache Memory DRAM Memory Processor addrdata addrdata Cache Operation  Insert between CPU, Main Mem.  Implement with fast Static RAM  Holds some of a program’s  data  instructions  Operation:

ECE 313 Fall 2004Lecture 20 - Memory23 Four Key Cache Questions: 1.Where can block be placed in cache? (block placement) 2.How can block be found in cache? (block identification) 3.Which block should be replaced on a miss? (block replacement) 4.What happens on a write? (write strategy)

ECE 313 Fall 2004Lecture 20 - Memory24 Basic Cache Design  Organized into blocks or lines  Block Contents  tag - extra bits to identify block (part of block address)  data - data or instruction words- contiguous memory locations  Our example:  One-word (4 byte) block size  30-bit tag  Two blocks in cache CPU tag 0data 0 CPU tag 1data 1 0x x x x C 0x b0 b1 Cache Main Memory

ECE 313 Fall 2004Lecture 20 - Memory25 Cache Example (2)  Assume:  r1==0, r2==1, r4==2  1 cycle for cache access  5 cycles for main. mem. access  1 cycle for instr. execution  At cycle 1 - PC=0x00  Fetch instruction from memory look in cache MISS - fetch from main mem (5 cycle penalty) CPU (empty) CPU (empty) L: add r1,r1,r2 0x x x x C 0x b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L MISSMISS

ECE 313 Fall 2004Lecture 20 - Memory26 Cache Example (3)  At cycle 6  Execute instr. add r1,r1,r2 CPU (empty) CPU (empty) L: add r1,r1,r2 0x x x x C 0x b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…000 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x…0

ECE 313 Fall 2004Lecture 20 - Memory27 Cache Example (4)  At cycle 6 - PC=0x04  Fetch instruction from memory look in cache MISS - fetch from main mem (5 cycle penalty) CPU (empty) CPU (empty) L: add r1,r1,r2 0x x x x C 0x b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x…0 MISSMISS 6-10 FETCH 0x…4

ECE 313 Fall 2004Lecture 20 - Memory28 Cache Example (5)  At cycle 11  Execute instr. bne r4,r1,L CPU (empty) CPU (empty) L: add r1,r1,r2 0x x x x C 0x b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…000 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x… FETCH 0x…004 bne r4,r1,L 0x…1 110x…4 bne r4,r1,L 1

ECE 313 Fall 2004Lecture 20 - Memory29 Cache Example (6)  At cycle 11 - PC=0x00  Fetch instruction from memory  HIT - instruction in cache CPU (empty) CPU (empty) L: add r1,r1,r2 0x x x x C 0x b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x… FETCH 0x…4 bne r4,r1,L 0x…1 HITHIT 110x…4 bne r4,r1,L 1 11 FETCH 0x…0 1

ECE 313 Fall 2004Lecture 20 - Memory30 Cache Example (7)  At cycle 12  Execute add r1, r1, 2 CPU (empty) CPU (empty) L: add r1,r1,r2 0x x x x C 0x b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x… FETCH 0x…4 bne r1,r2,L 0x…1 110x…4 bne r4,r1,L 1 12 FETCH 0x… add r1,r1,2 2

ECE 313 Fall 2004Lecture 20 - Memory31 Cache Example (8)  At cycle 12 - PC=0x04  Fetch instruction from memory  HIT - instruction in cache CPU (empty) CPU (empty) L: add r1,r1,r2 0x x x x C 0x b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x… FETCH 0x…4 bne r4,r1,L 0x…1 110x…4 bne r4,r1,L 1 12 FETCH 0x… add r1,r1, FETCH 0x04 HITHIT

ECE 313 Fall 2004Lecture 20 - Memory32 Cache Example (9)  At cycle 13  Execute instr. bne r4, r1, L  Branch not taken CPU (empty) CPU (empty) L: add r1,r1,r2 0x x x x C 0x b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x… FETCH 0x…4 bne r4,r1,L 0x…1 110x…4 bne r4,r1,L 1 12 FETCH 0x… add r1,r1, FETCH 0x04 13 bne r4, r1, L

ECE 313 Fall 2004Lecture 20 - Memory33 Cache Example (10)  At cycle 13 - PC=0x08  Fetch Instruction from Memory  MISS - not in cache CPU (empty) CPU (empty) L: add r1,r1,r2 0x x x x C 0x b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x… FETCH 0x…4 bne r4,r1,L 0x…1 110x…4 bne r4,r1,L 1 12 FETCH 0x… add r1,r1, FETCH 0x04 13 bne r4, r1, L 13 FETCH 0x08 MISSMISS

ECE 313 Fall 2004Lecture 20 - Memory34 Cache Example (11)  At cycle 17 - PC=0x08  Put instruction into cache  Replace existing instruction CPU (empty) CPU (empty) L: add r1,r1,r2 0x x x x C 0x b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x… FETCH 0x…4 bne r4,r1,L 0x…1 110x…4 bne r4,r1,L 1 12 FETCH 0x… add r1,r1, FETCH 0x04 13 bne r4, r1, L FETCH 0x08 sub r1,r1,r1 0x…2

ECE 313 Fall 2004Lecture 20 - Memory35 Cache Example (12)  At cycle 18  Execute sub r1, r1, r1 CPU (empty) L: add r1,r1,r2 0x x x x C 0x b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r FETCH 0x…4 bne r4,r1,L 0x…1 110x…4 bne r4,r1,L 1 12 FETCH 0x… add r1,r1, FETCH 0x bne r4, r1, L FETCH 0x sub r1, r1, r1 0 sub r1,r1,r1 0x…2

ECE 313 Fall 2004Lecture 20 - Memory36 Cache Example (13)  At cycle 18  Fetch instruction from memory  MISS - not in cache CPU (empty) CPU (empty) L: add r1,r1,r2 0x x x x C 0x b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x… FETCH 0x…4 bne r4,r1,L 0x…1 110x…4 bne r4,r1,L 1 12 FETCH 0x… add r1,r1, FETCH 0x bne r4, r1, L FETCH 0x08 2 sub r1,r1,r1 18 sub r1, r1, r FETCH 0x0C MISSMISS

ECE 313 Fall 2004Lecture 20 - Memory37 Cache Example (14)  At cycle 22  Put instruction into cache  Replace existing instruction CPU (empty) CPU (empty) L: add r1,r1,r2 0x x x x C 0x b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x… FETCH 0x…4 bne r1,r2,L 0x…1 110x…4 bne r4,r1,L 1 12 FETCH 0x… add r1,r1, FETCH 0x bne r4, r1, L FETCH 0x sub r1, r1, r FETCH 0x0C j L 0x…3 sub r1,r1,r1 0x…2

ECE 313 Fall 2004Lecture 20 - Memory38 Cache Example (15) CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r FETCH 0x…4 110x…4 bne r3,r1,L 11 FETCH 0x…0 120x…8 add r1,r1, FETCH 0x…4 130x…4 bne r4,r1,L FETCH 0x…8 180x…8 sub r1,r1,r FETCH 0x..C 230x…8 j L CPU (empty) CPU (empty) L: add r1,r1,r2 0x x x x C 0x b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L  At cycle 23  Execute j L j L 0x…3 sub r1,r1,r1 0x…2

ECE 313 Fall 2004Lecture 20 - Memory39 Compare No-cache vs. Cache CycleAddressOp/Instr. 1-5 FETCH 0x…0 60x…0 add r1,r1,r 6-10 FETCH 0x…4 110x…4 bne r3,r1,L FETCH 0x…0 160x…0 add r1,r1, FETCH 0x…4 210x…4 bne r3,r1,L FETCH 0x…8 260x…8 sub r1,r1,r FETCH 0x..C 310x…C j L CycleAddressOp/Instr. 1-5 FETCH 0x…0 60x…0 add r1,r1,r 6-10 FETCH 0x…4 110x…4 bne r3,r1,L 11 FETCH 0x…0 120x…0 add r1,r1,2 12 FETCH 0x…4 130x…4 bne r3,r1,L FETCH 0x…8 180x…8 sub r1,r1,r FETCH 0x..C 230x…C j L NO CACHE CACHE M M H H M M

ECE 313 Fall 2004Lecture 20 - Memory40 Cache Miss and the MIPS Pipeline Compare in Cycle 1 Fetch Completes (Pipeline Restarts) Miss Detected in Cycle 2  Instruction Fetch Clock Cycle 1 Clock Cycle 2+N Clock Cycle 3+N Clock Cycle 4+N Clock Cycle 5+N Clock Cycle 6+N

ECE 313 Fall 2004Lecture 20 - Memory41 Cache Miss and the MIPS Pipeline Compare in Cycle 4 Miss Detected in Cycle 5 Load Completes (Pipeline Restarts)  Load Instruction Clock Cycle 1 Clock Cycle 2 Clock Cycle 3 Clock Cycle 4 Clock Cycle 5 Clock Cycle 5+N Clock Cycle 6+N

ECE 313 Fall 2004Lecture 20 - Memory42 Coming Up: Four Key Cache Questions: 1.Where can block be placed in cache? (block placement) 2.How can block be found in cache? …using a tag (block identification) 3.Which block should be replaced on a miss? (block replacement) 4.What happens on a write? (write strategy)