Exploiting Memory Hierarchy Chapter 7

Slides:



Advertisements
Similar presentations
Performance of Cache Memory
Advertisements

Practical Caches COMP25212 cache 3. Learning Objectives To understand: –Additional Control Bits in Cache Lines –Cache Line Size Tradeoffs –Separate I&D.
CSC 4250 Computer Architectures December 8, 2006 Chapter 5. Memory Hierarchy.
6/12/2015Page 1 Exploiting Memory Hierarchy Chapter 7 B.Ramamurthy.
How caches take advantage of Temporal locality
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Chapter 7 Large and Fast: Exploiting Memory Hierarchy Bo Cheng.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
331 Lec20.1Spring :332:331 Computer Architecture and Assembly Language Spring 2005 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
Lecture 32: Chapter 5 Today’s topic –Cache performance assessment –Associative caches Reminder –HW8 due next Friday 11/21/2014 –HW9 due Wednesday 12/03/2014.
Caches – basic idea Small, fast memory Stores frequently-accessed blocks of memory. When it fills up, discard some blocks and replace them with others.
CMPE 421 Parallel Computer Architecture
The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.
Virtual Memory Expanding Memory Multiple Concurrent Processes.
B. Ramamurthy.  12 stage pipeline  At peak speed, the processor can request both an instruction and a data word on every clock.  We cannot afford pipeline.
Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.
Chapter 91 Logical Address in Paging  Page size always chosen as a power of 2.  Example: if 16 bit addresses are used and page size = 1K, we need 10.
Lecture Objectives: 1)Explain the relationship between miss rate and block size in a cache. 2)Construct a flowchart explaining how a cache miss is handled.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
Nov. 15, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 8: Memory Hierarchy Design * Jeremy R. Johnson Wed. Nov. 15, 2000 *This lecture.
DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%
CS.305 Computer Architecture Memory: Caches Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.
CPE232 Cache Introduction1 CPE 232 Computer Organization Spring 2006 Cache Introduction Dr. Gheith Abandah [Adapted from the slides of Professor Mary Irwin.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson.
1 CMPE 421 Parallel Computer Architecture PART3 Accessing a Cache.
Princess Sumaya Univ. Computer Engineering Dept. Chapter 5:
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
Additional Slides By Professor Mary Jane Irwin Pennsylvania State University Group 1.
Computer Organization CS224 Fall 2012 Lessons 39 & 40.
The Memory Hierarchy (Lectures #17 - #20) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
Chapter 5 Large and Fast: Exploiting Memory Hierarchy.
1 Chapter Seven. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value.
Memory Hierarchy— Five Ways to Reduce Miss Penalty.
Memory Hierarchy and Cache Design (3). Reducing Cache Miss Penalty 1. Giving priority to read misses over writes 2. Sub-block placement for reduced miss.
1 Memory Hierarchy Design Chapter 5. 2 Cache Systems CPUCache Main Memory Data object transfer Block transfer CPU 400MHz Main Memory 10MHz Bus 66MHz CPU.
CMSC 611: Advanced Computer Architecture
Virtual Memory Chapter 7.4.
Address – 32 bits WRITE Write Cache Write Main Byte Offset Tag Index Valid Tag Data 16K entries 16.
COSC3330 Computer Architecture
Memory COMPUTER ARCHITECTURE
Lecture 12 Virtual Memory.
Improving Memory Access 1/3 The Cache and Virtual Memory
CSC 4250 Computer Architectures
5.2 Eleven Advanced Optimizations of Cache Performance
Cache Memory Presentation I
Consider a Direct Mapped Cache with 4 word blocks
Morgan Kaufmann Publishers Memory & Cache
Morgan Kaufmann Publishers
ECE 445 – Computer Organization
Integration of cache System into MIPS Pipeline
Lecture 11 Memory Hierarchy.
Part V Memory System Design
Lecture 11 Memory Hierarchy.
Systems Architecture II
Set-Associative Cache
Lecture 22: Cache Hierarchies, Memory
Direct Mapping.
Module IV Memory Organization.
Morgan Kaufmann Publishers Memory Hierarchy: Cache Basics
Lecture 22: Cache Hierarchies, Memory
Lecture 21: Memory Hierarchy
Chapter Five Large and Fast: Exploiting Memory Hierarchy
Cache - Optimization.
Cache Memory Rabi Mahapatra
Memory & Cache.
10/18: Lecture Topics Using spatial locality
Overview Problem Solution CPU vs Memory performance imbalance
Presentation transcript:

Exploiting Memory Hierarchy Chapter 7 B.Ramamurthy 11/9/2018

Direct Mapped Cache: the Idea Main Memory All addresses with LSB 001 will map to purple cache slot All addresses with LSB 101 will map to blue cache slot And so on 11/9/2018

Cache Organization Content addressable memory Fully associative Set associative Fig.7.7 Cache Memory Organization Address Cache Memory Organization Data Regular Memory Organization 11/9/2018

Multi-word Cache Block Ordinary Memory word address Tag Index Block# Byte# within word Valid bit Tag Index Block# Byte# within block Block selection Data Block word 11/9/2018

Address  Cache block# Example: Floor(457/4)  114 114%8  2 Floor (Address/#bytes per block)block# in main memory (Block# in memory % blocks in cache)cache block# Example: Floor(457/4)  114 114%8  2 11/9/2018

Handling Cache Misses Send original PC value to memory Perform read on main memory Write cache entry, putting the data from memory in data portion of the entry, write upper bits into tag, turn valid bit on. Restart the missed instruction. 11/9/2018

Handling Writes Write through: A scheme in which writes always update both the cache and the memory, ensuring that data is always consistent between two. Write-back: A scheme that handles writes by updating values only to the block in the cache, then writing the modified block to the main memory. 11/9/2018

Example SPEC2000 CPI 1.0 with no misses Each miss incurs 100 extra cycles; miss occurs 10% of the times. Average CPI : 1 + 100X0.1 = 1+ 10 = 11 cycles (not good!) 11/9/2018

An Example Cache: The Intrinsity FastMath processor 12-stage pipeline When operating on peak speed, the processor can request both an instruction and a data word on every clock. Separate instruction and data cache are used. Each cache is 16KB or 4K words with 16-word blocks. 11/9/2018

Fig7.9 256 blocks with 16 words per block. 11/9/2018