Chapter 5 Memory CSE 820.

Slides:



Advertisements
Similar presentations
1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.
Advertisements

Lecture 8: Memory Hierarchy Cache Performance Kai Bu
1 Adapted from UCB CS252 S01, Revised by Zhao Zhang in IASTATE CPRE 585, 2004 Lecture 14: Hardware Approaches for Cache Optimizations Cache performance.
Cache Here we focus on cache improvements to support at least 1 instruction fetch and at least 1 data access per cycle – With a superscalar, we might need.
CSC 4250 Computer Architectures December 8, 2006 Chapter 5. Memory Hierarchy.
Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
Review CPSC 321 Andreas Klappenecker Announcements Tuesday, November 30, midterm exam.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Memory Hierarchy Design Chapter 5 Karin Strauss. Background 1980: no caches 1995: two levels of caches 2004: even three levels of caches Why? Processor-Memory.
EENG449b/Savvides Lec /1/04 April 1, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG Computer.
ENGS 116 Lecture 121 Caches Vincent H. Berk Wednesday October 29 th, 2008 Reading for Friday: Sections C.1 – C.3 Article for Friday: Jouppi Reading for.
EENG449b/Savvides Lec /13/04 April 13, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG Computer.
1  2004 Morgan Kaufmann Publishers Chapter Seven.
EENG449b/Savvides Lec /7/05 April 7, 2005 Prof. Andreas Savvides Spring g449b EENG 449bG/CPSC 439bG.
CSC 4250 Computer Architectures December 5, 2006 Chapter 5. Memory Hierarchy.
Lecture 19: Virtual Memory
Lecture 10 Memory Hierarchy and Cache Design Computer Architecture COE 501.
1  1998 Morgan Kaufmann Publishers Recap: Memory Hierarchy of a Modern Computer System By taking advantage of the principle of locality: –Present the.
Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.
Lecture 08: Memory Hierarchy Cache Performance Kai Bu
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
Chapter 5 Memory III CSE 820. Michigan State University Computer Science and Engineering Miss Rate Reduction (cont’d)
Outline Cache writes DRAM configurations Performance Associative caches Multi-level caches.
M E M O R Y. Computer Performance It depends in large measure on the interface between processor and memory. CPI (or IPC) is affected CPI = Cycles per.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Nov. 15, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 8: Memory Hierarchy Design * Jeremy R. Johnson Wed. Nov. 15, 2000 *This lecture.
DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%
11 Intro to cache memory Kosarev Nikolay MIPT Nov, 2009.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson.
CS6290 Caches. Locality and Caches Data Locality –Temporal: if data item needed now, it is likely to be needed again in near future –Spatial: if data.
Princess Sumaya Univ. Computer Engineering Dept. Chapter 5:
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
Memory Hierarchy and Caches. Who Cares about Memory Hierarchy? Processor Only Thus Far in Course CPU-DRAM Gap 1980: no cache in µproc; level cache,
Lecture 20 Last lecture: Today’s lecture: Types of memory
1 Appendix C. Review of Memory Hierarchy Introduction Cache ABCs Cache Performance Write policy Virtual Memory and TLB.
For each of these, where could the data be and how would we find it? TLB hit – cache or physical memory TLB miss – cache, memory, or disk Virtual memory.
1 Adapted from UC Berkeley CS252 S01 Lecture 17: Reducing Cache Miss Penalty and Reducing Cache Hit Time Hardware prefetching and stream buffer, software.
Memory Design Principles Principle of locality dominates design Smaller = faster Hierarchy goal: total memory system almost as cheap as the cheapest component,
Memory Hierarchy— Five Ways to Reduce Miss Penalty.
1 Memory Hierarchy Design Chapter 5. 2 Cache Systems CPUCache Main Memory Data object transfer Block transfer CPU 400MHz Main Memory 10MHz Bus 66MHz CPU.
Chapter 5 Memory II CSE 820. Michigan State University Computer Science and Engineering Equations CPU execution time = (CPU cycles + Memory-stall cycles)
CS161 – Design and Architecture of Computer
CMSC 611: Advanced Computer Architecture
Soner Onder Michigan Technological University
COSC3330 Computer Architecture
ECE232: Hardware Organization and Design
CS161 – Design and Architecture of Computer
Associativity in Caches Lecture 25
CSC 4250 Computer Architectures
Multilevel Memories (Improving performance using alittle “cash”)
CS 704 Advanced Computer Architecture
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
The University of Adelaide, School of Computer Science
Cache Memory Presentation I
Consider a Direct Mapped Cache with 4 word blocks
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Part V Memory System Design
Lecture 23: Cache, Memory, Virtual Memory
Lecture 14: Reducing Cache Misses
FIGURE 12-1 Memory Hierarchy
Lecture 08: Memory Hierarchy Cache Performance
CPE 631 Lecture 05: Cache Design
Virtual Memory Overcoming main memory size limitation
CSC3050 – Computer Architecture
Cache - Optimization.
Cache Memory Rabi Mahapatra
Overview Problem Solution CPU vs Memory performance imbalance
Presentation transcript:

Chapter 5 Memory CSE 820

It’s the latency stupid! CPU performance memory time Michigan State University Computer Science and Engineering

Michigan State University Computer Science and Engineering Vocabulary Cache Virtual memory Memory stall cycles Direct mapped Valid bit Block address Write through Michigan State University Computer Science and Engineering

Michigan State University Computer Science and Engineering Vocabulary Instruction cache Average memory access time Cache hit Page Miss penalty Fully associative Dirty bit Michigan State University Computer Science and Engineering

Michigan State University Computer Science and Engineering Vocabulary Block offset Write back Data cache Hit time Cache miss Page fault Miss rate Michigan State University Computer Science and Engineering

Michigan State University Computer Science and Engineering Vocabulary N-way set associative Least-recently used Tag field Write allocate Unified cache Misses per instruction block Michigan State University Computer Science and Engineering

Michigan State University Computer Science and Engineering Vocabulary Locality Address trace Set Random replacement Index field No-write allocate Write buffer Write stall Michigan State University Computer Science and Engineering

Michigan State University Computer Science and Engineering Equations CPU execution time = (CPU cycles + Memory-stall cycles) x clockCycleTime Memory stall cycles = misses x penalty = IC x miss/Inst x penalty = IC x memAccess/Inst x missRate x penalty Michigan State University Computer Science and Engineering

Michigan State University Computer Science and Engineering Hierarchy Questions Block Placement Where can a block be placed? Block Identification How is a block found? Block Replacement Which block should be replaced on a miss? Write Strategy What happens on a write? Michigan State University Computer Science and Engineering

Michigan State University Computer Science and Engineering Cache Q1: Where to place? Michigan State University Computer Science and Engineering

Michigan State University Computer Science and Engineering Cache Q2: How to find it? block address block offset tag index Index selects the set Tag checks all in the set Offset selects within the block Valid bit Michigan State University Computer Science and Engineering

Michigan State University Computer Science and Engineering Cache Q3: Replacement? Random Simplest FIFO LRU Approximation Outperforms others for large caches Michigan State University Computer Science and Engineering

Michigan State University Computer Science and Engineering Cache Q4: write? Reads are most common and easiest Write through Write back On replacement Dirty bit Write allocate No-write allocate Michigan State University Computer Science and Engineering

Alpha 21264 Cache: Go through it on your own Michigan State University Computer Science and Engineering

Michigan State University Computer Science and Engineering Equations Fig 5.9 has 12 memory performance equations Most are variations on CPU = IC x CPI x cycles AvgMemAccess = Hit + MissRate x MissPenalty Michigan State University Computer Science and Engineering

AvgMemAccess = Hit + MissRate x MissPenalty 17 Cache Optimizations Reduce miss penalty Multilevel, critical first, read-before-write, merging writes, victim cache Reduce miss rate Larger blocks and caches, higher associativity, way prediction, compiler optimizations Reduce miss rate & penalty with parallelism Non-blocking, hardware and software prefetch Reduce hit time Size, translation, pipelined, trace Michigan State University Computer Science and Engineering