Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania 18042 ECE 313 - Computer Organization Memory Hierarchy 2.

Slides:



Advertisements
Similar presentations
1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.
Advertisements

Lecture 8: Memory Hierarchy Cache Performance Kai Bu
Virtual Memory Chapter 18 S. Dandamudi To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi.
1 Recap: Memory Hierarchy. 2 Memory Hierarchy - the Big Picture Problem: memory is too slow and or too small Solution: memory hierarchy Fastest Slowest.
CSIE30300 Computer Architecture Unit 10: Virtual Memory Hsin-Chou Chi [Adapted from material by and
Virtual Memory Hardware Support
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Virtual Memory I Steve Ko Computer Sciences and Engineering University at Buffalo.
The Memory Hierarchy (Lectures #24) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer Organization.
1 Lecture 20: Cache Hierarchies, Virtual Memory Today’s topics:  Cache hierarchies  Virtual memory Reminder:  Assignment 8 will be posted soon (due.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Pipelined Processor.
Review of Mem. HierarchyCSCE430/830 Review of Memory Hierarchy CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng Zhu (U.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 19 - Pipelined.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 20 - Memory.
S.1 Review: The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 23 - Course.
Computer ArchitectureFall 2008 © November 10, 2007 Nael Abu-Ghazaleh Lecture 23 Virtual.
Review CPSC 321 Andreas Klappenecker Announcements Tuesday, November 30, midterm exam.
Recap. The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of the.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 18 - Pipelined.
Computer ArchitectureFall 2007 © November 21, 2007 Karem A. Sakallah Lecture 23 Virtual Memory (2) CS : Computer Architecture.
COMP 3221: Microprocessors and Embedded Systems Lectures 27: Virtual Memory - II Lecturer: Hui Wu Session 2, 2005 Modified.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania Computer Organization Pipelined Processor Design 1.
The Memory Hierarchy II CPSC 321 Andreas Klappenecker.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 22 - Input/Output.
CIS °The Five Classic Components of a Computer °Today’s Topics: Memory Hierarchy Cache Basics Cache Exercise (Many of this topic’s slides were.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania Computer Organization Pipelined Processor Design 3.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
©UCB CS 162 Ch 7: Virtual Memory LECTURE 13 Instructor: L.N. Bhuyan
Memory: Virtual MemoryCSCE430/830 Memory Hierarchy: Virtual Memory CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng Zhu.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy (Part II)
©UCB CS 161 Ch 7: Memory Hierarchy LECTURE 24 Instructor: L.N. Bhuyan
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 17 - Pipelined.
COMP3221 lec37-vm-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 13: Virtual Memory - II
CS61C L37 VM III (1)Garcia, Fall 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures.
Memory Hierarchy and Cache Design The following sources are used for preparing these slides: Lecture 14 from the course Computer architecture ECE 201 by.
Computer Architecture Lecture 28 Fasih ur Rehman.
CSE431 L22 TLBs.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 22. Virtual Memory Hardware Support Mary Jane Irwin (
Lecture 19: Virtual Memory
Lecture 10 Memory Hierarchy and Cache Design Computer Architecture COE 501.
July 30, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 8: Exploiting Memory Hierarchy: Virtual Memory * Jeremy R. Johnson Monday.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Multi-Cycle Processor.
Lecture 9: Memory Hierarchy Virtual Memory Kai Bu
Virtual Memory Expanding Memory Multiple Concurrent Processes.
1  1998 Morgan Kaufmann Publishers Recap: Memory Hierarchy of a Modern Computer System By taking advantage of the principle of locality: –Present the.
Virtual Memory. Virtual Memory: Topics Why virtual memory? Virtual to physical address translation Page Table Translation Lookaside Buffer (TLB)
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Nov. 15, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 8: Memory Hierarchy Design * Jeremy R. Johnson Wed. Nov. 15, 2000 *This lecture.
DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson.
CS.305 Computer Architecture Memory: Virtual Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.
Memory Hierarchy How to improve memory access. Outline Locality Structure of memory hierarchy Cache Virtual memory.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 16 - Multi-Cycle.
1 Appendix C. Review of Memory Hierarchy Introduction Cache ABCs Cache Performance Write policy Virtual Memory and TLB.
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
High Performance Computing1 High Performance Computing (CS 680) Lecture 2a: Overview of High Performance Processors * Jeremy R. Johnson *This lecture was.
Chapter 9 Memory Organization. 9.1 Hierarchical Memory Systems Figure 9.1.
CS161 – Design and Architecture of Computer
CS 704 Advanced Computer Architecture
ECE232: Hardware Organization and Design
Memory COMPUTER ARCHITECTURE
CS161 – Design and Architecture of Computer
Lecture 12 Virtual Memory.
CS 704 Advanced Computer Architecture
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Systems Architecture II
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
Cache - Optimization.
Presentation transcript:

Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Memory Hierarchy 2 Feb 2005 Reading: , 7.9* Homework: Look Over , 7.9, 7.12 for discussion on Friday Portions of these slides are derived from: Textbook figures © 1998 Morgan Kaufmann Publishers all rights reserved Tod Amon's COD2e Slides © 1998 Morgan Kaufmann Publishers all rights reserved Dave Patterson’s CS 152 Slides - Fall 1997 © UCB Rob Rutenbar’s Slides - Fall 1999 CMU other sources as noted

ECE 313 Fall 2004Lecture 21 - Memory 22 Outline - Memory Systems  Overview  Motivation  General Structure and Terminology  Memory Technology  Static RAM  Dynamic RAM  Disks  Cache Memory   Virtual Memory

ECE 313 Fall 2004Lecture 21 - Memory 23 Four Key Cache Questions: 1.Where can block be placed in cache? (block placement) 2.How can block be found in cache? …using a tag (block identification) 3.Which block should be replaced on a miss? (block replacement) 4.What happens on a write? (write strategy)

ECE 313 Fall 2004Lecture 21 - Memory 24 Q1: Block Placement  Where can block be placed in cache?  In one predetermined place - direct-mapped Use fragment of address to calculate block location in cache Compare cache block with tag to test if block present  Anywhere in cache - fully associative Compare tag to every block in cache  In a limited set of places - set-associative Use address fragment to calculate set (like direct-mapped) Place in any block in the set Compare tag to every block in set Hybrid of direct mapped and fully associative

ECE 313 Fall 2004Lecture 21 - Memory 25 Direct Mapped Block Placement *4*0*8*C Cache C C C C C Memory address maps to block: location = (block address MOD # blocks in cache)

ECE 313 Fall 2004Lecture 21 - Memory 26 Fully Associative Block Placement C C C C C Cache Memory arbitrary block mapping location = any

ECE 313 Fall 2004Lecture 21 - Memory 27 Set-Associative Block Placement C C C C *4*0*8*C C Cache Memory *0*4*8*C Set 0 Set 1 Set 2 Set 3 address maps to set: location = (block address MOD # sets in cache) (arbitrary location in set)

ECE 313 Fall 2004Lecture 21 - Memory 28 Q2: Block Identification  Every cache block has an address tag that identifies its location in memory  Hit when tag and address of desired word match (comparison by hardware)  Q: What happens when a cache block is empty? A: Mark this condition with a valid bit 0x 00001C0 0xff083c2d 1 TagValidData

ECE 313 Fall 2004Lecture 21 - Memory 29 Direct-Mapped Cache Design CACHE SRAM ADDR DATA[31:0] 0x 00001C0 0xff083c2d 0 1 0x x x x x23F02100x TagVData = 030x DATA[58:32]DATA[59] DATAHIT ADDRESS =1 Tag Cache Index Byte Offset

ECE 313 Fall 2004Lecture 21 - Memory 210 Set Associative Cache Design  Key idea:  Divide cache into sets  Allow block anywhere in a set  Advantages:  Better hit rate  Disadvantage:  More tag bits  More hardware  Higher access time A Four-Way Set-Associative Cache (Fig. 7.17)

ECE 313 Fall 2004Lecture 21 - Memory 211 tag data = Fully Associative Cache Design  Key idea: set size of one block  1 comparator required for each block  No address decoding  Practical only for small caches due to hardware demands tag data = = = = = tag tag tag tag data data data data tag in data out

ECE 313 Fall 2004Lecture 21 - Memory 212 Q3: Block Replacement  On a miss, data must be read from memory.  So, where do we put the new data?  Direct-mapped cache: must place in fixed location  Set-associative, fully-associative - can pick within set Random - replace an arbitrary block Least recently used (LRU) - replace the “least popular” block (best way) –Easy for 2-way set associative - one bit –Harder for n-way set associative - often “pseudo-LRU”

ECE 313 Fall 2004Lecture 21 - Memory 213 Q4: Write Strategy  What happens on a write?  Write through - write to memory, stall processor until done  Write buffer - place in buffer (allows pipeline to continue*)  Write back - delay write to memory until block is replaced in cache  Special considerations when using DMA, multiprocessors (coherence between caches)

ECE 313 Fall 2004Lecture 21 - Memory 214 Example: DECStation 3100 Cache  MIPS R2000 Workstation  Pipelined implementation  Instruction, Data Caches separate to allow concurrent access  Direct-Mapped Cache  Size: 64KB (16K words)  Write buffer (4-word buffer) Old Fig. 7.8

ECE 313 Fall 2004Lecture 21 - Memory 215 Miss Rates - DecStation 3100 Cache Program Instr. Miss Rate Data Miss Rate Combined Miss Rate gcc6.1%2.1%5.4% spice1.2%1.3%1.2% Old Fig. 7.10

ECE 313 Fall 2004Lecture 21 - Memory 216 Miss Rates vs. Block Size - DecStation 3100 Block Size (Words) Instr. Miss Rate Data Miss Rate Combined Miss Rate 16.1%2.1%5.4% 42.0%1.7%1.9% Fig Program gcc spice 40.3%0.6%0.4% 11.2%1.3%1.2%

ECE 313 Fall 2004Lecture 21 - Memory 217 Variation: Larger Block Size  Key advantage: take advantage of spatial locality  Disadvantages: competition, complicated writes Fig. 7.10

ECE 313 Fall 2004Lecture 21 - Memory 218 Example - Intrinsity FastMATH Cache Instr. Miss Rate Data Miss Rate Combined Miss Rate 0.4%11.4%3.2% Fig  Separate 16KB instruciton and data caches  16-word blocks (see previous page)  Results on SPECINT2000 benchmark:

ECE 313 Fall 2004Lecture 21 - Memory 219 Miss Rates vs. Block Size Fig  Note miss rate increases for larger block sizes

ECE 313 Fall 2004Lecture 21 - Memory 220 Example: Caches in the Pentium 4 Source: “The Microarchitecture of the Pentium® 4 Processor”, Intel Technology Journal, First Quarter L2 Cache: 128-byte block size Write Back L1 Data : 64-byte block size Write Through L1 Trace: decoded instr.

ECE 313 Fall 2004Lecture 21 - Memory 221 Summary: Cache Memory  Speeds up access by storing recently-used data  Structure has a strong impact on performance  Modern microprocessors use on-chip cache (sometimes multilevel caches)

ECE 313 Fall 2004Lecture 21 - Memory 222 Outline - Memory Systems  Overview  Motivation  General Structure and Terminology  Memory Technology  Static RAM  Dynamic RAM  Cache Memory  Virtual Memory 

ECE 313 Fall 2004Lecture 21 - Memory 223 Virtual Memory  Key idea: simulate a larger physical memory than is actually available  General approach:  Break address space up into pages  Each program accesses a working set of pages  Store pages: In physical memory as space permits On disk when no space left in physical memory  Access pages using virtual address Page Number (2)Offset Page 0 Page 1 Page 2

ECE 313 Fall 2004Lecture 21 - Memory 224 Virtual Memory  Why do this?  So a program can run as if it has a larger memory  So multiple programs can run in same memory with protected address spaces Virtual addresses Physical addresses Disk addresses Address Translation

ECE 313 Fall 2004Lecture 21 - Memory 225 Virtual Memory  Mapping from virtual to physical address Page offset Virtual page number Physical page number Translation (Fig 7.20)

ECE 313 Fall 2004Lecture 21 - Memory 226 Virtual Address Translation (Fig. 7.21)

ECE 313 Fall 2004Lecture 21 - Memory 227 Virtual Address Translation  What happens during a memory access?  map virtual address into physical address using page table  If the page is in memory: access physical memory  If the page is on disk: page fault Suspend program Get operating system to load the page from disk  Page table is in memory - this slows down access!  Translation lookaside buffer (TLB) special cache of translated addresses (speeds access back up)

ECE 313 Fall 2004Lecture 21 - Memory 228 TLB Structure (Fig. 7.23)

ECE 313 Fall 2004Lecture 21 - Memory 229 TLB / Cache Interaction Fig. 7.24

ECE 313 Fall 2004Lecture 21 - Memory 230 Virtual Memory and Protection  Important function of virtual memory: Protection  Allow sharing of single main memory by multiple processes  Provide each process with its own address space  Protect each process from memory accesses by other processes  Basic mechanism: two modes of operation  User mode - allows access only to user address space  Supervisor (kernel) mode - allows access to OS address space  System call - allows processor to change mode

ECE 313 Fall 2004Lecture 21 - Memory 231 Summary - Virtual Memory  Bottom level of memory hierarchy for programs  Used in all general-purpose architectures  Relies heavily on OS for support

ECE 313 Fall 2004Lecture 21 - Memory 232 Roadmap for the term: major topics  Overview / Abstractions and Technology  Instruction sets  Logic & arithmetic  Performance  Processor Implementation  Single-cycle implemenatation  Multicycle implementation  Pipelined Implementation  Memory systems  Input/Output 