Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.

Slides:



Advertisements
Similar presentations
Virtual Memory In this lecture, slides from lecture 16 from the course Computer Architecture ECE 201 by Professor Mike Schulte are used with permission.
Advertisements

1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.
Lecture 8: Memory Hierarchy Cache Performance Kai Bu
Practical Caches COMP25212 cache 3. Learning Objectives To understand: –Additional Control Bits in Cache Lines –Cache Line Size Tradeoffs –Separate I&D.
Computer Organization CS224 Fall 2012 Lesson 44. Virtual Memory  Use main memory as a “cache” for secondary (disk) storage l Managed jointly by CPU hardware.
Lecture 34: Chapter 5 Today’s topic –Virtual Memories 1.
CSIE30300 Computer Architecture Unit 10: Virtual Memory Hsin-Chou Chi [Adapted from material by and
CSC 4250 Computer Architectures December 8, 2006 Chapter 5. Memory Hierarchy.
The Memory Hierarchy (Lectures #24) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer Organization.
1 Lecture 20: Cache Hierarchies, Virtual Memory Today’s topics:  Cache hierarchies  Virtual memory Reminder:  Assignment 8 will be posted soon (due.
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
Chap. 7.4: Virtual Memory. CS61C L35 VM I (2) Garcia © UCB Review: Caches Cache design choices: size of cache: speed v. capacity direct-mapped v. associative.
S.1 Review: The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Computer ArchitectureFall 2008 © November 10, 2007 Nael Abu-Ghazaleh Lecture 23 Virtual.
Computer ArchitectureFall 2007 © November 21, 2007 Karem A. Sakallah Lecture 23 Virtual Memory (2) CS : Computer Architecture.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
The Memory Hierarchy II CPSC 321 Andreas Klappenecker.
Technical University of Lodz Department of Microelectronics and Computer Science Elements of high performance microprocessor architecture Virtual memory.
Virtual Memory. Why do we need VM? Program address space: 0 – 2^32 bytes –4GB of space Physical memory available –256MB or so Multiprogramming systems.
©UCB CS 162 Ch 7: Virtual Memory LECTURE 13 Instructor: L.N. Bhuyan
1  2004 Morgan Kaufmann Publishers Chapter Seven.
1 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value is stored as a charge.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy (Part II)
©UCB CS 161 Ch 7: Memory Hierarchy LECTURE 24 Instructor: L.N. Bhuyan
Lecture 19: Virtual Memory
Lecture 15: Virtual Memory EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr.
Lecture 10 Memory Hierarchy and Cache Design Computer Architecture COE 501.
1  2004 Morgan Kaufmann Publishers Multilevel cache Used to reduce miss penalty to main memory First level designed –to reduce hit time –to be of small.
July 30, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 8: Exploiting Memory Hierarchy: Virtual Memory * Jeremy R. Johnson Monday.
The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.
IT253: Computer Organization
Lecture 9: Memory Hierarchy Virtual Memory Kai Bu
Memory and cache CPU Memory I/O. CEG 320/52010: Memory and cache2 The Memory Hierarchy Registers Primary cache Secondary cache Main memory Magnetic disk.
Virtual Memory Expanding Memory Multiple Concurrent Processes.
Virtual Memory. DRAM as cache What about programs larger than DRAM? When we run multiple programs, all must fit in DRAM! Add another larger, slower level.
1 Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: –illusion of having more physical memory –program relocation.
1  1998 Morgan Kaufmann Publishers Recap: Memory Hierarchy of a Modern Computer System By taking advantage of the principle of locality: –Present the.
Virtual Memory. Virtual Memory: Topics Why virtual memory? Virtual to physical address translation Page Table Translation Lookaside Buffer (TLB)
1 Some Real Problem  What if a program needs more memory than the machine has? —even if individual programs fit in memory, how can we run multiple programs?
Introduction: Memory Management 2 Ideally programmers want memory that is large fast non volatile Memory hierarchy small amount of fast, expensive memory.
DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson.
CS2100 Computer Organisation Virtual Memory – Own reading only (AY2015/6) Semester 1.
Virtual Memory Ch. 8 & 9 Silberschatz Operating Systems Book.
Princess Sumaya Univ. Computer Engineering Dept. Chapter 5:
1  1998 Morgan Kaufmann Publishers Chapter Seven.
Virtual Memory Review Goal: give illusion of a large memory Allow many processes to share single memory Strategy Break physical memory up into blocks (pages)
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
1 Chapter Seven. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value.
Virtual Memory 1 Computer Organization II © McQuain Virtual Memory Use main memory as a “cache” for secondary (disk) storage – Managed jointly.
CS203 – Advanced Computer Architecture Virtual Memory.
The Memory Hierarchy Lecture 31 20/07/2009Lecture 31_CA&O_Engr. Umbreen Sabir.
CS161 – Design and Architecture of Computer
Memory Hierarchy Ideal memory is fast, large, and inexpensive
Virtual Memory Chapter 7.4.
ECE232: Hardware Organization and Design
Memory COMPUTER ARCHITECTURE
CS161 – Design and Architecture of Computer
Memory and cache CPU Memory I/O.
Lecture 12 Virtual Memory.
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Chapter 8 Digital Design and Computer Architecture: ARM® Edition
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Memory and cache CPU Memory I/O.
Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early days Primary memory.
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
CSC3050 – Computer Architecture
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Presentation transcript:

Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip

Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip A Second level cache is off chip that is larger and slower that interfaces with the Main Memory

Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip A Second level cache is off chip that is larger and slower that interfaces with the Main Memory Same cache techniques can be used

Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip A Second level cache is off chip that is larger and slower that interfaces with the Main Memory Same cache techniques can be used Ave Memory Access Time = Hit time(L1) + Miss rate (L1) * Miss Penalty( L1)

Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip A Second level cache is off chip that is larger and slower that interfaces with the Main Memory Same cache techniques can be used Ave Memory Access Time = Hit time(L1) + Miss rate (L1) * Miss Penalty( L1) Miss penalty (L1) = Hit time (L2) + Miss rate (L2) * Miss Penalty( L2)

Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip A Second level cache is off chip that is larger and slower that interfaces with the Main Memory Same cache techniques can be used Ave Memory Access Time = Hit time(L1) + Miss rate (L1) * Miss Penalty( L1) Miss penalty (L1) = Hit time (L2) + Miss rate (L2) * Miss Penalty( L2) Ave Memory Access Time = Hit time(L1) + Miss rate (L1) * Hit time (L2) + Miss rate (L1)*Miss rate (L2) * Miss Penalty( L2)

Ave Memory Access Time = Hit time(L1) + Miss rate (L1) * Hit time (L2) + Miss rate (L1)*Miss rate (L2) * Miss Penalty( L2) Ex: Hit time (L1) = 1 clock cycle Miss rate(L1) = 10% Hit time ( L2) = 10 Clock Cycles Miss rate (L2) = 20% Miss penalty (L2) = 100 clock cyles If no secondary cache Ave Memory Access = * 100 = 11 clock cycles

Ave Memory Access Time = Hit time(L1) + Miss rate (L1) * Hit time (L2) + Miss rate (L1)*Miss rate (L2) * Miss Penalty( L2) Ex: Hit time (L1) = 1 clock cycle Miss rate(L1) = 10% Hit time ( L2) = 10 Clock Cycles Miss rate (L2) = 20% Miss penalty (L2) = 100 clock cyles If no secondary cache Ave Memory Access = * 100 = 11 clock cycles With secondary cache Ave Memory Access = * *0.2* = 4 clock cycles

Characteristics of programs that makes efficient use of cache memory

Characteristics of programs that makes efficient use of cache memory Code has tight loops with lots of reuse

Characteristics of programs that makes efficient use of cache memory Code has tight loops with lots of reuse Code minimizes jumps and branches to far away

Characteristics of programs that makes efficient use of cache memory Code has tight loops with lots of reuse Code minimizes jumps and branches to far away Data marches through arrays

Processor Memory Hierarchy Cache Memory Main Memory Transfer in 1 clock cycle Transfer > 10 clock cycles SRAM DRAM

Processor Virtual Memory Hierarchy Cache Memory Main Memory Transfer in 10 + clock cycles Transfer > 100,000 clock cycles DRAM Hard Disk

Virtual Memory The Illusion of Unlimited Amount of Memory 1.Invented to allow programmers to use the full addressing capability of the processor. Not limited by physical RAM

Virtual Memory The Illusion of Unlimited Amount of Memory 1.Invented to allow programmers to use the full addressing capability of the processor. Not limited by physical RAM 2.Enabled Multiprocessing. Computer running multiple processes ( programs and environments) each with it’s own address space. ( Time Sharing )

Virtual Memory The Illusion of Unlimited Amount of Memory 1.Invented to allow programmers to use the full addressing capability of the processor. Not limited by physical RAM 2.Enabled Multiprocessing. Computer running multiple processes ( programs and environments) each with it’s own address space. ( Time Sharing ) 3.Provides a structure for protection. Pages can have restricted use by assigned processes. Types and Levels of access.

Virtual Memory The Illusion of Unlimited Amount of Memory 1.Invented to allow programmers to use the full addressing capability of the processor. Not limited by physical RAM 2.Enabled Multiprocessing. Computer running multiple processes ( programs and environments) each with it’s own address space. ( Time Sharing ) 3.Provides a structure for protection. Pages can have restricted use by assigned processes. Types and Levels of access. 4.Provides simple relocation by mapping to pages in physical memory

Virtual Address – 4 GB Virtual page number Page offset Physical page number Page offset Page = 2 12 = 4KB Translation Physical Address – 512 MB

Virtual Address Virtual Page No. Page Offset Page Table Main Memory Hard Disk Physical Address Page = Block What is the miss penalty? Called Page Fault

1. Where can a page ( block) be placed in Main Memory?

The miss penalty is very high, so minimize miss rate

1. Where can a page ( block) be placed in Main Memory? The miss penalty is very high, so minimize miss rate The long access time of the hard disk enables the OS to control placing the pages in Main Memory

1. Where can a page ( block) be placed in Main Memory? The miss penalty is very high, so minimize miss rate The long access time of the hard disk enables the OS to control placing the pages in Main Memory The lowest miss rate results from locating pages anywhere inMain Memory ( Fully Associative )

2. How is a page ( block) found if it is in Main Memory?

A page is located by the page table that contains the physical address of the page in Main Memory in the entry addressed by the virtual page number. ( The index is mapped by the software using the page table and no tag is required.)

2. How is a page ( block) found if it is in Main Memory? A page is located by the page table that contains the physical address of the page in Main Memory in the entry addressed by the virtual page number. ( The index is mapped by the software using the page table and no tag is required.) The page table has an entry for every virtual page number which can be 1 M entries. So, the page table is typically stored in Main Memory.

2. How is a page ( block) found if it is in Main Memory? A page is located by the page table that contains the physical address of the page in Main Memory in the entry addressed by the virtual page number. ( The index is mapped by the software using the page table and no tag is required.) The page table has an entry for every virtual page number which can be 1 M entries. So, the page table is typically stored in Main Memory. Accessing Main Memory twice each time takes to long, so another cache must be added.

3.Which page ( block) should be replaced on a Virtual Memory Miss ( Page Fault )

3.Which page ( block) should be replaced on a Virtual Memory Miss ( Page Fault ) Under OS control, so clever algorithms are feasible.

3.Which page ( block) should be replaced on a Virtual Memory Miss ( Page Fault ) Under OS control, so clever algorithms are feasible. Usually try to approximate least recently used (LRU) replacement strategy.

3.Which page ( block) should be replaced on a Virtual Memory Miss ( Page Fault ) Under OS control, so clever algorithms are feasible. Usually try to approximate least recently used (LRU) replacement strategy. May keep a use bit, which is set whenever a page is accessed to help estimate LRU.

4. What happens on a Write?

Write Through is not feasible due to the very slow speed of the Hard Disk

4. What happens on a Write? Write Through is not feasible due to the very slow speed of the Hard Disk The write strategy is always Write Back

4. What happens on a Write? Write Through is not feasible due to the very slow speed of the Hard Disk The write strategy is always Write Back Write Back # The word is written only to the page in Main Memory (cache). # The modified page is written to the Hard Disk only when it is replaced. A Dirty Bit indicates if a page has been altered.

Selecting a Page Size Reasons Pages should be larger: 1.Minimize miss rate, but too large can get no improvement or even increase. 2.Reduces the size of the page table 3.Transfers to / from Hard Disk are more efficient ( also over networks)

Selecting a Page Size Reasons Pages should be larger: 1.Minimize miss rate, but too large can get no improvement or even increase. 2.Reduces the size of the page table 3.Transfers to / from Hard Disk are more efficient ( also over networks) Reasons Pages should be smaller: 1.Wastes space because each process has several primary pages required. 2.Start up time is longer for small processes.

Selecting a Page Size Reasons Pages should be larger: 1.Minimize miss rate, but too large can get no improvement or even increase. 2.Reduces the size of the page table 3.Transfers to / from Hard Disk are more efficient ( also over networks) Reasons Pages should be smaller: 1.Wastes space because each process has several primary pages required. 2.Start up time is longer for small processes. Page size is 4 KB to 64 KB

Virtual Address Virtual Page No. Page Offset Page Table Main Memory Hard Disk Physical Address Page = Block Physical Page Address Valid Use Dirty Page Table Maps Virtual Page No. to Disk Addr

The Page Table is Large Usually in DRAM Main Memory Too Slow

The Page Table is Large Usually in DRAM Main Memory Too Slow So, provide a cache for the Page Table TLB – translation-lookaside buffer

Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Physical Page Addr Tag Valid, Dirty, Use

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Page Table Register + Physical Page Addr Physical Page Addr Tag Valid, Dirty, Use

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Page Table Register + Physical Page Addr Physical Page Addr Tag Valid, Dirty, Use

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Page Table Register + Physical Page Addr Physical Page Addr Tag Valid, Dirty, Use Hit

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Page Table Register + Physical Page Addr Physical Page Addr Tag Valid, Dirty, Use Hit Read Main Set Use Bit TLB

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Page Table Register + Physical Page Addr Physical Page Addr Tag Valid, Dirty, Use Hit Write Main Set Dirty Bit TLB

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Page Table Register + Physical Page Addr Physical Page Addr Tag Valid, Dirty, Use Miss

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Page Table Register + Physical Page Addr Physical Page Addr Tag Valid, Dirty, Use Miss TLB Miss or Page Fault

TLB Miss –

1.Select entry to be replaced (LRU or Random)

TLB Miss – 1.Select entry to be replaced (LRU or Random) 2.Write Back Use and Dirty Bits to Page Table replaced entry

TLB Miss – 1.Select entry to be replaced (LRU or Random) 2.Write Back Use and Dirty Bits to Page Table replaced entry 3.Access requested entry in Page Table If Page is in Main Memory ( Valid), load the translation from the Page Table to TLB and try again

TLB Miss – 1.Select entry to be replaced (LRU or Random) 2.Write Back Use and Dirty Bits to Page Table replaced entry 3.Access requested entry in Page Table If Page is in Main Memory ( Valid), load the translation from the Page Table to TLB and try again If Page is not in Main Memory ( Valid), then it is a Page Fault : Write Back Replaced Page to Disk if Dirty Move page from Disk to Main Memory Update Page Table & TLB and try again

Virtual Address Virtual Page Number Page Offset = = = = Tag Physical Page Number 20 TLB Hit Valid Dirty To Main Memory 20 12

Virtual Address Virtual Page Number Page Offset = = = = Tag Physical Page Number 20 TLB Hit Valid Dirty To Main Memory Design Goals Hit Time – 1 clock cycle TLB Miss Penalty – clock cycles TLB Miss Rate - < 1%

Virtual Address Virtual Page Number Page Offset = = = = Tag Physical Page Number 20 TLB Hit Valid Dirty Tag - m Index-k Block Offset Byte Offset Main Memory Cache Physical Address

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Physical Page Addr Tag Valid, Dirty, Use Ex: - 40 bit virtual byte address - 16KB pages - 36 bit physical address

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Physical Page Addr Tag Valid, Dirty, Use Ex: - 40 bit virtual byte address - 16KB pages - 36 bit physical address Size of page table assuming no disk address and 4 bits V,P,D,U?

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Physical Page Addr Tag Valid, Dirty, Use Ex: - 40 bit virtual byte address - 16KB pages - 36 bit physical address Size of page table assuming no disk address and 4 bits V,P,D,U? 14

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Physical Page Addr Tag Valid, Dirty, Use Ex: - 40 bit virtual byte address - 16KB pages - 36 bit physical address Size of page table assuming no disk address and 4 bits V,P,D,U? 1426

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Physical Page Addr Tag Valid, Dirty, Use Ex: - 40 bit virtual byte address - 16KB pages - 36 bit physical address Size of page table assuming no disk address and 4 bits V,P,D,U? 1426

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Physical Page Addr Tag Valid, Dirty, Use Ex: - 40 bit virtual byte address - 16KB pages - 36 bit physical address Size of page table assuming no disk address and 4 bits V,P,D,U?

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Physical Page Addr Tag Valid, Dirty, Use Ex: - 40 bit virtual byte address - 16KB pages - 36 bit physical address Size of page table assuming no disk address and 4 bits V,P,D,U? x 26

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Physical Page Addr Tag Valid, Dirty, Use Ex: - 40 bit virtual byte address - 16KB pages - 36 bit physical address TLB 2-way Associative with 256 entries. Index and Tag? x 26

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Physical Page Addr Tag Valid, Dirty, Use Ex: - 40 bit virtual byte address - 16KB pages - 36 bit physical address TLB 2-way Associative with 256 entries. Index and Tag? x 26 Tag Index

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Physical Page Addr Tag Valid, Dirty, Use Ex: - 40 bit virtual byte address - 16KB pages - 36 bit physical address TLB 2-way Associative with 256 entries. Index and Tag? x 26 Tag Index-8

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Physical Page Addr Tag Valid, Dirty, Use Ex: - 40 bit virtual byte address - 16KB pages - 36 bit physical address TLB 2-way Associative with 256 entries. Index and Tag? x 26 Tag-18 Index-8

TLB – translation-lookaside buffer Virtual Page No. Page Offset Page Table Main Memory Hard Disk TLB Physical Page Addr Tag Valid, Dirty, Use Ex: - 40 bit virtual byte address - 16KB pages - 36 bit physical address TLB 2-way Associative with 256 entries. Index and Tag? x 26 Tag-18 Index

CPU TLB On Chip Cache Secondary Cache DRAM Main Memory Hard Disk Virtual Address bits 4KB pages Microprocessors

CPU TLB On Chip Cache Secondary Cache DRAM Main Memory Hard Disk Virtual Address bits 4KB pages TLB Split Data and Instruction 4-Way Set Associative 64 – 128 entries Microprocessors

CPU TLB On Chip Cache Secondary Cache DRAM Main Memory Hard Disk Virtual Address bits 4KB pages TLB Split Data and Instruction 4-Way Set Associative 64 – 128 entries On Chip Cache Split Data and Instruction 8KB – 16KB each 4-Way Associative 32 Bytes / Block Microprocessors

Implementing Protection with Virtual Memory Add protection bits to the TLB / Page Table Read / Write Access User(s) / Supervisor

Implementing Protection with Virtual Memory Add protection bits to the TLB / Page Table Read / Write Access User(s) / Supervisor CPU supplies Read / Write and User / Supervisor signals for each access

Implementing Protection with Virtual Memory Add protection bits to the TLB / Page Table Read / Write Access User(s) / Supervisor CPU supplies Read / Write and User / Supervisor signals for each access Comparisons can be made in the TLB. Non-compare can cause an exception

Implementing Protection with Virtual Memory Add protection bits to the TLB / Page Table Read / Write Access User(s) / Supervisor CPU supplies Read / Write and User / Supervisor signals for each access Comparisons can be made in the TLB. Non-compare can cause an exception User programs cannot modify the protection bits in Page Table / TLB