CS 61C: Great Ideas in Computer Architecture

Slides:



Advertisements
Similar presentations
Virtual Memory In this lecture, slides from lecture 16 from the course Computer Architecture ECE 201 by Professor Mike Schulte are used with permission.
Advertisements

Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 34 – Virtual Memory II 11/19/2014: Primary Data – A new startup came out of stealth.
1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.
Virtual Memory main memory can act as a cache for secondary storage motivation: Allow programs to use more memory that there is available transparent to.
Computer Organization CS224 Fall 2012 Lesson 44. Virtual Memory  Use main memory as a “cache” for secondary (disk) storage l Managed jointly by CPU hardware.
Virtual Memory Hardware Support
COMP 3221: Microprocessors and Embedded Systems Lectures 27: Virtual Memory - III Lecturer: Hui Wu Session 2, 2005 Modified.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Virtual Memory I Steve Ko Computer Sciences and Engineering University at Buffalo.
Virtual Memory Adapted from lecture notes of Dr. Patterson and Dr. Kubiatowicz of UC Berkeley.
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
Virtual Memory Adapted from lecture notes of Dr. Patterson and Dr. Kubiatowicz of UC Berkeley and Rabi Mahapatra & Hank Walker.
S.1 Review: The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of.
CS 61C L25 VM III (1) Garcia, Spring 2004 © UCB TA Chema González www-inst.eecs/~cs61c-td inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture.
Principle Behind Hierarchical Storage  Each level memorizes values stored at lower levels  Instead of paying the full latency for the “furthermost” level.
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 35 – Virtual Memory III Researchers at three locations have recently added.
ECE 232 L27.Virtual.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 27 Virtual.
Translation Buffers (TLB’s)
CS61C L34 Virtual Memory II (1) Garcia, Fall 2006 © UCB Lecturer SOE Dan Garcia inst.eecs.berkeley.edu/~cs61c UC Berkeley.
Chapter 9 Virtual Memory Produced by Lemlem Kebede Monday, July 16, 2001.
Virtual Memory and Paging J. Nelson Amaral. Large Data Sets Size of address space: – 32-bit machines: 2 32 = 4 GB – 64-bit machines: 2 64 = a huge number.
A. Frank - P. Weisberg Operating Systems Simple/Basic Paging.
CS61C L37 VM III (1)Garcia, Fall 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures.
Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
CSE431 L22 TLBs.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 22. Virtual Memory Hardware Support Mary Jane Irwin (
Lecture 19: Virtual Memory
Instructor: Dan Garcia 1 CS 61C: Great Ideas in Computer Architecture Virtual Memory III.
Virtual Memory Muhamed Mudawar CS 282 – KAUST – Spring 2010 Computer Architecture & Organization.
Virtual Memory Part 1 Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology May 2, 2012L22-1
The Three C’s of Misses 7.5 Compulsory Misses The first time a memory location is accessed, it is always a miss Also known as cold-start misses Only way.
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 35 – Virtual Memory III ”The severity of the decline in the market is further evidence.
Virtual Memory. Virtual Memory: Topics Why virtual memory? Virtual to physical address translation Page Table Translation Lookaside Buffer (TLB)
1 Memory Management. 2 Fixed Partitions Legend Free Space 0k 4k 16k 64k 128k Internal fragmentation (cannot be reallocated) Divide memory into n (possible.
Virtual Memory Additional Slides Slide Source: Topics Address translation Accelerating translation with TLBs class12.ppt.
Review °Apply Principle of Locality Recursively °Manage memory to disk? Treat as cache Included protection as bonus, now critical Use Page Table of mappings.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Demand Paging.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Virtual Memory.  Next in memory hierarchy  Motivations:  to remove programming burdens of a small, limited amount of main memory  to allow efficient.
1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson.
CS.305 Computer Architecture Memory: Virtual Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.
Princess Sumaya Univ. Computer Engineering Dept. Chapter 5:
Constructive Computer Architecture Virtual Memory: From Address Translation to Demand Paging Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.
Instructor: Justin Hsia 7/30/2012Summer Lecture #241 CS 61C: Great Ideas in Computer Architecture Virtual Memory.
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
CS203 – Advanced Computer Architecture Virtual Memory.
CS161 – Design and Architecture of Computer
Bernhard Boser & Randy Katz
Virtual Memory Chapter 7.4.
ECE232: Hardware Organization and Design
Memory COMPUTER ARCHITECTURE
CS161 – Design and Architecture of Computer
Instructor: Justin Hsia
Lecture 12 Virtual Memory.
Section 9: Virtual Memory (VM)
From Address Translation to Demand Paging
From Address Translation to Demand Paging
CS 704 Advanced Computer Architecture
Day 21 Virtual Memory.
Day 22 Virtual Memory.
Memory Hierarchy Virtual Memory, Address Translation
Guest Lecturer: Justin Hsia
Chapter 8 Digital Design and Computer Architecture: ARM® Edition
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Evolution in Memory Management Techniques
Page that info back into your memory!
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
Virtual Memory Prof. Eric Rotenberg
Translation Buffers (TLB’s)
CSC3050 – Computer Architecture
Translation Buffers (TLBs)
Review What are the advantages/disadvantages of pages versus segments?
Presentation transcript:

CS 61C: Great Ideas in Computer Architecture Virtual Memory III Miki Lustig

Where Are TLBs Located? Which should we check first: Cache or TLB? Can cache hold requested data if corresponding page is not in physical memory? With TLB first, does cache receive VA or PA? No PA Cache VA PA miss hit data CPU Main Memory TLB Page Table Notice that it is now the TLB that does translation, not the Page Table!

Address Translation Using TLB VPN TLB Tag TLB Index Page Offset Virtual Address TLB TLB Tag PPN (used just like in a cache) . . . PA split two different ways! PPN Page Offset Physical Address Data Cache Tag Block Data . . . Tag Index Offset Note: TIO for VA & PA unrelated

Question: How many bits wide are the following? 16 KiB pages 40-bit virtual addresses 64 GiB physical memory 2-way set associative TLB with 512 entries Valid Dirty Ref Access Rights TLB Tag PPN X XX 12 14 38 A) 18 8 45 B) 14 12 40 C) 17 9 43 D) TLB Tag TLB Index TLB Entry

Question: How many bits wide are the following? 16 KiB pages 40-bit virtual addresses 64 GiB physical memory 2-way set associative TLB with 512 entries Valid Dirty Ref Access Rights TLB Tag PPN X XX 12 14 38 A) 18 8 45 B) 14 12 40 C) 17 9 43 D) TLB Tag TLB Index TLB Entry

Question: How many bits wide are the following? 16 KiB pages 40-bit virtual addresses 64 GiB physical memory 2-way set associative TLB with 512 entries Valid Dirty Ref Access Rights TLB Tag PPN 1 2 18 22 VPN TLB Tag TLB Index Page Offset 16KiB = 14 bits 40 Bits = 1 TeBi 64GiB = 2^30 + 2^6 = 36 bits VPN = 40 – 14 = 26 PPN = 36 – 14 = 22 TLB gets VPN = 26 bits TLB Index = 512/2 = 256 = 2^8 = 8 bits TLB Tag = 26 -8 = 18 Total: 18 + 22 + 5 = 45 Recall VPN = 26, PPN = 22 TLB gets VPN as input = 26 bits TLB Index = 512/2 = 256 = 8bits TLB Tag = 26 – 8 = 18bits Total = 1+1+1+2 + 18 + 22 = 45

Fetching Data on a Memory Read Check TLB (input: VPN, output: PPN) TLB Hit: Fetch translation, return PPN TLB Miss: Check page table (in memory) Page Table Hit: Load page table entry into TLB Page Table Miss (Page Fault): Fetch page from disk to memory, update corresponding page table entry, then load entry into TLB Check cache (input: PPN, output: data) Cache Hit: Return data value to processor Cache Miss: Fetch data value from memory, store it in cache, return it to processor

Page Faults Load the page off the disk into a free page of memory Switch to some other process while we wait Interrupt thrown when page loaded and the process' page table is updated When we switch back to the task, the desired data will be in memory If memory full, replace page (LRU), writing back if necessary, and update both page table entries Continuous swapping between disk and memory called “thrashing”

Performance Metrics VM performance also uses Hit/Miss Rates and Miss Penalties TLB Miss Rate: Fraction of TLB accesses that result in a TLB Miss Page Table Miss Rate: Fraction of PT accesses that result in a page fault Caching performance definitions remain the same Somewhat independent, as TLB will always pass PA to cache regardless of TLB hit or miss

Data Fetch Scenarios Cache VA PA miss hit data CPU Main Memory TLB Page Table Are the following scenarios for a single data access possible? TLB Miss, Page Fault TLB Hit, Page Table Hit TLB Miss, Cache Hit Page Table Hit, Cache Miss Page Fault, Cache Hit Yes No

Question: A program tries to load a word at X that causes a TLB miss but not a page fault. Are the following statements TRUE or FALSE? The page table does not contain a valid mapping for the virtual page corresponding to the address X The word that the program is trying to load is present in physical memory F F A) F T B) T F C) T T D) 1 2

Question: A program tries to load a word at X that causes a TLB miss but not a page fault. Are the following statements TRUE or FALSE? The page table does not contain a valid mapping for the virtual page corresponding to the address X The word that the program is trying to load is present in physical memory F F A) F T B) T F C) T T D) 1 2

VM Performance Virtual Memory is the level of the memory hierarchy that sits below main memory TLB comes before cache, but affects transfer of data from disk to main memory Previously we assumed main memory was lowest level, now we just have to account for disk accesses Same CPI, AMAT equations apply, but now treat main memory like a mid-level cache

Typical Performance Stats secondary memory primary memory primary memory CPU CPU cache Caching Demand paging cache entry page frame cache block (≈32 bytes) page (≈4Ki bytes) cache miss rate (1% to 20%) page miss rate (<0.001%) cache hit (≈1 cycle) page hit (≈100 cycles) cache miss (≈100 cycles) page miss (≈5M cycles) CS252 S05

Impact of Paging on AMAT (1/2) Memory Parameters: L1 cache hit = 1 clock cycles, hit 95% of accesses L2 cache hit = 10 clock cycles, hit 60% of L1 misses DRAM = 200 clock cycles (≈100 nanoseconds) Disk = 20,000,000 clock cycles (≈10 milliseconds) Average Memory Access Time (no paging): 1 + 5%×10 + 5%×40%×200 = 5.5 clock cycles Average Memory Access Time (with paging): 5.5 (AMAT with no paging) + ?

Impact of Paging on AMAT (2/2) Average Memory Access Time (with paging) = 5.5 + 5%×40%× (1-HRMem)×20,000,000 AMAT if HRMem = 99%? 5.5 + 0.02×0.01×20,000,000 = 4005.5 (≈728x slower) 1 in 20,000 memory accesses goes to disk: 10 sec program takes 2 hours! AMAT if HRMem = 99.9%? 5.5 + 0.02×0.001×20,000,000 = 405.5 AMAT if HRMem = 99.9999% 5.5 + 0.02×0.000001×20,000,000 = 5.9

Impact of TLBs on Performance Each TLB miss to Page Table ~ L1 Cache miss TLB Reach: Amount of virtual address space that can be simultaneously mapped by TLB: TLB typically has 128 entries of page size 4-8 KiB 128 × 4 KiB = 512 KiB = just 0.5 MiB What can you do to have better performance? Multi-level TLBs Variable page size (segments) Special situationally-used “superpages” Conceptually same as multi-level caches Not covered here

Aside: Context Switching How does a single processor run many programs at once? Context switch: Changing of internal state of processor (switching between processes) Save register values (and PC) and change value in Page Table Base register What happens to the TLB? Current entries are for different process Set all entries to invalid on context switch

Virtual Memory Summary User program view: Contiguous memory Start from some set VA “Infinitely” large Is the only running program Reality: Non-contiguous memory Start wherever available memory is Finite size Many programs running simultaneously Virtual memory provides: Illusion of contiguous memory All programs starting at same set address Illusion of ~ infinite memory (232 or 264 bytes) Protection , Sharing Implementation: Divide memory into chunks (pages) OS controls page table that maps virtual into physical addresses memory as a cache for disk TLB is a cache for the page table