Computer Architecture 2010 – VM 1 Computer Architecture Virtual Memory Dr. Lihu Rappoport.

Slides:



Advertisements
Similar presentations
EECS 470 Virtual Memory Lecture 15. Why Use Virtual Memory? Decouples size of physical memory from programmer visible virtual memory Provides a convenient.
Advertisements

16.317: Microprocessor System Design I
Computer Organization CS224 Fall 2012 Lesson 44. Virtual Memory  Use main memory as a “cache” for secondary (disk) storage l Managed jointly by CPU hardware.
Lecture 34: Chapter 5 Today’s topic –Virtual Memories 1.
CSIE30300 Computer Architecture Unit 10: Virtual Memory Hsin-Chou Chi [Adapted from material by and
Virtual Memory Hardware Support
Cs 325 virtualmemory.1 Accessing Caches in Virtual Memory Environment.
Computer Architecture Virtual Memory
CS 153 Design of Operating Systems Spring 2015
The Memory Hierarchy (Lectures #24) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer Organization.
Virtual Memory Adapted from lecture notes of Dr. Patterson and Dr. Kubiatowicz of UC Berkeley.
Virtual Memory Adapted from lecture notes of Dr. Patterson and Dr. Kubiatowicz of UC Berkeley and Rabi Mahapatra & Hank Walker.
Memory Management (II)
Translation Look-Aside Buffers TLBs usually small, typically entries Like any other cache, the TLB can be fully associative, set associative,
S.1 Review: The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of.
ECE 232 L27.Virtual.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 27 Virtual.
Virtual Memory and Paging J. Nelson Amaral. Large Data Sets Size of address space: – 32-bit machines: 2 32 = 4 GB – 64-bit machines: 2 64 = a huge number.
©UCB CS 162 Ch 7: Virtual Memory LECTURE 13 Instructor: L.N. Bhuyan
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy (Part II)
Vm Computer Architecture Lecture 16: Virtual Memory.
Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
Lecture 19: Virtual Memory
1 Chapter 3.2 : Virtual Memory What is virtual memory? What is virtual memory? Virtual memory management schemes Virtual memory management schemes Paging.
Computer Structure 2012 – VM 1 Computer Structure X86 Virtual Memory and TLB Franck Sala Slides from Lihu and Adi’s Lecture.
By Teacher Asma Aleisa Year 1433 H.   Goals of memory management  To provide a convenient abstraction for programming  To allocate scarce memory resources.
Computer Architecture 2011 – VM x86 1 Computer Architecture Virtual Memory (VM) – x86 By Dan Tsafrir, 30/5/2011 Presentation based on slides by Lihu Rappoport.
Computer Architecture 2012 – virtual memory 1 Computer Architecture Virtual Memory (VM) By Dan Tsafrir, 10/6/2011 Presentation based on slides by Lihu.
Computer Structure 2012 – VM 1 Computer Structure X86 Virtual Memory and TLB Franck Sala Slides from Lihu and Adi’s Lecture.
Memory Management Fundamentals Virtual Memory. Outline Introduction Motivation for virtual memory Paging – general concepts –Principle of locality, demand.
Computer Architecture 2008 – VM 1 Computer Architecture Virtual Memory Dr. Lihu Rappoport.
The Three C’s of Misses 7.5 Compulsory Misses The first time a memory location is accessed, it is always a miss Also known as cold-start misses Only way.
Computer Structure 2012 – VM 1 Computer Structure Virtual Memory Lihu Rappoport and Adi Yoaz.
1 Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: –illusion of having more physical memory –program relocation.
Vm.1 EEL-4713 Computer Architecture Virtual Memory.
Virtual Memory.  Next in memory hierarchy  Motivations:  to remove programming burdens of a small, limited amount of main memory  to allow efficient.
1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson.
CS2100 Computer Organisation Virtual Memory – Own reading only (AY2015/6) Semester 1.
Computer Structure Virtual Memory
Virtual Memory Ch. 8 & 9 Silberschatz Operating Systems Book.
VMLihu Rappoport, 12/ MAMAS – Computer Architecture Virtual Memory Dr. Lihu Rappoport.
Constructive Computer Architecture Virtual Memory: From Address Translation to Demand Paging Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
CS203 – Advanced Computer Architecture Virtual Memory.
CS161 – Design and Architecture of Computer
Computer Architecture Virtual Memory
ECE232: Hardware Organization and Design
Memory COMPUTER ARCHITECTURE
CS161 – Design and Architecture of Computer
Lecture 12 Virtual Memory.
Virtual Memory Provides illusion of very large memory
CS703 - Advanced Operating Systems
Address Translation Mechanism of 80386
Computer Architecture Virtual Memory (VM)
Memory Hierarchy Virtual Memory, Address Translation
CSE 153 Design of Operating Systems Winter 2018
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
CSE 451: Operating Systems Autumn 2005 Memory Management
Translation Buffers (TLB’s)
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
Translation Buffers (TLB’s)
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
CSE 153 Design of Operating Systems Winter 2019
Translation Buffers (TLBs)
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Virtual Memory.
Review What are the advantages/disadvantages of pages versus segments?
Computer Structure Virtual Memory
Presentation transcript:

Computer Architecture 2010 – VM 1 Computer Architecture Virtual Memory Dr. Lihu Rappoport

Computer Architecture 2010 – VM 2 Virtual Memory  Provides the illusion of a large memory  Different machines have different amount of physical memory –Allows programs to run regardless of actual physical memory size  The amount of memory consumed by each process is dynamic –Allow adding memory as needed  Many processes can run on a single machine –Provide each process its own memory space –Prevents a process from accessing the memory of other processes running on the same machine –Allows the sum of memory spaces of all process to be larger than physical memory  Basic terminology –Virtual Address Space: address space used by the programmer –Physical Address: actual physical memory address space

Computer Architecture 2010 – VM 3 Virtual Memory: Basic Idea  Divide memory (virtual and physical) into fixed size blocks –Pages in Virtual space, Frames in Physical space –Page size = Frame size –Page size is a power of 2: page size = 2 k  All pages in the virtual address space are contiguous  Pages can be mapped into physical Frames in any order  Some of the pages are in main memory (DRAM), some of the pages are on disk  All programs are written using Virtual Memory Address Space  The hardware does on-the-fly translation between virtual and physical address spaces –Use a Page Table to translate between Virtual and Physical addresses

Computer Architecture 2010 – VM 4  Main memory can act as a cache for the secondary storage (disk)  Advantages: –illusion of having more physical memory –program relocation –protection Virtual Memory Virtual Addresses Physical Addresses Address Translation Disk Addresses

Computer Architecture 2010 – VM 5 Virtual to Physical Address translation 31 Page offset 0 11 Virtual Page Number Page offset 11 0 Physical Frame Number 29 Virtual Address Physical Address VDFrame number 1 Page table base reg 0 Valid bit Dirty bit 12 AC Access Control 12 Page size: 2 12 byte =4K byte

Computer Architecture 2010 – VM 6 Page Tables Valid 1 Physical Memory Disk Page Table Physical Page Or Disk Address Virtual page number

Computer Architecture 2010 – VM 7 If V = 1 then page is in main memory at frame address stored in table  Fetch data else (page fault) need to fetch page from disk  causes a trap, usually accompanied by a context switch: current process suspended while page is fetched from disk Access Control (R = Read-only, R/W = read/write, X = execute only) If kind of access not compatible with specified access rights then protection_violation_fault  causes trap to hardware, or software fault handler  Missing item fetched from secondary memory only on the occurrence of a fault  demand load policy Address Mapping Algorithm

Computer Architecture 2010 – VM 8 Page Replacement Algorithm  Not Recently Used (NRU) –Associated with each page is a reference flag such that ref flag = 1 if the page has been referenced in recent past  If replacement is needed, choose any page frame such that its reference bit is 0. –This is a page that has not been referenced in the recent past  Clock implementation of NRU: While (PT[LRP].NRU) { PT[LRP].NRU LRP++ (mod table size) } page table entry Ref bit 1 0  Possible optimization: search for a page that is both not recently referenced AND not dirty

Computer Architecture 2010 – VM 9 Page Faults  Page faults: the data is not in memory  retrieve it from disk –The CPU must detect situation –The CPU cannot remedy the situation (has no knowledge of the disk) – CPU must trap to the operating system so that it can remedy the situation –Pick a page to discard (possibly writing it to disk) –Load the page in from disk –Update the page table –Resume to program so HW will retry and succeed!  Page fault incurs a huge miss penalty –Pages should be fairly large (e.g., 4KB) –Can handle the faults in software instead of hardware –Page fault causes a context switch –Using write-through is too expensive so we use write-back

Computer Architecture 2010 – VM 10 Optimal Page Size  Minimize wasted storage –Small page minimizes internal fragmentation –Small page increase size of page table  Minimize transfer time –Large pages (multiple disk sectors) amortize access cost –Sometimes transfer unnecessary info –Sometimes prefetch useful data –Sometimes discards useless data early  General trend toward larger pages because –Big cheap RAM –Increasing memory / disk performance gap –Larger address spaces

Computer Architecture 2010 – VM 11 Translation Lookaside Buffer (TLB)  Page table resides in memory  each translation requires a memory access  TLB –Cache recently used PTEs –speed up translation –typically 128 to 256 entries –usually 4 to 8 way associative –TLB access time is comparable to L1 cache access time Yes No TLB Hit ? Access Page Table Virtual Address Physical Addresses TLB Access

Computer Architecture 2010 – VM 12 TLB is a cache for recent address translations: Making Address Translation Fast

Computer Architecture 2010 – VM 13 TLB Access TagSet Offset Set# Hit/Miss Way MUX PTE = = = = Way 0 Way 1 Way 2 Way 3 Way 0 Way 1 Way 2 Way 3 Virtual page number

Computer Architecture 2010 – VM 14 Virtual Memory And Cache  TLB access is serial with cache access  Page table entries can be cached in L2 cache (as data) Yes No Access TLB Access Page Table In Memory Access Cache Virtual Address L1 Cache Hit ? Yes No Physical Addresses Data No Access Memory L2 Cache Hit ? TLB Hit ? L2 Cache Hit ? No

Computer Architecture 2010 – VM 15 Overlapped TLB & Cache Access #Set is not contained within the Page Offset  The #Set is not known until the physical page number is known  Cache can be accessed only after address translation done Virtual Memory view of a Physical Address Cache view of a Physical Address 0 Page offset 11 Physical Page Number disp 13 tag set 6

Computer Architecture 2010 – VM 16 Overlapped TLB & Cache Access (cont) In the above example #Set is contained within the Page Offset  The #Set is known immediately  Cache can be accessed in parallel with address translation  Once translation is done, match upper bits with tags Limitation: Cache ≤ (page size × associativity) Virtual Memory view of a Physical Address Cache view of a Physical Address 0 Page offset 11 Physical Page Number disptagset

Computer Architecture 2010 – VM 17 Overlapped TLB & Cache Access (cont)  Assume 4K byte per page  bits [11:0] are not translated  Assume cache is 32K Byte, 2 way set-associative, 64 byte/line –(2 15 / 2 ways) / (2 6 bytes/line) = = 2 8 = 256 sets  Physical_addr[13:12] may be different than virtual_addr[13:12] –Tag is comprised of bits [31:12] of the physical address  The tag may mis-match bits [13:12] of the physical address –Cache miss  allocate missing line according to its virtual set address and physical tag 0 Page offset 11 Physical Page Number disp 13 tag set 6

Computer Architecture 2010 – VM 18 Cache TLB Overlapped TLB & Cache Access (cont) TagSet Page offset Set# Hit/Miss Way MUX Virtual page number set disp Way MUX Set# Physical page number = = = = ======== Hit/Miss Data

Computer Architecture 2010 – VM 19 More On Page Swap-out  DMA copies the page to the disk controller –Reads each byte:  Executes snoop-invalidate for each byte in the cache (both L1 and L2)  If the byte resides in the cache: if it is modified reads its line from the cache into memory invalidates the line –Writes the byte to the disk controller –This means that when a page is swapped-out of memory  All data in the caches which belongs to that page is invalidated  The page in the disk is up-to-date  The TLB is snooped –If the TLB hits for the swapped-out page, TLB entry is invalidated  In the page table –The valid bit in the PTE entry of the swapped-out pages set to 0 –All the rest of the bits in the PTE entry may be used by the operating system for keeping the location of the page in the disk

Computer Architecture 2010 – VM 20 Context Switch  Each process has its own address space –Each process has its own page table –When the OS allocates to each process frames in physical memory, and updates the page table of each process –A process cannot access physical memory allocated to another process  Unless the OS deliberately allocates the same physical frame to 2 processes (for memory sharing)  On a context switch –Save the current architectural state to memory  Architectural registers  Register that holds the page table base address in memory –Flush the TLB –Load the new architectural state from memory  Architectural registers  Register that holds the page table base address in memory

Computer Architecture 2010 – VM 21 VM in VAX: Address Format Page size: 2 9 byte = 512 bytes 31 Page offset 08 Virtual Page Number Virtual Address P0 process space (code and data) P1 process space (stack) S0 system space S1 Page offset 80 Physical Frame Number 29 Physical Address 9

Computer Architecture 2010 – VM 22 VM in VAX: Virtual Address Spaces Process 0 Process 1 Process 2 Process 3 P 0 process code & global vars grow upward P 1 process stack & local vars grow downward S 0 system space grows upward, generally static 0 7FFFFFFF

Computer Architecture 2010 – VM 23 Page Table Entry (PTE) V PROT M Z OWN S S 0 Physical Frame Number Valid bit =1 if page mapped to main memory, otherwise page fault: Page on the disk swap area Address indicates the page location on the disk 4 Protection bits Modified bit 3 ownership bits Indicate if the line was cleaned (zero)

Computer Architecture 2010 – VM 24 System Space Address Translation Page offset VPN SBR (System page table base physical address) + VPN 0 PTE physical address = 00 PFN (from PTE) Page offset PFN Get PTE

Computer Architecture 2010 – VM 25 System Space Address Translation SBR VPN*4 PFN 10 offset VPN offset PFN 31

Computer Architecture 2010 – VM 26 P0 Space Address Translation 00 Page offset VPN P0BR (P0 page table base virtual address) + VPN 0 PTE S0 space virtual address = 00 PFN (from PTE) Page offset PFN Get PTE using system space translation algorithm

Computer Architecture 2010 – VM 27 P0 Space Address Translation (cont) SBR P0BR+VPN*4 Offset’ PFN’ 00 offset VPN Offset’ VPN’ 31 PFN’ VPN’*4 Physical addr of PTE PFN Offset PFN

Computer Architecture 2010 – VM 28 P0 space Address translation Using TLB Yes No Process TLB Access Process TLB hit? 00 VPN offset No System TLB hit? Yes Get PTE of req page from the proc. TLB Calculate PTE virtual addr (in S0): P0BR+4*VPN System TLB Access Get PTE from system TLB Get PTE of req page from the process Page table Access Sys Page Table in SBR+4*VPN(PTE) Memory Access Calculate physical address PFN Access Memory

Computer Architecture 2010 – VM 29 Paging in x86  2-level hierarchical mapping –Page directory and page tables –All pages and page tables are 4K  Linear address divided to: –Dir10 bits –Table10 bits –Offset12 bits –Dir/Table serves as index into a page table –Offset serves ptr into a data page  Page entry points to a page table or page  Performance issues: TLB 031 DIRTABLEOFFSET Page Table PG Tbl Entry Page Directory 4K Dir Entry 4K Page Frame Operand Linear Address Space (4K Page) 1121 CR3

Computer Architecture 2010 – VM 30 x86 Page Translation Mechanism  CR3 points to current page directory (may be changed per process)  Usually, a page directory entry (covers 4MB) points to a page table that covers data of the same type/usage  Can allocate different physical for same Linear (e.g. 2 copies of same code)  Sharing can alias pages from diff. processes to same physical (e.g., OS) DIRTABLEOFFSET Page Dir Code Data StackOS Phys Mem 4K page CR3 Page Tables

Computer Architecture 2010 – VM 31 x86 Page Entry Format  20 bit pointer to a 4K Aligned address  12 bits flags  Virtual memory –Present –Accessed, Dirt  Protection –Writable (R#/W) –User (U/S#) –2 levels/type only  Caching –Page WT –Page Cache Disabled  3 bit for OS usage Figure Format of Page Directory and Page Table Entries for 4K Pages 0 00 Page Frame Address 31:12 AVAIL00A PCDPCD PWTPWT UWP Present Writable User Write-Through Cache Disable Accessed Page Size (0: 4 Kbyte) Available for OS Use Page Dir Entry Page Frame Address 31:12 AVAILDA PCDPCD PWTPWT UWP Present Writable User Write-Through Cache Disable Accessed Dirty Available for OS Use Page Table Entry Reserved by Intel for future use (should be zero) - -

Computer Architecture 2010 – VM 32 x86 Paging – Virtual memory  A page can be –Not yet loaded –Loaded –On disk  A loaded page can be –Dirty –Clean  When a page is not loaded (P bit clear) => Page fault occurs –It may require throwing a loaded page to insert the new one  OS prioritize throwing by LRU and dirty/clean/avail bits  Dirty page should be written to Disk. Clean need not. –New page is either loaded from disk or “initialized” –CPU will set page “access” flag when accessed, “dirty” when written

Computer Architecture 2010 – VM 33 x86 Virtual to Physical Translation [45:12] – from the higher level page table entry or from CR3 for the upper level page table [11:3] – from the appropriate virtual address bits [2:0] – `000

Computer Architecture 2010 – VM 34 Virtually-Addressed Cache  Cache uses virtual addresses (tags are virtual)  Only require address translation on cache miss –TLB not in path to cache hit  Aliasing: 2 different virtual addr. mapped to same physical addr –Two different cache entries holding data for the same physical address –Must update all cache entries with same physical address data Trans- lation Cache Main Memory VA hit PA CPU

Computer Architecture 2010 – VM 35 Virtually-Addressed Cache (cont).  Cache must be flushed at task switch –Solution: include process ID (PID) in tag  How to share memory among processes –Permit multiple virtual pages to refer to same physical frame  Problem: incoherence if they point to different physical pages –Solution: require sufficiently many common virtual LSB –With direct mapped cache, guarantied that they all point to same physical page

Computer Architecture 2010 – VM 36 Backup

Computer Architecture 2010 – VM 37 Why virtual memory?  Generality –ability to run programs larger than size of physical memory  Storage management –allocation/deallocation of variable sized blocks is costly and leads to (external) fragmentation  Protection –regions of the address space can be R/O, Ex,...  Flexibility –portions of a program can be placed anywhere, without relocation  Storage efficiency –retain only most important portions of the program in memory  Concurrent I/O –execute other processes while loading/dumping page  Expandability –can leave room in virtual address space for objects to grow.  Performance

Computer Architecture 2010 – VM 38 Address Translation with a TLB virtual address virtual page number page offset physical address n–10p–1p validphysical page numbertag validtagdata = cache hit tagbyte offset index = TLB hit TLB Cache...