Virtual Memory
Accelerating translation Outline Virtual Space Address translation Accelerating translation with a TLB Multilevel page tables Different points of view Suggested reading: 10.1~10.6 TLB: Translation lookaside buffers
10.1 Physical and Virtual Addressing
Attributes of the main memory Physical Addressing Attributes of the main memory Organized as an array of M contiguous byte-sized cells Each byte has a unique physical address (PA) started from 0 physical addressing A CPU use physical addresses to access memory Examples Early PCs, DSP, embedded microcontrollers, and Cray supercomputers Contiguous: 临近的
Physical Addressing Figure 10.1 P693
Virtual Addressing Virtual addressing the CPU accesses main memory by a virtual address (VA) The virtual address is converted to the appropriate physical address
Virtual Addressing Address translation Converting a virtual address to a physical one requires close cooperation between the CPU hardware and the operating system the memory management unit (MMU) Dedicated hardware on the CPU chip to translate virtual addresses on the fly A look-up table Stored in main memory Contents are managed by the operating system
Figure 10.2 P694
10.2 Address Space
Address Space Address Space Linear Space N-bit address space An ordered set of nonnegative integer addresses Linear Space The integers in the address space are consecutive N-bit address space
Address Space K=210(Kilo), M=220(Mega), G=230(Giga), T=240(Tera), P=250(Peta), E=260(Exa) #virtual address bits (n) #virtual address (N) Largest possible virtual address 8 64K 32 256T 64 256 255 16 64K-1 4G 4G-1 48 256T-1 16E 16E-1 Practice Problem 10.1 P695
Data objects and their attributes Address Space Data objects and their attributes Bytes vs. addresses Each data object can have multiple independent addresses
10.3 VM as a Tool for Caching
Using Main Memory as a Cache P695 DRAM SRAM Disk
10.3.1 DRAM Cache Organization
Using Main Memory as a Cache DRAM vs. disk is more extreme than SRAM vs. DRAM Access latencies: DRAM ~10X slower than SRAM Disk ~100,000X slower than DRAM Bottom line: Design decisions made for DRAM caches driven by enormous cost of misses
Design Considerations Line size? Large, since disk better at transferring large blocks Associativity? High, to minimize miss rate Write through or write back? Write back, since can’t afford to perform small writes to disk Write back: defers the memory update as long as possible by writing the updated block to memory only when it is evicted from the cache by the replacement algorithm.
10.3.2 Page Tables
Page Virtual memory Organized as an array of contiguous byte-sized cells stored on disk conceptually. Each byte has a unique virtual address that serves as an index into the array The contents of the array on disk are cached in main memory
The data on disk is partitioned into blocks Page P695 The data on disk is partitioned into blocks Serve as the transfer units between the disk and the main memory virtual pages (VPs) physical pages (PPs) Also referred to as page frames
Page Attributes P695 1) Unallocated: Pages that have not yet been allocated (or created) by the VM system Do not have any data associated with them Do not occupy any space on disk.
Page Attributes 2) Cached: 3) Uncached: Allocated pages that are currently cached in physical memory. 3) Uncached: Allocated pages that are not cached in physical memory.
Page Figure 10.3 P696
Each allocate page of virtual memory has entry in page table Mapping from virtual pages to physical pages From uncached form to cached form Page table entry even if page not in memory Specifies disk address OS retrieves information
Page Table Data 243 17 105 • 0: 1: N-1: X Object Name Location D: J: On Disk “Cache” Page Table
Page Table Memory resident page table Physical Memory Disk Storage (physical page or disk address) Physical Memory Disk Storage (swap file or regular file system file) Valid 1 Virtual Page Number
10.3.3 Page Hits
Page Hits Figure 10.5 P698 Memory Address Translation: Hardware converts virtual addresses to physical addresses via an OS-managed lookup table (page table) CPU 0: 1: N-1: P-1: Page Table Disk Virtual Addresses Physical
10.3.4 Page Faults
Page table entry indicates virtual address not in memory Page Faults Page table entry indicates virtual address not in memory OS exception handler invoked to move data from disk into memory current process suspends, others can resume OS has full control over placement, etc. Suspend: 悬挂
Page Faults Figure 10.6 P699 Figure 10.7 P699 Swapping or paging Swapped out or paged out (from DRAM to Disk) Demand paging (Waiting until the last moment to swap in a page, when a miss occurs) CPU Memory Page Table Disk Virtual Addresses Physical Before fault After fault Figure 10.6 P699 Figure 10.7 P699
Processor Signals Controller Servicing a Page Fault (1) Initiate Block Read Processor Signals Controller Read block of length P starting at disk address X and store starting at memory address Y Processor Reg Cache Memory-I/O bus I/O controller Memory disk Disk disk Disk
Servicing a Page Fault Read Occurs Direct Memory Access (DMA) Under control of I/O controller Processor Reg Cache Memory-I/O bus (2) DMA Transfer I/O controller Memory disk Disk Disk disk
I / O Controller Signals Completion Servicing a Page Fault I / O Controller Signals Completion Interrupt processor OS resumes suspended process Processor Reg (3) Read Done Cache Memory-I/O bus I/O controller Memory Resumes: 再继续,重新开始 disk Disk Disk disk
10.3.5 Allocating Pages
Allocating Pages P700 The operating system allocates a new page of virtual memory, for example, as a result of calling malloc. Figure 10.8 P700
10.3.6 Locality to the Rescue Again
Locality P700 The principle of locality promises that at any point in time they will tend to work on a smaller set of active pages, known as working set or resident set. Initial overhead where the working set is paged into memory, subsequent references to the working set result in hits, with no additional disk traffic.
Locality-2 P700 If the working set size exceeds the size of physical memory, then the program can produce an unfortunate situation known as thrashing, where the pages are swapped in and out continuously. Thrash: 鞭打
10.4 VM as a Tool for Memory Management
A Tool for Memory Management Separate virtual address space Each process has its own virtual address space Simplify linking, sharing, loading, and memory allocation
A Tool for Memory Management Virtual Address Space for Process 1: Physical Address Space (DRAM) VP 1 VP 2 PP 2 Address Translation N-1 M-1 PP 7 PP 10 (e.g., read/only library code) ... Virtual Address Space for Process 2: Figure 10.9 P701
10.4.1 Simplifying Linking
A Tool for Memory Management kernel virtual memory Memory mapped region for shared libraries runtime heap (via malloc) program text (.text) initialized data (.data) uninitialized data (.bss) stack forbidden %esp memory invisible to user code the “brk” ptr Linux/x86 process memory image 0xc0000000 0xbfffffff 0x40000000 0x08048000 Figure 10.10 P702
10.4.2 Simplifying Sharing
Simplifying Sharing In some instances, it is desirable for processes to share code and data. The same operating system kernel code Make calls to routines in the standard C library The operating system can arrange for multiple process to share a single copy of this code by mapping the appropriate virtual pages in different processes to the same physical pages
10.4.3 Simplifying Memory Allocation
Simplifying Memory Allocation A simple mechanism for allocating additional memory to user processes. Page table work.
10.4.4 Simplifying Loading
Simplifying Loading Load executable and shared object files into memory. Memory mapping……mmap
10.5 VM as a Tool for Memory Protection
A Tool for Memory Protection Page table entry contains access rights information hardware enforces this protection (trap into OS if violation occurs)
A Tool for Memory Protection Page Tables Process i: Physical Addr Read? Write? PP 9 Yes No PP 4 XXXXXXX VP 0: VP 1: VP 2: • Process j: 0: 1: N-1: Memory PP 6
A Tool for Memory Protection Figure 10.11 P704
10.6 Address Translation
Address Translation Processor Hardware Addr Trans Mechanism fault handler Main Memory Secondary memory a a' page fault physical address OS performs this transfer (only if miss) virtual address part of the on-chip memory mgmt unit (MMU)
Address Translation Figure 10.12 P705 Parameters P = 2p = page size (bytes). N = 2n = Virtual address limit M = 2m = Physical address limit
Address Translation P705 virtual page number page offset virtual address physical page number physical address p–1 address translation p m–1 n–1 Notice that the page offset bits don't change as a result of translation
Address Translation via Page Table virtual page number (VPN) page offset virtual address physical page number (PPN) physical address p–1 p m–1 n–1 page table base register (PTBR) if valid=0 then page not in memory valid access VPN acts as table index Figure 10.13 P705
Page Hits Figure 10.14 (a) P706 VA: virtual address. PTEA: page table entry address. PTE: page table entry. PA: physical address.
Page Faults Figure 10.14 (b) P706
10.6.1 Integrating Caches and VM
Integrating Caches and VM CPU Trans- lation Cache Main Memory VA PA miss hit data
Integrating Caches and VM Most Caches “Physically Addressed” Accessed by physical addresses Allows multiple processes to have blocks in cache at same time Allows multiple processes to share pages Cache doesn’t need to be concerned with protection issues Access rights checked as part of address translation
Integrating Caches and VM Perform Address Translation Before Cache Lookup But this could involve a memory access itself (of the PTE) Of course, page table entries can also become cached
Figure 10.15 P708
10.6.2 Speeding up Address Translation with a TLB
Speeding up Translation with a TLB “Translation Lookaside Buffer” (TLB) Small hardware cache in MMU Maps virtual page numbers to physical page numbers
Figure 10.17 (a) P709
Speeding up Translation with a TLB Figure 10.16 P708
Speeding up Translation with a TLB virtual page number page offset virtual address valid tag physical page number . . . TLB = TLB hit physical address tag byte offset index valid tag data Cache = cache hit data
Simple Memory System TLB 16 entries 4-way associative 13 12 11 10 9 8 7 6 5 4 3 2 1 VPO VPN TLBI TLBT Set Tag PPN Valid 03 – 09 0D 1 00 07 02 2D 04 0A 2 08 06 3 34 Figure 10.21 (a) P712
Address Translation Example P714 Virtual Address 0x03D4 13 12 11 10 9 8 7 6 5 4 3 2 1 VPO VPN VPN: 0x0f TLBI: 0x03 TLBT: 0x03 TLB Hit? yes Page Fault? No 11 10 9 8 7 6 5 4 3 2 1 PPO PPN PPN: 0x0D VPO: 0x14 PA: 0x354
Simple Memory System Cache 16 lines 4-byte line size Direct mapped
Simple Memory System Cache 11 10 9 8 7 6 5 4 3 2 1 PPO PPN CO CI CT Idx Tag Valid B0 B1 B2 B3 19 1 99 11 23 8 24 3A 00 51 89 15 – 9 2D 2 1B 02 04 08 A 93 DA 3B 3 36 B 0B 4 32 43 6D 8F 09 C 12 5 0D 72 F0 1D D 16 96 34 6 31 E 13 83 77 D3 7 C2 DF 03 F 14 Figure 10.21 (c) P713
Address Translation Example P714 PA: 0x354 11 10 9 8 7 6 5 4 3 2 1 PPO PPN CO CI CT Offset: 0x0 CI: 0x05 CT: 0x0D Hit? Yes Byte: 0x36
10.6.3 Multi Level Page Tables
Multi-Level Page Tables Given: 4KB (212) page size 32-bit address space 4-byte PTE Problem: Would need a 4 MB page table! 220 *4 bytes
Multi-Level Page Tables ... Level 2 Tables Common solution multi-level page tables e.g., 2-level table (P6) Level 1 table: 1024 entries, each of which points to a Level 2 page table. Level 2 table: 1024 entries, each of which points to a page
Multi-Level Page Tables Figure 10.18 P710
Multi-Level Page Tables Figure 10.19 P711
10.6.4 Putting it Together: End-to-End Address Translation
Simple Memory System Example Addressing 14-bit virtual addresses 12-bit physical address Page size = 64 bits 13 12 11 10 9 8 7 6 5 4 3 2 1 VPO PPO PPN VPN (Virtual Page Number) (Virtual Page Offset) (Physical Page Number) (Physical Page Offset) Figure 10.20 P712
Simple Memory System Page Table Only show first 16 entries VPN PPN Valid 00 28 1 08 13 01 – 09 17 02 33 0A 03 0B 04 0C 05 16 0D 2D 06 0E 11 07 0F Figure 10.21 P712 (b)
Address Translation Example P714 Virtual Address 0x03D4 13 12 11 10 9 8 7 6 5 4 3 2 1 VPO VPN VPN: 0x0f Page Fault? No 11 10 9 8 7 6 5 4 3 2 1 PPO PPN PPN: 0x0D VPO: 0x14 PA: 0x354