Above: The Burrough B5000 computer

Above: The Burrough B5000 computer
Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas computer, which had virtual memory backed on a magnetic drum.

COMP 740: Computer Architecture and Implementation
Montek Singh Sep 26, 2016 Topic: Virtual Memory

Virtual Memory (App. B) Several purposes:
Main: Allowing software to address more than physical memory Other benefits: Provides for protection; facilitates multi-processing Enables relocation Enables programs to begin before loading fully (some implementations) Programmers used to use overlays and manually control loading/unloading

Virtual Memory Hard disk becomes an extension of the main memory
currently used data resides in main memory (e.g. pages A-C) rest is “swapped out” to disk (e.g. page D) Figure B.19 The logical program in its contiguous virtual address space is shown on the left. It consists of four pages, A, B, C, and D. The actual location of three of the blocks is in physical main memory and the other is located on the disk.

Characteristics: Cache vs. VM “speed gap”

Segmentation and Paging
Paging system has flat, linear address space 32-bit VA = (10-bit VPN1, 10-bit VPN2, 12-bit offset) If, for given VPN1, we reach max value of VPN2 and add 1, we reach next page at address (VPN+1, 0) Segmented version has two-dimensional address space

Segmented version has 2-D address space 32-bit VA = (10-bit segment #, 10-bit page #, 12-bit offset) If, for given segment #, we reach max page number and add 1, we get an undefined value Segments are not contiguous Segments do not need to have the same size Size can even vary dynamically Implemented by storing upper bound for each segment and checking every reference against it Pure segmentation not used today However, variable page sizes have been used to get some of the locality advantages of segmentation

Pros and Cons

Paged Virtual Memory Addressing
Always a many-to-one mapping different virtual pages can map to the same physical page Example: Assume 4GB VM (32-bit address space) composed of 220 4KB pages 64MB DRAM main memory composed of 16K pages (of same size) Only those pages (of the 220) that are not empty actually exist Each is either in main memory or on disk Can be located with two mappings (implemented with tables) Virtual address = (virtual page number, page offset) VA = (VPN, offset) 32 bits = (20 bits bits) Physical address = (real page number, page offset) PA = (RPN, offset) 26 bits = (14 bits bits)

Address Translation RPN = fM(VPN)
VA  PA (VPN, offset within page)  (RPN, offset within page) VA  disk address RPN = fM(VPN) In reality, implemented as a table lookup VPN is mapped to a page table entry (PTE) which contains RPN … … as well as miscellaneous control information (e.g., valid bit, dirty bit, replacement information, access control)

Single-Level Direct Page Table in MM
Fully associative mapping: when VM page is brought in from disk to MM, it may go into any of the real page frames for maximum flexibility, least conflicts Simplest addressing scheme: one-level, direct page table (page table base address + VPN) = PTE or page fault Assume that PTE size is 4 bytes Then whole table requires 4220 = 4MB of main memory Disadvantage: 4MB of main memory must be reserved for page tables … …even when the VM space is almost empty this requirements goes up substantially when we move to 64-bit address space!

Single-Level Direct Page Table in VM
To avoid tying down 4MB of physical memory Put page tables in VM “Paging the page tables” Bring into MM only those that are actually needed Needs only 1K PTEs in main memory, rather than 4MB But: Slows down access to VM pages by possibly needing disk accesses for the PTEs

Multi-Level Direct Page Table in MM
Another solution to storage problem Break 20-bit VPN into two 10-bit parts VPN = (VPN1, VPN2) This turns original one-level page table into a tree structure (1st level base address + VPN1) = 2nd level base address (2nd level base address + VPN2) = PTE or page fault Storage situation much improved Always need root node (1K 4-byte entries = 1 VM page) Need only a few of the second level nodes (due to locality) Allocated on demand Can be anywhere in main memory Negative: Access time to PTE has doubled

Inverted Page Tables Virtual address spaces may be vastly larger (and more sparsely populated) than real address spaces less-than-full utilization of tree nodes in multi-level direct page table becomes more significant Ideal (i.e., smallest possible) page table would have one entry for every VM page actually in main memory Need 416K = 64KB of main memory to store this ideal page table Storage overhead = 0.1% Inverted page table implementations are approximations to this ideal page table Associative inverted page table in special hardware (ATLAS) Hashed inverted page table in MM (IBM, HP PA-RISC)

Translation Lookaside Buffer (TLB)
To avoid two or more MM accesses for each VM access, use a small cache to store (VPN, PTE) pairs PTE contains RPN, from which RA can be constructed This cache is the TLB, and it exploits locality DEC Alpha (32 entries, fully associative) Amdahl V/8 (512 entries, 2-way set-associative) Processor issues VA TLB hit Send RA to main memory TLB miss Make two or more MM accesses to page tables to retrieve RA Send RA to MM (Any of these may cause page fault)

TLB Misses Causes for TLB miss Miss rates are remarkably low (~0.1%)
VM page is not in main memory VM page is in main memory, but TLB entry has not yet been entered into TLB VM page is in main memory, but TLB entry has been removed for some reason (removed as LRU, invalidated because page table was updated, etc.) Miss rates are remarkably low (~0.1%) Miss rate depends on size of TLB and on VM page size (coverage) Miss penalty varies from a single cache access to several page faults

Dirty Bits and TLB: Two Solutions
TLB is read-only cache Dirty bit is contained only in page table in MM TLB contains only a write-access bit Initially set to zero (denying writing of page) On first attempt to write VM page An exception is caused Sets the dirty bit in page table in MM Resets the write access bit to 1 in TLB TLB is a read-write cache Dirty bit present in both TLB and page table in MM On first write to VM page Only dirty bit in TLB is set Dirty bit in page table is brought up-to-date when TLB entry is evicted when VM page and PTE are evicted

Virtual Memory Access Time
Assume existence of TLB, physical cache, MM, disk Processor issues VA TLB hit Send RA to cache TLB miss Exception: Access page tables, update TLB, retry Memory reference may involve accesses to TLB Page table in MM Cache Page in MM Each of these can be a hit or a miss 16 possible combinations

Virtual Memory Access Time (2)
Constraints among these accesses Hit in TLB  hit in page table in MM Hit in cache  hit in page in MM Hit in page in MM  hit in page table in MM These constraints eliminate eleven combinations

Virtual Memory Access Time (3)
Number of MM accesses depends on page table organization MIPS R2000/R4000 accomplishes table walking with CPU instructions (eight instructions per page table level) Several CISC machines implement this in microcode, with MC88200 having dedicated hardware for this RS/6000 implements this completely in hardware TLB miss penalty dominated by having to go to main memory Page tables may not be in cache Further increase in miss penalty if page table organization is complex

Page Size Choices Fixed at design time (most early VM systems)
Statically configurable At any moment, only pages of same size exist in system MC68030 allowed page sizes between 256B and 32KB this way Dynamically configurable Pages of different sizes coexist in system Alpha 21164, UltraSPARC: 8KB, 64KB, 512KB, 4MB MIPS R10000, PA-8000: 4KB, 16Kb, 64KB, 256 KB, 1 MB, 4 MB, 16 MB All pages are aligned Dynamic configuration is a sophisticated way to decrease TLB miss Increasing # TLB entries increases processor cycle time Increasing size of VM page increases internal memory fragmentation Needs fully associative TLBs

Example 1: Opteron TLB Steps:
VPN is extracted, validity and protections checked, one of 40 entries muxed (or miss registered), physical page address combined with offset to generate real address Figure B.24 Operation of the Opteron data TLB during address translation. The four steps of a TLB hit are shown as circled numbers. This TLB has 40 entries. Section B.5 describes the various protection and access fields of an Opteron page table entry.

Example 2: Hypothetical VM
Figure B.25 8K page size TLB is direct-mapped: 256 entries Cache block 64 bytes L1 8KB direct-mapped L1 cache is virtually-indexed L2 4MB direct-mapped Caches are direct mapped Virtual address is 64 bits Physical address is 41 bits Figure B.25 The overall picture of a hypothetical memory hierarchy going from virtual address to L2 cache access. The page size is 8 KB. The TLB is direct mapped with 256 entries. The L1 cache is a direct-mapped 8 KB, and the L2 cache is a direct-mapped 4 MB. Both use 64-byte blocks. The virtual address is 64 bits and the physical address is 41 bits. The primary difference between this simple figure and a real cache is replication of pieces of this figure.

The University of Adelaide, School of Computer Science
Virtual Memory The University of Adelaide, School of Computer Science 21 April 2018 Protection via virtual memory Keeps processes in their own memory space Role of architecture: Provide user mode and supervisor mode Protect certain aspects of CPU state Provide mechanisms for switching between user mode and supervisor mode Provide mechanisms to limit memory accesses Provide TLB to translate addresses Chapter 2 — Instructions: Language of the Computer

The University of Adelaide, School of Computer Science
Virtual Machines The University of Adelaide, School of Computer Science 21 April 2018 Supports isolation and security Sharing a computer among many unrelated users Enabled by raw speed of processors, making the overhead more acceptable Allows different ISAs and operating systems to be presented to user programs “System Virtual Machines” SVM software is called “virtual machine monitor” or “hypervisor” Individual virtual machines run under the monitor are called “guest VMs” Chapter 2 — Instructions: Language of the Computer

Impact of Virtual Machines on Virtual Memory
The University of Adelaide, School of Computer Science 21 April 2018 Each guest OS maintains its own set of page tables VMM adds a level of memory between physical and virtual memory called “real memory” VMM maintains shadow page table that maps guest virtual addresses to physical addresses Requires VMM to detect guest’s changes to its own page table Occurs naturally if accessing the page table pointer is a privileged operation Chapter 2 — Instructions: Language of the Computer

Reading Textbook App. B.4 App. B.5 (self-study) Ch. 2.4 (self-study)

Above: The Burrough B5000 computer

Similar presentations

Presentation on theme: "Above: The Burrough B5000 computer"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Above: The Burrough B5000 computer

Similar presentations

Presentation on theme: "Above: The Burrough B5000 computer"— Presentation transcript:

Similar presentations

About project

Feedback