Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

Similar presentations


Presentation on theme: "1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory."— Presentation transcript:

1 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory

2 2Outline  Introduction  Address Translation  VM Organization  Examples Reading: HP3 Section 5.10 For background: Refer to PH (Comp. Org.)

3 3Characteristics

4 4Addressing  Always a congruence mapping  Assume 4GB VM composed of 2 20 4KB pages 4GB VM composed of 2 20 4KB pages 64MB DRAM main memory composed of 16384 page frames (of same size) 64MB DRAM main memory composed of 16384 page frames (of same size)  Only those pages (of the 2 20 ) that are not empty actually exist Each is either in main memory or on disk Each is either in main memory or on disk Can be located with two mappings (implemented with tables) Can be located with two mappings (implemented with tables) Virtual address= (virtual page number,page offset) VA= (VPN,offset) 32 bits= (20 bits +12 bits) Physical address= (real page number,page offset) PA= (RPN, offset) 26 bits= (14 bits +12 bits)

5 5 Address Translation  RPN = f M (VPN) In reality, VPN is mapped to a page table entry (PTE) In reality, VPN is mapped to a page table entry (PTE)  which contains RPN …  … as well as miscellaneous control information (e.g., valid bit, dirty bit, replacement information, access control) VA  PA (VPN, offset within page)  (RPN, offset within page) VA  disk address

6 6 Single-Level, Direct Page Table in MM  Fully associative mapping: when VM page is brought in from disk to MM, it may go into any of the real page frames when VM page is brought in from disk to MM, it may go into any of the real page frames  Simplest addressing scheme: one-level, direct page table (page table base address + VPN) = PTE or page fault (page table base address + VPN) = PTE or page fault Assume that PTE size is 4 bytes Assume that PTE size is 4 bytes Then whole table requires 4  2 20 = 4MB of main memory Then whole table requires 4  2 20 = 4MB of main memory  Disadvantage: 4MB of main memory must be reserved for page tables, even when the VM space is almost empty

7 7 Single-Level Direct Page Table in VM  To avoid tying down 4MB of physical memory Put page tables in VM Put page tables in VM Bring into MM only those that are actually needed Bring into MM only those that are actually needed “Paging the page tables” “Paging the page tables”  Needs only 1K PTEs in main memory, rather than 4MB  Slows down access to VM pages by possibly needing disk accesses for the PTEs

8 8 Multi-Level Direct Page Table in MM  Another solution to storage problem  Break 20-bit VPN into two 10-bit parts VPN = (VPN1, VPN2) VPN = (VPN1, VPN2)  This turns original one-level page table into a tree structure (1st level base address + VPN1) = 2nd level base address (1st level base address + VPN1) = 2nd level base address (2nd level base address + VPN2) = PTE or page fault (2nd level base address + VPN2) = PTE or page fault  Storage situation much improved Always need root node (1K 4-byte entries = 1 VM page) Always need root node (1K 4-byte entries = 1 VM page) Ned only a few of the second level nodes Ned only a few of the second level nodes  Allocated on demand  Can be anywhere in main memory  Access time to PTE has doubled

9 9 Inverted Page Tables  Virtual address spaces may be vastly larger (and more sparsely populated) than real address spaces less-than-full utilization of tree nodes in multi-level direct page table becomes more significant less-than-full utilization of tree nodes in multi-level direct page table becomes more significant  Ideal (i.e., smallest possible) page table would have one entry for every VM page actually in main memory Need 4  16K = 64KB of main memory to store this ideal page table Need 4  16K = 64KB of main memory to store this ideal page table Storage overhead = 0.1% Storage overhead = 0.1%  Inverted page table implementations are approximations to this ideal page table Associative inverted page table in special hardware (ATLAS) Associative inverted page table in special hardware (ATLAS) Hashed inverted page table in MM (IBM, HP PA-RISC) Hashed inverted page table in MM (IBM, HP PA-RISC)

10 10 Translation Lookaside Buffer (TLB)  To avoid two or more MM accesses for each VM access, use a small cache to store (VPN, PTE) pairs PTE contains RPN, from which RA can be constructed PTE contains RPN, from which RA can be constructed  This cache is the TLB, and it exploits locality DEC Alpha (32 entries, fully associative) DEC Alpha (32 entries, fully associative) Amdahl V/8 (512 entries, 2-way set-associative) Amdahl V/8 (512 entries, 2-way set-associative)  Processor issues VA TLB hit TLB hit  Send RA to main memory TLB miss TLB miss  Make two or more MM accesses to page tables to retrieve RA  Send RA to MM –(Any of these may cause page fault)

11 11 TLB Misses  Causes for TLB miss VM page is not in main memory VM page is not in main memory VM page is in main memory, but TLB entry has not yet been entered into TLB VM page is in main memory, but TLB entry has not yet been entered into TLB VM page is in main memory, but TLB entry has been removed for some reason (removed as LRU, invalidated because page table was updated, etc.) VM page is in main memory, but TLB entry has been removed for some reason (removed as LRU, invalidated because page table was updated, etc.)  Miss rates are remarkably low (~0.1%) Miss rate depends on size of TLB and on VM page size (coverage) Miss rate depends on size of TLB and on VM page size (coverage)  Miss penalty varies from a single cache access to several page faults

12 12 Dirty Bits and TLB: Two Solutions  TLB is read-only cache  Dirty bit is contained only in page table in MM  TLB contains only a write- access bit Initially set to zero (denying writing of page) Initially set to zero (denying writing of page)  On first attempt to write VM page An exception is caused An exception is caused Sets the dirty bit in page table in MM Sets the dirty bit in page table in MM Resets the write access bit to 1 in TLB Resets the write access bit to 1 in TLB  TLB is a read-write cache  Dirty bit present in both TLB and page table in MM  On first write to VM page Only dirty bit in TLB is set Only dirty bit in TLB is set  Dirty bit in page table is brought up-to-date when TLB entry is evicted when TLB entry is evicted when VM page and PTE are evicted when VM page and PTE are evicted

13 13 Virtual Memory Access Time  Assume existence of TLB, physical cache, MM, disk  Processor issues VA TLB hit TLB hit  Send RA to cache TLB miss TLB miss  Exception: Access page tables, update TLB, retry  Memory reference may involve accesses to TLB TLB Page table in MM Page table in MM Cache Cache Page in MM Page in MM  Each of these can be a hit or a miss 16 possible combinations 16 possible combinations

14 14 Virtual Memory Access Time (2)  Constraints among these accesses Hit in TLB  hit in page table in MM Hit in TLB  hit in page table in MM Hit in cache  hit in page in MM Hit in cache  hit in page in MM Hit in page in MM  hit in page table in MM Hit in page in MM  hit in page table in MM  These constraints eliminate eleven combinations

15 15 Virtual Memory Access Time (3)  Number of MM accesses depends on page table organization MIPS R2000/R4000 accomplishes table walking with CPU instructions (eight instructions per page table level) MIPS R2000/R4000 accomplishes table walking with CPU instructions (eight instructions per page table level) Several CISC machines implement this in microcode, with MC88200 having dedicated hardware for this Several CISC machines implement this in microcode, with MC88200 having dedicated hardware for this RS/6000 implements this completely in hardware RS/6000 implements this completely in hardware  TLB miss penalty dominated by having to go to main memory Page tables may not be in cache Page tables may not be in cache Further increase in miss penalty if page table organization is complex Further increase in miss penalty if page table organization is complex TLB misses can have very damaging effect on physical caches TLB misses can have very damaging effect on physical caches

16 16 Page Size  Choices Fixed at design time (most early VM systems) Fixed at design time (most early VM systems) Statically configurable Statically configurable  At any moment, only pages of same size exist in system  MC68030 allowed page sizes between 256B and 32KB this way Dynamically configurable Dynamically configurable  Pages of different sizes coexist in system  Alpha 21164, UltraSPARC: 8KB, 64KB, 512KB, 4MB  MIPS R10000, PA-8000: 4KB, 16Kb, 64KB, 256 KB, 1 MB, 4 MB, 16 MB  All pages are aligned Dynamic configuration is a sophisticated way to decrease TLB miss Dynamic configuration is a sophisticated way to decrease TLB miss  Increasing # TLB entries increases processor cycle time  Increasing size of VM page increases internal memory fragmentation  Needs fully associative TLBs

17 17 Segmentation and Paging  Paged segments: Segments are made up of pages  Paging system has flat, linear address space 32-bit VA = (10-bit VPN1, 10-bit VPN2, 12-bit offset) 32-bit VA = (10-bit VPN1, 10-bit VPN2, 12-bit offset) If, for given VPN1, we reach max value of VPN2 and add 1, we reach next page at address (VPN+1, 0) If, for given VPN1, we reach max value of VPN2 and add 1, we reach next page at address (VPN+1, 0)  Segmented version has two-dimensional address space 32-bit VA = (10-bit segment #, 10-bit page number, 12-bit offset) 32-bit VA = (10-bit segment #, 10-bit page number, 12-bit offset) If, for given segment #, we reach max page number and add 1, we get an undefined value If, for given segment #, we reach max page number and add 1, we get an undefined value  Segments are not contiguous  Segments do not need to have the same size Size can even vary dynamically Size can even vary dynamically  Implemented by storing upper bound for each segment and checking every reference against it

18 18 Example 1: Alpha 21264 TLB  Figure 5.36

19 19 Example 2: Hypothetical Virtual Mem  Figure 5.37


Download ppt "1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory."

Similar presentations


Ads by Google