Computer Architecture 2011 – VM x86 1 Computer Architecture Virtual Memory (VM) – x86 By Dan Tsafrir, 30/5/2011 Presentation based on slides by Lihu Rappoport.

Slides:



Advertisements
Similar presentations
Virtual Memory In this lecture, slides from lecture 16 from the course Computer Architecture ECE 201 by Professor Mike Schulte are used with permission.
Advertisements

EECS 470 Virtual Memory Lecture 15. Why Use Virtual Memory? Decouples size of physical memory from programmer visible virtual memory Provides a convenient.
OS Fall’02 Virtual Memory Operating Systems Fall 2002.
16.317: Microprocessor System Design I
Virtual Memory Chapter 18 S. Dandamudi To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi.
Computer Organization CS224 Fall 2012 Lesson 44. Virtual Memory  Use main memory as a “cache” for secondary (disk) storage l Managed jointly by CPU hardware.
Lecture 34: Chapter 5 Today’s topic –Virtual Memories 1.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Virtual Memory I Steve Ko Computer Sciences and Engineering University at Buffalo.
Computer Architecture Virtual Memory
The Memory Hierarchy (Lectures #24) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer Organization.
CS 333 Introduction to Operating Systems Class 11 – Virtual Memory (1)
Memory Management (II)
Memory Management and Paging CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han.
Paging and Virtual Memory. Memory management: Review  Fixed partitioning, dynamic partitioning  Problems Internal/external fragmentation A process can.
CE6105 Linux 作業系統 Linux Operating System 許 富 皓. Chapter 2 Memory Addressing.
Translation Buffers (TLB’s)
Address Translation Mechanism of 80386
Memory Addressing in Linux  Logical Address machine language instruction location  Linear address (virtual address) a single 32 but unsigned integer.
Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
Computer Architecture Lecture 28 Fasih ur Rehman.
Lecture 19: Virtual Memory
Computer Structure 2012 – VM 1 Computer Structure X86 Virtual Memory and TLB Franck Sala Slides from Lihu and Adi’s Lecture.
July 30, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 8: Exploiting Memory Hierarchy: Virtual Memory * Jeremy R. Johnson Monday.
1 Virtual Memory. 2 Outline Pentium/Linux Memory System Core i7 Suggested reading: 9.6, 9.7.
Virtual Memory Expanding Memory Multiple Concurrent Processes.
1 Linux Operating System 許 富 皓. 2 Memory Addressing.
8.1 Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Paging Physical address space of a process can be noncontiguous Avoids.
Computer Structure 2012 – VM 1 Computer Structure X86 Virtual Memory and TLB Franck Sala Slides from Lihu and Adi’s Lecture.
Virtual Memory Part 1 Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology May 2, 2012L22-1
Computer Architecture 2008 – VM 1 Computer Architecture Virtual Memory Dr. Lihu Rappoport.
Pentium III Memory.
Computer Structure 2012 – VM 1 Computer Structure Virtual Memory Lihu Rappoport and Adi Yoaz.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Virtual Memory Hardware.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Demand Paging.
Processes and Virtual Memory
CS2100 Computer Organisation Virtual Memory – Own reading only (AY2015/6) Semester 1.
Computer Structure Virtual Memory
Virtual Memory Ch. 8 & 9 Silberschatz Operating Systems Book.
VMLihu Rappoport, 12/ MAMAS – Computer Architecture Virtual Memory Dr. Lihu Rappoport.
Lecture 14 PA2. Lab 2: Demand Paging Implement the following syscalls xmmap, xmunmap, vcreate, vgetmem/vfreemem, srpolicy Deadline: November , 10:00.
Constructive Computer Architecture Virtual Memory: From Address Translation to Demand Paging Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.
Microprocessor system architectures – IA32 paging Jakub Yaghob.
Virtual Memory 1 Computer Organization II © McQuain Virtual Memory Use main memory as a “cache” for secondary (disk) storage – Managed jointly.
CS203 – Advanced Computer Architecture Virtual Memory.
Computer Architecture 2010 – VM 1 Computer Architecture Virtual Memory Dr. Lihu Rappoport.
1 Virtual Memory. 2 Outline Case analysis –Pentium/Linux Memory System –Core i7 Suggested reading: 9.7.
Computer Architecture Lecture 12: Virtual Memory I
CS161 – Design and Architecture of Computer
Translation Lookaside Buffer
Computer Architecture Virtual Memory
ECE232: Hardware Organization and Design
CS161 – Design and Architecture of Computer
From Address Translation to Demand Paging
From Address Translation to Demand Paging
Address Translation Mechanism of 80386
Computer Architecture Virtual Memory (VM)
CSE 153 Design of Operating Systems Winter 2018
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
From Address Translation to Demand Paging
Introduction to the Intel x86’s support for “virtual” memory
Virtual Memory Hardware
Translation Lookaside Buffer
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
Translation Buffers (TLB’s)
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE 153 Design of Operating Systems Winter 2019
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Virtual Memory.
Computer Structure Virtual Memory
Presentation transcript:

Computer Architecture 2011 – VM x86 1 Computer Architecture Virtual Memory (VM) – x86 By Dan Tsafrir, 30/5/2011 Presentation based on slides by Lihu Rappoport

Computer Architecture 2011 – VM x (funny beginning)

Computer Architecture 2011 – VM x86 3 Reminder: VM motivation  VM provides –Illusion of large memory –Illusion of contiguity –Ability to overcommitment –Process isolation

Computer Architecture 2011 – VM x86 4 Reminder: page table translates VA=>PA Valid 1 Physical Memory Disk Page Table points to memory frame or disk address Virtual page number Think of it as a hash table that maps VA to PA

Computer Architecture 2011 – VM x86 5 Reminder: TLB accelerates translation TLB is a VA => PA cache

Computer Architecture 2011 – VM x86 6 Reminder: VM concepts  A page can be –Not yet loaded –Loaded –On disk  A loaded page can be –Dirty –Clean  When a page is not loaded (P bit clear)  page fault occurs –It may require throwing a loaded page to insert the new one  OS prioritize throwing by LRU and dirty/clean/avail bits  Dirty page should be written to Disk. Clean need not. –New page is either loaded from disk or “initialized” –CPU will set page “access” flag when accessed, “dirty” when written

Computer Architecture 2011 – VM x86 7 Goal  In the context of x86…  Provide a method to map –From virtual address (used by program) –To: physical address  Method should be efficient –Can generally be exercised by HW alone –Typically no SW involvement

Computer Architecture 2011 – VM x BIT X86 REGULAR PAGING

Computer Architecture 2011 – VM x86 9 Hierarchical translation  x86 supports 4KB & 4MB pages –Q: why would we want a 4MB (called “super-page”)? –A: TLB is small…  Page directory –Each process has its own page-directory (but threads share)  CR3 points to p-d of current process –Holds 1024 PDEs (page-directory entries), each is 32 bits –Each PDE contains a PS (“page size”) flag  PS=1: PDE points directly to a 4MB (super)page  PS=0: PDE points to “page table” whose entries point to 4KB pages  Page table –Holds 1024 PTEs (page-table entries), each is 32 bits –Each PTE points to a 4KB page in physical memory

Computer Architecture 2011 – VM x86 10 Mapping only 4KB pages (typical)  2-level hierarchy –All pages are 4KB aligned –Total of 2 20 (=1M) 4KB pages = 4GB  DIR (10 bits) –Point to PDE in page directory –We assume all PDEs have PS=0 –=> Each PDE provides 20bit of 4KB- aligned base physical address of a 4KB page table (no superpaging)  TABLE (10 bits) –Point to PTE in page table –PTE provides a 20 bit, 4KB-aligned base physical address of a 4KB page  OFFSET (12 bits) –Offset within the selected 4KB page 031 DIRTABLEOFFSET 32bit linear address KB 1K-PTE page table 4KB 1K-PDE page directory PDE 4K Page data CR3 (PDBR) PTE 20+12=32 (4K aligned) 20

Computer Architecture 2011 – VM x86 11 Mapping only 4MB pages  1-level hierarchy –All pages are 4MB aligned –Total of 2 10 (=1K) 4KB pages = 4GB  DIR (10 bits): –Point to PDE in page directory –We assume all PDEs have PS=1 –=> Each PDE provides 10bit of 4MB- aligned base physical address of a 4MB page table (no superpaging)  TABLE (10 bits) –None! (moved to offset)  OFFSET (22 bits) –Offset within the selected 4MB page  Fine print –Must set PSE flag in CR4 for 4MB support to work –Otherwise, PS=1 flag settings ignored 031 DIROFFSET 32bit linear address 21 PDE 4MB Page data CR3 (PDBR) =32 (4K aligned) 10 4KB 1K-PDE page directory

Computer Architecture 2011 – VM x86 12 Mixing 4KB & 4MB pages  Works “out of the box” –When CR3.PSE=1 –Alignment constraints: 4MB for superpages, 4KB for regular pages  TLB issues? –No, as CPU maintains 4MB and 4KB PTEs in separate TLBs  Benefits –Superpages often used for often-used kernel code –Frees up 4KB TLB entries –Reduces TLB misses => improve overall system performance

Computer Architecture 2011 – VM x86 13 PDE & PTE format  20 bit physical address –4K-aligned pointer  12 bits flags –Virtual memory  Present, accessed, dirty –Protection  Read, write, user, privileged –Caching  WB, WT, disable –3 bit for OS usage 0 00 Page Frame Address 31:12 AVAIL00A PCDPCD PWTPWT UWP Present Writable User Write-Through Cache Disable Accessed Page Size (0: 4 Kbyte) Available for OS Use Page Dir Entry Page Frame Address 31:12 AVAILDA PCDPCD PWTPWT UWP Present Writable User Write-Through Cache Disable Accessed Dirty Available for OS Use Page Table Entry Reserved for future use (should be zero) - -

Computer Architecture 2011 – VM x KB-page PTE format G PATPAT Page Base Address 31:12 AVAILDA PCDPCD PWTPWT U/SU/S R/WR/W P Present Writable User / Supervisor Write-Through Cache Disable Accessed Dirty Page Table Attribute Index Global Page Available for OS Use

Computer Architecture 2011 – VM x KB-page PDE format G PSPS Page Table Base Address 31:12 AVAIL AVLAVL A PCDPCD PWTPWT U/SU/S R/WR/W P Present Writable User / Supervisor Write-Through Cache Disable Accessed Dirty Page Size (0 indicates 4 Kbytes) Global Page (ignored) Available for OS Use

Computer Architecture 2011 – VM x86 16 Reserved 4MB-page PDE format G PSPS Page Base Address 31:22 AVAILDA PCDPCD PWTPWT U/SU/S R/WR/W P Present Writable User / Supervisor Write-Through Cache Disable Accessed Dirty Page Size (1 indicates 4 Mbytes) Global Page (ignored) Available for OS Use Page Table Attribute Index PATPAT 12

Computer Architecture 2011 – VM x86 17 VM attributes: present flag (P)  Set => page in physical memory –Translation is carried out by the MMU (memory management unit)  Clear => page not in physical memory –When encounters by MMU => generates a page-fault exception –Faulting address is available to SW exception handler  MMU does not set/clear this flag (only reads it) –It’s up to the OS  Upon page-fault exception => OS typically does the following: 1.Copy page from disk to memory (unless already in buffer cache) 2.Update PTE/PDE with page RAM address 3.P = 1; dirty = accessed = 0; etc. 4.Invalidate associated PTE in TLB 5.Resume program on faulty instruction

Computer Architecture 2011 – VM x86 18 VM attributes: page size flag (PS)  In PDEs only  Determines the page size –Clear=> page size = 4KB (& PDE points to a page table) –Set=> page size = 4MB (& PDE points to superpage)

Computer Architecture 2011 – VM x86 19 VM attributes: accessed (A) & dirty (D)  MMU sets A-flag –Upon first time a page (or page-table) is accessed (load or store)  MMU sets D-flag –Upon first time a page (or PT) is accessed (store only)  A & D are sticky –Once set, MMU (=HW) never clears them –Only SW does  OS clears them –When initially loading PTE –Possibly from time to time as part of LRU approximation (used to decide which pages to swap out and which to keep)

Computer Architecture 2011 – VM x86 20 VM attributes: global flag (G)  Has affect only when PGE=1 in CR4  When set, indicates page is “global” –Not flushed from TLB when CR3 loaded –Ignored for PDEs with PS=0 (that point to page tables)  Used to improve performance –Keeps important pages of OS in TLB across context switches  Only software can set or clear this flag

Computer Architecture 2011 – VM x86 21 Cache attributes: PWT  PWT –Means “page-level write-through”  Controls write-through / write-back caching policy of page / PT –1: enable write-through caching –0 : disable write-through => enable write-back caching  Ignored if –CD (“cache disable”) flag is set in CR0 –If associated PCD is on

Computer Architecture 2011 – VM x86 22 Cache attributes: PCD  PCD –Means “page-level cache disable” flag  Controls caching of individual pages / PTs –1: caching associated page/PT is prevented –0: caching allowed  Used –When caching doesn’t help performance (e.g., streaming) –Memory mapped I/O ports to communicate with devices  Assumed as set (regardless of actual value) –If the CD (“cache disable”) flag in CR0 is set

Computer Architecture 2011 – VM x86 23 Cache attributes: PAT  PAT –Means “page attribute table index” flag  If on, used along with PCD & PWT flags to select an entry in the PAT –Which in turn selects the memory type for the page –PAT is a 64bit register –(Not going into the details)

Computer Architecture 2011 – VM x86 24 Protection attributes : R/W & U/S  Read/write (R/W) flag –Specifies read-write privileges for  page (if PTE),  group of pages (if PDE) –0 = read only –1 = read & write  User/supervisor (U/S) flag –Specifies privileges for a page (PTE) or group of pages (PDE) (in case of a PDE that points to a page table) –0 = supervisor privilege level –1 = user privilege level –User accessing a supervisor page will trigger an interrupt  Typically resulting in the termination of the program

Computer Architecture 2011 – VM x86 25 Misc issues  Memory aliasing/sharing –When two (or more) PDEs point to a common PTE –When two (or more) PTEs point to a common page –But SW must maintain consistency of accessed & dirty bits in the these PDEs & PTEs  Base address of page-directory –Physical address of current p-d is stored in CR3  Also called the page-directory-base-register (PDBR) –PDBR typically reloaded upon task switches –Page directory must remain in-memory as long as task is active

Computer Architecture 2011 – VM x BIT X86 EXTENDED PAGING

Computer Architecture 2011 – VM x86 27 PAE – Physical Address Extension  32bit address imposes a limit –Means we can use memory <= 2^32 = 4GB –Too small for many system,  PAE (physical address extension) support –Allows access to a 2^36 RAM (= 64 GB) –But not directly (address remains 32bit)  Only applicable when paging is enabled –When also turning on PAE in CR4 –Support for 4KB and 2MB (rather than 4MB)

Computer Architecture 2011 – VM x86 28 PAE – Physical Address Extension  Relies on an additional Page Directory Pointer Table –Lies above the page directory in the translation hierarchy –Has 4 entries of 64-bits each to support up to 4 page directories –PTEs are increased to 64 bits to accommodate 36-bit base physical addresses –Each 4KB page directory and page table can thus have up to 512 entries –CR3 contains the page-directory-pointer-table base address

Computer Architecture 2011 – VM x KB Page Mapping with PAE  Linear address divided to –Page-directory-pointer-table entry  Indexed by bits 30:31 of the linear addr.  Provides an offset to one of 4 entries in the page-directory-pointer table  The selected entry provides the base physical address of a page directory –Dir(9 bits) – points to a PDE in the Page Directory  PS in the PDE = 0  PDE provides a 27 bit, 4KB aligned base physical address of a page table –Table (9 bit) – points to a PTE in the Page Table  PTE provides a 24 bit, 4KB aligned base physical address of a 4KB page –Offset (12 bits) – offset within the selected 4KB page 029 DIRTABLEOFFSET Linear Address Space (4K Page) entry Page Table 512 entry Page Directory PDE 4KByte Page data PTE CR3 (PDPTR) 32 (32B aligned) Dir ptr entry Page Directory Pointer Table Dir ptr entry 27 2

Computer Architecture 2011 – VM x MB Page Mapping with PAE  Linear address divided to –Page-directory-pointer-table entry  Indexed by bits 30:31 of the linear addr.  Provides an offset to one of 4 entries in the page-directory-pointer table  The selected entry provides the base physical address of a page directory –Dir(9 bits) – points to a PDE in the Page Directory  PS in the PDE = 1  PDE provides a 15 bit, 2MB aligned base physical address of a 2MB page –Offset (21 bits) – offset within the selected 2MB page 029 DIROFFSET Linear Address Space (2MB Page) 20 Page Directory PDE 2MByte Page data 9 21 CR3 (PDPTR) 32 (32B aligned) Dir ptr Page Directory Pointer Table Dir ptr entry 27 2

Computer Architecture 2011 – VM x86 31 PTE/PDE/PDP Entry Format with PAE  The major differences in these entries are as follows: –A page-directory-pointer-table entry is added –The size of the entries is increased from 32 bits to 64 bits –The maximum number of entries in a page directory or page table is 512 –The base physical address field in each entry is extended to 24 bits

Computer Architecture 2011 – VM x86 32 Paging in 64 bit Mode  PAE paging structures expanded –Potentially support mapping a 64-bit linear address to a 52-bit physical address –First implementation supports mapping a 48-bit linear address into a 40-bit physical address  A 4 th page mapping table added: the page map level 4 table (PML4) –The base physical address of the PML4 is stored in CR3 –A PML4 entry contains the base physical address a page directory pointer table  The page directory pointer table is expanded to byte entries –Indexed by 9 bits of the linear address  The size of the PDE/PTE tables remains 512 eight-byte entries –each indexed by nine linear-address bits  The total of linear-address index bits becomes 48  PS flag in PDEs selects between 4-KByte and 2-MByte page sizes –CR4.PSE bit is ignored

Computer Architecture 2011 – VM x86 33 sign ext. 4KB Page Mapping in 64 bit Mode 029 DIRTABLEOFFSET Linear Address Space (4K Page) entry Page Table 512 entry Page Directory PDE 4KByte Page data PTE CR3 (PDPTR) 40 (4KB aligned) entry Page Directory Pointer Table PDP entry 31 9 PDPPML entry PML4 Table PML4 entry 9 31

Computer Architecture 2011 – VM x86 34 sign ext. 2MB Page Mapping in 64 bit Mode 029 DIROFFSET Linear Address Space (2M Page) entry Page Directory PDE 2MByte Page data 9 21 CR3 (PDPTR) 40 (4KB aligned) entry Page Directory Pointer Table PDP entry 31 9 PDPPML entry PML4 Table PML4 entry 9 31

Computer Architecture 2011 – VM x86 35 PTE/PDE/PDP/PML4 Entry Format – 4KB Pages

Computer Architecture 2011 – VM x86 36 TLBs  The processor saves most recently used PDEs and PTEs in TLBs –Separate TLB for data and instruction caches –Separate TLBs for 4-KByte and 2/4-MByte page sizes  OS running at privilege level 0 can invalidate TLB entries –INVLPG instruction invalidates a specific PTE in the TLB  This instruction ignores the setting of the G flag –Whenever a PDE/PTE is changed (including when the present flag is set to zero), OS must invalidate the corresponding TLB entry –All (non-global) TLBs are automatically invalidated when CR3 is loaded  The global (G) flag prevents frequently used pages from being automatically invalidated in on a task switch –The entry remains in the TLB indefinitely –Only INVLPG can invalidate a global page entry