Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Architecture 2011 – VM x86 1 Computer Architecture Virtual Memory (VM) – x86 By Dan Tsafrir, 30/5/2011 Presentation based on slides by Lihu Rappoport.

Similar presentations


Presentation on theme: "Computer Architecture 2011 – VM x86 1 Computer Architecture Virtual Memory (VM) – x86 By Dan Tsafrir, 30/5/2011 Presentation based on slides by Lihu Rappoport."— Presentation transcript:

1 Computer Architecture 2011 – VM x86 1 Computer Architecture Virtual Memory (VM) – x86 By Dan Tsafrir, 30/5/2011 Presentation based on slides by Lihu Rappoport

2 Computer Architecture 2011 – VM x86 2 http://www.youtube.com/watch?v=3ye2OXj32DM http://www.youtube.com/watch?v=3ye2OXj32DM (funny beginning)

3 Computer Architecture 2011 – VM x86 3 Reminder: VM motivation  VM provides –Illusion of large memory –Illusion of contiguity –Ability to overcommitment –Process isolation

4 Computer Architecture 2011 – VM x86 4 Reminder: page table translates VA=>PA Valid 1 Physical Memory Disk Page Table points to memory frame or disk address 1 1 1 1 1 1 1 1 0 0 0 Virtual page number Think of it as a hash table that maps VA to PA

5 Computer Architecture 2011 – VM x86 5 Reminder: TLB accelerates translation TLB is a VA => PA cache

6 Computer Architecture 2011 – VM x86 6 Reminder: VM concepts  A page can be –Not yet loaded –Loaded –On disk  A loaded page can be –Dirty –Clean  When a page is not loaded (P bit clear)  page fault occurs –It may require throwing a loaded page to insert the new one  OS prioritize throwing by LRU and dirty/clean/avail bits  Dirty page should be written to Disk. Clean need not. –New page is either loaded from disk or “initialized” –CPU will set page “access” flag when accessed, “dirty” when written

7 Computer Architecture 2011 – VM x86 7 Goal  In the context of x86…  Provide a method to map –From virtual address (used by program) –To: physical address  Method should be efficient –Can generally be exercised by HW alone –Typically no SW involvement

8 Computer Architecture 2011 – VM x86 8 32BIT X86 REGULAR PAGING

9 Computer Architecture 2011 – VM x86 9 Hierarchical translation  x86 supports 4KB & 4MB pages –Q: why would we want a 4MB (called “super-page”)? –A: TLB is small…  Page directory –Each process has its own page-directory (but threads share)  CR3 points to p-d of current process –Holds 1024 PDEs (page-directory entries), each is 32 bits –Each PDE contains a PS (“page size”) flag  PS=1: PDE points directly to a 4MB (super)page  PS=0: PDE points to “page table” whose entries point to 4KB pages  Page table –Holds 1024 PTEs (page-table entries), each is 32 bits –Each PTE points to a 4KB page in physical memory

10 Computer Architecture 2011 – VM x86 10 Mapping only 4KB pages (typical)  2-level hierarchy –All pages are 4KB aligned –Total of 2 20 (=1M) 4KB pages = 4GB  DIR (10 bits) –Point to PDE in page directory –We assume all PDEs have PS=0 –=> Each PDE provides 20bit of 4KB- aligned base physical address of a 4KB page table (no superpaging)  TABLE (10 bits) –Point to PTE in page table –PTE provides a 20 bit, 4KB-aligned base physical address of a 4KB page  OFFSET (12 bits) –Offset within the selected 4KB page 031 DIRTABLEOFFSET 32bit linear address 1121 4KB 1K-PTE page table 4KB 1K-PDE page directory PDE 4K Page data CR3 (PDBR) 10 12 PTE 20+12=32 (4K aligned) 20

11 Computer Architecture 2011 – VM x86 11 Mapping only 4MB pages  1-level hierarchy –All pages are 4MB aligned –Total of 2 10 (=1K) 4KB pages = 4GB  DIR (10 bits): –Point to PDE in page directory –We assume all PDEs have PS=1 –=> Each PDE provides 10bit of 4MB- aligned base physical address of a 4MB page table (no superpaging)  TABLE (10 bits) –None! (moved to offset)  OFFSET (22 bits) –Offset within the selected 4MB page  Fine print –Must set PSE flag in CR4 for 4MB support to work –Otherwise, PS=1 flag settings ignored 031 DIROFFSET 32bit linear address 21 PDE 4MB Page data CR3 (PDBR) 10 22 20+12=32 (4K aligned) 10 4KB 1K-PDE page directory

12 Computer Architecture 2011 – VM x86 12 Mixing 4KB & 4MB pages  Works “out of the box” –When CR3.PSE=1 –Alignment constraints: 4MB for superpages, 4KB for regular pages  TLB issues? –No, as CPU maintains 4MB and 4KB PTEs in separate TLBs  Benefits –Superpages often used for often-used kernel code –Frees up 4KB TLB entries –Reduces TLB misses => improve overall system performance

13 Computer Architecture 2011 – VM x86 13 PDE & PTE format  20 bit physical address –4K-aligned pointer  12 bits flags –Virtual memory  Present, accessed, dirty –Protection  Read, write, user, privileged –Caching  WB, WT, disable –3 bit for OS usage 0 00 Page Frame Address 31:12 AVAIL00A PCDPCD PWTPWT UWP Present Writable User Write-Through Cache Disable Accessed Page Size (0: 4 Kbyte) Available for OS Use Page Dir Entry 0412357 911 68 1231 Page Frame Address 31:12 AVAILDA PCDPCD PWTPWT UWP Present Writable User Write-Through Cache Disable Accessed Dirty Available for OS Use Page Table Entry 0412357911681231 Reserved for future use (should be zero) - -

14 Computer Architecture 2011 – VM x86 14 4KB-page PTE format G PATPAT Page Base Address 31:12 AVAILDA PCDPCD PWTPWT U/SU/S R/WR/W P Present Writable User / Supervisor Write-Through Cache Disable Accessed Dirty Page Table Attribute Index Global Page Available for OS Use 0412357911681231 -

15 Computer Architecture 2011 – VM x86 15 4KB-page PDE format G PSPS Page Table Base Address 31:12 AVAIL AVLAVL A PCDPCD PWTPWT U/SU/S R/WR/W P Present Writable User / Supervisor Write-Through Cache Disable Accessed Dirty Page Size (0 indicates 4 Kbytes) Global Page (ignored) Available for OS Use 0412357911681231 -

16 Computer Architecture 2011 – VM x86 16 Reserved 4MB-page PDE format G PSPS Page Base Address 31:22 AVAILDA PCDPCD PWTPWT U/SU/S R/WR/W P Present Writable User / Supervisor Write-Through Cache Disable Accessed Dirty Page Size (1 indicates 4 Mbytes) Global Page (ignored) Available for OS Use Page Table Attribute Index 04123579116813 31 - 2221 PATPAT 12

17 Computer Architecture 2011 – VM x86 17 VM attributes: present flag (P)  Set => page in physical memory –Translation is carried out by the MMU (memory management unit)  Clear => page not in physical memory –When encounters by MMU => generates a page-fault exception –Faulting address is available to SW exception handler  MMU does not set/clear this flag (only reads it) –It’s up to the OS  Upon page-fault exception => OS typically does the following: 1.Copy page from disk to memory (unless already in buffer cache) 2.Update PTE/PDE with page RAM address 3.P = 1; dirty = accessed = 0; etc. 4.Invalidate associated PTE in TLB 5.Resume program on faulty instruction

18 Computer Architecture 2011 – VM x86 18 VM attributes: page size flag (PS)  In PDEs only  Determines the page size –Clear=> page size = 4KB (& PDE points to a page table) –Set=> page size = 4MB (& PDE points to superpage)

19 Computer Architecture 2011 – VM x86 19 VM attributes: accessed (A) & dirty (D)  MMU sets A-flag –Upon first time a page (or page-table) is accessed (load or store)  MMU sets D-flag –Upon first time a page (or PT) is accessed (store only)  A & D are sticky –Once set, MMU (=HW) never clears them –Only SW does  OS clears them –When initially loading PTE –Possibly from time to time as part of LRU approximation (used to decide which pages to swap out and which to keep)

20 Computer Architecture 2011 – VM x86 20 VM attributes: global flag (G)  Has affect only when PGE=1 in CR4  When set, indicates page is “global” –Not flushed from TLB when CR3 loaded –Ignored for PDEs with PS=0 (that point to page tables)  Used to improve performance –Keeps important pages of OS in TLB across context switches  Only software can set or clear this flag

21 Computer Architecture 2011 – VM x86 21 Cache attributes: PWT  PWT –Means “page-level write-through”  Controls write-through / write-back caching policy of page / PT –1: enable write-through caching –0 : disable write-through => enable write-back caching  Ignored if –CD (“cache disable”) flag is set in CR0 –If associated PCD is on

22 Computer Architecture 2011 – VM x86 22 Cache attributes: PCD  PCD –Means “page-level cache disable” flag  Controls caching of individual pages / PTs –1: caching associated page/PT is prevented –0: caching allowed  Used –When caching doesn’t help performance (e.g., streaming) –Memory mapped I/O ports to communicate with devices  Assumed as set (regardless of actual value) –If the CD (“cache disable”) flag in CR0 is set

23 Computer Architecture 2011 – VM x86 23 Cache attributes: PAT  PAT –Means “page attribute table index” flag  If on, used along with PCD & PWT flags to select an entry in the PAT –Which in turn selects the memory type for the page –PAT is a 64bit register –(Not going into the details)

24 Computer Architecture 2011 – VM x86 24 Protection attributes : R/W & U/S  Read/write (R/W) flag –Specifies read-write privileges for  page (if PTE),  group of pages (if PDE) –0 = read only –1 = read & write  User/supervisor (U/S) flag –Specifies privileges for a page (PTE) or group of pages (PDE) (in case of a PDE that points to a page table) –0 = supervisor privilege level –1 = user privilege level –User accessing a supervisor page will trigger an interrupt  Typically resulting in the termination of the program

25 Computer Architecture 2011 – VM x86 25 Misc issues  Memory aliasing/sharing –When two (or more) PDEs point to a common PTE –When two (or more) PTEs point to a common page –But SW must maintain consistency of accessed & dirty bits in the these PDEs & PTEs  Base address of page-directory –Physical address of current p-d is stored in CR3  Also called the page-directory-base-register (PDBR) –PDBR typically reloaded upon task switches –Page directory must remain in-memory as long as task is active

26 Computer Architecture 2011 – VM x86 26 32BIT X86 EXTENDED PAGING

27 Computer Architecture 2011 – VM x86 27 PAE – Physical Address Extension  32bit address imposes a limit –Means we can use memory <= 2^32 = 4GB –Too small for many system,  PAE (physical address extension) support –Allows access to a 2^36 RAM (= 64 GB) –But not directly (address remains 32bit)  Only applicable when paging is enabled –When also turning on PAE in CR4 –Support for 4KB and 2MB (rather than 4MB)

28 Computer Architecture 2011 – VM x86 28 PAE – Physical Address Extension  Relies on an additional Page Directory Pointer Table –Lies above the page directory in the translation hierarchy –Has 4 entries of 64-bits each to support up to 4 page directories –PTEs are increased to 64 bits to accommodate 36-bit base physical addresses –Each 4KB page directory and page table can thus have up to 512 entries –CR3 contains the page-directory-pointer-table base address

29 Computer Architecture 2011 – VM x86 29 4KB Page Mapping with PAE  Linear address divided to –Page-directory-pointer-table entry  Indexed by bits 30:31 of the linear addr.  Provides an offset to one of 4 entries in the page-directory-pointer table  The selected entry provides the base physical address of a page directory –Dir(9 bits) – points to a PDE in the Page Directory  PS in the PDE = 0  PDE provides a 27 bit, 4KB aligned base physical address of a page table –Table (9 bit) – points to a PTE in the Page Table  PTE provides a 24 bit, 4KB aligned base physical address of a 4KB page –Offset (12 bits) – offset within the selected 4KB page 029 DIRTABLEOFFSET Linear Address Space (4K Page) 1120 512 entry Page Table 512 entry Page Directory PDE 4KByte Page data 9 9 12 PTE CR3 (PDPTR) 32 (32B aligned) 24 27 1221 Dir ptr 30 31 4 entry Page Directory Pointer Table Dir ptr entry 27 2

30 Computer Architecture 2011 – VM x86 30 2MB Page Mapping with PAE  Linear address divided to –Page-directory-pointer-table entry  Indexed by bits 30:31 of the linear addr.  Provides an offset to one of 4 entries in the page-directory-pointer table  The selected entry provides the base physical address of a page directory –Dir(9 bits) – points to a PDE in the Page Directory  PS in the PDE = 1  PDE provides a 15 bit, 2MB aligned base physical address of a 2MB page –Offset (21 bits) – offset within the selected 2MB page 029 DIROFFSET Linear Address Space (2MB Page) 20 Page Directory PDE 2MByte Page data 9 21 CR3 (PDPTR) 32 (32B aligned) 15 21 Dir ptr 30 31 Page Directory Pointer Table Dir ptr entry 27 2

31 Computer Architecture 2011 – VM x86 31 PTE/PDE/PDP Entry Format with PAE  The major differences in these entries are as follows: –A page-directory-pointer-table entry is added –The size of the entries is increased from 32 bits to 64 bits –The maximum number of entries in a page directory or page table is 512 –The base physical address field in each entry is extended to 24 bits

32 Computer Architecture 2011 – VM x86 32 Paging in 64 bit Mode  PAE paging structures expanded –Potentially support mapping a 64-bit linear address to a 52-bit physical address –First implementation supports mapping a 48-bit linear address into a 40-bit physical address  A 4 th page mapping table added: the page map level 4 table (PML4) –The base physical address of the PML4 is stored in CR3 –A PML4 entry contains the base physical address a page directory pointer table  The page directory pointer table is expanded to 512 8-byte entries –Indexed by 9 bits of the linear address  The size of the PDE/PTE tables remains 512 eight-byte entries –each indexed by nine linear-address bits  The total of linear-address index bits becomes 48  PS flag in PDEs selects between 4-KByte and 2-MByte page sizes –CR4.PSE bit is ignored

33 Computer Architecture 2011 – VM x86 33 sign ext. 4KB Page Mapping in 64 bit Mode 029 DIRTABLEOFFSET Linear Address Space (4K Page) 1120 512 entry Page Table 512 entry Page Directory PDE 4KByte Page data 9 9 12 PTE CR3 (PDPTR) 40 (4KB aligned) 28 31 122130 38 512 entry Page Directory Pointer Table PDP entry 31 9 PDPPML4 394763 512 entry PML4 Table PML4 entry 9 31

34 Computer Architecture 2011 – VM x86 34 sign ext. 2MB Page Mapping in 64 bit Mode 029 DIROFFSET Linear Address Space (2M Page) 20 512 entry Page Directory PDE 2MByte Page data 9 21 CR3 (PDPTR) 40 (4KB aligned) 19 2130 38 512 entry Page Directory Pointer Table PDP entry 31 9 PDPPML4 394763 512 entry PML4 Table PML4 entry 9 31

35 Computer Architecture 2011 – VM x86 35 PTE/PDE/PDP/PML4 Entry Format – 4KB Pages

36 Computer Architecture 2011 – VM x86 36 TLBs  The processor saves most recently used PDEs and PTEs in TLBs –Separate TLB for data and instruction caches –Separate TLBs for 4-KByte and 2/4-MByte page sizes  OS running at privilege level 0 can invalidate TLB entries –INVLPG instruction invalidates a specific PTE in the TLB  This instruction ignores the setting of the G flag –Whenever a PDE/PTE is changed (including when the present flag is set to zero), OS must invalidate the corresponding TLB entry –All (non-global) TLBs are automatically invalidated when CR3 is loaded  The global (G) flag prevents frequently used pages from being automatically invalidated in on a task switch –The entry remains in the TLB indefinitely –Only INVLPG can invalidate a global page entry


Download ppt "Computer Architecture 2011 – VM x86 1 Computer Architecture Virtual Memory (VM) – x86 By Dan Tsafrir, 30/5/2011 Presentation based on slides by Lihu Rappoport."

Similar presentations


Ads by Google