Download presentation
1
Introduction to the Intel x86’s support for “virtual” memory
IA32 Paging Scheme Introduction to the Intel x86’s support for “virtual” memory
2
What is ‘paging’? It’s a scheme for dynamically remapping addresses for fixed-size memory-blocks Physical address-space Virtual address-space
3
What’s ‘paging’ good for?
For efficient ‘time-sharing’ among multiple tasks, an operating system needs to have several programs residing in main memory at the same time To accomplish this using actual physical memory-addressing would require doing address-relocation calculations each time a program was loaded (to avoid conflicting with any addresses already being used)
4
Why use ‘Paging’? Use of ‘paging’ allows ‘relocations’ to be done just once (by the linker), and every program can ‘reuse’ the same addresses Task #3 Task #1 physical memory Task #2
5
How to enable paging Control Register CR0 P G C D N W A M W P N E E T
P G C D N W A M W P N E E T T S E M M P P E ‘Protected-Mode’ must be enabled (PE=1) Then ‘Paging’ can be enabled (set PG=1) # Here is how you can enable paging (if CPU is in protected-mode) mov %cr0, %eax # get current machine status bts $31, %eax # turn on the PE-bit’s image mov %eax, %cr0 # put modified status in CR0 jmp # now flush the prefetch queue # but you had better prepare the ‘mapping’ beforehand!
6
Several ‘paging’ schemes
Intel’s design for ‘paging’ has continued to evolve since its introduction in CPU Our Core-2 Quad CPUs support the initial ‘paging’ design (plus several extensions) Here we shall describe the initial design (it’s simplest and it remains the ‘default’) It is based on subdividing the entire 4GB physical address-space into 4-KB blocks
7
Terminology The 4KB memory-blocks are called ‘page frames’ -- and they are non-overlapping Therefore each page-frame begins at a memory-address which is a multiple of 4K Remember: 4K = 4 x 1024 = 4096 = 212 So the address of any page-frame will have its lowest 12-bits equal to zeros Example: page six begins at 0x
8
Physical Address of the Page-Directory
Control Register CR3 Register CR3 is used by the CPU to find the paging-tables in memory which define its ‘virtual-to-physical’ address-translation Specifically, CR3 points to a page-frame, called the Page Directory, which contains addresses of frames called Page Tables An address in CR3 must be ‘page aligned’ 31 CR3 = Physical Address of the Page-Directory
9
Two-Level Translation Scheme
PAGE TABLES PAGE DIRECTORY PAGE FRAMES CR3
10
Page-Directory The Page-Directory occupies one frame, so it has room for byte entries Each page-directory entry can contain a pointer to a further data-structure, called a Page-Table (also page-aligned 4KB size) Each Page-Table occupies one frame and has enough room for byte entries Page-Table entries can contain pointers
11
Address-translation The CPU examines any virtual address it encounters, subdividing it into three fields index into page-directory index into page-table offset into page-frame 10-bits 10-bits 12-bits This field selects one of the 1024 array-entries in the Page-Directory This field selects one of the 1024 array-entries in that Page-Table This field provides the offset to one of the 4096 bytes in that Page-Frame
12
Identity-mapping When the CPU first turns on the ‘paging’ capability, it must be executing code from an ‘identity-mapped’ page (or it crashes!) identity-mapping code code physical memory ‘virtual’ memory
13
Additional mappings Besides having at least one page that is ‘identity-mapped’ (for turning ‘paging’ on), there can be multiple other mappings data data identity-mapping code code data data physical memory ‘virtual’ memory
14
Demo program We wrote a very simple demo-program showing how to create a Page-Directory and a Page-Table for an identity-mapping of the page-frame that contains program-code, plus a non-identity mapping for the initial page of the video display memory This demo is named ‘vrampage.s’ (you can find it on our CS 630 course website)
15
Demo’s page-mapping program arena one page-table page-directory unused
video memory CR3 Our ‘vrampage.s’ demo-program uses only four page-frames of physical memory (16K) 1) the program’s arena (at 0x ) 2) the page-directory (at 0x ) 3) only one page-table (at 0x ) 4) one page of vram (at 0x000B8000)
16
Virtual-to-Physical video memory page-directory page-table
0x000B8000 page-directory page-table code and data code and data 0x 0x video memory 0x physical address-space ‘virtual’ address-space
17
The demo’s table-entries
Our page-directory uses only one entry: And our page-table uses only two entries: pgdir[0x000]: 0x pgtbl[0x000]: 0x000B8 003 pgtbl[0x010]: 0x identity-mapping
18
The segment descriptors
Our demo’s GDT uses three descriptors: ‘executable’ segment at virtual-address 0x 0x A010000FFFF ‘writable’ segment at virtual-address 0x 0x FFFF ‘writable’ segment at virtual-address 0x 0x FFFF
19
Page-Level ‘protection’
Each entry in a Page-Table can assign a collection of ‘attributes’ to the Page-Frame it points to; for example: The P-bit (page is ‘present’) can be used by an operating system to implement “demand paging” and “memory-mapping” of disk-files The W/R-bit can be used to mark a page as either ‘Writable’ (=1) or as ‘Read-Only’ (=0) The U/S-bit can be used to mark a page as ‘User-accessible’ or as ‘Supervisor-only’
20
Format of a Page-Table entry
31 PAGE-FRAME BASE ADDRESS AVAIL D A P C D P W T U W P LEGEND P = Present (1=yes, 0=no) W = Writable (1 = yes, 0 = no) U = User (1 = yes, 0 = no) A = Accessed (1 = yes, 0 = no) D = Dirty (1 = yes, 0 = no) PWT = Page Write-Through (1=yes, 0 = no) PCD = Page Cache-Disable (1 = yes, 0 = no)
21
Format of a Page-Directory entry
31 PAGE-TABLE BASE ADDRESS AVAIL P S A P C D P W T U W P LEGEND P = Present (1=yes, 0=no) W = Writable (1 = yes, 0 = no) U = User (1 = yes, 0 = no) A = Accessed (1 = yes, 0 = no) PS = Page-Size (0=4KB, 1 = 4MB) PWT = Page Write-Through (1=yes, 0 = no) PCD = Page Cache-Disable (1 = yes, 0 = no) NOTE: The PS-bit is only meaningful when the PSE-bit in register CR4 is set
22
Violations When a task violates the page-attributes of any Page-Frame, the CPU will generate a ‘Page-Fault’ Exception (interrupt 0x0E) Then the operating system’s page-fault exception-handler gets control and can take whatever action it deems is suitable The CPU will provide help to the OS in determining why a Page-Fault occurred
23
The Error-Code format The CPU will push an Error-Code onto the operating system’s stack reserved (=0) U / S W / R P Legend: P (Present): 0=attempted access was to a ‘not-present’ page W/R (Write/Read): 1=attempted to write to a ‘read-only’ page U/S (User/Supervisor): 1=user attempted access to a ‘supervisor’ page NOTE: ‘User’ means that CPL = 3; ‘Supervisor’ means that CPL = 0, 1, or 2
24
Control Register CR2 Whenever a ‘Page-Fault’ is encountered, the CPU will save the virtual-address that caused that fault into the CR2 register If the CPU was trying to modify the value of an operand in a ‘read-only’ page, then that operand’s virtual address is written into CR2 If the CPU was trying to read the value of an operand in a supervisor-only page (or was trying to fetch-and-execute an instruction) while CPL=3, the relevant virtual address will be written into CR2
25
CR3 and Task-Switching Page-Table Directory Base
32-bits link 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100 esp0 ss0 esp1 ss1 esp2 Page-Table Directory Base ss2 PTDB EIP This value will get loaded into register CR3 as part of the context-switching mechanism when paging has been enabled (PG=1) So the ‘incoming task’ will automatically have its own individual mapping of its ‘virtual’ address-space to page-frames in the CPU’s ‘physical’ address-space 26 longwords ss0 EFLAGS ss0 EAX ss0 ss0 ECX ss0 ss0 EDX ss0 ss0 ss0 EBX ss0 ESP ss0 ss0 ss0 EBP ss0 ss0 ESI ss0 EDI ss0 ss0 ES CS SS = field is ‘static’ DS FS GS = field is ‘volatile’ LDTR IOMAP TRAP = field is ‘reserved’ I/O permission bitmap
26
Extensions to ‘paging’ scheme
The Core-2 Quad CPU provides several enhancements to the original 386 paging These enhancements are ‘optional’ and must be selectively enabled by software Control Register CR4 implements bits to “turn on” the desired ‘paging-extension’ and some other enhancements that are unrelated to the ‘paging’ architectures
27
Control Register CR4 P C E P G E M C E P A E P S E D E T S D P V I V M
V M X E P C E P G E M C E P A E P S E D E T S D P V I V M E Legend (for paging-related extensions): PSE = Page-Size Extension is enabled (1 = yes, 0 = no) PAE = Page-Address Extension is enabled (1 = yes, 0 = no) PGE = Page-Global Extension is enabled (1 = yes, 0 = no)
28
What about efficiency? When paging is enabled, every reference to memory requires the CPU to ‘translate’ the virtual-address into a physical-address That ‘translation’ is based on table-lookups These lookups must be done ‘sequentially’ So ‘address-translation’ could be costly in terms of CPU speed – a high percentage of instructions typically refer to memory
29
The ‘TLB’ solution When the CPU has performed the table lookups that map a virtual-address to a physical-address, it “remembers” that relationship by saving the pair of page-addresses (virtual-page physical page) in a special CPU cache known as the TLB (“Translation Look-aside Buffer”) Subsequent references to this same page can be resolved quickly -- via that cache!
30
4-way set-associative The TLB is implemented as a ‘4-way set-associative’ cache -- it’s like a parallelized version of a Hash Table (with ‘evictions’) Due to the ‘locality of reference’ principle, the TLB concept generally works well in most common programming contexts as an efficient ‘speedup’ of the page-address table-lookup translation-mechanism Modifying CR3 invalidates the TLB cache
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.