Virtual Memory
Names, Virtual Addresses & Physical Addresses Source Program Absolute Module Name Space P i ’s Virtual Address Space P i ’s Virtual Address Space Executable Image Physical Address Space Physical Address Space t : Virtual Address Space Physical Address Space
Locality Address space is logically partitioned –Text, data, stack –Initialization, main, error handle Different parts have different reference patterns: Address Space for p i Initialization code (used once) Code for 1 Code for 2 Code for 3 Code for error 1 Code for error 2 Code for error 3 Data & stack 30% 20% 35% 15% <1% Execution time
Virtual Memory Virtual Address Space for p i Virtual Address Space for p j Virtual Address Space for p k Secondary Memory Complete virtual address space is stored in secondary memory Primary Memory 0 n-1 Physical Address Space Fragments of the virtual address space are dynamically loaded into primary memory at any given time Each address space is fragmented
Virtual Memory Every process has code and data locality –Code tends to execute in a few fragments at one time –Tend to reference same set of data structures Dynamically load/unload currently-used address space fragments as the process executes Uses dynamic address relocation/binding –Generalization of base-limit registers –Physical address corresponding to a compile- time address is not bound until run time
Virtual Memory (cont) Since binding changes with time, use a dynamic virtual address map, t Virtual Address Space tt
Address Formation Translation system creates an address space, but its address are virtual instead of physical A virtual address, x: –Is mapped to physical address y = t (x) if x is loaded at physical address y –Is mapped to if x is not loaded The map, t, changes as the process executes -- it is “time varying” t : Virtual Address Physical Address { }
Size of Blocks of Memory Virtual memory system transfers “blocks” of the address space to/from primary memory Fixed size blocks: System-defined pages are moved back and forth between primary and secondary memory Variable size blocks: Programmer-defined segments – corresponding to logical fragments – are the unit of movement Paging is the commercially dominant form of virtual memory today
Paging A page is a fixed size, 2 h, block of virtual addresses A page frame is a fixed size, 2 h, block of physical memory (the same size as a page) When a virtual address, x, in page i is referenced by the CPU –If page i is loaded at page frame j, the virtual address is relocated to page frame j –If page is not loaded, the OS interrupts the process and loads the page into a page frame
Addresses Suppose there are G= 2 g 2 h =2 g+h virtual addresses and H=2 j+h physical addresses assigned to a process –Each page/page frame is 2 h addresses –There are 2 g pages in the virtual address space –2 j page frames are allocated to the process –Rather than map individual addresses t maps the 2 g pages to the 2 j page frames That is, page_frame j = t (page i ) Address k in page i corresponds to address k in page_frame j
Page-Based Address Translation Let N = {d 0, d 1, … d n-1 } be the pages Let M = {b 0, b 1, …, b m-1 } be page frames Virtual address, i, satisfies 0 i<G= 2 g+h Physical address, k = U2 h +V (0 V<G= 2 h ) –U is page frame number –V is the line number within the page – t :[0:G-1] { } –Since every page is size c=2 h page number = U = i/c line number = V = i mod c
Address Translation (cont) Page #Line # Virtual Address g bits h bits tt Missing Page Frame #Line # j bitsh bits Physical Address MAR CPU Memory “page table”
Demand Paging Algorithm 1.Page fault occurs 2.Process with missing page is interrupted 3.Memory manager locates the missing page 4.Page frame is unloaded (replacement policy) 5.Page is loaded in the vacated page frame 6.Page table is updated 7.Process is restarted
Modeling Page Behavior Let = r 1, r 2, r 3, …, r i, … be a page reference stream –r i is the i th page # referenced by the process –The subscript is the virtual time for the process Given a page frame allocation of m, the memory state at time t, S t (m), is set of pages loaded –S t (m) = S t-1 (m) X t - Y t X t is the set of fetched pages at time t Y t is the set of replaced pages at time t
More on Demand Paging If r t was loaded at time t-1, S t (m) = S t-1 (m) If r t was not loaded at time t-1 and there were empty page frames –S t (m) = S t-1 (m) {r t } If r t was not loaded at time t-1 and there were no empty page frames –S t (m) = S t-1 (m) {r t } - {y} The alternative is prefetch paging
Static Allocation, Demand Paging Number of page frames is static over the life of the process Fetch policy is demand Since S t (m) = S t-1 (m) {r t } - {y}, the replacement policy must choose y -- which uniquely identifies the paging policy
Random Replacement Replaced page, y, is chosen from the m loaded page frames with probability 1/m Let page reference stream, = Frame No knowledge of not perform well Easy to implement 13 page faults
Belady’s Optimal Algorithm Replace page with maximal forward distance: y t = max xeS t-1(m) FWD t (x) Let page reference stream, = Frame
Belady’s Optimal Algorithm Replace page with maximal forward distance: y t = max xeS t-1(m) FWD t (x) Let page reference stream, = Frame FWD 4 (2) = 1 FWD 4 (0) = 2 FWD 4 (3) = 3
Belady’s Optimal Algorithm Replace page with maximal forward distance: y t = max xeS t-1(m) FWD t (x) Let page reference stream, = Frame FWD 4 (2) = 1 FWD 4 (0) = 2 FWD 4 (3) = 3
Belady’s Optimal Algorithm Replace page with maximal forward distance: y t = max xeS t-1(m) FWD t (x) Let page reference stream, = Frame
Belady’s Optimal Algorithm Replace page with maximal forward distance: y t = max xeS t-1(m) FWD t (x) Let page reference stream, = Frame FWD 7 (2) = 2 FWD 7 (0) = 3 FWD 7 (1) = 1
Belady’s Optimal Algorithm Replace page with maximal forward distance: y t = max xeS t-1(m) FWD t (x) Let page reference stream, = Frame FWD 10 (2) = FWD 10 (3) = 2 FWD 10 (1) = 3
Belady’s Optimal Algorithm Replace page with maximal forward distance: y t = max xeS t-1(m) FWD t (x) Let page reference stream, = Frame FWD 13 (0) = FWD 13 (3) = FWD 13 (1) =
Belady’s Optimal Algorithm Replace page with maximal forward distance: y t = max xeS t-1(m) FWD t (x) Let page reference stream, = Perfect knowledge of perfect performance Impossible to implement 10 page faults Frame
Least Recently Used (LRU) Replace page with maximal forward distance: y t = max xeS t-1(m) BKWD t (x) Let page reference stream, = Frame BKWD 4 (2) = 3 BKWD 4 (0) = 2 BKWD 4 (3) = 1
Least Recently Used (LRU) Replace page with maximal forward distance: y t = max xeS t-1(m) BKWD t (x) Let page reference stream, = Frame BKWD 4 (2) = 3 BKWD 4 (0) = 2 BKWD 4 (3) = 1
Least Recently Used (LRU) Replace page with maximal forward distance: y t = max xeS t-1(m) BKWD t (x) Let page reference stream, = Frame BKWD 5 (1) = 1 BKWD 5 (0) = 3 BKWD 5 (3) = 2
Least Recently Used (LRU) Replace page with maximal forward distance: y t = max xeS t-1(m) BKWD t (x) Let page reference stream, = Frame BKWD 6 (1) = 2 BKWD 6 (2) = 1 BKWD 6 (3) = 3
Least Recently Used (LRU) Replace page with maximal forward distance: y t = max xeS t-1(m) BKWD t (x) Let page reference stream, = Frame
Least Recently Used (LRU) Replace page with maximal forward distance: y t = max xeS t-1(m) BKWD t (x) Let page reference stream, = Frame Backward distance is a good predictor of forward distance -- locality
Least Frequently Used (LFU) Replace page with minimum use: y t = min xeS t-1(m) FREQ(x) Let page reference stream, = Frame FREQ 4 (2) = 1 FREQ 4 (0) = 1 FREQ 4 (3) = 1
Least Frequently Used (LFU) Replace page with minimum use: y t = min xeS t-1(m) FREQ(x) Let page reference stream, = Frame FREQ 4 (2) = 1 FREQ 4 (0) = 1 FREQ 4 (3) = 1
Least Frequently Used (LFU) Replace page with minimum use: y t = min xeS t-1(m) FREQ(x) Let page reference stream, = Frame FREQ 6 (2) = 2 FREQ 6 (1) = 1 FREQ 6 (3) = 1
Least Frequently Used (LFU) Replace page with minimum use: y t = min xeS t-1(m) FREQ(x) Let page reference stream, = Frame FREQ 7 (2) = ? FREQ 7 (1) = ? FREQ 7 (0) = ?
First In First Out (FIFO) Replace page that has been in memory the longest: y t = max xeS t-1(m) AGE(x) Let page reference stream, = Frame
First In First Out (FIFO) Replace page that has been in memory the longest: y t = max xeS t-1(m) AGE(x) Let page reference stream, = Frame AGE 4 (2) = 3 AGE 4 (0) = 2 AGE 4 (3) = 1
First In First Out (FIFO) Replace page that has been in memory the longest: y t = max xeS t-1(m) AGE(x) Let page reference stream, = Frame AGE 4 (2) = 3 AGE 4 (0) = 2 AGE 4 (3) = 1
First In First Out (FIFO) Replace page that has been in memory the longest: y t = max xeS t-1(m) AGE(x) Let page reference stream, = Frame AGE 5 (1) = ? AGE 5 (0) = ? AGE 5 (3) = ?
Belady’s Anomaly FIFO with m = 3 has 9 faults FIFO with m = 4 has 10 faults Let page reference stream, = Frame Frame
Stack Algorithms Some algorithms are well-behaved Inclusion Property: Pages loaded at time t with m is also loaded at time t with m+1 Frame Frame LRU
Stack Algorithms Some algorithms are well-behaved Inclusion Property: Pages loaded at time t with m is also loaded at time t with m+1 Frame Frame LRU
Stack Algorithms Some algorithms are well-behaved Inclusion Property: Pages loaded at time t with m is also loaded at time t with m+1 Frame Frame LRU
Stack Algorithms Some algorithms are well-behaved Inclusion Property: Pages loaded at time t with m is also loaded at time t with m+1 Frame Frame LRU
Stack Algorithms Some algorithms are well-behaved Inclusion Property: Pages loaded at time t with m is also loaded at time t with m+1 Frame Frame LRU
Stack Algorithms Some algorithms are well-behaved Inclusion Property: Pages loaded at time t with m is also loaded at time t with m+1 FIFO Frame Frame
Implementation LRU has become preferred algorithm Difficult to implement –Must record when each page was referenced –Difficult to do in hardware Approximate LRU with a reference bit –Periodically reset –Set for a page when it is referenced Dirty bit
Dynamic Paging Algorithms The amount of physical memory -- the number of page frames -- varies as the process executes How much memory should be allocated? –Fault rate must be “tolerable” –Will change according to the phase of process Need to define a placement & replacement policy Contemporary models based on working set
Working Set Intuitively, the working set is the set of pages in the process’s locality –Somewhat imprecise –Time varying –Given k processes in memory, let m i (t) be # of pages frames allocated to p i at time t m i (0) = 0 i=1 k m i (t) |primary memory| Also have S t (m i (t)) = S t (m i (t-1)) X t - Y t Or, more simply S(m i (t)) = S(m i (t-1)) X t - Y t
Placed/Replaced Pages S(m i (t)) = S(m i (t-1)) X t - Y t For the missing page – Allocate a new page frame –X t = {r t } in the new page frame How should Y t be defined? Consider a parameter, , called the window size –Determine BKWD t (y) for every y S(m i (t-1)) –if BKWD t (y) , unload y and deallocate frame –if BKWD t (y) < do not disturb y
Working Set Principle Process p i should only be loaded and active if it can be allocated enough page frames to hold its entire working set The size of the working set is estimated using –Unfortunately, a “good” value of depends on the size of the locality –Empirically this works with a fixed
Example ( = 3) Frame # 1
Example ( = 4) Frame # 1
Implementing the Working Set Global LRU will behave similarly to a working set algorithm –Page fault Add a page frame to one process Take away a page frame from another process –Use LRU implementation idea Reference bit for every page frame Cleared periodically, set with each reference Change allocation of some page frame with a clear reference bit –Clock algorithms use this technique by searching for cleared ref bits in a circular fashion
Segmentation Unit of memory movement is: –Variably sized –Defined by the programmer Two component addresses, Address translation is more complex than paging t : segments x offsets Physical Address { } t (i, j) = k
Segment Address Translation t : segments x offsets physical address { } t (i, j) = k : segments segment addresses t ( (segName), j) = k : offset names offset addresses t ( (segName), (offsetName)) = k Read implementation in Section
Address Translation segment #offset tt Missing segment LimitBaseP Limit Relocation + ? To Memory Address Register
Implementation Segmentation requires special hardware –Segment descriptor support –Segment base registers (segment, code, stack) –Translation hardware Some of translation can be static –No dynamic offset name binding –Limited protection
Multics Old, but still state-of-the-art segmentation Uses linkage segments to support sharing Uses dynamic offset name binding Requires sophisticated memory management unit See pp