Avishai Wool lecture Introduction to Systems Programming Lecture 6 Memory Management
Avishai Wool lecture Memory Management Ideally programmers want memory that is –large –fast –non volatile (does not get erased when power goes off) Memory hierarchy –small amount of fast, expensive memory – cache –some medium-speed, medium price main memory –gigabytes of slow, cheap disk storage Memory manager handles the memory hierarchy
Avishai Wool lecture The Memory Hierarchy Registers On-chip Cache Main Memory Magnetic (Hard) Disk Magnetic Tape 1 nsec 2 nsec 10 nsec 10 msec 100 sec Access TimeCapacity < 1 KB 4 MB 512MB-2GB 200GB-1000GB multi-TB Other types of memory: ROM, EEPROM, Flash RAM
Avishai Wool lecture Basic Memory Management An operating system with one user process (Palm computers) (MS-DOS) BIOS
Avishai Wool lecture Why is multi-programming good? Running several processes in parallel seems to “lets users get more done” Can we show a model that can quantify this? From the systems’ perspective: Multi-programming improves utilization
Avishai Wool lecture Modeling Multiprogramming A process waits for I/O a fraction p of time –(1-p) of the time is spent in CPU bursts Degree of Multiprogramming: The number n of processes in memory Pr(CPU busy running processes) = utilization Utilization = 1 - p n For an interactive process, p=80% is realistic
Avishai Wool lecture CPU utilization as a function of number of processes in memory Degree of multiprogramming
Avishai Wool lecture Using the simple model Assume 32MB of memory OS uses 16MB, user processes use 4MB 4-way multi-programming possible Model predicts utilization = = 60% If we add another 16MB 8-way multi- programming utilization = 83%
Avishai Wool lecture Real-Memory Partitioning
Avishai Wool lecture Multiprogramming with Fixed Partitions (a)Separate input queues for each partition (Used in IBM OS/360) (b)Single input queue
Avishai Wool lecture Problems with Fixed Partitions Separate queues: memory not used efficiently if many process in one class and few in another Single queue: small processes can use up a big partition, again memory not used efficiently
Avishai Wool lecture Basic issues in multi-programming Programmer, and compiler, cannot be sure where process will be loaded in memory –address locations of variables, code routines cannot be absolute Relocation: the mechanism for fixing memory references in memory Protection: one process should not be able to access another processes’ memory partition
Avishai Wool lecture Relocation in Software: Compiler+OS Compiler assumes program loaded at address 0. Compiler/Linker inserts a relocation table into the binary file: –positions in code containing memory addresses At load (part of process creation): –OS computes offset = lowest memory address for process –OS modifies the code - adds offset to all positions listed in relocation table
Avishai Wool lecture Relocation example Relocation table: 6, 12, … mov ax, *200 mov bx, * Load Relocate: add 1024 mov ax, *200 mov bx, * Compile time CreateProcess movregAddress (4bytes)
Avishai Wool lecture Protection – Hardware Support Memory partitions have ID (protection code) PSW has a “protection code” field (e.g. 4 bits) Saved in PCB as part of process state CPU checks each memory access: if protection code of address != protection code of process error
Avishai Wool lecture Alternative hardware support : Base and Limit Registers Special CPU registers: “base”, “limit” Address locations added to base value to map to physical address –Replaces software relocation –OS sets the base & limit registers during CreateProcess Access to address locations over limit value is a CPU exception error –solves protection too Intel 8088 used a weak version of this: base register but no limit
Avishai Wool lecture Swapping
Avishai Wool lecture Swapping Fixed partitions are too inflexible, waste memory Next step up in complexity: dynamic partitions Allocate as much memory as needed by each process Swap processes out to disk to allow more multi- programming
Avishai Wool lecture Swapping - example Memory allocation changes as –processes come into memory –leave memory Shaded regions are unused memory
Avishai Wool lecture How much memory to allocate? (a) Allocating space for growing data segment (b) Allocating space for growing stack & data segment
Avishai Wool lecture Issues in Swapping When a process terminates – compact memory? –Move all processes above the hole down in memory. Can be very slow: 256MB of memory, copy 4 bytes in 40ns compacting memory in 2.7 sec Almost never used Result: OS needs to keep track of holes. Problem to avoid: memory fragmentation.
Avishai Wool lecture Swapping Data Structure: Bit Maps Part of memory with 5 processes, 3 holes –tick marks show allocation units –shaded regions are free Corresponding bit map
Avishai Wool lecture Properties of Bit-Map Swapping Memory of M bytes, allocation unit is k bytes bitmap uses M/k bits = M/8k bytes. Could be quite large. E.g., allocation unit is 4 bytes Bit map uses 1/32 of memory Searching bit-map for a hole is slow
Avishai Wool lecture Swapping Data Structure: Linked Lists Variant #1: keep a list of blocks (process=P, hole=H)
Avishai Wool lecture What Happens When a Process Terminates? Merge neighboring holes to create a bigger hole
Avishai Wool lecture Variant #2 Keep separate lists for processes and for holes E.g., Process information can be in PCB Maintain hole list inside the holes Process A size nextprev next size prev Hole 1 Hole 2
Avishai Wool lecture Hole Selection Strategy We have a list of holes of sizes 10, 20, 10, 50, 5 A process that needs size 4. Which hole to use? First fit : pick the 1 st hole that’s big enough (use hole of size 10) Break up the hole into a used piece and a hole of size = 6 Simple and fast
Avishai Wool lecture Best Fit For a process of size s, use smallest hole that has size(hole) >= s. In example, use last hole, of size 5. Problems: –Slower (needs to search whole list) –Creates many tiny holes that fragment memory Can be made as fast as first fit if blocks sorted by size (but then slower termination processing)
Avishai Wool lecture Other Options Worst fit: find the biggest hole that fits. –Simulations show that this is not very good Quick Fit: maintain separate lists for common block sizes. –Improved performance of “find-hole” operation –More complicated termination processing
Avishai Wool lecture Related Problems The hole-list system is used in other places: C language dynamic memory runtime system –malloc() / calloc(), or C++ “new” keyword –free() File systems can use this type of system to maintain free and used blocks on the disk.
Avishai Wool lecture Virtual Memory
Avishai Wool lecture Main Idea Processes use virtual address space (e.g., FFFFFFFF for 32-bit addresses). Every process has its own address space The address space of each process can be larger than physical memory.
Avishai Wool lecture Memory Mapping Only part of the virtual address space is mapped to physical memory at any time. Parts of processes’ memory content is on disk. Hardware & OS collaborate to move memory contents to and from disk.
Avishai Wool lecture Advantages of Virtual Memory No need for software relocation: process code uses virtual addresses. Solves protection requirement: Impossible for a process to refer to another process’s memory. For virtual memory protection to work: –Per-process memory mapping (page table) –Only OS can modify the mapping
Avishai Wool lecture Hardware support: the MMU (Memory Management Unit)
Avishai Wool lecture Example 16-bit memory addresses Virtual address space size: 64 KB Physical memory: 32 KB (15 bit) Virtual address space split into 4KB pages. –16 pages Physical memory is split into 4KB page frames. –8 frames
Avishai Wool lecture Paging The relation between virtual addresses and physical memory addresses given by the page table OS maintains table Page table per process MMU uses table
Avishai Wool lecture Example (cont) CPU executes the command mov rx, *5 MMU gets the address “5”. Virtual address 5 is in page 0 (addresses ) Page 0 is mapped to frame 2 (physical addresses ). MMU puts the address 8197 (=8192+5) on the bus.
Avishai Wool lecture Page Faults What if CPU issues mov rx, *32780 That page (page 8) is un-mapped (not in any frame) MMU causes a page fault (interrupt to CPU) OS handles the page fault: –Evict some page from a frame –Copy the requested page from disk into the frame –Re-execute instruction
Avishai Wool lecture How the MMU Works Splits 32-bit virtual address into –A k-bit page number: the top k MSB –A (32-k) bit offset Uses page number as index into page table, and adds offset. Page table has 2 k pages. Each page is of size 2 32-k.
Avishai Wool lecture bit page number
Avishai Wool lecture Issues with Virtual Memory Page table can be very large: –32 bit addresses, 4KB pages (12-bit offsets) over 1 million pages –Each process needs its own page table Page lookup has to be very fast: –Instruction in 4ns page table lookup should be around 1ns Page fault rate has to be very low.
Avishai Wool lecture Concepts for review Degree of multi-programming Processor utilization Fixed partitions Code relocation Memory protection Dynamic partitions – Swapping Memory fragmentation Data structures: bitmaps; list of holes First-fit/worst-fit/best-fit Virtual memory Address space MMU Pages and Frames Page table Page fault Page lookup