Operating System Main Memory
Background Program must be brought (from disk) into memory and placed within a process for it to be run Main memory and registers are only storage CPU can access directly Register access in one CPU clock (or less) Main memory can take many cycles Cache sits between main memory and CPU registers Protection of memory required to ensure correct operation
The Problem There is a finite amount of RAM on any machine. There is a bus with finite speed, non-zero latency. There is a CPU with a given clock speed. There is a disk with a given speed. How do we manage memory effectively? Mainly, that means preventing CPU idle time, maintaining good response time, etc. While maintaining isolation and security.
Multiprogramming and Time-Sharing Multiple concurrent processes Batch multiprogramming Timesharing Could switch by swapping. Fast or slow? Why? Instead, leave in memory. Good? Bad?
Base and Limit Registers A pair of base and limit registers define the logical address space
Binding of Instructions and Data to Memory Address binding of instructions and data to memory addresses can happen at three different stages Compile time: If memory location known a priori, absolute code can be generated; must recompile code if starting location changes Load time: Must generate relocatable code if memory location is not known at compile time Execution time: Binding delayed until run time if the process can be moved during its execution from one memory segment to another. Need hardware support for address maps (e.g., base and limit registers)
Logical vs. Physical Address Space The concept of a logical address space that is bound to a separate physical address space is central to proper memory management Logical address – generated by the CPU; also referred to as virtual address Physical address – address seen by the memory unit Logical and physical addresses are the same in compile-time and load-time address-binding schemes; logical (virtual) and physical addresses differ in execution-time address-binding scheme
Memory-Management Unit (MMU) Hardware device that maps virtual to physical address In MMU scheme, the value in the relocation register is added to every address generated by a user process at the time it is sent to memory The user program deals with logical addresses; it never sees the real physical addresses
Dynamic relocation using a relocation register
Swapping A process can be swapped temporarily out of memory to a backing store, and then brought back into memory for continued execution Roll out, roll in – swapping variant used for priority-based scheduling algorithms; lower-priority process is swapped out so higher-priority process can be loaded and executed Modified versions of swapping are found on many systems (i.e., UNIX, Linux, and Windows)
Schematic View of Swapping
Contiguous Allocation Multiple-partition allocation Hole – block of available memory; holes of various size are scattered throughout memory When a process arrives, it is allocated memory from a hole large enough to accommodate it Operating system maintains information about: a) allocated partitions b) free partitions (hole) OS OS OS OS process 5 process 5 process 5 process 5 process 9 process 9 process 8 process 10 process 2 process 2 process 2 process 2
Dynamic Storage-Allocation Problem How to satisfy a request of size n from a list of free holes First-fit: Allocate the first hole that is big enough Best-fit: Allocate the smallest hole that is big enough; must search entire list, unless ordered by size Produces the smallest leftover hole Worst-fit: Allocate the largest hole; must also search entire list Produces the largest leftover hole
Fragmentation External Fragmentation – total memory space exists to satisfy a request, but it is not contiguous Internal Fragmentation – allocated memory may be slightly larger than requested memory; this size difference is memory internal to a partition, but not being used
Old technique #1: Fixed partitions Physical memory is broken up into fixed partitions partitions may have different sizes, but partitioning never changes hardware requirement: base register, limit register physical address = virtual address + base register base register loaded by OS when it switches to a process how do we provide protection? if (physical address > base + limit) then… ? Advantages Simple Problems internal fragmentation: the available partition is larger than what was requested external fragmentation: two small partitions left, but one big job – what sizes should the partitions be??
Mechanics of fixed partitions physical memory partition 0 limit register base register 2K 2K P2’s base: 6K partition 1 6K yes partition 2 <? + offset 8K virtual address partition 3 no raise protection fault 12K
Old technique #2: Variable partitions Obvious next step: physical memory is broken up into partitions dynamically – partitions are tailored to programs hardware requirements: base register, limit register physical address = virtual address + base register how do we provide protection? if (physical address > base + limit) then… ? Advantages no internal fragmentation simply allocate partition size to be just big enough for process (assuming we know what that is!) Problems external fragmentation as we load and unload jobs, holes are left scattered throughout physical memory slightly different than the external fragmentation for fixed partition systems
Mechanics of variable partitions physical memory partition 0 limit register base register P3’s size P3’s base partition 1 partition 2 yes <? + offset partition 3 virtual address no partition 4 raise protection fault
Dealing with fragmentation Compact memory by copying Swap a program out Re-load it, adjacent to another Adjust its base register “Lather, rinse, repeat” Ugh partition 0 partition 0 partition 1 partition 1 partition 2 partition 2 partition 3 partition 4 partition 3 partition 4
Modern technique: Paging Solve the external fragmentation problem by using fixed sized units in both physical and virtual memory Solve the internal fragmentation problem by making the units small virtual address space physical address space page 0 frame 0 page 1 frame 1 page 2 frame 2 page 3 … … frame Y page X
Paging Logical address space of a process can be noncontiguous; process is allocated physical memory whenever the latter is available Divide physical memory into fixed-sized blocks called frames (size is power of 2, between 512 bytes and 8,192 bytes) Divide logical memory into blocks of same size called pages Set up a page table to translate logical to physical addresses Internal fragmentation
Address Translation Scheme Address generated by CPU is divided into: Page number (p) – used as an index into a page table which contains base address of each page in physical memory Page offset (d) – combined with base address to define the physical memory address that is sent to the memory unit For given logical address space 2m and page size 2n page number page offset p d m - n n
Paging Hardware
Paging Model of Logical and Physical Memory
Implementation of Page Table Page table is kept in main memory Page-table base register (PTBR) points to the page table Page-table length register (PRLR) indicates size of the page table In this scheme every data/instruction access requires two memory accesses. One for the page table and one for the data/instruction. The two memory access problem can be solved by the use of a special fast-lookup hardware cache called associative memory or translation look-aside buffers (TLBs) Some TLBs store address-space identifiers (ASIDs) in each TLB entry – uniquely identifies each process to provide address-space protection for that process
Memory Protection Memory protection implemented by associating protection bit with each frame Valid-invalid bit attached to each entry in the page table: “valid” indicates that the associated page is in the process’ logical address space, and is thus a legal page “invalid” indicates that the page is not in the process’ logical address space
Valid (v) or Invalid (i) Bit In A Page Table
Shared Pages Shared code One copy of read-only (reentrant) code shared among processes (i.e., text editors, compilers, window systems). Shared code must appear in same location in the logical address space of all processes Private code and data Each process keeps a separate copy of the code and data The pages for the private code and data can appear anywhere in the logical address space
Shared Pages Example
Page Table Entries – an opportunity! As long as there’s a PTE lookup per memory reference, we might as well add some functionality We can add protection A virtual page can be read-only, and result in a fault if a store to it is attempted Some pages may not map to anything – a fault will occur if a reference is attempted We can add some “accounting information” Can’t do anything fancy, since address translation must be fast Can keep track of whether or not a virtual page is being used, though This will help the paging algorithm, once we get to paging
Page Table Entries (PTE’s) 1 1 1 2 20 V R M prot page frame number PTE’s control mapping the valid bit says whether or not the PTE can be used says whether or not a virtual address is valid it is checked each time a virtual address is used the referenced bit says whether the page has been accessed it is set when a page has been read or written to the modified bit says whether or not the page is dirty it is set when a write to the page has occurred the protection bits control which operations are allowed read, write, execute the page frame number determines the physical page physical page start address = PFN
Paging advantages Easy to allocate physical memory physical memory is allocated from free list of frames to allocate a frame, just remove it from the free list external fragmentation is not a problem managing variable-sized allocations is a huge pain in the neck “buddy system” Leads naturally to virtual memory entire program need not be memory resident take page faults using “valid” bit all “chunks” are the same size (page size) but paging was originally introduced to deal with external fragmentation, not to allow programs to be partially resident
Paging disadvantages Can still have internal fragmentation Process may not use memory in exact multiples of pages But minor because of small page size relative to address space size Memory reference overhead 2 references per address lookup (page table, then memory) Solution: use a hardware cache to absorb page table lookups translation lookaside buffer (TLB) – next class Memory required to hold page tables can be large need one PTE per page in virtual address space 32 bit AS with 4KB pages = 220 PTEs = 1,048,576 PTEs 4 bytes/PTE = 4MB per page table OS’s have separate page tables per process 25 processes = 100MB of page tables Solution: page the page tables (!!!) (ow, my brain hurts…more later)
Segmentation Memory-management scheme that supports user view of memory A program is a collection of segments A segment is a logical unit such as: main program procedure function method object local variables, global variables common block stack symbol table arrays
User’s View of a Program
Segmentation Architecture Logical address consists of a two tuple: <segment-number, offset>, Segment table – maps two-dimensional physical addresses; each table entry has: base – contains the starting physical address where the segments reside in memory limit – specifies the length of the segment Segment-table base register (STBR) points to the segment table’s location in memory Segment-table length register (STLR) indicates number of segments used by a program; segment number s is legal if s < STLR
Segment lookups <? + segment 0 segment 1 segment 2 yes segment 3 no segment table base limit physical memory segment 0 segment # offset segment 1 virtual address segment 2 yes <? + segment 3 no segment 4 raise protection fault
Segmentation Architecture (Cont.) Protection With each entry in segment table associate: validation bit = 0 illegal segment read/write/execute privileges Protection bits associated with segments; code sharing occurs at segment level Since segments vary in length, memory allocation is a dynamic storage-allocation problem A segmentation example is shown in the following diagram
Variable Partitions and Fragmentation Memory wasted by External Fragmentation 1 2 3 4 5 Do you know about disk de-fragmentation? It can improve your system performance!
Combining segmentation and paging Can combine these techniques x86 architecture supports both segments and paging Use segments to manage logical units segments vary in size, but are typically large (multiple pages) Use pages to partition segments into fixed-size chunks each segment has its own page table there is a page table per segment, rather than per user address space memory allocation becomes easy once again no contiguous allocation, no external fragmentation Segment # Page # Offset within page Offset within segment
Linux: 1 kernel code segment, 1 kernel data segment 1 user code segment, 1 user data segment all of these segments are paged Note: this is a very limited/boring use of segments!
Example: The Intel Pentium Supports both segmentation and segmentation with paging CPU generates logical address Given to segmentation unit Which produces linear addresses Linear address given to paging unit Which generates physical address in main memory Paging units form equivalent of MMU
Pentium Paging Architecture
Logical to Physical Address Translation in Pentium
Intel Pentium Segmentation
Pentium Paging Architecture
Locality Locality is a fundamental property of computer systems. Spatial locality: Near by memory addresses are often accessed together. Temporal locality: Memory that is accessed now is likely to be accessed again in the near future.