Last lecture: VM Implementation

Slides:



Advertisements
Similar presentations
Chapter 4 Memory Management Basic memory management Swapping
Advertisements

Background Virtual memory – separation of user logical memory from physical memory. Only part of the program needs to be in memory for execution. Logical.
Virtual Memory (II) CSCI 444/544 Operating Systems Fall 2008.
Part IV: Memory Management
Caching and Virtual Memory. Main Points Cache concept – Hardware vs. software caches When caches work and when they don’t – Spatial/temporal locality.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 10: Virtual Memory Background Demand Paging Process Creation Page Replacement.
Module 9: Virtual Memory
Module 10: Virtual Memory Background Demand Paging Performance of Demand Paging Page Replacement Page-Replacement Algorithms Allocation of Frames Thrashing.
Virtual Memory Background Demand Paging Performance of Demand Paging
Virtual Memory Introduction to Operating Systems: Module 9.
Memory/Storage Architecture Lab Computer Architecture Virtual Memory.
Instructor: Umar KalimNUST Institute of Information Technology Operating Systems Virtual Memory.
Multiprocessing Memory Management
CS 104 Introduction to Computer Science and Graphics Problems
Memory Management 1 CS502 Spring 2006 Memory Management CS-502 Spring 2006.
CS-3013 & CS-502, Summer 2006 Memory Management1 CS-3013 & CS-502 Summer 2006.
Computer Organization and Architecture
Chapter 4 Memory Management 4.1 Basic memory management 4.2 Swapping
Memory Management ◦ Operating Systems ◦ CS550. Paging and Segmentation  Non-contiguous memory allocation  Fragmentation is a serious problem with contiguous.
Caching and Virtual Memory. Main Points Cache concept – Hardware vs. software caches When caches work and when they don’t – Spatial/temporal locality.
Page 19/17/2015 CSE 30341: Operating Systems Principles Optimal Algorithm  Replace page that will not be used for longest period of time  Used for measuring.
Operating Systems Chapter 8
Lecture 19: Virtual Memory
1 Memory Management 4.1 Basic memory management 4.2 Swapping 4.3 Virtual memory 4.4 Page replacement algorithms 4.5 Modeling page replacement algorithms.
Chapter 8 – Main Memory (Pgs ). Overview  Everything to do with memory is complicated by the fact that more than 1 program can be in memory.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 10: Virtual Memory Background Demand Paging Page Replacement Allocation of.
By Teacher Asma Aleisa Year 1433 H.   Goals of memory management  To provide a convenient abstraction for programming.  To allocate scarce memory.
Virtual Memory Questions answered in this lecture: How to run process when not enough physical memory? When should a page be moved from disk to memory?
W4118 Operating Systems Instructor: Junfeng Yang.
Virtual Memory. 2 Last Week Memory Management Increase degree of multiprogramming –Entire process needs to fit into memory Dynamic Linking and Loading.
Chapter 9: Virtual Memory. 9.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Background Virtual memory – separation of user logical memory.
Virtual Memory What if program is bigger than available memory?
Logistics Homework 5 will be out this evening, due 3:09pm 4/14
Virtual memory.
OPERATING SYSTEM CONCEPTS AND PRACTISE
CE 454 Computer Architecture
CSE 120 Principles of Operating
UNIT–IV: Memory Management
Instructor: Junfeng Yang
Virtual Memory Chapter 8.
Module 9: Virtual Memory
Chapter 9: Virtual Memory
Chapter 8: Main Memory.
Chapter 9: Virtual-Memory Management
What Happens if There is no Free Frame?
5: Virtual Memory Background Demand Paging
Main Memory Background Swapping Contiguous Allocation Paging
CS399 New Beginnings Jonathan Walpole.
Lecture 3: Main Memory.
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
CSE 451: Operating Systems Autumn 2005 Memory Management
Operating System Chapter 7. Memory Management
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
Contents Memory types & memory hierarchy Virtual memory (VM)
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
CSE 451 Autumn 2003 November 13 Section.
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
Computer Architecture
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
Operating Systems CMPSC 473
CSE 471 Autumn 1998 Virtual memory
Virtual Memory: Working Sets
CS703 - Advanced Operating Systems
Lecture 9: Caching and Demand-Paged Virtual Memory
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Module 9: Virtual Memory
COMP755 Advanced Operating Systems
Virtual Memory.
Chapter 8 & 9 Main Memory and Virtual Memory
CSE 542: Operating Systems
Presentation transcript:

Instructor: Junfeng Yang W4118 Operating Systems Instructor: Junfeng Yang

Last lecture: VM Implementation Operations OS + hardware must provide to support virtual memory Page fault Locate page Bring page into memory Continue process OS decisions When to bring a on-disk page into memory? Demand paging Request paging Prepaging What page to throw out to disk? OPT, RANDOM, FIFO, LRU, MRU Segmentation Advantages Sharing of segments Easier to relocate segment than entire program Avoids allocating unused memory Flexible protection Efficient translation Segment table small  fit in MMU Segmentation Disadvantages Segments have variable lengths  dynamic allocation overhead (Best fit? First fit?) External fragmentation: wasted memory Segments can be large

Today Virtual Memory Implementation Linux Memory Management Implementing LRU How to approximate LRU efficiently? Linux Memory Management 2

Implementing LRU: hardware A counter for each page Every time page is referenced, save system clock into the counter of the page Page replacement: scan through pages to find the one with the oldest clock Problem: have to search all pages/counters!

Implementing LRU: software A doubly linked list of pages Every time page is referenced, move it to the front of the list Page replacement: remove the page from back of list Avoid scanning of all pages Problem: too expensive Requires 6 pointer updates for each page reference High contention on multiprocessor

Example software LRU implementation

LRU: Concept vs. Reality LRU is considered to be a reasonably good algorithm Problem is in implementing it efficiently Hardware implementation: counter per page, copied per memory reference, have to search pages on page replacement to find oldest Software implementation: no search, but pointer swap on each memory reference, high contention In practice, settle for efficient approximate LRU Find an old page, but not necessarily the oldest LRU is approximation anyway, so approximate more

Clock (second-chance) Algorithm Goal: remove a page that has not been referenced recently good LRU-approximate algorithm Idea: combine FIFO and LRU A reference bit per page Memory reference: hardware sets bit to 1 Page replacement: OS finds a page with reference bit cleared OS traverses pages, clearing bits over time Access memory twice, why okay?

Clock Algorithm Implementation OS circulates through pages, clearing reference bits and finding a page with reference bit set to 0 Keep pages in a circular list = clock Pointer to next victim = clock hand

Second-Chance (clock) Page-Replacement Algorithm

Clock Algorithm Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 1 1 1 1 1 3 1 3 3 2 1 4 1 4 4 4 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 5 1 4 1 4 1 1 1 1 5 1 2 1 2 2 3 1 3 3

Clock Algorithm Extension Problem of clock algorithm: does not differentiate dirty v.s. clean pages Dirty page: pages that have been modified and need to be written back to disk More expensive to replace dirty pages than clean pages One extra disk write (5 ms)

Clock Algorithm Extension Use dirty bit to give preference to dirty pages On page reference Read: hardware sets reference bit Write: hardware sets dirty bit Page replacement reference = 0, dirty = 0  victim page reference = 0, dirty = 1  skip (don’t change) reference = 1, dirty = 0  reference = 0, dirty = 0 reference = 1, dirty = 1  reference = 0, dirty = 1 advance hand, repeat If no victim page found, run swap daemon to flush unreferenced dirty pages to the disk, repeat

Problem with LRU-based Algorithms When memory is too small to hold past LRU does handle repeated scan well when data set is bigger than memory 5-frame memory with 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 Solution: Most Recently Used (MRU) algorithm Replace most recently used pages Best for repeated scans

Problem with LRU-based Algorithms (cont.) LRU ignores frequency Intuition: a frequently accessed page is more likely to be accessed in the future than a page accessed just once Problematic workload: scanning large data set 1 2 3 1 2 3 1 2 3 1 2 3 … (pages frequently used) 4 5 6 7 8 9 10 11 12 … (pages used just once) Solution: track access frequency Least Frequently Used (LFU) algorithm Expensive Approximate LFU: LRU-k: throw out the page with the oldest timestamp of the k’th recent reference LRU-2Q: don’t move pages based on a single reference; try to differentiate hot and cold pages

Linux Page Replacement Algorithm Similar to LRU 2Q algorithm Two LRU lists Active list: hot pages, recently referenced Inactive list: cold pages, not recently referenced Page replacement: select from inactive list Transition of page between active and inactive requires two references or “missing references”

Allocating Memory to Processes Split pages among processes Split pages among users Global allocation

Thrashing Example System repeats Thrashing Set of processes frequently referencing 5 pages Only 4 frames in physical memory System repeats Reference page not in memory Replace a page in memory with newly referenced page Thrashing system busy reading and writing instead of executing useful instructions CPU utilization low Average memory access time equals disk access time Illusion breaks: memory appears slow as disk rather than disks appearing fast as memory Add more processes, thrashing get worse

Working Set Informal Definition Collection of pages the process is referencing frequently Collection of pages that must be resident to avoid thrashing Methods exist to estimate working set of process to avoid thrashing

Memory Management Summary “All problems in computer science can be solved by another level of indirection” David Wheeler Different memory management techniques Contiguous allocation Paging Segmentation Paging + segmentation In practice, hierarchical paging is most widely used; segmentation loses is popularity Some RISC architectures do not even support segmentation Virtual memory OS and hardware exploit locality to provide illusion of fast memory as large as disk Similar technique used throughout entire memory hierarchy Page replacement algorithms LRU: Past predicts future No silver bullet: choose algorithm based on workload

Current Trends Memory management: less critical now Personal computer v.s. time-sharing machines Memory is cheap  Larger physical memory Segmentation becomes less popular Some RISC chips don’t event support segmentation Larger page sizes (even multiple page sizes) Better TLB coverage Smaller page tables, less page to manage Internal fragmentation Larger virtual address space 64-bit address space Sparse address spaces File I/O using the virtual memory system Memory mapped I/O: mmap() Use VM for file caching Access file data as if access memory Page fault handling for file reads/writes madvise(): OS can ignore your hints

Today Virtual Memory Implementation Linux Memory Management Implementing LRU How to approximate LRU efficiently? Linux Memory Management Page replacement Segmentation and Paging Dynamic memory allocation 21 21

Page descriptor Keep track of the status of each page frame struct papge, include/linux/mm.h Each descriptor has two bits relevant to page replacement policy PG_active: is page on active_list? PG_referenced: was page referenced recently?

Memory Zone Keep track of pages in different zones struct zone, include/linux/mmzone.h ZONE_DMA: <16MB ZONE_NORMAL: 16MB-896MB ZONE_HIGHMEM: >896MB Two LRU list of pages active_list inactive_list

Functions lru_cache_add*(): add to page cache mark_page_accessed(): move pages from inactive to active page_referenced(): test if a page is referenced refill_inactive_zone(): move pages from active to inactive When to replace page Usually free_more_memory() free_more_memory try_to_free_pages shrink_acches shrink_zone refill_inactive_zone shrink_cache shrink_list pageout()

Today Virtual Memory Implementation Linux Memory Management Implementing LRU How to approximate LRU efficiently? Linux Memory Management Page replacement Segmentation and Paging Dynamic memory allocation 25 25

Recall: x86 segmentation and paging hardware CPU generates logical address Given to segmentation unit Which produces linear addresses Linear address given to paging unit Which generates physical address in main memory Paging units form equivalent of MMU

Recall: Linux Process Address Space 4G User space Kernel space process descriptor and kernel-mode stack kernel mode 3G User-mode stack-area Kernel space is also mapped into user space  from user mode to kernel mode, no need to switch address spaces user mode 0xC0000000 Shared runtime-libraries Task’s code and data

Linux Segmentation Linux does not use segmentation More portable since some RISC architectures don’t support segmentation Hierarchical paging is flexible enough X86 segmentation hardware cannot be disabled, so Linux just hacks segmentation table arch/i386/kernel/head.S Set base to 0x00000000, limit to 0xffffffff Logical addresses == linear addresses Protection Descriptor Privilege Level indicates if we are in privileged mode or user mode User code segment: DPL = 3 Kernel code segment: DPL = 0

Linux Paging Linux uses paging to translate logical addresses to physical addresses Page model splits a linear address into five parts Global dir Upper dir Middle dir Table Offset Draw the figure with Global dir upper dir middle dir table offset cr3 -> page global dir -> page upper dir -> page middle dir -> page table On 32-bit without physical address extension (a method to enlarge the physical memory range we can address, requires hardware support), only global dir, table and offset are used. Ask students: Cr3 holds physical address or logical address? What about entries stored in page global directory?

Kernel Address Space Layout Physical Memory Mapping vmalloc area vmalloc area Persistent High Memory Mappings Fix-mapped Linear addresses … Physical memory mapping: up to 896 MB Mapping is simple: physical address = virtual address – 0xC00000000 0xFFFFFFFF 0xC0000000

Linux Page Table Operations include/asm-i386/pgtable.h arch/i386/mm/hugetlbpage.c Examples mk_pte mk_pte

TLB Flush Operations include/asm-i386/tlbflush.h Flushing TLB on X86 load cr3: flush all TLB entries invlpg addr: flush a single TLB entry

Today Virtual Memory Implementation Linux Memory Management Implementing LRU How to approximate LRU efficiently? Linux Memory Management Page replacement Segmentation and Paging Dynamic memory allocation 33 33

Linux Page Allocation Linux use a buddy allocator for page allocation Buddy Allocator: Fast, simple allocation for blocks that are 2^n bytes [Knuth 1968] Allocation restrictions: 2^n pages Allocation of k pages: Raise to nearest 2^n Search free lists for appropriate size Recursively divide larger blocks until reach block of correct size “buddy” blocks Free Recursively coalesce block with buddy if buddy free Example: allocate a 256-page block mm/page_alloc.c 11 free lists 512, 1024 __page_alloc

Advantages and Disadvantages of Buddy Allocation Fast and simple compared to general dynamic memory allocation Avoid external fragmentation by keeping free pages contiguous Can use paging, but three problems: DMA bypasses paging Modifying page table leads to TLB flush Cannot use “super page” to increase TLB coverage Disadvantages Internal fragmentation Allocation of block of k pages when k != 2^n Allocation of small objects (smaller than a page)

The Linux Slab Allocator For objects smaller than a page Implemented on top of page allocator Each memory region is called a cache Two types of slab allocator Fixed-size slab allocator: cache contains objects of same size for frequently allocated objects General-purpose slab allocator: caches contain objects of size 2^n for less frequently allocated objects For allocation of object with size k, round to nearest 2^n mm/slab.c kmem_cahce_init: init general purpose cache sizes Include/linux/kmalloc_sizes.h kmem_cache_create: create a new cache _kmalloc

Advantages and Disadvantages of slab allocation Fast: no need to allocate and free page frames Allocation: no search of objects with the right size for fixed-size allocator; simple search for general-purpose allocator Free: no merge with adjacent free blocks Reduce internal fragmentation: many objects in one page Disadvantages Memory overhead for bookkeeping Internal fragmentation