Linux Kernel Development Memory Management Pavel Sorokin Gyeongsang National University 2007-05-04.

Slides:



Advertisements
Similar presentations
Chapter 12: File System Implementation
Advertisements

Device Drivers. Linux Device Drivers Linux supports three types of hardware device: character, block and network –character devices: R/W without buffering.
Part IV: Memory Management
Module R2 CS450. Next Week R1 is due next Friday ▫Bring manuals in a binder - make sure to have a cover page with group number, module, and date. You.
Note on malloc() and slab allocation CS-502 (EMC) Fall A Note on malloc() and Slab Allocation CS-502, Operating Systems Fall 2009 (EMC) (Slides include.
The Linux Kernel: Memory Management
Memory Management in Linux (Chap. 8 in Understanding the Linux Kernel)
Liu Meihua Chapter 3 Memory management Chapter 3 Memory management —— 3.5 Kernel Memory.
Kernel Memory Allocator
Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641.
CSC 660: Advanced Operating SystemsSlide #1 CSC 660: Advanced OS Memory Management.
Chapter 12. Kernel Memory Allocation
Memory management.
Computer Systems/Operating Systems - Class 8
Chapter 11: File System Implementation
11/13/01CS-550 Presentation - Overview of Microsoft disk operating system. 1 An Overview of Microsoft Disk Operating System.
Linux Vs. Windows NT Memory Management Hitesh Kumar
Introduction to Kernel
File System Implementation
1 Input/Output Chapter 3 TOPICS Principles of I/O hardware Principles of I/O software I/O software layers Disks Clocks Reference: Operating Systems Design.
Data Types in the Kernel Sarah Diesburg COP 5641.
I/O Tanenbaum, ch. 5 p. 329 – 427 Silberschatz, ch. 13 p
Memory Allocation CS Introduction to Operating Systems.
Data Structures in the Kernel Sarah Diesburg COP 5641.
File System. NET+OS 6 File System Architecture Design Goals File System Layer Design Storage Services Layer Design RAM Services Layer Design Flash Services.
Memory Management in Windows and Linux &. Windows Memory Management Virtual memory manager (VMM) –Executive component responsible for managing memory.
Chapter 12. Memory Management. Overview Memory allocation inside the kernel is not as easy as memory allocation outside the kernel  The kernel simply.
CS 6560 Operating System Design Lecture 13 Finish File Systems Block I/O Layer.
Disk Access. DISK STRUCTURE Sector: Smallest unit of data transfer from/to disk; 512B 2/4/8 adjacent sectors transferred together: Blocks Read/write heads.
OPERATING SYSTEM OVERVIEW. Contents Basic hardware elements.
ITEC 502 컴퓨터 시스템 및 실습 Chapter 8-1: I/O Management Mi-Jung Choi DPNM Lab. Dept. of CSE, POSTECH.
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
8.4 paging Paging is a memory-management scheme that permits the physical address space of a process to be non-contiguous. The basic method for implementation.
1 CMSC421: Principles of Operating Systems Nilanjan Banerjee Principles of Operating Systems Acknowledgments: Some of the slides are adapted from Prof.
Chapter 4. INTERNAL REPRESENTATION OF FILES
Paging Example What is the data corresponding to the logical address below:
Ihr Logo Operating Systems Internals & Design Principles Fifth Edition William Stallings Chapter 2 (Part II) Operating System Overview.
Page 1 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Dynamic Memory Allocation Suppose we defined the data type: struct custrec.
1 Threads, SMP, and Microkernels Chapter Multithreading Operating system supports multiple threads of execution within a single process MS-DOS.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
Lec 7aOperating Systems1 Operating Systems Lecture 7a: Linux Memory Manager William M. Mongan.
Chapter 12. Memory Management. Overview Memory allocation inside the kernel is not as easy as memory allocation outside the kernel  The kernel simply.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
4P13 Week 12 Talking Points Device Drivers 1.Auto-configuration and initialization routines 2.Routines for servicing I/O requests (the top half)
Virtual Memory – Managing Physical Memory
Device Driver Concepts Digital UNIX Internals II Device Driver Concepts Chapter 13.
CSE 351 Dynamic Memory Allocation 1. Dynamic Memory Dynamic memory is memory that is “requested” at run- time Solves two fundamental dilemmas: How can.
FILE SYSTEM IMPLEMENTATION 1. 2 File-System Structure File structure Logical storage unit Collection of related information File system resides on secondary.
External fragmentation in a paging system Use paging circuitry to map groups of noncontiguous free pages into logically contiguous addresses (remap your.
Ch6. Flow of Time Ch7. Getting Hold of Memory 홍원의.
Memory Management.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 33 Paging Read Ch. 9.4.
Linux 2.6 Memory Management Joseph Garvin. Why do we care? Without keeping multiple process in memory at once, we loose all the hard work we just did.
Introduction to Kernel
Chapter 2: The Linux System Part 4
Memory Management.
Chapter 11: File System Implementation
CS703 - Advanced Operating Systems
Chapter 11: File System Implementation
Operating System Structure
Chapter 11: File System Implementation
CS Introduction to Operating Systems
Chapter 11: File System Implementation
CSC 660: Advanced Operating Systems
Process Description and Control
CSE 451 Autumn 2003 November 13 Section.
Kernel Memory Chris Gill, David Ferry, Brian Kocoloski
Chapter 14: File-System Implementation
Chapter 11: File System Implementation
Structure of Processes
Presentation transcript:

Linux Kernel Development Memory Management Pavel Sorokin Gyeongsang National University

2 Overview Unlike user-space, the kernel is not always afforded the capability to easily allocate memory Pages Zones Memory allocation procedures Slab layer

3 Pages Physical pages are basic unit of memory management Each architecture enforces its own page size 32-bit architecture-4 Kb page size 64-bit architecture-8 Kb page size Kernel represents every physical page on system with a struct page structure struct page { page_flags_tflags; atomic_t_count; atomic_t_mapcount; unsigned longprivate; struct address_space*mapping; pgoff_tindex; struct list_headlru; void*virtual; }; flags – stores the status of page _count – how many references there are to this page virtual – stores virtual address The goal of page structure – to describe physical memory, not the data contained therein, because data may me in cache page, but not in physical page

4 Zones Because of hardware limitations, the kernel cannot treat all pages as identical Because of limitations, the kernel divides pages into different zones ZONE_DMA- capable for undergoing DMA ZONE_NORMAL- normal, regularly mapped, pages ZONE_HIGHMEM- “high memory”, which are pages not permanently mapped into the kernel’s address space Zones do not have any physical relevance; they are simply logical grouping used by kernel to keep track of pages Each zone is represented by struct zone lock- spin lock to protect structure from concurrent access free_pages- number of free pages in this zone name- NULL-terminated string, representing name of zone

5 Zones ZONE_DMA – some architectures have problems to perform DMA (direct memory access) to all memory addresses ZONE_HIGHMEM – some architectures have problems with performing directly mapping varies in x86 – ZONE_DMA consist of memory from 0 to 16 Mb in x86 – ZONE_HIGHMEM consist of memory above 896 Mb ZONE_NORMAL – whatever is left over after the ZONE_DMA and ZONE_HIGHMEM zones in x86 – ZONE_NORMAL consist of memory from 16 to 896 Mb

6 Memory Allocation Procedures The kernel provides one low-level mechanism for requesting memory, along with several interfaces the kernel implements to allow allocation and freeing memory Core function, that allocates continuous physical pages and returns pointer to the first page’s page structure; in error - NULL struct page * alloc_pages(unsigned int gfp_mask, unsigned int order) To convert a given page to its logical address void * page_address(struct page * page) Function, that returns logical address after page allocation unsigned long __get_free_pages(unsigned int gfp_mask, unsigned int order) If it is necessary to work only with one page struct page * alloc_page(unsigned int gfp_mask) unsigned long __get_free_page(unsigned int gfp_mask)

7 Memory Allocation Procedures When pages are no more necessary, they should be freed A family of functions allow to free allocated pages void __free_pages(struct page *page, unsigned int order) void free_pages(unsigned long addr, unsigned int order) void free_pages(unsigned long addr) Careful is needed when pages are free because of mistake can result in corruption

8 Memory Allocation Procedures For more general byte-sized allocations kernel provide another functions Function that allocates byte sized-chunks void * kmalloc(size_t size, int flags) Function that allocates byte sized-chunks, but memory is only virtually continuous void * vmalloc(unsigned long size) void * kfree(const void * ptr) void * vfree(const void * ptr)

9 Flags of Memory Allocation gfp_mask flag The flags are broken up into three categories action modifiers zone modifiers types All the flags are declared in

10 Flags of Memory Allocation Action modifiers Specify how the kernel is supposed to allocate the requested memory FlagDescription __GFP_WAITThe allocator can sleep __GFP_HIGHThe allocator can access emergency pools __GFP_IOThe allocator can start disc I/O __GFP_FSThe allocator can start filesystem I/O __GFP_COLDThe allocator should use cache cold pages __GFP_NOWARNThe allocator will not print failure messages __GFP_REPEATThe allocator will repeat the allocation if it fails, but allocation can potentially fail __GFP_NOFAILThe allocator will indefinitely repeat the allocation, allocation cannot fail __GFP_NORETRYThe allocator will newer retry if the allocation fails __GFP_NO_GROWUsed internally by the slab layer __GFP_COMPAdd compound page metadata. Used internally by the huget1b codes

11 Flags of Memory Allocation Zone modifiers Specify from which memory zone the allocation should originate FlagDescription __GFP_DMAAllocate only from ZONE_DMA __GFP_HIGHMEMAllocate only from ZONE_HIGHMEM or ZONE_NORMAL By default kernel allocates memory in ZONE_NORMAL If neither flag is specified, the kernel fulfills the allocation from either ZONE_DMA or ZONE_NORMAL, but preference will be on ZONE_NORMAL

12 Flags of Memory Allocation Type Flags Specify the required action and zone modifiers to fulfill a particular type of transaction FlagDescription GFP_ATOMIC__GFP_HIGH Allocation is high priority and must not sleep GFP_NOIO__GFP_WAIT Allocation can block, but must not initiate disc I/O GFP_NOFS(__GFP_WAIT | __GFP_IO) Allocation can block and initiate disc I/O, but will not initiate a filesystem operations GFP_KERNEL(__GFP_WAIT | __GFP_IO | __GFP_FS) Normal allocation and might block GFP_USER(__GFP_WAIT | __GFP_IO | __GFP_FS) Normal allocation and might block. For user-space processes GFP_HIGHUSER(__GFP_WAIT | __GFP_IO | __GFP_FS | __GFP_HIGHMEM ) For ZONE_HIGHMEM and might block. For user-space processes GFP_DMA__GFP_DMA For ZONE_DMA. For device drivers

13 Flags of Memory Allocation Which Flag to Use When In most cases used GFP_KERNEL or GFP_ATOMIC flags SituationSolution Process context, can sleepUse GFP_KERNEL Process context, cannot sleepUse GFP_ATOMIC, or perform your allocations with GFP_KERNEL at en earlier or later point when you can sleep Interrupt handlerUse GFP_ATOMIC SoftirqUse GFP_ATOMIC TaskletUse GFP_ATOMIC Need DMA-able memoryUse (GFP_DMA | GFP_KERNEL) can sleep Need DMA-able memoryUse (GFP_DMA | GFP_ATOMIC), or perform your allocation at cannot sleepan earlier point when you can sleep

14 Slab Layer Free list – made for facility frequent allocation and deallocation of data A free list contains a block of available, already allocated, data structures When code requires a new instance of a data structure, it can grab one of the structures off the free list rather that allocate new When data structure no longer needed, it is returned to free list instead of deallocating Main problem is that there are no global exist no global control of free lists

15 Slab Layer Slab layer made to solve the problem of global free list control Frequently used data structures tend to be allocated and freed often, so cache them Free lists allocated continuous to prevent memory fragmentation Free lists provides improved performance of using data structures If part of cache is made per-processor, allocations and frees can be performed without SMP lock

16 Slab Layer Design of slab layer Slab layer divides different objects into groups called caches One cache for one object type The caches then divided into slabs Each slab contains some number of objects full- all objects in slab are allocated, no free objects Each slab is in one of three states partial- slab has some allocated objects and some free objects empty- no allocated objects in slab, all objects are free When kernel requests a new object request satisfied from partial slab, if such exist request satisfied from empty slab, if such exist new empty slab allocated, if no one empty or partial slab exist

17 Slab Layer relationship between caches, slabs, and objects Cache Slab Object Each cache is represented by a kmem_cache_s structure Structure kmem_cache_s are contains three lists – slabs_full, slabs_partial, slabs_empty struct slab { struct list_headlist;/* full, partial, or empty list */ unsigned longcolouroff;/* offset for the slab coloring */ void*s_mem;/* first object in the slab */ unsigned intinsue;/* allocated objects in the slab */ kmem_bufctl_tfree;/* first free object, if any */ }

18 Slab Layer Slab layer memory management Memory allocation for new slabs __get_free_pages() Memory deallocation for slabs __kmem_freepages() Slab layer invokes memory allocation only when there does not exist any partial or empty slabs in a given cache The slab layer managed on a per-cache basis through a simple interface, which is exported to the entire kernel The interface allows the creation and destruction of caches and the allocation and freeing of objects within the caches