Memory Management.

Slides:



Advertisements
Similar presentations
Part IV: Memory Management
Advertisements

1 Memory hierarchy and paging Electronic Computers M.
The Linux Kernel: Memory Management
Memory Management in Linux (Chap. 8 in Understanding the Linux Kernel)
Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641.
CSC 660: Advanced Operating SystemsSlide #1 CSC 660: Advanced OS Memory Management.
Memory management.
Virtual Memory Operating System Concepts chapter 9 CS 355
Segmentation and Paging Considerations
User-Level Memory Management in Linux Programming
OS Memory Addressing.
1 A Real Problem  What if you wanted to run a program that needs more memory than you have?
CE6105 Linux 作業系統 Linux Operating System 許 富 皓. Chapter 2 Memory Addressing.
Computer Organization and Architecture
CSCI2413 Lecture 6 Operating Systems Memory Management 2 phones off (please)
Memory Allocation CS Introduction to Operating Systems.
Data Structures in the Kernel Sarah Diesburg COP 5641.
File System. NET+OS 6 File System Architecture Design Goals File System Layer Design Storage Services Layer Design RAM Services Layer Design Flash Services.
Chapter 12. Memory Management. Overview Memory allocation inside the kernel is not as easy as memory allocation outside the kernel  The kernel simply.
Chapter 8 Memory Management Dr. Yingwu Zhu. Outline Background Basic Concepts Memory Allocation.
Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
8.4 paging Paging is a memory-management scheme that permits the physical address space of a process to be non-contiguous. The basic method for implementation.
1 Linux Operating System 許 富 皓. 2 Memory Addressing.
8.1 Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Paging Physical address space of a process can be noncontiguous Avoids.
Chapter 4 Memory Management Virtual Memory.
Virtual Memory 1 1.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Virtual Memory Hardware.
Lec 7aOperating Systems1 Operating Systems Lecture 7a: Linux Memory Manager William M. Mongan.
Operating Systems CSE 411 CPU Management Sept Lecture 10 Instructor: Bhuvan Urgaonkar.
Chapter 12. Memory Management. Overview Memory allocation inside the kernel is not as easy as memory allocation outside the kernel  The kernel simply.
Swap Space and Other Memory Management Issues Operating Systems: Internals and Design Principles.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Demand Paging.
Processes and Virtual Memory
Virtual Memory.  Next in memory hierarchy  Motivations:  to remove programming burdens of a small, limited amount of main memory  to allow efficient.
Device Driver Concepts Digital UNIX Internals II Device Driver Concepts Chapter 13.
CSCI 156: Lab 11 Paging. Our Simple Architecture Logical memory space for a process consists of 16 pages of 4k bytes each. Your program thinks it has.
CSE 351 Dynamic Memory Allocation 1. Dynamic Memory Dynamic memory is memory that is “requested” at run- time Solves two fundamental dilemmas: How can.
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
OS Memory Addressing. Architecture CPU – Processing units – Caches – Interrupt controllers – MMU Memory Interconnect North bridge South bridge PCI, etc.
Linux Kernel Development Memory Management Pavel Sorokin Gyeongsang National University
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 2: The Linux System Part 4.
1 Memory Management n In most schemes, the kernel occupies some fixed portion of main memory and the rest is shared by multiple processes.
Ch6. Flow of Time Ch7. Getting Hold of Memory 홍원의.
Memory management The main purpose of a computer system is to execute programs. These programs, together with the data they access, must be in main memory.
DYNAMIC MEMORY ALLOCATION. Disadvantages of ARRAYS MEMORY ALLOCATION OF ARRAY IS STATIC: Less resource utilization. For example: If the maximum elements.
Linux 2.6 Memory Management Joseph Garvin. Why do we care? Without keeping multiple process in memory at once, we loose all the hard work we just did.
Computer Organization
Chapter 2: The Linux System Part 4
Non Contiguous Memory Allocation
CE 454 Computer Architecture
Lecture 12 Virtual Memory.
CS703 - Advanced Operating Systems
Structure of Processes
CS Introduction to Operating Systems
Evolution in Memory Management Techniques
CSC 660: Advanced Operating Systems
Virtual Memory Hardware
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE451 Virtual Memory Paging Autumn 2002
CSE 451 Autumn 2003 November 13 Section.
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
Kernel Memory Chris Gill, David Ferry, Brian Kocoloski
Lecture 7: Flexible Address Translation
Lecture 8: Efficient Address Translation
Buddy Allocation CS 161: Lecture 5 2/11/19.
COMP755 Advanced Operating Systems
Structure of Processes
Synonyms v.p. x, process A v.p # index Map to same physical page
Virtual Memory and Paging
Virtual Memory 1 1.
Presentation transcript:

Memory Management

Pages The basic unit of memory management. MMU typically deals in pages. Page granularity Many architectures even support multiple page sizes. Most 32-bit architectures have 4KB pages, whereas most 64-bit architectures have 8KB pages. A machine with 4KB pages and 1GB of physical memory  1G/4K = 262,144 distinct physical pages.

struct page struct page { page_flags_t flags; atomic_t _count; atomic_t _mapcount; unsigned long private; struct address_space *mapping; pgoff_t index; struct list_head lru; void *virtual; }; Dirty or locked? Usage count Virtual Addr

Zones To deal with two shortcomings of hardware with respect to memory addressing: DMA (direct memory access) to only certain memory addresses. On the x86 architecture, ISA devices can access only the first 16MB of physical memory. Physically addressing larger amounts of memory than they can virtually address.

Zone on x86 Three memory zones in Linux: ZONE_DMA contains pages that are capable of undergoing DMA. ZONE_NORMAL contains normal, regularly mapped, pages. ZONE_HIGHMEM contains high memory, which are pages not permanently mapped into the kernel's address space.

Getting Pages

Freeing pages Be careful to free only pages you allocate! void __free_pages(struct page *page, unsigned int order) void free_pages(unsigned long addr, unsigned int order) void free_page(unsigned long addr) Be careful to free only pages you allocate!

Example

kmalloc() The kmalloc() function is a simple interface for obtaining kernel memory in byte-sized chunks. void * kmalloc(size_t size, int flags) To free a block of memory void kfree(const void *ptr)

gfp_mask Flags Action modifiers Zone modifiers Type flags Specify how the kernel is supposed to allocate the requested memory. For example, interrupt handlers must instruct the kernel not to sleep in the course of allocating memory. Zone modifiers Specify from where to allocate memory. Type flags Simplify specifying numerous modifiers.

Action Modifiers

Zone Modifiers

Type Flags (cont)

Modifiers Behind Each Type Flag

vmalloc() kmalloc() vs. vmalloc() kmalloc() guarantees that the pages are physically contiguous (and virtually contiguous). vmalloc() allocates memory that is only virtually contiguous and not necessarily physically contiguous. The pages returned by malloc() are contiguous within the virtual address space of the processor, but there is no guarantee that they are actually contiguous in physical RAM.

Memory for Hardware Devices Only hardware devices require physically contiguous memory allocations. Hardware devices may live on the other side of the memory management unit and, thus, do not understand virtual addresses. Memory buffers for hardware devices must exist as a physically contiguous block and not merely a virtually contiguous one.

But kmalloc() Is Preferred… For performance, most kernel code uses kmalloc() and not vmalloc() to obtain memory. The vmalloc() function, to make non-physically contiguous pages contiguous in the virtual address space, must specifically set up the page table entries. Pages obtained via vmalloc() must be mapped by their individual pages which results in much greater TLB thrashing than you see when directly mapped memory is used. vmalloc() is used only when absolutely necessary; typically, to obtain very large regions of memory. For example, when modules are dynamically inserted into the kernel, they are loaded into memory created via vmalloc().

Slab and Free Lists To facilitate frequent allocations and deallocations of data, programmers often introduce free lists. A free list contains a block of available, already allocated, data structures. When code requires a new instance of a data structure, it can grab one of the structures off the free list rather than allocate the sufficient amount of memory and set it up for the data structure. Later on, when the data structure is no longer needed, it is returned to the free list instead of deallocated. In this sense, the free list acts as an object cache, caching a frequently used type of object.

Slab Layer First implemented in SunOS 5.4. Linux shares the same name and basic design. Frequently used data structures tend to be allocated and freed often, so cache them. Frequent allocation and deallocation can result in memory fragmentation. The free list provides improved performance during frequent allocation and deallocation because a freed object can be immediately returned to the next allocation. If the allocator is aware of concepts such as object size, page size, and total cache size, it can make more intelligent decisions. If part of the cache is made per-processor, allocations and frees can be performed without an SMP lock. If the allocator is NUMA-aware, it can fulfill allocations from the same memory node as the requestor. Stored objects can be colored to prevent multiple objects from mapping to the same cache lines.

The slab allocator components

cache and slab descriptors

Creating a Cache A new cache is created via kmem_cache_t * kmem_cache_create( const char *name, size_t size, size_t align, unsigned long flags, void (*ctor)(void*, kmem_cache_t *, unsigned long), void (*dtor)(void*, kmem_cache_t *, unsigned long)) In practice, caches in the Linux kernel do not often utilize a constructor (ctor) or destructor (dtor).

Destroying a Cache To destroy a cache, call int kmem_cache_destroy(kmem_cache_t *cachep)

Useful SLAB Flags (1/2) SLAB_HWCACHE_ALIGN : Align each object within a slab to a cache line. This prevents false sharing This improves performance, but comes at a cost of increased memory footprint For frequently used caches in performance- critical code, setting this option is a good idea; otherwise, think twice.

Useful SLAB Flags (2/2) SLAB_POISON : Fill the slab with a known value (a5a5a5a5). Poisoning is useful for catching access to uninitialized memory. SLAB_RED_ZONE : Insert "red zones" around the allocated memory to help detect buffer overruns. SLAB_PANIC : Panic if the allocation fails. This flag is useful when the allocation must not fail e.g. allocating the VMA structure cache during bootup SLAB_CACHE_DMA : Allocate each slab in DMA-able memory

Allocating a Cache Object After a cache is created, an object is obtained from the cache via void * kmem_cache_alloc(kmem_cache_t *cachep, int flags) Returns a pointer to an object from the given cache cachep. If no free objects are in any slabs in the cache, and the slab layer must obtain new pages via kmem_getpages(), the value of flags is passed to __get_free_pages(). These are the same flags we looked at earlier. You probably want GFP_KERNEL or GFP_ATOMIC.

Freeing a Cache Object To later free an object and return it to its originating slab, use the function void kmem_cache_free(kmem_cache_t *cachep, void *objp)

Allocating a SLAB: kmem_getpages() static inline void * kmem_getpages( struct kmem_cache *cachep, gfp_t flags) { void *addr; flags |= cachep->gfpflags; addr = (void*) __get_free_pages(flags, cachep->gfporder); return addr; }

kmem_getpages & kmem_freepages Since the point of the slab layer is to refrain from allocating and freeing pages: kmem_getpages() is invoked only when there does not exist any partial or empty slabs in a given cache. kmem_freepages() is called only when available memory grows low and the system is attempting to free memory, or when a cache is explicitly destroyed. kmem_freepages() calls free_pages() to free the given cache's pages.

SLAB Example – fork (1/2) Pointer to the task_struct cache: kmem_cache_t *task_struct_cachep; During kernel initialization, in fork_init(), the cache is created: task_struct_cachep = kmem_cache_create( "task_struct", sizeof(struct task_struct), ARCH_MIN_TASKALIGN, SLAB_PANIC, NULL, NULL); Offset: ARCH_MIN_TASKALIGN defines an architecture-specific value, L1_CACHE_BYTES the size in bytes of the L1 cache. If the allocation fails, the slab allocator calls panic(). If you do not provide this flag, you must check the return!

SLAB Example – fork (2/2) Creating a new process descriptor in fork(): fork()do_fork()dup_task_struct struct task_struct *tsk; tsk = kmem_cache_alloc(task_struct_cachep, GFP_KERNEL); if (!tsk) return NULL; Freeing a process descriptor is done in free_task_struct(): kmem_cache_free(task_struct_cachep, tsk); The slab layer handles all the low-level alignment, coloring, allocations, freeing, and reaping during low-memory conditions. If you are frequently creating many objects of the same type, consider using the slab cache. Definitely do not implement your own free list!

Kernel Stack The size of kernel stack is limited. Each process's entire call chain has to fit in its kernel stack. Default size on 32-bit: 8KB (pre-2.6) 4KB (2.6) Shared by ISR’s? Yes (pre-2.6) Interrupt stacks (2.6)

Kernel Stack: Usage You must keep stack usage to a minimum. Performing a large static allocation on the stack is dangerous. Stack overflow crash the system! It is wise to use a dynamic allocation.

Per-CPU Allocations (1/3) Modern SMP-capable operating systems use per-CPU data that is unique to a given processor extensively. Typically, per-CPU data is stored in an array. unsigned long my_percpu[NR_CPUS]; cpu = get_cpu(); /* get current CPU and disable kernel preemption */ my_percpu[cpu]++; /* ... or whatever */ printk("my_percpu on cpu=%d is %lu\n", cpu, my_percpu[cpu]); put_cpu(); /* enable kernel preemption */ No lock is required… so long as no processor touches this data except the current.

Per-CPU Allocations (2/3) Kernel preemption poses two problems: If your code is preempted and reschedules on another processor, the cpu variable is no longer valid because it points to the wrong processor. If another task preempts your code, it can concurrently access my_percpu on the same processor, which is a race condition. This is why, in the above case, kernel preemption is disabled between get_cpu() and put_cpu().

Per-CPU Allocations (3/3) Note that if you use a call to smp_processor_id() to get the current processor number, kernel preemption is not disabled. Always use the aforementioned methods to remain safe.

The New percpu Interface Define a per-CPU variable at compile-time: DEFINE_PER_CPU(type, name); Manipulate the variables: A call to get_cpu_var() returns an lvalue for the given variable on the current processor. It also disables preemption, which put_cpu_var() correspondingly enables. get_cpu_var(name)++; /* increment name on this processor */ put_cpu_var(name); /* done; enable kernel preemption */ Obtain the value of another processor's per-CPU data: per_cpu(name, cpu)++; /* increment name on the given processor */ Be careful -- per_cpu() neither disables kernel preemption nor provides any sort of locking mechanism.

Per-CPU Data - Benefits Reduction in locking requirements. You might not need any locking at all if you can ensure that the local processor accesses only its unique data. Per-CPU data greatly reduces cache invalidation The percpu interface cache-aligns all data to ensure that accessing one processor's data does not bring in another processor's data on the same cache line.

Per-CPU Data - Usage The only safety requirement for the use of per-CPU data is disabling kernel preemption, which is much cheaper than locking, and the interface does so automatically. Per-CPU data can safely be used from either interrupt or process context. You cannot sleep in the middle of accessing per-CPU data (or else you might end up on a different processor). The new interface, however, is much easier to use and might gain additional optimizations in the future. However, it is not backward compatible with earlier kernels.

Memory Allocation - Summary Need contiguous physical pages Low-level page allocators or kmalloc(). Standard practice within kernel. Use GFP_ATOMIC for high priority allocation that will not sleep in ISRs. Use GFP_KERNEL for code that can sleep. No need for physically contiguous pages  vmalloc() Mapping chunks of physical memory into a contiguous logical memory. Slight performance hit taken with vmalloc() over kmalloc(). Allocate from high memory alloc_pages() Because high memory might not be mapped, the only way to access it might be via the corresponding struct page structure. To obtain an actual pointer, use kmap() to map the high memory into the kernel's logical address space. Create and destroy many large data structures  Slab cache. Slab: a per-processor object cache (a free list). Might greatly enhance object allocation and deallocation performance.

buddy system

The concept

data structure used by Linux