Simon Jackson James Sleeman Pete Hemery. Simon Jackson.

Slides:



Advertisements
Similar presentations
CSCC69: Operating Systems
Advertisements

The Linux Kernel: Memory Management
Memory Management in Linux (Chap. 8 in Understanding the Linux Kernel)
Memory management.
16.317: Microprocessor System Design I
CS 153 Design of Operating Systems Spring 2015
Linux Vs. Windows NT Memory Management Hitesh Kumar
Lecture 11: Memory Management
Memory Management 2010.
Chapter 3.2 : Virtual Memory
Memory Management Chapter 5.
Chapter 9 Virtual Memory Produced by Lemlem Kebede Monday, July 16, 2001.
Computer Organization Cs 147 Prof. Lee Azita Keshmiri.
Computer Organization and Architecture
03/05/2008CSCI 315 Operating Systems Design1 Memory Management Notice: The slides for this lecture have been largely based on those accompanying the textbook.
Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.
Memory Management in Windows and Linux &. Windows Memory Management Virtual memory manager (VMM) –Executive component responsible for managing memory.
Real-Time Concepts for Embedded Systems Author: Qing Li with Caroline Yao ISBN: CMPBooks.
VxWorks & Memory Management
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 8: Main Memory.
Operating Systems Chapter 8
Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Principles of I/0 hardware.
1 Chapter 3.2 : Virtual Memory What is virtual memory? What is virtual memory? Virtual memory management schemes Virtual memory management schemes Paging.
File System Implementation Chapter 12. File system Organization Application programs Application programs Logical file system Logical file system manages.
CIS250 OPERATING SYSTEMS Memory Management Since we share memory, we need to manage it Memory manager only sees the address A program counter value indicates.
By Teacher Asma Aleisa Year 1433 H.   Goals of memory management  To provide a convenient abstraction for programming  To allocate scarce memory resources.
Chapter 8 – Main Memory (Pgs ). Overview  Everything to do with memory is complicated by the fact that more than 1 program can be in memory.
Interrupts By Ryan Morris. Overview ● I/O Paradigm ● Synchronization ● Polling ● Control and Status Registers ● Interrupt Driven I/O ● Importance of Interrupts.
Fundamentals of Programming Languages-II Subject Code: Teaching SchemeExamination Scheme Theory: 1 Hr./WeekOnline Examination: 50 Marks Practical:
1 Memory Management Chapter 7. 2 Memory Management Subdividing memory to accommodate multiple processes Memory needs to be allocated to ensure a reasonable.
By Teacher Asma Aleisa Year 1433 H.   Goals of memory management  To provide a convenient abstraction for programming.  To allocate scarce memory.
Computer Systems Week 14: Memory Management Amanda Oddie.
1 Memory Management Chapter 7. 2 Memory Management Subdividing memory to accommodate multiple processes Memory needs to be allocated to ensure a reasonable.
Main Memory. Chapter 8: Memory Management Background Swapping Contiguous Memory Allocation Paging Structure of the Page Table Segmentation Example: The.
Chapter 12. Memory Management. Overview Memory allocation inside the kernel is not as easy as memory allocation outside the kernel  The kernel simply.
11.1 Silberschatz, Galvin and Gagne ©2005 Operating System Principles 11.5 Free-Space Management Bit vector (n blocks) … 012n-1 bit[i] =  1  block[i]
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
COMP091 – Operating Systems 1 Memory Management. Memory Management Terms Physical address –Actual address as seen by memory unit Logical address –Address.
OS Memory Addressing. Architecture CPU – Processing units – Caches – Interrupt controllers – MMU Memory Interconnect North bridge South bridge PCI, etc.
Linux Kernel Development Memory Management Pavel Sorokin Gyeongsang National University
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 2: The Linux System Part 4.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 8: Main Memory.
Chapter 8: Memory Management. 8.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 8: Memory Management Background Swapping Contiguous.
Linux 2.6 Memory Management Joseph Garvin. Why do we care? Without keeping multiple process in memory at once, we loose all the hard work we just did.
Memory Management.
Non Contiguous Memory Allocation
Chapter 9: Virtual Memory
Linux device driver development
Chapter 8: Main Memory.
Chapter 9: Virtual-Memory Management
Page Replacement.
Computer Architecture
So far… Text RO …. printf() RW link printf Linking, loading
Main Memory Background Swapping Contiguous Allocation Paging
Chapter 8: Memory management
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
CSE 451: Operating Systems Autumn 2005 Memory Management
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE451 Virtual Memory Paging Autumn 2002
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
CSE 451 Autumn 2003 November 13 Section.
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
CS703 - Advanced Operating Systems
COMP755 Advanced Operating Systems
Virtual Memory.
CSE 542: Operating Systems
Chapter 13: I/O Systems “The two main jobs of a computer are I/O and [CPU] processing. In many cases, the main job is I/O, and the [CPU] processing is.
Presentation transcript:

Simon Jackson James Sleeman Pete Hemery

Simon Jackson

 Divided into a number of “Zones”  Zone_DMA : 0 – 16MB  ZONE_NORMAL : 16MB – 896MB  ZONE_HIGH : 896MB – 4GB  Most Kernel operations may only take place in ZONE_NORMAL  Organised into Pages, x86 has 4KB Pages  Include/linux/mm_types.h

 Each page has a struct page associated with it  The kernel maintains one or more arrays of these that track all of the physical memory on the system  Functions and macros are defined for translating between struct page pointers and virtual addresses  struct page *virt_to_page(void *kaddr);  struct page *pfn_to_page(int pfn);  void *page_address(struct page *page);

 0 – 16MB  Used for Direct Memory Access  Legacy ISA devices can only access first 16MB of memory and thus the kernel tries to dedicate this area to them

 16MB – 896MB  AKA Low Memory  Normally addressable region for kernel  Kernel addresses that map it are called Logical Addresses and have a constant offset from their physical addresses

 896MB – 4GB  Kernel can only access by mapping into ZONE_NORMAL  Results in a virtual address, not logical  Kmap first checks to see if page is already in low memory  Kmap uses a page table to track mapped memory called pkmap_page_table which is located at PKMAP_BASE and set up during system initialisation

 Virtual addresses mapped to physical memory by Page Tables  Each process has it’s own page tables  Once the MMU is enabled, Virtual Memory applies to all programs, including the kernel  Kernel doesn’t necessarily use that much physical memory, it just has that address space available to map physical memory

 Kernel space is constantly present and maps the same physical memory in all processes – it is Resident  Marked as exclusive to privileged code in page tables, i.e. kernel only  Mapping for the user land VM changes whenever a process switch happens

 For devices that cannot access full address range, such as 32bit devices on 64bit systems  In memory low enough for device to address  Copied to desired page in high memory  Used as buffer pages for DMA to and from the device  Data is copied via the bounce buffer differently depending on whether it is a read or write buffer  Buffer can be reclaimed once IO done

 In 2.4, the high memory manager was the only subsystem that maintained emergency pools of pages  In 2.6, memory pools are implemented as a generic concept where a minimum of memory is needed even when memory is low  Two emergency pools are maintained for the express use by bounce buffers

 Maintains a three-level architecture independent page table to handle 64 bit addresses  Architectures that manage their MMU differently emulate three-level page tables  Each process has a pointer to its own Page Global Directory (PGD) which is a physical page  Each active PGD entry points to a page containing an array of Page Middle Directory (PMD) entries  Each PMD entry points to a page of Page Table Entries (PTE), which in turn point at pages of actual data

 Linear addresses may be broken up into parts to yield offsets within these three page table levels and an offset within the actual page Macro definitions on x86

James Sleeman

 Slab allocation  Buddy Allocation  Mempools  Look aside buffers

 The main motivation for slab allocation is initialising and freeing Kernel data objects can outweigh the cost of allocating them.  With slab allocation, memory chunks suitable to fit data objects of certain type or size are preallocated.

 Is a fast memory allocation technique that divides memory into power of 2 partitions and attempts to allocate memory on a best fit approach  When memory is freed by the user, the buddy block is checked to see if any of its contiguous neighbours have also been freed. If so, the blocks are combined to minimize fragmentation

 A memory pool has the type mempool_t, defined in

 Kmalloc is a memory allocation function that returns contiguous memory from kernel space.  Void *kmalloc(size_t size, int flags)  buf = kmalloc(BUF_SIZE, GFP_DMA | GFP_KERNEL);  void kfree(const void *ptr)  Kfree(buf);  and

#define BUF_LEN 2048 void function(void) { char buf[BUF_LEN]; /* Do stuff with buf */ } #define BUF_LEN 2048 void function(void) { char *buf; buf = kmalloc(BUF_LEN, GFP_KERNEL); if (!buf) /* error! */ }

 All flags are listed in include/linux./gfp.h  Type flags:  GFP_ATOMIC  GFP_NOIO  GFP_NOFS  GFP_KERNEL  GFP_USER  GFP_HIGHUSER  GFP_DMA

 unsigned long get_zeroed_page(int flags);  unsigned long __get_free_page(int flags);  unsigned long __get_free_pages(int flags, unsigned long order);  unsigned long __get_dma_pages(int flags, unsigned long order);

 #include  DEFINE_PER_CPU(type, name);  get_cpu_var(sockets_in_use)++;  put_cpu_var(sockets_in_use);  per_cpu(variable, int cpu_id);  cpu = get_cpu( )  ptr = per_cpu_ptr(per_cpu_var, cpu);

sudo cat /proc/slabinfo | awk '{printf "%5d MB %s\n", $3*$4/(1024*1024), $1}' | sort –n 0 MB vm_area_struct 1 MB dentry 2 MB ext4_inode_cache 2 MB inode_cache 8 MB buffer_head

 Some of the causes of OOM:  The kernel is really out of memory, its used more memory than the system has in ram and swap  Kernel memory leaks  Deadlocks kind of, writing data to disk may require memory allocation  OOM KILLER: Linux/mm/oom_kill.coom_kill.c  vm enough memory();  out_of_memory();

 Thomas Habets had an unfortunate experience recently. His Linux system ran out of memory, and the dreaded "OOM killer" was loosed upon the system's unsuspecting processes. One of its victims turned out to be his screen locking program.

 DMA is a feature inside modern microcontrollers that allows other hardware subsystems to access system memory independently of the CPU.  Without DMA, large amount of CPU cycles are taken up, and PIO can be tied up for the entire duration of the read or write.

Useful websites: Kmalloc and more: lwn.net/images/pdf/LDD3/ch08.pdf slab-allocator/

Pete Hemery

Programmed I/O (Polling)Simplest method but inefficient Interrupt Driven I/OInterrupt Service Routine in Device Driver How does the CPU know when a device is ready?

Direct Memory AccessBypasses the CPU to get to system memory

 A DMA deals with physical addresses, so:  Programming a DMA requires retrieving a physical address at some point (virtual addresses are usually used)  The memory accessed by the DMA shall be physically contiguous  The CPU can access memory through a data cache  Using the cache can be more efficient (faster accesses to the cache than the bus)  But the DMA does not access the CPU cache, so care needs to be taken for cache coherency (cache content vs. memory content)  Either flush or invalidate the cache lines corresponding to the buffer accessed by DMA and processor at strategic times

 Need to use contiguous memory in physical space.  Can use any memory allocated by kmalloc (up to 128 KB) or __get_free_pages (up to 8MB).  Can use block I/O and networking buffers, designed to support DMA.  Can not use vmalloc memory (would have to setup DMA on each individual physical page).

Memory caching could interfere with DMA  Before DMA to device:  Need to make sure that all writes to DMA buffer are committed.  After DMA from device:  Before drivers read from DMA buffer, need to make sure that memory caches are flushed.  Bidirectional DMA  Need to flush caches before and after the DMA transfer.

 The ARM Cortex™-A8 processor is based on the ARMv7 architecture and has the ability to scale in speed from 600MHz to greater than 1GHz. The Cortex-A8 processor can meet the requirements for power-optimized mobile devices needing operation in less than 300mW; and performance- optimized consumer applications requiring 2000 Dhrystone MIPS. Cortex A8 Netbook

Arbitration “The process by which the parties to a dispute submit their differences to the judgment of an impartial person or group appointed by mutual consent or statutory provision.”

 Sitara™ ARM® Microprocessors  Welcome to the Sitara™ ARM® Microprocessors Section of the TI E2E Support Community. Ask questions, share knowledge, explore ideas, and help solve problems with fellow engineers. To post a question, click on the forum tab then "New Post".  This group contains forums for discussion on Cortex A8 based AM35x, AM37x and AM335x processors and ARM9 based AM1x processors. For faster response please be sure to tag your post.

I am currently working on getting WLAN up and running. It seems that the SDIO driver is broken for libertas_sdio: libertas_sdio: probe of mmc1:0001:1 failed with error -16 A second problem is the USB Host interface. it seems to be completely broken. Hotplugging USB mouse: [ ] drivers/hid/usbhid/hid-core.c: can't reset device, ehci-omap.0-2.3/input0, status -71 Adding a webcam: [ ] Linux video capture interface: v2.00 [ ] gspca: main v2.9.0 registered [ ] gspca: probing 046d:08da [ ] twl_rtc twl_rtc: rtc core: registered twl_rtc as rtc0 [ ] lib80211: common routines for IEEE drivers [ ] lib80211_crypt: registered algorithm 'NULL' [ ] ads7846 spi1.0: touchscreen, irq 274 [ ] input: ADS7846 Touchscreen as /devices/platform/omap2_mcspi.1/spi1.0/input/input1 [ ] cfg80211: Calling CRDA to update world regulatory domain [ ] libertas_sdio: Libertas SDIO driver [ ] libertas_sdio: Copyright Pierre Ossman [ ] zc3xx: probe 2wr ov vga 0x0000 [ ] zc3xx: probe sensor -> 0011 [ ] zc3xx: Find Sensor HV7131R(c) [ ] input: zc3xx as /devices/platform/ehci-omap.0/usb1/1-2/1-2.3/input/input2 [ ] gspca: video0 created [ ] gspca: found int in endpoint: 0x82, buffer_len=8, interval=10 [ ] kernel BUG at arch/arm/mm/dma-mapping.c:409! [ ] Unable to handle kernel NULL pointer dereference at virtual address [ ] libertas_sdio: probe of mmc1:0001:1 failed with error -16 [ ] cfg80211: World regulatory domain updated: [ ] (start_freq - bandwidth), (max_antenna_gain, max_eirp) [ ] ( KHz KHz), (300 mBi, 2000 mBm) [ ] ( KHz KHz), (300 mBi, 2000 mBm) [ ] ( KHz KHz), (300 mBi, 2000 mBm) [ ] ( KHz KHz), (300 mBi, 2000 mBm) [ ] ( KHz KHz), (300 mBi, 2000 mBm) [ ] pgd = cff58000 [ ] [ ] *pgd=8ff36031, *pte= , *ppte= [ ] Internal error: Oops: 817 [#1] PREEMPT [ ] last sysfs file: /sys/devices/platform/ehci-omap.0/usb1/1-2/1-2.3/bcdDevice [ ] Modules linked in: libertas_sdio libertas cfg80211 joydev rfkill ads7846 mailbox_mach lib80211 mailbox rtc_twl gspca_zc3xx(+) rtc_core gspca_main videodev v4l1_compat [ ] CPU: 0 Not tainted ( #1)

 This is a case where a thorough knowledge of the hardware is essential to making the software work. DMA is almost impossible to troubleshoot without using a logic analyzer.  No matter what mode the transfers will ultimately use, and no matter what the source and destination devices are, I always first write a routine to do a memory to memory DMA transfer. This is much easier to troubleshoot than DMA to a complex I/O port. You can use your ICE to see if the transfer happened (by looking at the destination block), and to see if exactly the right number of bytes were transferred.  At some point you'll have to recode to direct the transfer to your device. Hook up a logic analyzer to the DMA signals on the chip to be sure that the addresses and byte count are correct. Check this even if things seem to work - a slight mistake might trash part of your stack or data space.  Some high integration CPUs with internal DMA controllers do not produce any sort of cycle that you can flag as being associated with DMA. This drives me nuts - one lousy extra pin would greatly ease debugging. The only way to track these transfers is to trigger the logic analyzer on address ranges associated with the transfer, but unfortunately these ranges may also have non- DMA activity in them.  Be aware that DMA will destroy your timing calculations. Bit banging UARTs will not be reliable; carefully crafted timing loops will run slower than expected. In the old days we all counted T- states to figure how long a loop ran, but DMA, prefetchers, cache, and all sorts of modern exoticness makes it almost impossible to calculate real execution time. 

linux /drivers/mmc/host/omap_hsmmc.c

Modified Version of omap_hsmmc_start_dma_transfer

   M33x_Announcement&HQS=am335x M33x_Announcement&HQS=am335x    65.aspx 65.aspx 