Memory Management and RMAP VM of 2.6 By A.R.Karthick )
Memory Hierarchies L2 cache L1 cache RAM Hard Disk TIMETIME
Page Tables Define the virtual to physical mapping Page directory,page mid level directory,page table entry define the course of translation Example: PGD 10 bits PTE 10 BITS (PMD folded in 32 bit) | | > (0x00080c0f) pgd index(0) pte_index(1 << 7), pmd is folded to pgd
Page Table Entry Status Bits (PTE Entry) PAGE_PRESENT PAGE_RW PAGE_USER PAGE_RESERVED PAGE_ACCESSED PAGE_DIRTY INTERNAL_STATUS
Page Fault Processor Exception raised when there is a problem mapping the virtual address to physical address. Handled by do_page_fault in arch/i386/mm/fault.c. Write protection faults or COW faults map to do_wp_page. For pages in swap, do_swap_page is called. For pages not found, do_no_page is called that either faults in an anonymous zero page or an existing page. Page faults populate the LRU cache.
Page Replacement Algorithms Optimal Replacement Not possible Not Recently Used (NRU) Crude hack FIFO Inefficient Second Chance Better than above Clock Replacement Efficient than above
Page Replacement Algorithms LRU – Least Recently used replacement NFU – Not Frequently Used replacement Page Ageing based replacement Working Set algorithm based on locality of references per process Working Set based clock algorithms LRU with Ageing and Working Set algorithms are efficient to use and are commonly used
Page replacement handling in Linux Kernel Page Cache Pages are added to the Page cache for fast lookup. Page cache pages are hashed based on their address space and page index Inode or disk block pages, shared pages and anonymous pages form the page cache. Swap cached pages also part of the page cache represent the swapped pages. Anonymous pages enter the swap cache at swap-out time and shared pages enter when they become dirty.
LRU CACHE LRU cache is made up of per zone active lists and inactive lists. Per-CPU lru active and inactive page vectors make lru cache additions faster. These lists are populated during page faults and when page cached pages are accessed or referenced. kswapd is the page out kernel thread per node that balances the LRU cache and trickles out pages based on an approximation to LRU algorithm. Page stealing is performed on a page vector or performed in batches. Active state Inactive dirty state Inactive clean state Per-CPU cold pages
Zone Balancing Kswapd performs zone balancing based on pages_high, pages_low and pages_min Zone is considered balance with its free pages above pages_high The page out process takes a page by scanning inactive pages in batches. Batch page stealing scales well for large physical memory.
RMAP Maintains mapping of a page to a pte/virtual address Greatly speeds up the page unmap path without scanning the process virtual address space Unmapping of shared pages is greatly improved because of availability of pte mappings for shared pages Page faults are reduced because pte entries are unmapped only when required. Reduced search space during page replacement as only inactive pages are touched. Low overhead involved in adding reverse mapping during fork, page fault, mmap and exit paths.
RMAP struct pte_chain { unsigned long next_and_idx; pte_addr_t ptes[NRPTE]; }____cachelinealigned; next_and_idx field contains both the index to the next pte in the same chain or a pointer to the next pte chain,thus aiding in fast pte chaining. pte chains have free slots at the top or the head of the chain and additions happen from the tail. process mm_struct pointer is kept in the pages address space, that is used during swapout times.
VM-Overcommit Policies Commit more than available/actual memory space which includes the swap space to the process. Overcommit policies can be set through sysctl vm.overcommit_{memory,ratio} 0 indicates no overcommit 1 indicates overcommit totally. 2 indicates overcommit with overcommit_ratio on total ram pages plus total swap space pages. mmap,mprotect, munmap, brk, shared memory, affect overcommit.
References Primarily Linux Kernel Source Code 2.6 Towards an O(1) VM by Rik Van Riel – Proceedings of the Linux Symposium –Ottawa