U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Yi Feng & Emery Berger University of Massachusetts Amherst A Locality-Improving.

Slides:



Advertisements
Similar presentations
Chapter 4 Memory Management Basic memory management Swapping
Advertisements

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 1 MC 2 –Copying GC for Memory Constrained Environments Narendran Sachindran J. Eliot.
Kernel memory allocation
Chapter 12. Kernel Memory Allocation
Exploiting Spatial Locality in Data Caches using Spatial Footprints Sanjeev Kumar, Princeton University Christopher Wilkerson, MRL, Intel.
1 Smart Memory for Smart Phones Chris Clack University College London
File Systems.
KERNEL MEMORY ALLOCATION Unix Internals, Uresh Vahalia Sowmya Ponugoti CMSC 691X.
4/14/2017 Discussed Earlier segmentation - the process address space is divided into logical pieces called segments. The following are the example of types.
Virtual Memory Chapter 18 S. Dandamudi To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi.
U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles Dynamic Memory Management Emery Berger and Mark Corner.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 2007 Exterminator: Automatically Correcting Memory Errors with High Probability Gene.
CS 153 Design of Operating Systems Spring 2015
CS 333 Introduction to Operating Systems Class 12 - Virtual Memory (2) Jonathan Walpole Computer Science Portland State University.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science CRAMM: Virtual Memory Support for Garbage-Collected Applications Ting Yang, Emery.
Operating Systems CMPSCI 377 Lecture 11: Memory Management
CS 333 Introduction to Operating Systems Class 12 - Virtual Memory (2) Jonathan Walpole Computer Science Portland State University.
Memory Management 2010.
U NIVERSITY OF M ASSACHUSETTS Department of Computer Science Automatic Heap Sizing Ting Yang, Matthew Hertz Emery Berger, Eliot Moss University of Massachusetts.
Memory Organization.
Virtual Memory. Why do we need VM? Program address space: 0 – 2^32 bytes –4GB of space Physical memory available –256MB or so Multiprogramming systems.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Garbage Collection Without Paging Matthew Hertz, Yi Feng, Emery Berger University.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Quantifying the Performance of Garbage Collection vs. Explicit Memory Management.
Scalable Locality- Conscious Multithreaded Memory Allocation Scott Schneider Christos D. Antonopoulos Dimitrios S. Nikolopoulos The College of William.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
Chapter 5: Memory Management Dhamdhere: Operating Systems— A Concept-Based Approach Slide No: 1 Copyright ©2005 Memory Management Chapter 5.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science PLDI 2006 DieHard: Probabilistic Memory Safety for Unsafe Programming Languages Emery.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
CS 333 Introduction to Operating Systems Class 12 - Virtual Memory (2) Jonathan Walpole Computer Science Portland State University.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 2006 Exterminator: Automatically Correcting Memory Errors Gene Novark, Emery Berger.
Windows 2000 Memory Management Computing Department, Lancaster University, UK.
CS333 Intro to Operating Systems Jonathan Walpole.
Lecture 21 Last lecture Today’s lecture Cache Memory Virtual memory
Flashing Up the Storage Layer I. Koltsidas, S. D. Viglas (U of Edinburgh), VLDB 2008 Shimin Chen Big Data Reading Group.
Jiang Lin 1, Qingda Lu 2, Xiaoning Ding 2, Zhao Zhang 1, Xiaodong Zhang 2, and P. Sadayappan 2 Gaining Insights into Multi-Core Cache Partitioning: Bridging.
Institute of Computing Technology On Improving Heap Memory Layout by Dynamic Pool Allocation Zhenjiang Wang Chenggang Wu Institute of Computing Technology,
29th ACSAC (December, 2013) SPIDER: Stealthy Binary Program Instrumentation and Debugging via Hardware Virtualization Zhui Deng, Xiangyu Zhang, and Dongyan.
Cosc 2150: Computer Organization Chapter 6, Part 2 Virtual Memory.
IT253: Computer Organization
CS 149: Operating Systems March 3 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 1 Automatic Heap Sizing: Taking Real Memory into Account Ting Yang, Emery Berger,
University of Massachusetts, Amherst TFS: A Transparent File System for Contributory Storage James Cipar, Mark Corner, Emery Berger
1 Memory Management Basics. 2 Program P Basic Memory Management Concepts Address spaces Physical address space — The address space supported by the hardware.
1 Advanced Memory Management Techniques  static vs. dynamic kernel memory allocation  resource map allocation  power-of-two free list allocation  buddy.
Computer Systems Week 14: Memory Management Amanda Oddie.
Virtual Memory.  Next in memory hierarchy  Motivations:  to remove programming burdens of a small, limited amount of main memory  to allow efficient.
Full and Para Virtualization
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Memory Management Overview.
Copyright ©: Nahrstedt, Angrave, Abdelzaher, Caccamo 1 Memory management & paging.
Memory Management Continued Questions answered in this lecture: What is paging? How can segmentation and paging be combined? How can one speed up address.
Copyright ©: Nahrstedt, Angrave, Abdelzaher1 Memory.
Memory Management Chapter 5 Advanced Operating System.
Virtual Memory. Cache memory enhances performance by providing faster memory access speed. Virtual memory enhances performance by providing greater memory.
Qin Zhao1, Joon Edward Sim2, WengFai Wong1,2 1SingaporeMIT Alliance 2Department of Computer Science National University of Singapore
Memory Management What if pgm mem > main mem ?. Memory Management What if pgm mem > main mem ? Overlays – program controlled.
Kernel Code Coverage Nilofer Motiwala Computer Sciences Department
CSE 120 Principles of Operating
PA1 is out Best by Feb , 10:00 pm Enjoy early
Chapter 8: Main Memory.
CSCI206 - Computer Organization & Programming
Reconsidering Custom Memory Allocation
Computer Architecture
Memory Management Overview
Oracle Memory Configuration on Windows Server
CS703 - Advanced Operating Systems
COMP755 Advanced Operating Systems
Presentation transcript:

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Yi Feng & Emery Berger University of Massachusetts Amherst A Locality-Improving Dynamic Memory Allocator

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 2 motivation Memory performance: bottleneck for many applications Heap data often dominates Dynamic allocators dictate spatial locality of heap objects

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 3 related work Previous work on dynamic allocation Reducing fragmentation [survey: Wilson et al., Wilson & Johnstone] Improving locality Search inside allocator [Grunwald et al.] Programmer-assisted [Chilimbi et al., Truong et al.] Profile-based [Barrett & Zorn, Seidl & Zorn]

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 4 this work Replacement allocator called Vam Reduces fragmentation Improves allocator & application locality Cache and page-level Automatic and transparent

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 5 outline Introduction Designing Vam Experimental Evaluation Space Efficiency Run Time Cache Performance Virtual Memory Performance

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 6 Vam design Builds on previous allocator designs DLmalloc Doug Lea, default allocator in Linux/GNU libc PHKmalloc Poul-Henning Kamp, default allocator in FreeBSD Reap [Berger et al. 2002] Combines best features

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 7 DLmalloc Goal Reduce fragmentation Design Best-fit Small objects: fine-grained, cached Large objects: coarse-grained, coalesced sorted by size, search Object headers ease deallocation and coalescing

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 8 PHKmalloc Goal Improve page-level locality Design Page-oriented design Coarse size classes: 2 x or n*page size Page divided into equal-size chunks, bitmap for allocation Objects share headers at page start (BIBOP) Discards free pages via madvise

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 9 Reap Goal Capture speed and locality advantages of region allocation while providing individual frees Design Pointer-bumping allocation Reclaims free objects on associated heap

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 10 Vam overview Goal Improve application performance across wide range of available RAM Highlights Page-based design Fine-grained size classes No headers for small objects Implemented in Heap Layers using C++ templates [Berger et al. 2001]

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 11 page-based heap Virtual space divided into pages Page-level management maps pages from kernel records page status discards freed pages

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 12 page-based heap Heap Space Page Descriptor Table free discard

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 13 fine-grained size classes Small (8-128 bytes) and medium ( bytes) sizes 8 bytes apart, exact-fit dedicated per-size page blocks (group of pages) 1 page for small sizes 4 pages for medium sizes either available or full reap-like allocation inside block availablefull

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 14 fine-grained size classes Large sizes (504-32K bytes) also 8 bytes apart, best-fit collocated in contiguous pages aggressive coalescing Extremely large sizes (above 32KB) use mmap/munmap Contiguous Pages free coalesce empty …… Free List Table

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 15 header elimination Object headers simplify deallocation & coalescing but: Space overhead Cache pollution Eliminated in Vam for small objects headerobject per-page metadata

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 16 header elimination Need to distinguish “headered” from “headerless” objects in free() Heap address space partitioning address space 16MB area (homogeneous objects) partition table

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 17 outline Introduction Designing Vam Experimental Evaluation Space efficiency Run time Cache performance Virtual memory performance

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 18 experimental setup Dell Optiplex 270 Intel Pentium 4 3.0GHz 8KB L1 (data) cache, 512KB L2 cache, 64-byte cache lines 1GB RAM 40GB 5400RPM hard disk Linux Use perfctr patch and perfex tool to set Intel performance counters (instructions, caches, TLB)

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 19 benchmarks Memory-intensive SPEC CPU2000 benchmarks custom allocators removed in gcc and parser 176.gcc197.parse r 253.perlbm k 255.vorte x Execution Time24 sec275 sec43 sec62 sec Instructions40 billion424 billion114 billion102 billion VM Size130MB15MB120MB65MB Max Live Size110MB10MB90MB45MB Total Allocations9M788M5.4M1.5M Average Object Size52 bytes21 bytes285 bytes471 bytes Alloc Rate (#/sec)373K2813K129K30K Alloc Interval (# of inst) 4.4K0.5K21K68K

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 20 space efficiency Fragmentation = max (physical) mem in use / max live data of app

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 21 total execution time

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 22 total instructions

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 23 cache performance L2 cache misses closely correlated to run time performance

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 24 VM performance Application performance degrades with reduced RAM Better page-level locality produces better paging performance, smoother degradation

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 25

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 26 Vam summary Outperforms other allocators both with enough RAM and under memory pressure Improves application locality cache level page-level (VM) see paper for more analysis

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 27 the end Heap Layers publicly available Vam to be included soon

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 28 backup slides

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 29 TLB performance

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 30 average fragmentation Fragmentation = average of mem in use / live data of app