U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley.

Slides:



Advertisements
Similar presentations
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Memory Management for High-Performance Applications Emery Berger University of Massachusetts,
Advertisements

Dynamic Memory Management
Automatic Memory Management Noam Rinetzky Schreiber 123A /seminar/seminar1415a.html.
Reconsidering Custom Memory Allocation Emery D. Berger Benjamin G. Zorn Kathryn S. McKinley November 2002 Proceedings of the Conference on Object-Oriented.
KERNEL MEMORY ALLOCATION Unix Internals, Uresh Vahalia Sowmya Ponugoti CMSC 691X.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles Dynamic Memory Management Emery Berger and Mark Corner.
U NIVERSITY OF M ASSACHUSETTS – Department of Computer Science Emery Berger Scalable Memory Management for Multithreaded Applications CMPSCI 691P Fall.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 2007 Exterminator: Automatically Correcting Memory Errors with High Probability Gene.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Mitigating the Compiler Optimization Phase- Ordering Problem using Machine Learning.
Hastings Purify: Fast Detection of Memory Leaks and Access Errors.
Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
Log-Structured Memory for DRAM-Based Storage Stephen Rumble, Ankita Kejriwal, and John Ousterhout Stanford University.
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science CRAMM: Virtual Memory Support for Garbage-Collected Applications Ting Yang, Emery.
Hoard: A Scalable Memory Allocator for Multithreaded Applications -- Berger et al. -- ASPLOS 2000 Emery Berger, Kathryn McKinley *, Robert Blumofe, Paul.
Operating Systems CMPSCI 377 Lecture 11: Memory Management
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
BROADWAY: A SOFTWARE ARCHITECTURE FOR SCIENTIFIC COMPUTING Samuel Z. Guyer and Calvin Lin The University of Texas.
CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.
1 Optimizing Malloc and Free Professor Jennifer Rexford
U NIVERSITY OF M ASSACHUSETTS Department of Computer Science Automatic Heap Sizing Ting Yang, Matthew Hertz Emery Berger, Eliot Moss University of Massachusetts.
Composing High-Performance Memory Allocators Emery Berger, Ben Zorn, Kathryn McKinley.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Garbage Collection Without Paging Matthew Hertz, Yi Feng, Emery Berger University.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Quantifying the Performance of Garbage Collection vs. Explicit Memory Management.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science PLDI 2006 DieHard: Probabilistic Memory Safety for Unsafe Programming Languages Emery.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles C/C++ Emery Berger and Mark Corner University of Massachusetts.
Memory Management for High-Performance Applications - Ph.D. defense - Emery Berger 1 Emery Berger Memory Management for High-Performance Applications Department.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 2006 Exterminator: Automatically Correcting Memory Errors Gene Novark, Emery Berger.
P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Compilers, Interpreters and Debuggers Ruibin Bai (Room AB326) Division of Computer Science.
Dynamic Memory Allocation Questions answered in this lecture: When is a stack appropriate? When is a heap? What are best-fit, first-fit, worst-fit, and.
CS 11 C track: lecture 5 Last week: pointers This week: Pointer arithmetic Arrays and pointers Dynamic memory allocation The stack and the heap.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu
Ch. 4 Memory Mangement Parkinson’s law: “Programs expand to fill the memory available to hold them.”
CSC 230: C and Software Tools Rudra Dutta Computer Science Department Course Introduction.
CRT State Stuff Dana Robinson The HDF Group. In the next slide, I show a single executable linked to three dlls. Two dlls and the executable were built.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Advanced Compilers CMPSCI 710 Spring 2004 Lecture 1 Emery Berger University of Massachusetts,
Copyright (c) 2004 Borys Bradel Myths and Realities: The Performance Impact of Garbage Collection Paper: Stephen M. Blackburn, Perry Cheng, and Kathryn.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 1 Automatic Heap Sizing: Taking Real Memory into Account Ting Yang, Emery Berger,
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
Memory Management. Memory  Commemoration or Remembrance.
1 CMSC421: Principles of Operating Systems Nilanjan Banerjee Principles of Operating Systems Acknowledgments: Some of the slides are adapted from Prof.
COMP3190: Principle of Programming Languages
Hoard: A Scalable Memory Allocator for Multithreaded Applications Emery Berger, Kathryn McKinley, Robert Blumofe, Paul Wilson Presented by Dimitris Prountzos.
11/26/2015IT 3271 Memory Management (Ch 14) n Dynamic memory allocation Language systems provide an important hidden player: Runtime memory manager – Activation.
CS 241 Discussion Section (11/17/2011). Outline Review of MP7 MP8 Overview Simple Code Examples (Bad before the Good) Theory behind MP8.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Yi Feng & Emery Berger University of Massachusetts Amherst A Locality-Improving.
Authors – Jeahyuk huh, Doug Burger, and Stephen W.Keckler Presenter – Sushma Myneni Exploring the Design Space of Future CMPs.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Dynamic Compilation I John Cavazos University.
A Dynamic Operating System for Sensor Nodes Chih-Chieh Han, Ram Kumar, Roy Shea, Eddie Kohler, Mani, Srivastava, MobiSys ‘05 Oct., 2009 발표자 : 김영선, 윤상열.
Consider Starting with 160 k of memory do: Starting with 160 k of memory do: Allocate p1 (50 k) Allocate p1 (50 k) Allocate p2 (30 k) Allocate p2 (30 k)
Runtime The optimized program is ready to run … What sorts of facilities are available at runtime.
CS 241 Discussion Section (2/9/2012). MP2 continued Implement malloc, free, calloc and realloc Reuse free memory – Sequential fit – Segregated fit.
CS 241 Discussion Section (12/1/2011). Tradeoffs When do you: – Expand Increase total memory usage – Split Make smaller chunks (avoid internal fragmentation)
Dynamic Memory Management Jennifer Rexford 1. 2 Goals of this Lecture Dynamic memory management techniques Garbage collection by the run-time system (Java)
Ch. 4 Memory Mangement Parkinson’s law: “Programs expand to fill the memory available to hold them.”
Memory Management What if pgm mem > main mem ?. Memory Management What if pgm mem > main mem ? Overlays – program controlled.
MSP’05 1 Gated Memory Control for Memory Monitoring, Leak Detection and Garbage Collection Chen Ding, Chengliang Zhang Xipeng Shen, Mitsunori Ogihara University.
Storage Management Different-sized Items. Light blue indicates allocated items Heap Memory with Different-sized Items.
Jonathan Walpole Computer Science Portland State University
Threads vs. Events SEDA – An Event Model 5204 – Operating Systems.
Rifat Shahriyar Stephen M. Blackburn Australian National University
Introduction to JSP Liu Haibin 12/09/2018.
Reconsidering Custom Memory Allocation
Optimizing Malloc and Free
Memory Management and Garbage Collection Hal Perkins Autumn 2011
Closure Representations in Higher-Order Programming Languages
Presentation transcript:

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 2 Custom Memory Allocation Very common practice Apache, gcc, lcc, STL, database servers… Language-level support in C++ Widely recommended Programmers replace new / delete, bypassing system allocator Reduce runtime – often Expand functionality – sometimes Reduce space – rarely “Use custom allocators”

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 3 Drawbacks of Custom Allocators Avoiding system allocator: More code to maintain & debug Can’t use memory debuggers Not modular or robust: Mix memory from custom and general-purpose allocators → crash!  Increased burden on programmers Are custom allocators really a win?

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 4 Overview Introduction Perceived benefits and drawbacks Three main kinds of custom allocators Comparison with general-purpose allocators Advantages and drawbacks of regions Reaps – generalization of regions & heaps

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 5 (1) Per-Class Allocators a b c a = new Class1; b = new Class1; c = new Class1; delete a; delete b; delete c; a = new Class1; b = new Class1; c = new Class1; Recycle freed objects from a free list + Fast + Linked list operations + Simple + Identical semantics + C++ language support - Possibly space-inefficient

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 6 (II) Custom Patterns Tailor-made to fit allocation patterns Example: 197.parser (natural language parser) char[MEMORY_LIMIT] a = xalloc(8); b = xalloc(16); c = xalloc(8); xfree(b); xfree(c); d = xalloc(8); a b c d end_of_array + Fast + Pointer-bumping allocation - Brittle - Fixed memory size - Requires stack-like lifetimes

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 7 (III) Regions + Fast + Pointer-bumping allocation + Deletion of chunks + Convenient + One call frees all memory regionmalloc(r, sz) regiondelete(r) Separate areas, deletion only en masse regioncreate(r) r - Risky - Dangling references - Too much space Increasingly popular custom allocator

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 8 Overview Introduction Perceived benefits and drawbacks Three main kinds of custom allocators Comparison with general-purpose allocators Advantages and drawbacks of regions Reaps – generalization of regions & heaps

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 9 Custom Allocators Are Faster… As good as and sometimes much faster than Win32

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 10 Not So Fast… DLmalloc: as fast or faster for most benchmarks

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 11 The Lea Allocator (DLmalloc 2.7.0) Mature public-domain general-purpose allocator Optimized for common allocation patterns Per-size quicklists ≈ per-class allocation Deferred coalescing (combining adjacent free objects)  Highly-optimized fastpath Space-efficient

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 12 Space Consumption: Mixed Results

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 13 Overview Introduction Perceived benefits and drawbacks Three main kinds of custom allocators Comparison with general-purpose allocators Advantages and drawbacks of regions Reaps – generalization of regions & heaps

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 14 Regions – Pros and Cons + Fast, convenient, etc. + Avoid resource leaks (e.g., Apache) Tear down memory for terminated connections - No individual object deletion  Unbounded memory consumption (producer-consumer, long-running computations, off-the-shelf programs)  Apache: vulnerable to DoS, memory leaks

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 15 Reap = region + heap Adds individual object deletion & heap Reap Hybrid Allocator reapmalloc(r, sz) reapdelete(r) reapcreate(r) r reapfree(r,p) + Can reduce memory consumption + Fast + Adapts to use (region or heap style) + Cheap deletion

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 16 Reap Runtime

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 17 Reap Space

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 18 Reap: Best of Both Worlds Allows mixing of regions and new / delete Case study: New Apache module “mod_bc” bc: C-based arbitrary-precision calculator Changed 20 lines out of 8000 Benchmark: compute 1000 th prime With Reap: 240K Without Reap: 7.4MB

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 19 Conclusions and Future Work Empirical study of custom allocators Lea allocator often as fast or faster Non-region custom allocation ineffective Reap: region performance without drawbacks Future work: Reduce space with per-page bitmaps Combine with scalable general-purpose allocator (e.g., Hoard)

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 20 Software (Reap: part of Heap Layers distribution) (DLmalloc 2.7.0)

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 21 If You Can Read This, I Went Too Far

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 22 Backup Slides

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 23 Experimental Methodology Comparing to general-purpose allocators Same semantics: no problem E.g., disable per-class allocators Different semantics: use emulator Uses general-purpose allocator Adds bookkeeping to support region semantics

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 24 Why Did They Do That? Recommended practice Premature optimization Microbenchmarks vs. actual performance Drift Not bottleneck anymore Improved competition Modern allocators are better

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 25 Reaps as Regions: Runtime Reap performance nearly matches regions

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 26 Using Reap as Regions Reap performance nearly matches regions

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 27 Drawbacks of Regions Can’t reclaim memory within regions Bad for long-running computations, producer-consumer patterns, “malloc/free” programs  unbounded memory consumption Current situation for Apache: vulnerable to denial-of-service limits runtime of connections limits module programming

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 28 Use Custom Allocators? Strongly recommended by practitioners Little hard data on performance/space improvements Only one previous study [Zorn 1992] Focused on just one type of allocator Custom allocators: waste of time Small gains, bad allocators Different allocators better? Trade-offs?

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 29 Kinds of Custom Allocators Three basic types of custom allocators Per-class Fast Custom patterns Fast, but very special-purpose Regions Fast, possibly more space-efficient Convenient Variants: nested, obstacks

U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE 30 Optimization Opportunity