Reconsidering Custom Memory Allocation Emery D. Berger Benjamin G. Zorn Kathryn S. McKinley November 2002 Proceedings of the Conference on Object-Oriented.

Slides:



Advertisements
Similar presentations
UNIX System Programming Installing OpenSolaris. 2/86 Contents How to setup a virtual machine guest How to install OpenSolaris as a guest How to update.
Advertisements

Steve Blackburn Department of Computer Science Australian National University Perry Cheng TJ Watson Research Center IBM Research Kathryn McKinley Department.
Computer Vision Lecture 18: Object Recognition II
U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley.
MC 2 : High Performance GC for Memory-Constrained Environments - Narendran Sachindran, J. Eliot B. Moss, Emery D. Berger Sowmiya Chocka Narayanan.
COMPUTER SYSTEM CAN BE DIVIDED INTO : 1- General Computer 2- Special Computer.
By Emery D. Berger and Benjamin G. Zorn Presented by: David Roitman.
The Collaborative Mindset: Performance Management and Feedback OS608 November 20, 2006 Fisher.
Composing High-Performance Memory Allocators Emery Berger, Ben Zorn, Kathryn McKinley.
CRC Cards Class-Responsibility-Collaboration. Where did the idea come from? Kent Beck and Ward Cunningham first introduced CRC cards at OOPSLA (object-oriented.
Memory Allocation Costs in Large C and C++ Programs n An article by David Detlefs, Al Dosser and Benjamin Zorn n Presented by Vered Ojalvo.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
Metacognitive Instructional Approaches Chapter 11.
On the purpose of Object Oriented Analysis Geri Magne Høydalsvik and Guttorm Sindre.
Eng 209W Leeward CC Hurley 1 Writing for Business Audiences Adapted from a Presentaton by the Purdue University Writing Lab.
EconLit with Full Text. EconLit The authoritative index for economic literature EconLit indexes: –Books & Book Reviews –Conference Proceedings & Papers.
Writing a thesis and organizing your paper What to know before you write.
Systematic Approaches to Literature Reviewing. The Literature Review ? “Literature reviews …… introduce a topic, summarise the main issues and provide.
ADV RESEARCH DESIGN AND ANALYSIS LAB Reading Scientific Articles.
Directional Pricing Theory in Electricity 32th USAEE/IAEE North American Conference Hotel Captain Cook, Anchorage, AK July 29, 2013 Akira Maeda (University.
WP7 Sustainability Synthesis Report Key Points Purpose: to develop sustainability and long term potential for including LIFE 2 training in national VET.
 Article  Article Idea  Article Purpose  The main Points  Conclusion  Comment.
Responding Critically to Texts
First Indico Workshop Abstracts & Timetable José Benito González López May 2013 CERN.
What does it mean? A text which analyses, persuades the reader that something is the case, gives a point of view and substantiates what is claimed with.
Overview Lectures are defined as a highly structured method by which the teacher verbally transmits information directly to the groups of learners for.
Rulemaking for Central Florida Coordination Area Coordinated Rulemaking by the South Florida, St. Johns River and Southwest Florida Water Management Districts.
Doing a Literature Review- Jeffrey W. Knopf Author: Professor of National Security Affairs at the Naval Postgraduate School; BA and MA from Harvard, Ph.D.
C OMMUNICATION I N THE WORK PLACE By: John Randle JR Class: interpersonal communication Teacher: Ignatjeva.
Literature Review Related Science, Knowledge, and Practice – The Context of the Study Back to Class 4.
Interactive Applications Design and Development About this course State of the mobile app industry What is an interactive application? Topic 1: Course.
Planning A Research Study Neuman and Robson Ch. 4 and 5: Reviewing the Scholarly Literature and Planning a Study.
CE Operating Systems Lecture 22 Operating Systems - Revision Lecture.
REPORTS By Preeti Patel Lecturer School of Library And Information Science DAVV, Indore
Combining Garbage Collection and Safe Manual Memory Management Michael Hicks University of Maryland, College Park Joint work with Greg Morrisett - Harvard,
Blog Summarization We have built a blog summarization system to assist people in getting opinions from the blogs. After identifying topic-relevant sentences,
Pol Sc 322 Politics of Atlantic Canada Case Study Team Process, 2010 Purpose Topic Team Process Team Evaluation Individual Research Paper.
Doing Your Own Research. Topic: A Focus for the Study F Is the topic likely researchable, given time, resources, and availability of data? F Is there.
Consider Starting with 160 k of memory do: Starting with 160 k of memory do: Allocate p1 (50 k) Allocate p1 (50 k) Allocate p2 (30 k) Allocate p2 (30 k)
Principles of Marketing Lecture-13. Summary of Lecture-12.
Analytical evaluation of the research on your topic It progresses from the general to the particular: skim through first of all, then pick out some details,
Experiences in Soliciting and Incorporating Feedback from Diverse Data Users and Providers Presented by Suzanne King U.S. Department of Agriculture, National.
GRE READING COMPREHENSION. READING COMPREHENSION PASSAGE STRUCTURES Three Classic GRE Passage Structures Arguing a Position Discussing something specific.
CS 241 Discussion Section (12/1/2011). Tradeoffs When do you: – Expand Increase total memory usage – Split Make smaller chunks (avoid internal fragmentation)
Memory Management What if pgm mem > main mem ?. Memory Management What if pgm mem > main mem ? Overlays – program controlled.
Economies Based on Tradition
Economic Systems.
Economies Based on Tradition
Age-Based Garbage Collection
CS 854: Advanced Topics in Operating Systems

How to write a literature review
What is your FICO IQ Quiz? Link to article attached.
How to publish from your MEd or PhD research
Reconsidering Custom Memory Allocation
مدیریت استراتژيک منابع انسانی
Chapter 2 First Language Acquisition
Research discussions DRAL Special
Conducting a STEM Literature Review
From results to submission
Paper for submission to Law and Society Review
Ռազմավարական կառավարում
The Research Paper.
How to prepare for your examination paper
4.9 Critical Evaluation AICE THINKING SKILLS.
Презентация құру тәсілдері
Language model using HTK
Title of Article First Author: Second Author: Third Author:
Presentation transcript:

Reconsidering Custom Memory Allocation Emery D. Berger Benjamin G. Zorn Kathryn S. McKinley November 2002 Proceedings of the Conference on Object-Oriented Programming: Systems, Languages, and Applications (OOPSLA) 2002

Lecture Topics Custom memory allocators General purpose allocators Regions (good performance) Reaps (very good performance and more) Results and Conclusions

Key Contributions of the paper A comprehensive evaluation of custom allocators Custom allocations vs. General-Purpose allocators (memory consumption and performance) Most programmers seeking faster memory allocation should use Lea allocator rather than writing their own

Key Contributions of the paper – Cont. The custom allocators that do provide higher performance use regions Reaps are even better

Key Contributions of the paper – Cont. If you need fast regions – use reaps Otherwise use Lea allocator, rather than any custom allocator.

Related Works Articles in the trade press claim Custom Allocators are a good idea “ Effective C++ ” “ C++ Programming language ” Benjamin Zorn in 1993 claims it to be a waste of time Articles on region allocation (arenas, groups, zones) We find that all of them are true

General-purpose memory allocators Windows XP allocator Lea allocator (Linux)

Lea Allocator An approximate best-fit allocator with different behavior based on object size Small Objects (<64 bytes) allocated by exact- size quicklists Medium Objects (<128K) – coalesce quicklists Large Objects – allocate and free by mmap The best allocator known

Our Benchmarks

Emulating Custom Semantics Custom allocators often support different semantics from C interface Region emulator Full region semantics General allocator Records a pointer to each allocated object to allow region deletion The pointer recorded in an out-of-band array – no impact on drag

Custom memory allocators - Definition Memory allocation mechanism that differs from general-purpose allocator in at least one of two: May provide more than one object for every allocated chunk of memory May not immediately return objects to the system/general-purpose allocator No wrappers

Custom allocators – widespread use Recommended as an optimization technique in a trade press Apache web server, GCC, C++ STL Direct support by C++ (by overloading new and delete operators)

Why programmers use Custom Allocators? Improving runtime performance Reducing memory consumption Improving software engineering (?)

Improving runtime performance 16% (average) of the run-time in the memory allocator Most our benchmarks reason Per-operation cost of general allocators In programs with intensive use of allocator

Improving runtime performance – Cont.

Reducing memory consumption

Improving software engineering (?) Memory allocated by a custom allocator can ’ t be managed by another allocator Free on custom allocated object may cause a segmentation fault Difficult to understand the source of memory consumption in the program No Purify No parallel allocator for SMP scalability No GC No shared multi-language heap

Improving software engineering (!) Region-based allocator simplifies memory management Memory area can be deleted by a single call Separate memory areas Regions are good for multithreaded server applications Memory spaces isolation Memory leaks prevention Apache web server

A Taxonomy of Custom Allocators Apply your knowledge about some set of objects Use regions to free objects dead at the same time Take advantage of object sizes Use known allocation patterns

Benchmark allocators characteristics Per-class allocators Regions Nested regions Obstack Custom patterns

Per-class allocators Objects of the same size (type) Eliding size checks Freelist with objects of the specific type The same API like malloc and free

Regions Allocation by incrementing a pointer to a large chunks of memory Only entire region deletion - no deletion of individual objects freeAll function Nested regions Nested object lifetime Obstack ( “ Object Stack ” ) Deletion of every object allocated after a certain object

Custom patterns A general – purpose allocator optimized for a particular pattern of object behavior

Custom allocators characteristics – Cont.

Problems with regions Excessive memory retention Unbounded memory consumption Unbounded buffers Dynamic arrays Producer – consumer patterns Complicated programming of server applications (Apache)

The ideal allocator Region Semantics + General-Puspose Allocation (heap) = Reaps

Heaps malloc free Regions malloc freeAll Reaps malloc free freeAll

Reaps - Example r region chunks heap reapCreate(r); x1 x1 = reapMalloc(r, 8); x2 x2 = reapMalloc(r, 8); x3 x3 = reapMalloc(r, 16); x4 x4 = reapMalloc(r, 8); x3 reapFree(r, x3);

Implementation Issues Initially, Region – similar behavior Allocation by bumping a pointer Geometrically-increasing chunks of memory threaded onto a linked list Header for every allocated object Freed objects (reapFree) are placed in an associated heap Allocations use memory from this heap

Reap allocation interface void reapCreate (void ** reap, void ** parent); void reapDestroy (void ** reap); void reapFreeAll (void ** reap); //clear void * reapMalloc (void ** reap, size_t size); void reapFree (void ** reap, void * object);

Design issues Heap Layers Mixins

Design issues – Cont. RegionHeap CoalesceableHeap ClearOptimizedHeap NestedHeap LeaHeap Sbrk

Design issues – Layers LeaHeap layer high speed low fragmentation NestedHeap layer ClearOptimizedHeap layer nothingOnHeap flag Fast allocations by pointer bumping on first heap Second heap – after freeing an object CoalesceableHeap layer adds per-object metadata RegionHeap layer Linked list of allocated objects clear()

Benchmark allocation statistics

Benchmark allocation statistics – Cont. Programs with general-purpose allocators Not allocation-intensive Spend little time in memory allocator Programs with custom allocators Tend to allocate many small objects More time in memory allocator Correct pinpointing of memory manager as a significant factor in the performance

Results Different memory management policies compared (general, custom, reaps) Execution time Memory consumption

Results - technicalities Runtime – the best of three Visual C compilation Pentium III 600MHz 320Mb under Windows XP

Runtime Performance

Runtime Performance – Cont. Custom Vs Windows – justifies the use of custom allocator Lea provides almost the same performance as custom - except regions Reaps are comparable to Lea and to custom

Memory Consumption

Memory Consumption – Cont. No Windows XP – no equivalent way to keep track of memory consumption Reaps – don ’ t use individual deletion Mixed results Region space advantage - misleading

Evaluating Region Allocation Total drag – an average ratio of heap sizes with and without immediate object deallocation Immediate free of every dead object – total drag of 1 Non-region allocators – minimal drag Region allocators – high drag, substantial increase in memory consumption

Evaluating Region Allocation – Cont.

Experimental Comparison to previous work

Reaps in Apache Using space consumption advantages by allowing individual deletion bc – an arbitrary-precision calculator language Apache region rerouting to reaps + reapFree ( ap_pfree ) call Redefinition of malloc and free in bc Computing 1000 th prime consumes 7.4Mb without ap_free and 240 kilobytes with

Why programmers use custom allocators to no effect Recommended practice Premature optimization Drift Improved competition

Conclusions Despite widespread belief custom allocator doesn ’ t always improve performance Lea allocator is as fast or even faster The exception is region-based allocator Reaps – high-performance and reduction in memory consumption

Future plans Reaps integration with Hoard scalable memory allocator Reaps integration into garbage- collected setting

Questions ?

The End

Custom Allocator implementation Standard C++ way (inheritance) Significant overhead of virtual method dispatch Limits compiler optimizations Fixed relations between classes, single inheritance structure – difficult reuse

Mixins Can be reparented template class Mixin : public Super{}; No single class hierarchy class Composition1 : public A {}; class Composition2 : public A {};

Heap Layers Mixin Provides Malloc and Free Coding Guidelines Handle NULL returned by malloc() correctly Destructor must free any memory held by layer Top heaps – system-provided memory wrappers

Example – Composing a Per-Class Allocator Per – class pool of memory Same-sized objects Singly-linked freelist for memory management No change of source code for the original class PerClassHeap Utility Class - to adapt a class to use heap layer as its allocator FreeListHeap Heap Layer

Example - PerClassHeap Template class PerClassHeap : public Object { public: inline void * opertor new (size_t sz){ return getHeap().malloc (sz);} inline void * opertor delete (void * ptr){ return getHeap().free (ptr);} private: static SuperHeap& GetHeap (){ static SuperHeap theHeap; return theHeap;}};

Example - FreeListHeap

Example - Combination Foo subclass that uses per-class pools Class FasterFoo: public PerClassHeap > {};

The End!!!