© Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

Slides:



Advertisements
Similar presentations
Garbage collection David Walker CS 320. Where are we? Last time: A survey of common garbage collection techniques –Manual memory management –Reference.
Advertisements

© Richard Jones, 1999BCS Advanced Programming SG: Garbage Collection 14 October Garbage Collection Richard Jones Computing Laboratory University.
© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.
CMSC 330: Organization of Programming Languages Memory and Garbage Collection.
PZ10B Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ10B - Garbage collection Programming Language Design.
Lecture 10: Heap Management CS 540 GMU Spring 2009.
Garbage Collection What is garbage and how can we deal with it?
CMSC 330: Organization of Programming Languages Memory and Garbage Collection.
Garbage Collection  records not reachable  reclaim to allow reuse  performed by runtime system (support programs linked with the compiled code) (support.
5. Memory Management From: Chapter 5, Modern Compiler Design, by Dick Grunt et al.
Memory Management Tom Roeder CS fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)
Garbage Collection CSCI 2720 Spring Static vs. Dynamic Allocation Early versions of Fortran –All memory was static C –Mix of static and dynamic.
© Richard Jones, Eric Jul, OOPSLA 2000 Tutorial: Garbage Collection1 Introduction to Memory Management & Garbage Collection or How to Live with.
By Jacob SeligmannSteffen Grarup Presented By Leon Gendler Incremental Mature Garbage Collection Using the Train Algorithm.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 18.
CPSC 388 – Compiler Design and Construction
CS 536 Spring Automatic Memory Management Lecture 24.
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
Memory Management. History Run-time management of dynamic memory is a necessary activity for modern programming languages Lisp of the 1960’s was one of.
1 The Compressor: Concurrent, Incremental and Parallel Compaction. Haim Kermany and Erez Petrank Technion – Israel Institute of Technology.
CS 1114: Data Structures – memory allocation Prof. Graeme Bailey (notes modified from Noah Snavely, Spring 2009)
G Robert Grimm New York University Cool Pet Tricks with… …Virtual Memory.
MOSTLY PARALLEL GARBAGE COLLECTION Authors : Hans J. Boehm Alan J. Demers Scott Shenker XEROX PARC Presented by:REVITAL SHABTAI.
Runtime The optimized program is ready to run … What sorts of facilities are available at runtime.
Memory Allocation and Garbage Collection. Why Dynamic Memory? We cannot know memory requirements in advance when the program is written. We cannot know.
Compilation 2007 Garbage Collection Michael I. Schwartzbach BRICS, University of Aarhus.
1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot.
Garbage collection (& Midterm Topics) David Walker COS 320.
Uniprocessor Garbage Collection Techniques Paul R. Wilson.
Reference Counters Associate a counter with each heap item Whenever a heap item is created, such as by a new or malloc instruction, initialize the counter.
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.
Garbage Collection Memory Management Garbage Collection –Language requirement –VM service –Performance issue in time and space.
A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao.
Memory Management Last Update: July 31, 2014 Memory Management1.
SEG Advanced Software Design and Reengineering TOPIC L Garbage Collection Algorithms.
Dynamic Memory Allocation Questions answered in this lecture: When is a stack appropriate? When is a heap? What are best-fit, first-fit, worst-fit, and.
Ulterior Reference Counting: Fast Garbage Collection without a Long Wait Author: Stephen M Blackburn Kathryn S McKinley Presenter: Jun Tao.
Storage Management. The stack and the heap Dynamic storage allocation refers to allocating space for variables at run time Most modern languages support.
Incremental Garbage Collection Uwe Kern 23. Januar 2002
A Principled Approach to Nondeferred Reference-Counting Garbage Collection † Pramod G. Joisha HP Labs, Palo Alto † This work was done when the author was.
Memory Management II: Dynamic Storage Allocation Mar 7, 2000 Topics Segregated free lists –Buddy system Garbage collection –Mark and Sweep –Copying –Reference.
1 Lecture 22 Garbage Collection Mark and Sweep, Stop and Copy, Reference Counting Ras Bodik Shaon Barman Thibaud Hottelier Hack Your Language! CS164: Introduction.
11/26/2015IT 3271 Memory Management (Ch 14) n Dynamic memory allocation Language systems provide an important hidden player: Runtime memory manager – Activation.
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.
Immix: A Mark-Region Garbage Collector Curtis Dunham CS 395T Presentation Feb 2, 2011 Thanks to Steve Blackburn and Jennifer Sartor for their 2008 and.
GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo.
Memory Management -Memory allocation -Garbage collection.
Runtime The optimized program is ready to run … What sorts of facilities are available at runtime.
Introduction to Garbage Collection. Garbage Collection It automatically reclaims memory occupied by objects that are no longer in use It frees the programmer.
2/4/20161 GC16/3011 Functional Programming Lecture 20 Garbage Collection Techniques.
® July 21, 2004GC Summer School1 Cycles to Recycle: Copy GC Without Stopping the World The Sapphire Collector Richard L. Hudson J. Eliot B. Moss Originally.
The Metronome Washington University in St. Louis Tobias Mann October 2003.
CS412/413 Introduction to Compilers and Translators April 21, 1999 Lecture 30: Garbage collection.
Reference Counting. Reference Counting vs. Tracing Advantages ✔ Immediate ✔ Object-local ✔ Overhead distributed ✔ Very simple Trivial implementation for.
Eliminating External Fragmentation in a Non-Moving Garbage Collector for Java Author: Fridtjof Siebert, CASES 2000 Michael Sallas Object-Oriented Languages.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Garbage Collection What is garbage and how can we deal with it?
Dynamic Compilation Vijay Janapa Reddi
Memory Management 6/20/ :27 PM
Dynamic Memory Allocation
Concepts of programming languages
Automatic Memory Management
Storage.
Memory Management and Garbage Collection Hal Perkins Autumn 2011
Strategies for automatic memory management
Memory Management Kathryn McKinley.
Presentation: Cas Craven
Created By: Asst. Prof. Ashish Shah, J.M.Patel College, Goregoan West
Reference Counting.
Garbage Collection What is garbage and how can we deal with it?
Presentation transcript:

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory University of Kent at Canterbury mm-net Garbage Collection & Memory Management Summer School Tuesday 20 July 2004 © Richard Jones, All rights reserved.

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July PART 1: Introduction Motivation for garbage collection What to look for Motivation for garbage collection What to look for

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Why garbage collect? Finite storage requirement computer have finite, limited storage Language requirement many OO languages assume GC, e.g. allocated objects may survive much longer than the method that created them Problem requirement the nature of the problem may make it very hard/impossible to determine when something is garbage

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Why automatic garbage collection? Because human programmers just can’t get it right. Either too little is collected leading to memory leaks, or too much is collected leading to broken programs. Explicit memory management conflicts with the software engineering principles of abstraction and modularity. It’s not a silver bullet Some memory management problems cannot be solved using automatic GC, e.g. if you forget to drop references to objects that you no longer need. Some environments are inimical to garbage collection –embedded systems with limited memory –hard real-time systems

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July PART 2: The Basics What is garbage? The concept of liveness by reachability The basic algorithms The cost of garbage collection What is garbage? The concept of liveness by reachability The basic algorithms The cost of garbage collection

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July What is garbage? Almost all garbage collectors assume the following definition of live objects called liveness by reachability: if you can get to an object, then it is live. More formally: An object is live if and only if: it is referenced in a predefined variable called a root, or it is referenced in a variable contained in a live object (i.e. it is transitively referenced from a root). Non-live objects are called dead objects, i.e. garbage.

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Roots Objects and references can be considered a directed graph. Live objects are those reachable from a root. A process executing a computation is called a mutator — it simply modifies the object graph dynamically. Determining roots of a computation is, in general, language-dependent. In common language implementations roots include words in the static area registers words on the execution stack that point into the heap.

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July The basic algorithms Reference counting: Keep a note on each object in your garage, indicating the number of live references to the object. If an object’s reference count goes to zero, throw the object out (it’s dead). Mark-Sweep: Put a note on objects you need (roots). Then recursively put a note on anything needed by a live object. Afterwards, check all objects and throw out objects without notes. Mark-Compact: Put notes on objects you need (as above). Move anything with a note on it to the back of the garage. Burn everything at the front of the garage (it’s all dead). Copying: Move objects you need to a new garage. Then recursively move anything needed by an object in the new garage. Afterwards, burn down the old garage (any objects in it are dead)!

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Update(left(R), S) Reference counting The simplest form of garbage collection is reference counting. Basic idea: count the number of references from live objects. Each object has a reference count (RC) when a reference is copied, the referent’s RC is incremented when a reference is deleted, the referent’s RC is decremented an object can be reclaimed when its RC = 0

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Advantages of reference counting Simple to implement Costs distributed throughout program Good locality of reference: only touch old and new targets' RCs Works well because few objects are shared and many are short-lived Zombie time minimized: the zombie time is the time from when an object becomes garbage until it is collected Immediate finalisation is possible (due to near zero zombie time) OHPOHP

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Disadvantages of reference counting  Not comprehensive (does not collect all garbage): cannot reclaim cyclic data structures  High cost of manipulating RCs: cost is ever-present even if no garbage is collected  Bad for concurrency — need Compare&Swap  Tightly coupled interface to mutator  High space overheads  Recursive freeing cascade OHPOHP

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Mark-Sweep Mark-sweep is a tracing algorithm — it works by following (tracing) references from live objects to find other live objects. Implementation: Each object has a mark-bit associated with it. There are two phases: Mark phase: starting from the roots, the graph is traced and the mark-bit is set in each unmarked object encountered. At the end of the mark phase, unmarked objects are garbage. Sweep phase: starting from the bottom, the heap is swept –mark-bit not set:the object is reclaimed –mark-bit set:the mark-bit is cleared

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July A simple mark-sweep example 0 1 2

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Comprehensive: cyclic garbage collected naturally No run-time overhead on pointer manipulations Loosely coupled to mutator Does not move objects does not break any mutator invariants optimiser-friendly requires only one reference to each live object to be discovered (rather than having to find every reference) Advantages of mark-sweep OHPOHP

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Disadvantages of mark-sweep  Stop/start nature leads to disruptive pauses and long zombie times.  Complexity is O(heap) rather than O(live) every live object is visited in mark phase every object, alive or dead, is visited in sweep phase  Degrades with residency (heap occupancy) the collector needs headroom in the heap to avoid thrashing  Fragmentation and mark-stack overflow are issues  Tracing collectors must be able to find roots (unlike reference counting) OHPOHP

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Fast allocation? Problem: Non-moving memory managers fragment the heap mark-sweep reference counting A compacted heap offers better spatial locality, e.g. better virtual memory and cache performance allows fast allocation –merely bump a pointer

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Copying garbage collection Divide heap into 2 halves called semi-spaces and named Fromspace and Tospace Allocate objects in Tospace When Tospace is full flip the roles of the semi-spaces pick out all live data in Fromspace and copy them to Tospace preserve sharing by leaving a forwarding address in the Fromspace replica use Tospace objects as a work queue OHPOHP

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July copy root and update pointer, leaving forwarding address scan A' copy B and C, leaving forwarding addresses scan B' copy D and E, leaving forwarding addresses scan C' copy F and G, leaving forwarding addresses scan D' and E' nothing to do scan F' use A's forwarding address scan G' nothing to do scan=free so collection is complete Copying GC Example

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Advantages of copying GC Compaction for free Allocation is very cheap for all object sizes out-of-space check is pointer comparison simply increment free pointer to allocate Only live data is processed (commonly a small fraction of the heap) Fixed space overheads free and scan pointers forwarding addresses can be written over user data Comprehensive: cyclic garbage collected naturally Simple to implement a reasonably efficient copying GC OHPOHP

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Disadvantages of copying GC  Stop-and-copy may be disruptive Degrades with residency  Requires twice the address space of other simple collectors touch twice as many pages trade-off against fragmentation  Cost of copying large objects Long-lived data may be repeatedly copied  All references must be updated Moving objects may break mutator invariants  Breadth-first copying may disturb locality patterns

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Mark-compact collection Mark-compact collectors make at least two passes over the heap after marking to relocate objects to update references (not necessarily in this order) Issues how many passes? compaction style –sliding: preserve the original order of objects –linearising: objects that reference each other are placed adjacently (as far as possible) –arbitrary: objects moved without regard for original order or referential locality

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Cost metrics Many cost metrics can be interesting (albeit not necessarily at the same time). These cost metrics cover different types of concerns that may apply. The metrics are partially orthogonal, partially overlapping, and certainly also partially contradictory. In general it is not possible to identify one particular metric as the most important in all cases — it is application dependent. Because different GC algorithms emphasise different metrics, it is also, in general, not possible to point out one particular GC algorithm as “the best”. In the following, we present the most important metrics to consider when choosing a collector algorithm.

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July GC Metrics Execution time total execution time distribution of GC execution time time to allocate a new object Memory usage additional memory overhead fragmentation virtual memory and cache performance Delay time length of disruptive pauses zombie times Other important metrics comprehensiveness implementation simplicity and robustness

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Execution time metrics Total execution time relevant for applications such as batch processing. can be less important for some applications, e.g. where there is much idle time (interactive applications). Distribution of GC execution time the absolute amount of execution time consumed may be less important than the amortisation of that cost over the mutator’s execution. The time to allocate a new object for some applications it may be important to be able to allocate new objects fast. NotesonlyNotesonly

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Delay time metrics Length of disruptive pauses for applications requiring rapid response, e.g. most interactive applications, the length of the disruptive pauses introduced by the collector may be the most relevant metric. Zombie times the delay time from when an object becomes garbage until the memory allocated to it is actually collected. Long zombie times require more memory to be available (to house the dead, as yet uncollected, objects) NotesonlyNotesonly

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Memory metrics Amount of extra memory consumed some algorithms work better (or simply require) large amounts of extra memory Memory fragmentation some algorithms result in much fragmentation of memory, while others actually reduce fragmentation Virtual memory and cache performance the interaction between virtual memory, caches, and the garbage collector can be quite important. Some algorithms touch all parts of allocated memory (live as well as dead objects and even unallocated memory) while others touch only limited amounts (e.g. live objects only). NotesonlyNotesonly

© Richard Jones, Eric Jul, mmnet GC & MM Summer School, July Other important metrics Comprehensiveness does it find all garbage? Some collectors are comprehensive: they collect all garbage while others are conservative: they leave some garbage uncollected, for example, some do not collect cyclic object structures, while others retain some dead objects because these collectors cannot clearly identify all garbage. Implementation simplicity and robustness at times, simplicity of implementation is most important (get the job done!) is the garbage collector robust? How tightly coupled to the mutator is it? NotesonlyNotesonly