Garbage Collection and Memory Management CS 480/680 – Comparative Languages
Garbage Collection2 Motivation for GC int mysub(times) { myClass *myObject; myObject = new myClass; for(int i=0; i<times; i++) { int a = myObject.firstMethod(); int b = myObject.secondMethod(); } return(a/b); } What’s wrong with this code?
Garbage Collection3 Memory Management Explicit memory management C++new/delete Cmalloc/free Garbage collection Javanew Class RubyClass.new Perl Smalltalk Etc…
Garbage Collection4 What is Garbage Collection Ideally, the garbage collector will find all objects that will not be accessed in the future. Of course, this is impossible, but a good guess is those objects that have no references to them remaining in scope. When to invoke GC? When memory is low When the local scope changes When the system is idle When a reference is destroyed Others?
Garbage Collection5 Advantages of Garbage Collection Easier on the programer Create objects when needed No need to de-allocate space No need for destructors and other special memory considerations Less errors Less memory leaks No dangling pointers
Garbage Collection6 GC Strategies Two of the most common strategies are: Tracing Includes the very common Mark & Sweep method Reference Counting
Garbage Collection7 Tracing Garbage Collectors Start with all objects accessible to the main program (called roots) Registers Stack Instruction pointer Global variables Find all objects reachable from the roots
Garbage Collection8 Roots D0 D1 D2 A0 A1 A7 … … $7000 ? rtrn $6FFE $6FFC Old A6 var1 var2 param $6FFA $6FF8 $6FF6 $6FF4 $6FF2 $6FF0 $6FEE $6FEC $6FEA $6FE8 ptr1 ptr2 Global variable list PC
Garbage Collection9 Reference Chains $7000 ? rtrn $6FFE $6FFC Old A6 var1 var2 param $6FFA $6FF8 $6FF6 $6FF4 $6FF2 $6FF0 $6FEE $6FEC $6FEA $6FE8 ptr1 ptr2 myObject acctList Value Next
Garbage Collection10 Tri-color Algorithm Every object is the system is “marked” in one of three colors: White – candidate for recycling (condemned) Black – Safe from recycling; No references to objects in the white set Grey – Safe from recycling; May refer to objects in the white set
Garbage Collection11 Initialization White set – everything except the roots Black set – empty Grey set – the roots (already safe from recycling)
Garbage Collection12 Tracing Repeat: 1.Pick an object from the grey set 2.Grey all white objects that this object refers to 3.Move the object to the black set Until: The grey set is empty Now recycle all objects remaining in the white set.
Garbage Collection13 Example Step (1) $7000 ? rtrn $6FFE $6FFC Old A6 var1 var2 param $6FFA $6FF8 $6FF6 $6FF4 $6FF2 $6FF0 $6FEE $6FEC $6FEA $6FE8 ptr1 ptr2 Value Next Value Next
Garbage Collection14 Example Step (2) $7000 ? rtrn $6FFE $6FFC Old A6 var1 var2 param $6FFA $6FF8 $6FF6 $6FF4 $6FF2 $6FF0 $6FEE $6FEC $6FEA $6FE8 ptr1 ptr2 Value Next Value Next
Garbage Collection15 Example Step (3) $7000 ? rtrn $6FFE $6FFC Old A6 var1 var2 param $6FFA $6FF8 $6FF6 $6FF4 $6FF2 $6FF0 $6FEE $6FEC $6FEA $6FE8 ptr1 ptr2 Value Next Value Next
Garbage Collection16 Example Step (4) $7000 ? rtrn $6FFE $6FFC Old A6 var1 var2 param $6FFA $6FF8 $6FF6 $6FF4 $6FF2 $6FF0 $6FEE $6FEC $6FEA $6FE8 ptr1 ptr2 Value Next Value Next
Garbage Collection17 An Important Observation Note that, by following this method, an object in the black set can never refer to an object in the white set When an object is moved to the black set, all the objects it points to are moved to the grey set. At the end, all of the objects in the white set have no live references to them, and can safely be removed.
Garbage Collection18 Variations of Tracing GC To move or not to move… Mark & Sweep – keep a few bits with each object to indicate if it is in the black, grey, or white sets. After marking, recycle every object in memory in the white set. Copying GC – relocate objects in the black set to a “safe” area of memory, then recycle everything else This is nice for creating good locality of reference! Also reduces fragmentation. Can be slower.
Garbage Collection19 More Variations Can the collector identify which parts of objects are references (pointers) and which are not? Yes: “precice” GC No: “conservative” GC What if pointers are encrypted, scrambled, or stored in some other “funny” way? Value Next Value Next
Garbage Collection20 Still More Variations Can the GC mechanism run incrementally? No: stop the rest of the system, do GC, then start up again Yes: interleave their work with work units from the rest of the system Can the GC mechanism run in a parallel thread? Note: All methods must at least scan the root set all at once Why?
Garbage Collection21 Generational GC It is not strictly necessary to place everything (except the root objects) in the white set Faster GC can be performed by limiting the size of the white set What should go there? Statistically speaking, the newest objects are also the most likely to go out of scope soon Referred to as infant mortality or the generational hypothesis Divide run time into generations, only put objects created in the current generation into the white set
Garbage Collection22 Disadvantages of Tracing GC Can be invoked at any time, and can be slow Not a good thing for Real Time (RT) computing The GC thread violates locality of reference by intentionally looking at memory areas that haven’t been accessed recently So what?
Garbage Collection23 Open Research Is all memory that is reachable still in use? Consider ARGV & ARGC. Usually read, parsed, and then ignored for the rest of the process. Do we really need to keep it around? Research area: finding objects that will never be used again…
Garbage Collection24 Reference Counting Keep a reference count with every object Advantages: Easy to implement Fast and incremental GC Disadvantages: Must update reference count whenever a reference is created or deleted: SLOW Need extra space in every object Does not reclaim space if there are reference cycles (as in a doubly linked list)