Lecture 10: Heap Management CS 540 GMU Spring 2009
Process Address Space Each process has its own address space: –Text section (text segment) contains the executable code –Data section (data segment) contains the global variables –Stack contains temporary data (local variables, return addresses..) –Heap, which contains memory that is dynamically allocated at run-time. text data heap stack
Heap Management Heap is used to allocate space for dynamic objects May be managed by the user or by the runtime system In a perfect world: livedead not deleted ---- deleted----
User Heap Management User library manages memory; programmer decides when and where to allocate and deallocate void* malloc(longn) void free(void*addr) Library calls OS for more pages when necessary How does malloc/free work? Blocks of unused memory stored on a freelist malloc: search free list for usable memory block free: put block onto the head of the freelist
User Heap Management Drawbacks malloc is not free: we might have to do a search to find a big enough block As program runs, the heap fragments, leaving many small, unusable pieces Have to have a way to reclaim and merge blocks when freed. Memory management always has a cost. We want to minimize costs and, these days, maximize reliability
User Heap Management Pro: performance Con: safety livedead not deleted memory leak deleteddangling reference
Automatic Heap Management Garbage collection – reclamation of chunks of storage holding objects that can no longer be accessed by a program Originated with Lisp New languages tend to have automatic management Less error prone but performance penalty Assumptions: type info is available at runtime (how large a block is, where are pointers), pointers are always to start of block
Automatic Heap Management Issues: How much does this increase the runtime of a program? Space – need to manage free/used space Pause time – incremental vs. ‘stop the world’ Influences on data locality Different types/sizes of objects
Locating Garbage STACKSTACK r1 r2
Locating Garbage r1 r2 STACKSTACK
Object reachability The set of reachable objects changes as a program executes- Object allocations Parameters & return values Reference assignments Stack-based variables
Reference Counting Keep a count of pointers to each object –performance overhead each time the reachable set can change –storage overhead since each object needs a count field Zero references garbage that can be removed (applied transitively) > zero references not garbage??
Reference Counting STACKSTACK r1 r
What if r1 is removed? STACKSTACK r1 r
Trace Collecting When the heap starts to fill, pause the program and reclaim ‘Stop the world’ – pauses can be significant Lots of versions –Mark & sweep –Mark & compact –…
Mark & Sweep Basic idea: maintain a linked list, called the free list, that contains all of the heap blocks that can be allocated if the program requests a heap block but the free list is empty, invoke the garbage collector –starting from the pointers in the stack, data area and registers*, trace and mark all reachable heap blocks –sweep and collect all unmarked heap blocks, putting each one back on the free list (merging as appropriate) –sweep again, unmarking all heap blocks First developed by McCarthy in 1960 for Lisp * detecting what is a pointer is easier said than done
Mark & Sweep Issues: How to detect pointers? Need to keep a bit that we can use for the algorithm in every object In Lisp, the program execution would pause while the system did garbage collecting
Mark & Compact Basic idea (Fig 7.27 in text): To garbage collect 1.Starting from the pointers in the stack, data area and registers, find and mark all reachable heap blocks 2.Scan the marked nodes and compute a new address for each, based on where it should be relocated to have all of the blocks contiguous 3.Move each block to its new location, updating any internal pointers into the heap based on step 2. Update any program & stack variables based on the new assignments as well.
Generational Collection Generational collection –Idea: An object that survives its first round of garbage collection is likely to survive later (i.e. objects tend to die young)
A Generational algorithm Memory is divided into N partitions New objects are always allocated into partition 0 When partition 0 fills, it is garbage collected (via some technique). Anything that survives is moved to partition 1 (leaving 0 empty). Keep doing this – eventually partition i (> 0) will fill. Once it fills, garbage collect it and put surviving elements into partition i+1.
Generational Garbage Collection