An Efficient Machine-Independent Procedure for Garbage Collection in Various List Structures, Schorr and Waite CACM August 1967, pp Curtis Dunham CS 395T Memory Management, Spring 2011
Technique not called “Mark-Sweep” in this work SCHEME-79 paper (Holloway, Steele, Sussman, and Bell) the first to use this label The verbs “trace” and “mark” are used however
Credit for the idea given to John McCarthy (Lisp, 1960) Section 4c of his Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I describes the same process of using the sign bit to mark memory cells, then scanning the heap
Algorithm independently discovered by L Peter Deutsch
Words in memory called “registers” IBM 7094 bit words; 144 KB 0.5 MHz, or more charitably, 500 KHz
Three options for reclaiming memory: 1. Programmer does it manually 2. Reference Counting 3. “Garbage collection”, like McCarthy suggested
McCarthy pretty much solved this, right? Three problems, our authors claim: 1. Algorithms for “tracing the lists” (today, traversing the directed graph) at that time were expensive in either time or space 2. Signed numbers, or, Hey, I was using that sign bit 3. Heap objects occupying multiple words
3 bits of the 36-bit words in a 7094 are for a prefix field Use one of those bits to indicate what is actually stored in that word 1. the number of words that make up the heap object, or 2. the heap object itself data Another word 3 A word
Use a large heap object, then the “interior” words have usable sign bits
Must run in fixed space No queue for breadth-first No stack growth for recursive techniques, e.g. depth-first Must trace efficiently Bounded visits per object Must handle circular lists 3 words 2 visits That, too.
Mutator and collector do not run concurrently Heap can be arbitrarily modified, so long as it is correct when the mutator resumes In a depth-first recursive scheme, the stack provides “breadcrumbs” that allow us to find our way back Reverse pointers as they are traversed Easy to do, easy to undo, provides “breadcrumbs” for free (in terms of space)
Mark (sign bit) Depth (prefix bit) Head (car) Tail (cdr) Trace through (follow) Tail elements: 1.Reverse Pointers 2.Set Mark Bits Trace through (follow) Tail elements (again), Only this time, traversal is reversed. 1.Reverse Pointers (again) 2.Check Head fields for pointers. If any are found, recursively apply this procedure.
This technique will mark everything on the heap that is reachable. A sweep of the entire heap is necessary to find all the unmarked words, which are returned to the free list. This algorithm solves “one-way list” tracing. With modification, it can trace others (see section 7 of paper)
Testbed: pathological 5-list, 20K-element depth-12 “mess” 1.85s for this work 2.75s for Wilkes 0.448s for breadth-first with 48 element queue Authors’ opinion Space is available: use the breadth-first approach Otherwise: this approach takes essentially no space; use it instead Cannot handle circular lists
The authors mention a concern for portability The final sweep through the heap is machine- dependent, but with just 6 special functions, the tracer/marking algorithm can be written in the higher level language (Wisp) Analogous to MMTk’s Java GC in Java
(other than this paper, cited on title slide) Gries, David. The Schorr-Waite graph marking algorithm McCarthy, John. Recursive functions of symbolic expressions and their computation by machine, Part I. CACM April Holloway, Jack; Steel, Guy Lewis, Jr.; Sussman, Gerald Jay; Bell, Alan. The SCHEME-79 Chip Previous slides by Sowmiya Chocka Narayanan
Experimental methodology Performance of pointer reversals Performance Human-noticable delay in stop-the-world design Motivation for incrementality, not waiting for heap to fill up, etc.? Significance of “no-space” design today