Correctness-Preserving Derivation of Concurrent Garbage Collection Algorithms Martin T. Vechev Eran Yahav David F. Bacon University of Cambridge IBM T.J. Watson Research Center PLDI – June 2006
Why Concurrent Garbage Collection ? Java and C# Garbage-collected languages are prevalent Multicore Concurrency is becoming prevalent Cheap RAM Large heaps are becoming prevalent Real-Time Systems More widely used
Memory Model Thread Model Concurrency Primitives CPU primitives Tracing/reference counting moving Allocate White / Black Dijkstra / Steele / Yuasa Barrier Atomic / Incremental Stack Snapshot Write Barrier Atomic / Non-atomic Color toggle, stacklets etc etc etc Implementation Existing Way to Create a Concurrent GC ENVIRONMENT REQUIREMENTS TECHNIQUES ?? Hard to verify/test Often buggy Did the monkey choose well?? Throughput Memory Consumption Pause Time
Ben-Ari Base ‘84 Dijkstra(C) ‘78 Doligez(C) ‘93 Azatchi ‘03 Domani ‘03 Yuasa ‘90 Pixley ‘88 Ben-Ari Base ‘84 Doligez ‘94 Ben-Ari Extended ‘84 Steele(C) ‘75 Boehm ‘91 Barabash ‘03 ‘03 ALGORITHMS PROOFS Concurrent GC algorithms and proofs are hard Incorrect Correct (C) Corrected FAMILY THEOREM PROVING
Optimal Correct Implementation Our Research Vision Memory Model Thread Model Concurrency Primitives CPU primitives ENVIRONMENT (Declarative Specification) Formally Defined Techniques Automated System Throughput Memory Consumption Pause Time REQUIREMENTS
In This Work Memory Model Thread Model Concurrency Primitives CPU primitives FIXED ENVIRONMENT Formally Defined Techniques for Tracing Non- Moving GC Automated System REQUIREMENTS Throughput Pause Time Memory Consumption Algorithm 1 Algorithm 2 Algorithm 3 Algorithm N <<< …
Problem : Interference A C B Traced Not Traced 1. GC traced B SYSTEM = MUTATOR || GC
Problem : Interference A C B A C B Traced Not Traced 1. GC traced B2. Mutator links C to B SYSTEM = MUTATOR || GC
Problem : Interference A C B A C B A C B X Traced Not Traced 1. GC traced B2. Mutator links C to B 3. Mutator unlinks C from A SYSTEM = MUTATOR || GC
Problem : Interference A C B A C B A C B A C B Traced Not Traced C LOST 1. GC traced B2. Mutator links C to B 3. Mutator unlinks C from A 4. GC traced A SYSTEM = MUTATOR || GC
The 3 Families of Concurrent GC Algorithms A C B 1. Marks C when C is linked to B (DIJKSTRA) A C B 2. Marks C when link to C is removed (YUASA) X A C B 3. Rescan B when C is linked to B (STEELE) Solutions are applied uniformly for all objects C C B
Contributions Systematic Exploration A new parametric model of concurrent GC Better understanding New algorithms – potentially useful Formal Relationship between algorithms Space - Relative precision between algorithms Sharing Proof Burden Correctness-preserving “transformations”
A Parametric Concurrent GC Skeleton Intuition : Common out as much as possible Record interaction history between collector and mutator during tracing Collector exposes “hidden objects” based on entire interaction history
mark … reclaim Complete Garbage Collection Expose(L,D) Change Heap COLLECTOR MUTATOR markExpose(L,D) Change Heap A Parametric Concurrent GC Skeleton
Dimensions: an intuition The effect of each Mutator/GC action is controlled by a dimension Collector Scans PointerWavefront Granularity Mutator Allocates ObjectAllocation Color AB Mutator Creates Pointer Counting Mutator Overwrites Pointer Snapshot X C
Implementation Choice: Wavefront Per-Field Wavefront Exact information One bit per field More expensive More synchronization More garbage collected Per-Object Wavefront Approximate Information One bit per object Less expensive Less synchronization Less garbage collected
Choice: Record on Link or Unlink Record on Link More synchronization More garbage collected Record on Unlink Less synchronization Less garbage collected X
Combined Choices Record on LinkRecord on Unlink Per-Field WF Per-Object WF AB X X AB AB AB
Combined Choices Per Object Rec. Link A Rec. Link B Rec. Link A Unlink B Per-Field A Per-Field B Rec. Unlink A Rec. Link B Rec. Unlink A Rec. Unlink B Per-Field A Per-Obj B Per-Obj A Per-Field B Per-Obj A Per-Obj B X X X X X X X X A B
Correctness Transformations = Proof Steps APEX (U, U, U, U, {}) APEX (U, U, U, U, {}) STEELE DIJKSTRA (stacks,U,{},U,{}) STEELE-D STEELE-YC STEELE-D-YC DIJKSTRA-OLD DIJKSTRA-YC STEELE-BC HYBRID-YC (stacks,A,{},{},{}) HYBRID-YC (stacks,A,{},{},{}) STEELE-D-BC DIJKSTRA-BC YUASA (stacks, A, {}, {}, U) START WITH A CORRECT ALGORITHM RETAIN LESS GARBAGE RETAIN MORE GARBAGE
Intuition: an algorithm is more precise than another if it collects more garbage An algorithm that is less precise (more conservative) than a correct algorithm is guaranteed to be correct Should be a reference point for practical comparisons no ad-hoc methods Hard to do manually: need a tool to provide insights Finding the “right” definition was harder than proving safety, yet simpler than “relative concurrency” Relative Precision
Precision APEX (U, U, U, U, {}) APEX (U, U, U, U, {}) STEELE DIJKSTRA (stacks,U,{},U,{}) STEELE-D STEELE-YC STEELE-D-YC DIJKSTRA-OLD DIJKSTRA-YC STEELE-BC HYBRID-YC (stacks,A,{},{},{}) HYBRID-YC (stacks,A,{},{},{}) STEELE-D-BC DIJKSTRA-BC YUASA (stacks, A, {}, {}, U) MORE PRECISE LESS PRECISE
Conclusions Systematic exploration of an algorithm space Useful new algorithms Formal definition of Relative precision between algorithms A first step towards automatic derivation of concurrent garbage collectors