Download presentation
Presentation is loading. Please wait.
Published byPaul McBride Modified over 6 years ago
1
Rifat Shahriyar Stephen M. Blackburn Australian National University
High Performance Reference Counting and Conservative Garbage Collection Rifat Shahriyar Stephen M. Blackburn Australian National University Kathryn S. McKinley Microsoft Research
2
Down for the Count? Getting Reference Counting Back in the Ring ISMM’12
What happened 53 years ago?
3
Why Reference Counting?
Advantages Immediacy Object local Basic RC is easy Disadvantages Cycles Performance
4
Can we get RC back in the ring?
Problem One of the two fundamental GC algorithms Many advantages Neglected by performance-conscious VMs So how much slower is it? Can we get RC back in the ring? 30%
5
RC vs. MS New RC ≈ MS
6
Summary Old RC New RC Performance 30% slower than MS
40% slower than production New RC Limited bit count Optimization for new objects Performance Matches MS Still 10% slower than production < 2012 2012 6
7
Taking Off the Gloves with Reference Counting Immix OOPSLA’13
8
Why So Slow? GC Total Mutator
9
Looking a Little Deeper…
L1 D Cache Misses Instructions Retired Time Using Managed Runtime Systems to Tolerate Holes in Wearable Memories
10
Looking a Little Deeper…
Free List Lets see which GC uses which allocator RC and MS – Free List SS and Immix – Bump pointer L1 D Cache Misses Instructions Retired Time Bump Pointer Using Managed Runtime Systems to Tolerate Holes in Wearable Memories
11
RC Immix Combines RC and Immix Exploit Immix’s opportunistic copy
Line/block reclamation Line live object count with object reference count Exploit Immix’s opportunistic copy Observe new objects can be copied by first GC Observe old objects can be copied by backup GC Using Managed Runtime Systems to Tolerate Holes in Wearable Memories
12
3% faster then Gen Immix, +6% worst case, -21% best case
Total time 3% faster then Gen Immix, +6% worst case, -21% best case
13
Summary RC Immix Great performance Transforms RC
-3% RC Immix Object-local collection Excellent mutator locality Copying with RC Great performance Outperforms fastest production Transforms RC
14
Fast Conservative Garbage Collection OOPSLA’14
What happened 53 years ago?
15
GC is Ubiquitous GC implementations
Exact Conservative High performance systems use exact GC Conservative GC is popular roots heap roots heap heap roots GC – needs to find all live/dead objects Start from the roots Roots - all references into the heap held by runtime including stacks, registers, statics, and JNI Conservative GC is generally used in less performant systems exact conservative
16
Root Conservative GC heap roots int
We are interested in root conservative GC Where References in the roots are not precisely known But references in the heap objects are precisely known
17
We are interested in managed languages
Why Conservative GC Advantages No cooperation from compiler and runtime Engineering accurate stack maps is challenging Enable some compiler optimizations Disadvantages Must handle ambiguous references Performance We are interested in managed languages Reference counting has some interesting advantages. Our goal is to make it faster than the production. Zoom in on the result
18
Performance of Conservative GC
BDW suffers 12% and MCC suffers 45% overhead
19
Ambiguous Reference Pointers? – retain their referents and transitively reachable objects (Excess retention) Values? – not modify them and pin the referents (Pinning) Corrupt heap? – guarantee validation before updating per-object metadata (Filtering)
20
Non-moving Boehm-Demers-Weiser (BDW) widely used Problems
free-list allocator mark-sweep trace to reclaim garbage Problems Free-list suffers bad locality than contiguous With object type precision, a overly restrictive design
21
Mostly copying aka Bartlett Style with many variants
Two twists over the classic semi-space to-space and from-space are linked lists of discontiguous pages Promotes page referenced by ambiguous root Problems Semi-space suffers from huge collection cost Space waste due to page level pinning Objects can’t span pages and allocator can’t use pinned page
22
RC Immixcons matches production Gen Immix
Total time RC Immixcons matches production Gen Immix
23
Summary Conservative GC New designs Conservative RC Immix
Dominated by BDW and MCC Significant overheads Heap org. key to performance New designs Low overhead object map Immix line based pinning Conservative RC Immix Matches fastest production
24
Conclusion
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.