Presentation is loading. Please wait.

Presentation is loading. Please wait.

Connectivity-Based Garbage Collection Martin Hirzel University of Colorado at Boulder Collaborators: Amer Diwan, Michael Hind, Hal Gabow, Johannes Henkel,

Similar presentations


Presentation on theme: "Connectivity-Based Garbage Collection Martin Hirzel University of Colorado at Boulder Collaborators: Amer Diwan, Michael Hind, Hal Gabow, Johannes Henkel,"— Presentation transcript:

1 Connectivity-Based Garbage Collection Martin Hirzel University of Colorado at Boulder Collaborators: Amer Diwan, Michael Hind, Hal Gabow, Johannes Henkel, Matthew Hertz

2 2 Garbage Collection Benefits Garbage collection leads to simpler Design  no complex deallocation protocols Implementation  automatic deallocation Maintenance  fewer bugs Benefits are widely accepted Java, C#, Python, …

3 3 Garbage Collection: Haven’t we solved this problem yet? For a state-of-the-art garbage collector: –time ~14% of execution time –space 3x high watermark –pauses 0.8 seconds Can reduce any one cost Challenge: reduce all three costs

4 4 o2o2 o1o1 o4o4 o3o3 o5o5 o 10 o6o6 o8o8 o9o9 o7o7 o 11 o 15 o 14 o 12 o 13 Example Heap Boxes: heap objects Arrows: pointers Long box: stack + global variables s1s1 s2s2 g

5 5 o2o2 o1o1 o4o4 o3o3 o5o5 o 10 o6o6 o8o8 o9o9 o7o7 o 11 o 15 o 14 o 12 o 13 Thesis 1.Objects form distinct data structures 2.Connected objects die together 3.Garbage collectors can exploit 1. and 2. to reclaim objects efficiently stack + globals

6 6 Experimental Infrastructure JikesRVM Research Virtual Machine –From IBM Research –Written in Java –Application and runtime system share heap  Good garbage collection even more important Benchmarks –SPECjvm98 suite and SPECjbb2000 –Java Olden suite –xalan, ipsixql, nfc, jigsaw

7 7 Outline Garbage Collector Design Principles Family of Garbage Collectors Design Space Exploration Pointer Analysis for Java

8 8 Garbage Collector Design Principles “Do partial collections.” Don’t collect the full heap every time  Shorter pause times o2o2 o1o1 o4o4 o3o3 o5o5 o 10 o6o6 o8o8 o9o9 o7o7 o 11 o 15 o 14 o 12 o 13 stack + globals

9 9 Garbage Collector Design Principles “Predict lifetime based on age.” Generational hypothesis: Most objects die young Generational garbage collection: –Partition by age –Collect young objects most often  Low time overhead That’s the state of the art. o2o2 o1o1 o4o4 o3o3 o5o5 o 10 o6o6 o8o8 o9o9 o7o7 o 11 o 15 o 14 o 12 o 13 stack + globals young generationold generation

10 10 Garbage Collector Design Principles Generational GC Problems o2o2 o1o1 o4o4 o3o3 o5o5 o 10 o6o6 o8o8 o9o9 o7o7 o 11 o 15 o 14 o 12 o 13 stack + globals young generationold generation Regular full collections  Long peak pause Old-to-young pointers  Need bookkeeping ~37.5% long-lived objects  Pig in the python

11 11 Garbage Collector Design Principles “Collect connected objects together.” Likelihood that two objects die at the same time: ConnectivityExampleLikelihood Any pair33.1% Weakly connected46.3% Strongly connected72.4% Direct pointer76.4% o2o2 o1o1 ? o2o2 o1o1 o2o2 o1o1 o2o2 o1o1

12 12 Garbage Collector Design Principles “Focus on objects with few ancestors.”  Shortlived objects are easy to collect Lifetime Median number of ancestor objects Short2 objects Long83,324 objects

13 13 Garbage Collector Design Principles “Predict lifetime based on roots.” o1o1 o2o2 o3o3 stack + globals Lifetime Objects reachable …ShortLong indirectly from stack25.6%16.2% only directly from stack32.9%0.8% from globals4.0%20.5% Total62.5%37.5% o4o4 g s For details, see our [ISMM’02] paper.

14 14 Outline Garbage Collector Design Principles Family of Garbage Collectors Design Space Exploration Pointer Analysis for Java

15 15 CBGC Family of Garbage Collectors: Connectivity-Based Garbage Collection o2o2 o1o1 o4o4 o3o3 o5o5 o 10 o6o6 o8o8 o9o9 o7o7 o 11 o 15 o 12 o 13 p1p1 p2p2 p3p3 p4p4 o 14 stack + globals Do partial collections. Collect connected objects together. Predict lifetime based on age. Focus on objects with few ancestors. Predict lifetime based on roots.

16 16 Family of Garbage Collectors Components of CBGC Before allocation: 1.Partitioning Decide into which partition to put each object Collection algorithm: 2.Estimator Estimate dead + live objects for each partition 3.Chooser Choose “good” set of partitions 4.Partial collection Collect chosen partitions

17 17 Find fine-grained partitions, where Partition edges respect pointers Objects don’t move between partitions o2o2 o1o1 o4o4 o3o3 o5o5 o 10 o6o6 o8o8 o9o9 o7o7 o 11 o 15 o 12 o 13 p1p1 p2p2 p3p3 p4p4 Family of Garbage Collectors Partitioning Problem o 14 stack + globals

18 18 Pointer analysis Type-based [Harris] –o 1 may point to o 2 if o 1 has a field of a type compatible to o 2 Constraint-based [Andersen] –We will discuss this later in the talk o2o2 o1o1 o4o4 o3o3 o5o5 o 10 o6o6 o8o8 o9o9 o7o7 o 11 o 15 o 12 o 13 p1p1 p2p2 p3p3 p4p4 Family of Garbage Collectors Partitioning Solutions o 14 stack + globals

19 19 Family of Garbage Collectors Estimator Problem For each partition guess dead –Objects that can be reclaimed –Pay-off live –Objects that must be traversed –Cost 3 dead + 3 live 1 dead + 2 live 2 dead + 0 live p1p1 p2p2 p3p3 p4p4 2 dead + 2 live stack + globals

20 20 Family of Garbage Collectors Estimator Solutions Heuristics Connected objects die together Most objects die young Objects reachable from globals live long The past predicts the future 3 dead + 3 live 1 dead + 2 live 2 dead + 0 live p1p1 p2p2 p3p3 p4p4 2 dead + 2 live stack + globals

21 21 Family of Garbage Collectors Chooser Problem Pick subset of partitions Maximize total dead Minimize total live Closed under predecessor relation  No bookkeeping for external pointers p3p3 p1p1 p2p2 p3p3 p4p4 7 dead + 5 live 3 dead + 3 live 1 dead + 2 live 2 dead + 0 live 2 dead + 2 live stack + globals

22 22 Family of Garbage Collectors Chooser Solutions Optimal algorithm based on network flow [TR] Simpler, greedy algorithm p3p3 p1p1 p2p2 p3p3 p4p4 7 dead + 5 live 3 dead + 3 live 1 dead + 2 live 2 dead + 0 live 2 dead + 2 live stack + globals

23 23 o5o5 o 10 o8o8 o 11 Family of Garbage Collectors Partial Collection Problem o2o2 o6o6 o9o9 o7o7 o5o5 o 10 o8o8 o 11 o 12 o 13 o 15 p2p2 p3p3 p4p4 rest of heap o 14 Look only at chosen partitions Traverse reachable objects Reclaim unreachable objects stack + globals o o

24 24 o5o5 o 10 o8o8 o 11 Family of Garbage Collectors Partial Collection Solutions o2o2 o6o6 o9o9 o7o7 o5o5 o 10 o8o8 o 11 o 12 o 13 o 15 p2p2 p3p3 p4p4 rest of heap o 14 stack + globals Generalize canonical full-heap algorithms Mark and sweep [McCarthy’60] Semi-space copying [Cheney’70] Treadmill [Baker’92]

25 25 Outline Garbage Collector Design Principles Family of Garbage Collectors Design Space Exploration Pointer Analysis for Java

26 26 Design Space Exploration Questions How good is a naïve CBGC? How good could CBGC be in 20 years? How well does CBGC do in a JVM?

27 27 Design Space Exploration Simulator Methodology Garbage collection simulator (under GPL) –Uses traces of allocations and pointer writes from our benchmark runs Simulator advantages –Easier to implement variety of collector algorithms –Know entire trace beforehand: can use that for “in 20 years” experiments Simulator disadvantages –No bottom-line performance numbers Currently adding CBGC to JikesRVM

28 28 Design Space Exploration How good is a naïve CBGC? Cost in time Cost in space Pause times Full-heap Semi-space copying CBGC-naïve Type-based partitioning [Harris] Heuristics estimator Appel Copying generational jackxalanjbbjavacjackxalanjbbjavacjackxalanjbbjavac 1.72 0 0 0 0.87 0.22

29 29 Cost in time Cost in space Pause times Full-heap Semi-space copying CBGC-oracles Partitioning and estimator based on trace Appel Copying generational jackxalanjbbjavacjackxalanjbbjavacjackxalanjbbjavac Design Space Exploration How good could CBGC be in 20 years? 1.72 0 0 0 0.87 0.22

30 30 CBGC with oracles beats Appel –We did not find a “performance wall” –CBGC has potential The performance gap between CBGC with oracles and naïve CBGC is large –Research challenges Design Space Exploration How good could CBGC be in 20 years?

31 31 How well does CBGC do in a Java virtual machine? Implementation in progress Need a pointer analysis for the partitioning

32 32 Outline Garbage Collector Design Principles Family of Garbage Collectors Design Space Exploration Pointer Analysis for Java

33 33 Pointer Analysis for Java Which analysis do we need? jackxalanjbbjavacjackxalanjbbjavacjackxalanjbbjavacjackxalanjbbjavacjackxalanjbbjavac Cost in time Full-heapCBGCAppel Semi-space copying Type-based partitioning [Harris] Type-based partitioning (oracles) Allocation site partitioning (oracles) Copying generational [Andersen] 1.7 0

34 34 Pointer Analysis for Java Andersen’s Analysis What When Constraint generation Model flow of pointers Ahead-of-time compilation Constraint propagation Find fixed-point solution Ahead-of-time compilation Allocation-site granularity Set-inclusion constraints Flow and context insensitive can’t analyze Java ahead of time!

35 35 Pointer Analysis for Java Andersen for all of Java What When Constraint generation Model flow of pointers VM build and start-up Class loading Type resolution Method compilation (JIT) Execution of reflection Execution of native code Constraint propagation Find next fixed- point solution Points-to information used (before garbage collection) Do as little as possible as late as possible

36 36 Pointer Analysis for Java Correctness Properties Can not do any better for Java! time …… Constraint generation Constraint propagation Ifthere is a pointer thenthe results predict it

37 37 Pointer Analysis for Java Analysis Cost  Expensive, but once behavior stabilizes, costs diminish to zero ConstraintConstraint propagation generationEagerAt GCAt End SecondsCountSecondsCountSecondsCountSeconds compress21.41303.2540.4167.4 db20.11433.6542.9171.4 mtrt20.32652.1546.2168.1 mpegaudio20.63192.2546.1166.6 jack21.23974.2749.0178.2 jess22.37336.8849.7185.7 javac21.11,1075.91087.41187.6 xalan20.11,7284.9885.71215.7

38 38 Pointer Analysis for Java Validation Lots of corner cases –Dynamic class loading –Reflection –Native code Missing any one leads to nasty bugs –CBGC relies on conservative results We performed validation runs –Check analysis results against pointers in heap during garbage collection

39 39 Wrapping Up

40 40 Related Work: Using Program Analysis for Garbage Collection Stack allocation [ParkGoldberg’92, …] Regions [TofteTalpin’97, …] Liveness analysis [AgesenDetlefsMoss’98, …] Early reclamation [Harris’99] Thread-local heaps [Steensgaard’00, …] Object inlining [DolbyChien’00] Write-barrier removal [ZeeRinard’02, Shuf’02] …

41 41 Related Work: Pointer analyses for Java Andersen’s analysis for “static Java” [RountevMilanovaRyder’01] [LiangPenningsHarrold’01] [WhaleyLam’02] [LhotakHendren’03] Weaker analyses with dynamic class loading DOIT – [PechtchanskiSarkar’01] XTA – [QianHendren’04] Ruf’s escape analysis – [BogdaSingh’01, King’03] Demand-driven / incremental analysis

42 42 Other Research Interests Accuracy of Garbage Collection [M.S.Thesis,ISMM’00,ECOOP’01,TOPLAS’02] Profiling [FDDO’01,Patent’01a] Dynamic Optimizations, Prefetching [PLDI’02,Patent’02b] Future directions: More techniques for performance improvement Reducing bugs, improving productivity

43 43 Contributions presented in this talk Connectivity-based GC design principles [ISMM’02] CBGC, a new family of garbage collectors; Design space exploration with simulator [OOPSLA’03] First non-trivial pointer analysis for Java [ECOOP’04 (to appear)] http://www.cs.colorado.edu/~hirzel


Download ppt "Connectivity-Based Garbage Collection Martin Hirzel University of Colorado at Boulder Collaborators: Amer Diwan, Michael Hind, Hal Gabow, Johannes Henkel,"

Similar presentations


Ads by Google