Mark Marron 1, Deepak Kapur 2, Manuel Hermenegildo 1 1 Imdea-Software (Spain) 2 University of New Mexico 1
Want to identify regions (sets of objects) that are conceptually related Conceptually related Same recursive data structure Stored in equivalent locations (e.g., same array) Extract information via static analysis Apply memory optimizations on regions instead of over entire heap Region Allocation/Collection Region/Parallel GC Optimized Layout 2
Must be Dynamic Variable based partitions too coarse, do not represent composition well. Allocation site based too imprecise, can cause spurious grouping of objects. Must be Repartitionable Want to track program splitting and merging regions: list append, subset operations. 3
Base on storage shape graph Nodes represent sets of objects (or recursive data structures), edges represent sets of pointers Has natural representation heap regions and relations between them Efficient Annotate nodes and edges with additional instrumentation properties For region identification only need type information 4
Recursive Structures Group objects representing same recursive structure, keep distinct from other recursive structures References Group objects stored in similar sets of locations together (objects in A, in B, both A and B) Composite Structures Group objects in each subcomponent, group similar components hierarchically 5
The general approach taken to Identifying Recursive Data Structures is well known Look at type information to determine which objects may be part of a recursive structure Based on connectivity group these recursive objects together Two subtle distinctions made in this work Only group objects in complete recursive structure Ignore back pointers in computing complete recursive structures 6
7 class Enode { Enode[] fromN; … }
The grouping of objects that are in the same container or related composite structures is more difficult Given regions R, R’ when do they represent conceptually equivalent sets of objects Stored in the same types of locations (variables, collections, referred to by same object fields) Have same type of recursive signature (can split leaf contents of recursive structures from internal recursive component) 8
9
N-Body Simulation in 3-dimensions Uses Fast Multi-Pole method with space decomposition tree For nearby bodies use naive n 2 algorithm For distant bodies compute center of mass of many bodies and treat as single point mass 10
11
12 for(…) { root = null; makeTree(); Iterator bm = this.bodyTabRev.iterator(); while(bm.hasNext()) bm.next().hackGravity(root); Iterator bp = this.bodyTabRev.iterator(); while(bm.hasNext()) bm.next().propUpdatedAccel(); }
Statically collect, space decomposition tree and all MathVector/double[] objects (11% of GC work). 13
GC objects reachable from the acc/vel fields in parallel with the hackGravity method (no overhead). 14
Inline Double[] into MathVector objects, 23% serial speedup 37% memory use reduction. 15
BenchmarkLOCAnalysis Time Analysis Memory Region Ok tsp s<30 MBY em3d s<30 MBY voronoi s<30 MBY bh s<30 MBY db s<30 MBY raytrace s38 MBY exp s48 MBY debug s122 MBY 16
Simple interpreter and debug environment for large subset of Java language 14,000+ Loc (in normalized form), 90 Classes Additional 1500 Loc for specialized standard library handling stubs. Large recursive call structures, large inheritance trees with numerous virtual method implementations Wide range of data structure types, extensive use of java.util collections, heap contains both shared and unshared structures. 17
Region Information provides excellent basis for driving many memory optimizations and supporting other analysis work A simple set of heuristics (when taking into account a few subtleties) is sufficient for grouping memory objects Recent work shows excellent scalability on non-trivial programs Further work on developing robust infrastructure for further evaluation and applications 18