Using Generational Garbage Collection To Implement Cache- conscious Data Placement Trishul M. Chilimbi & James R. Larus מציג : ראובן ביק
Introduction n Main memory access cost is increasing n goal : to improve cache locality n introducing a technique for using a (copying) generational GC to reorganize data, so that objects with high temporal affinity are placed next to each other and thus are likely to reside in the same cache block
Contents n background –copying GC –generational GC n the method –profiling instrumentation –affinity graph –algorithm steps n results & conclusions
Copying GC n Two memory areas n When FROMSPACE is full, moves all the live objects from FROMSPACE to TOSPACE
Copying GC
Copying GC (cheney algorithm) n Breadth-first scan of the tree n one continuous scan of TOSPACE
Copying GC (cheney algorithm)
Why Generational GC ? n Most objects live a very short time, while a small percentage of them live much longer n problem : repeated copying of old objects
Generational GC n segregating objects into multiple areas by age n scavenge older objects less frequently n copy long surviving objects to older generations
Real time data access profiling n Real time profiling is more effective then an earlier training run n must be low overhead n done by modified compiler n upon access, the object address is written into an object access buffer
Real time data access profiling
Profiling is low-overhead n Implemented at object, not field, granularity n most object accesses are not lightweight
Affinity graph n Is based on object access buffer n created prior to scavenge n separate graph for each generation n nodes=objects n edges=affinity between objects
ADFGDACCAFDGACADFGDACCAFDGAC A D 1
ADFGDACCAFDGACADFGDACCAFDGAC A D 1 F 1 1
ADFGDACCAFDGACADFGDACCAFDGAC A D 1 F 1 1 G 1 1
ADFGDACCAFDGACADFGDACCAFDGAC A D 1 F 1 2 G 1 2
Cache-conscious copying algorithm n From the set of roots, pick the one with the highest affinity edge n perform a greedy depth first traversal of the affinity graph starting from this node n while traversing, copy each visited object to TOSPACE Step1:
Cache-conscious copying algorithm n Process all objects between the unprocessed and free pointers, using Cheney algorithm Step2:
Cache-conscious copying algorithm n Cleanup : copy any roots not present to TOSPACE and process using Cheney algorithm Step3:
n this algorithm is not used in the youngest generation (where new objects are allocated and most of the garbage is generated)
Results n Tested on 5 Cecil language programs n on a Sun computer with 2GB memory, with 2 level cache, running Solaris
Conclusions n This is an attractive technique that reduces cache miss rates by 21-42% and improves program performance by 14-37%, as compared to the commonly used alternative.