Download presentation
Presentation is loading. Please wait.
Published byWilfrid Chandler Modified over 9 years ago
1
380C Lecture 17 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Why you need to care about workloads & experimental methodology –Alias analysis –Dependence analysis –Loop transformations –EDGE architectures 1
2
2 Today Garbage Collection –Why use garbage collection? –What is garbage? Reachable vs live, stack maps, etc. –Allocators and their collection mechanisms Semispace Marksweep Performance comparisons –Incremental age based collection Write barriers: Friend or foe? Generational Beltway –More performance
3
3 Basic VM Structure Executing Program Program/Bytecode Dynamic Compilation Subsystem Class Loader Verifier, etc. Heap Thread Scheduler Garbage Collector
4
4 True or False? Real programmers use languages with explicit memory management. –I can optimize my memory management much better than any garbage collector
5
5 True or False? Real programmers use languages with explicit memory management. –I can optimize my memory management much better than any garbage collector –Scope of effort?
6
6 Why Use Garbage Collection? Software engineering benefits –Less user code compared to explict memory management (MM) –Less user code to get correct –Protects against some classes of memory errors No free(), thus no premature free(), no double free(), or forgetting to free() Not perfect, memory can still leak –Programmers still need to eliminate all pointers to objects the program no longer needs Performance: space time tradeoff –Time proportional to dead objects (explicit mm, reference counting) or live objects (semispace, marksweep) –Throughput versus pause time Less frequent collection, typically reduces total time but increases space requirements and pause times –Hidden locality benefits?
7
7 What is Garbage? In theory, any object the program will never reference again –But compiler & runtime system cannot figure that out In practice, any object the program cannot reach is garbage –Approximate liveness with reachability Managed languages couple GC with “safe” pointers –Programs may not access arbitrary addresses in memory –The compiler can identify and provide to the garbage collector all the pointers, thus –“Once garbage, always garbage” –Runtime system can move objects by updating pointers –“Unsafe” languages can do non-moving GC by assuming anything that looks like a pointer is one.
8
8 Reachability stackglobalsregisters heap A B C {.... r0 = obj PC -> p.f = obj.... Compiler produces a stack-map at GC safe-points and Type Information Blocks GC safe points: new(), method entry, method exit, & back- edges (thread switch points) Stack-map: enumerate global variables, stack variables, live registers -- This code is hard to get right! Why? Type Information Blocks: identify reference fields in objects
9
9 Reachability stackglobalsregisters heap A B C {.... r0 = obj PC -> p.f = obj.... Compiler produces a stack-map at GC safe-points and Type Information Blocks Type Information Blocks: identify reference fields in objects for each type i (class) in the program, a map 302 TIB i
10
10 Reachability Tracing collector (semispace, marksweep) –Marks the objects reachable from the roots live, and then performs a transitive closure over them stackglobalsregisters heap A B C {.... r0 = obj PC -> p.f = obj.... mark
11
11 Reachability Tracing collector (semispace, marksweep) –Marks the objects reachable from the roots live, and then performs a transitive closure over them stackglobalsregisters heap A B C {.... r0 = obj PC -> p.f = obj.... mark
12
12 Reachability Tracing collector (semispace, marksweep) –Marks the objects reachable from the roots live, and then performs a transitive closure over them stackglobalsregisters heap A B C {.... r0 = obj PC -> p.f = obj.... mark
13
13 Reachability Tracing collector (semispace, marksweep) –Marks the objects reachable from the roots live, and then performs a transitive closure over them All unmarked objects are dead, and can be reclaimed stackglobalsregisters heap A B C {.... r0 = obj PC -> p.f = obj.... mark
14
14 Reachability Tracing collector (semispace, marksweep) –Marks the objects reachable from the roots live, and then performs a transitive closure over them All unmarked objects are dead, and can be reclaimed stackglobalsregisters heap A B C {.... r0 = obj PC -> p.f = obj.... sweep
15
15 Today Garbage Collection –Why use garbage collection? –What is garbage? Reachable vs live, stack maps, etc. –Allocators and their collection mechanisms Semispace Marksweep Performance comparisons –Incremental age based collection Write barriers: Friend or foe? Generational Beltway –More performance
16
16 Semispace Fast bump pointer allocation Requires copying collection Cannot incrementally reclaim memory, must free en masse Reserves 1/2 the heap to copy in to, in case all objects are live heap to spacefrom space
17
17 Semispace Fast bump pointer allocation Requires copying collection Cannot incrementally reclaim memory, must free en masse Reserves 1/2 the heap to copy in to, in case all objects are live heap to spacefrom space
18
18 Semispace Fast bump pointer allocation Requires copying collection Cannot incrementally reclaim memory, must free en masse Reserves 1/2 the heap to copy in to, in case all objects are live heap to spacefrom space
19
19 Semispace Fast bump pointer allocation Requires copying collection Cannot incrementally reclaim memory, must free en masse Reserves 1/2 the heap to copy in to, in case all objects are live heap to spacefrom space
20
20 Semispace Mark phase: –copies object when collector first encounters it –installs forwarding pointers heap from spaceto space
21
21 Semispace Mark phase: –copies object when collector first encounters it –installs forwarding pointers –performs transitive closure, updating pointers as it goes heap from spaceto space
22
22 Semispace Mark phase: –copies object when collector first encounters it –installs forwarding pointers –performs transitive closure, updating pointers as it goes heap from spaceto space
23
23 Semispace Mark phase: –copies object when collector first encounters it –installs forwarding pointers –performs transitive closure, updating pointers as it goes heap from spaceto space
24
24 Semispace Mark phase: –copies object when collector first encounters it –installs forwarding pointers –performs transitive closure, updating pointers as it goes –reclaims “from space” en masse heap from spaceto space
25
25 Semispace Mark phase: –copies object when collector first encounters it –installs forwarding pointers –performs transitive closure, updating pointers as it goes –reclaims “from space” en masse –start allocating again into “to space” heap from spaceto space
26
26 Semispace Mark phase: –copies object when collector first encounters it –installs forwarding pointers –performs transitive closure, updating pointers as it goes –reclaims “from space” en masse –start allocating again into “to space” heap from spaceto space
27
27 Semispace Notice: 4fast allocation 4locality of contemporaneously allocated objects 4locality of objects connected by pointers 8wasted space heap from spaceto space
28
28 Marksweep Free-lists organized by size –blocks of same size, or –individual objects of same size Most objects are small < 128 bytes 4 8 12 16... 128 free lists... heap
29
29 Marksweep Allocation –Grab a free object off the free list 4 8 12 16... 128 free lists... heap
30
30 Marksweep Allocation –Grab a free object off the free list 4 8 12 16... 128 free lists... heap
31
31 Marksweep Allocation –Grab a free object off the free list 4 8 12 16... 128 free lists... heap
32
32 Marksweep Allocation –Grab a free object off the free list –No more memory of the right size triggers a collection –Mark phase - find the live objects –Sweep phase - put free ones on the free list 4 8 12 16... 128 free lists... heap
33
33 Marksweep Mark phase –Transitive closure marking all the live objects Sweep phase –sweep the memory for free objects populating free list 4 8 12 16... 128 free lists... heap
34
34 Marksweep Mark phase –Transitive closure marking all the live objects Sweep phase –sweep the memory for free objects populating free list 4 8 12 16... 128 free lists... heap
35
35 Marksweep Mark phase –Transitive closure marking all the live objects Sweep phase –sweep the memory for free objects populating free list 4 8 12 16... 128 free lists... heap
36
36 Marksweep Mark phase –Transitive closure marking all the live objects Sweep phase –sweep the memory for free objects populating free list –can be made incremental by organizing the heap in blocks and sweeping one block at a time on demand 4 8 12 16... 128 free lists... heap
37
37 Marksweep 4space efficiency 4Incremental object reclamation 8relatively slower allocation time 8poor locality of contemporaneously allocated objects 4 8 12 16... 128 free lists... heap
38
38 How do these differences play out in practice? Marksweep 4space efficiency 4Incremental object reclamation 8relatively slower allocation time 8poor locality of contemporaneously allocated objects Semispace 4fast allocation 4locality of contemporaneously allocated objects 4locality of objects connected by pointers 8wasted space
39
39 Methodology [SIGMETRICS 2004] Compare Marksweep (MS) and Semispace (SS) Mutator time, GC time, total time Jikes RVM & MMTk replay compilation measure second iteration without compilation Platforms 1.6GHz G5 (PowerPC 970) 1.9GHz AMD Athlon 2600+ 2.6GHz Intel P4 Linux 2.6.0 with perfctr patch & libraries – Separate accounting of GC & Mutator counts SPECjvm98 & pseudojbb
40
40 Allocation Mechanism Bump pointer – ~70 bytes IA32 instructions, 726MB/s Free list – ~140 bytes IA32 instructions, 654MB/s Bump pointer 11% faster in tight loop – < 1% in practical setting – No significant difference (?)
41
41 Mutator Time
42
42 jess
43
43 jess
44
44 jess
45
45 jess
46
46 javac
47
47 pseudojbb
48
48 Geometric Mean Mutator Time
49
49 Garbage Collection Time
50
50 Garbage Collection Time Geometric mean pseudojbb jess javac
51
51 Total Time
52
52 Total Time pseudojbb Geometric mean jess javac
53
53 MS/SS Crossover: 1.6GHz PPC
54
54 MS/SS Crossover: 1.9GHz AMD
55
55 MS/SS Crossover: 2.6GHz P4
56
56 MS/SS Crossover: 3.2GHz P4
57
57 MS/SS Crossover 2.6GHz 1.9GHz 1.6GHz localityspace 3.2GHz
58
58 Today Garbage Collection –Why use garbage collection? –What is garbage? Reachable vs live, stack maps, etc. –Allocators and their collection mechanisms Semispace Marksweep Performance comparisons –Incremental age based collection Enabling mechanisms –write barrier & remembered sets Heap organizations –Generational –Beltway –Performance comparisons
59
59 One Big Heap? Pause times –it takes to long to trace the whole heap at once Throughput –the heap contains lots of long lived objects, why collect them over and over again? Incremental collection –divide up the heap into increments and collect one at a time. Increment 1 Increment 2 to spacefrom spaceto spacefrom space
60
60 Incremental Collection Ideally perfect pointer knowledge of live pointers between increments requires scanning whole heap, defeats the purpose to spacefrom spaceto spacefrom space Increment 1 Increment 2
61
61 Incremental Collection Ideally perfect pointer knowledge of live pointers between increments requires scanning whole heap, defeats the purpose to spacefrom spaceto spacefrom space Increment 1 Increment 2
62
62 Incremental Collection to spacefrom spaceto spacefrom space Increment 1 Increment 2 Ideally perfect pointer knowledge of live pointers between increments requires scanning whole heap, defeats the purpose
63
63 Incremental Collection Ideally perfect pointer knowledge of live pointers between increments requires scanning whole heap, defeats the purpose Mechanism: Write barrier records pointers between increments when the mutator installs them, conservative approximation of reachability to spacefrom spaceto spacefrom space Increment 1 Increment 2
64
64 Write barrier compiler inserts code that records pointers between increments when the mutator installs them // original program // compiler support for incremental collection p.f = o; if (incr(p) != incr(o) { remembered set (incr(o)) U p.f; } p.f = o; to spacefrom spaceto spacefrom space Increment 1 Increment 2 remset 1 ={w}remset 2 ={f,g} a b c d e f gt u v w x y z
65
65 Write barrier Install new pointer d -> v // original program // compiler support for incremental collection p.f = o; if (incr(p) != incr(o) { remembered set (incr(o)) U p.f; } p.f = o; to spacefrom spaceto spacefrom space Increment 1 Increment 2 remset 1 ={w}remset 2 ={f,g} a b c d e f gt u v w x y z
66
66 Write barrier Install new pointer d -> v, then update d-> y // original program // compiler support for incremental collection p.f = o; if (incr(p) != incr(o) { remembered set (incr(o)) = p.f; } p.f = o; to spacefrom spaceto spacefrom space Increment 1 Increment 2 remset 1 ={w}remset 2 ={f,g,d} a b c d e f gt u v w x y z
67
67 Write barrier Install new pointer d -> v, then update d-> y // original program // compiler support for incremental collection p.f = o; if (incr(p) != incr(o) { remembered set (incr(o)) = p.f; } p.f = o; to spacefrom spaceto spacefrom space Increment 1 Increment 2 remset 1 ={w}remset 2 ={f,g,d,d} a b c d e f gt u v w x y z
68
68 Write barrier At collection time collector re-examines all entries in the remset for the increment, treating them like roots Collect Increment 2 to spacefrom spaceto spacefrom space Increment 1 Increment 2 remset 1 ={w}remset 2 ={f,g,d,d} a b c d e f gt u v w x y z
69
69 Write barrier At collection time collector re-examines all entries in the remset for the increment, treating them like roots Collect Increment 2 to spacefrom spaceto spacefrom space Increment 1 Increment 2 remset 1 ={w}remset 2 ={f,g,d,d} a b c d e f gt u v w x y z
70
70 Summary of the costs of incremental collection write barrier to catch pointer stores crossing boundaries remsets to store crossing pointers processing remembered sets at collection time excess retention to spacefrom spaceto spacefrom space Increment 1 Increment 2 remset 1 ={w}remset 2 ={f,g,d,d} a b c d e f gt u v w x y z
71
71 Heap Organization What objects should we put where? Generational hypothesis –young objects die more quickly than older ones [Lieberman & Hewitt’83, Ungar’84] –most pointers are from younger to older objects [Appel’89, Zorn’90] ÜOrganize the heap in to young and old, collect young objects preferentially to space from space Young Old
72
72 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old
73
73 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old
74
74 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old
75
75 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old
76
76 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old
77
77 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old
78
78 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old
79
79 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old
80
80 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old
81
81 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old
82
82 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old
83
83 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to spacefrom spaceto space Young Old
84
84 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces - ignore remembered sets –Generalizing to m generations if space n < m fills up, collect n through n-1 to spacefrom spaceto space Young Old
85
85 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces - ignore remembered sets –Generalizing to m generations if space n < m fills up, collect 1 through n-1 to spacefrom spaceto space Young Old
86
86 Generational Write Barrier Unidirectional barrier 4record only older to younger pointers 8no need to record younger to older pointers, since we never collect the old space independently most pointers are from younger to older objects [Appel’89, Zorn’90] track the barrier between young objects and old spaces to space from space Young Old address barrier
87
87 Generational Write Barrier to space from space Young Old unidirectional boundary barrier // original program // compiler support for incremental collection p.f = o; if (p > barrier && o < barrier) { remset nursery U p.f; } p.f = o;
88
88 Generational Write Barrier Unidirectional 4record only older to younger pointers 8no need to record younger to older pointers, since we never collect the old space independently –most pointers are from younger to older objects [Appel’89, Zorn’90] –most mutations are to young objects [Stefanovic et al.’99] to space from space Young Old
89
89 Results
90
90 Garbage Collection Time
91
91 Mutator Time
92
92 Total Time
93
McKinley, UT Recap Copying improves locality Incrementality improves responsiveness Generational hypothesis –Young objects: Most very short lived Infant mortality: ~90% die young (within 4MB of alloc) –Old objects: most very long lived (bimodal) Mature morality: ~5% die each 4MB of new allocation Help from pointer mutations –In Java, pointers go in both directions, but older to younger pointers across many objects are rare less than 1% –Most mutations among young objects 92 to 98% of pointer mutations
94
380C Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Can we get mutator locality, space efficiency, and collector efficiency all in one collector? –Read: Blackburn and McKinley, Immix: A Mark-Region Garbage Collector with Space Efficiency, Fast Collection, and Mutator Performance, ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pp. 22-32, Tucson AZ, June 2008. –Why you need to care about workloads & methodology –Alias analysis –Dependence analysis –Loop transformations –EDGE architectures 94
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.