Download presentation
Presentation is loading. Please wait.
Published byAllan Walters Modified over 9 years ago
1
Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu
2
Department of Computer Sciences 11-Dec-2006UMCP2 Best case : increases GC workload Worst case: systematic heap growth causes crash after days of execution A memory leak in a garbage-collected language occurs when a program inadvertently maintains references to objects that it no longer needs, preventing the collector from reclaiming space. Cork accurately pinpoints systematic heap growth completely online
3
Department of Computer Sciences 11-Dec-2006UMCP3 Cork’s Solution 1. Summarize heap growth by calculating type points-from graph Piggybacks on full-heap object scan Summarizes the heap by type 2. Interpret the summarization using differencing 3. Generate debugging reports Candidate Report Slice Report Allocation Site Report
4
Department of Computer Sciences 11-Dec-2006UMCP4 3 Type Points-From Graph Heap 3 4 2 2 1 4 1 1 1 TPFG =instance =type =HashTable =Queue =PQueue =Company =People
5
Department of Computer Sciences 11-Dec-2006UMCP5 1 2 Differencing TPFGs 1 2 1 2 2 3 2 1 TPFG i+1 TPFG i+2 1 4 1 1 2 1 2 3 3 4 TPFG i 1 1 1 1 1 1 1 1 1
6
Department of Computer Sciences 11-Dec-2006UMCP6 1 4 1 1 4 1 1 1 4 Rank growing nodes Rank all growing nodes Designate node as a candidate if Finding Growth (SRT)
7
Department of Computer Sciences 11-Dec-2006UMCP7 Reported Candidates # of Candidates SRT jess SPECjbb fop
8
Department of Computer Sciences 11-Dec-2006UMCP8 1 4 1 1 4 1 1 1 4 Find nodes that are growing Rank all growing nodes Designate node as a candidate if Finding Growth (RRT)
9
Department of Computer Sciences 11-Dec-2006UMCP9 Reported Candidates # of Candidates SRT RRT jess SPECjbb fop
10
Department of Computer Sciences 11-Dec-2006UMCP10 Finding Data Structure Type is not enough Growing edges identify the data structure Rank edges Calculate a slice from each candidate Set of all paths ( n 0 … n n ) such that “Sees” beyond non-candidate nodes 1 1 1 1 1 4 1 4 1 4
11
Department of Computer Sciences 11-Dec-2006UMCP11 Implementation and Methodology Jikes RVM with MMTk Benchmarks: SPECjvm98, DaCapo, SPECjbb2000 Eclipse 3.1.2 Garbage collector Generational with 4MB bounded nursery For performance, report application only Replay compilation 2 nd run methodology
12
Department of Computer Sciences 11-Dec-2006UMCP12 Efficiency and Scalability Node/type data stored in type information block (TIB) adding 5 words 1 word for type volume and edge list pointer for each of the previous 4 collections 1 word for # of phases (p) Edge data stored in lists Prune parts of TPFG that are non-growing
13
Department of Computer Sciences 11-Dec-2006UMCP13 Space Overhead jessEclipse Geomean # of types bm+VM174433651747 TPFG avg318667334 TPFG max319775346 # of edges TPFG avg8444090904 TPFG max86175851142 % pruned66%42%60% Increased Alloc %0.094%0.167%0.233% 19% 2.7X 0.233%
14
Department of Computer Sciences 11-Dec-2006UMCP14 Time Overhead Normalized Total Time Heap Size Relative to Minimum
15
Department of Computer Sciences 11-Dec-2006UMCP15 Benchmarks on Cork Cork identified: Systematic heap growth Growing types Growing data structure Analysis: fop – application design jess – memory leak SPECjbb2000 – memory leak SPECjbb fop jess
16
Department of Computer Sciences 11-Dec-2006UMCP16 SPECjbb2000 Heap Occupancy (MB) Time (MB of allocation)
17
Department of Computer Sciences 11-Dec-2006UMCP17 Slice Diagram: SPECjbb2000 Order Orderline Date NewOrder Object[] longBTreeNode longBTree longStaticBTree Types:1663 (71) Nodes:318 Edges:904 Candidate Non-candidate
18
Department of Computer Sciences 11-Dec-2006UMCP18 SPECjbb2000 Heap Occupancy (MB) Time (MB of allocation)
19
Department of Computer Sciences 11-Dec-2006UMCP19 Eclipse 3.1.2 on Cork IDE Big, complex, and open-source Bug repository details known memory leaks and how to reproduce them #115789: Memory Leak Comparing 2 source trees or jar files Manually repeat while running Cork
20
Department of Computer Sciences 11-Dec-2006UMCP20 Eclipse 115789 Heap Occupancy (MB) Time (MB of allocation)
21
Department of Computer Sciences 11-Dec-2006UMCP21 Slice Diagram: Eclipse 115789 Path FolderFile ResourceCompareInput$ FilteredBufferedResourceNode ArrayList Object[] ListenerList RuleBasedCollator ResourceCompareInput$ MyDiffNode HashMap$ HashEntry HashMap$ HashEntry[] HashMap HashMap$ HashIterator ResourceCompareInput ElementTree$ ChildIDsCache ElementTree Types:3365 (1773) Nodes:667 Edges:4090 Candidate Non-candidate
22
Department of Computer Sciences 11-Dec-2006UMCP22 Eclipse 115789 Heap Occupancy (MB) Time (MB of allocation)
23
Department of Computer Sciences 11-Dec-2006UMCP23 Slice Diagram: Eclipse 115789 Path FolderFile ResourceCompareInput$ FilteredBufferedResourceNode ArrayList Object[] ListenerList RuleBasedCollator ResourceCompareInput$ MyDiffNode HashMap$ HashEntry HashMap$ HashEntry[] HashMap HashMap$ HashIterator ResourceCompareInput ElementTree$ ChildIDsCache ElementTree Types:3365 (1773) Nodes:667 Edges:4090 Candidate Non-candidate
24
Department of Computer Sciences 11-Dec-2006UMCP24 Eclipse 115789 Heap Occupancy (MB) Time (MB of allocation)
25
Department of Computer Sciences 11-Dec-2006UMCP25 Cork’s Contributions Very low-overhead technique <0.5% space overhead ~2% time overhead Accurately identifies Systematic heap growth Data structure containing the growth First mechanism for detecting memory leaks in production systems
26
Department of Computer Sciences 11-Dec-2006UMCP26 Thank You! mjump@cs.utexas.edu http://www.cs.utexas.edu/~mjump
27
Department of Computer Sciences 11-Dec-2006UMCP27 Second Run Methodology Replay compilation Profiling runs chooses hot methods Deterministically applies optimizing compiler Mixture of optimized & unoptimized code Measure 2 nd run First run applies replay compilation Turn off compilation Flush compiler objects from heap Measure second run
28
Department of Computer Sciences 11-Dec-2006UMCP28 Gartner Report predicts that by 2010, 80% of all new software will be in Java or C# C++Java Execution efficiencyDeveloper productivity Trusts the programmerProtects the programmer Arbitrary memory access possible Memory access only through objects Can arbitrarily override typesType safety Procedural or object-orientedObject-oriented Operator overloadingMeaning of operators immutable Powerful capabilities of language Feature-rich, easy-to-use standard library Explicit memory control Automatic memory management [Wikipedia: Comparison of Java and C++, Dec 2006]
29
Department of Computer Sciences 11-Dec-2006UMCP29 Panacea for Bugs? PMD, FindBugs, JLint, … ESC/Java, Bandera, … HPROF, JProbe, HAT, Leakbot, … Microsoft reports that, even in C#, 75% of development time is spent in debugging Provide a good start Programs still ship with memory and semantic errors
30
Department of Computer Sciences 11-Dec-2006UMCP30 My Research Focus PROBLEM: Dynamically detect statistical and anomalous per-object behavior 5in production systems Low overhead and high accuracy SOLUTION: Exploit GC and underlying runtime system Focus only on interesting objects Find ways to summarize object properties
31
Department of Computer Sciences 11-Dec-2006UMCP31 Outline Motivation: Programs have bugs Cork: Dynamic Memory Leak Detection for Garbage-Collected Languages Summarize using a type points-from graph Interpret the summarization Find memory leaks with Cork How to focus only on interesting objects Heap summarization with focus Conclusions and future work
32
Department of Computer Sciences 11-Dec-2006UMCP32 Memory-Related Bugs with GC Lost Pointer : lose pointer to memory before freeing Dangling Pointer : de-referencing pointer to memory previously freed Unnecessary Reference : keeping pointer to memory no longer needed Objects are live, can not reclaim Object is live Reclaims automatically
33
Department of Computer Sciences 11-Dec-2006UMCP33 Heap Occupancy Graph Heap Occupancy (MB) Time (MB of allocation)
34
Department of Computer Sciences 11-Dec-2006UMCP34 Related Work Offline Techniques: Static analysis [Heine et al. 03] Heap differencing [JProbe, DePauw et al. 98, 99, 00] Allocation and/or usage tracking [OptimizeIt, Rationale, Purify, HAT, HPROF, Shaham et al. 00] Online Techniques: Leakbot (partially online) [Mitchell et al. 03] Adaptive usage tracking [Chilimbi et al. 04, Bond et al. 06] Cork accurately pinpoints systematic heap growth completely online
35
Department of Computer Sciences 11-Dec-2006UMCP35 Outline Motivation: Programs have bugs Cork: Dynamic Memory Leak Detection for Garbage-Collected Languages Summarize using a type points-from graph Interpret the summarization Find memory leaks with Cork How to focus only on interesting objects Heap summarization with focus Conclusions and future work
36
Department of Computer Sciences 11-Dec-2006UMCP36 What do we know? Objects have special properties Lifetime, allocation site, last-use site, calling context, thread usage, etc. Tracking individual object properties is useful for debugging Can use dynamic object sampling to gather fine-grained object statistics at very low overhead [Jump et al. 04]
37
Department of Computer Sciences 11-Dec-2006UMCP37 Dynamic Object Sampling Tag objects with special properties One bit in the header indicates a tag Sample tag encodes object properties Examples: Allocation site Last-use site Lifetime Which data structure
38
Department of Computer Sciences 11-Dec-2006UMCP38 For example, modify a bump-pointer allocator Dynamic Object Sampling Sample Tag
39
Department of Computer Sciences 11-Dec-2006UMCP39 During Garbage Collection Gather object statistics Piggyback on object scanning SAMPLE TAG FOUND! 1. Examine tag 2. Collect statistics survivors
40
Department of Computer Sciences 11-Dec-2006UMCP40 Focus DOS Overhead Sampling every object 12% space overhead 6-7% time overhead What is interesting depends application Memory leak detection … candidate types Malformed data structures … nodes Dynamic pretenuring … random sampling Focus only on 6% of objects 0.8% space overhead 2-3% time overhead 6%
41
Department of Computer Sciences 11-Dec-2006UMCP41 DOS in Cork Encode allocation site and lifetime for candidates <1.3% space overhead, ~4% time overhead Find specific allocation sites causing growth Future work Encode last-use site in sample tag Requires read/write barrier for candidates Will overhead still be low enough for use in production systems?
42
Department of Computer Sciences 11-Dec-2006UMCP42 Conclusions Developed synergistic two techniques Dynamic object sampling Points-from graphs See detailed object characteristics in high-level summarizations Unique ways to debug software in production systems
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.