Guoquing Xu, Atanas Rountev Ohio State University Oct 9 th, 2008 Presented by Eun Jung Park.

Slides:



Advertisements
Similar presentations
Chapter 22 Implementing lists: linked implementations.
Advertisements

Garbage collection David Walker CS 320. Where are we? Last time: A survey of common garbage collection techniques –Manual memory management –Reference.
Dynamic Memory Management
Introduction to Memory Management. 2 General Structure of Run-Time Memory.
Lecture 10: Heap Management CS 540 GMU Spring 2009.
Garbage Collection What is garbage and how can we deal with it?
CSC 213 – Large Scale Programming. Today’s Goals  Consider what new does & how Java works  What are traditional means of managing memory?  Why did.
Various languages….  Could affect performance  Could affect reliability  Could affect language choice.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Precise Memory Leak Detection for Java Software Using Container Profiling.
Hastings Purify: Fast Detection of Memory Leaks and Access Errors.
LOW-OVERHEAD MEMORY LEAK DETECTION USING ADAPTIVE STATISTICAL PROFILING WHAT’S THE PROBLEM? CONTRIBUTIONS EVALUATION WEAKNESS AND FUTURE WORKS.
CORK: DYNAMIC MEMORY LEAK DETECTION FOR GARBAGE- COLLECTED LANGUAGES A TRADEOFF BETWEEN EFFICIENCY AND ACCURATE, USEFUL RESULTS.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 18.
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
Submitted by: Omer & Ofer Kiselov Supevised by: Dmitri Perelman Networked Software Systems Lab Department of Electrical Engineering, Technion.
Establishing Local Temporal Heap Safety Properties with Applications to Compile-Time Memory Management Ran Shaham Eran Yahav Elliot Kolodner Mooly Sagiv.
Lecture 36: Programming Languages & Memory Management Announcements & Review Read Ch GU1 & GU2 Cohoon & Davidson Ch 14 Reges & Stepp Lab 10 set game due.
LeakChaser: Helping Programmers Narrow Down Causes of Memory Leaks Guoqing Xu, Michael D. Bond, Feng Qin, Atanas Rountev Ohio State University.
Pointers and Dynamic Variables. Objectives on completion of this topic, students should be able to: Correctly allocate data dynamically * Use the new.
Reference Counters Associate a counter with each heap item Whenever a heap item is created, such as by a new or malloc instruction, initialize the counter.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
Detecting Inefficiently-Used Containers to Avoid Bloat Guoqing Xu and Atanas Rountev Department of Computer Science and Engineering Ohio State University.
Precise Memory Leak Detection for Java Software Using Container Profiling Guoqing Xu, Atanas Rountev Program analysis and software tools group Ohio State.
Exploiting Prolific Types for Memory Management and Optimizations By Yefim Shuf et al.
Language Evaluation Criteria
P ARALLEL P ROCESSING I NSTITUTE · F UDAN U NIVERSITY 1.
Mark Marron IMDEA-Software (Madrid, Spain) 1.
Dynamic Memory Allocation Questions answered in this lecture: When is a stack appropriate? When is a heap? What are best-fit, first-fit, worst-fit, and.
15-740/ Oct. 17, 2012 Stefan Muller.  Problem: Software is buggy!  More specific problem: Want to make sure software doesn’t have bad property.
Putting Pointer Analysis to Work Rakesh Ghiya and Laurie J. Hendren Presented by Shey Liggett & Jason Bartkowiak.
1 Names, Scopes and Bindings Aaron Bloomfield CS 415 Fall
Computer Science and Software Engineering University of Wisconsin - Platteville 2. Pointer Yan Shi CS/SE2630 Lecture Notes.
Pointers OVERVIEW.
Constructors CMSC 202. Object Creation Objects are created by using the operator new in statements such as… The following expression invokes a special.
Chameleon Automatic Selection of Collections Ohad Shacham Martin VechevEran Yahav Tel Aviv University IBM T.J. Watson Research Center Presented by: Yingyi.
Mark Marron 1, Deepak Kapur 2, Manuel Hermenegildo 1 1 Imdea-Software (Spain) 2 University of New Mexico 1.
1 C++ Classes and Data Structures Jeffrey S. Childs Chapter 4 Pointers and Dynamic Arrays Jeffrey S. Childs Clarion University of PA © 2008, Prentice Hall.
1 Dynamic Memory Allocation –The need –malloc/free –Memory Leaks –Dangling Pointers and Garbage Collection Today’s Material.
® IBM Software Group © 2006 IBM Corporation PurifyPlus on Linux / Unix Vinay Kumar H S.
Static Detection of Loop-Invariant Data Structures Harry Xu, Tony Yan, and Nasko Rountev University of California, Irvine Ohio State University 1.
Buffer Overflow Proofing of Code Binaries By Ramya Reguramalingam Graduate Student, Computer Science Advisor: Dr. Gopal Gupta.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University IWPSE 2003 Program.
Detecting Inefficiently-Used Containers to Avoid Bloat Guoqing Xu and Atanas Rountev Department of Computer Science and Engineering Ohio State University.
ApproxHadoop Bringing Approximations to MapReduce Frameworks
CMSC 202 Advanced Section Classes and Objects: Object Creation and Constructors.
Consider Starting with 160 k of memory do: Starting with 160 k of memory do: Allocate p1 (50 k) Allocate p1 (50 k) Allocate p2 (30 k) Allocate p2 (30 k)
GC Assertions: Using the Garbage Collector To Check Heap Properties Samuel Z. Guyer Tufts University Edward Aftandilian Tufts University.
Memory Management What if pgm mem > main mem ?. Memory Management What if pgm mem > main mem ? Overlays – program controlled.
Optimistic Hybrid Analysis
Dynamic Allocation in C
Object Lifetime and Pointers
Garbage Collection What is garbage and how can we deal with it?
Data Types In Text: Chapter 6.
Threads vs. Events SEDA – An Event Model 5204 – Operating Systems.
Cork: Dynamic Memory Leak Detection with Garbage Collection
Semantic Analysis with Emphasis on Name Analysis
Seminar in automatic tools for analyzing programs with dynamic memory
Dynamic Memory Allocation
CS 153: Concepts of Compiler Design November 28 Class Meeting
Concepts of programming languages
Amir Kamil and Katherine Yelick
Pointers and Dynamic Variables
Capriccio – A Thread Model
Storage.
Introduction to Data Structure
Amir Kamil and Katherine Yelick
Interpreting Java Program Runtimes
Classes and Objects Object Creation
Garbage Collection What is garbage and how can we deal with it?
CMSC 202 Constructors Version 9/10.
Presentation transcript:

Guoquing Xu, Atanas Rountev Ohio State University Oct 9 th, 2008 Presented by Eun Jung Park

 Example of memory leak/dangling pointer in C/C++  How about in Java? ◦ GC (Garbage Collector) will handle this!  Then what is memory leak problem in Java? int *pi; void foo() { pi = (int*) malloc(8*sizeof(int)); // oops, memory leak of 4 ints // use pi free(pi); // foo() is done with pi } void main() { pi = (int*) malloc(4*sizeof(int)); foo(); pi[0] = 10; // oops, pi is now a dangling pointer } int *pi; void foo() { pi = (int*) malloc(8*sizeof(int)); // oops, memory leak of 4 ints // use pi free(pi); // foo() is done with pi } void main() { pi = (int*) malloc(4*sizeof(int)); foo(); pi[0] = 10; // oops, pi is now a dangling pointer } Above example is from

 What is Java Memory Leak? ◦ Object references that are no longer needed are unnecessarily maintained. They will not disappeared by GC.  Why Java Memory Leak is bad? ◦ It can degrade the performance. ◦ It can eventually cause running out of memory and crash. ◦ It is difficult to find. public void slowlyLeakingVector(int iter, int count) { for (int i=0; i<iter; i++) { for (int n=0; n<count; n++) { myVector.add(Integer.toString(n+i)); } for (int n=count-1; n>0; n--) { // Oops, it should be n>=0 myVector.removeElementAt(n); } public void slowlyLeakingVector(int iter, int count) { for (int i=0; i<iter; i++) { for (int n=0; n<count; n++) { myVector.add(Integer.toString(n+i)); } for (int n=count-1; n>0; n--) { // Oops, it should be n>=0 myVector.removeElementAt(n); } Above example is from

 Static method using compiler or code analysis ◦ Not precise: Usually they cannot precisely identify these unnecessary references. ◦ Not scalable: It is not good to use for large application.  Dynamic method using fine-grained runtime information about individual objects with single information - memory contribution or staleness contribution. ◦ Not precise: They use from-symptom-to-cause approach and it can be difficult to locate the source of the leak and cause the imprecise leak reports. (possible false positive) ◦ Hard to interpret and not sufficient for programmers: The output is too complex to interpret and lack of precision. Also the output is not enough to locate a bug for programmers.

 Dynamic method with container-based heap-tracking ◦ Instead of using from-symptom-to-cause  Only track containers to directly identify the source of the leak. ◦ Instead of using single information  Compute heuristic confidence value for each container based on the combination of  Overall memory consumption  Each container’s memory consumption  Each container’s staleness contribution  What is definition of container? An abstract data type (ADT) with a set of data elements and three basic operations ADD, GET, and REMOVE. (e.g., hash table, graphical element)  Why container is suspicious? Container causes many memory leaks in Java!

 Contribution 1: Computing a Confidence Value  Contribution 2: Java Memory Leak Detection  Contribution 3: Implementation  Contribution 4: Empirical Evaluation

Define Memory Leak Symptom Define Memory Leak Symptom Define Memory Leak Free Define Memory Leak Free Choose Non Memory-Leak-Free Containers Choose Non Memory-Leak-Free Containers Calculate Memory Contribution Calculate Memory Contribution Calculate Staleness Contribution Calculate Staleness Contribution Put them together! Calculate Leaking Confidence Put them together! Calculate Leaking Confidence

 A program written in garbage-collected language has a memory leak symptom within [, ] if ◦ (1) Memory consumption at the moment immediately after gc-events in the region, ◦ (2) There exists a subsequences of gc-events, memory consumption at each gc-events keeps growing  How to define and ? ◦ by offline: Ending of the program or the out of memory error. ◦ by online: User-defined. gc_events will be a check-points. ◦ : Choose the smallest user-defined ratio to get the longest region and more precise analysis.  This helps to identify the appropriate time region to analyze

 A container is memory-leak-free if (1) at the end of leak region, the number of element is 0 (2) all elements added were removed and garbage collected within the leak region. This means that # of ADD = # of REMOVE.  Why we this need definition? Containers that are not memory- leak-free will "possibly" contribute to the memory leak symptom and considered for further evaluation.  We choose container that is not memory-leak-free and we are ready to go to next step!

 Memory time graph is used to capture a container's memory footprint. ◦ x-axis: the relative time of program execution at ◦ y-axis: the relative memory consumption of a container at ◦ Staring point: /, where =max(, allocation time of container) ◦ Ending point: /, where =min(, deallocation time of container)  Container’s memory contribution is defined as the area covered by the memory consumption curve in the graph. MC x-axis =, y-axis= at Memory consumption of all reachable objects from container Total amount of memory consumption of a container

 Staleness: the time since the object's last use.  How calculate Staleness? time diff between and where, ◦ : the moment that element was removed from a container. ◦ : the moment that element was added into a container or retrieved from a container. ◦ Condition: no retrieval of element between and. ◦ If < ? ◦ If an element is never removed from a container?  How calculate Staleness contribution? When we have element in a container, MC SC

 Combining MC and SC, we get Leaking Confidence defined as ◦ Why LC as an exponential function of SC? SC is more important than MC in determining a memory leak.  Desirable Properties ◦ MC SC LC

Leak symptom Leak free MC SC LC Non Leak-free Containers Non Leak-free Containers Container Modeling Code Instrumentation Profiling Instrumented code with glue class Data Analysis Information of Potential leaking containers Leaking Call Sites

 For each container, corresponding glue class ◦ Provided for all types in the Java collection frameworks. ◦ User's annotation required for user-defined container.  These glue methods call profiling library to pass ◦ For instrumentation step: call site ID ◦ For SC computation  the container object  the element object  the number of elements in the container before the operation is performed  operation types are used for SC computation

 Soot analysis framework is used  Calls to the corresponding glue methods are inserted before and/or after the call site.  Code is inserted after a container is allocated in order to track its allocation time.  Escape analysis: They do not include thread-local and method-local containers since their lifetime is limited within their allocating methods.

 Perform profiling with JVMTI ◦ Data for MC values ◦ Data for SC values  What JVMTI helps for profiling? ◦ Activate an object graph traversal thread ◦ Calculate the deallocation time of a tagged container. ◦ Activate a dumping thread to prevent too much profiling data in memory for performance.

 When we reach to the ending of leak region, tool starts offline analysis to ◦ Determine leaking region ◦ Approximate the memory time graph and MC value ◦ Compute SC

 For each element in a container, tool calculates the average staleness of each call sites.  Tool reports to programmers (testers) ◦ potentially leaking containers sorted by LC value ◦ potentially leaking call sites sorted by average staleness in each container

 Hardware Platform: 2.4 GHz Dual-core, 2GB RAM  Three memory leak bugs ◦ Two are from Sun Bug Repository ◦ One is from SPECjbb  Method ◦ Check how successfully their tool can locate a memory leak bug in three different sampling/dumping rates (1/15gc, 1/50gc, 1/85gc) ◦ Check overhead and performance by measuring instrumentation overhead, runtime with different size of heap in different sampling rate, and the overhead of using their tool.  What they want to show here? ◦ Their tool achieved high precision in reporting causes for memory leak bug with acceptable overhead for practical use!

1. Enough information for programmers to locate bugs. 2. Sampling rate: 1/15gc and 1/50gc is better than 1/85gc. 1/50gc is the best for tradeoff between performance and preciseness. JDK bug # JDK bug # Image = (VolatileImage)volatileMap.get(config) addElement(weakWindow)

1.Requires user-defined container glue class. Before this, tool couldn’t locate a memory leak bug successfully 2.After modeling, it found the correct place for memory leak bug 3.1/50gc showed the best performance and preciseness. orderTable.put(anOrder.getId(), anOrder)

Static Overhead -# of call sites instrumented -Static overhead of tools Dynamic Overhead -# of gc-events and runtime with the default vs. large heap size in two different samplings Overhead of using this tool 1.Applying escape analysis reduces the number of call sites 2.In the same sampling rate, large initial heap size uses smaller running time 3.Decreasing the sampling rate reduces the runtime overhead

 Why they are different from existing dynamic method? ◦ Instead of focusing on arbitrary objects, they only focus on containers main contributor of memory leak problem. ◦ They consider the combination of MC and SC, not single. ◦ They locate a bug more precisely ◦ Programmer or testers only need to learn how to add glue class and can use their tool easily instead of learning how to interpret complex outputs from existing tools.  Contributions ◦ Contribution 1: Computing a Confidence ◦ Contribution 2: Java Memory leak detection ◦ Contribution 3: Implementation ◦ Contribution 4: Empirical Evaluation

 How about overhead? ◦ Need optimization to reduce overhead ◦ Using JikesRVM to avoid JVMTI  Automated tool ◦ Automate the mapping between container methods to the ADT operations  Alternative definition of LC for more precisely information  More context information about containers and call sites that can be useful for programmers.