Heap Growth Detection in C++ GrowthTracker 1. Heap Growth Detection in C++ Motivation Scalable City needs to run continuously –Many months without intervention/access.

Slides:



Advertisements
Similar presentations
Paging: Design Issues. Readings r Silbershatz et al: ,
Advertisements

Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Chair of Software Engineering Einführung in die Programmierung Introduction to Programming Prof. Dr. Bertrand Meyer Lecture 7: References and Assignment.
Ensuring Operating System Kernel Integrity with OSck By Owen S. Hofmann Alan M. Dunn Sangman Kim Indrajit Roy Emmett Witchel Kent State University College.
Lecture 10: Heap Management CS 540 GMU Spring 2009.
Overview Motivations Basic static and dynamic optimization methods ADAPT Dynamo.
Bounding Space Usage of Conservative Garbage Collectors Ohad Shacham December 2002 Based on work by Hans-J. Boehm.
Memory Management Tom Roeder CS fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)
Guoquing Xu, Atanas Rountev Ohio State University Oct 9 th, 2008 Presented by Eun Jung Park.
Hastings Purify: Fast Detection of Memory Leaks and Access Errors.
LOW-OVERHEAD MEMORY LEAK DETECTION USING ADAPTIVE STATISTICAL PROFILING WHAT’S THE PROBLEM? CONTRIBUTIONS EVALUATION WEAKNESS AND FUTURE WORKS.
CORK: DYNAMIC MEMORY LEAK DETECTION FOR GARBAGE- COLLECTED LANGUAGES A TRADEOFF BETWEEN EFFICIENCY AND ACCURATE, USEFUL RESULTS.
File Management Chapter 12. File Management File management system is considered part of the operating system Input to applications is by means of a file.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 18.
Establishing Local Temporal Heap Safety Properties with Applications to Compile-Time Memory Management Ran Shaham Eran Yahav Elliot Kolodner Mooly Sagiv.
OOP in Java Nelson Padua-Perez Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Chapter 10 Storage Management Implementation details beyond programmer’s control Storage/CPU time trade-off Binding times to storage.
Run-Time Storage Organization
Memory Management 2010.
Introduction and a Review of Basic Concepts
Run time vs. Compile time
CS-3013 & CS-502, Summer 2006 Memory Management1 CS-3013 & CS-502 Summer 2006.
Pointers and Dynamic Variables. Objectives on completion of this topic, students should be able to: Correctly allocate data dynamically * Use the new.
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
Introducing the Common Language Runtime. The Common Language Runtime The Common Language Runtime (CLR) The Common Language Runtime (CLR) –Execution engine.
OOP in Java Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Programming Languages and Paradigms Object-Oriented Programming.
Proposed Work 1. Client-Server Synchronization Proposed Work 2.
CSC3315 (Spring 2009)1 CSC 3315 Programming Languages Hamid Harroud School of Science and Engineering, Akhawayn University
© 2004, D. J. Foreman 1 Memory Management. © 2004, D. J. Foreman 2 Building a Module -1  Compiler ■ generates references for function addresses may be.
Introduction and Features of Java. What is java? Developed by Sun Microsystems (James Gosling) A general-purpose object-oriented language Based on C/C++
Chapter 13 Recursion. Learning Objectives Recursive void Functions – Tracing recursive calls – Infinite recursion, overflows Recursive Functions that.
Storage Management. The stack and the heap Dynamic storage allocation refers to allocating space for variables at run time Most modern languages support.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Basic Semantics Associating meaning with language entities.
Constructors CMSC 202. Object Creation Objects are created by using the operator new in statements such as… The following expression invokes a special.
File Management Chapter 12. File Management File management system is considered part of the operating system Input to applications is by means of a file.
1 Records Record aggregate of data elements –Possibly heterogeneous –Elements/slots are identified by names –Elements in same fixed order in all records.
C++ Memory Overview 4 major memory segments Key differences from Java
1 Threads Chapter 11 from the book: Inter-process Communications in Linux: The Nooks & Crannies by John Shapley Gray Publisher: Prentice Hall Pub Date:
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 9.
Diagnosing Unbounded Heap Growth in C++ Problem Description Types of Unbounded Heap Growth –Reference Lost (Leak) Reference lost to memory without freeing.
COMP3190: Principle of Programming Languages
Buffer Overflow Proofing of Code Binaries By Ramya Reguramalingam Graduate Student, Computer Science Advisor: Dr. Gopal Gupta.
Time Parallel Simulations I Problem-Specific Approach to Create Massively Parallel Simulations.
Core Java Introduction Byju Veedu Ness Technologies httpdownload.oracle.com/javase/tutorial/getStarted/intro/definition.html.
Object-Oriented Programming Chapter Chapter
 Objects versus Class  Three main concepts of OOP ◦ Encapsulation ◦ Inheritance ◦ Polymorphism  Method ◦ Parameterized ◦ Value-Returning.
Digging into the GAT API Comparing C, C++ and Python API‘s Hartmut Kaiser
ISBN Object-Oriented Programming Chapter Chapter
CMSC 202 Advanced Section Classes and Objects: Object Creation and Constructors.
CIS 200 Test 01 Review. Built-In Types Properties  Exposed “Variables” or accessible values of an object  Can have access controlled via scope modifiers.
GC Assertions: Using the Garbage Collector To Check Heap Properties Samuel Z. Guyer Tufts University Edward Aftandilian Tufts University.
Michael J. Voss and Rudolf Eigenmann PPoPP, ‘01 (Presented by Kanad Sinha)
Code Generation Instruction Selection Higher level instruction -> Low level instruction Register Allocation Which register to assign to hold which items?
CMSC 341 Lecture 2 – Dynamic Memory and Pointers (Review)
Data Types In Text: Chapter 6.
Names and Attributes Names are a key programming language feature
CIS 200 Test 01 Review.
Memory Caches & TLB Virtual Memory
Memory Management © 2004, D. J. Foreman.
Indranil Roy High Performance Computing (HPC) group
Effective and Efficient memory Protection Using Dynamic Tainting
Created By: Asst. Prof. Ashish Shah, J.M.Patel College, Goregoan West
Dynamic Memory And Objects
COP 3330 Object-oriented Programming in C++
Classes and Objects Object Creation
CMSC 202 Constructors Version 9/10.
Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
Presentation transcript:

Heap Growth Detection in C++ GrowthTracker 1

Heap Growth Detection in C++ Motivation Scalable City needs to run continuously –Many months without intervention/access –Had slow growth of memory Caused crash after several weeks –Available analysis tools reported no leaks! Software frees all memory correctly! Had different kind of undetected memory issue 2

What is a Memory Leak? 3

Has become a broad term for memory mismanagement Definition depends on the programming language –Does it have Garbage Collection? 4

What is a Memory Leak? C/C++ Programmers generally subscribe to traditional definition: Def. 1: A memory leak occurs iff all references to a region of memory are lost before the memory is freed. 5

What is a Memory Leak? Java/C# Programmers generally have a different definition: Def. 2: A memory leak occurs when a program’s memory is unexpectedly growing. –Using Def. 1, Java/C# don’t have memory leaks Garbage Collector frees memory that is no longer referenced 6

What is a Memory Leak? 7 C/C++Java/C# Def.2 (unbounded growth) Def.2 (unbounded growth) Def.2 (unbounded growth) Def.2 (unbounded growth) Def.1 (lost refs) Def.1 (lost refs)

What is a Memory Leak? We need new terminology! –Java/C# programmers Concerned with unbounded memory growth, not leaks –References exist to all memory growth Have clearer understanding of their problem –C/C++ programmers Multiple definitions causes confusion False sense of security when tools report no memory leaks –Unbounded memory growth may still exist! A term for unbounded memory growth not caused by memory leaks, would clearly parse the 2 problems 8

New “Leak” Terminology Properties of new term –Unbounded Heap Growth –Reference(s) retained –Manifests as dynamically growing data structure –Will eventually kill the process –Doesn’t fit leaky pipe metaphor Memory Tumor –Structure of cells that exhibits unbounded growth 9 data

Memory Tumor Example void main(){ queue q; while( key != ESC ) // exits when ESC key pressed q.push(0); } q will grow constantly while the program runs there are no memory leaks –if the user hits ESC, all memory is freed when q goes out of scope 10

Leak vs. Tumor 11 X X X X X LeakTumor heap

Separation of Concerns 12 Memory Leak (Def 1: lost refs) Memory Leak (Def 1: lost refs) C/C++/Java/C# Memory Tumor

Current C++ Tumor Detection Tools SWAT and Hound –Only ones we’re aware of –Both use Staleness detection Misses some tumors by design Memory can be accessed, but not needed –Investigate at allocation level Don’t need to modify source code –Not open source or commercially available 13

Memory Tumor Detection Theory Tumors –Data Structures that grow without bound Healthy data structures –Will grow –Maximum size stabilizes 14

Memory Tumor Detection Challenges Detect all growth that doesn’t stabilize –Don’t dismiss non-stale growth Tests must exhibit the growth that exists in a program’s implementation Support Multithreaded programs 15

Memory Tumor Detection Approach Growth Tracker Tool –Container Tracking Keep references to all data structures in memory –Growth Tracking Track data structure size changes over time Identify those with unbounded growth Automated Test –Cyclically execute all code paths (user created) 16

Growth Tracker Tool Container tracking –CAT (Central Aggregate Tracker) Maintains references to all aggregates in the system –Create wrappers for each aggregate type in system Templated constructors, multiple inheritance Add to CAT on construction, remove on destruction –Namespace replacement to enable wrappers Find and replace to apply new namespace Wrappers disabled with compile time flag Example: trak::std::vector 17

Growth Tracker Tool Growth Tracking –Take periodic samples of the CAT –Exponentially increasing interval sizes Reduces false positives & negatives over time –Report growing aggregates at each sample 18

Growth Tracking Heuristic Take periodic samples of the CAT Two Interval Analysis –1 st interval establishes aggregate age, gives time to stabilize –2 nd interval proves stability, non-tumors shouldn’t grow –2 nd interval becomes the 1 st for next more accurate test Exponentially increasing interval sizes –Reduces false positives & negatives over time Monitor size maximums –Reduces size fluctuation false positives At each interval report all aggregates that: –Increased their size maximum –Have existed for two full intervals Prioritize results by size & reporting frequency 19

Growth Tracking Heuristic Two interval analysis 20 time memory Not reported (growth stabilized) A Data Structure Memory Footprint

Diagnosing Unbounded Heap Growth in C++ Detection Approach Two interval analysis 21 time memory Reported as tumor (false positive) Not reported (growth stabilized)

Growth Tracking Heuristic Take periodic samples of the CAT Two Interval Analysis –1 st interval establishes aggregate age, gives time to stabilize –2 nd interval proves stability, non-tumors shouldn’t grow –2 nd interval becomes the 1 st for next more accurate test Exponentially increasing interval sizes –Reduces false positives & negatives over time Monitor size maximums –Reduces size fluctuation false positives At each interval report all aggregates that: –Increased their size maximum –Have existed for two full intervals Prioritize results by size & reporting frequency 22

Diagnosing Unbounded Heap Growth in C++ Detection Approach Growth Tracking –Exponentially increasing interval size 23 time memory In this example: constant intervals would not report growth half the time

Growth Tracking Heuristic Take periodic samples of the CAT Two Interval Analysis –1 st interval establishes aggregate age, gives time to stabilize –2 nd interval proves stability, non-tumors shouldn’t grow –2 nd interval becomes the 1 st for next more accurate test Exponentially increasing interval sizes –Reduces false positives & negatives over time Monitor size maximums –Reduces size fluctuation false positives At each interval report all aggregates that: –Increased their size maximum –Have existed for two full intervals Prioritize results by size & reporting frequency 24

Diagnosing Unbounded Heap Growth in C++ Detection Approach Growth Tracking –Max size variable time memory ceiling 4 Growth would be reported without max size

Growth Tracking Heuristic Take periodic samples of the CAT Two Interval Analysis –1 st interval establishes aggregate age, gives time to stabilize –2 nd interval proves stability, non-tumors shouldn’t grow –2 nd interval becomes the 1 st for next more accurate test Exponentially increasing interval sizes –Reduces false positives & negatives over time Monitor size maximums –Reduces size fluctuation false positives At each interval report all aggregates that: –Increased their size maximum –Have existed for two full intervals Prioritize results by size & reporting frequency 26

Growth Tracker Targets Multi-threaded Applications –Initial CAT implementation works Requires locking for each aggregate constructor Potential to diminish multi-threaded performance Good starting point –Need new CAT implementation Eliminate Locks –Multiple bucket approach –Map aggregate construction from different threads to buckets Design can accelerate sampling process as well 27

Growth Tracker Drawbacks –Source code modification Tracking requires compilation with our wrappers Allows consideration of Objects not just allocations. –Limited information about identified tumors Full type string & allocation number Code location possible –requires stack tracing (slower) –Reliance on the user Must identify custom data structures Must run feature complete and cyclic test 28

Growth Tracker Drawbacks –Multi-threaded potential slow down –Persistent buckets Example: Linear hash table with std::vector buckets More useful to include child bucket sizes in parent’s output and stop reporting individual children –Multiple instances of same tumor reported Parent report including children would resolve 29

Growth Tracker Results –Scalable City Identified tumor Eliminated memory growth –Ogre3D Rendering Engine Identified 2 tumors Our fix integrated into their code base –Bullet Physics Engine Tests revealed no tumors in Core 1 tumor found in demo framework 30

Growth Tracker Results –Google Chrome / Chromium Identified 21 tumors Fixed the fastest growing tumor ourselves –WebKit (Safari Browser, etc.) Identified 2 tumors Submitted fix to code base 31

Growth Tracker Paper Recently accepted for publication IEEE International Conference on Software Testing, Verification and Validation (ICST 2013) 32

Growth Tracker Proposed Work –Resolve Multithreaded locking limitations Solution designed, needs implementation –Reduce tracking of temporaries Detect stack-based data structures Multi-layer CAT to separate entries by age –Will reduce overhead of CAT insertion/removal 33

Growth Tracker Proposed Work Automation –Reduce reliance on the user –Detect custom data structures Automatically create wrappers when possible –Improvements to code transformation process After initial code conversion, detect when wrapper is forgotten. 34

Growth Tracker Proposed Work Prioritize tracking parent data structures –Would address persistent bucket problem –Would reduce reports of multiple instances of same tumor –Must identify relationships between data structures. 35