Download presentation
Presentation is loading. Please wait.
Published byRamiro Wharton Modified over 9 years ago
1
HeapMD: Identifying Heap-based Bugs using Anomaly Detection Trishul M. Chilimbi Microsoft Research Redmond, WA trishulc@microsoft.com Vinod Ganapathy University of Wisconsin Madison, WI vg@cs.wisc.edu ASPLOS XII, October 2006 San Jose, California
2
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection2 A motivating example pNewAsset = Initialize(pAssetParams);... if (pAssetList->next != NULL) { pNewAsset->next = pAssetList->next pAssetList->next = pNewAsset } …… pAssetList pNewAsset
3
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection3 A motivating example pNewAsset = Initialize(pAssetParams);... if (pAssetList->next != NULL) { pNewAsset->next = pAssetList->next pAssetList->next = pNewAsset } …… pAssetList pNewAsset
4
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection4 Noteworthy points Violation of implicit data-structure invariant –Only extremal nodes must have indegree=1 Malformed, but pointer-correct structure –Bug does not necessarily result in a crash …… pAssetList pNewAsset …
5
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection5 Key challenges Programmers rarely write down invariants for heap data structures Bugs may not be immediately apparent Can we infer invariants for the heap and use them for bug detection?
6
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection6 Presenting HeapMD A tool to monitor the health of the heap Highlights of results HeapMD found 40 bugs (31 new) in 5 large commercial applications
7
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection7 Talk outline Motivation Stability of the heap Application to bug-finding Related work and conclusion
8
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection8 Heap data A programmer’s perspective 1.Invisible No tangible representation 2.Can be arbitrarily modified Akin to self-modifying code 3.Can have arbitrary structure Akin to programming with gotos If this were true in practice, building large software would be untenable
9
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection9 Heap data The reality 1.Invisible No tangible representation 2.Can be arbitrarily modified Often only a small fraction is modified 3.Can have arbitrary structure In practice, structure is simple In practice, the heap has a simple, stable structure
10
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection10 Stability of pointer-valued locations Number of pointer-valued heap locations that only store NULL during their lifetime Number of pointer-valued heap locations that only store only one value during their lifetime Number of pointer-valued heap locations that only store only two values during their lifetime Number of pointer-valued heap locations that store more than two values during their lifetime 91.22% Strong evidence that a large fraction of the heap is not modified
11
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection11 Simple structure of the heap Most data structures in large programs have low in- and out-degrees –Linked lists –Trees –Hash tables Can we quantify the simplicity and stability of the heap?
12
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection12 Simple metrics suffice % nodes with indegree = 0 (root nodes) % nodes with outdegree = 0 (leaf nodes) % nodes with indegree = 1 % nodes with indegree = 2 % nodes with outdegree = 1 % nodes with outdegree = 2 % nodes with indegree = outdegree
13
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection13 Gathering metrics: Setup Instrument program (x86) to track metrics Execute instrumented program on a set of inputs Gather at metric-computation points –Function entry points –Frequency is tunable: currently 1/100,000
14
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection14 % indegree = outdegree % indegree = 2 % outdegree = 0 (leaves)
15
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection15 Algorithm to find stable metrics Compute metrics Program Calibration inputs Metric reports Determine stability Is stable for 40% inputs? Metric unstable (discard) Find range of metric Stable metrics and their ranges Check for range violation Other 60% Can potentially find bugs exercised by calibration inputs
16
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection16 Stable metrics exist: SPEC SPEC b/m# Inputs# Stable twolf36 crafty32 mcf34 vpr61 vortex51 gzip1002 parser1003 gcc1002 % of vertices with Indegree = Outdegree Range = [ 14.2%, 17.2% ]
17
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection17 Stable metrics exist: Commercial Benchmark# Inputs# Stable Multimedia502 Interactive web-app.502 PC-game (simulation)502 PC-game (action)501 Productivity502
18
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection18 Extending the observation Several degree-based metrics of the heap-graph remain stable as the heap evolves Stable heap metrics exist even across different development versions of a program In fact, we observed that …
19
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection19 Metric stability across versions Benchmark# Versions# Inputs# Stable Multimedia5102 Interactive web-app.5102 PC-game (simulation)5102 PC-game (action)5101 Productivity5102
20
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection20 Talk outline Motivation Stability of the heap Application to bug-finding Related work and conclusion
21
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection21 Architecture of HeapMD Binary instrumenter input.exe output.exe Stable metrics and their ranges Metric gatherer Metric analyzer Calibration input set Observed metrics Calibration Anomaly detector “On the field” input set Metric range violations Bug-finding Goal Identify a subset of observed metrics as stable metrics Goal Check the metrics identified by training for range violation
22
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection22 Finding bugs using HeapMD Key intuition Range violation Likely bug
23
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection23 Recall our motivating example pNewAsset = Initialize(pAssetParams);... if (pAssetList->next != NULL) { pNewAsset->next = pAssetList->next pAssetList->next = pNewAsset } …… pAssetList pNewAsset Violated invariant Only extremal nodes have indegree=1
24
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection24 Range violation for this example Log the call stack for diagnostics
25
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection25 Summary of bugs found Benchmark# Bugs# New bugs # Invariant violations Multimedia 863 Interactive web-app 1065 PC-game (simulation) 962 PC-game (action) 883 Productivity 554 Total 403117
26
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection26 Kinds of bugs found (1) Erroneous insertion into linked list % vertices with indegree = 0, 1 % vertices with outdegree = 0, 1 % vertices with indegree = outdegree
27
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection27 Kinds of bugs found (2) Shared data structure manipulation errors Delete ?? % vertices with outdegee = 0 % vertices with indegee = 2 % vertices with indegree = outdegree
28
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection28 Kinds of bugs found (3) Data structure invariants % vertices with outdegee = 1, 2 % vertices with indegee = 1, 2
29
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection29 Kinds of bugs found (3) Data structure invariants % vertices with indegree = outdegree % vertices with indegee = 1 See the paper for several more examples
30
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection30 Comparison with SWAT [Chilimbi and Hauswirth, ASPLOS’04] BenchmarkLeaks#FPLeaks#FP Other bugs Multimedia 40206 Interactive web-app. 91406 PC-Game (simulation) 41306 SWATHeapMD
31
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection31 Characteristics of bugs found Systemic bugs: Repeated often enough to affect heap-graph metric HeapMD cannot find “one-off” bugs –Temporary data structure invariant violations
32
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection32 False positives and negatives In our experiments: 0 false positives –Metrics are computed for the whole heap rather than per data structure –Anomaly detector only looks for range violations, not stability HeapMD cannot find all bugs –Bugs with no effect on heap-graph metrics –Bugs that affect heap-graph metrics, but do not violate calibrated range Comes at a cost: False negatives
33
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection33 Talk outline Motivation Stability of the heap Application to bug-finding Related work and conclusion
34
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection34 Related work Tools for special classes of bugs –Purify, Valgrind, SWAT, … –Not built for detecting invariant violations Tools to find invariants –Daikon [Ernst], DIDUCE [Hangal and Lam, ICSE 2002] –HeapMD complements these in the types of invariants found Shape analysis algorithms and tools –Can find invariant violations given a correctness specification
35
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection35 Points to take home Stable heap-graph metrics exist. Their ranges serve as invariants Range violation Bug likely exercised Can find non-crashing bugs e.g., invariant violations
36
HeapMD: Identifying Heap-based Bugs using Anomaly Detection Trishul M. Chilimbi Microsoft Research Redmond, WA trishulc@microsoft.com Vinod Ganapathy University of Wisconsin Madison, WI vg@cs.wisc.edu Questions?
37
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection37 Discarded slides
38
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection38 Presenting HeapMD A tool to monitor the health of the heap
39
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection39 Presenting HeapMD A tool to monitor the health of the heap Key Feature HeapMD can detect bugs that don’t cause crashes Highlights of results HeapMD found 40 bugs (31 new) in 5 large commercial applications Key Insight In practice, the heap has a simple, stable structure
40
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection40 Heap-graph metrics Quantify the graph structure of the heap Stability of heap will reflect in values of metrics Calibrate to obtain “normal” range Bug finding using anomaly detection: Violation of “normal” range Likely a bug
41
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection41 Gathering metrics: Simple example Program startup
42
ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection42 Metric fluctuation Average = 2.47% Std. dev. = 24.80 Average = -0.10% Std. dev. = 1.72 Thresholds Average = ± 1% Std. dev. = 5
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.