Presentation is loading. Please wait.

Presentation is loading. Please wait.

HeapMD: Identifying Heap-based Bugs using Anomaly Detection Trishul M. Chilimbi Microsoft Research Redmond, WA Vinod Ganapathy University.

Similar presentations


Presentation on theme: "HeapMD: Identifying Heap-based Bugs using Anomaly Detection Trishul M. Chilimbi Microsoft Research Redmond, WA Vinod Ganapathy University."— Presentation transcript:

1 HeapMD: Identifying Heap-based Bugs using Anomaly Detection Trishul M. Chilimbi Microsoft Research Redmond, WA trishulc@microsoft.com Vinod Ganapathy University of Wisconsin Madison, WI vg@cs.wisc.edu ASPLOS XII, October 2006 San Jose, California

2 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection2 A motivating example pNewAsset = Initialize(pAssetParams);... if (pAssetList->next != NULL) { pNewAsset->next = pAssetList->next pAssetList->next = pNewAsset } …… pAssetList pNewAsset

3 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection3 A motivating example pNewAsset = Initialize(pAssetParams);... if (pAssetList->next != NULL) { pNewAsset->next = pAssetList->next pAssetList->next = pNewAsset } …… pAssetList pNewAsset

4 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection4 Noteworthy points Violation of implicit data-structure invariant –Only extremal nodes must have indegree=1 Malformed, but pointer-correct structure –Bug does not necessarily result in a crash …… pAssetList pNewAsset …

5 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection5 Key challenges Programmers rarely write down invariants for heap data structures Bugs may not be immediately apparent Can we infer invariants for the heap and use them for bug detection?

6 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection6 Presenting HeapMD A tool to monitor the health of the heap Highlights of results HeapMD found 40 bugs (31 new) in 5 large commercial applications

7 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection7 Talk outline Motivation Stability of the heap Application to bug-finding Related work and conclusion

8 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection8 Heap data A programmer’s perspective 1.Invisible  No tangible representation 2.Can be arbitrarily modified  Akin to self-modifying code 3.Can have arbitrary structure  Akin to programming with gotos If this were true in practice, building large software would be untenable

9 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection9 Heap data The reality 1.Invisible  No tangible representation 2.Can be arbitrarily modified  Often only a small fraction is modified 3.Can have arbitrary structure  In practice, structure is simple In practice, the heap has a simple, stable structure

10 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection10 Stability of pointer-valued locations Number of pointer-valued heap locations that only store NULL during their lifetime Number of pointer-valued heap locations that only store only one value during their lifetime Number of pointer-valued heap locations that only store only two values during their lifetime Number of pointer-valued heap locations that store more than two values during their lifetime 91.22% Strong evidence that a large fraction of the heap is not modified

11 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection11 Simple structure of the heap Most data structures in large programs have low in- and out-degrees –Linked lists –Trees –Hash tables Can we quantify the simplicity and stability of the heap?

12 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection12 Simple metrics suffice % nodes with indegree = 0 (root nodes) % nodes with outdegree = 0 (leaf nodes) % nodes with indegree = 1 % nodes with indegree = 2 % nodes with outdegree = 1 % nodes with outdegree = 2 % nodes with indegree = outdegree

13 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection13 Gathering metrics: Setup Instrument program (x86) to track metrics Execute instrumented program on a set of inputs Gather at metric-computation points –Function entry points –Frequency is tunable: currently 1/100,000

14 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection14 % indegree = outdegree % indegree = 2 % outdegree = 0 (leaves)

15 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection15 Algorithm to find stable metrics Compute metrics Program Calibration inputs Metric reports Determine stability Is stable for 40% inputs? Metric unstable (discard) Find range of metric Stable metrics and their ranges Check for range violation Other 60% Can potentially find bugs exercised by calibration inputs

16 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection16 Stable metrics exist: SPEC SPEC b/m# Inputs# Stable twolf36 crafty32 mcf34 vpr61 vortex51 gzip1002 parser1003 gcc1002 % of vertices with Indegree = Outdegree Range = [ 14.2%, 17.2% ]

17 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection17 Stable metrics exist: Commercial Benchmark# Inputs# Stable Multimedia502 Interactive web-app.502 PC-game (simulation)502 PC-game (action)501 Productivity502

18 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection18 Extending the observation Several degree-based metrics of the heap-graph remain stable as the heap evolves Stable heap metrics exist even across different development versions of a program In fact, we observed that …

19 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection19 Metric stability across versions Benchmark# Versions# Inputs# Stable Multimedia5102 Interactive web-app.5102 PC-game (simulation)5102 PC-game (action)5101 Productivity5102

20 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection20 Talk outline Motivation Stability of the heap Application to bug-finding Related work and conclusion

21 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection21 Architecture of HeapMD Binary instrumenter input.exe output.exe Stable metrics and their ranges Metric gatherer Metric analyzer Calibration input set Observed metrics Calibration Anomaly detector “On the field” input set Metric range violations Bug-finding Goal Identify a subset of observed metrics as stable metrics Goal Check the metrics identified by training for range violation

22 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection22 Finding bugs using HeapMD Key intuition Range violation  Likely bug

23 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection23 Recall our motivating example pNewAsset = Initialize(pAssetParams);... if (pAssetList->next != NULL) { pNewAsset->next = pAssetList->next pAssetList->next = pNewAsset } …… pAssetList pNewAsset Violated invariant Only extremal nodes have indegree=1

24 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection24 Range violation for this example Log the call stack for diagnostics

25 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection25 Summary of bugs found Benchmark# Bugs# New bugs # Invariant violations Multimedia 863 Interactive web-app 1065 PC-game (simulation) 962 PC-game (action) 883 Productivity 554 Total 403117

26 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection26 Kinds of bugs found (1) Erroneous insertion into linked list % vertices with indegree = 0, 1 % vertices with outdegree = 0, 1 % vertices with indegree = outdegree

27 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection27 Kinds of bugs found (2) Shared data structure manipulation errors Delete ?? % vertices with outdegee = 0 % vertices with indegee = 2 % vertices with indegree = outdegree

28 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection28 Kinds of bugs found (3) Data structure invariants % vertices with outdegee = 1, 2 % vertices with indegee = 1, 2

29 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection29 Kinds of bugs found (3) Data structure invariants % vertices with indegree = outdegree % vertices with indegee = 1 See the paper for several more examples

30 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection30 Comparison with SWAT [Chilimbi and Hauswirth, ASPLOS’04] BenchmarkLeaks#FPLeaks#FP Other bugs Multimedia 40206 Interactive web-app. 91406 PC-Game (simulation) 41306 SWATHeapMD

31 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection31 Characteristics of bugs found Systemic bugs: Repeated often enough to affect heap-graph metric HeapMD cannot find “one-off” bugs –Temporary data structure invariant violations

32 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection32 False positives and negatives In our experiments: 0 false positives –Metrics are computed for the whole heap rather than per data structure –Anomaly detector only looks for range violations, not stability HeapMD cannot find all bugs –Bugs with no effect on heap-graph metrics –Bugs that affect heap-graph metrics, but do not violate calibrated range Comes at a cost: False negatives

33 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection33 Talk outline Motivation Stability of the heap Application to bug-finding Related work and conclusion

34 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection34 Related work Tools for special classes of bugs –Purify, Valgrind, SWAT, … –Not built for detecting invariant violations Tools to find invariants –Daikon [Ernst], DIDUCE [Hangal and Lam, ICSE 2002] –HeapMD complements these in the types of invariants found Shape analysis algorithms and tools –Can find invariant violations given a correctness specification

35 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection35 Points to take home Stable heap-graph metrics exist. Their ranges serve as invariants Range violation  Bug likely exercised Can find non-crashing bugs e.g., invariant violations

36 HeapMD: Identifying Heap-based Bugs using Anomaly Detection Trishul M. Chilimbi Microsoft Research Redmond, WA trishulc@microsoft.com Vinod Ganapathy University of Wisconsin Madison, WI vg@cs.wisc.edu Questions?

37 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection37 Discarded slides

38 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection38 Presenting HeapMD A tool to monitor the health of the heap

39 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection39 Presenting HeapMD A tool to monitor the health of the heap Key Feature HeapMD can detect bugs that don’t cause crashes Highlights of results HeapMD found 40 bugs (31 new) in 5 large commercial applications Key Insight In practice, the heap has a simple, stable structure

40 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection40 Heap-graph metrics Quantify the graph structure of the heap Stability of heap will reflect in values of metrics Calibrate to obtain “normal” range Bug finding using anomaly detection: Violation of “normal” range  Likely a bug

41 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection41 Gathering metrics: Simple example Program startup

42 ASPLOS'06Chilimbi/Ganapathy - HeapMD: Identifying Heap-based Bugs using Anomaly Detection42 Metric fluctuation Average = 2.47% Std. dev. = 24.80 Average = -0.10% Std. dev. = 1.72 Thresholds Average = ± 1% Std. dev. = 5


Download ppt "HeapMD: Identifying Heap-based Bugs using Anomaly Detection Trishul M. Chilimbi Microsoft Research Redmond, WA Vinod Ganapathy University."

Similar presentations


Ads by Google