Download presentation
Presentation is loading. Please wait.
Published byArnold Allen Modified over 9 years ago
1
Breadcrumbs: Efficient Context Sensitivity for Dynamic Bug Detection Analyses Michael D. Bond University of Texas at Austin Graham Z. Baker Tufts / MIT Lincoln Laboratory We don't make a lot of the bug detectors you use. We make a lot of the bug detectors you use better. Samuel Z. Guyer Tufts University
2
Example: Dynamic data race detector Thread AThread B write x unlock m lock m write x read x
3
Example: dynamic data race detector Thread AThread B write x unlock m lock m write x read x race!
4
Example: dynamic data race detector Thread AThread B write x unlock m lock m write x read x race! T@AT@A T@AT@A T’ @ B T’’ @ A
5
Example: dynamic data race detector Thread AThread B write x unlock m lock m write x read x race! How is this race reported? T@AT@A T@AT@A T’ @ B T’’ @ A
6
Reporting a race Thread AThread B write x unlock m lock m write x read x T@AT@A T@AT@A T’ @ B T’’ @ A loc1 loc2 loc3 race!
7
Reporting a race Thread AThread B write x read x race! T’ @ B T’’ @ A write x unlock m lock m T@AT@A T@AT@A loc1 loc2 loc3 AbstractDataTreeNode.indexOfChild():426 AbstractDataTreeNode.storeStrings():536
8
Reporting a race Thread AThread B write x read x race! T’ @ B T’’ @ A write x unlock m lock m T@AT@A T@AT@A loc1 loc2 loc3 AbstractDataTreeNode.indexOfChild():426 AbstractDataTreeNode.storeStrings():536 Problem : not much information
9
Full stack traces Thread AThread B write x read x race! T’ @ B T’’ @ A write x unlock m lock m T@AT@A T@AT@A loc1 loc2 loc3 AbstractDataTreeNode.indexOfChild():426 AbstractDataTreeNode.childAtOrNull():212 DeltaDataTree.lookup():666 ElementTree.includes():528 Workspace.getResourceInfo():1135 Resource.getResourceInfo():973 Project.hasNature():479 JavaProject.hasJavaNature():224 JavaProject.computeExpandedClasspath():430 JavaProject.getExpandedClasspath():1444... EclipseStarter.run():376... AbstractDataTreeNode.storeStrings():536 DataTreeNode.storeStrings():343 AbstractDataTreeNode.storeStrings():541 DataTreeNode.storeStrings():343... ElementTree.shareStrings():706 SaveManager.shareStrings():1154... StringPoolJob.shareStrings():124... Worker.run():76...
10
Context sensitivity Big impact on static analysis Better information Better precision Critical in modern software: Intensive code reuse (e.g., frameworks) Many small methods Highly dynamic behavior What about dynamic analysis?
11
How hard is this? Thread AThread B write x read x race! T’ @ B T’’ @ A write x unlock m lock m T@AT@A T@AT@A loc1 loc2 loc3 AbstractDataTreeNode.indexOfChild():426 AbstractDataTreeNode.childAtOrNull():212 DeltaDataTree.lookup():666 ElementTree.includes():528 Workspace.getResourceInfo():1135 Resource.getResourceInfo():973 Project.hasNature():479 JavaProject.hasJavaNature():224 JavaProject.computeExpandedClasspath():430 JavaProject.getExpandedClasspath():1444... EclipseStarter.run():376... AbstractDataTreeNode.storeStrings():536 DataTreeNode.storeStrings():343 AbstractDataTreeNode.storeStrings():541 DataTreeNode.storeStrings():343... ElementTree.shareStrings():706 SaveManager.shareStrings():1154... StringPoolJob.shareStrings():124... Worker.run():76...
12
How hard is this? Thread AThread B write x read x race! T’ @ B T’’ @ A write x unlock m lock m T@AT@A T@AT@A loc1 loc2 loc3 AbstractDataTreeNode.indexOfChild():426 AbstractDataTreeNode.childAtOrNull():212 DeltaDataTree.lookup():666 ElementTree.includes():528 Workspace.getResourceInfo():1135 Resource.getResourceInfo():973 Project.hasNature():479 JavaProject.hasJavaNature():224 JavaProject.computeExpandedClasspath():430 JavaProject.getExpandedClasspath():1444... EclipseStarter.run():376... AbstractDataTreeNode.storeStrings():536 DataTreeNode.storeStrings():343 AbstractDataTreeNode.storeStrings():541 DataTreeNode.storeStrings():343... ElementTree.shareStrings():706 SaveManager.shareStrings():1154... StringPoolJob.shareStrings():124... Worker.run():76... EASY Race discovered here
13
How hard is this? Thread AThread B write x read x race! T’ @ B T’’ @ A write x unlock m lock m T@AT@A T@AT@A loc1 loc2 loc3 AbstractDataTreeNode.indexOfChild():426 AbstractDataTreeNode.childAtOrNull():212 DeltaDataTree.lookup():666 ElementTree.includes():528 Workspace.getResourceInfo():1135 Resource.getResourceInfo():973 Project.hasNature():479 JavaProject.hasJavaNature():224 JavaProject.computeExpandedClasspath():430 JavaProject.getExpandedClasspath():1444... EclipseStarter.run():376... AbstractDataTreeNode.storeStrings():536 DataTreeNode.storeStrings():343 AbstractDataTreeNode.storeStrings():541 DataTreeNode.storeStrings():343... ElementTree.shareStrings():706 SaveManager.shareStrings():1154... StringPoolJob.shareStrings():124... Worker.run():76... EASY HARD Previously recorded information
14
Challenge Many events might need context information e.g., race detector: every read and write (!) Existing approaches Walk the stack: up to 100X slowdown Build calling context tree: 2-3X, plus space Context BUG
15
Goal Compact representation of calling contexts Fast correct execution Print out stack trace when bug detected Efficient context sensitivity for dynamic bug detectors
16
Starting point Represent a calling context in 1 word PCC value Computed online, low overhead <5% BUT, no way to decode a PCC value Probabilistic Calling Context Bond and McKinley OOPSLA 07 ✓ ✓ ✘
17
With PCC: analysis is context sensitive Thread AThread B write x unlock m lock m write x read x race! T@AT@A T@AT@A T’ @ B T’’ @ A pcc1 pcc2 pcc3 0xFE9A651B 0x59C2DF08
18
How PCC works Caller Callee m() k() k(); j(); h(); current PCC callsite ID p’ = f (p, c) p’ = f (p, c) new PCC = ( 3 p + c) mod 2 32 … …… At each call site…
19
Caller Callee m() k() k(); j(); h(); current PCC callsiteID p’ = f (p, c) p’ = f (p, c) new PCC = ( 3 p + c) mod 2 32 … …… p = 0 in main() … p = f(…f( f( f(0, c 0 ), c 1 ), c 2 )…, c n ) p = 0 in main() … p = f(…f( f( f(0, c 0 ), c 1 ), c 2 )…, c n ) How PCC works
20
Breadcrumbs Problem: decode PCC value Find a sequence of callsite IDs such that p = f(…f( f( f(0, c 0 ), c 1 ), c 2 )…, c n ) i.e., invert the hash function
21
Breadcrumbs Problem: decode PCC value p Find a sequence of callsite IDs such that p = f(…f( f( f(0, c 0 ), c 1 ), c 2 )…, c n ) i.e., invert the hash function Key: f is invertible Given p’ and c unique p such that p’ = f(p, c) “Peel off” callsites until we reach 0 (main) 3 and 2 32 relatively prime
22
Decode stack trace bottom-up PCC value = 0x5A93CF09
23
Decode stack trace bottom-up PCC value = 0x5A93CF09 PCC value = 0x089C3A02 f -1 ( 0x5A93CF09, g():2)
24
Decode stack trace bottom-up PCC value = 0x5A93CF09 PCC value = 0x0 PCC value = 0x089C3A02 PCC value = 0x59C2DF08 f -1 ( 0x5A93CF09, g():2) f -1 ( 0x5A93CF09, d():9)
25
Decode stack trace bottom-up PCC value = 0x5A93CF09 … PCC value = 0x0 PCC value = 0x089C3A02 PCC value = 0x59C2DF08 f -1 ( 0x5A93CF09, g():2) f -1 ( 0x5A93CF09, d():9) …
26
Problem: blind search of call graph … … PCC value = 0x0 … … … … PCC value = 0x5A93CF09 Need more information
27
Idea: record per-callsite PCC values … … … … … … 0x089C3A02
28
Very easy search … … … … … … PCC value = 0x5A93CF09 PCC value = 0x0
29
Very easy search … … … … … … PCC value = 0x5A93CF09 PCC value = 0x0 f -1 ( 0x5A93CF09, g():2) 0x089C3A02
30
Very easy search … … … … … … PCC value = 0x5A93CF09 PCC value = 0x0 ✓ ✘ f -1 ( 0x5A93CF09, g():2) 0x089C3A02
31
Very easy search … … … … … … PCC value = 0x5A93CF09 PCC value = 0x0 0x59C2DF08 f -1 ( 0x5A93CF09, d():9) f -1 ( 0x5A93CF09, g():2) 0x089C3A02
32
Very easy search … … … … … … PCC value = 0x5A93CF09 PCC value = 0x0 ✓ ✘ 0x59C2DF08 f -1 ( 0x5A93CF09, d():9) f -1 ( 0x5A93CF09, g():2) 0x089C3A02
33
Very easy search … … … … … … PCC value = 0x5A93CF09 PCC value = 0x0 0x59C2DF08 f -1 ( 0x5A93CF09, d():9) f -1 ( 0x5A93CF09, g():2) 0x089C3A02 …
34
antlr chart eclipse fop hsqldb jython luindex pmd xalan jbb geomean 020406080100120140160 % overhead With per-callsite sets JikesRVM DaCapo benchmarks # set ops 528m 201m 857m 21m 158m 3,624m 217m 270m 738m 137m PCC only
35
Observation … … … … … …
36
Idea: stop tracking hot call sites … … … … …
37
antlr chart eclipse fop hsqldb jython luindex pmd xalan jbb geomean 020406080100120140160 % overhead t = 100,000 t = 100 t = 10,000 No threshold t = 1,000 PCC only Is it enough information? Tunable “hotness” threshold
38
Decoding: hybrid search … … … … … … PCC value = 0x5A93CF09
39
Decoding: hybrid search … … … … … … PCC value = 0x5A93CF09 ✓ f -1 ( 0x5A93CF09, g():2) 0x089C3A02
40
Decoding: hybrid search … … … … … … PCC value = 0x5A93CF09 0x59C2DF08 f -1 ( 0x5A93CF09, d():9) f -1 ( 0x5A93CF09, g():2) 0x089C3A02
41
Heuristic search (see paper) … … … … … … PCC value = 0x5A93CF09 0x59C2DF08 f -1 ( 0x5A93CF09, d():9) f -1 ( 0x5A93CF09, g():2) 0x089C3A02
42
antlr chart eclipse fop hsqldb jython luindex pmd xalan jbb 020406080100120140160 % overhead 100 % 47 % 82 % 95 % 100 % 89 % 95 % 97 % Race detection results (go to Pacer talk tomorrow!) t = 100,000 t = 100 t = 10,000 No threshold t = 1,000 geomean
43
Summary Make any dynamic bug detector context sensitive More in the paper: Description of search algorithm What kinds of bug detectors will benefit Results for two real bug detectors (both quantitative and qualitative) Available as patch to JikesRVM
44
Related work Reconstruct contexts from PC and SP [Mytkowicz et al. 2009] [Inoue and Nakatani 2009] Very low overhead, but little entropy in these values Path profiling approach [Sumner et al. 2010] Uses multiple integers to represent calling context exactly Both require offline training, pre-computed info Challenge for complex, highly dynamic software
45
Thank You Questions?
46
Goals Represent calling context compactly Easily take place of static program locations Fast correct execution For deployed or field-testing environment Decode back into stack trace when needed Could expensive, but cost paid offline
47
Calling context representation Calling context stored in 1 word PCC value Essentially a hash of sequence of call site IDs Computed online, low overhead <5% PCC values computed incrementally, at each call site BUT, no way to decode a PCC value Can distinguish, but not identify calling contexts ✓ ✓ ✘ Started with Probabilistic Calling Context Bond and McKinley OOPSLA 07
48
Summary Make any dynamic bug detector context sensitive Tunable overhead/precision tradeoff Sweet spot: 10% to 20% overhead at threshold 1,000 to 10,000 Challenges Long sequences of hot callsites Deep recursion Available as patch to JikesRVM
49
antlr chart eclipse fop hsqldb jython luindex pmd xalan jbb geomean 020406080100120140160 % overhead Tradeoff: cost vs decoding t = 100,000 t = 100 t = 10,000 No threshold t = 1,000 PCC only
50
100 % antlr chart eclipse fop hsqldb jython luindex pmd xalan jbb geomean 020406080100120140160 % overhead 47 % 82 % 95 % 100 % 89 % 95 % 97 %
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.