Presentation is loading. Please wait.

Presentation is loading. Please wait.

Breadcrumbs: Efficient Context Sensitivity for Dynamic Bug Detection Analyses Michael D. Bond University of Texas at Austin Graham Z. Baker Tufts / MIT.

Similar presentations


Presentation on theme: "Breadcrumbs: Efficient Context Sensitivity for Dynamic Bug Detection Analyses Michael D. Bond University of Texas at Austin Graham Z. Baker Tufts / MIT."— Presentation transcript:

1 Breadcrumbs: Efficient Context Sensitivity for Dynamic Bug Detection Analyses Michael D. Bond University of Texas at Austin Graham Z. Baker Tufts / MIT Lincoln Laboratory We don't make a lot of the bug detectors you use. We make a lot of the bug detectors you use better. Samuel Z. Guyer Tufts University

2 Example: Dynamic data race detector Thread AThread B write x unlock m lock m write x read x

3 Example: dynamic data race detector Thread AThread B write x unlock m lock m write x read x race!

4 Example: dynamic data race detector Thread AThread B write x unlock m lock m write x read x race! T@AT@A T@AT@A T’ @ B T’’ @ A

5 Example: dynamic data race detector Thread AThread B write x unlock m lock m write x read x race! How is this race reported? T@AT@A T@AT@A T’ @ B T’’ @ A

6 Reporting a race Thread AThread B write x unlock m lock m write x read x T@AT@A T@AT@A T’ @ B T’’ @ A loc1 loc2 loc3 race!

7 Reporting a race Thread AThread B write x read x race! T’ @ B T’’ @ A write x unlock m lock m T@AT@A T@AT@A loc1 loc2 loc3 AbstractDataTreeNode.indexOfChild():426 AbstractDataTreeNode.storeStrings():536

8 Reporting a race Thread AThread B write x read x race! T’ @ B T’’ @ A write x unlock m lock m T@AT@A T@AT@A loc1 loc2 loc3 AbstractDataTreeNode.indexOfChild():426 AbstractDataTreeNode.storeStrings():536 Problem : not much information

9 Full stack traces Thread AThread B write x read x race! T’ @ B T’’ @ A write x unlock m lock m T@AT@A T@AT@A loc1 loc2 loc3 AbstractDataTreeNode.indexOfChild():426 AbstractDataTreeNode.childAtOrNull():212 DeltaDataTree.lookup():666 ElementTree.includes():528 Workspace.getResourceInfo():1135 Resource.getResourceInfo():973 Project.hasNature():479 JavaProject.hasJavaNature():224 JavaProject.computeExpandedClasspath():430 JavaProject.getExpandedClasspath():1444... EclipseStarter.run():376... AbstractDataTreeNode.storeStrings():536 DataTreeNode.storeStrings():343 AbstractDataTreeNode.storeStrings():541 DataTreeNode.storeStrings():343... ElementTree.shareStrings():706 SaveManager.shareStrings():1154... StringPoolJob.shareStrings():124... Worker.run():76...

10 Context sensitivity  Big impact on static analysis  Better information  Better precision  Critical in modern software:  Intensive code reuse (e.g., frameworks)  Many small methods  Highly dynamic behavior What about dynamic analysis?

11 How hard is this? Thread AThread B write x read x race! T’ @ B T’’ @ A write x unlock m lock m T@AT@A T@AT@A loc1 loc2 loc3 AbstractDataTreeNode.indexOfChild():426 AbstractDataTreeNode.childAtOrNull():212 DeltaDataTree.lookup():666 ElementTree.includes():528 Workspace.getResourceInfo():1135 Resource.getResourceInfo():973 Project.hasNature():479 JavaProject.hasJavaNature():224 JavaProject.computeExpandedClasspath():430 JavaProject.getExpandedClasspath():1444... EclipseStarter.run():376... AbstractDataTreeNode.storeStrings():536 DataTreeNode.storeStrings():343 AbstractDataTreeNode.storeStrings():541 DataTreeNode.storeStrings():343... ElementTree.shareStrings():706 SaveManager.shareStrings():1154... StringPoolJob.shareStrings():124... Worker.run():76...

12 How hard is this? Thread AThread B write x read x race! T’ @ B T’’ @ A write x unlock m lock m T@AT@A T@AT@A loc1 loc2 loc3 AbstractDataTreeNode.indexOfChild():426 AbstractDataTreeNode.childAtOrNull():212 DeltaDataTree.lookup():666 ElementTree.includes():528 Workspace.getResourceInfo():1135 Resource.getResourceInfo():973 Project.hasNature():479 JavaProject.hasJavaNature():224 JavaProject.computeExpandedClasspath():430 JavaProject.getExpandedClasspath():1444... EclipseStarter.run():376... AbstractDataTreeNode.storeStrings():536 DataTreeNode.storeStrings():343 AbstractDataTreeNode.storeStrings():541 DataTreeNode.storeStrings():343... ElementTree.shareStrings():706 SaveManager.shareStrings():1154... StringPoolJob.shareStrings():124... Worker.run():76... EASY Race discovered here

13 How hard is this? Thread AThread B write x read x race! T’ @ B T’’ @ A write x unlock m lock m T@AT@A T@AT@A loc1 loc2 loc3 AbstractDataTreeNode.indexOfChild():426 AbstractDataTreeNode.childAtOrNull():212 DeltaDataTree.lookup():666 ElementTree.includes():528 Workspace.getResourceInfo():1135 Resource.getResourceInfo():973 Project.hasNature():479 JavaProject.hasJavaNature():224 JavaProject.computeExpandedClasspath():430 JavaProject.getExpandedClasspath():1444... EclipseStarter.run():376... AbstractDataTreeNode.storeStrings():536 DataTreeNode.storeStrings():343 AbstractDataTreeNode.storeStrings():541 DataTreeNode.storeStrings():343... ElementTree.shareStrings():706 SaveManager.shareStrings():1154... StringPoolJob.shareStrings():124... Worker.run():76... EASY HARD Previously recorded information

14 Challenge  Many events might need context information e.g., race detector: every read and write (!)  Existing approaches  Walk the stack: up to 100X slowdown  Build calling context tree: 2-3X, plus space Context BUG

15 Goal Compact representation of calling contexts Fast correct execution Print out stack trace when bug detected Efficient context sensitivity for dynamic bug detectors

16 Starting point Represent a calling context in 1 word PCC value Computed online, low overhead <5% BUT, no way to decode a PCC value Probabilistic Calling Context Bond and McKinley OOPSLA 07 ✓ ✓ ✘

17 With PCC: analysis is context sensitive Thread AThread B write x unlock m lock m write x read x race! T@AT@A T@AT@A T’ @ B T’’ @ A pcc1 pcc2 pcc3 0xFE9A651B 0x59C2DF08

18 How PCC works Caller Callee m() k() k(); j(); h(); current PCC callsite ID p’ = f (p, c) p’ = f (p, c) new PCC = ( 3 p + c) mod 2 32 … …… At each call site…

19 Caller Callee m() k() k(); j(); h(); current PCC callsiteID p’ = f (p, c) p’ = f (p, c) new PCC = ( 3 p + c) mod 2 32 … …… p = 0 in main() … p = f(…f( f( f(0, c 0 ), c 1 ), c 2 )…, c n ) p = 0 in main() … p = f(…f( f( f(0, c 0 ), c 1 ), c 2 )…, c n ) How PCC works

20 Breadcrumbs  Problem: decode PCC value Find a sequence of callsite IDs such that p = f(…f( f( f(0, c 0 ), c 1 ), c 2 )…, c n ) i.e., invert the hash function

21 Breadcrumbs  Problem: decode PCC value p Find a sequence of callsite IDs such that p = f(…f( f( f(0, c 0 ), c 1 ), c 2 )…, c n ) i.e., invert the hash function  Key: f is invertible  Given p’ and c unique p such that p’ = f(p, c)  “Peel off” callsites until we reach 0 (main) 3 and 2 32 relatively prime

22 Decode stack trace bottom-up PCC value = 0x5A93CF09

23 Decode stack trace bottom-up PCC value = 0x5A93CF09 PCC value = 0x089C3A02 f -1 ( 0x5A93CF09, g():2)

24 Decode stack trace bottom-up PCC value = 0x5A93CF09 PCC value = 0x0 PCC value = 0x089C3A02 PCC value = 0x59C2DF08 f -1 ( 0x5A93CF09, g():2) f -1 ( 0x5A93CF09, d():9)

25 Decode stack trace bottom-up PCC value = 0x5A93CF09 … PCC value = 0x0 PCC value = 0x089C3A02 PCC value = 0x59C2DF08 f -1 ( 0x5A93CF09, g():2) f -1 ( 0x5A93CF09, d():9) …

26 Problem: blind search of call graph … … PCC value = 0x0 … … … … PCC value = 0x5A93CF09 Need more information

27 Idea: record per-callsite PCC values … … … … … … 0x089C3A02

28 Very easy search … … … … … … PCC value = 0x5A93CF09 PCC value = 0x0

29 Very easy search … … … … … … PCC value = 0x5A93CF09 PCC value = 0x0 f -1 ( 0x5A93CF09, g():2) 0x089C3A02

30 Very easy search … … … … … … PCC value = 0x5A93CF09 PCC value = 0x0 ✓ ✘ f -1 ( 0x5A93CF09, g():2) 0x089C3A02

31 Very easy search … … … … … … PCC value = 0x5A93CF09 PCC value = 0x0 0x59C2DF08 f -1 ( 0x5A93CF09, d():9) f -1 ( 0x5A93CF09, g():2) 0x089C3A02

32 Very easy search … … … … … … PCC value = 0x5A93CF09 PCC value = 0x0 ✓ ✘ 0x59C2DF08 f -1 ( 0x5A93CF09, d():9) f -1 ( 0x5A93CF09, g():2) 0x089C3A02

33 Very easy search … … … … … … PCC value = 0x5A93CF09 PCC value = 0x0 0x59C2DF08 f -1 ( 0x5A93CF09, d():9) f -1 ( 0x5A93CF09, g():2) 0x089C3A02 …

34 antlr chart eclipse fop hsqldb jython luindex pmd xalan jbb geomean 020406080100120140160 % overhead With per-callsite sets JikesRVM DaCapo benchmarks # set ops 528m 201m 857m 21m 158m 3,624m 217m 270m 738m 137m PCC only

35 Observation … … … … … …

36 Idea: stop tracking hot call sites … … … … …

37 antlr chart eclipse fop hsqldb jython luindex pmd xalan jbb geomean 020406080100120140160 % overhead t = 100,000 t = 100 t = 10,000 No threshold t = 1,000 PCC only Is it enough information? Tunable “hotness” threshold

38 Decoding: hybrid search … … … … … … PCC value = 0x5A93CF09

39 Decoding: hybrid search … … … … … … PCC value = 0x5A93CF09 ✓ f -1 ( 0x5A93CF09, g():2) 0x089C3A02

40 Decoding: hybrid search … … … … … … PCC value = 0x5A93CF09 0x59C2DF08 f -1 ( 0x5A93CF09, d():9) f -1 ( 0x5A93CF09, g():2) 0x089C3A02

41 Heuristic search (see paper) … … … … … … PCC value = 0x5A93CF09 0x59C2DF08 f -1 ( 0x5A93CF09, d():9) f -1 ( 0x5A93CF09, g():2) 0x089C3A02

42 antlr chart eclipse fop hsqldb jython luindex pmd xalan jbb 020406080100120140160 % overhead 100 % 47 % 82 % 95 % 100 % 89 % 95 % 97 % Race detection results (go to Pacer talk tomorrow!) t = 100,000 t = 100 t = 10,000 No threshold t = 1,000 geomean

43 Summary  Make any dynamic bug detector context sensitive  More in the paper:  Description of search algorithm  What kinds of bug detectors will benefit  Results for two real bug detectors (both quantitative and qualitative)  Available as patch to JikesRVM

44 Related work  Reconstruct contexts from PC and SP [Mytkowicz et al. 2009] [Inoue and Nakatani 2009] Very low overhead, but little entropy in these values  Path profiling approach [Sumner et al. 2010] Uses multiple integers to represent calling context exactly  Both require offline training, pre-computed info Challenge for complex, highly dynamic software

45 Thank You Questions?

46 Goals Represent calling context compactly Easily take place of static program locations Fast correct execution For deployed or field-testing environment Decode back into stack trace when needed Could expensive, but cost paid offline

47 Calling context representation Calling context stored in 1 word PCC value Essentially a hash of sequence of call site IDs Computed online, low overhead <5% PCC values computed incrementally, at each call site BUT, no way to decode a PCC value Can distinguish, but not identify calling contexts ✓ ✓ ✘ Started with Probabilistic Calling Context Bond and McKinley OOPSLA 07

48 Summary  Make any dynamic bug detector context sensitive  Tunable overhead/precision tradeoff Sweet spot: 10% to 20% overhead at threshold 1,000 to 10,000  Challenges  Long sequences of hot callsites  Deep recursion  Available as patch to JikesRVM

49 antlr chart eclipse fop hsqldb jython luindex pmd xalan jbb geomean 020406080100120140160 % overhead Tradeoff: cost vs decoding t = 100,000 t = 100 t = 10,000 No threshold t = 1,000 PCC only

50 100 % antlr chart eclipse fop hsqldb jython luindex pmd xalan jbb geomean 020406080100120140160 % overhead 47 % 82 % 95 % 100 % 89 % 95 % 97 %


Download ppt "Breadcrumbs: Efficient Context Sensitivity for Dynamic Bug Detection Analyses Michael D. Bond University of Texas at Austin Graham Z. Baker Tufts / MIT."

Similar presentations


Ads by Google