Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 A Plethora of Paths Eric Larson May 18, 2009 Seattle University.

Similar presentations


Presentation on theme: "1 A Plethora of Paths Eric Larson May 18, 2009 Seattle University."— Presentation transcript:

1 1 A Plethora of Paths Eric Larson May 18, 2009 Seattle University

2 2 Paths Paths are commonly used in static analysis techniques. Symbolic path simulation: Simulate each path with symbolic data values Issues:  Path explosion  Illegal paths A BC D G EF

3 3 Format of Talk Research Questions Implementation  Analysis framework  Program slicing  Path counting algorithm  Shortcomings Results  Quantitative  Qualitative Conclusion  Answers to the research questions  Future work

4 4 Research Questions: Single Run / Individual Operations 1.When employing high-quality static software bug detection techniques, is it better to analyze the entire program in a single run or to look at dangerous operations individually? High-quality static software bug detection techniques:  catches most (ideally all) bugs  reports few (ideally none) false bug reports Dangerous operation: Any operation that needs to be checked for potential errors.  In this study, we consider operations that access memory to be dangerous operations

5 5 Single Run / Individual Operations: Tradeoffs Entire program:  Only one run  Most of the program is relevant  Big-O: 2 n Individual operations:  Many runs  More of the program is irrelevant (can be ignored)  Big-O: s x 2 m Key question: To what extent is m < n?

6 6 Research Questions: Program Slicing 2.What is the effectiveness of program slicing in reducing the number of paths? Program slicing removes statements not relevant to the property. Obtain path counts with different slicing criterion:  all statements (no slicing)  all dangerous statements  all dangerous statements within a function  one individual dangerous statement

7 7 Research Questions: Path Explosion 3.What types of tasks lead to path explosion? Is slicing more or less effective on particular tasks? Quantitative and qualitative analysis across 15 different programs.

8 8 Analysis Framework Uses modified version of SUDS (SCAM 2007)  Operates on the whole program  Analyzes programs written in C 1.Performs traditional analyses  Simplification  Control flow graph / call graph  Pointer analysis (flow-sensitive)  Data flow analysis 2.Program slicing (next slide) 3.Path counting (slide after next)

9 9 Program Slicing Backwards, context-insensitive slicing algorithm  Prevents the slice from propagating into a function that is clearly not in the slice Indirect uses from control statements are not part of the slice  Path counting will follow both directions regardless of condition No attempt to make slice executable  Used for analysis only Slicing criterion varies by experiment:  No slicing  All dangerous statements  All dangerous statements in a function  One dangerous statement

10 10 Path Counting Control flow graph is collapsed after slicing Path count is computed interprocedurally  Total paths is the sum of each function Loops introduce two new paths:  One for the loop not taken  One for the loop taken once  Assumes fixed-point analysis summarizes the loop Goto statements end a path  Not too many gotos in the programs used  Functions with gotos have a lot of paths even with this simplification

11 11 Shortcomings Processing of loops and goto statements Not all paths are equal  length of path  complexity of state Intraprocedural path count depends on how the program is divided into functions Amount of work to reduce the number of paths varies widely  Depends on factors such as loop depth

12 12 Results: Programs Used DescriptionFunctionsStatements bccalculator10514,491 betaftpdfile transfer daemon734,791 diff3compares three files324,016 findfile finder39831,098 flexlexical analyzer14022,453 ftspanning tree331,879 ghttpdweb server192,663 gnuchesschess game24339,443 gzipcompression utility10611,380 indentsource code indenter11419,605 ksgraph partitioning161,325 othelloothello game111,055 spacespecialized interpreter12711,652 thttpdweb server13012,500 yacr2channel router595,606

13 13 Results: Single Run, No Slicing Total paths Paths in Worst Function Functions with  100 paths Functions with >100,000 paths bc 2,653,0072,144,737 (80.8%)87 (82.9%)3 (2.9%) betaftpd 68,36555,297 (80.9%)66 (90.4%)0 (0.0%) diff3 2,067,3451,558,324 (75.4%)23 (71.9%)3 (9.4%) find 22,453,01121,748,720 (96.9%)366 (92.0%)3 (0.8%) flex 7.33E+117.22E+11 (98.4%)123 (87.9%)7 (5.0%) ft 10,49810,082 (96.0%)31 (93.9%)0 (0.0%) ghttpd 91,58091,082 (99.5%)16 (84.2%)0 (0.0%) gnuchess 2.35E+162.32E+16 (98.9%)202 (83.1%)12 (4.9%) gzip 3.49E+113.44E+11 (98.8%)80 (75.5%)9 (8.5%) indent 2.12E+172.12E+17 (100.0%)94 (82.5%)7 (6.1%) ks 25,37123,100 (91.0%)14 (87.5%)0 (0.0%) othello 42,8022,5057 (58.5%)6 (54.5%)0 (0.0%) space 5,8533,900 (66.6%)123 (96.9%)0 (0.0%) thttpd 1.57E+141.57E+14 (100.0%)108 (83.1%)3 (2.3%) yacr2 3,666,9002,991,744 (81.6%)40 (67.8%)2 (3.4%)

14 14 Results: Single Run, Slicing Total paths % Decr. Paths in Worst Function Funcs with  100 paths Funcs with >100,000 paths bc2,268,43214.5%2,144,736 (94.5%)91 (86.7%)1 (1.0%) betaftpd5,21292.4%1,980 (38.0%)70 (95.9%)0 (0.0%) diff340,42398.0%20,412 (50.5%)26 (81.3%)0 (0.0%) find4,146,60481.5%4,057,361 (97.8%)382 (96.0%)1 (0.3%) flex7.22E+111.6%7.22E+11 (100.0%)128 (91.4%)4 (2.9%) ft25797.6%194 (75.5%)32 (97.0%)0 (0.0%) ghttpd2,70197.1%2,520 (93.3%)18 (94.7%)0 (0.0%) gnuchess3.41E+1498.5%2.66E+14 (77.9%)214 (88.1%)11 (4.5%) gzip8.26E+0899.8%8.26E+08 (100.0%)91 (85.8%)1 (0.9%) indent8.00E+13100%8.00E+13 (100.0%)96 (84.2%)6 (5.3%) ks1,51994.0%1,400 (92.2%)15 (93.8%)0 (0.0%) othello3,46291.9%3,249 (93.8%)10 (90.9%)0 (0.0%) space1,89267.7%346 (18.3%)124 (97.6%)0 (0.0%) thttpd4.19E+1297.3%4.19E+12 (100.0%)111 (85.4%)2 (1.5%) yacr2287,63992.2%259,328 (90.2%)46 (78.0%)1 (1.7%)

15 15 Results: Individual Statement Runs One run for each dangerous operation The runs are sorted by the number of paths from smallest to largest Graphs show cumulative percentage of runs that have fewer than n paths

16 16 Results: Individual Statement Runs

17 17 Results: Individual Function Runs

18 18 Results: Worst Case Comparison Total paths (slicing - all) Worst Case Run Total paths (slicing - stmt) Total paths (slicing - func) bc2,268,432617,9921,106,152 betaftpd5,2126152,341 diff340,4233,25620,788 find4,146,604171,3944,058,603 flex7.22E+116.44E+10 ft257244 ghttpd2,7011321,614 gnuchess3.41E+141.11E+132.66E+14 gzip8.26E+086.19E+08 indent8.00E+137.36E+124.99E+13 ks1,519761,467 othello3,4623,2863,290 space1,8921,231 thttpd4.19E+129.04E+081.86E+11 yacr2287,639818259,518

19 19 Qualitative Analysis Look deeper at each program  What tasks lead to path explosion?  What does slicing do? Example analysis – find  Function quotearg_buffer_restyled has the most paths (21 million)  Modifies and buffers a string  Many options and special character processing  After slicing, 4 million paths remain  Function consider_visiting has the second most paths  Individual runs effective for operations not either of the above two functions See the paper for analysis of the other 14 programs.

20 20 Qualitative Analysis Common tasks for path explosion:  Input processing functions (often not sliced away)  Parsing functions (often not sliced away)  Stylized output functions (often sliced away) Other program-specific tasks suffered from path explosion:  divide in bc  finite state automata conversion in flex  finding the best move in gnuchess

21 21 Conclusions 1.When employing high-quality static software bug detection techniques, is it better to attempt to use the entire program in a single run or to look at dangerous operations individually? Worst case individual run ≈ single run  But there are exceptions Individual runs were effective for many operations  Especially those that were not from a function that suffered from path explosion

22 22 Conclusion 2.What is the effectiveness of program slicing in reducing the number of paths? Slicing did reduce the number of paths. Not enough in the worst cases of path explosion. 3.What types of tasks lead to path explosion? Is slicing more or less effective on particular tasks? Input processing, parsing, and stylized output functions often suffered from path explosion. Path explosion still existed in these functions after slicing. Slicing was helpful for stylized output functions since little to no code was dependent on its results.

23 23 Future Work Use the results to improve static bug detection:  Looking at task-specific techniques to address path explosion.  Incorporate some level of guidance from the user Extend the study  Address shortcomings: loops, interprocedural analysis  Programs in different languages

24 24 Questions


Download ppt "1 A Plethora of Paths Eric Larson May 18, 2009 Seattle University."

Similar presentations


Ads by Google