CS4723 Software Engineering Lecture 10 Debugging and Fault Localization.

CS4723 Software Engineering Lecture 10 Debugging and Fault Localization

2 Debugging  We do when testing find a bug  Basic Process  Reproduce the bug  Locate the fault  Fix

3 Debugging  Sometimes the software is too large  Before we can do the fix  Narrow down the relevant input  Delta Debugging  Narrow down the relevant code  Statistical debugging  Dynamic slicing

4 Debugging  The inputs can be very complex…  Quite common in real world (compiler, office, browser, database, OS, …)  Important to locate just relevant inputs  Shorten the execution for debugging  Filter out the noise  Easier to identify the root cause of the bug

5 Consider Mozilla Firefox  Taking html pages as inputs  A large number of bugs are related to loading certain html pages  Corner cases in html syntax  Incompatibility between browsers  Corner cases in Javascripts, css, …  Error handling for incorrect html, Javascript, css, …

6 How do we go from this All Windows 3.1 Windows 95<OPTION VALUE="Windows 98">Windows 98 Windows ME Windows 2000 Windows NT Mac System 7<OPTION VALUE="Mac System 7.5">Mac System 7.5 Mac System 7.6.1 Mac System 8.0 Mac System 8.5 Mac System 8.6<OPTION VALUE="Mac System 9.x">Mac System 9.x MacOS X Linux<OPTION VALUE="BSDI">BSDI FreeBSD NetBSD<OPTION VALUE="OpenBSD">OpenBSD AIX BeOS HPUX< OPTION VALUE="IRIX">IRIX Neutrino OpenVMS<OPTION VALUE="OS/2">OS/2 OSF/1 Solaris<OPTION VALUE="SunOS">SunOS other -- P1 P2 P3<OPTION VALUE="P4">P4 P5 blocker critical major<OPTION VALUE="normal">normal minor trivial<OPTION VALUE="enhancement">enhancement<

7 To this…

8 Motivation  Turning bug reports with real web pages to minimized test cases  The minimized test case should still be able to reveal the bug  Benefit of simplification  Easy to communicate  Remove duplicates  Easy debugging  Involve less potentially buggy code  Shorter execution time

9 Delta Debugging  The problem definition  A program exhibit an error for an input  The input is a set of elements  E.g., a sequence of API calls, a text file, a serialized object, …  Problem:  Find a smaller subset of the input that still cause the failure

10 A generic algorithm  How do people handle this problem?  Binary search  Cut the input to halves  Try to reproduce the bug  Iterate

11 Delta Debugging Version 1  The set of elements in the bug-revealing input is I  Assumptions  Each subset of I is a valid input:  Each Subset of I -> success / fail  A single input element E causes the failure  E will cause the failure in any cases (combined with any other elements) (Monotonic)

12 Solution is simple  Go with the binary search process  Throw away half of the input elements, if the rest input elements still cause the failure

13 Solution is simple  Go with the binary search process  Throw away half of the input elements, if the rest input elements still cause the failure A single element: we are done!

14 Example

15 Delta Debugging Version 1  This is just binary search: easy to automate  The assumptions do not always hold  Let’s look at the assumptions:  (I 1 U I 2 ) = -> I 1 = and I 2 = or I 1 = and I 2 = It is interesting to see if this is not the case

16 Case I: multiple failing branches  What happened if I 1 = and I 2 = ?  A subset of I 1 fails and also a subset of I 2 fails  We can simply continue to search I 1 and I 2  And we find two fail-causing elements  They may be due to the same bug or not

17 Case II: Interference  What happened if I 1 = and I 2 = ?  This means that a subset of I 1 and a subset of I 2 cause the failure when they combined  This is called interference

18 Handling Interference  The cute trick  Consider I 1 = and I 2 =  But I 1 U I 2 =  An element D 1 in I 1 and an element D 2 in I 2 cause the failure  We do binary search in I 2 with I 1  Split I 2 to P 1 and P 2, try I 1 U P 1 and I 1 U P 2  Continue until you find D 2, so that I 1 U D 2 cause the failure  Then we do binary search in I 1 with D 2 until find D 1  Return D 1 U D 2

19 Example I: Handle interference Consider 8 input elements, of which 3 and 7 cause the failure when they applied together Configuration Result 1 2 3 4 5 6 7 8 1 2 3 4 5 6 1 2 3 4 7 8 1 2 3 4 7 1 2 7 3 4 7 3 7 Interference!

20 Example II: Handle multiple interference Consider 8 input elements, of which 3, 5 and 7 cause the failure when they applied together Configuration Result 1 2 3 4 5 6 7 8 1 2 3 4 5 6 1 2 3 4 7 8 1 2 3 4 5 6 7 1 2 3 4 5 7 1 2 5 7 3 4 5 7 Interference! Second Interference! What to do? 3 5 7 Go on with I 1 U P 1 !

21 Delta Debugging Version 2  The set of elements in the bug-revealing input is I  New Assumptions  Each subset of I is a valid input  A subset of input elements E causes the failure  E will cause the failure in any cases (combined with any other elements)

22 Delta Debugging Version 2  Algorithm  Split I to I 1 and I 2  Case I: I 1 = and I 2 =  Try I 1  Case I: I 1 = and I 2 =  Try I 2  Case I: I 1 = and I 2 =  try both I 1 and I 2  Case II: I 1 = and I 2 =  Handle interference for I 1 and I 2

23 Real example: GNU Compiler  This input program (bug.c) causes Gcc 2.59.2 to crash when all optimitization are enabled  Minimize it to debug gcc  Consider each character as an element

24 Real example: GNU Compiler  Our delta debugging process  Create the appropriate subset of bug.c  Feed it to gcc  Continue according to whether Gcc crashes 77

25 GCC compiler example  The minimized code:  The test case is 1-minimal  No single character can be removed  Even every space is removed  The function name has been changed from mult to a signle t  Gcc is executed for 700+ times  Input reduce to 10% of the initial input t(double z[],int n){int i,j;for(;;){i=i+j+1;z[i]=z[i]*(z[0]+0);}return z[n];}

26 Another example: GDB  GDB is the debugger from GNU  It updates from 4.16 to 4.17  The version 4.17 no longer compatible with DDD (a GUI for GNU software development tools)  178, 000 lines of code change from 4.16  How to know which code change(s) cause the failure

27 Results  After a lot of work (by machine)  178KLOC change grouped to 8700 groups (commits)  Use delta debugging  Work it out in 470 tests  It took 48 hours  Doing this by hand would be a nightmare!

28 Importance of input elements  It is important to have good input element definition  So that subset of input elements are valid for input  The size of input is small  Consider the examples  GCC example: we use characters as elements, which is simple but not so good, if the bug happens after parser, the bug is not monotonic due to syntax errors  GDB example: we group LOC to groups to reduce input size to 5% of the original size. 2 days are acceptable, what about 40 days?

29 Limitations of Delta debugging  Rely on the assumptions  Monotonicity does not always hold  Rely on good input elements, always providing valid inputs will enhance efficiency  Require automatic test oracles  Good for regression testing  No good for development-time testing

30 Statistical Debugging  Delta Debugging  Narrow down the input to be considered  Statistical Debugging  Narrow down the code to be considered

31 Statistical Debugging  Basic Idea  Consider a number of test cases, some of which pass and some of which fail  If a statement is covered mostly by failed test cases, it is highly likely to be the buggy part of the code

32 Tarantula  A classical tool for statistical debugging  Use the following formulas  Color = red + pass/(fail + pass) * (green )  Brightness = max (pass, fail)

33 Tarantula: Illustration

34 Context based statistical debugging  Not just consider a statement  Runtime Control Flow Graph  Also consider connections  Outcomes of branches  Connections on a runtime-CFG

35 Runtime Control Flow Graph 1: void replaceFirst (sx, sy) { 2: for (int i=0;i<len;i++) { 3: if (arr[i]==sx){ 4: arr[i] = sz; 5: //should break; 6: } 7: if (arr[i]==sy)){ 8: arr[i] = sz; 9: //should break; 10: } 11: } 12:} pass Fail

36 Limitations  Questions:  If a statement is covered only by passed test cases, can it be the root cause of the bug found?  If a statement is covered only by failed test cases, it must be the root cause of the bug found?

37 Example void f(int a, int b){ if (a > 0){ //error: should be >= do something; } if (b < 0){ do something } Test Cases: 3, 2 2, 1, 0, -1 2, 0

38 Dynamic Slicing  Another way to narrow down code to be considered in debugging  Recall static slicing  All code elements that affect or are affected by a certain variable  Generate a large dependency graph for the code  Do reachability analysis

39 Data Dependencies  Data dependencies are the dependency from the usage of a variable to the definition of the variable  Example: s1: x = 3; s2: if(y > 5){ s3: y = y + x; //data depend on x in s1 s4: }

40 Control Dependencies  Control dependencies are the dependency from the branch basic blocks to the predicate  Example: s1: x = 3; s2: if(y > 5){ s3: y = y + x; //control depend on y in s2 s4: }

41 Program slicing for sum = 0 -> sum = 1

42 Dynamic Slicing  Also describe dependencies among code elements  If a variable has incorrect value, the bug should be in its backward dynamic slice  Like runtime control flow graph  A map from static slicing to the executed code

Dynamic Slicing Example 1: b=0 2: a=2 3: for i= 1 to N do 4: if ((i++)%2==1) then 5: a = a+1 else 6: b = a*2 endif done 7: z = a+b 8: print(z) For input N=2, 1 1 : b=0 [b=0] 2 1 : a=2 3 1 : for i = 1 to N do [i=1] 4 1 : if ( (i++) %2 == 1) then [i=1] 5 1 : a=a+1 [a=3] 3 2 : for i=1 to N do [i=2] 4 2 : if ( i%2 == 1) then [i=2] 6 1 : b=a*2 [b=6] 7 1 : z=a+b [z=9] 8 1 : print(z) [z=9]

Algorithm I  This algorithm uses a static dependence graph in which all executed nodes are marked dynamically so that during slicing when the graph is traversed, nodes that are not marked are avoided as they cannot be a part of the dynamic slice.  Limited dynamic information - fast, imprecise (but more precise than static slicing)

8181 7171 5151 4141 3131 1 2121 Algorithm I Example 1: b=0 2: a=2 3: 1 <=i <=N 4: if ((i++)%2= =1) 5: a=a+16: b=a*2 7: z=a+b 8: print(z) TF T F For input N=1, the trace is: 3232

Algorithm II  A dependence edge is introduced from a load to a store if during execution, at least once, the value stored by the store is indeed read by the load (mark dependence edge)  No static analysis is needed.

1 2121 5151 7171 8181 3131 4141 Algorithm II Example 1: b=0 2: a=2 3: 1 <=i <=N 4: if ((i++)%2= =1) 5: a=a+16: b=a*2 7: z=a+b 8: print(z) TF T F For input N=1, the trace is:

Algorithm II Example 1: b=0 2: a=2 3: 1 <=i <=N 4: if ((i++)%2= =1) 5: a=a+16: b=a*2 7: z=a+b 8: print(z) TF T F For input N=2, the trace is: 2 1 : save a 1 1 : save b 3 1 : save i 4 1 : load i 5 1 : load/save a 3 2 : load/save i 6 1 : load a / save b 7 1 : load a, b / save z 8 1 : load z 4 2 : load i

Algorithm II – Compare to Algorithm I  More precise b=… …=b Algo. I b=… …=b Algo. II

Efficiency: Summary  For an execution of 130M instructions:  Space requirement: about 1.5GB  Time requirement: About 10 min  JSlice  http://jslice.sourceforge.net/

Dynamic Dependence Graph Sizes Program Statements Executed (Millions) Dynamic Dependence Graph Size(MB) 300.twolf 256.bzip2 255.vortex 197.parser 181.mcf 134.perl 130.li 126.gcc 099.go 140 67 108 123 118 220 124 131 138 1,568 1,296 1,442 1,816 1,535 1,954 1,745 1,534 1,707

Classic Dynamic Slicing in Debugging Buggy RunsLOCEXEC (%LOC) BS (%EXEC) flex 2.5.31(a)267541871 (6.99%)695 (37.2%) flex 2.5.31(b)267542198 (8.2%)272 (12.4%) flex 2.5.31(c)267542053 (7.7%)50 (2.4%) grep 2.585811157 (13.5%)NA grep 2.5.1(a)8587509 (5.9%)NA grep 2.5.1(b)85871123 (13.1%)NA grep 2.5.1(c)85871338 (15.6%)NA make 3.80(a)299782277 (7.6%)981 (43.1%) make 3.80(b)299782740 (9.1%)1290 (47.1%) gzip-1.2.48164118 (1.5%)34 (28.8%) ncompress-4.2.4192359 (3.1%)18 (30.5%) polymorph-0.4.071645 (6.3%)21 (46.7%) tar 1.13.2525854445 (1.7%)105 (23.6%) bc 1.068288636 (7.7%)204 (32.1%) Tidy311321519 (4.9 %)554 (36.5%) 2.4-47.1% EXEC Avg 30.9%

Advantages compared with Statistical Debugging  Error-related code is guaranteed to be appear in the slice  Only requires the test case that reveals the bugs  This is a large advantage for field bugs reported by users

Issues about Dynamic Slicing  Slices are usually not very small (30% of the execution code)  Running history – very big ( GB )  Algorithm to compute dynamic slice - slow and very high space requirement.  On average, given an execution of 130M instructions, the constructed dependence graph requires 1.5GB space.

Review of Debugging  Debugging is a process after testing  Steps:  Reproduce, Localize, Fix  Approach in localization  Delta Debugging  Statistic Debugging  Dynamic Slicing

CS4723 Software Engineering Lecture 10 Debugging and Fault Localization.

Similar presentations

Presentation on theme: "CS4723 Software Engineering Lecture 10 Debugging and Fault Localization."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS4723 Software Engineering Lecture 10 Debugging and Fault Localization.

Similar presentations

Presentation on theme: "CS4723 Software Engineering Lecture 10 Debugging and Fault Localization."— Presentation transcript:

Similar presentations

About project

Feedback