1 Delta Debugging Koushik Sen EECS, UC Berkeley
2 All Windows 3.1 Windows 95 Windows 98 Windows ME Windows 2000 Windows NT Mac System 7 Mac System 7.5 Mac System Mac System 8.0 Mac System 8.5 Mac System 8.6 Mac System 9.x MacOS X Linux BSDI FreeBSD NetBSD OpenBSD AIX BeOS HP- UX IRIX Neutrino OpenVMS OS/2 OSF/1 Solaris SunOS other -- P1 P2 P3 P4 P5 blocker critical major normal minor trivial enhancement How do we go from this FilePrintSegmentation Fault
3 into this FilePrintSegmentation Fault
Simplification Once one has reproduced a problem, one must find out what’s relevant: –Does the problem really depend on 10,000 lines of input? –Does the failure really require this exact schedule? –Do we need this sequence of calls? 4
Broken computer How do you identify the problem? 5
Why Simplify? Ease of communication. A simplified test case is easier to communicate. Easier debugging. Smaller test cases result in smaller states and shorter executions. Identify duplicates. Simplified test cases subsume several duplicates. 6
7 Motivation The Mozilla open-source web browser project receives several dozens bug reports a day. Each bug report has to be simplified –Eliminate all details irrelevant to producing the failure To facilitate debugging To make sure it does not replicate a similar bug report In July 1999, Bugzilla listed more than 370 open bug reports for Mozilla. –These were not even simplified –Mozilla engineers were overwhelmed with work –They created the Mozilla BugAThon: a call for volunteers to process bug reports
8 Your Solution How do you solve these problems? Binary search –Cut the test case in half –Iterate Brilliant idea: Why not automate this?
Binary Search Proceed by binary search. Throw away half the input and see if the output is still wrong. If not, go back to the previous state and discard the other half of the input. 9
Binary Search Proceed by binary search. Throw away half the input and see if the output is still wrong. If not, go back to the previous state and discard the other half of the input. 10 ✘
Binary Search Proceed by binary search. Throw away half the input and see if the output is still wrong. If not, go back to the previous state and discard the other half of the input. 11
Binary Search Proceed by binary search. Throw away half the input and see if the output is still wrong. If not, go back to the previous state and discard the other half of the input. 12 ✘
Binary Search Proceed by binary search. Throw away half the input and see if the output is still wrong. If not, go back to the previous state and discard the other half of the input. 13
Binary Search Proceed by binary search. Throw away half the input and see if the output is still wrong. If not, go back to the previous state and discard the other half of the input. 14 ✘
Binary Search Proceed by binary search. Throw away half the input and see if the output is still wrong. If not, go back to the previous state and discard the other half of the input. 15
Binary Search Proceed by binary search. Throw away half the input and see if the output is still wrong. If not, go back to the previous state and discard the other half of the input. 16 ✔
Binary Search Proceed by binary search. Throw away half the input and see if the output is still wrong. If not, go back to the previous state and discard the other half of the input. 17
Binary Search Proceed by binary search. Throw away half the input and see if the output is still wrong. If not, go back to the previous state and discard the other half of the input. 18 ✘
Binary Search Proceed by binary search. Throw away half the input and see if the output is still wrong. If not, go back to the previous state and discard the other half of the input. 19 ✔
Binary Search Proceed by binary search. Throw away half the input and see if the output is still wrong. If not, go back to the previous state and discard the other half of the input. 20 ✔
Binary Search Proceed by binary search. Throw away half the input and see if the output is still wrong. If not, go back to the previous state and discard the other half of the input. 21 Simplified input
22 All Windows 3.1 Windows 95 Windows 98 Windows ME Windows 2000 Windows NT Mac System 7 Mac System 7.5 Mac System Mac System 8.0 Mac System 8.5 Mac System 8.6 Mac System 9.x MacOS X Linux BSDI FreeBSD NetBSD OpenBSD AIX BeOS HP- UX IRIX Neutrino OpenVMS OS/2 OSF/1 Solaris SunOS other -- P1 P2 P3 P4 P5 blocker critical major normal minor trivial enhancement Complex Input FilePrintSegmentation Fault
Simplified Input Simplified from 896 lines to one single line Required 12 tests only 23
Binary Search 24
Binary Search 25 ✘
Binary Search 26 ✘ ✔
Binary Search 27 ✘ ✔ ✔
Binary Search What do we do if both halves pass? 28 ✘ ✔ ✔
29 Change Granularity Increase Granularity of test case Decrease Granularity of test case Failure of the test case More chancesLess chances Progress of the search Slower, Reduced by amount < ½ Faster, Reduced by amount > ½
30 General DD Algorithm Basic idea: 1.Start with few & large changes first 2.If all alternatives pass or are unresolved, apply more & smaller changes. ∆1 ∆2 ∆1 ∆2 ∆3 ∆4 ∆5 ∆6 ∆7 ∆8
Binary Search 31 ✘ ✔ ✔
Binary Search 32 ✘ ✔ ✔ ✔
Binary Search 33 ✘ ✔ ✔ ✔ ✘
Binary Search 34 ✘ ✔ ✔ ✔ ✘ ✘
Binary Search 35 ✘ ✔ ✔ ✔ ✔ ✘ ✘
36 Inputs and Failures Let E be the set of possible inputs r P E corresponds to an input that passes r F E corresponds to an input that fails
Binary Search Example of r F and r P ? 37 ✘ ✔ ✔ ✔ ✔ ✘ ✘
Binary Search Example of r F and r P ? r P = r F = <SELECT NAME="priori 38 ✘ ✔ ✔ ✔ ✔ ✘ ✘
39 Changes We can go from one input to another by changes A change is a mapping : E E which takes one input and changes it to another input r1 = ’ = insert ME="priori at appropriate position What is ’(r1)?
40 Changes We can go from one input to another by changes A change is a mapping : E E which takes one input and changes it to another input r1 = ’ = insert ME="priori at appropriate position What is ’(r1)?
41 Decomposing Changes A change can be decomposed to a number of elementary changes 1, 2,..., n where = 1 o 2 o... o n –where ( i o j )(r) = i ( j (r)) For example, deleting a part of the input file can be decomposed to deleting characters one be one from the input file –another way to say it: by composing deleting of single characters we can get a change that deletes part of the input file ’ = insert ME="priori ’ = 1 o 2 o 3 o 4 o 5 o 6 o 7 o 8 o 9 o 10 – 1 = insert M – 2 = insert E
42 Summary We have an input with failure: r F We have an input without failure: r P We have a set of changes c F = { 1, 2,..., n } such that r F = ( 1 o 2 o... o n )(r P ) where r F is a run with failure Each subset c of c F is a test case
43 Testing Test Cases Given a test case c, we would like to know if the input generated by changing r P by the changes in c is an input that causes a failure We define a function test: Powerset(c F ) {P, F, ?} such that, given c={ 1, 2,..., m } c F test(c) = F iff ( 1 o 2 o... o m )(r P ) is a failing input
44 Minimizing Test Cases Now the question is: Can we find the minimal test case c such that test(c) = F? A test case c c F is called the global minimum of c F if for all c’ c F, |c’| < |c| test(c’) F Global minimum is the smallest set of changes which will make the program fail
45 Minimizing Test Cases Now the question is: Can we find the minimal test case c such that test(c) = F? A test case c c F is called the global minimum of c F if for all c’ c F, |c’| < |c| test(c’) F Global minimum is the smallest set of changes which will make the program fail Finding the global minimum may require us to perform exponential number of tests
Search for 1-minimal Input Different problem formulation Find a set of changes that cause the failure, but removing any change causes the failure to go away This is 1-minimality 46
47 Minimizing Test Cases A test case c c F is called a local minimum of c F if for all c’ c, test(c’) F A test case c c F is n-minimal if for all c’ c, |c| |c’| n test(c’) F A test case is 1-minimal if for all i c, test(c – { i }) F
48 Naïve Algorithm To find a 1-minimal subset of C, simply Remove one element from C If c – { } = X, recurse with smaller set If C – { } X, C is 1-minimal
49 Analysis In the worst case, –We remove one element from the set per iteration –After trying every other element Work is potentially N + (N-1) + (N-2) + … This is O(N 2 )
50 Work Smarter, Not Harder We can often do better Silly to start out removing 1 element at a time –Try dividing change set in 2 initially –Increase # of subsets if we can’t make progress –If we get lucky, search will converge quickly
51 Minimization Algorithm The delta debugging algorithm finds a 1-minimal test case It partitions the set c F to 1, 2,... n – 1, 2,... n are pairwise disjoint –c F = 1 2 ... n Define the complement of i as i = c F i Start with n = 2 Tests each test case defined by the partition and their complements Reduce the test case if a smaller failure inducing set is found –otherwise refine the partition, i.e. n = n*2
52 Each Step of the Minimization Algorithm Let n = 2 (A) Start with as test set (B) Test each 1, 2,... n and each 1, 2,..., n (C) There are four possible outcomes 1.Some i causes failure –Go to step (A) with = i and n = 2 2.Some i causes failure –Go to step (A) with = i and n = n 1 3.No test causes failure –Increase granularity: Go to (A) with = and n = 2n 4.The granularity can no longer be increased –Done, found the 1-minimal subset
Delta Debugging 53
54 Analysis Worst case is still quadratic Subdivide until each set is of size 1 –Reduced to the naïve algorithm Good news –For single failure, converges in log N –Binary search again
55 The GNU C Compiler This program ( bug.c ) causes GCC to crash when optimization is enabled We would like to minimize this program in order to file a bug report In the case of GCC, a passing program run is the empty input For the sake of simplicity, we model change as the insertion of a single character –r is running GCC with an empty input –r means running GCC with bug.c –each change i inserts the i th character of bug.c #define SIZE 20 double mult(double z[], int n) { int i, j; i = 0; for (j = 0; j < n; j++) { i = i + j + 1; z[i] = z[i] * (z[0] + 1.0); } return z[n]; } void copy(double to[], double from[], int count) { int n = (count + 7) / 8; switch (count % 8) do { case 0: *to++ = *from++; case 7: *to++ = *from++; case 6: *to++ = *from++; case 5: *to++ = *from++; case 4: *to++ = *from++; case 3: *to++ = *from++; case 2: *to++ = *from++; case 1: *to++ = *from++; } while (--n > 0); return mult(to, 2); } int main(int argc, char *argv[]) { double x[SIZE], y[SIZE]; double *px = x; while (px < x + SIZE) *px++ = (px – x) * (SIZE + 1.0); return copy(y, x, SIZE); }
56 The GNU C Compiler The test procedure would –create the appropriate subset of bug.c –feed it to GCC –return iff GCC had crashed, and otherwise
57 The GNU C Compiler The minimized code is The test case is 1-minimal –No single character can be removed without removing the failure –Even every superfluous whitespace has been removed –The function name has shrunk from mult to a single t –This program actually has a semantic error (infinite loop), but GCC still isn't supposed to crash So where could the bug be? –We already know it is related to optimization –If we remove the –O option to turn off optimization, the failure disappears t(double z[],int n){int i,j;for(;;){i=i+j+1;z[i]=z[i]*(z[0]+0);}return z[n];}
58 The GNU C Compiler The GCC documentation lists 31 options to control optimization on Linux It turns out that applying all of these options causes the failure to disappear –Some option(s) prevent the failure –ffloat-store–fno-default-inline–fno-defer-pop –fforce-mem–fforce-addr–fomit-frame-pointer –fno-inline–finline-functions–fkeep-inline-functions –fkeep-static-consts–fno-function-cse–ffast-math –fstrength-reduce–fthread-jumps–fcse-follow-jumps –fcse-skip-blocks–frerun-cse-after-loop–frerun-loop-opt –fgcse–fexpensive-optimizations–fschedule-insns –fschedule-insns2–ffunction-sections–fdata-sections –fcaller-saves–funroll-loops–funroll-all-loops –fmove-all-movables–freduce-all-givs–fno-peephole –fstrict-aliasing
59 The GNU C Compiler We can use test case minimization in order to find the preventing option(s) –Each i stands for removing a GCC option –Having all i applied means to run GCC with no option (failing) –Having no i applied means to run GCC with all options (passing) After seven tests, the single option -ffast-math is found which prevents the failure –Unfortunately, it is a bad candidate for a workaround because it may alter the semantics of the program –Thus, we remove -ffast-math from the list of options and make another run –Again after seven tests, it turn out that -fforce-addr also prevents the failure –Further examination shows that no other option prevents the failure
60 The GNU C Compiler So, this is what we can send to the GCC maintainers –The minimal test case –“The failure only occurs with optimization” –“ -ffast-math and -fforce-addr prevent the failure”
61 Case Studies Reducing a Mozilla bug to –Print the html file containing Minimizing fuzz input –Fuzzing: Take a program, send it randomly generated input and see it crashes –Used delta debugging to minimize fuzz input
62 Another Application Yesterday, my program worked. Today, it does not. Why? –The new release 4.17 of GDB changed 178,000 lines –it no longer integrated properly with DDD (a graphical front- end) –How to isolate the change that caused the failure.
63 Summary Delta Debugging is a technique, not a tool Bad News: –Probably must be reimplemented for each significant system –To exploit knowledge of changes Good News: –Relatively simple algorithm, significant payoff –It’s worth reimplementing