White-Box Testing
Two Classes of Adequacy Criteria: Black Box and White Box Black Box (aka Spec-Based, aka Functional) Tests derived from functional requirements Input/Output Driven Internal nature of the software is not relevant to design the tests White Box (aka Code-Based, aka Structural) Tests derived from code structure - Code Based Tests are evaluated in terms of coverage of the code structures Many others in between
Some White Box Adequacy Criteria Statement coverage Decision coverage Condition coverage Path coverage Dataflow coverage
Statement Coverage Adequacy Statement coverage: find test cases that ensure coverage of every executable program statement Try: 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Statement Coverage Adequacy Statement coverage: find test cases that ensure coverage of every executable program statement Try: i = -1, j = 1 : 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Statement Coverage Adequacy Statement coverage: find test cases that ensure coverage of every executable program statement Try: i = -1, j = 1 : 1,2,3,4,9 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Statement Coverage Adequacy Statement coverage: find test cases that ensure coverage of every executable program statement Try: i = -1, j = 1 : 1,2,3,4,9 i = 10, j = 1,2 : 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Statement Coverage Adequacy Statement coverage: find test cases that ensure coverage of every executable program statement Try: i = -1, j = 1 : 1,2,3,4,9 i = 10, j = 1,2 : 1,2,3,4,5,6,7,8,9 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Greedy Algorithm for Achieving Statement Coverage Input: program P Output: test suite T Instrument P to report statement coverage Repeat create test t, execute P with t measure coverage of P by t if t adds to cumulative coverage, add it to T Until testing is statement-coverage-adequate Is T necessarily “minimal”?
Statement Coverage Issues What about unreachable statements?
Statement Coverage Issues What about unreachable statements? Bar(int x) ….. if (x==0) print “Bar: divide by zero” else x = 1/x endif Foo(int x) ….. if (x>0) Bar(x) endif
Statement Coverage Issues What about unreachable statements? What about hard-to-reach statements?
Statement Coverage Issues What about unreachable statements? What about hard-to-reach statements? int *ptr = malloc(sizeof(int) * 5000); if (ptr==NULL) { garbage-collect(); printf(“Unrecoverable system error\n”); exit(1); }
Statement Coverage Issues What about unreachable statements? What about hard-to-reach statements? How do you choose test cases to use? Manually With automated test case generation tools Start with functional, then augment until coverage is achieved.
Code Instrumentation Code instrumentation involves inserting probe statements into code that report data about its execution. How is it done and what issues does it involve?
Code Instrumentation How would you instrument this using print statements? read i read j sum = 0 while (i > 0) and (i < = 10) do if (j >0) sum = sum + j endif i = i + 1 endwhile 9. print sum
Code Instrumentation How would you instrument this using print statements? read i print “1 executed” read j print “2 executed” sum = 0 print “3 executed” print “4 executed” while (i > 0) and (i < = 10) do print “5 executed” if (j >0) sum = sum + j print “6 executed” endif i = i + 1 print “7 executed” print “8 executed” endwhile 9. print sum print “9 executed”
Code Instrumentation Can you reduce the number of probe statements (or minimize them) and still get the same information? How? read i print “1 executed” read j print “2 executed” sum = 0 print “3 executed” print “4 executed” while (i > 0) and (i < = 10) do print “5 executed” if (j >0) sum = sum + j print “6 executed” endif I = i + 1 print “7 executed” print “8 executed” endwhile 9. print sum print “9 executed”
Code Instrumentation Can you reduce the number of probe statements (or minimize them) and still get the same information? How? What’s being assumed about program execution in this case? read i read j sum = 0 print “1, 2, 3, 4 executed” while (i > 0) and (i < = 10) do print “5 executed” if (j >0) sum = sum + j print “6 executed” endif i = i + 1 print “7, 8 executed” endwhile 9. print sum print “9 executed”
Other Issues Using function calls instead of prints (how?) Effects on run-time and determinism? Instrumenting source code vs binary code. Tradeoffs? Profiling interpreted code in the interpreter (e.g., the JVM). Drawbacks?
Statement Coverage – A Weakness What’s the weakness? 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Decision Coverage Adequacy Decision coverage: find test cases that ensure coverage of every outcome of each predicate (decision) in the program. Try: 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Decision Coverage Adequacy Decision coverage: find test cases that ensure coverage of every outcome of each predicate (decision) in the program. Try: i = -1, j = 1 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Decision Coverage Adequacy Decision coverage: find test cases that ensure coverage of every outcome of each predicate (decision) in the program. Try: i = -1, j = 1 : 4F 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Decision Coverage Adequacy Decision coverage: find test cases that ensure coverage of every outcome of each predicate (decision) in the program. Try: i = -1, j = 1 : 4F i = 10, j = 1, 2 : 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Decision Coverage Adequacy Decision coverage: find test cases that ensure coverage of every outcome of each predicate (decision) in the program. Try: i = -1, j = 1 : 4F i = 10, j = 1, 2 : 4T, 5T, 4F 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Decision Coverage Adequacy Decision coverage: find test cases that ensure coverage of every outcome of each predicate (decision) in the program. Try: i = -1, j = 1 : 4F i = 10, j = 1, 2 : 4T, 5T, 4F i = 9, j = 1, 0 : 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Decision Coverage Adequacy Decision coverage: find test cases that ensure coverage of every outcome of each predicate (decision) in the program. Try: i = -1, j = 1 : 4F i = 10, j = 1, 2 : 4T, 5T, 4F i = 9, j = 1, 0 : 4T, 5T, 5F, 4F 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Greedy Algorithm for Achieving C-Coverage Replace “C” with “decision”, “condition”, “path”, ….. Input: program P Output: test suite T Instrument P to report C-coverage Repeat create test t, execute P with t measure coverage of P by t if t adds to cumulative C-coverage, add it to T Until testing is C-coverage-adequate
Decision Coverage Issues What about unreachable decisions? What about hard-to-reach decisions? How do you choose test cases to use? How to instrument code to measure decision coverage? What to do about “switch” statements?
Decision Coverage – A Weakness What’s the weakness? 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Condition Coverage Adequacy Condition coverage: find test cases that ensure coverage of every outcome of each predicate (decision) in the program, under each assignment of truth values to the individual conditions in that predicate. Try: 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Condition Coverage Adequacy Condition coverage: find test cases that ensure coverage of every outcome of each predicate (decision) in the program, under each assignment of truth values to the individual conditions in that predicate. Try: i = -1, j = 1 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Condition Coverage Adequacy Condition coverage: find test cases that ensure coverage of every outcome of each predicate (decision) in the program, under each assignment of truth values to the individual conditions in that predicate. Try: i = -1, j = 1 : 4(F,T) 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Condition Coverage Adequacy Condition coverage: find test cases that ensure coverage of every outcome of each predicate (decision) in the program, under each assignment of truth values to the individual conditions in that predicate. Try: i = -1, j = 1 : 4(F,T) i = 10, j = 1, 2 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Condition Coverage Adequacy Condition coverage: find test cases that ensure coverage of every outcome of each predicate (decision) in the program, under each assignment of truth values to the individual conditions in that predicate. Try: i = -1, j = 1 : 4(F,T) i = 10, j = 1, 2 : 4(T,T),5(T),4(T,F) 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Condition Coverage Adequacy Condition coverage: find test cases that ensure coverage of every outcome of each predicate (decision) in the program, under each assignment of truth values to the individual conditions in that predicate. Try: i = -1, j = 1 : 4(F,T) i = 10, j = 1, 2 : 4(T,T),5(T),4(T,F) i = 9, j = 1, 0 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Condition Coverage Adequacy Condition coverage: find test cases that ensure coverage of every outcome of each predicate (decision) in the program, under each assignment of truth values to the individual conditions in that predicate. Try: i = -1, j = 1 : 4(F,T) i = 10, j = 1, 2 : 4(T,T),5(T),4(T,F) i = 9, j = 1, 0 : 4(T,T),5(T),5(F),4(T,F) 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Condition Coverage Adequacy Condition coverage: find test cases that ensure coverage of every outcome of each predicate (decision) in the program, under each assignment of truth values to the individual conditions in that predicate. Try: i = -1, j = 1 : 4(F,T) i = 10, j = 1, 2 : 4(T,T),5(T),4(T,F) i = 9, j = 1, 0 : 4(T,T),5(T),5(F),4(T,F) 4(F,F) is not possible in this case 1. read i 2. read j 3. sum = 0 4. while (i > 0) and (i < = 10) do 5. if (j >0) 6. sum = sum + j endif 7. i = i + 1 8. read j endwhile 9. print sum
Condition Coverage Issues What about unreachable conditions? What about hard-to-reach conditions? How do you choose tests to use? How to instrument code to measure condition coverage? How many tests are needed for (P || Q) (P || Q || R) (P || Q || R || S)
Discussion Are code coverage criteria “partitioning strategies”? Why or why not? If yes, how so? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
Graph-Based Adequacy Criteria Control Flow Graphs (CFGs) PROGRAM GCD begin 1 read(x) 2 read(y) 3 while (x <> y) do 4 if (x > y) then 5 x = x – y else 6 y = y – x endif endwhile 7 print x end Entry read(x) Exit read(y) while x <> y if x > y x = x - y y = y - x print x endif endwhile T F
Control Flow Graph Issues Entry and exit nodes Predicate nodes Edge labels Back edges “Join” nodes Basic blocks Node labeling How to handle “switch” statements? How to represent interprocedural control flow? Entry read(x) Exit read(y) while x <> y if x > y x = x - y y = y - x print x endif endwhile T F
Procedure for Constructing CFGs Determine set of nodes Create entry node E and exit node X Add edge from E to node representing first statement Add edges from all nodes associated with statements from which control exits the routine to X For all nodes n1 and n2 other than E and X, add an edge from n1 to n2 if control may pass from the statement corresponding to n1 to the statement corresponding to n2 Label edges out of predicate nodes with the value to which the predicate must evaluate in order for control to flow that way
More CFG Issues The prior algorithm is for humans to use; in practice tools build these, often during the process of parsing Cost of building CFGs is linear in program size Initially, CFGs used extensively for code optimization, now used for many things See handouts page for more examples
CFG-based Coverage Criteria Node coverage Branch coverage What are they similar to? What about condition coverage? Entry read(x) Exit read(y) while x <> y if x > y x = x - y y = y - x print x endif endwhile T F
Path Coverage Adequacy Entry read(x) Exit read(y) while x <> y if x > y x = x - y y = y - x print x endif endwhile T F Path coverage: find test cases that ensure coverage of every entry-to-exit path in the program. Problems?
Restricted-Path Strategies Cover all acyclic paths Cover a set basis paths for the program. Basis paths are entry-to-exit paths, and techniques for covering them typically restrict coverage to C = e-n+2 paths (C is McCabe’s complexity metric, e and n are the number of nodes in the CFG).
Comparing Criteria via Subsumption Criterion A subsumes criterion B if, for any program P and test suite T for P, T being A-adequate for P implies that T is B-adequate for P. Consider: statement vs decision decision vs condition decision vs path condition vs path path statement decision condition
Comparing Criteria Empirically (Hutchins et al, ICSE 94) Strategy Mean Cases Faults Found Random Testing 100 79.5% Branch Testing 34 85.5% All Uses 84 90.0%