Basic Concepts of Software Testing (2.1 – 2.3) Course Software Testing & Verification 2016/17 Wishnu Prasetya
Software error, how costly can it be ? Some errors in the software of UK Inland Revenue is rumored to cause tax-credit over-payment of 3.45 billions USD. (Charette, Why Software Fails. IEEE Spectrum, 2005)
On the other hand, quality costs effort World Quality Report, 2014/15. 1543 CIOs, 25 countries
Management perspective of solving the testing challange, by introducing organizartion like this By users/customers (or 3rd party hired to represent them) (also called V-model ) : Acceptance test: to test if the resulting system is indeed what the customer wants. In principle the customer has to do this himself. In practice this is often done with the help of a 3rd party hired to help the customer set up the test. System test is to test if the system meets its specification. Integration test might be defined as to test if a subsystem meets its specifiations; however this is not how AO defines it. Instead, it is to test that “modules” connect properly. The modules are assumed to be correct individually; we just focus on their connections. Module test is to test that a module behaves correctly. Ch. 7 provides you more background on “practical aspect”, e.g. concrete work out of the V-model, outlines of “test plan” read the chapter yourself! By developers
Scientific perspective of testing challenges How to make a good test-set ? When have tested enough? Can we automate the execution of the tests? Can we automatically generate tests? If a test fails, how to find the source of the failure? (debugging) Can we automate debugging?
Terms used A test-set (Def 1.18): is just a set of test-cases. (Deviate from Def. 1.17) A test-case specifies a series of interactions with the target program and the input needed by those interactions. Some interactions may trigger responses from the program. The test-case also specify what kind of responses it expects from the program. In the simple case: just one interaction foo(x), and one response, namely return value. Test inputs : the inputs meant above. Test oracle: specify the expectation on responses as meant above.
Expressing adequacy Testing is inherently incomplete. To at least do it systematically: we start by defining our “test requirements” come up with a reasonable definition, and preferably also quantifiable (so that you some indication of how far you are from fulfilling the requirements) “Fulfilling TR” does not imply absence of error. Not “fulfilling TR” increases the probability of still having errors.
Terms used in AO (I reformulate it a bit) Def. 1.20. A test requirement specifies a single goal (e.g. a position in your program) that your test-set must cover. OA uses the term TR to refer to a set of test requirements. Def. 1.21. A coverage criterion = a TR to fulfill/cover. Ex. in line-cov the TR is the set of all P’s lines. TR is just AO's trick to make their concrete definitions of various CCs (later) to look more concise. But it is a nice trick. A test-set T meets a given set TR of test-requirements if executing T will meet all requirements in TR. Note that this does not require T to have the same cardinality as TR. That is, a single test-case may cause multiple requirements in TR to be met.
“line coverage” is too imprecise? foo(x,y) { while (x>y) x=x-y ; if (x<y) return -1 else return x } 3 2 4 1 Abstractly, a program is a set of branches and branching points. An execution flows through these branches, to form a path through the program. Such an abstraction is called Control Flow Graph (CFG, Legard 1970), such as the one above. AO gives you a set of more precise concept of coverage, based on graph.
Graph terminologies... (Sec 2.1) A graph is described by (N,N0,Nf,E). E is a subset of N×N. directed, no labels, no multi-edges. A path p : [n0, n1, ... , nk ] non-empty! each consecutive pair is an edge in E length of p = #edges in p = natural length of p - 1 Test-path (Def. 2.31) : a path from somewhere in N0, and ends somewhere in Nf . N0 and NF must be NON empty! Test-path: Each test-case, when executed, wiil cause a program to trace a path through its control flow graph. OA calls this path test-path.
Graph terminologies (OA forget to define this!) A path q is a subpath of p if q is not longer than p and: (i: i0 : (k : 0klength q : qk = pi+k )) [1,2] is a subpath of [0,1,2,3] [1,2] is not a subpath of [1,0,2] OA also use the term “p tours q”
CFG-based TR 1 2 3 A path can be used to express a test-requirement. If a path q is such a requirement, it is covered by a test path p if p tours q. So, we can specify a TR in terms of a set of paths. Example, TR = { [0,0], [2], [3] } Which test-set is good enough to satisfy this?
OA’s graph coverage criteria (C2.1) Node coverage: TR is the set of paths of length 0 in G. (C2.2) TR is the set of paths of length at most 1 in G. So, what does this criterion tell? Why “at most” ? (Def 1.24) A coverage criterion C1 subsumes (stronger than) C2 iff every test-set that satisfies C1 also satisfies C2.
Does C2.2 subsume C2.1 ? Obviously. How about the other way around? Are we happy now!? How about C2.3 (Edge-pair coverage): TR contains all paths of length at most 2. 1 2 2 1 6 3 5 4
Potential weakness: diamonds row 2 1 6 3 5 4 9 7 8 CR Min. size of test-set that will satisfy it Node (C2.1) 2 Edge (C2.2) Edge-pair (C2.3) 4 A row of k diamond has 2k different paths. However, suggesting that TR should include all paths in G is not realistic either (think of loops!).
AO propose prime paths coverage 1 2 3 A simple path p is a path where every node in p appears only once, except the first and the last which are allowed to be the same. Such a path is a prime path if it is not a proper subpath of another simple path.
Prime Path Coverage (PPC) (C2.4) TR is the set of all prime paths in G. Strong in a diamond row (force you to cover all 2k paths). Still finite. But... Perhaps a bit surprisingly, a single test can tour the above TR, namely 01012. Note that PPC is not the same as, nor implies that you must cover each loop by doing no iteration and at least one iteration, as shown by the above single test-case. 2 1 This can be toured with just one test path. We can fix C2.4 e.g. by requiring minimum length test-paths, but we will not go into this. TR = { 012, 010, 101 }
Calculating PPC’s TR How long can a prime path be? It follows that they can be systematically calculated, e.g. : “invariant” : maintain S to be a set of simple and still extensible (at the end-side) paths, and T of non-extensible (at the end-side) simple paths “extend” S by extending the paths in it; we move non-extensibles to T keep the invariant maintained. This will terminate! T contains non-extensible simple paths. The length of any simple path is bounded by N-1, where N is the number of nodes in G. Since a prime path is also a simple parth, it follows that the number of prime paths is finite, so they can be systematically computed.
Example S : [0, 1, 2] [0, 1] [0, 2, 3] [0, 2] [0, 2, 4] [1, 2] [0] [1, 2, 3] [1, 2, 4] [2, 3, 6] ! [2, 4, 6] ! [2, 4, 5] ! [4, 5, 4] * [5, 4, 6] ! [5, 4, 5] * [0, 1] [0, 2] [1, 2] [2, 3] [2, 4] [3, 6] ! [4, 6] ! [4, 5] [5, 4] 5 2 1 3 4 6 S : [0] [1] [2] [3] [4] [5] [6] ! A simple path s cannot be extended at the end-side to s++[v] if one of these happen: the last node of s is already a terminal node s++[v] is not a simple path because it cycles into the middle of s s is already a cycle We mark simple paths that cannot be extended. We also mark cycles. ! cannot be extended * cycle
Example [0, 1, 2] [0, 2, 3] [0, 2, 4] [1, 2, 3] [1, 2, 4] [2, 3, 6] ! [2, 4, 6] ! [2, 4, 5] ! [4, 5, 4] * [5, 4, 6] ! [5, 4, 5] * [0, 1, 2, 3] [0, 1, 2, 4] [0, 2, 3, 6] ! [0, 2, 4, 6] ! [0, 2, 4, 5] ! [1, 2, 3, 6] ! [1, 2, 4, 5] ! [1, 2, 4, 6] ! [0, 1, 2, 3, 6] ! [0, 1, 2, 4, 6] ! [0, 1, 2, 4, 5] ! 5 2 1 3 4 6 S is now empty; terminate. Prime paths = all the (*) paths, plus the non-cyclic simple paths (!) which are not a subpath of another (!)
Unfeasible test requirement If it turns out that there is no real execution that can cover it For example, we cannot test a Kelvin thermometer below 0o.
Typical unfeasibility in loops i = a.length-1 ; while (i0) { if (a[i]==0) break ; i-- } 1 2 5 4 3 123425 does not tour 125, but with sidetrips it does. Some loops always iterate at least once. E.g. above if a is never empty prime path 125 is unfeasible. (Def 2.37) A test-path p tours q with sidetrips if all edges in q appear in p, and they appear in the same order.
Generalizing Graph-based Coverage Criterion (CC) Obviously, sidetrip is weaker that strict touring. Every CC we have so far (C2.1, 2.2, 2.3, 2.4, 2.7) can be thought to be parameterized by the used concept of “tour”. We can opt to use a weaker “tour”, suggesting this strategy: First try to meet your CC with the strongest concept of tour. If you still have some paths uncovered which you suspect to be unfeasible, try again with a weaker touring.
Oh, another scenario... 1 2 5 4 3 Often, loops always break in the middle. E.g. above if a always contain a 0, then 125 is infeasible. Notice: e.g. 1235 does not tour 125, not even with sidetrip! (Def 2.38) A test-path p tours q with detours if all nodes in q appear in p, and they appear in the same order (q is a subsequence of p). Weaker than sidetrip.
Mapping your program to its CFG many ways, depending on purpose P1(x) { if (x==0) { x++ ; return 0 } else return x } (x==0) dummy (x==0) x==0 x!=0 x++ return x x++ ; return 0 return x x++ ; return 0 return x return 0 See Sec. 2.3.1. AO use left; I usually use middle. (Sec 2.3.1) Define a block a maximum sequence of “instructions” so that if one instruction is executed, all the others are always executed as well. Implies that a block a block does not contain a jump or branching instruction, except if it is the last one no instruction except the first one, can be the target of a jump/branching instruction in the program.
Mapping your program to its CFG P2(a) { for (int i=0; i<a.length ; i++) { if (a[i]==0) break ; if (odd a[i]) continue ; b[i] = a[i] ; a[i] = 0 } 1 2 6 3 4 Add “end-of-program” as a virtual node. 5
Discussion: exception try { File f = openFile(“input”) x = f.readInt() y = (Double) 1/x } catch (Exception e) { y = -1 } return y } 1 Every instruction can potentially throw an exception. Yet representing each instruction as its own node in our CFG will blow up the size of the graph, and thus also the size of the TR to tour. A minimalistic approach is to ignore the fact different instructions in the same block may throw different exceptions, and thus may be responsible for their own kind of error. Ignoring this, we can pretend that the jump to an exception handler always happen at the end of the block, so giving us the above CFG. But you need to be aware of the above ignorance. First line may throw file-not-exists exception 2nd line may throw an exception if the file does not contain an int 3rd line throws an exception if x is 0
Discussion: exception try { File f = openFile(“input”) x = f.readInt() y = (Double) 1/x } catch (IOException e) { y = -1 } catch (Exception e) { y = -1 } return y } 1 1a 1b Had we programmed as above, we can now at least distinguish between an execution that throws an IOException and other kind of exceptions (because they are represented by different nodes in the CFG), but of course we are still blind to the distinction between e.g. an execution that throws a ParseException and that the throws DivisionByZeroException.
More discussion What to do with dynamic binding ? What to do with recursion? You don’t know ahead which “add” will be called; it depends on the run-time type of c. E.g. it may refuse to add certain type of students. Graph-based coverage is not strong enough to express this aspect. register(Course c, Student s) { if (! c.full()) c.add(s) ; } f(x) { if (x0) return 0 else f(x-1) } g(x) { if (x0) return 0 else { x = g(x-1) ; return x+1 } }