Download presentation
Presentation is loading. Please wait.
Published byBernice Cannon Modified over 9 years ago
1
CS5103 Software Engineering Lecture 16 Test coverage Regression Testing
2
2 Today’s class Test coverage Input combination coverage Mutation coverage Regression Testing Test Prioritization Mocking
3
3 Input Combination Coverage Basic idea Origins from the most straightforward idea In theory, proof of 100% correctness when achieve 100% coverage in theory In practice, on very trivial cases Main problems Combinations are exponential Possible values are infinite
4
4 Input Combination Coverage An example on a simple automatic sales machine Accept only 1$ bill once and all beverages are 1$ Coke, Sprite, Juice, Water Icy or normal temperature Want receipt or not All combinations = 4*2*2 = 16 combinations Try all 16 combinations will make sure the system works correctly
5
5 Input Combination Coverage Sales Machine Example Coke Sprite Juice Water Normal Icy Receipt No-Receipt Input 1 Input 2 Input 3
6
6 Combination Explosion Combinations are exponential to the number of inputs Consider an annual tax report system with 50 yes/no questions to generate a customized form for you 2 50 combinations = about 10 15 test cases Running 1000 test case for 1 second -> 30,000 years
7
7 Observation When there are many inputs, usually a relationship among inputs usually involve only a small number of inputs The previous example: Maybe only icy coke and sprite, but receipt is independent
8
8 Example of Tax Report Input 1: Family combined report or Single report Input 2: Home loans or not Input 3: Receive gift or not Input 4: Age over 60 or not … Input 1 is related to all other inputs Other inputs are independent of each other
9
9 Studies A long term study from NIST (national institute of standardization technology) A combination width of 4 to 6 is enough for detecting almost all errors
10
10 N-wise coverage Coverage on N-wise combination of the possible values of all inputs Example: 2-wise combinations (coke, icy), (sprite, icy), (water, icy), (juice, icy) (coke, normal), (sprite, normal), … (coke, receipt), (sprite, receipt), … (coke, no-receipt), (sprite, no-receipt), … (icy, receipt), (normal, receipt) (icy, no-receipt), (normal, no-receipt) 20 combinations in total We had 16 3-wise combinations, now we have 20, get worse??
11
11 N-wise coverage Note: One test case may cover multiple N-wise combinations E.g., (Coke, Icy, Receipt) covers 3 2-wise combinations (Coke, Icy), (Coke, Receipt), (Icy, Receipt) 100% N-wise coverage will fully cover 100% (N-1)- wise coverage, is this true? For K Boolean inputs Full combination coverage = 2 k combinations: exponential Full n-wise coverage = 4*k*(k-1)* … *(k-n+1)/n! combinations: polynomial, for 2-wise combination, 2*k*(k-1)
12
12 N-wise coverage: Example How many test cases for 100% 2-wise coverage of our sales machine example? (coke, icy, receipt), covers 3 new 2-wise combinations (sprite, icy, no-receipt), cover 3 new … (juice, icy, receipt), covers 2 new … (water, icy, receipt), covers 2 new … (coke, normal, no-receipt), covers 3 new … (sprite, normal, receipt), cover 3 new … (juice, normal, no-receipt), covers 2 new … (water, normal, no-receipt), covers 2 new … 8 test cases covers all 20 2-wise combinations
13
13 Combination Coverage in Practice 2-wise combination coverage is very widely used Pair-wise testing All pairs testing Mostly used in configuration testing Example: configuration of gcc All lot of variables Several options for each variable For command line tools: add or remove an option
14
14 Input model What happened if an input has infinite possible values Integer Float Character String Note: all these are actually finite, but the possible value set is too large, so that they are deemed as infinite Idea: map infinite values to finite value baskets (ranges)
15
15 Input model Input partition Partition the possible value set of a input to several value ranges Transform numeric variables (integer, float, double, character) to enumerated variables Example: int exam_score => {less than -1}, {0, 59}, {60,69}, {70,79}, {80,89}, {90, 100}, {100+} char c => {a, z}, {A,Z}, {0,9}, {other}
16
16 Input model Feature extraction For string and structure inputs Split the possible value set with a certain feature Example: String passwd => {contains space}, {no space} It is possible to extract multiple features from one input Example: String name => {capitalized first letter}, {not} => {contains space}, {not} => {length >10}, {2-10}, {1}, {0} One test case may cover multiple features
17
17 Input model Feature extraction: structure input A Word Binary Tree (Data at all nodes are strings) Depth : integer -> partition {0, 1, 1+} Number of leaves : integer -> partition {0, 1, <10, 10+} Root: null / not A node with only left child / not A node with only right child / not Null value data on any node / not Root value: string -> further feature extraction Value on the left most leaf: string -> further feature extraction ……
18
18 Input model Infeasible feature combination? Example: String name => {capitalized first letter}, {not} => {contains space}, {not} => {length >10}, {2-10}, {1}, {0} Length = 0 ^ contains space Length = 0 ^ capitalized first letter Length = 1 ^ contains space ^ capitalized first letter
19
19 Input combination coverage Summary: Try to cover the combination of possible values of inputs Exponential combinations: N-wise coverage 2-wise coverage is most popular, all pairs testing Infinite possible values Input partition Input feature extraction Coverage is usually 100% once adopted It is easy to achieve, compared with code coverage Models are not easy to write
20
20 Test coverage So far, covering inputs and code The final goal of testing Find all bugs in the software So there should be a bug coverage The coverage represents the adequacy of a test suite 50% bug coverage = half done! 100% bug coverage = done!
21
21 But it is impossible Bugs are unknown Otherwise we do not need testing So we have the number of bugs found, we do not know what to divide One possible solution Estimation 1-10 bugs in 1 KLOC Depends on the type of software and the stage of development, imprecise When you find many bugs, do you think all bugs are there or the code is really of low quality?
22
22 Mutation coverage How can we know how many bugs there are in the code? If only we plant those bugs! Mutation coverage checks the adequacy of a test suite by how many human-planted bugs it can expose
23
23 Concepts Mutant A software version with planted bugs Usually each mutant contains only one planted bug, why? Mutant Kill Given a test suite S and a mutant m, if there is a test case t in S, so that execute(original, t) != execute(m, t), we state that S can kill m Basically, a test suite can kill a mutant, meaning that the test suite is able to detect the planted bug represented by the mutant
24
24 Illustration Test Cases Original Mutant 1 Mutant 2 Mutant n... Oracles Results sameSurvived different Killed
25
25 Concepts Mutation coverage
26
26 Mutant generation Traditional mutation operators Statement deletion Replace Boolean expression with true/false Replace arithmetic operators (+, -, *, /, …) Replace comparison relations (>=, ==, <=, !=) Replace variables …
27
27 Mutation Example: Operator Mutant operatorIn originalIn mutant Statement Deletionz=x*y+1; Boolean expression to true | false if (x<y)if(true) If(false) Replace arithmetic operators z=x*y+1;z=x*y-1 z=x+y-1 Replace comparison operators if(x<y)if(x<=y) if(x==y) Replace variablesz=x*y+1;z = z*y+1 z = x*x+1
28
28 Mutant testing tools MILU http://www0.cs.ucl.ac.uk/staff/Y.Jia/#tools MuJava http://cs.gmu.edu/~offutt/mujava/ Javalanche https://github.com/david-schuler/javalanche/
29
29 Summary on all coverage measures Code coverage Target: code Adequacy: no -> 100% code coverage != no bugs Approximation: dataflow, branch, method/statements Preparation: none (instrumentation can be done automatically) Overhead: low (instrumentation cause some overhead)
30
30 Summary on all coverage measures Input combination coverage Target: inputs Adequacy: yes -> 100% input coverage == no bugs Approximation: n-wise coverage, input partition, input feature extraction Preparation: hard (require input modelling) Overhead: none
31
31 Summary on all coverage measures Mutation coverage Target: bugs Adequacy: no -> 100% mutant coverage != no bugs Approximation: mutation is already approximation Preparation: none (mutation and execution can be done automatically) Overhead: very high (execution on instrumented mutated versions)
32
32 Regression Testing So far Unit testing System testing Test coverage All of these are about the first round of testing Testing is performed time to time during the software life cycle Test cases / oracles can be reused in all rounds Testing during the evolution phase is regression testing
33
33 Regression Testing When we try to enhance the software We may also bring in bugs The software works yesterday, but not today, it is called “regression” Numbers Empirical study on eclipse 2005 11% of commits are bug-inducing 24% of fixing commits are bug-inducing
34
34 Regression Testing Run old test cases on the new version of software It will cost a lot if we run the whole suite each time Try to save time and cost for new rounds of testing Test Prioritization Fake Objects
35
35 Test prioritization Rank all the test cases Run test cases according to the ranked sequence Stop when resources are used up How to rank test cases To discover bugs sooner Or approximation: to achieve higher coverage sooner
36
36 APFD: Measurement of Test Prioritization Average Percentage of Fault Detected (APFD) Compare two test case sequences A number of faults (bugs) are detected after each test case The following two sequences, which is better? S1: T1 (2), t2(3), t3(5) S2: T2(1), t1(3), t3(5) APFD is the average of these numbers (normalized with the total number of faults), and 0 for initial state APFD (S1) = (0/5 + 2/5 + 3/5 + 5/5) / 4 = 0.5 APFD (S2) = (0/5 + 1/5 + 3/5 + 5/5) / 4 = 0.45
37
37 APFD: Illustration APFD can be deemed as the area under the TestCase-Fault curve Consider t1(f1, f2), t2(f3), t3(f3), t4(f1, f2, f3, f4)
38
38 Coverage-based test case prioritization Code coverage based Require recorded code-coverage information in previous testing Combination coverage based Require input model Mutation coverage based Require recorded mutation-killing stats
39
39 Total Strategy The simplest strategy Always select the unselected test case that has the best coverage
40
40 Example Consider code coverage on five test cases: T1: s1, s3 T2: s2, s3, s4, s5 T3: s3, s4, s5 T4: s6, s7 T5: s3, s5, s8, s9, s10 Ranking: T5, T2, T3, T1/T4
41
41 Additional Strategy An adaption of total strategy Instead of always choosing the test case with highest coverage Choose the test case that result in most extra coverage Starts from the test case with highest coverage
42
42 Example Consider code coverage on five test cases: T1: s1, s3 T2: s2, s3, s4, s5 T3: s3, s4, s5 T4: s6, s7 T5: s3, s5, s8, s9, s10 Ranking: T5(5), T2(2, s2, s4) / T4(2, s6, s7), T1(1, s1), T3
43
43 Fake Objects A resource waste in regression testing We change the code a little bit We need to run all the unchanged code in the test execution Using fake objects For all/some of the unchanged modules Do not run the modules Use the results of previous test instead
44
44 Fake Objects Example Testing an expert system for finance Has two components, UI and interest calculator (based on the inputs from UI) In first round of testing, store as a map the results of interest calculator: (a, b) -> 5%, (a, c) -> 10%, (d, e) -> 7.7% In regression testing, if the change is made on UI, you can rerun the software with the data map Using more fake objects means saving more time in regression testing, should we mock every object???
45
45 Pros & Cons Pros Saving time in regression testing Cons Be careful when mocking non-deterministic components E.g., mocking getSystemTime(), may conflict with another call Spend a lot of time for recording data maps Stored data map can be too huge When the mocked object is changed, the data map requires updates
46
46 Selection of faking modules Rules Using fake objects for time consuming modules So that you save more time The fake module should be stable E.g., libraries The interface should contain a small data flow E.g., numeric inputs and return values
47
47 Fake objects Fake objects are not just useful for regression testing They are also useful for UI Components Internet Components Components that will affect real world Sending an email Transfer money from credit cards
48
48 Next class Debugging Test coverage based bug localization Delta debugging
49
49 Thanks!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.