Mutant Subsumption Graphs Mutation 2014 March 31, 2014 Bob Kurtz, Paul Ammann, Marcio Delamaro, Jeff Offutt, Lin Deng
Introduction In this talk, we will – Define true subsumption, dynamic subsumption, and static subsumption to model the redundancy between mutants – Develop a graph model to display the subsumption relationship – Examine via an example how subsumption relationships behave and evolve 2
Motivation What exactly is subsumption, anyway? – Lots of prior work – fault hierarchies, subsuming HOMs, etc. – Can we specify some rules and produce a useful representation? What can we do with it once we have it? – Can we select a minimal set of mutants to reduce testing complexity? – Can we use subsumption as a fitness function for tasks like evaluating automated program repair candidates? 3
“True” Subsumption Given a set of mutants M on artifact A, mutant m i subsumes mutant m j (m i → m j ) iff: – Some test kills m i – All tests that kill m i also kill m j “True” subsumption represents the actual relationship between mutants We’d like to get this relationship, but in general it is undecidable, so we must approximate it 4
Dynamic Subsumption Dynamic subsumption approximates true subsumption using a finite test set T Given a set of mutants M on artifact A and a test set T, mutant m i dynamically subsumes mutant m j iff: – Some test in T kills m i – All tests in T that kill m i also kill m j If T contains all possible tests – dynamic subsumption = true subsumption 5
Static Subsumption Static subsumption approximates true subsumption using static analysis of mutants rather than test execution Given a set of mutants M on artifact A, mutant m i statically subsumes mutant m j iff: – Analysis shows that some test could kill m i – Analysis shows that all tests that could kill m i also would kill m j If we had omniscient analysis, then static subsumption = true subsumption 6
An Informal View 7 All tests Tests that kill m j Tests that kill m i Tests that kill m k m i → m j
Graph Model In the Mutant Subsumption Graph (MSG) graph model – Nodes represent a maximal set of indistinguished mutants – Edges represent the subsumption relationship – Thus, m 1 → m 2 → m 3 is represented as: 8 8
Dynamic Subsumption Graph (DMSG) 9 T = { t 1, t 2, t 3, t 4 } “Indistinguished” mutants
Minimal Mutants Minimal mutants are not subsumed by any other mutants If we execute a test set that kills all the minimal mutants, then we will kill all the mutants – All other mutants are redundant! 10 Minimal 10
DMSG Growth 11 We can observe the growth of the DMSG as we add tests – Dashed nodes indicate live mutants T = { t 1 }T = { t 1, t 2 }T = { t 1, t 2, t 3 }T = { t 1, t 2, t 3, t 4 }
Subsumption State Model Mutants change state (with respect to subsumption relationships) as tests are added. – Live or killed – Distinguished or indistinguished – Minimal or subsumed Only if killed These 3 attributes combine to create 8 possible states, but since subsumption is not defined for live mutants, we only care about 6 states 12
To explore mutant subsumption graphs in more detail, we selected a small example program cal() is a simple Java program of < 20 SLOC – cal() calculates the number of days between two dates in the same year Chosen for its well-defined finite input space – See Ammann and Offutt, Introduction to Software Testing We used muJava to generate 173 mutants The cal() Example 13
The cal() Example Dynamic subsumption requires a test set We used the Advanced Combinatorial Testing System (ACTS) to generate a test set – Pairwise combinations of months and year types (divisible-by-400, divisible-by-100, divisible-by-4, other) generated 90 test cases – Test set killed 145 mutants, and the remaining 28 were analyzed by hand and determined to be equivalent 14
The cal() Example 31 nodes of indistinguished mutants 7 nodes of minimal mutants – Even though muJava generated 145 non- equivalent mutants, we need to kill only 7 (one from each of these nodes) to ensure that we kill all
We can observe the growth of the DMSG as we individually add the 90 “pairwise” tests in random order – Graph shows the number of minimal mutant nodes (red) and the total number of graph nodes (red + blue) DMSG Growth for cal() 16
DMSG Growth for cal() 17
cal() DMSG for Different Test Sets 90-test “pairwise” test set 31 nodes 7 minimal nodes 6-test “minimal” test set 17 nodes 6 minimal nodes 312-test “combinatorial” test set 33 nodes 9 minimal nodes 18
cal() in C We implemented the cal() program in C, then used Proteum to generate mutants – Proteum’s mutation operators are not based on the selective set of operators, so it generated many more mutants – 891 – The same 90 tests killed all but 71 mutants, and those 71 were determined to be equivalent nodes Only 18 minimal nodes
May group mutants together where a distinguishing test is missing May add unsound edges where a contradicting test is missing Dynamic Approximation 20 TMSGDMSG Not Distinguished Unsound Edge
May group mutants together where unable to solve constraints If analysis is sound, should never add unsound edges Static Approximation 21 SMSG Not Distinguished TMSG
Static Refinement of the DMSG Can the dynamic results be refined by static analysis? We performed a manual analysis of a small portion of the graph 22
Static Refinement of the DMSG COI_1 is killed by all tests AORB_4 is killed whenever (month2=month1) AOUI_7 is killed whenever (month2≠month1) or whenever ((month2=month1)^(day2≠day1)) Tests that kill COI_1 (all tests) Tests that kill AORB_4 Tests that kill AOUI_7 23
Static Refinement of the DMSG COI_1 is killed by all tests AORB_4 is killed whenever (month2=month1) AORB_2 is killed whenever ((month2=month1)^ ((day2≠day1)≠(day2-day1))) Tests that kill COI_1 (all tests) Tests that kill AORB_4 24 Tests that kill AORB_2
Static Refinement of the DMSG AORB_2 is killed whenever (month2=month1)^ ((day2-day1)≠(day2/day1)) AORB_3 is killed whenever (month2=month1)^ ((day2-day1)≠(day2%day1)) What is the relationship between these mutants? All tests / tests that kill COI_1 Tests that kill AORB_2 Tests that kill AORB_3 ? 25
Static Refinement of the DMSG Combinations of day1 and day2 that kill: – both AORB_2 and AORB_3 are GREEN – neither are BLUE – AORB_2 but not AORB_3 are RED – AORB_3 but not AORB_2 are YELLOW This one test case breaks AORB_3 → AORB_2 26
Static Refinement of the DMSG Static analysis removes the unsound edge between AORB_3 and AORB_2 Refines to Unsound Edge 27
“Stubborn” Mutants Yao, Harman, and Jia define “stubborn” mutants as those non- equivalent mutants which are not killed by a branch- adequate test set – “A Study of Equivalent and Stubborn Mutation Operators Using Human Analysis of Equivalence”, ICSE 2014 What’s the relationship between “stubborn” mutants and minimal mutants? 82% kill 63% kill 28
Summary We have developed a succinct definition of mutant subsumption, as well as two practical approximations, dynamic and static subsumption We have developed a graphical notation for subsumption We have investigated some properties of subsumption, including growth patterns of the DMSG and a state machine 29
Open Questions Why are the Java/muJava and C/Proteum subsumption graphs so different? Can we analyze static subsumption using Java Pathfinder and differential symbolic execution (or some other tools/techniques)? How do we merge dynamic and static MSGs to get closer to the “true” MSG? What is the relationship between minimal and “stubborn” mutants? 30
Establishing Theoretical Minimal Sets of Mutants – Paul Ammann, Marcio Delamaro, and Jeff Offutt – Tuesday, 11:30-1:00 in the Burlington Room Related Information 31
Questions?
AORB_13 ROR_16, ROR_20 ROR_17, AORB_12, AORB_11, AORB_10, AOIS_20, AOIS_22, AOIS_21, AORB_9, AOIS_33, AOIS_34, AOIS_19, LOI_6, LOI_9, ROR_21, ROR_24, ROR_28 ROR_14, ROR_10 AORB_19 AORB_3 AOIS_46, AOIS_8 Minimal Mutant Operators 33
Minimal Mutant Operators 34 Operator#Minimal#Total%Mimimal AOIS % AOIU070% AORB % COI070% COR040% LOI % ROR %