Kyle Mundt February 3, 2010
Richard Lipton, 1971 A way of testing your tests Alter your code in various ways Check to see if tests fail on altered code Becoming popular again due to increased computing power Strongest use seems to be with Fortran and C (procedural languages)
Mutant: code resulting from applying a mutation operator Mutation (or Fault) Operator: rule used to create a mutant Killing a Mutant: when tests fail on a mutant Mutation-Adequate: tests kill all mutants Equivalent mutant: produces same output as original (can’t be killed) Mutation score: killed / (total – equivalent)
Original if (a == 10) b = 3; else b = 5; Mutant if (a != 10) b = 3; else b = 5;
Apply mutation operators to code (only one change in each mutant) Run mutants through test cases Check mutants that do not fail ◦ Equivalent mutants Change test cases to catch mutant Repeat
Don’t begin too early Code should already be written Tests should be reasonably thorough Begin with good set of test cases and code that passes tests Iterative process ◦ Create mutants, run tests, fix, repeat
Traditional Mutation Operators ◦ Statement deletion ◦ Invert logical operators ◦ Replace arithmetic operators ◦ Replace variable with another in same scope Class-Level Mutation Operators ◦ Object oriented ◦ Change containers ◦ Concurrency
If test results are the same ◦ There is “dead code” ◦ Mutant is equivalent to original ◦ Test cases not complete Code coverage not good enough Functional testing not enough Need way to analyze test effectiveness Useful for hardware verification Based on coupling effect: “ test data set that detects all simple faults in a program is so sensitive that it also detects more complex faults” Ties into automatic test generation
Hitting every line doesn’t guarantee good tests Doesn’t ensure defects detected if they occur Not good measure of verification effectiveness
Testing functionality of program Deals with how program should work Mostly ignores how it shouldn’t Functional tests are subjective Also bad measure of verification quality
Looking at, improving tests Convinces us tests will detect defects if they occur Similar to techniques done manually on hardware Need way to automate it to be practical
Verify testbed Gives data to be used in improving chip testing Hardware defects can cause unexpected behavior Must ensure this behavior is caught Attempt to show defects found by tests
Adequate automation Huge (maybe infinite) possible number of mutants Fault classification and prioritization Which operators to use How to apply to OOP, concurrency, etc.
Mothra – 22 operators 6 operators account for 40-60% of mutants Redundant mutants (can be killed by same test) Research shows 5 operators gives good data ◦ Gives 99% total mutant score and reduces number of mutants by 77%
ABS – make value of each arithmetic expression be 0, positive, and negative AOR – replace arithmetic operator with valid operators LCR – replace each logical operator ROR – replace relational operators UOI – insert unary operators in front of expressions Note this is used on Fortran, a procedural language
Running whole program takes time Don’t really care about what happens after mutant line hit So, don’t execute whole mutant Compare states of original and mutant after mutated line executed Regular mutation requires three things ◦ Mutant be reached ◦ Mutant creates incorrect state in program ◦ Program state reflected as difference in test output Weak Mutation only requires the first two Related to automatic test generation
Put all mutants into “metaprogram” One large program to compile and link Mothra turns program into intermediate form Modifies intermediate form to create mutants Interprets intermediate forms (interpreting is slow) Schema-based is much faster than interpretive systems
Can reduce number of test cases executed ◦ Once a test case is killed, remove it from further testing ◦ No reason to kill it twice Random selection of mutants appropriate in some cases
Easy to come up with operators for simple types User types much harder Operators have been developed for things like ◦ Inheritance ◦ Polymorphism ◦ Method visibility Harder to mutate based on meaning of object Some research applied to Java OO mutation
Methods proposed to alter at code level ◦ Allows changes to things like attributes Mutants made this way may not compile Could cause integration errors Still doesn’t look at meanings Can write operators specifically for classes Probably not practical in almost all situations How to actually change object state?
Mutate commonly used Java libraries Again operators picked to try and simulate common mistakes One solution found was to mutate Containers, Iterators, and InputStream Java reflection can also be used to examine object fields at runtime
Collection Interface (most apply to List or Vector as well) ◦ Make collection empty ◦ Remove some element (first, last, random) ◦ Reorder elements ◦ Mutate element type Iterators: ◦ Skip element InputStream ◦ Skip bytes of data
First described for C programs Designed for integration testing, mutate interface between modules Designed to scale to larger systems Actions taken to control number of mutants ◦ Only look at integration errors ◦ Tests only connections between pairs of subsystems ◦ Mutates only module interfaces (eg. Function calls, return values)
MuClipse ◦ Open source mutation testing plug-in for Eclipse µJava (muJava) Heckle (Ruby) Insure++ (C++) Nester (C#)
C# programs for Visual Studios 2005 Only supports NUnit Framework Highlights code ◦ Killed mutations in green ◦ Surviving mutations in red ◦ Code not covered by tests in blue Allows use of XML-based grammar to define own transformation rules
Different approach Used to find bugs in source code Creates functionally equivalent mutants ◦ Lines changed without changing expected results Tests should all still pass If a test fails, something wrong with code Insure++ reports faults and lines responsible
More complicated programs equals greater need for quality tests Need way to ensure/measure test effectiveness Mutation-Based testing can fill this role Still needs further research Not a lot of implementations or examples of use on large scale Getting attention in hardware verification More work needs to be done to apply to OO
[1] Offutt, J. A. & Untch, R. H. (October 2000). Mutation 2000: Uniting the Orthogonal. Retrieved January 18, 2011 from [2] Bakewell, G. (2010). Mutation-Based Testing Technologies Close the “Quality Gap” in Functional Verification for Complex Chip Designs. Retrieved January 18, 2011, from Quality-Gap-in-Functional-Verification-for-Complex-Chip-Designs/4.aspx [3] Offut, J. A. (June 1995). A Practical System for Mutation Testing: Help for the Common Programmer. Retrieved January 18, 2011 from [4] Usaola, M. & Mateo, P. (2010). Mutation Testing Cost Reduction Techniques: A Survey. IEEE Software, 27(3), Retrieved January 23, 2011, from ABI/INFORM Global. (Document ID: ). [5] Kolawa, Adam (1999). Mutation Testing: A New Approach to Automatic Error-Detection. Retrieved January 18, 2011, from [6] Alexander, R. T.; Bieman, J. M.; Ghosh, S.; & Ji, B (2002). Mutation of Java Objects. Retrieved January 18, 2011 from [7] Nester - Free Software that Helps to do Effective Unit Testing in C#.