Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA

Slides:



Advertisements
Similar presentations
Introduction to Software Testing Chapter 1
Advertisements

Coverage Criteria Drawn mostly from Ammann&Offutt and Pezze&Yooung.
1 Applications of Optimization to Logic Testing Gary Kaminski and Paul Ammann ICST 2010 CSTVA Workshop.
An Evaluation of MC/DC Coverage for Pair-wise Test Cases By David Anderson Software Testing Research Group (STRG)
Software Testing Logic Coverage. Introduction to Software Testing (Ch 3) © Ammann & Offutt 2 Logic Coverage Four Structures for Modeling Software Graphs.
Model-Driven Test Design Based on the book by Paul Ammann & Jeff Offutt Jeff Offutt Professor, Software Engineering.
Introduction to Software Testing Chapter 3.1, 3.2 Logic Coverage Paul Ammann & Jeff Offutt
Introduction to Software Testing Chapter 1 Paul Ammann & Jeff Offutt SUMMARY OF PARTS 1 AND 2 FROM LAST WEEK.
Of 23 Generating Automated Tests from Behavioral Models Jeff Offutt (Research with Dr. Nan Li, currently with MediData Solutions) Software Engineering.
Introduction to Software Testing Chapter 9.3 Challenges in Testing Software Test Criteria and the Future of Testing Paul Ammann & Jeff Offutt
1 Software Testing and Quality Assurance Lecture 9 - Software Testing Techniques.
© 2006 Fraunhofer CESE1 MC/DC in a nutshell Christopher Ackermann.
Introduction to Software Testing Chapter 3.1 Logic Coverage Paul Ammann & Jeff Offutt.
Introduction to Software Testing Chapter 8.2
Cost / Benefits Arguments for Automation and Coverage Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA
Introduction to Software Testing Chapter 5.2 Program-based Grammars Paul Ammann & Jeff Offutt
Introduction to Software Testing Chapter 9.4 Model-Based Grammars Paul Ammann & Jeff Offutt
Cost / Benefits Arguments for Automation and Coverage Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA
Software Logic Mutation Testing Presented by Gary Kaminski.
Introduction to Software Testing Chapter 8.1 Logic Coverage Paul Ammann & Jeff Offutt
Introduction to Software Testing Chapter 3.6 Disjunctive Normal Form Criteria Paul Ammann & Jeff Offutt
Of 18 Is Bytecode Instrumentation as Good as Source Instrumentation? An Empirical Study with Industrial Tools Nan Li, Xin Meng, Jeff Offutt, and Lin Deng.
Introduction to Software Testing Chapter 3.1, 3.2 Logic Coverage Paul Ammann & Jeff Offutt
Introduction to Software Testing Paul Ammann & Jeff Offutt Updated 24-August 2010.
Using Logic Criterion Feasibility to Reduce Test Set Size While Guaranteeing Double Fault Detection Gary Kaminski and Paul Ammann Software Engineering.
Introduction to Software Testing Chapter 3.6 Disjunctive Normal Form Criteria Paul Ammann & Jeff Offutt
Introduction to Software Testing Chapter 9.2 Program-based Grammars Paul Ammann & Jeff Offutt
Introduction to Software Testing Chapter 3.6 Disjunctive Normal Form Criteria Paul Ammann & Jeff Offutt
Introduction to Software Testing Chapter 3.1 Logic Coverage Paul Ammann & Jeff Offutt.
1 Using a Fault Hierarchy to Improve the Efficiency of DNF Logic Mutation Testing Gary Kaminski and Paul Ammann ICST 2009.
Introduction to Software Testing Chapter 3.2 Logic Coverage
Mutation Testing Breaking the application to test it.
Introduction to Software Testing Chapter 3.4 Logic Coverage for Specifications Paul Ammann & Jeff Offutt
Using Logic Criterion Feasibility to Reduce Test Set Size While Guaranteeing Fault Detection Gary Kaminski and Paul Ammann ICST 2009 March 24 Version.
Dr. Rob Hasker Dr. Brad Dennis. Coverage  Exercise: Each participant: write down 4 instructions Input to procedure: value given by someone, which person.
Verification vs. Validation Verification: "Are we building the product right?" The software should conform to its specification.The software should conform.
Software Testing and Quality Assurance Practical Considerations (1) 1.
Software Testing and Quality Assurance Syntax-Based Testing (2) 1.
Introduction to Software Testing (2nd edition) Chapter 5 Criteria-Based Test Design Paul Ammann & Jeff Offutt
Introduction to Software Testing Chapter 8.2
Paul Ammann & Jeff Offutt
Introduction to Software Testing Chapter 8.1 Logic Coverage
Paul Ammann & Jeff Offutt
Introduction to Unique Aspects of Web Applications
Introduction to Software Testing Chapter 3, Sec# 1 & 2 Logic Coverage
Introduction to Software Testing Chapter 9.2 Program-based Grammars
Mutation Testing Moonzoo Kim School of Computing KAIST
Introduction to Software Testing Syntactic Logic Coverage Criteria
Input Space Partition Testing CS 4501 / 6501 Software Testing
Introduction to Software Testing Chapter 3.1, 3.2 Logic Coverage
Logic Coverage CS 4501 / 6501 Software Testing
It is great that we automate our tests, but why are they so bad?
Introduction to Software Testing Chapter 9.2 Program-based Grammars
Paul Ammann & Jeff Offutt
Introduction to Software Testing
Fabiano Ferrari Software Engineering Federal University of São Carlos
Sergiy Vilkomir January 20, 2012
Introduction to Software Testing Chapter 5.2 Program-based Grammars
Moonzoo Kim School of Computing KAIST
Introduction to Software Testing Chapter 3.2 Logic Coverage
Logic Coverage CS 4501 / 6501 Software Testing
(some of) My Research Engineering is about getting technology to do what it does well so humans can do what they do well Jeff Offutt Professor of Software.
Logic Coverage Active Clause Coverage CS 4501 / 6501 Software Testing
Introduction to Software Testing Chapter 5.1 Syntax-based Testing
Mutation Testing Moonzoo Kim School of Computing KAIST
Introduction to Software Testing Chapter 8.1 Logic Coverage
Introduction to Software Testing Chapter 3.2 Logic Coverage
An Analysis of OO Mutation Operators Jingyu Hu, Nan Li, and Jeff Offutt Presented by Nan Li 03/24/2011.
Introduction to Software Testing Chapter 3.1, 3.2 Logic Coverage
Mutation Testing Faults are introduced into the program by creating many versions of the program called mutants. Each mutant contains a single fault. Test.
Presentation transcript:

of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA Joint research with Gary Kaminski and Paul Ammann Improving Logic-Based Testing, invited to Journal of Software Systems

of 33Outline Better Logic-Testing© Kaminski, Ammann, Offutt2 1.Motivation 2.Logic-Based Testing 3.Making Mutation Cheaper 4.Strengthening MCDC 5.Making MCDC Obsolete 6.Summary 1.Motivation 2.Logic-Based Testing 3.Making Mutation Cheaper 4.Strengthening MCDC 5.Making MCDC Obsolete 6.Summary

of 33 Software is a Skin that Surrounds Our Civilization Linköping, January 2011© Jeff Offutt3 Quote due to Dr. Mark Harman

of 33 Costly Software Failures “The Economic Impacts of Inadequate Infrastructure for Software Testing” –Inadequate software testing costs the US alone between $22 and $59 billion USD annually –Better testing could cut this amount in half 2006 : Amazon’s BOGO offer became a double discount 2007 : Symantec says that most security vulnerabilities are now due to faulty software –And more than half are in web applications Huge losses due to web application failures –Financial services : $6.5 million per hour (just in USA!) –Credit card sales applications : $2.4 million per hour (in USA) Better Logic-Testing© Kaminski, Ammann, Offutt4 World-wide monetary loss is staggering

of 33 Cost Of Late Testing Linköping, January 2011© Jeff Offutt Requirements Prog / Unit Test Design Integration Test Fault origin (%) Fault detection (%) Unit cost (X) Software Engineering Institute; Carnegie Mellon University; Handbook CMU/SEI-96-HB-002 System TestProduction

of 33Outline Better Logic-Testing© Kaminski, Ammann, Offutt6 1.Motivation 2.Logic-Based Testing 3.Making Mutation Cheaper 4.Strengthening MCDC 5.Making MCDC Obsolete 6.Summary 1.Motivation 2.Logic-Based Testing 3.Making Mutation Cheaper 4.Strengthening MCDC 5.Making MCDC Obsolete 6.Summary

of 33 Covering Logic Expressions Logic expressions show up in many situations They are essential to defining software behavior Covering logic expressions is required by the US Federal Aviation Administration for safety critical software Logical expressions can come from many sources –Decisions in programs –UML : FSMs and statecharts, activity diagrams –Requirements –SQL queries Test designs are subsets of expressions’ truth assignments Better Logic-Testing© Kaminski, Ammann, Offutt7

of 33 Better Logic-Testing© Kaminski, Ammann, Offutt8 L ogic P redicates and C lauses A predicate is an expression that evaluates to a boolean value Predicates can contain –boolean variables –non-boolean variables that contain >, =, <=, != –boolean function calls Internal structure is created by logical operators –¬ – the negation operator –  – the and operator –  – the or operator –  – the implication operator –  – the exclusive or operator –  – the equivalence operator A clause is a predicate with no logical operators

of 33 Power of Logic Testing Logic expressions encode the behavior of software Logic expressions define the domain of values for which the software behaves in a certain way Logic expressions are often –Complicated –Subtle –Easy to get wrong, both in design and implementation Better Logic-Testing© Kaminski, Ammann, Offutt9 Testing logic predicates is a cost-effective way to find many subtle software faults

of 33 Problems Addressed This (mostly) theoretical talk presents results on three problems with logic predicate testing : 1.Redundant mutation operators for predicate testing 2.Weakness of major logic testing criterion : MCDC 3.A stronger logic test criterion, minimal-MUMCUT Solutions based on theoretical analysis Solutions can be immediately used to create better tools and stronger criteria, with very slight additional cost Better Logic-Testing© Kaminski, Ammann, Offutt10

of 33Outline Better Logic-Testing© Kaminski, Ammann, Offutt11 1.Motivation 2.Logic-Based Testing 3.Making Mutation Cheaper 4.Strengthening MCDC 5.Making MCDC Obsolete 6.Summary 1.Motivation 2.Logic-Based Testing 3.Making Mutation Cheaper 4.Strengthening MCDC 5.Making MCDC Obsolete 6.Summary

of 33 Mutation Testing Mutation helps testers design tests directly to find common mistakes 1.Modify the software in small, syntactic, ways (mutants) Replace a variable, replace an operator, delete statements, … 2.Design or find a test to cause each mutant to result in incorrect behavior (killing mutants) The resulting tests are very strong—will detect most mistakes in the software Better Logic-Testing© Kaminski, Ammann, Offutt12 Fundamental Premise : If the software contains a fault, there will usually be mutants that can only be killed by a test that also detects that fault

of 33 Redundancy in Mutation Mutation is widely considered to be “expensive” This expense is largely based on the high number of test requirements—mutants But Li et al. found that mutation needed fewer tests ! Better Logic-Testing© Kaminski, Ammann, Offutt13 Li, Praphamontripong, Offutt, An experimental comparison of four unit test criteria, Mutation 2009

of 33 Eliminating Redundancy This is strong evidence that mutation tools use many redundant operators A more clever mutation system should have less redundancy Fewer mutants means less work for the tester … cheaper! Better Logic-Testing© Kaminski, Ammann, Offutt14

of 33 Mutation Predicate Testing Traditional ROR operator : Better Logic-Testing© Kaminski, Ammann, Offutt15 Each occurrence of a relational operator (, =, =, !=) is replaced by each other operator, and the expression is replaced by True and False. Example: –Original predicate: a > b –Mutant 1 : a < b –Mutant 2 : a <= b –Mutant 3 : a >= b –Mutant 4 : a == b –Mutant 5 : a != b –Mutant 6 : true –Mutant 7 : false

of 33 Mutation Predicate Testing A fault hierarchy establishes theoretical dominance relations among faults: Better Logic-Testing© Kaminski, Ammann, Offutt16 TNF LNF LRF LIF TOF ORF+ ENF ORF. LOF If fault A dominates fault B, then any test that detects fault A will by definition detect fault B Lau and Yu’s logic fault hierarchy detects

of 33 ROR Mutant Hierarchy Better Logic-Testing© Kaminski, Ammann, Offutt17 If mutant A dominates mutant B, then any test that detects mutant A will by definition detect mutant B Mutants for a < b a <= b false a != b true a >= b a == b a > b Mutants for a >= b a > b true a == b false a < b a != b a <= b

of 33 A Cheaper ROR Operator Better Logic-Testing© Kaminski, Ammann, Offutt18 Each occurrence of a relational operator (, =, =, !=) is replaced by operators as follows: < : <=, !=, False < : <=, !=, False > : >=, !=, False > : >=, !=, False <= : <, ==, True <= : <, ==, True >= : >, ==, True >= : >, ==, True == : =, False == : =, False != :, True != :, True Saves four mutants for each relational operator

of 33Outline Better Logic-Testing© Kaminski, Ammann, Offutt19 1.Motivation 2.Logic-Based Testing 3.Making Mutation Cheaper 4.Strengthening MCDC 5.Making MCDC Obsolete 6.Summary 1.Motivation 2.Logic-Based Testing 3.Making Mutation Cheaper 4.Strengthening MCDC 5.Making MCDC Obsolete 6.Summary

of 33MCDC Multiple Condition - Decision Coverage –Required by the USA Federal Aviation Administration to test safety critical software Each clause (condition) in a predicate (decision) is required to be tested with true and false when the clause “matters” –Changing the value of the clause changes the predicate’s value Example : MCDC considered to thoroughly probe predicates Most useful when predicate has more than 4 clauses –Otherwise, we can test all 2 N truth assignments Better Logic-Testing© Kaminski, Ammann, Offutt20 p = a  (b  c) Test 1 for a : a=true, (b  c)=false Test 2 for a : a=false, (b  c)=false

of 33 Weakness of MCDC MCDC was invented in the early 1990s Research community has invented additional logic criteria since –MCDC is weaker than ROR-mutation MCDC works at the clause level ROR works at the relational operator level Better Logic-Testing© Kaminski, Ammann, Offutt21 Solution : Extend MCDC to the relational operator level

of 33 Stronger MCDC MCDC can be extended to include requirements to kill ROR mutants Method : –MCDC requires clause c = x op y to have two values, True and False –Cheaper-ROR requires c to have three values : x y –The two MCDC values will always satisfy at least two of the cheaper-ROR requirements –Add one additional test to cover the third Better Logic-Testing© Kaminski, Ammann, Offutt22

of 33 Cost is Minor MCDC on a predicate with N clauses requires N+1.. 2N tests MCDC + ROR requires N more (2N N tests) Algorithm and proof in paper Better Logic-Testing© Kaminski, Ammann, Offutt23

of 33Example Better Logic-Testing© Kaminski, Ammann, Offutt24 p = a  b  c a = (a1 < a2), b = (b1 <= b2), c = (c1 == c2) The following test set satisfies MCDC : T = { t1, t2, t3, t4} = { ttf, tft, tff, ftf } Which can be refined with the following value assignments : TestValuea1a2b1b2c1c2abc t1TTF << t2TFT == t3TFF >< t4FTF > New(t1) == New(t1) == New(t2) > RORROR teststests

of 33Outline Better Logic-Testing© Kaminski, Ammann, Offutt25 1.Motivation 2.Logic-Based Testing 3.Making Mutation Cheaper 4.Strengthening MCDC 5.Making MCDC Obsolete 6.Summary 1.Motivation 2.Logic-Based Testing 3.Making Mutation Cheaper 4.Strengthening MCDC 5.Making MCDC Obsolete 6.Summary

of 33 Faults in Logic Expressions Types of common and possible faults in logic expressions have been categorized –LIF : Literal Insertion Fault … An extraneous literal, or clause, is in the predicate –LRF : Literal Replacement Fault … The wrong literal, or clause, is used in the predicate –LOF : Literal Omission Fault … A literal, or clause, that should have been in the predicate was omitted –TNF : Term Negation Fault … A term, or collection of clauses, that should have been negated was not (or vice versa) Lau and Yu defined nine types of logic faults and explored their relationships Better Logic-Testing© Kaminski, Ammann, Offutt26

of 33 Lau and Yu’s Fault Hierarchy A fault hierarchy establishes theoretical dominance relations among faults: Better Logic-Testing© Kaminski, Ammann, Offutt27 TNF LNF LRF LIF TOF ORF+ ENF ORF. LOF If fault A dominates fault B, then any test that detects fault A will by definition detect fault B MCDC minimal- MUMCUT detects

of 33 Empirical Studies The fault hierarchy result is theoretical –That is, MCDC is only guaranteed to detect TNF and ENF faults –But it could detect others by serendipity Two studies on different sets of logic expressions 1.Software from 5 airplane “black boxes” (Line Replaceable Units) 2.Air traffic collision avoidance system (TCAS) Better Logic-Testing© Kaminski, Ammann, Offutt28

of 33 Empirical Results Black boxes –20,256 logic expressions from 5 different “black boxes” –Expressions originally studied by Chilenski –132 expressions with at least 5 unique literals –125 were simple, so that MCDC can find all faults, leaving 7 –MCDC found 81% of the possible faults (140 of 173) TCAS –19 predicates (originally studied by Chen, Lau and Yu) –Larger predicates, several with 25 clauses –MCDC found only 35% of the possible faults (205 of 580) Better Logic-Testing© Kaminski, Ammann, Offutt29 MCDC found 81% of the possible faults (140 of173) MCDC found only 35% of the possible faults (205 of 580)

of 33 Size Matters Better Logic-Testing© Kaminski, Ammann, Offutt30 Unique Literals Total Faults Faults Found Percent % % % % % % % % Faults found grouped by number of unique literals

of 33 Minimal-MUMCUT vs. MCDC Minimal-MUMCUT finds significantly more faults than MCDC –Especially in large, complicated, logic expressions –Which is precisely when engineers are most likely to make mistakes! –Also very hard to debug when the software fails Requires up to four times as many tests Suggested approach 1.Use all combinations on predicates with less than 5 clauses 2.Use MCDC with 5 to 10 clauses 3.Use minimal-MUMCUT with above 10 clauses Better Logic-Testing© Kaminski, Ammann, Offutt31

of 33Outline Better Logic-Testing© Kaminski, Ammann, Offutt32 1.Motivation 2.Logic-Based Testing 3.Making Mutation Cheaper 4.Strengthening MCDC 5.Making MCDC Obsolete 6.Summary 1.Motivation 2.Logic-Based Testing 3.Making Mutation Cheaper 4.Strengthening MCDC 5.Making MCDC Obsolete 6.Summary

of 33Recommendations 1.Replace ROR with cheaper-ROR in mutation tools –No loss in strength –Saves 4 test requirements (mutants) for each relational operator –We are currently implementing in the muJava mutation tool Better Logic-Testing© Kaminski, Ammann, Offutt33 2.Extend logic criteria such as MCDC with ROR –Logic testing should apply to the relational operator level –Small increase in the number of tests –Large increase in the testing strength 3.Replace MCDC –Better: Replace MCDC with Minimal-MUMCUT + ROR –RTCA-DO-178B has been in effect for almost 20 years –MCDC was a brilliant idea … in 1992 –MCDC was adopted and required without scientific validation Little theoretical analysis and no experimental studies –We now know that MCDC is poor at discovering crucial faults