CS 4723: Lecture 5 Test Coverage

Slides:



Advertisements
Similar presentations
Lecture 2: testing Book: Chapter 9 What is testing? Testing is not showing that there are no errors in the program. Testing cannot show that the program.
Advertisements

1 Software Unit Test Coverage And Test Adequacy Hong Zhu, Patrick A. V. Hall, John H.R. May Presented By: Arpita Gandhi.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Masahiro Fujita Yoshihisa Kojima University of Tokyo May 2, 2008
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
CSE 1301 Lecture 5B Conditionals & Boolean Expressions Figures from Lewis, “C# Software Solutions”, Addison Wesley Briana B. Morrison.
Register Allocation CS 671 March 27, CS 671 – Spring Register Allocation - Motivation Consider adding two numbers together: Advantages: Fewer.
Annoucements  Next labs 9 and 10 are paired for everyone. So don’t miss the lab.  There is a review session for the quiz on Monday, November 4, at 8:00.
CS5103 Software Engineering Lecture 16 Test coverage Regression Testing.
Unit Testing CSSE 376, Software Quality Assurance Rose-Hulman Institute of Technology March 27, 2007.
1 Software Testing and Quality Assurance Lecture 9 - Software Testing Techniques.
Software Testing Sudipto Ghosh CS 406 Fall 99 November 16, 1999.
Computer Science 1620 Programming & Problem Solving.
CS4723 Software Validation and Quality Assurance Lecture 02 Overview of Software Testing.
Software Testing and QA Theory and Practice (Chapter 4: Control Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
Fundamentals of Python: From First Programs Through Data Structures
Software Testing Sudipto Ghosh CS 406 Fall 99 November 9, 1999.
System/Software Testing
Fundamentals of Python: First Programs
Introduction to Software Testing Chapter 5.2 Program-based Grammars Paul Ammann & Jeff Offutt
CMSC 345 Fall 2000 Unit Testing. The testing process.
Path Testing + Coverage Chapter 9 Assigned reading from Binder.
CS5103 Software Engineering Lecture 15 System Testing Testing Coverage.
Coverage – “Systematic” Testing Chapter 20. Dividing the input space for failure search Testing requires selecting inputs to try on the program, but how.
Foundations of Software Testing Chapter 5: Test Selection, Minimization, and Prioritization for Regression Testing Last update: September 3, 2007 These.
Regression Testing. 2  So far  Unit testing  System testing  Test coverage  All of these are about the first round of testing  Testing is performed.
IT253: Computer Organization Lecture 3: Memory and Bit Operations Tonga Institute of Higher Education.
Something to amuse you… CS UWO minutes.
Software testing techniques Software testing techniques Mutation testing Presentation on the seminar Kaunas University of Technology.
Flow of Control Part 1: Selection
Testing Testing Techniques to Design Tests. Testing:Example Problem: Find a mode and its frequency given an ordered list (array) of with one or more integer.
Test Coverage CS-300 Fall 2005 Supreeth Venkataraman.
Unit Testing 101 Black Box v. White Box. Definition of V&V Verification - is the product correct Validation - is it the correct product.
1 Graph Coverage (4). Reading Assignment P. Ammann and J. Offutt “Introduction to Software Testing” ◦ Section
CS 217 Software Verification and Validation Week 7, Summer 2014 Instructor: Dong Si
Coverage Estimating the quality of a test suite. 2 Code Coverage A code coverage model calls out the parts of an implementation that must be exercised.
Today’s Agenda  Reminder: HW #1 Due next class  Quick Review  Input Space Partitioning Software Testing and Maintenance 1.
Introduction to Software Testing (2nd edition) Chapter 7.4 Graph Coverage for Design Elements Paul Ammann & Jeff Offutt
Java Basics Hussein Suleman March 2007 UCT Department of Computer Science Computer Science 1015F.
Software Testing Part II March, Fault-based Testing Methodology (white-box) 2 Mutation Testing.
Introduction to Software Testing Chapter 9.2 Program-based Grammars Paul Ammann & Jeff Offutt
1 Test Coverage Coverage can be based on: –source code –object code –model –control flow graph –(extended) finite state machines –data flow graph –requirements.
Whole Test Suite Generation. Abstract Not all bugs lead to program crashes, and not always is there a formal specification to check the correctness of.
Foundations of Software Testing Chapter 7: Test Adequacy Measurement and Enhancement Using Mutation Last update: September 3, 2007 These slides are copyrighted.
The NP class. NP-completeness Lecture2. The NP-class The NP class is a class that contains all the problems that can be decided by a Non-Deterministic.
MUTACINIS TESTAVIMAS Benediktas Knispelis, IFM-2/2 Mutation testing.
Foundations of Software Testing Chapter 7: Test Adequacy Measurement and Enhancement Using Mutation Last update: September 3, 2007 These slides are copyrighted.
1 Software Testing. 2 What is Software Testing ? Testing is a verification and validation activity that is performed by executing program code.
Lecture 3: More Java Basics Michael Hsu CSULA. Recall From Lecture Two  Write a basic program in Java  The process of writing, compiling, and running.
Introduction to Software Testing (2nd edition) Chapter 5 Criteria-Based Test Design Paul Ammann & Jeff Offutt
Paul Ammann & Jeff Offutt
Software Testing.
Paul Ammann & Jeff Offutt
Software Testing.
Control Flow Testing Handouts
Handouts Software Testing and Quality Assurance Theory and Practice Chapter 4 Control Flow Testing
Introduction to Software Testing Chapter 9.2 Program-based Grammars
Input Space Partition Testing CS 4501 / 6501 Software Testing
Graph Coverage for Specifications CS 4501 / 6501 Software Testing
Mutation testing Julius Purvinis IFM-0/2.
CS5123 Software Validation and Quality Assurance
Structural testing, Path Testing
White-Box Testing.
Paul Ammann & Jeff Offutt
Software Testing (Lecture 11-a)
White-Box Testing.
Introduction to Software Testing Chapter 5.2 Program-based Grammars
Sudipto Ghosh CS 406 Fall 99 November 16, 1999
The Java switch Statement
Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
Presentation transcript:

CS 4723: Lecture 5 Test Coverage

Test Coverage After we have done some testing, how do we know the testing is enough? The most straightforward: input coverage # of inputs tested / # of possible inputs Unfortunately, # of possible inputs is typically infinite Not feasible, so we need approximations… 2

Test Coverage 3 Code Coverage Input Combination Coverage Specification Coverage Mutation Coverage 3

Code Coverage 4 Basic idea: Definition: Bugs in the code that has never been executed will not be exposed So the test suite is definitely not sufficient Definition: Divide the code to elements Calculate the proportion of elements that are executed by the test suite 4

Control Flow Graph 5 How many test cases to achieve full statement coverage? 5

Statement Coverage in Practice Microsoft reports 80-90% statement coverage Safely-critical software must achieve 100% statement coverage Usually about 85% coverage, 100% for large systems is usually very hard 6

Statement Coverage: Example 7

Branch Coverage 8 Cover the branches in a program A branch is consider executed when both (All) outcomes are executed Also called multiple-condition coveage 8

Control Flow Graph 9 How many test cases to achieve full branch coverage? 9

Branch Coverage: Example 10

Branch Coverage: Example An untested flow of data from an assignment to a use of the assigned value, could hide an erroneous computation Even though we have 100% statement and branch coverage 11

Data Flow Coverage 12 Cover all def-use pairs in a software Def: write to a variable Use: read of a variable Use u and Def d are paired when d is the direct precursor of u in certain execution 12

Data Flow Coverage 13 Formula Not easy to locate all use-def pairs Easy for inner-procedure (inside a method) Very difficult for inter-procedure Consider the write to a field var in one method, and the read to it in another method 13

Path coverage 14 The strongest code coverage criterion Try to cover all possible execution paths in a program Covers all previous coverage criteria? Usually not feasible Exponential paths in acyclic programs Infinite paths in some programs with loops 14

Path coverage 15 N conditions 2N paths Many are not feasible e.g., L1L2L3L4L6 X = 0 => L1L2L3L4L5L6 X = -1 => L1L3L4L6 X = -2 => L1L3L4L5L6 15

Control Flow Graph How many paths? How many test cases to cover? 16

Path coverage, not enough 1. main() { 2. int x, y, z, w; 3. read(x); 4. read(y); 5. if (x != 0) 6. z = x + 10; 7. else 8. z = 1; 9. if (y>0) 10. w = y / z; 10. else 11. w = 0; } Test Requirements: – 4 paths • Test Cases – (x = 1, y = 22) – (x = 0, y = 10) – (x = 1, y = -22) – (x = 1, y = -10) • We are still not exposing the fault ! • Faulty if x = -10 – Structural coverage cannot reveal this error 17

Code Coverage 18 Questions Statement (basic block) coverage, are they the same? Branch coverage (cover all edges in a control flow graph), same with basic block coverage? 18

Method coverage 19 So far, all examples are inner-method Quite useful in unit testing It is very hard to achieve 100% statement coverage in system testing Need higher level code element Method coverage Similar to statements Node coverage : method coverage Edge coverage : method invocation coverage Path coverage : stack trace coverage 19

Method coverage 20

Code coverage: summary Coverage of code elements and their connections Node coverage: Class/method/statement/predicate coverage Edge coverage: Branch/Dataflow/MethodInvok Path coverage: Path/UseDefChain/StackTrace 21

Code coverage: limitations Not enough Some bugs can not be revealed even with full path coverage Cannot reveal bugs due to missing code 22

Code coverage: practice Though not perfect, code coverage is the most widely used technique for test evaluation Also used for measure progress made in testing The criteria used in practice are mainly: Method coverage Statement coverage Branch coverage Loop coverage with heuristic (0, 1, many) 23

Code coverage: practice Far from perfect The commonly used criteria are the weakest, recall our examples A lot of corner (they are not so corner if just not found by statement coverage) cases can never be found 100% code coverage is rarely achieved Mature commercial software products released with 85% to 90% statement coverage Some commercial software products released with around 60% statement coverage Many open source software even lower than 50% 24

Input Combination Coverage Basic idea Origins from the most straightforward idea In theory, proof of 100% correctness when achieve 100% coverage in theory In practice, on very trivial cases Main problems Combinations are exponential Possible values are infinite 25

Input Combination Coverage An example on a simple automatic sales machine Accept only 1$ bill once and all beverages are 1$ Coke, Sprite, Juice, Water Icy or normal temperature Want receipt or not All combinations = 4*2*2 = 16 combinations Try all 16 combinations will make sure the system works correctly 26

Input Combination Coverage Sales Machine Example Input 1 Input 2 Input 3 Coke Sprite Juice Water Normal Icy Receipt No-Receipt 27

Combination Explosion Combinations are exponential to the number of inputs Consider an annual tax report system with 50 yes/no questions to generate a customized form for you 250 combinations = about 1015 test cases Running 1000 test case for 1 second -> 30,000 years 28

Observation When there are many inputs, usually a relationship among inputs usually involve only a small number of inputs The previous example: Maybe only icy coke and sprite, but receipt is independent 29

Example of Tax Report Input 1: Family combined report or Single report Input 2: Home loans or not Input 3: Receive gift or not Input 4: Age over 60 or not … Input 1 is related to all other inputs Other inputs are independent of each other 30

Studies A long term study from NIST (national institute of standardization technology) A combination width of 4 to 6 is enough for detecting almost all errors 31

N-wise coverage Coverage on N-wise combination of the possible values of all inputs Example: 2-wise combinations (coke, icy), (sprite, icy), (water, icy), (juice, icy) (coke, normal), (sprite, normal), … (coke, receipt), (sprite, receipt), … (coke, no-receipt), (sprite, no-receipt), … (icy, receipt), (normal, receipt) (icy, no-receipt), (normal, no-receipt) 20 combinations in total We had 16 3-wise combinations, now we have 20, get worse?? 32

N-wise coverage Note: One test case may cover multiple N-wise combinations E.g., (Coke, Icy, Receipt) covers 3 2-wise combinations (Coke, Icy), (Coke, Receipt), (Icy, Receipt) 100% N-wise coverage will fully cover 100% (N-1)-wise coverage, is this true? For K Boolean inputs Full combination coverage = 2k combinations: exponential Full n-wise coverage = 2n*k*(k-1)* … *(k-n+1)/n! combinations: polynomial, for 2-wise combination, 2*k*(k-1) 33

N-wise coverage: Example How many test cases for 100% 2-wise coverage of our sales machine example? (coke, icy, receipt), covers 3 new 2-wise combinations (sprite, icy, no-receipt), cover 3 new … (juice, icy, receipt), covers 2 new … (water, icy, receipt), covers 2 new … (coke, normal, no-receipt), covers 3 new … (sprite, normal, receipt), cover 3 new … (juice, normal, no-receipt), covers 2 new … (water, normal, no-receipt), covers 2 new … 8 test cases covers all 20 2-wise combinations 34

Combination Coverage in Practice 2-wise combination coverage is very widely used Pair-wise testing All pairs testing Mostly used in configuration testing Example: configuration of gcc All lot of variables Several options for each variable For command line tools: add or remove an option 35

Input model 36 What happened if an input has infinite possible values Integer Float Character String Note: all these are actually finite, but the possible value set is too large, so that they are deemed as infinite Idea: map infinite values to finite value baskets (ranges) 36

Input model 37 Equivalent class partition Partition the possible value set of a input to several value ranges Transform numeric variables (integer, float, double, character) to enumerated variables Example: int exam_score => {less than -1}, {0, 59}, {60,69}, {70,79}, {80,89}, {90, 100}, {100+} char c => {a, z}, {A,Z}, {0,9}, {other} 37

Input model 38 Feature extraction For string and structure inputs Split the possible value set with a certain feature Example: String passwd => {contains space}, {no space} It is possible to extract multiple features from one input String name => {capitalized first letter}, {not} => {contains space}, {not} => {length >10}, {2-10}, {1}, {0} One test case may cover multiple features 38

Input model 39 Feature extraction: structure input A Word Binary Tree (Data at all nodes are strings) Depth : integer -> partition {0, 1, 1+} Number of leaves : integer -> partition {0, 1, <10, 10+} Root: null / not A node with only left child / not A node with only right child / not Null value data on any node / not Root value: string -> further feature extraction Value on the left most leaf: string -> further feature extraction … 39

Input model 40 Infeasible feature combination? Example: String name => {capitalized first letter}, {not} => {contains space}, {not} => {length >10}, {2-10}, {1}, {0} Length = 0 ^ contains space Length = 0 ^ capitalized first letter Length = 1 ^ contains space ^ capitalized first letter 40

Input combination coverage Summary: Try to cover the combination of possible values of inputs Exponential combinations: N-wise coverage 2-wise coverage is most popular, all pairs testing Infinite possible values Input partition Input feature extraction Coverage is usually 100% once adopted It is easy to achieve, compared with code coverage Models are not easy to write 41

Specification Coverage A type of input coverage Covers the written formal specification in the requirement document Example When a number smaller than 0 is fed in, the system should report error => testcase: -1 Sometimes can be a sequence of inputs When you input correct user name, a passwd prompt is shown, after you input the correct passwd, the user profile will be shown, … => testcase: xiaoyin, xxxxx, … 42

Specification Coverage Widely used in industry Advantages Target at the specification No need for writing oracles Usually can achieve 100% coverage Disadvantages Very hard to automate can only be automated with formal specifications No guarantee to be complete Quality highly depend on the specification 43

Test coverage 44 So far, covering inputs and code The final goal of testing Find all bugs in the software So there should be a bug coverage The coverage best represents the adequacy of a test suite 50% bug coverage = half done! 100% bug coverage = done! 44

But it is impossible 45 Bugs are unknown Otherwise we do not need testing So we have the number of bugs found, we do not know what to divide One possible solution Estimation 1-10 bugs in 1 KLOC Depends on the type of software and the stage of development, imprecise When you find many bugs, do you think all bugs are there or the code is really of low quality? 45

Mutation coverage How can we know how many bugs there are in the code? If only we plant those bugs! Mutation coverage checks the adequacy of a test suite by how many human-planted bugs it can expose 46

Concepts 47 Mutant Mutant Kill A software version with planted bugs Usually each mutant contains only one planted bug, why? Mutant Kill Given a test suite S and a mutant m, if there is a test case t in S, so that execute(original, t) != execute(m, t), we state that S can kill m Basically, a test suite can kill a mutant, meaning that the test suite is able to detect the planted bug represented by the mutant 47

Illustration 48 Original Oracles same Survived Mutant 1 Results Test Cases different Killed Mutant 2 Results ... Mutant n Results 48

Concepts Mutation coverage 49

Mutant generation Traditional mutation operators 50 Statement deletion Replace Boolean expression with true/false Replace arithmetic operators (+, -, *, /, …) Replace comparison relations (>=, ==, <=, !=) Replace variables … 50

Mutation Example: Operator Mutant operator In original In mutant Statement Deletion z=x*y+1; Boolean expression to true | false if (x<y) if(true) If(false) Replace arithmetic operators z=x*y-1 z=x+y-1 Replace comparison operators if(x<y) if(x<=y) if(x==y) Replace variables z = z*y+1 z = x*x+1 51

Mutant generation Object-oriented mutation operators 52 Insert/Delete overriding method Add/delete “this” Instantiation as child class Cast to subtype … 52

Mutation Example: Object-Oriented Insert/Delete overriding method class Shape{ public void setID(String id){ this.id = id; } public void draw(){ ... class Circle extends Shape{ class Shape{ public void setID(String id){ this.id = id; } public void draw(){ ... class Circle extends Shape{ class Shape{ public void setID(String id){ this.id = id; } protected void draw(){ ... class Circle extends Shape{ 53

Problems of mutation testing Large amount of time overhead Need to run the test suite over large number of mutants Cause extra burden for collecting test coverage Equivalent mutants A mutant that will not affect the behavior of the software 54

Time overhead For n mutants, requires n times of overhead How to reduce time overhead? Reuse execution info Early rule out Mutants that are not covered Mutants that cannot be killed 55

Reduce Time Overhead 56 original m1 m2 m3 int index = read; while (…) { …; index++; if (index == 10) { break; } return value > 0; int index = read; while (…) { …; index++; if (index == 10) { break; } return value < 0; int index = read; while (…) { …; index++; if (index == 10) { return true; } return value > 0; int index = read; while (…) { …; index++; if (index == 10) { break; } return value +1 >0; If value is not 0, nothing is changed reuse the program states before return statement If index reads 100, The mutant is not covered 56

Equivalent mutants Another main problem in mutation coverage is equivalent mutants A mutant is an equivalent mutant if its semantics is identical with the original software int index = 0; while (…) { …; index++; if (index == 10) { break; } int index = 0; while (…) { …; index++; if (index >= 10) { break; } => 57

Equivalent mutants Another main problem in mutation coverage is equivalent mutants Equivalent mutants cause mutation coverage to never reach 100% So you do not know whether there are too many equivalent mutants, or the test suite is not adequate 58

Reduce equivalent mutants Using compiler optimization Check whether the compiled bytecode is the same with the original software Mutating dead code Mutating unused variable After the mutation code, write a conditional path, and check whether the path is feasible //result = a + b; result = a - b; if(a + b != a - b){ not equivalent; } //result = a + b; result = a - b; => 59

Mutant testing tools http://www0.cs.ucl.ac.uk/staff/Y.Jia/#tools MILU http://www0.cs.ucl.ac.uk/staff/Y.Jia/#tools MuJava http://cs.gmu.edu/~offutt/mujava/ Javalanche https://github.com/david-schuler/javalanche/ 60

Summary on all coverage measures Code coverage Target: code Adequacy: no -> 100% code coverage != no bugs Approximation: dataflow, branch, method/statements Usability: medium (require code for instrumentation) Preparation: none Overhead: low (instrumentation cause some overhead) 61

Summary on all coverage measures Input combination coverage Target: inputs Adequacy: yes -> 100% input coverage == no bugs Approximation: n-wise coverage, input partition, input feature extraction Usability: none Preparation: hard (require input mapping) Overhead: none 62

Summary on all coverage measures Mutation coverage Target: bugs Adequacy: no -> 100% mutant coverage != no bugs Approximation: mutation is already approximation Usability: medium (require code change for mutants) Preparation: none Overhead: very high (execution on instrumented mutated versions) 63