Software Testing Part III: Test Assessment and Improvement

Slides:

Advertisements

Similar presentations

1 Software Unit Test Coverage And Test Adequacy Hong Zhu, Patrick A. V. Hall, John H.R. May Presented By: Arpita Gandhi.

Advertisements

Data Flow Coverage. Reading assignment L. A. Clarke, A. Podgurski, D. J. Richardson and Steven J. Zeil, "A Formal Evaluation of Data Flow Path Selection.

SOFTWARE TESTING. INTRODUCTION  Software Testing is the process of executing a program or system with the intent of finding errors.  It involves any.

Foundations of Software Testing Chapter 6: Test Adequacy Measurement and Enhancement: Control and Data flow Last update: September 3, 2007 These slides.

Foundations of Software Testing Chapter 1: Section 1.19 Coverage Principle and the Saturation Effect Aditya P. Mathur Purdue University Last update: August.

Software Testing Sudipto Ghosh CS 406 Fall 99 November 16, 1999.

Ch6: Software Verification. 1 Statement coverage criterion  Informally:  Formally:  Difficult to minimize the number of test cases and still ensure.

Aditya P. Mathur Professor, Department of Computer Science, Associate Dean, Graduate Education and International Programs Purdue University Wednesday July.

Coverage Principle: A Mantra for Software Testing and Reliability Aditya P. Mathur Purdue University August 28, Cadence Labs, Chelmsford Last update:August.

Testing an individual module

Handouts Software Testing and Quality Assurance Theory and Practice Chapter 5 Data Flow Testing

SOFTWARE TESTING WHITE BOX TESTING 1. GLASS BOX/WHITE BOX TESTING 2.

Software Testing and QA Theory and Practice (Chapter 4: Control Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.

CS 406 Fall 98 Software Testing Part III: Test Assessment and Improvement Aditya P. Mathur Purdue university Last update: July 19, 1998.

Software Testing Sudipto Ghosh CS 406 Fall 99 November 9, 1999.

1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor Kostas Kontogiannis.

Presented By Dr. Shazzad Hosain Asst. Prof., EECS, NSU

CMSC 345 Fall 2000 Unit Testing. The testing process.

Chapter 13: Implementation Phase 13.3 Good Programming Practice 13.6 Module Test Case Selection 13.7 Black-Box Module-Testing Techniques 13.8 Glass-Box.

1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor Kostas Kontogiannis.

Overview of Software Testing 07/12/2013 WISTPC 2013 Peter Clarke.

Path Testing + Coverage Chapter 9 Assigned reading from Binder.

1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor Kostas Kontogiannis.

Course Outline Traditional Static Program Analysis –Theory –Classic analysis and applications Points-to analysis, CHA, RTA –The Soot analysis framework.

White-Box Testing Techniques II Originals prepared by Stephen M. Thebaut, Ph.D. University of Florida Dataflow Testing.

Coverage – “Systematic” Testing Chapter 20. Dividing the input space for failure search Testing requires selecting inputs to try on the program, but how.

Overview Graph Coverage Criteria ( Introduction to Software Testing Chapter 2.1, 2.2) Paul Ammann & Jeff Offutt.

Foundations of Software Testing Chapter 5: Test Selection, Minimization, and Prioritization for Regression Testing Last update: September 3, 2007 These.

Chapter 13 Recursion. Learning Objectives Recursive void Functions – Tracing recursive calls – Infinite recursion, overflows Recursive Functions that.

1 Software Testing. 2 Path Testing 3 Structural Testing Also known as glass box, structural, clear box and white box testing. A software testing technique.

Foundations of Software Testing Chapter 1: Preliminaries Last update: September 3, 2007 These slides are copyrighted. They are for use with the Foundations.

Test Coverage CS-300 Fall 2005 Supreeth Venkataraman.

White-box Testing.

CPS120: Introduction to Computer Science Decision Making in Programs.

Coverage Estimating the quality of a test suite. 2 Code Coverage A code coverage model calls out the parts of an implementation that must be exercised.

1 Program Testing (Lecture 14) Prof. R. Mall Dept. of CSE, IIT, Kharagpur.

Mutation Testing G. Rothermel. Fault-Based Testing White-box and black-box testing techniques use coverage of code or requirements as a “proxy” for designing.

Overview Structural Testing Introduction – General Concepts

UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.

SOFTWARE TESTING. Introduction Software Testing is the process of executing a program or system with the intent of finding errors. It involves any activity.

Condition Testing. Condition testing is a test case design method that exercises the logical conditions contained in a program module. A simple condition.

Foundations of Software Testing Slides based on: Draft V4.0. November 19, 2006 Test Adequacy Measurement and Enhancement Using Mutation Last update: January15,

1 Test Coverage Coverage can be based on: –source code –object code –model –control flow graph –(extended) finite state machines –data flow graph –requirements.

Foundations of Software Testing Chapter 6: Test Adequacy Measurement and Enhancement: Control and Data flow Last updated: April 21, 2011 These slides are.

White-Box Testing Techniques I Prepared by Stephen M. Thebaut, Ph.D. University of Florida Software Testing and Verification Lecture 7.

Software Testing Sudipto Ghosh CS 406 Fall 99 November 23, 1999.

Foundations of Software Testing Chapter 7: Test Adequacy Measurement and Enhancement Using Mutation Last update: September 3, 2007 These slides are copyrighted.

Foundations of Software Testing Chapter 5: Test Selection, Minimization, and Prioritization for Regression Testing Last update: September 3, 2007 These.

Dynamic White-Box Testing What is code coverage? What are the different types of code coverage? How to derive test cases from control flows?

Foundations of Software Testing Chapter 7: Test Adequacy Measurement and Enhancement Using Mutation Last update: September 3, 2007 These slides are copyrighted.

SOFTWARE TESTING LECTURE 9. OBSERVATIONS ABOUT TESTING “ Testing is the process of executing a program with the intention of finding errors. ” – Myers.

1 Software Testing. 2 What is Software Testing ? Testing is a verification and validation activity that is performed by executing program code.

Foundations of Software Testing Slides based on: Draft V1.0 August 17, 2005 Test Adequacy Measurement and Enhancement Using Mutation Last update: October.

Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.

Aditya P. Mathur Purdue University

Software Testing.

Software Testing.

Control Flow Testing Handouts

Handouts Software Testing and Quality Assurance Theory and Practice Chapter 4 Control Flow Testing

Software Engineering (CSI 321)

Software Testing and Maintenance 1

Outline of the Chapter Basic Idea Outline of Control Flow Testing

Aditya P. Mathur Purdue University

Structural testing, Path Testing

Types of Testing Visit to more Learning Resources.

Software Testing (Lecture 11-a)

Sudipto Ghosh CS 406 Fall 99 November 16, 1999

Coverage Principle: A Mantra for Software Testing and Reliability

Software Testing Part III: Test Assessment and Improvement

Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.

Presentation transcript:

Software Testing Part III: Test Assessment and Improvement Aditya P. Mathur Purdue university Last update: November 15, 2001

Test assessment and improvement Learning Objectives To understand the relevance and importance of test assessment. To learn the fundamental principle underlying test assessment. To learn various methods and tools for test assessment. Test assessment and improvement

Test assessment and improvement Learning objectives To understand the relative strengths/weaknesses of test assessment methods. To learn how to improve tests based on a test assessment procedure. Test assessment and improvement

What is test assessment? Once a test set T, a collection of test inputs, has been developed, we ask: How good is T? It is the measurement of the goodness of T which is known as test assessment. Test assessment is carried out based on one or more criteria. Test assessment and improvement

Test assessment-continued These criteria are known as test adequacy criteria. Test assessment is also known as test adequacy assessment. Test assessment and improvement

Test assessment-continued Test assessment provides the following information: A metric, also known as the adequacy score or coverage, usually between 0 and 1. A list of all the weaknesses found in T, which when removed, will raise the score to 1. The weaknesses depend on the criteria used for assessment. Test assessment and improvement

Test assessment-continued Once the coverage has been computed, and the weaknesses identified, one can improve T. Improvement of T is done by examining one or more weaknesses and constructing new test requirements designed to overcome the weakness(es). The new test requirements lead to new test specifications and to further testing of the program. Test assessment and improvement

Test assessment-continued This is continued until all weaknesses are overcome, i.e. the adequacy criterion is satisfied (coverage=1). In some instances it may not be possible to satisfy the adequacy criteria for one or more of the following reasons: Lack of sufficient manpower Weaknesses that cannot be removed because they are infeasible. Test assessment and improvement

Test assessment-continued The cost of removing the weaknesses is not justified. While improving T by removing its weaknesses, one usually tests the program more thoroughly than it has been tested so far. This additional testing is likely to result in the discovery of remaining errors. Test assessment and improvement

Test assessment-continued Hence we say that test assessment and improvement helps in the improvement of software reliability. Test assessment and improvement is applicable throughout the testing process and during all stages of software development. Test assessment and improvement

Test assessment-summary procedure Develop T 1 Select an adequacy criterion C. 2 Measure adequacy of T w.r.t. C. Yes 3 Is T adequate? Yes No 4 Improve T 5 More testing is warranted ? No 6 Done Test assessment and improvement

Principle underlying test assessment There is a uniform principle that underlies test assessment throughout the testing process. This principle is known as the coverage principle. It has come about as a result of intensive research at Purdue and other research groups in software testing. Test assessment and improvement

The coverage principle To formulate and understand the coverage principle, we need to understand: coverage domains coverage elements A coverage domain is a finite domain, related to the program under test, that we want to cover. Coverage elements are the individual elements of this domain Test assessment and improvement

The coverage principle-continued Requirements Classes Functions Interface mutations Exceptions Coverage Domains Coverage Elements Test assessment and improvement

The coverage principle-continued Measuring test adequacy and improving a test set against a sequence of well defined, increasingly strong, coverage domains leads to improved confidence in the reliability of the system under test. Test assessment and improvement

The coverage principle-continued Note the following properties of a coverage domain: It is related to the program under test. It is finite. It may come from program requirements, related to the inputs and outputs. Test assessment and improvement

The coverage principle-continued It may come from program code. Can you think of a coverage domain that comes from the program code? It aids in measuring test adequacy as well as the progress made in testing. How? Test assessment and improvement

The coverage principle-continued Example: It is required to write a program that takes in the name of a person as a string and searches for the name in a file of names. The program must output the record ID which matches the given name. In case of no match a -1 is returned. What coverage domains can be identified from this requirement? Test assessment and improvement

The coverage principle-continued As we learned earlier, improving coverage improves our confidence in the correct functioning of the program under test. Given a program P and a test T suppose that T is adequate w.r.t. a coverage criterion C. Does this mean that P is error free? Obviously……??? Test assessment and improvement

Test assessment and improvement Test effort There are several measures of test effort. One measure is the size of T. By this measure a test set with a larger number of test cases corresponds to higher effort than one with a lesser number of test cases. Test assessment and improvement

Error detection effectiveness Each coverage criterion has its error detection ability. This is also known as the error detection effectiveness or simply effectiveness of the criterion. One measure of the effectiveness of criterion C is the fraction of faults guaranteed to be revealed by a test T that satisfies C. Test assessment and improvement

Effectiveness-continued Another measure is the probability that at least fraction f of the faults in P will be revealed by test T that satisfies C. Unfortunately there is no absolute measure of the effectiveness of any given coverage criterion for a general class of programs and for arbitrary test sets. Test assessment and improvement

Effectiveness-continued One coverage criterion results in an exception to this rule: What is it? Empirical studies conducted by researchers give us an idea of the relative goodness of various coverage criteria. Thus, for a variety of criteria we can make a statement like: Criterion C1 is definitely better than criterion C2. Test assessment and improvement

Effectiveness-continued In some cases we may be able to say: Criterion C1 is probably better than criterion C2. Such information allows us to construct a hierarchy of coverage criteria. This hierarchy is helpful in organizing and managing testing. How? Test assessment and improvement

Strength of a coverage criterion The effectiveness of a coverage criterion is also referred to as its strength. Strength is a measure of the criterion’s ability to reveal faults in a program. Criterion C1 is considered stronger than criterion C2 if C1 is is capable of revealing more faults than C2. Test assessment and improvement

Test assessment and improvement The Saturation Effect The rate at which new faults are discovered reduces as test adequacy with respect to a finite coverage domain increases; it reduces to zero when the coverage domain has been exhausted. coverage 1 Test assessment and improvement

Saturation Effect: Fault View Remaining Faults M tfs tfe tds tdfe tme Functional Testing Effort Test assessment and improvement

Saturation Effect: Reliability View R’m R’d R’df R’f Reliability Rm Rdf Mutation Rd Dataflow Rf Decision Functional tfs tfe tds tde tdfs tdfe tms tfe Testing Effort True reliability (R) Estimated reliability (R’) Saturation region FUNCTIONAL, DECISION, DATAFLOW AND MUTATION COVERAGE PROVIDE VARIOUS TEST EVALUATION CRITERIA. Test assessment and improvement

Coverage principle-discussion How you will use the knowledge of coverage principle and the saturation effect in organizing and managing testing? Can you think of any other uses of the coverage principle and the saturation effect? Test assessment and improvement

Test assessment and improvement Control flow graph Control flow graph (CFG) of a program is a representation of the flow of execution within the program. It is useful in program analysis such as that required during test assessment and improvement. More formally, a CFG G is: Test assessment and improvement

Test assessment and improvement Control flow graph G=(N,A) where N: set of nodes and A: set of arcs There is a unique entry node en in N. There is a unique exit node ex in N. A node represents a single statement or a block. A block is a single-entry-single-exit sequence of instructions that are always executed in a sequence without any diversion of path except at the end of the block. Test assessment and improvement

Control flow graph-continued Every statement in a block, except possibly the first one, has exactly one predecessor. Similarly, every statement in the block, except possibly the last one, has exactly one successor. An arc a in A is a pair (n,m) of nodes from N which represent transfer of control from node n to node m. A path of length k in G is an ordered sequence of arcs, from A such that: Test assessment and improvement

Control flow graph-continued The first node in is en The last node in is ex For any two adjacent arcs = (n,m) and = (p,q), m=p. A path is considered executable or feasible if there exists a test case which causes this path to be traversed during program execution, otherwise the path is unexecutable or infeasible. Test assessment and improvement

Control flow graph-example Exercise: Draw a CFG for the following program and identify all paths.: 1. scanf (x,y); if (y<0) 2. pow=0-y; 3. else pow=y; 4. z=1.0; 5. while (pow !=0) 6. {z=z*x; pow=pow-1;} 7. if (y<0) 8. z=1.0/z; 9. printf(z); What does the above program compute? Test assessment and improvement

Structure-based test adequacy Based on the CFG of a program several test adequacy criteria can be defined. Some are: statement coverage criterion branch coverage criterion condition coverage criterion path coverage criterion Test assessment and improvement

Test assessment and improvement Statement coverage The coverage domain consists of all statements in the program. Restated, in terms of the control flow graph, it is the set of all nodes in G. A test T satisfies the statement coverage criterion if upon execution of P on each element of T, each statement of P has been executed at least once. Test assessment and improvement

Statement coverage-continued Restated in terms of G, T is adequate w.r.t. the statement coverage criterion if each node in N is on at least one of the paths traversed when P is executed on each element of T. Test assessment and improvement

Statement coverage-continued Class exercise: For the program for which you have drawn the control flow graph, develop a test set that satisfies the statement coverage criterion. Follow the procedure for test assessment and improvement suggested earlier. Test assessment and improvement

Statement coverage-weakness Consider the following program: int abs (x); int x; { if (x>=0) x=0-x; return x; } Test assessment and improvement

Statement coverage-weakness Suppose that T= {(x=0)}. Clearly, T satisfies the statement coverage criterion. But is the program correct and is the error revealed by T which is adequate w.r.t. the statement coverage criterion? What do you suggest we do to improve T? Test assessment and improvement

Branch (or edge) coverage In G there may be nodes which correspond to conditions in P. Such nodes, also called condition nodes, contain branches in P. Each such node is considered covered if during some execution of P, the condition evaluates to true and false; these executions of P need not be the same. Test assessment and improvement

Test assessment and improvement Branch coverage The coverage domain consists of all branches in G. Restated, in terms of the control flow graph, it is the set of all arcs exiting the condition nodes. A test T satisfies the branch coverage criterion if upon execution of P on each element of T, each branch of P has been executed at least once. Test assessment and improvement

Test assessment and improvement Branch coverage Class exercise: Identify all condition nodes in the flow graph you have drawn earlier. Does T= {(x=0)} satisfy the branch coverage criterion? If not, then improve it so that it does. Test assessment and improvement

Branch coverage-weakness Consider the following program that is supposed to check if the input data item is in the range 0 to 100, inclusive: int check(x); int x; { if ((x>=0 )&& (x<=200)) check=true; else check=false; } Test assessment and improvement

Branch coverage-weakness Class exercise: Do you notice the error in this program? Find a test set T which is adequate w.r.t. statement coverage and does not reveal the error. Improve T so that it is adequate w.r.t. branch coverage and does not reveal the error. What do you conclude about the weakness of the branch coverage criterion? Test assessment and improvement

Test assessment and improvement Condition coverage Condition nodes in G might have compound conditions. For example, in the check program the condition node contains the condition: This is a compound condition which consists of the elementary conditions x>=0 and x<=200. ((x>=0 ) && (x<=200)) Test assessment and improvement

Condition coverage-continued A compound condition is considered covered if all of its constituent elementary conditions evaluate to true and false, respectively, during some execution of P. A test set T is adequate w.r.t. condition coverage if all conditions in P are covered when P is executed on elements of T. Test assessment and improvement

Condition coverage-continued Class exercise: Improve T from the previous exercise so that it is adequate w.r.t. the condition coverage criterion for the check function and does not reveal the error. Do you find the above possible? Test assessment and improvement

Branch coverage-weakness, continued Consider the following program: 0. int set_z(x,y); { 1. int x,y; 2. if (x!=0) 3. y=5; 4. else z=z-x; 5. if (z>1) 6. z=z/x; 7. else 8. z=y; } What might happen here? Test assessment and improvement

Branch coverage-weakness Class exercise: Construct T for set_z such that (a) T is adequate w.r.t. the branch coverage criterion and (b) does not reveal the error. What do you conclude about the effectiveness of the branch and condition coverage criteria? Test assessment and improvement

Test assessment and improvement Path coverage As mentioned before, a path through a program is a sequence of statements such that the entry node of the program CFG is the first node on the path and the exit node is the last one on the path. Is this definition equivalent to the one given earlier? Test assessment and improvement

Path coverage-continued A test set T is considered adequate w.r.t. the path coverage criterion if all paths in P are executed at least once upon execution on each element of T. Class exercise: Construct T for set_z such that T is adequate w.r.t. the path coverage criterion and does not reveal the error. Is the above possible? Test assessment and improvement

Path coverage-weakness The number of paths in a program is usually very large. How many paths in set_z? How many paths in check? How many in the program that computes Test assessment and improvement

Path coverage-weaknesses It is the infinite or a prohibitively large number of paths that prevent the use of this criterion in practice. Suppose that a test set T covers all paths. Will it guarantee that all errors in P are revealed ? Is obtaining 100% path coverage equivalent to exhaustive testing? Test assessment and improvement

Variants of path coverage As path coverage is usually impossible to attain, other heuristics have been proposed. Loop coverage: Make sure that each loop is executed 0, 1, and 2 times. Try several combinations of if and switch statements. The combinations must come from requirements. Test assessment and improvement

Hierarchy in Control flow criteria Path coverage Condition coverage Branch coverage Statement coverage X Y X subsumes Y. Test assessment and improvement

Test assessment and improvement Exercise Develop a test set T that is adequate w.r.t. the statement, condition, and the loop coverage criteria for the exponentiation program. Test assessment and improvement

Testing technique or strategy One can develop a testing strategy based on any of the criteria discussed. Example: A testing strategy based on the statement coverage criterion will begin by evaluating a test set T against this criterion. Then new tests will be added to T until all the statements are covered, i.e. T satisfies the criterion. Test assessment and improvement

Test assessment and improvement Definitions Error-sensitive path: a path whose execution might lead to eventual detection of an error. Error revealing path: a path whose execution will always cause the program to fail and the error to be detected. Test assessment and improvement

Test assessment and improvement Definitions Reliable: A testing technique is reliable for an error if it guarantees that the error will always be detected. This implies that a reliable testing technique must lead to the exercising of at least one error-revealing path. Test assessment and improvement

Test assessment and improvement Definitions Weakly reliable: A testing technique is weakly reliable if it forces the execution of at least one error sensitive path. Test assessment and improvement

Test assessment and improvement Example: error detection [1] ([1]-[3] not covered during Fall 2001 in CS 406) Let us go over the example in Korel and Laski’s paper. It is a sorting program which uses the bubble sort algorithm. It sorts an array a[0:N] in descending order. There are two, nested, loops in the program. The inner loop from i6-i10 finds the largest element of a[R1:N]. Test assessment and improvement

Example: error detection[2] The largest element is saved in R0 and R3 points to the location of R0 in a. The outer loop swaps a(R1) with a(R3). The completion of one iteration of the outer loop ensures that the sub-array a[0:R1-1] has been sorted and that a[R1-1] is greater than or equal to any element of a[R1:N]. Test assessment and improvement

Example: error detection[3] There is a missing re-initialization of R3 to R1 at the beginning of the inner loop. In some cases this will cause the program to fail. What are these cases? We will get back to this error later! Test assessment and improvement

Test assessment and improvement Data flow graph It represents the flow of data in a program. The graph is constructed from the control flow graph (CFG) of the program. A statement that occurs within a node of the CFG might contain variables occurrences. Each variable occurrence is classified as a def or a use. Test assessment and improvement

Test assessment and improvement defs and uses A def represents the definition of a variable. Here are some sample defs of variable x: x=y*x; scanf(&x,&y); int x; x[i-1]=y*x; A use represents the use of a variable in a statement. Here a few examples of use of variable x: All defs of x are italicized. Test assessment and improvement

Test assessment and improvement def-use-continued x=x+1; printf (“x is %d, y is %d”, x,y); cout << x << endl << y z=x[i+1] if (x<y)… Uses of a variable in input and assignments are classified as c-uses. Those in conditions are classified as p-uses. All uses of x are italicized. Test assessment and improvement

Test assessment and improvement def-use-continued c-use stands for computational use and p-use for predicate-use. Both c- and p-uses affect the flow of control: p-uses directly as their values are used in evaluating conditions and c-uses indirectly as their values are used to compute other variables which in turn affect the outcome of condition evaluation. Test assessment and improvement

Test assessment and improvement def-use-continued A path from node i to node j is said to be def-clear w.r.t. a variable x if there is no def of x in the nodes along the path from node i to node j. Nodes i and j may have a def of x. A def-clear path from node i to edge (j,k) is one in which no node on the path has a def of x. Test assessment and improvement

Test assessment and improvement global-def A def of a variable x is considered global to its block if it is the last def of x within that block. A c-use of x in a block is considered global c-use if there is no def of x preceding this c-use within this block. Test assessment and improvement

def-use graph: definitions def(i): set of all variables for which there is a global definition at node i. c-use(i): set of all variables that have a global c-use at node i. p-use(i,j): set of all variables for which there is a p-use for the edge (i,j). dcu(x,i): set of all nodes such that each node has x in its c-use and x is in def(i). Test assessment and improvement

def-use graph: definitions dpu(x,i): set of all edges such that each edge has x in its p-use , x is in def(i). The def-use graph of program P is constructed by associating defs, c-use, and p-use sets with nodes of a flow graph. Test assessment and improvement

def-use graph-continued Sample program: 1. scanf (x,y); if (y<0) 2. pow=0-y; 3. else pow=y; 4. z=1.0; 5. while (pow !=0) 6. {z=z*x; pow=pow-1;} 7. if (y<0) 8. z=1.0/z; 9. printf(z); Test assessment and improvement

def-use graph-continued Unlabeled edges imply empty p-use set. def={x,y} c-use= 1 y y def={pow} c-use={y} def={pow} c-use={y} 2 3 4 def={z} c-use= def= c-use= 5 def= c-use= pow pow def={z,pow} c-use={z,x,pow} 7 6 y y def={z} c-use={z} def= c-use={z} 8 9 Test assessment and improvement

def-use graph exercise Draw a def-use graph for the following program. 0. int set_z(x,y); { 1. int x,y; 2. if (x!=0) 3. y=5; 4. else z=z-x; 5. if (z>1) 6. z=z/x; 7. else 8. z=y; } Test assessment and improvement

def-use graph-continued Traverse the graph to determine dcu and dpu sets. (node, var) dcu dpu (1,x) {6}  (1,y) {2,3} {(1,2),(1,3),(7,8),(7,9)} (2,pow) {6} {(5,6),(5,7)} (3,pow) {6} {5,6),(5,7)} (4,z) {6,8,9}  (6,z) {6,8,9}  (6,pow) {6} {(5,6),(5,7)} (8,z) {9}  Test assessment and improvement

Test assessment and improvement Test generation Exercises: For the above graph generate a test set that satisfies the branch coverage criterion the all-defs criterion - for definitions of all variables at least one use (c- or p- use) must be exercised. the all-uses criterion- all p-uses and all c-uses of all variable definitions be covered. Develop the tests incrementally, i.e. by modifying the previous test set! Test assessment and improvement

ATAC processing: phase I P, Program under test Preprocess, compile and instrument Test set generate input generate .atac files Instrumented version of P (executable) upon execution upon execution .trace file Program output Test assessment and improvement

ATAC processing: phase II .atac files .trace file coverage analyzer control flow and data flow coverage values Test assessment and improvement

Test assessment and improvement Mutation testing What is mutation testing? Mutation testing is a code-based test assessment and improvement technique. It relies on the competent programmer hypothesis which is the following assumption: Given a specification a programmer develops a program that is either correct or differs from the correct program by a combination of simple errors. Test assessment and improvement

Mutation testing-continued The process of program development is considered as iterative whereby an initial version of the program is refined by making simple, or a combination of simple changes, towards the final version. Test assessment and improvement

Mutation testing-definitions Given a program P, a mutant of P is obtained by making a simple change in P. Program Mutant 1. int x,y; 2. if (x!=0) 3. y=5; 4. else z=z-x; 5. if (z>1) 6. z=z/x; 7. else 8. z=y; 1. int x,y; 2. if (x!=0) 3. y=5; 4. else z=z-x; 5. if (z>1) 6. z=z/zpush(x); 7. else 8. z=y; What is zpush? Test assessment and improvement

Test assessment and improvement Another mutant Program Mutant 1. int x,y; 2. if (x!=0) 3. y=5; 4. else z=z-x; 5. if (z>1) 6. z=z/x; 7. else 8. z=y; 1. int x,y; 2. if (x!=0) 3. y=5; 4. else z=z-x; 5. if (z<1) 6. z=z/x; 7. else 8. z=y; Test assessment and improvement

Test assessment and improvement Mutant A mutant M is considered distinguished by a test case t T iff: P(t)M(t) where P(t) and M(t) denote, respectively, the observed behavior of P and M when executed on test input t. A mutant M is considered equivalent to P iff: P(t)M(t) t  T. Test assessment and improvement

Test assessment and improvement Mutation score During testing a mutant is considered live if it has not been distinguished or proven equivalent. Suppose that a total of #M mutants are generated for program P. The mutation score of a test set T, designed to test P, is computed as: number of live mutants/(#M-number of equivalent mutants) Test assessment and improvement

Test adequacy criterion A test T is considered adequate w.r.t. the mutation criterion if its mutation score is 1. The number of mutants generated depends on P and the mutant operators applied on P. A mutant operator is a rule that when applied to the program under test generates zero or more mutants. Test assessment and improvement

Test assessment and improvement Mutant operators Consider the following program: int abs (x); int x; { if (x>=0) x=0-x; return x; } Test assessment and improvement

Test assessment and improvement Mutation operator Consider the following rule: Replace each relational operator in P by all possible relational operators excluding the one that is being replaced. Assuming the set of relational operators to be: {<, >, <=, >=, ==, !=}, the above mutant operator will generate a total of 5 mutants of P. Test assessment and improvement

Test assessment and improvement Mutation operators Mutation operators are language dependent. For Fortran a total of 22 operators were proposed. For C a total of 77 operators were proposed. None have been proposed for C++ though most of the operators for C are applicable to C++ programs. Test assessment and improvement

Test assessment and improvement Equivalent mutant Consider the following program P: int x,y,z; scanf(&x,&y); if (x>0) x=x+1; z=x*(y-1); else x=x-1; z=x*(y-1); Here z is considered the output of P. Test assessment and improvement

Equivalent mutant-continued Now suppose that a mutant of P is obtained by changing x=x+1 to x=abs(x)+1. This mutant is equivalent to P as no test case can distinguish it from P. Test assessment and improvement

Mutation testing procedure Given P and a test set T: 1. Generate mutants 2. Compile P and the mutants 3. Execute P and the mutants on each test case. 4. Determine equivalent mutants.. 5. Determine mutation score. 6. If mutation score is not 1 then improve the test set and repeat from step 3. Test assessment and improvement

Mutation testing procedure In practice the above procedure is implemented incrementally. One applies a few selected mutant operators to P and computes the mutation score w.r.t. to the mutants generated. Once these mutants have been distinguished or proven equivalent, another set of mutant operators is applied. Test assessment and improvement

Mutation testing procedure This procedure is repeated until either all the mutants have been exhausted or some external condition forces testing to stop. We will not discuss the details of practical application of mutation testing. Test assessment and improvement

Tools for mutation testing Mothra: for Fortran, developed at Purdue, 1990 Proteum: for C, developed at the University of Saõ Paulo at Saõ Carlos in Brazil. Test assessment and improvement

Uses of Mutation testing Mutation testing is useful during integration testing to check for integration errors. Only the variables that are in the interfaces of the components being integrated are mutated. This reduces the complexity of mutation testing. Test assessment and improvement

Test assessment and improvement Summary Test adequacy criterion Test improvement Coverage principle Saturation effect Control flow criteria Data flow criteria def, use, p-use, c-use, all-uses Test assessment and improvement

Test assessment and improvement Summary continued xSUDS, data flow testing tool. Mutation testing mutant, distinguishing a mutant, live mutant, mutant score, competent programmer hypothesis. Test assessment and improvement