Automation of Test Case Generation

Slides:



Advertisements
Similar presentations
Heuristic Search techniques
Advertisements

P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
White Box and Black Box Testing Tor Stålhane. What is White Box testing White box testing is testing where we use the info available from the code of.
1 Symbolic Execution for Model Checking and Testing Corina Păsăreanu (Kestrel) Joint work with Sarfraz Khurshid (MIT) and Willem Visser (RIACS)
SOFTWARE TESTING. INTRODUCTION  Software Testing is the process of executing a program or system with the intent of finding errors.  It involves any.
Inpainting Assigment – Tips and Hints Outline how to design a good test plan selection of dimensions to test along selection of values for each dimension.
Symbolic execution © Marcelo d’Amorim 2010.
FIT FIT1002 Computer Programming Unit 19 Testing and Debugging.
Programming with Alice Computing Institute for K-12 Teachers Summer 2011 Workshop.
Artificial Intelligence in Game Design Introduction to Learning.
CMSC 345, Version 11/07 SD Vick from S. Mitchell Software Testing.
Pexxxx White Box Test Generation for
CS4723 Software Validation and Quality Assurance Lecture 02 Overview of Software Testing.
Introduction to Software Testing
1 Software Testing Techniques CIS 375 Bruce R. Maxim UM-Dearborn.
1 Functional Testing Motivation Example Basic Methods Timing: 30 minutes.
Heuristic Search Heuristic - a “rule of thumb” used to help guide search often, something learned experientially and recalled when needed Heuristic Function.
Software Testing Sudipto Ghosh CS 406 Fall 99 November 9, 1999.
Dr. Pedro Mejia Alvarez Software Testing Slide 1 Software Testing: Building Test Cases.
Chapter 2 Build Your First Project A Step-by-Step Approach 2 Exploring Microsoft Visual Basic 6.0 Copyright © 1999 Prentice-Hall, Inc. By Carlotta Eaton.
Week 4-5 Java Programming. Loops What is a loop? Loop is code that repeats itself a certain number of times There are two types of loops: For loop Used.
System/Software Testing
Testing. What is Testing? Definition: exercising a program under controlled conditions and verifying the results Purpose is to detect program defects.
Tao Xie (North Carolina State University) Nikolai Tillmann, Jonathan de Halleux, Wolfram Schulte (Microsoft Research, Redmond WA, USA)
CUTE: A Concolic Unit Testing Engine for C Technical Report Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
CMSC 345 Fall 2000 Unit Testing. The testing process.
CS4311 Spring 2011 Unit Testing Dr. Guoqiang Hu Department of Computer Science UTEP.
CS5103 Software Engineering Lecture 15 System Testing Testing Coverage.
Testing Basics of Testing Presented by: Vijay.C.G – Glister Tech.
Regression Testing. 2  So far  Unit testing  System testing  Test coverage  All of these are about the first round of testing  Testing is performed.
IT253: Computer Organization Lecture 3: Memory and Bit Operations Tonga Institute of Higher Education.
Testing and Debugging Version 1.0. All kinds of things can go wrong when you are developing a program. The compiler discovers syntax errors in your code.
Introduction to Software Testing. Types of Software Testing Unit Testing Strategies – Equivalence Class Testing – Boundary Value Testing – Output Testing.
Testing Testing Techniques to Design Tests. Testing:Example Problem: Find a mode and its frequency given an ordered list (array) of with one or more integer.
16 October Reminder Types of Testing: Purpose  Functional testing  Usability testing  Conformance testing  Performance testing  Acceptance.
Software Construction Lecture 18 Software Testing.
(1) Unit Testing and Test Planning CS2110: SW Development Methods These slides design for use in lab. They supplement more complete slides used in lecture.
1 Phase Testing. Janice Regan, For each group of units Overview of Implementation phase Create Class Skeletons Define Implementation Plan (+ determine.
CS Data Structures I Chapter 2 Principles of Programming & Software Engineering.
2005MEE Software Engineering Lecture 11 – Optimisation Techniques.
Today’s Agenda  Reminder: HW #1 Due next class  Quick Review  Input Space Partitioning Software Testing and Maintenance 1.
1 Introduction to Software Testing. Reading Assignment P. Ammann and J. Offutt “Introduction to Software Testing” ◦ Chapter 1 2.
Xusheng Xiao North Carolina State University CSC 720 Project Presentation 1.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Optimization Problems
1 CSC 216 Lecture 3. 2 Unit Testing  The most basic kind of testing is called unit testing  Why is it called “unit” testing?  When should tests be.
CS451 Lecture 10: Software Testing Yugi Lee STB #555 (816)
Using Symbolic PathFinder at NASA Corina Pãsãreanu Carnegie Mellon/NASA Ames.
CPSC 871 John D. McGregor Module 8 Session 1 Testing.
Chapter 5 Linked List by Before you learn Linked List 3 rd level of Data Structures Intermediate Level of Understanding for C++ Please.
PROGRAMMING TESTING B MODULE 2: SOFTWARE SYSTEMS 22 NOVEMBER 2013.
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
1 Phase Testing. Janice Regan, For each group of units Overview of Implementation phase Create Class Skeletons Define Implementation Plan (+ determine.
Random Test Generation of Unit Tests: Randoop Experience
Symbolic Execution in Software Engineering By Xusheng Xiao Xi Ge Dayoung Lee Towards Partial fulfillment for Course 707.
Software Testing Reference: Software Engineering, Ian Sommerville, 6 th edition, Chapter 20.
Dynamic White-Box Testing What is code coverage? What are the different types of code coverage? How to derive test cases from control flows?
Chapter 2 Build Your First Project A Step-by-Step Approach 2 Exploring Microsoft Visual Basic 6.0 Copyright © 1999 Prentice-Hall, Inc. By Carlotta Eaton.
CPSC 372 John D. McGregor Module 8 Session 1 Testing.
CS5123 Software Validation and Quality Assurance
Mid-term Exam Account for 20% of the grade 100 points in total
Structural testing, Path Testing
UNIT-4 BLACKBOX AND WHITEBOX TESTING
Chapter 10 – Software Testing
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
CUTE: A Concolic Unit Testing Engine for C
Testing.
UNIT-4 BLACKBOX AND WHITEBOX TESTING
Junit Tests.
Testing Slides adopted from John Jannotti, Brown University
Presentation transcript:

Automation of Test Case Generation

Testing is expensive We have talked about 3 different kinds of coverage However, the three coverage are mainly used As measure for progress Not as target to be achieved Why? Money coverage Time coverage The most used targets… 2

Testing is expensive 3 Numbers How many lines of test code About 40% in Microsoft code base How much is spent each year on software testing It is not easy to even estimate the time/effort spent on testing inside a IT company (Recall words of Bill Gates) The software testing market (out source testing) is about $17B in year 2012 How many new testers More than 30k each year in US recently Well more than 150K each year globally 3

Automation is the way to reduce cost Good or bad news, no way to fully automate testing in predictable future Three phase Execution automation Mostly done for unit testing: xUnit Somewhat done for system testing: scripts Test case generation automation Quite hot topic in academia Great progress made recently Test oracle check automation Some good trials 4

Automating different steps Automating test execution Unit Testing Framework Record and Replay for GUI testing Scripts for Command-line tools and configuration testing Automating test case generation (this lecture) 5

Automatic test case generation Random testing Pure Random Rule-based Combinatory Feedback-guided Exploration testing Systematical testing Symbolic execution Chaining Search-based testing 6

Random Testing Randomly generate inputs to feed in a software Problems Easiest way to do automatic test case generation Do not require any preparation and easy to implement Problems Semantically redundant inputs E.g., for a simple program 10/x, providing any input except 0 means the same 7

Rule-based Random Testing Try to reduce redundancy Use rules Try boundary cases 0, 1, -1, and other known boundaries Use distributions Generate random values following certain distribution Very effective if the distribution of input is known e.g, scores of a final exam 8

Combinatory Testing 9 Recall input combination coverage Try to achieve better coverage until reach 100% Add a new test case that covers uncovered value combinations Give priority to the test case that covers multiple new possible values For multiple features of a given input, use constraint solvers to generate a value that satisfies feature combinations (e.g., length = 2 && starts-with A) 9

Adaptive random testing Intuition Input patterns for bug revealing Strips: bug is revealed when x + y = 4 Blocks: bug is revealed when x > 3 & y < 4 Trying two very similar test cases does not make much sense 10

Adaptive random testing Approach When choosing the next test case, choose the test case having the largest minimum distance from existing test cases {t1, t2,…tn} are existing test cases, choose ch so that Example: one input, int16 x T3: -32768 T2:32767 T5: -16374 T1: -20 T6:... T4: 16394

Feedback-Guided Random Testing Question: It is easy to generate random values for primitive types, how to generate a random object? You may generate a random object id = -1, content = “”, child = null May not make sense… For a class, id may be always positive, content may not be empty, and child cannot be null… 12

Feedback-Guided Random Testing Consequence: Randomly generated objects may cause a lot of bugs But they are not real bugs… class BugClass{ int id; String content; Object child; public BugClass(String str){ this.id = Global.id ++; this.content = "bug:" + str; this.child = Bug.createDefaultChild(); } public do(){ String type = Global.type[this.id]; String content = this.content.substring(4); this.child.do(); id = -1; content = ""; child = null; Index array < 0!!! Index out of string length!!! Null pointer reference!!! 13

Feedback-Guided Random Testing Solution: Try from the methods taking primitive types first When an object of a class is generated, use the object in following test cases Do not use duplicate and null objects Do not use objects generated with exception 14

Exploration Testing Very commonly used in testing of GUI & Web software Explore the user interface Click all clickable Feed in random value to input boxes May often achieve good coverage One problem is the login part Or any inputs that requires some semantics 15

Exploration Testing 16 Example Menu Window About Play Records Setting Audio Setting Select Level Login Game Setting Play Confirm 16

Exploration Testing 17 Return to the last window Return button, go back, … Not always available Record the state of the software at the previous window Sometimes high overhead Usually Require instrumentation of the code To check outputs (sent to the screen) with oracles To tell the explorer that the window is ready (the drawing of the window is done) 17

Exploration Testing Signal based exploration testing 18 Driver Software under test Click event Event handler finishes Analyze new GUI Click event Event handler finishes 18

Tool Introduction-Randoop Feedback-based random testing Joint work by MIT and Microsoft Research The most robust and effective random testing tool for Java Have both stand-alone tool and eclipse plug-in Tool website: https://code.google.com/p/randoop/ 19

Tool Introduction - Randoop Process Install with eclipse update site: http://randoop.googlecode.com/hg/plugin.updateSite/ Right click on the class to be tested Choose as Randoop Input Configuration How long should randoop run Size of test methods Timeout of certain thread #Test methods per file 20

Tool Introduction - Randoop Results Add support Jar file Use Junit to re-execute the test suite Remove some invalid test cases Use EclEmma to check results 21

Search-based Testing Deem test case generation as a optimization problem Somewhat between random testing and systematical testing Based on random testing, and focus on the input domains Use code coverage as guidance 22

Optimization Try to find the maximal or minimal value of a certain function Numerous practical problems can be viewed as optimization problem Least cost to travel to a number of cities Least camera to cover an area Distribution of stores to attract most customers Design of pipe systems with least material Put items into a backpack (with limited volume) with highest value 23

Illustration of Optimization global peak local peak 24

Solution of Optimization Hill climbing Start from a random point Try all neighboring points, and go to the point with highest value, until all neighboring points has a value lower than the current point Easy to find a local peak Random-restart Hill climbing Restart hill climbing for many times To avoid local peaks 25

Solution of Optimization Annealing simulation Adaptation of hill climbing Has a probability to move after reached local peak The probability drops as time goes by Genetic algorithm Simulate the process of evolution Start with random points Select a number of best points Combine and mutate these points Until no more improvements can be made 26

Transform Testing to Optimization Using code coverage as criterion: Try to generate a test case that covers certain code element (method, statement, branch, …) Measure how well we have solve the problem A simple fitness function How far is the already covered elements from the target code elements Try to make the distance 0 27

Go from random testing A list of random test cases as the start point Each test case is a point in the input domain Using various optimization algorithms to find the best test case 28

An example with hill climbing read x, y Target is st2 Start from 0, 0 f(0, 0) = 2 steps Try (0,1) (1,0), (0,-1), (-1,0) No improvement Go all directions equally Until reach (5,5) f(5,5) = 1 step Increase x and y Until reach (7,5) Done! if x+y >= 10 N y st1 st3 N if x >= 7 read x, y; if(x+y>=10){ st1; if(x>=7){ st2; // target } st3; y y st2 29

Problem 30 The fitness function is too simple Requires a lot of random walk before getting to the correct place (5,5) Better algorithms may help, but not much, because all algorithms require to compare the value of fitness functions A better fitness function: Consider the value gap at all branches 30

Rerun example with hill climbing read x, y Target is st2 Start from 0, 0 f(0, 0) = 10 value gap Try (0,1) (1,0), (0,-1), (-1,0) Go to (0,1) or (1,0) Until reach (0,10) f(0,10) = 7 value gap Increase x Until reach (7,10) Done! if x+y >= 10 N y st1 st3 N if x >= 7 read x, y; if(x+y>=10){ st1; if(x>=7){ st2; // target } st3; y y st2 31

EvoSuite: Demo 32 Search based software testing tool Work in around 2010 by Gordon Fraser and colleagues at U Saarland, Germany Use genetic algorithm as the optimization algorithm Use combination of step + value gap as the fitness function 32

Process 33 Their eclipse plug-in is rather instable Have a relative stable command-line tool for mac os X Very easy to setup Download the jar Package your own code to jar file Run the command with options Coverage criterion Timeout Random seeds 33

Results 34 Junit test cases Even with comments Very high coverage on relatively simple examples 34

Systematical testing 35 Target at certain code coverage Try to understand how the code works Analyze the code structure to find out a path to go to a certain statement Analyze the code structure to find out the constraint of the inputs to let the program follow the path 35

Symbolic execution 36 Target at better statement coverage Basic idea If a statement is not covered yet, try to provide an input to go over that statement A statement is covered only when a path goes to it is covered Then, what input will cause the program to go through certain paths? Only when the input satisfy all if-conditions along the path! 36

Symbolic execution: example void CoverMe(int[] a) { if (a == null) return; if (a.Length > 0) if (a[0] == 1234567890) throw new Exception("bug"); } a==null F T a.Length>0 F T a[0]==123… F T

How to know what input to feed in Static symbolic execution Construct a constraint for each statement, the statement will be executed when the constraint is satisfied The parameters (variables) in the constraint are inputs of the software Solve the constraint The variables used in conditions may not be inputs So they must be represented by expressions of inputs 38

Static symbolic execution Here T is the condition for the statement to be executed, (y=s) is the relationship of all variables to the inputs after the statement is executed Basic Example y = read(); y = 2 * y; if (y <= 12) fails(); else success(); print ("OK"); T (y=s), s is a symbolic variable for input T (y=2*s) T (y=2*s) T^y<=12 (y=2*s) T^y<=12 (y=2*s) | T^!(y<=12) (y=2*s) T^!(y<=12) (y=2*s) T^y<=12 (y=2*s) | T^!(y<=12) (y=2*s) 39

Static symbolic execution Generating test cases y = read(); T (y=s), s is the input y = 2 * y; T (y=2*s) if (y <= 12) T (y=2*s) fails(); T^y<=12 (y=2*s) s<=6 -> 8 else success(); T^!(y<=12) (y=2*s) s>6 ->3 print ("OK"); T^y<=12 (y=2*s) | T^!(y<=12) (y=2*s) 40

Constraint Solver Solve constraints -> provide a value set satisfying the constraint Float values Linear programming if the constraints are all linear Boolean values SAT problem Integer values Can be deduced to SAT String values SMT problem Mostly NPC problems Can be deduced from/to SAT problem 41

Problems of static symbolic execution Path explosion Remember n branches will cause 2n paths Infinite paths for unbounded loops Calculate constraints on all paths is infeasible for real software Too complex constraint The constraint gets very complex for large programs Not to mention the resolving part is NPC 42

Chaining and Goal-oriented Approach Not all branches are actually relevant If the outcome of a branch will not affect whether a statement will be executed We should not waste time on the branch A classification of branches Critical branches The branch must not be taken if target is executed Non-essential branches Other branches 43

Example of branch classification Read a, b Read a, b; if (a > 0){ a = a + b; } if (b >= 10){ st2; //target if a > 10 N-E a = a+b; N-E if b >= 10 C st2 End 44

Chaining 45 Find a path to reach the target code Start with the start node and the target node <read a,b>, <target> Avoid critical branches: b>=10 Choose b = 10 45

A different example 46 b=0 Read a; if a > 0 b = 0; if (a > 0){ b = a + b; } if (b >= 10){ st2; //target if a > 0 N-E b = a+b; N-E if b >= 10 C st2 End 46

Chaining 47 Find a path to reach the target code Start with the start node and the target node <read a>, <target> Avoid critical branches: b>=10 , no solution Go to definition statement of b: b = 0; no solution b = a + b; -> a + b>=10 47

Chaining: next steps <read a>, <b=a+b, a+b>=10>, <target> Avoid critical branches: a>0, a+b>=10 go to definition of a, b b = 0; read a; -> solve the constraint: a = 10 48

Chaining 49 Compared to static symbolic execution: Problems Reduce the branches to be considered Reduce the definitions to be considered More scalable than static symbolic execution Problems Still not scalable enough Spend much time to generating only 1 test case to cover a certain statement 49

Dynamic symbolic execution Koushik Sen et al. 2005 One of the most important progress in software engineering in the 21st century Basic idea Actually run the software Generate constraints as the program runs Flip constraints to reach other statements 50

Dynamic symbolic execution Choose next path Code to generate inputs for: Solve Execute&Monitor void CoverMe(int[] a) { if (a == null) return; if (a.Length > 0) if (a[0] == 1234567890) throw new Exception("bug"); } Data null {} {0} {123…} Constraints to solve a!=null a!=null && a.Length>0 a.Length>0 && a[0]==1234567890 Observed constraints a==null a!=null && !(a.Length>0) a.Length>0 && a[0]!=1234567890 a[0]==1234567890 Negated condition a==null F T a.Length>0 F T a[0]==123… Done: There is no statement left. F T 51

Dynamic symbolic execution Advantages No need to analysis the whole system before perform testing All constraints and expressions can be executed along one execution path Can handle library method calls Testing as the analysis taking place Disadvantages Overhead in testing 52

Review of automatic test case generation Random testing Rule-based Adaptive Feedback-based Exploration Search-based testing Optimization algorithms Fitness function Systematical testing Symbolic execution Chaining and goal-oriented 53

Review of testing 54 General guidelines Levels of testing Unit testing, higher level testing, GUI testing Test coverage Code, input, mutation Regression testing Non-function testing Security testing, performance testing Test automation 54

Review: General guidelines Test is the practical choice: the best affordable approach Concepts: test case, test oracle, test suite, test driver, test script, test coverage Granularity: unit, integration, system, acceptance Type by design principle: black-box, white-box Black-box-testing: boundary, equivalence, decision table White-box-testing: branch coverage, complexity 55

Review of Unit Testing 56 Using unit test framework to do unit testing It does all the common things, reduce test interference Inside test framework setUp -> test -> assert -> tearDown Writing informative assertions Writing complete tear downs Dependencies are evil! Dependency injection Remove dependencies Test doubles 56

Review: Higher level testing Integration Testing Strategies: big bang, top-down, bottom-up, sandwich, path-based System Testing Environment issues: building, platforms, web services, environment failures, distribution issues Acceptance Testing GUI testing Event-based, Screen-based, CV-based Record and replay 57

Review of test coverage Code coverage Target: code Adequacy: no -> 100% code coverage != no bugs Overhead: low (instrumentation cause some overhead) Input combination coverage Target: inputs Adequacy: yes -> 100% input coverage == no bugs Overhead: none Mutation coverage Target: bugs Adequacy: no -> 100% mutant coverage != no bugs Overhead: very high (execution on instrumented mutated versions) 58

Review of Regression Testing Test Prioritization Try only the most important test cases Test Relevant Code Try the most relevant test cases Record and Replay Reuse the execution results of previous test cases 59

Review of Non-Functional Testing Performance Testing Test whether the efficiency (time and space) of a software meets requirements Security Testing Test whether the software is vulnerable to attacks (special invalid inputs designed to control the software or reveal info from the software) 60

Review of automatic test case generation Random testing Rule-based Adaptive Feedback-based Exploration Search-based testing Optimization algorithms Fitness function Systematical testing Symbolic execution Chaining and goal-oriented 61