Automation of Test Case Generation

Name: Automation of Test Case Generation
Uploaded: 2017-10-15T23:44:41+00:00
Duration: PTM33S34
Channel: Roland Dravis
Description: Automation of Test Case Generation

Automation of Test Case Generation

Testing is expensive We have talked about 3 different kinds of coverage However, the three coverage are mainly used As measure for progress Not as target to be achieved Why? Money coverage Time coverage The most used targets… 2

Testing is expensive 3 Numbers How many lines of test code
About 40% in Microsoft code base How much is spent each year on software testing It is not easy to even estimate the time/effort spent on testing inside a IT company (Recall words of Bill Gates) The software testing market (out source testing) is about $17B in year 2012 How many new testers More than 30k each year in US recently Well more than 150K each year globally 3

Automation is the way to reduce cost
Good or bad news, no way to fully automate testing in predictable future Three phase Execution automation Mostly done for unit testing: xUnit Somewhat done for system testing: scripts Test case generation automation Quite hot topic in academia Great progress made recently Test oracle check automation Some good trials 4

Automating different steps
Automating test execution Unit Testing Framework Record and Replay for GUI testing Scripts for Command-line tools and configuration testing Automating test case generation (this lecture) 5

Automatic test case generation
Random testing Pure Random Rule-based Combinatory Feedback-guided Exploration testing Systematical testing Symbolic execution Chaining Search-based testing 6

Random Testing Randomly generate inputs to feed in a software Problems
Easiest way to do automatic test case generation Do not require any preparation and easy to implement Problems Semantically redundant inputs E.g., for a simple program 10/x, providing any input except 0 means the same 7

Rule-based Random Testing
Try to reduce redundancy Use rules Try boundary cases 0, 1, -1, and other known boundaries Use distributions Generate random values following certain distribution Very effective if the distribution of input is known e.g, scores of a final exam 8

Combinatory Testing 9 Recall input combination coverage
Try to achieve better coverage until reach 100% Add a new test case that covers uncovered value combinations Give priority to the test case that covers multiple new possible values For multiple features of a given input, use constraint solvers to generate a value that satisfies feature combinations (e.g., length = 2 && starts-with A) 9

Adaptive random testing
Intuition Input patterns for bug revealing Strips: bug is revealed when x + y = 4 Blocks: bug is revealed when x > 3 & y < 4 Trying two very similar test cases does not make much sense 10

Adaptive random testing
Approach When choosing the next test case, choose the test case having the largest minimum distance from existing test cases {t1, t2,…tn} are existing test cases, choose ch so that Example: one input, int16 x T3: T2:32767 T5: T1: -20 T6:... T4: 16394

Feedback-Guided Random Testing
Question: It is easy to generate random values for primitive types, how to generate a random object? You may generate a random object id = -1, content = “”, child = null May not make sense… For a class, id may be always positive, content may not be empty, and child cannot be null… 12

Consequence: Randomly generated objects may cause a lot of bugs But they are not real bugs… class BugClass{ int id; String content; Object child; public BugClass(String str){ this.id = Global.id ++; this.content = "bug:" + str; this.child = Bug.createDefaultChild(); } public do(){ String type = Global.type[this.id]; String content = this.content.substring(4); this.child.do(); id = -1; content = ""; child = null; Index array < 0!!! Index out of string length!!! Null pointer reference!!! 13

Solution: Try from the methods taking primitive types first When an object of a class is generated, use the object in following test cases Do not use duplicate and null objects Do not use objects generated with exception 14

Exploration Testing Very commonly used in testing of GUI & Web software Explore the user interface Click all clickable Feed in random value to input boxes May often achieve good coverage One problem is the login part Or any inputs that requires some semantics 15

Exploration Testing 16 Example Menu Window About Play Records Setting
Audio Setting Select Level Login Game Setting Play Confirm 16

Exploration Testing 17 Return to the last window
Return button, go back, … Not always available Record the state of the software at the previous window Sometimes high overhead Usually Require instrumentation of the code To check outputs (sent to the screen) with oracles To tell the explorer that the window is ready (the drawing of the window is done) 17

Exploration Testing Signal based exploration testing 18 Driver
Software under test Click event Event handler finishes Analyze new GUI Click event Event handler finishes 18

Tool Introduction-Randoop
Feedback-based random testing Joint work by MIT and Microsoft Research The most robust and effective random testing tool for Java Have both stand-alone tool and eclipse plug-in Tool website: 19

Tool Introduction - Randoop
Process Install with eclipse update site: Right click on the class to be tested Choose as Randoop Input Configuration How long should randoop run Size of test methods Timeout of certain thread #Test methods per file 20

Tool Introduction - Randoop
Results Add support Jar file Use Junit to re-execute the test suite Remove some invalid test cases Use EclEmma to check results 21

Search-based Testing Deem test case generation as a optimization problem Somewhat between random testing and systematical testing Based on random testing, and focus on the input domains Use code coverage as guidance 22

Optimization Try to find the maximal or minimal value of a certain function Numerous practical problems can be viewed as optimization problem Least cost to travel to a number of cities Least camera to cover an area Distribution of stores to attract most customers Design of pipe systems with least material Put items into a backpack (with limited volume) with highest value 23

Illustration of Optimization
global peak local peak 24

Solution of Optimization
Hill climbing Start from a random point Try all neighboring points, and go to the point with highest value, until all neighboring points has a value lower than the current point Easy to find a local peak Random-restart Hill climbing Restart hill climbing for many times To avoid local peaks 25

Solution of Optimization
Annealing simulation Adaptation of hill climbing Has a probability to move after reached local peak The probability drops as time goes by Genetic algorithm Simulate the process of evolution Start with random points Select a number of best points Combine and mutate these points Until no more improvements can be made 26

Transform Testing to Optimization
Using code coverage as criterion: Try to generate a test case that covers certain code element (method, statement, branch, …) Measure how well we have solve the problem A simple fitness function How far is the already covered elements from the target code elements Try to make the distance 0 27

Go from random testing A list of random test cases as the start point Each test case is a point in the input domain Using various optimization algorithms to find the best test case 28

An example with hill climbing
read x, y Target is st2 Start from 0, 0 f(0, 0) = 2 steps Try (0,1) (1,0), (0,-1), (-1,0) No improvement Go all directions equally Until reach (5,5) f(5,5) = 1 step Increase x and y Until reach (7,5) Done! if x+y >= 10 N y st1 st3 N if x >= 7 read x, y; if(x+y>=10){ st1; if(x>=7){ st2; // target } st3; y y st2 29

Problem 30 The fitness function is too simple
Requires a lot of random walk before getting to the correct place (5,5) Better algorithms may help, but not much, because all algorithms require to compare the value of fitness functions A better fitness function: Consider the value gap at all branches 30

Rerun example with hill climbing
read x, y Target is st2 Start from 0, 0 f(0, 0) = 10 value gap Try (0,1) (1,0), (0,-1), (-1,0) Go to (0,1) or (1,0) Until reach (0,10) f(0,10) = 7 value gap Increase x Until reach (7,10) Done! if x+y >= 10 N y st1 st3 N if x >= 7 read x, y; if(x+y>=10){ st1; if(x>=7){ st2; // target } st3; y y st2 31

EvoSuite: Demo 32 Search based software testing tool
Work in around 2010 by Gordon Fraser and colleagues at U Saarland, Germany Use genetic algorithm as the optimization algorithm Use combination of step + value gap as the fitness function 32

Process 33 Their eclipse plug-in is rather instable
Have a relative stable command-line tool for mac os X Very easy to setup Download the jar Package your own code to jar file Run the command with options Coverage criterion Timeout Random seeds 33

Results 34 Junit test cases
Even with comments Very high coverage on relatively simple examples 34

Systematical testing 35 Target at certain code coverage
Try to understand how the code works Analyze the code structure to find out a path to go to a certain statement Analyze the code structure to find out the constraint of the inputs to let the program follow the path 35

Symbolic execution 36 Target at better statement coverage Basic idea
If a statement is not covered yet, try to provide an input to go over that statement A statement is covered only when a path goes to it is covered Then, what input will cause the program to go through certain paths? Only when the input satisfy all if-conditions along the path! 36

Symbolic execution: example
void CoverMe(int[] a) { if (a == null) return; if (a.Length > 0) if (a[0] == ) throw new Exception("bug"); } a==null F T a.Length>0 F T a[0]==123… F T

How to know what input to feed in
Static symbolic execution Construct a constraint for each statement, the statement will be executed when the constraint is satisfied The parameters (variables) in the constraint are inputs of the software Solve the constraint The variables used in conditions may not be inputs So they must be represented by expressions of inputs 38

Static symbolic execution
Here T is the condition for the statement to be executed, (y=s) is the relationship of all variables to the inputs after the statement is executed Basic Example y = read(); y = 2 * y; if (y <= 12) fails(); else success(); print ("OK"); T (y=s), s is a symbolic variable for input T (y=2*s) T (y=2*s) T^y<=12 (y=2*s) T^y<=12 (y=2*s) | T^!(y<=12) (y=2*s) T^!(y<=12) (y=2*s) T^y<=12 (y=2*s) | T^!(y<=12) (y=2*s) 39

Static symbolic execution
Generating test cases y = read(); T (y=s), s is the input y = 2 * y; T (y=2*s) if (y <= 12) T (y=2*s) fails(); T^y<=12 (y=2*s) s<=6 -> 8 else success(); T^!(y<=12) (y=2*s) s>6 ->3 print ("OK"); T^y<=12 (y=2*s) | T^!(y<=12) (y=2*s) 40

Constraint Solver Solve constraints -> provide a value set satisfying the constraint Float values Linear programming if the constraints are all linear Boolean values SAT problem Integer values Can be deduced to SAT String values SMT problem Mostly NPC problems Can be deduced from/to SAT problem 41

Problems of static symbolic execution
Path explosion Remember n branches will cause 2n paths Infinite paths for unbounded loops Calculate constraints on all paths is infeasible for real software Too complex constraint The constraint gets very complex for large programs Not to mention the resolving part is NPC 42

Chaining and Goal-oriented Approach
Not all branches are actually relevant If the outcome of a branch will not affect whether a statement will be executed We should not waste time on the branch A classification of branches Critical branches The branch must not be taken if target is executed Non-essential branches Other branches 43

Example of branch classification
Read a, b Read a, b; if (a > 0){ a = a + b; } if (b >= 10){ st2; //target if a > 10 N-E a = a+b; N-E if b >= 10 C st2 End 44

Chaining 45 Find a path to reach the target code
Start with the start node and the target node <read a,b>, <target> Avoid critical branches: b>=10 Choose b = 10 45

A different example 46 b=0 Read a; if a > 0 b = 0; if (a > 0){
b = a + b; } if (b >= 10){ st2; //target if a > 0 N-E b = a+b; N-E if b >= 10 C st2 End 46

Chaining 47 Find a path to reach the target code
Start with the start node and the target node <read a>, <target> Avoid critical branches: b>=10 , no solution Go to definition statement of b: b = 0; no solution b = a + b; -> a + b>=10 47

Chaining: next steps <read a>, <b=a+b, a+b>=10>, <target> Avoid critical branches: a>0, a+b>=10 go to definition of a, b b = 0; read a; -> solve the constraint: a = 10 48

Chaining 49 Compared to static symbolic execution: Problems
Reduce the branches to be considered Reduce the definitions to be considered More scalable than static symbolic execution Problems Still not scalable enough Spend much time to generating only 1 test case to cover a certain statement 49

Dynamic symbolic execution
Koushik Sen et al. 2005 One of the most important progress in software engineering in the 21st century Basic idea Actually run the software Generate constraints as the program runs Flip constraints to reach other statements 50

Choose next path Code to generate inputs for: Solve Execute&Monitor void CoverMe(int[] a) { if (a == null) return; if (a.Length > 0) if (a[0] == ) throw new Exception("bug"); } Data null {} {0} {123…} Constraints to solve a!=null a!=null && a.Length>0 a.Length>0 && a[0]== Observed constraints a==null a!=null && !(a.Length>0) a.Length>0 && a[0]!= a[0]== Negated condition a==null F T a.Length>0 F T a[0]==123… Done: There is no statement left. F T 51

Advantages No need to analysis the whole system before perform testing All constraints and expressions can be executed along one execution path Can handle library method calls Testing as the analysis taking place Disadvantages Overhead in testing 52

Review of automatic test case generation
Random testing Rule-based Adaptive Feedback-based Exploration Search-based testing Optimization algorithms Fitness function Systematical testing Symbolic execution Chaining and goal-oriented 53

Review of testing 54 General guidelines Levels of testing
Unit testing, higher level testing, GUI testing Test coverage Code, input, mutation Regression testing Non-function testing Security testing, performance testing Test automation 54

Review: General guidelines
Test is the practical choice: the best affordable approach Concepts: test case, test oracle, test suite, test driver, test script, test coverage Granularity: unit, integration, system, acceptance Type by design principle: black-box, white-box Black-box-testing: boundary, equivalence, decision table White-box-testing: branch coverage, complexity 55

Review of Unit Testing 56 Using unit test framework to do unit testing
It does all the common things, reduce test interference Inside test framework setUp -> test -> assert -> tearDown Writing informative assertions Writing complete tear downs Dependencies are evil! Dependency injection Remove dependencies Test doubles 56

Review: Higher level testing
Integration Testing Strategies: big bang, top-down, bottom-up, sandwich, path-based System Testing Environment issues: building, platforms, web services, environment failures, distribution issues Acceptance Testing GUI testing Event-based, Screen-based, CV-based Record and replay 57

Review of test coverage
Code coverage Target: code Adequacy: no -> 100% code coverage != no bugs Overhead: low (instrumentation cause some overhead) Input combination coverage Target: inputs Adequacy: yes -> 100% input coverage == no bugs Overhead: none Mutation coverage Target: bugs Adequacy: no -> 100% mutant coverage != no bugs Overhead: very high (execution on instrumented mutated versions) 58

Review of Regression Testing
Test Prioritization Try only the most important test cases Test Relevant Code Try the most relevant test cases Record and Replay Reuse the execution results of previous test cases 59

Review of Non-Functional Testing
Performance Testing Test whether the efficiency (time and space) of a software meets requirements Security Testing Test whether the software is vulnerable to attacks (special invalid inputs designed to control the software or reveal info from the software) 60

Review of automatic test case generation
Random testing Rule-based Adaptive Feedback-based Exploration Search-based testing Optimization algorithms Fitness function Systematical testing Symbolic execution Chaining and goal-oriented 61

Automation of Test Case Generation

Similar presentations

Presentation on theme: "Automation of Test Case Generation"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Automation of Test Case Generation

Similar presentations

Presentation on theme: "Automation of Test Case Generation"— Presentation transcript:

Similar presentations

About project

Feedback