Tao Xie (North Carolina State University) Nikolai Tillmann, Jonathan de Halleux, Wolfram Schulte (Microsoft Research, Redmond WA, USA)

 Software testing is important  Software errors cost the U.S. economy about $59.5 billion each year (0.6% of the GDP) [NIST 02]  Improving testing infrastructure could save 1/3 cost [NIST 02]  Software testing is costly  Account for even half the total cost of software development [Beizer 90]  Automated testing reduces manual testing effort  Test execution: JUnit, NUnit, xUnit, etc.  Test generation: Pex, AgitarOne, Parasoft Jtest, etc.  Test-behavior checking: Pex, AgitarOne, Parasoft Jtest, etc.

var list = new List(); list.Add(item); var count = list.Count; var list = new List(); list.Add(item); var count = list.Count; Assert.AreEqual(1, count); } Assert.AreEqual(1, count); }  Three essential ingredients:  Data  Method Sequence  Assertions void Add() { int item = 3; void Add() { int item = 3;

void Add(List list, int item) { var count = list.Count; list.Add(item); Assert.AreEqual(count + 1, list.Count); } void Add(List list, int item) { var count = list.Count; list.Add(item); Assert.AreEqual(count + 1, list.Count); }  Parameterized Unit Test = Unit Test with Parameters  Separation of concerns  Data is generated by a tool  Developer can focus on functional specification

5 Goal: Given a program with a set of input parameters, automatically generate a set of input values that, upon execution, will exercise as many statements as possible Observations:  Reachability not decidable, but approximations often good enough  Encoding of functional correctness checks as assertions that reach an error statement on failure

 Manual  Brute Force  Bounded Exhaustive  Random …… Our context here: Dynamic Symbolic Execution  Symbolic bounded-exhaustive model checking  Always under-approximation

7  Combines concrete and symbolic execution  Algorithm computes test inputs iteratively, exercising a different execution path each time Set J := ∅ (intuitively, J is set of analyzed program inputs) Loop Choose program input i ∉ J (stop if no such i can be found) Output i Execute P(i); record path condition C (in particular, C(i) holds) Set J := J ∪ C (viewing C as the set { i | C(i ) } ) End loop Does not terminate if the number of execution paths is infinite (in the presence of loops/recursion) This choice decides search order Search order decides how quick we can achieve high code coverage! Incomplete constraint-solver leads to under-approximation This choice decides search order Search order decides how quick we can achieve high code coverage! Incomplete constraint-solver leads to under-approximation

Code to generate inputs for: Constraints to solve a!=null a!=null && a.Length>0 a!=null && a.Length>0 && a[0]==1234567890 void CoverMe(int[] a) { if (a == null) return; if (a.Length > 0) if (a[0] == 1234567890) throw new Exception("bug"); } void CoverMe(int[] a) { if (a == null) return; if (a.Length > 0) if (a[0] == 1234567890) throw new Exception("bug"); } Observed constraints a==null a!=null && !(a.Length>0) a!=null && a.Length>0 && a[0]!=1234567890 a!=null && a.Length>0 && a[0]==1234567890 Data null {} {0} {123…} a==null a.Length>0 a[0]==123… T T F T F F Execute&Monitor Solve Choose next path Done: There is no path left. Negated condition “Flipping a node”

There are decision procedures for individual path conditions, but…  Number of potential paths grows exponentially with number of branches  Reachable code not known initially  Without guidance, same loop might be unfolded forever

public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Path condition: !(x == 90) ↓ New path condition: (x == 90) ↓ New test input: TestLoop(90, {0}) Path condition: !(x == 90) ↓ New path condition: (x == 90) ↓ New test input: TestLoop(90, {0}) Test input: TestLoop(0, {0}) Test input: TestLoop(0, {0})

Path condition: (x == 90) && !(y[0] == 15) ↓ New path condition: (x == 90) && (y[0] == 15) ↓ New test input: TestLoop(90, {15}) Path condition: (x == 90) && !(y[0] == 15) ↓ New path condition: (x == 90) && (y[0] == 15) ↓ New test input: TestLoop(90, {15}) Test input: TestLoop(90, {0}) Test input: TestLoop(90, {0}) public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; }

public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Test input: TestLoop(90, {15}) Test input: TestLoop(90, {15}) Path condition: (x == 90) && (y[0] == 15) && !(x+1 == 110) ↓ New path condition: (x == 90) && (y[0] == 15) && (x+1 == 110) ↓ New test input: No solution!? Path condition: (x == 90) && (y[0] == 15) && !(x+1 == 110) ↓ New path condition: (x == 90) && (y[0] == 15) && (x+1 == 110) ↓ New test input: No solution!?

public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Path condition: (x == 90) && (y[0] == 15) && (0 < y.Length) && !(1 < y.Length) && !(x+1 == 110) ↓ New path condition: (x == 90) && (y[0] == 15) && (0 < y.Length) && (1 < y.Length)  Expand array size Path condition: (x == 90) && (y[0] == 15) && (0 < y.Length) && !(1 < y.Length) && !(x+1 == 110) ↓ New path condition: (x == 90) && (y[0] == 15) && (0 < y.Length) && (1 < y.Length)  Expand array size Test input: TestLoop(90, {15}) Test input: TestLoop(90, {15})

We can have infinite paths! (both length and number) Manual analysis  need at least 20 loop iterations to cover the target branch Exploring all paths up to 20 loop iterations is practically infeasible: 2 20 paths We can have infinite paths! (both length and number) Manual analysis  need at least 20 loop iterations to cover the target branch Exploring all paths up to 20 loop iterations is practically infeasible: 2 20 paths Test input: TestLoop(90, {15}) Test input: TestLoop(90, {15}) public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; }

public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Test input: TestLoop(90, {15, 15}) Test input: TestLoop(90, {15, 15}) Our solution:  Prefer to flip nodes on the most promising path  Prefer to flip the most promising nodes on path  Use fitness function as a proxy for promising Our solution:  Prefer to flip nodes on the most promising path  Prefer to flip the most promising nodes on path  Use fitness function as a proxy for promising Key observations: with respect to the coverage target,  not all paths are equally promising for flipping nodes  not all nodes are equally promising to flip Key observations: with respect to the coverage target,  not all paths are equally promising for flipping nodes  not all nodes are equally promising to flip

 FF computes fitness value (distance between the current state and the goal state)  Search tries to minimize fitness value [ Tracey et al. 98, Liu at al. 05, …]

public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Fitness function: |110 – x |

(90, {0}) 20 (90, {15}) 19 (90, {15, 0}) 19 (90, {15, 15}) 18 (90, {15, 15, 0}) 18 (90, {15, 15, 15}) 17 (90, {15, 15, 15, 0}) 17 (90, {15, 15, 15, 15}) 16 (90, {15, 15, 15, 15, 0}) 16 (90, {15, 15, 15, 15, 15}) 15 … Fitness Value (x, y) Give preference to flip a node in paths with better fitness values. We still need to address which node to flip on paths … Give preference to flip a node in paths with better fitness values. We still need to address which node to flip on paths … public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Fitness function: |110 – x |

Fitness Value public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } (90, {0}) 20 (90, {15})  flip b4 19 (90, {15, 0})  flip b2 19 (90, {15, 15})  flip b4 18 (90, {15, 15, 0})  flip b2 18 (90, {15, 15, 15})  flip b4 17 (90, {15, 15, 15, 0})  flip b2 17 (90, {15, 15, 15, 15})  flip b4 16 (90, {15, 15, 15, 15, 0})  flip b2 16 (90, {15, 15, 15, 15, 15})  flip b4 15 … (x, y) Fitness function: |110 – x | Branch b1: i < y.Length Branch b2: i >= y.Length Branch b3: y[i] == 15 Branch b4: y[i] != 15 Flipping branch node of b4 (b3) gives us average 1 (-1) fitness gain (loss) Flipping branch node of b2 (b1) gives us average 0 (0) fitness gain (loss)

 Let p be an already explored path, and n a node on that path, with explored outgoing branch b.  After (successfully) flipping n, we get path p’ that goes to node n, and then continues with a different branch b’.  Define fitness gains as follows, where F(.) is the fitness value of a path.  Set FGain(b) := F(p) – F(p’)  Set FGain(b’) := F(p’) – F(p)  Compute the average fitness gain for each program branch over time

 Pex: Automated White-Box Test Generation tool for.NET, based on Dynamic Symbolic Execution  Pex maintains global search frontier  All discovered branch nodes are added to frontier  Frontier may choose next branch node to flip  Fully explored branch nodes are removed from frontier  Pex has a default search frontier  Tries to create diversity across different coverage criteria, e.g. statement coverage, branch coverage, stack traces, etc.  Customizable: Other frontiers can be combined in a fair round-robin scheme

We implemented a new search frontier “Fitnex”:  Nodes to flip are prioritized by their composite fitness value: F(p n ) – FGain(b n ), where  p n is path of node n  b n is explored outgoing branch of n  Fitnex always picks node with lowest composite fitness value to flip.  To avoid local optimal or biases, the fitness-guided strategy is combined with Pex’s search strategies

A collection of micro-benchmark programs routinely used by the Pex developers to evaluate Pex’s performance, extracted from real, complex C# programs Ranging from string matching like if (value.StartsWith("Hello") && value.EndsWith("World!") && value.Contains(" ")) { … } to a small parser for a Pascal-like language where the target is to create a legal program.

 Pex with the Fitnex strategy  Pex without the Fitnex strategy  Pex’s previous default strategy  Random  a strategy where branch nodes to flip are chosen randomly in the already explored execution tree  Iterative Deepening  a strategy where breadth-first search is performed over the execution tree

#runs/iterations required to cover the target Pex w/o Fitnex: avg. improvement of factor 1.9 over Random Pex w/ Fitnex: avg. improvement of factor 5.2 over Random

Search strategies of other Dynamic-Symbolic-Execution approaches:  Depth-first search (DART, CUTE, EXE, JPF, …)  Breadth-first search (JPF)  Coverage heuristics (EXE)  Innermost- function first (SMART)  Generational search (SAGE)  Control-flow guided (Crest, JPF, …)  Random (Crest, JPF, …) None strongly guided towards covering state-dependent test targets in the form of conditional branches, which is what Pex does with Fitnex.

 Dynamic Symbolic Execution needs guidance  Fitness – a practical guide  Fitness values of explored paths  Fitness gains of branches’ past flipping  Evaluation results show the effectiveness of the new Fitnex strategy  Fitnex has been integrated in Pex’ default strategy  Pex is available for academic use and commercial evaluation

http://research.microsoft.com/pex http://www.codeplex.com/Pex https://sites.google.com/site/asergrp/

Tao Xie (North Carolina State University) Nikolai Tillmann, Jonathan de Halleux, Wolfram Schulte (Microsoft Research, Redmond WA, USA)

Similar presentations

Presentation on theme: "Tao Xie (North Carolina State University) Nikolai Tillmann, Jonathan de Halleux, Wolfram Schulte (Microsoft Research, Redmond WA, USA)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Tao Xie (North Carolina State University) Nikolai Tillmann, Jonathan de Halleux, Wolfram Schulte (Microsoft Research, Redmond WA, USA)

Similar presentations

Presentation on theme: "Tao Xie (North Carolina State University) Nikolai Tillmann, Jonathan de Halleux, Wolfram Schulte (Microsoft Research, Redmond WA, USA)"— Presentation transcript:

Similar presentations

About project

Feedback