Download presentation
Presentation is loading. Please wait.
Published byGeoffrey Hubbard Modified over 9 years ago
1
Tao Xie (North Carolina State University) Nikolai Tillmann, Jonathan de Halleux, Wolfram Schulte (Microsoft Research, Redmond WA, USA)
2
Software testing is important Software errors cost the U.S. economy about $59.5 billion each year (0.6% of the GDP) [NIST 02] Improving testing infrastructure could save 1/3 cost [NIST 02] Software testing is costly Account for even half the total cost of software development [Beizer 90] Automated testing reduces manual testing effort Test execution: JUnit, NUnit, xUnit, etc. Test generation: Pex, AgitarOne, Parasoft Jtest, etc. Test-behavior checking: Pex, AgitarOne, Parasoft Jtest, etc.
3
var list = new List(); list.Add(item); var count = list.Count; var list = new List(); list.Add(item); var count = list.Count; Assert.AreEqual(1, count); } Assert.AreEqual(1, count); } Three essential ingredients: Data Method Sequence Assertions void Add() { int item = 3; void Add() { int item = 3;
4
void Add(List list, int item) { var count = list.Count; list.Add(item); Assert.AreEqual(count + 1, list.Count); } void Add(List list, int item) { var count = list.Count; list.Add(item); Assert.AreEqual(count + 1, list.Count); } Parameterized Unit Test = Unit Test with Parameters Separation of concerns Data is generated by a tool Developer can focus on functional specification
5
5 Goal: Given a program with a set of input parameters, automatically generate a set of input values that, upon execution, will exercise as many statements as possible Observations: Reachability not decidable, but approximations often good enough Encoding of functional correctness checks as assertions that reach an error statement on failure
6
Manual Brute Force Bounded Exhaustive Random …… Our context here: Dynamic Symbolic Execution Symbolic bounded-exhaustive model checking Always under-approximation
7
7 Combines concrete and symbolic execution Algorithm computes test inputs iteratively, exercising a different execution path each time Set J := ∅ (intuitively, J is set of analyzed program inputs) Loop Choose program input i ∉ J (stop if no such i can be found) Output i Execute P(i); record path condition C (in particular, C(i) holds) Set J := J ∪ C (viewing C as the set { i | C(i ) } ) End loop Does not terminate if the number of execution paths is infinite (in the presence of loops/recursion) This choice decides search order Search order decides how quick we can achieve high code coverage! Incomplete constraint-solver leads to under-approximation This choice decides search order Search order decides how quick we can achieve high code coverage! Incomplete constraint-solver leads to under-approximation
8
Code to generate inputs for: Constraints to solve a!=null a!=null && a.Length>0 a!=null && a.Length>0 && a[0]==1234567890 void CoverMe(int[] a) { if (a == null) return; if (a.Length > 0) if (a[0] == 1234567890) throw new Exception("bug"); } void CoverMe(int[] a) { if (a == null) return; if (a.Length > 0) if (a[0] == 1234567890) throw new Exception("bug"); } Observed constraints a==null a!=null && !(a.Length>0) a!=null && a.Length>0 && a[0]!=1234567890 a!=null && a.Length>0 && a[0]==1234567890 Data null {} {0} {123…} a==null a.Length>0 a[0]==123… T T F T F F Execute&Monitor Solve Choose next path Done: There is no path left. Negated condition “Flipping a node”
9
There are decision procedures for individual path conditions, but… Number of potential paths grows exponentially with number of branches Reachable code not known initially Without guidance, same loop might be unfolded forever
10
public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Path condition: !(x == 90) ↓ New path condition: (x == 90) ↓ New test input: TestLoop(90, {0}) Path condition: !(x == 90) ↓ New path condition: (x == 90) ↓ New test input: TestLoop(90, {0}) Test input: TestLoop(0, {0}) Test input: TestLoop(0, {0})
11
Path condition: (x == 90) && !(y[0] == 15) ↓ New path condition: (x == 90) && (y[0] == 15) ↓ New test input: TestLoop(90, {15}) Path condition: (x == 90) && !(y[0] == 15) ↓ New path condition: (x == 90) && (y[0] == 15) ↓ New test input: TestLoop(90, {15}) Test input: TestLoop(90, {0}) Test input: TestLoop(90, {0}) public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; }
12
public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Test input: TestLoop(90, {15}) Test input: TestLoop(90, {15}) Path condition: (x == 90) && (y[0] == 15) && !(x+1 == 110) ↓ New path condition: (x == 90) && (y[0] == 15) && (x+1 == 110) ↓ New test input: No solution!? Path condition: (x == 90) && (y[0] == 15) && !(x+1 == 110) ↓ New path condition: (x == 90) && (y[0] == 15) && (x+1 == 110) ↓ New test input: No solution!?
13
public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Path condition: (x == 90) && (y[0] == 15) && (0 < y.Length) && !(1 < y.Length) && !(x+1 == 110) ↓ New path condition: (x == 90) && (y[0] == 15) && (0 < y.Length) && (1 < y.Length) Expand array size Path condition: (x == 90) && (y[0] == 15) && (0 < y.Length) && !(1 < y.Length) && !(x+1 == 110) ↓ New path condition: (x == 90) && (y[0] == 15) && (0 < y.Length) && (1 < y.Length) Expand array size Test input: TestLoop(90, {15}) Test input: TestLoop(90, {15})
14
We can have infinite paths! (both length and number) Manual analysis need at least 20 loop iterations to cover the target branch Exploring all paths up to 20 loop iterations is practically infeasible: 2 20 paths We can have infinite paths! (both length and number) Manual analysis need at least 20 loop iterations to cover the target branch Exploring all paths up to 20 loop iterations is practically infeasible: 2 20 paths Test input: TestLoop(90, {15}) Test input: TestLoop(90, {15}) public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; }
15
public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Test input: TestLoop(90, {15, 15}) Test input: TestLoop(90, {15, 15}) Our solution: Prefer to flip nodes on the most promising path Prefer to flip the most promising nodes on path Use fitness function as a proxy for promising Our solution: Prefer to flip nodes on the most promising path Prefer to flip the most promising nodes on path Use fitness function as a proxy for promising Key observations: with respect to the coverage target, not all paths are equally promising for flipping nodes not all nodes are equally promising to flip Key observations: with respect to the coverage target, not all paths are equally promising for flipping nodes not all nodes are equally promising to flip
16
FF computes fitness value (distance between the current state and the goal state) Search tries to minimize fitness value [ Tracey et al. 98, Liu at al. 05, …]
17
public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Fitness function: |110 – x |
18
(90, {0}) 20 (90, {15}) 19 (90, {15, 0}) 19 (90, {15, 15}) 18 (90, {15, 15, 0}) 18 (90, {15, 15, 15}) 17 (90, {15, 15, 15, 0}) 17 (90, {15, 15, 15, 15}) 16 (90, {15, 15, 15, 15, 0}) 16 (90, {15, 15, 15, 15, 15}) 15 … Fitness Value (x, y) Give preference to flip a node in paths with better fitness values. We still need to address which node to flip on paths … Give preference to flip a node in paths with better fitness values. We still need to address which node to flip on paths … public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Fitness function: |110 – x |
19
Fitness Value public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } (90, {0}) 20 (90, {15}) flip b4 19 (90, {15, 0}) flip b2 19 (90, {15, 15}) flip b4 18 (90, {15, 15, 0}) flip b2 18 (90, {15, 15, 15}) flip b4 17 (90, {15, 15, 15, 0}) flip b2 17 (90, {15, 15, 15, 15}) flip b4 16 (90, {15, 15, 15, 15, 0}) flip b2 16 (90, {15, 15, 15, 15, 15}) flip b4 15 … (x, y) Fitness function: |110 – x | Branch b1: i < y.Length Branch b2: i >= y.Length Branch b3: y[i] == 15 Branch b4: y[i] != 15 Flipping branch node of b4 (b3) gives us average 1 (-1) fitness gain (loss) Flipping branch node of b2 (b1) gives us average 0 (0) fitness gain (loss)
20
Let p be an already explored path, and n a node on that path, with explored outgoing branch b. After (successfully) flipping n, we get path p’ that goes to node n, and then continues with a different branch b’. Define fitness gains as follows, where F(.) is the fitness value of a path. Set FGain(b) := F(p) – F(p’) Set FGain(b’) := F(p’) – F(p) Compute the average fitness gain for each program branch over time
21
Pex: Automated White-Box Test Generation tool for.NET, based on Dynamic Symbolic Execution Pex maintains global search frontier All discovered branch nodes are added to frontier Frontier may choose next branch node to flip Fully explored branch nodes are removed from frontier Pex has a default search frontier Tries to create diversity across different coverage criteria, e.g. statement coverage, branch coverage, stack traces, etc. Customizable: Other frontiers can be combined in a fair round-robin scheme
22
We implemented a new search frontier “Fitnex”: Nodes to flip are prioritized by their composite fitness value: F(p n ) – FGain(b n ), where p n is path of node n b n is explored outgoing branch of n Fitnex always picks node with lowest composite fitness value to flip. To avoid local optimal or biases, the fitness-guided strategy is combined with Pex’s search strategies
23
A collection of micro-benchmark programs routinely used by the Pex developers to evaluate Pex’s performance, extracted from real, complex C# programs Ranging from string matching like if (value.StartsWith("Hello") && value.EndsWith("World!") && value.Contains(" ")) { … } to a small parser for a Pascal-like language where the target is to create a legal program.
24
Pex with the Fitnex strategy Pex without the Fitnex strategy Pex’s previous default strategy Random a strategy where branch nodes to flip are chosen randomly in the already explored execution tree Iterative Deepening a strategy where breadth-first search is performed over the execution tree
25
#runs/iterations required to cover the target Pex w/o Fitnex: avg. improvement of factor 1.9 over Random Pex w/ Fitnex: avg. improvement of factor 5.2 over Random
26
Search strategies of other Dynamic-Symbolic-Execution approaches: Depth-first search (DART, CUTE, EXE, JPF, …) Breadth-first search (JPF) Coverage heuristics (EXE) Innermost- function first (SMART) Generational search (SAGE) Control-flow guided (Crest, JPF, …) Random (Crest, JPF, …) None strongly guided towards covering state-dependent test targets in the form of conditional branches, which is what Pex does with Fitnex.
28
Dynamic Symbolic Execution needs guidance Fitness – a practical guide Fitness values of explored paths Fitness gains of branches’ past flipping Evaluation results show the effectiveness of the new Fitnex strategy Fitnex has been integrated in Pex’ default strategy Pex is available for academic use and commercial evaluation
29
http://research.microsoft.com/pex http://www.codeplex.com/Pex https://sites.google.com/site/asergrp/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.