Korat: Automated Testing Based on Java Predicates Chandrasekhar Boyapati 1, Sarfraz Khurshid 2, and Darko Marinov 3 1 University of Michigan Ann Arbor 2 University of Texas at Austin 3 University of Illinois at Urbana-Champaign
Examples of Structurally Complex Data red-black tree T1 A1 T2 A2 T3 A3 XML document
Testing Setup examples of code under test –abstract data type input/output: data structure –XML processing program input/output: XML document code test generator testing oracle pass fail inputsoutputs
Manual Test Generation drawbacks of manual generation –labor-intensive and expensive –vulnerable to systematic bias that eliminates certain kinds of inputs code test generator manual testing oracle pass fail inputsoutputs
Automated Test Generation challenges of automated test generation –describing test inputs –(efficient) test generation –checking output code test generator automated testing oracle pass fail inputsoutputs
Running Example class BST { Node root; int size; static class Node { Node left, right; int value; } B 0 : 3 root N 0 : 2 left right N 1 : 1N 2 : 3 void remove(int i) { … } …
Example Valid Inputs trees with exactly 3 nodes right N 0 : 1 N 1 : 2 N 2 : 3 B 0 : 3 root right left N 0 : 1 N 1 : 3 N 2 : 2 B 0 : 3 root left right N 0 : 3 N 1 : 1 N 2 : 2 B 0 : 3 root left N 0 : 3 N 1 : 2 N 2 : 1 B 0 : 3 root left right N 0 : 2 N 1 : 1N 2 : 3 B 0 : 3 root
Example Invalid Inputs object graphs violating some validity property left right N 0 : 2 N 1 : 1N 2 : 3 B 0 : 3 root left right N 0 : 2 N 1 : 1N 2 : 3 B 0 : 2 root left right N 0 : 3 N 1 : 1N 2 : 2 B 0 : 3 root
Validity Properties for Example underlying graph is a tree (no sharing between subtrees) correct number of nodes reachable from root node values ordered for binary search left right N 0 : 2 N 1 : 1N 2 : 3 B 0 : 3 root
Describing Validity Properties Korat uses a predicate written in standard implementation language (Java, C#…) –takes an input that can be valid or invalid –returns a boolean indicating validity advantages –familiar language –existing development tools –predicates often already present challenge: generate tests from predicates
Example Predicate boolean repOk(BST t) { return isTree(t) && hasCorrectSize(t) && isOrdered(t); } boolean isTree(BST t) { if (t.root == null) return true;// empty tree Set visited = new HashSet(); visited.add(t.root); List workList = new LinkedList(); workList.add(t.root); while (!workList.isEmpty()) { Node current = (Node)workList.removeFirst(); if (current.left != null) { if (!visited.add(current.left)) return false;// sharing workList.add(current.left); } if (current.right != null) { if (!visited.add(current.right)) return false;// sharing workList.add(current.right); } } return true;// no sharing }
Input Space all possible object graphs with a BST root (obeying type declarations) right left N 0 : 2 N 1 : 1N 2 : 3 B0: 3 root N 0 : 1 B0: 1 root B0: 0 right N 0 : 1 N 1 : 2 N 2 : 3 B0: 3 root right left N 0 : 1 N 1 : 3 N 2 : 2 B0: 3 root left right N 0 : 3 N 1 : 1 N 2 : 2 B0: 3 root left N 0 : 3 N 1 : 2 N 2 : 1 B0: 3 root N 0 : 1 B0: 1 root left N 0 : 1 B0: 1 root right N 0 : 1 B0: 1 root left right leftright N 0 : 2 N 1 : 1N 2 : 3 B0: 3 root left right N 0 : 2 N 1 : 1N 2 : 3 B0: 2 root left right N 0 : 3 N 1 : 1N 2 : 2 B0: 3 root
Bounded-Exhaustive Generation finitization bounds input space –number of objects –values of fields generate all valid inputs up to given bound –eliminates systematic bias –finds all errors detectable within bound avoid isomorphic inputs –reduces the number of inputs –preserves capability to find all errors
Example Finitization specifies number of objects for each class –1 object for BST: { B 0 } –3 objects for Node: { N 0, N 1, N 2 } specifies set of values for each field –sets consist of objects and literals –for root, left, right: { null, N 0, N 1, N 2 } –for value: { 1, 2, 3 } –for size: { 3 }
Example Input Space 1 BST object, 3 Node objects: total 11 fields B0B0 N0N0 N1N1 N2N2 rootsizeleftrightvaluerightleftrightvalueleftvalue null N0N0 N1N1 N2N2 N0N0 N1N1 N2N2 N0N0 N1N1 N2N N0N0 N1N1 N2N2 N0N0 N1N1 N2N N0N0 N1N1 N2N2 N0N0 N1N1 N2N * 1 * (4 * 4 * 3) 3 > 2 18 inputs, only 5 valid 3 B0B0 N0N0 N1N1 N2N2 rootsizeleftrightvaluerightleftrightvalueleftvalue N0N0 N1N1 N1N1 null 3231
Example Input each input is a valuation of fields B0B0 N0N0 N1N1 N2N2 rootsizeleftrightvaluerightleftrightvalueleftvalue N0N0 N1N1 N1N1 null 3231 left right N 0 : 2 N 1 : 1N 2 : 3 B 0 : 3 root
Generation Problem given –predicate –finitization generate –all nonisomorphic valid inputs within finitization simple “solution” –enumerate entire input space –run predicate on each input –generate input if predicate returns true –infeasible for sparse input spaces (#valid << #total) –must reason about behavior of predicate
field accesses: [ ] boolean repOk(BST t) { return isTree(t) && …; } boolean isTree(BST t) { if (t.root == null) return true; Set visited = new HashSet(); visited.add(t.root); List workList = new LinkedList(); workList.add(t.root); while (!workList.isEmpty()) { Node current = (Node)workList.removeFirst(); if (current.left != null) { if (!visited.add(current.left)) return false; workList.add(current.left); } if (current.right != null) { if (!visited.add(current.right)) return false; workList.add(current.right); } } return true; } Example Execution left right N 0 : 2 N 1 : 1N 2 : 3 B 0 : 3 root [ B 0.root ] [ B 0.root, N 0.left ] [ B 0.root, N 0.left, N 0.right ]
Failed Execution failed after few accesses for a concrete input would fail for all inputs with partial valuation B0B0 N0N0 N1N1 N2N2 rootsizeleftrightvaluerightleftrightvalueleftvalue N0N0 N1N1 N1N B0B0 N0N0 N1N1 N2N2 rootsizeleftrightvaluerightleftrightvalueleftvalue N0N0 N1N1 N1N1 null * 3 * 4 * 4 * 3 * 4 * 4 * 3 > 2 12
Key Idea observe execution of predicate record field accesses prune large chunks of input space for each failed execution use backtracking to efficiently enumerate valid inputs
Search Step backtracking on [ B 0.root, N 0.left, N 0.right ] produces next candidate input B0B0 N0N0 N1N1 N2N2 rootsizeleftrightvaluerightleftrightvalueleftvalue N0N0 N1N1 N1N1 null 3231 B0B0 N0N0 N1N1 N2N2 rootsizeleftrightvaluerightleftrightvalueleftvalue N0N0 N1N1 N2N2 null 3231
Next Input execution returns true, generate this input B0B0 N0N0 N1N1 N2N2 rootsizeleftrightvaluerightleftrightvalueleftvalue N0N0 N1N1 N2N2 null 3231 left right N 0 : 2 N 1 : 1N 2 : 3 B 0 : 3 root
Korat solver for executable predicates Korat systematically searches input space –encodes inputs as vectors of integers –creates candidate object-graph inputs –monitors execution of predicate –prunes search by backtracking on field accesses –avoids isomorphic inputs Korat all nonisomorphic valid inputs repOk finitization
equivalent for all code and all properties example for trees: permutation of nodes n! isomorphic trees => generation infeasible removing isomorphic inputs from test suites –significantly reduces the number of tests –does not reduce quality left right N 1 : 2 N 2 : 1N 0 : 3 B 0 : 3 root N 0 : 2 N 1 : 1N 2 : 3 B 0 : 3 left right root Isomorphic Inputs
inputs: rooted and labeled (object) graphs –edges have fields –nodes have object identities isomorphism with respect to object identitites definition: inputs I and I’ consisting of objects from a set O are isomorphic iff : O->O. o,o’ reachable in I. f fields(o). o.f = o’ in I => (o).f = (o’) in I’ and o.f = p in I => (o).f = p in I’ Isomorphism Definition
Example Nonisomorphic Trees different edges or primitives, not only nodes rightleft N 0 : 2 N 1 : 1N 2 : 3 B 0 : 3 root N 0 : 2 N 1 : 1N 2 : 3 B 0 : 3 left right root left right N 0 : 3 N 1 : 1N 2 : 2 B 0 : 3 root
Nonisomorphic Generation simple “solution” –generate all inputs –filter out isomorphic inputs Korat does not require filtering –generate only one (candidate) input from each isomorphism class –only the lexicographically smallest input –search increments some field values for >1
Correctness Korat’s input generation –sound no invalid input –complete at least one valid input from each isomorphism class –optimal at most one (valid) input from each isomorphism class
Example Specification class BST { Node root; int size; static class Node { Node left, right; int value; } invariant repOk(); precondition true; postcondition !contains(i) && …; void remove(int i) { … } boolean contains(int i) { … } … }
Testing Scenario removeKorat an d pass fail right N0: 1N0: 1 N1: 2N1: 2 B0: 3B0: 3 root right N0: 1N0: 1 N1: 2N1: 2 N2: 3N2: 3 B0: 3B0: 3 root, 3 invariant repOk() precondition true postcondition !contains(i) && … and finitization
linked data structures and array-based data structures Korat Structure Generation benchmark BST HeapArray java.util.LinkedList java.util.TreeMap java.util.HashSet IntentionalName
Korat Structure Generation very large input spaces benchmarksizeinput space BST HeapArray java.util.LinkedList java.util.TreeMap java.util.HashSet IntentionalName52 50
Korat Structure Generation pruning based on filed accesses very effective benchmarksizeinput space candidate inputs BST HeapArray java.util.LinkedList java.util.TreeMap java.util.HashSet IntentionalName
Korat Structure Generation correct number of nonisomorphic structures (Sloane’s) benchmarksizeinput space candidate inputs valid inputs BST HeapArray java.util.LinkedList java.util.TreeMap java.util.HashSet IntentionalName
Korat Structure Generation 800Mhz Pentium III Sun’s Java 2 SDK JVM benchmarksizeinput space candidate inputs valid inputs time [sec] BST HeapArray java.util.LinkedList java.util.TreeMap java.util.HashSet IntentionalName
Korat Method Testing several operations for each data structure benchmarkboundinputs generated gen. [sec] test [sec] BST HeapArray LinkedList TreeMap HashSet IntentionalName SortedList DisjSet BinomialHeap FibonacciHeap
Conclusions Korat automates specification-based testing –uses method precondition to generate all nonisomorphic test inputs prunes search space using field accesses –invokes the method on each input and uses method postcondition as a test oracle Korat prototype uses JML specifications Korat efficiently generates complex data structures including some from JCF