Korat: Automated Testing Based on Java Predicates Presented by : Praveen Kumar Vanga Authors : Chandrasekhar Boyapati, Sarfraz Khurshid, and Darko Marinov 1
What is Korat ? Korat, a framework for automated testing of Java programs Korat uses the method precondition to automatically generate all test cases Korat then executes the method on each test case Uses the method postcondition as a test output to check the correctness of each output. 2
How to generate Such Test Inputs ? Motivations Testing a Program Generated at Google. Input: Based on Acyclic Directed Graph (DAG) Strongly connected components of web graph Example of Structurally Complex input - Structural : Data Consists of Linked Lists - Complex : Nodes need to satisfy Properties Output : Set of Nodes on Certain Path How to generate Such Test Inputs ? 3
Examples of Structurally Complex Data 1 3 2 red-black tree <library> <book year=2001> <title>T1</title> <author>A1</author> </book> <book year=2002> <title>T2</title> <author>A2</author> </book> <book year=2003> <title>T3</title> <author>A3</author> </book> </library> /library/book[@year<2003]/title XML document 4
Applications with Structurally Complex Inputs Operations on Abstract Data Types - Input : Data structures Satisfying Invariants Code Processing Java (IDE) - Input : Correct Java Program (Syntactically and Semantically) XML Processing Systems - Input : XML Documents Many More..... 5
Testing Setup inputs outputs Assumptions : - User has knowledge about the Input of the code under test - Knows the properties of desired Inputs Large Number of Desired Inputs - Ex: Corner cases for Linked List : empty LL, Non connected List, disconnected LL, Linked List with Cycles...etc 1 3 2 3 2 pass fail testing oracle test generator code 6
Testing Setup examples of code under test XML processing program inputs outputs examples of code under test abstract data type input/output: data structure XML processing program input/output: XML document 1 3 2 3 2 pass fail testing oracle test generator code 7
Manual Test Generation inputs outputs drawbacks of manual generation labor-intensive and expensive Tester Manually writes test abstractions and each test abstraction describes a set of inputs User might forget certain kind of test cases which represent certain Inputs (Vulnerability) 1 3 2 3 2 pass fail test generator manual testing oracle code 8
Automated Test Generation inputs outputs Tool Automatically Generates the inputs - If test abstraction changes, tool regenerates Inputs challenges of automated test generation describing test inputs (efficient) test generation checking output 1 3 2 3 2 test generator automated pass fail testing oracle code 9
Test Abstractions Imperative : Describe how to generate Inputs - ASTGen (More recent work - Discuss Later) Declarative : Describe what Input looks like Two Kinds of Languages used Declarative Language for properties of desired inputs (NLP) - Properties written in Alloy Modeling Language - Uses Alloy Analyzer for Generation of test cases Imperative Language for Properties of Desired Inputs - Properties written in high-level languages (Java, C#,...) - Korat Uses Imperative Language 10
UseCase : Bounded-Exhaustive Generation Generate all valid Structures up to a given bound User specified bounds Number of nodes Possible Values Rationale : Find all error detectable with in bound Tools Should avoid some equivalent inputs 11
Korat Examples What it does Demo How it works Results ASTGen Conclusion Outline
Example : Directed Acyclic Graph User Provides Input representation (Java) Class DAG { List<DAGNode> nodes; int size; Static class DAGNode { DAGNode[] children; } Desired Properties of Objects of these classes -No Directed cycle among the nodes -Not a multi-graph (all outgoing edges different)
Properties as Imperative Predicates (desired structures) Method that Identifies desired Structures Takes an Input that can be desired or not (ex. Graph is a DAG or not) - Returns Boolean Indicating desirability Advantages - Familiar Language - Existing development tools - Predicates can be already present in the code
Example Finitization specifies number of objects for each class 1 object for BST: { B0 } 3 objects for Node: { N0, N1, N2 } specifies set of values for each field sets consist of objects and literals for root, left, right: { null, N0, N1, N2 } for value: { 1, 2, 3 } for size: { 3 } 15
Bonded-Exhaustive generation from imperative predicates (Desired Structures) Inputs: predicate and finitization. Output: All Structures where predicate returns true. Korat Searches input space to find appropriate structures Korat 16
Korat Implemented in Java Works on java predicates Command line tool main arguments --Class main class for which to generate structures --Predicate : Method with test properties --Finitization : Method for finitization --args : bounds for finitization Actions for generated structures Visualizing Structures Storing Structures in a file Running the code under test for each structure
Korat java -cp korat.jar korat.Korat --visualize --class korat.examples.binarytree.BinaryTree --args 3,3,3(runs binary tree example) java -cp korat.jar korat.Korat --visualize --class korat.examples.searchtree.SearchTree --args 3,3,3,0,2(runs binary search tree example)
Running Example import java.util.*; class BinaryTree { private Node root; // root node private int size; // number of nodes in the tree static class Node { private Node left; // left child private Node right; // right child } 19
Running Example public static Finitization finBinaryTree(int NUM_Node) { Finitization f = new Finitization(BinaryTree.class); ObjSet nodes = f.createObjects("Node", NUM_Node); // #Node = NUM_Node nodes.add(null); f.set("root", nodes); // root in null + Node f.set("size", NUM_Node); // size = NUM_Node f.set("Node.left", nodes); // Node.left in null + Node f.set("Node.right", nodes); // Node.right in null+ Node return f; } 20
Running Example 21 public boolean repOk() { // checks that empty tree has size zero if (root == null) return size == 0; Set visited = new HashSet(); visited.add(root); LinkedList workList = new LinkedList(); workList.add(root); while (!workList.isEmpty()) { Node current = (Node)workList.removeFirst(); if (current.left != null) { // checks that tree has no cycle if (!visited.add(current.left)) return false; workList.add(current.left); } if (current.right != null) { if (!visited.add(current.right)) workList.add(current.right); // checks that size is consistent if (visited.size() != size) return false; return true; 21
Example Valid Inputs trees with exactly 3 nodes left right N0: 2 N1: 1 B0: 3 root right N0: 1 N1: 2 N2: 3 B0: 3 root right left N0: 1 N1: 3 N2: 2 B0: 3 root left right N0: 3 N1: 1 N2: 2 B0: 3 root left N0: 3 N1: 2 N2: 1 B0: 3 root 22
Validity Properties for Example underlying graph is a tree (no sharing between subtrees) correct number of nodes reachable from root node values ordered for binary search left right N0: 2 N1: 1 N2: 3 B0: 3 root 23
Example Testing Scenario Program for XML Processing Create a model for test inputs (XML Syntax tree) Write predicates for valid inputs Korat generates valid inputs Translate from model to actual inputs (pretty-print)
Challenge for Generation How to efficiently desired test inputs? Naturally input spaces can be enumerated (eg: all graphs of given size) Sparce : number of desired test inputs much smaller than the total number of inputs (eg: #DAGs << #Graphs) Brute-force search is inferable must reason about the behavior of the predicate
Korat efficient generates all (valid) structures within given bounds Korat efficient generates all (valid) structures within given bounds - Systematically searches the bounded input spaces - Avoids some equivalent inputs - prunes the search based on data accessed during predicate execution - Monitors dynamically what predicates accesses
Input Space all possible object graphs with a BST root (obeying type declarations) left right N0: 2 N1: 1 N2: 3 B0: 3 root left right N0: 2 N1: 1 N2: 3 B0: 2 root left right N0: 3 N1: 1 N2: 2 B0: 3 root right left N0: 2 N1: 1 N2: 3 B0: 3 root N0: 1 B0: 1 root left right N0: 1 B0: 1 root N0: 1 B0: 1 root left right B0: 0 right N0: 1 N1: 2 N2: 3 B0: 3 root left N0: 3 N1: 2 N2: 1 B0: 3 root right left N0: 1 N1: 3 N2: 2 B0: 3 root left right N0: 3 N1: 1 N2: 2 B0: 3 root 27
Isomorphic Inputs equivalent for all code and all properties example for trees: permutation of nodes removing isomorphic inputs from test suites significantly reduces the number of tests does not reduce quality N0: 2 N1: 1 N2: 3 B0: 3 left right root left right N1: 2 N2: 1 N0: 3 B0: 3 root 28
Nonisomorphic Generation simple “solution” generate all inputs filter out isomorphic inputs Korat does not require filtering generate only one (candidate) input from each isomorphism class only the lexicographically smallest input search increments some field values for >1 29
Correctness Korat’s input generation sound complete optimal no invalid input complete at least one valid input from each isomorphism class optimal at most one (valid) input from each isomorphism class 30
Korat Structure Generation very large input spaces benchmark size input space BST 8 12 253 292 HeapArray 6 8 220 229 java.util.LinkedList 291 2150 java.util.TreeMap 7 9 292 2130 java.util.HashSet 7 11 2119 2215 31
Korat Structure Generation pruning based on filed accesses very effective benchmark size input space candidate inputs BST 8 12 253 292 54418 12284830 HeapArray 6 8 220 229 64533 5231385 java.util.LinkedList 291 2150 5455 5034894 java.util.TreeMap 7 9 292 2130 256763 50209400 java.util.HashSet 7 11 2119 2215 193200 39075006 32
Korat Structure Generation correct number of nonisomorphic structures (Sloane’s) benchmark size input space candidate inputs valid inputs BST 8 12 253 292 54418 12284830 1430 208012 HeapArray 6 8 220 229 64533 5231385 13139 1005075 java.util.LinkedList 291 2150 5455 5034894 4140 4213597 java.util.TreeMap 7 9 292 2130 256763 50209400 35 122 java.util.HashSet 7 11 2119 2215 193200 39075006 2386 277387 IntentionalName 5 250 1330628 598358 33
Korat Structure Generation 800Mhz Pentium III Sun’s Java 2 SDK 1.3.1 JVM benchmark size input space candidate inputs valid inputs time [sec] BST 8 12 253 292 54418 12284830 1430 208012 2 234 HeapArray 6 8 220 229 64533 5231385 13139 1005075 2 43 java.util.LinkedList 291 2150 5455 5034894 4140 4213597 2 690 java.util.TreeMap 7 9 292 2130 256763 50209400 35 122 9 2149 java.util.HashSet 7 11 2119 2215 193200 39075006 2386 277387 4 927 IntentionalName 5 250 1330628 598358 63 34
Korat at Microsoft Research Korat implemented in the AsmLT test tool in foundations of software engineering group Predicates in Abstract state machine language (AsmL), not in java or C# GUI for setting finitization and manipulating test Korat can be used stand-alone or to generate input for method sequences Extensions (Controlled) non-exhaustive generation Generation of complete tests from partial tests Library for faster generation of common datatypes
AsmLT/Korat at Microsoft Used by testers in several product groups Enabling finding numerous errors XML Tools Xpath Compiler (10 error codes, test –suite agumentation) Serialization (3 Error codes, Changing spec) Web-Service protocols WS-Policy (13 code errors, 6 problems in informal spec) WS-Routing (1 code error, 20 problems in informal spec) Others SSL Stream MSN Authentication, …... Errors Found in Important real world applications Code already tested using best testing practices
Korat at Google Testing Web traversal code Test Inputs based on DAGs(Strongly connected components of websites with links) Problem : Large number of inputs Goal : Faster generation and execution of structurally complex test inputs Challenge : Korat search mostly sequentially Solution : Parallelized Korat Search - A Family of (Online) Algorithms for load balancing
ASTGen: Imperative Generators Korat is Inherently declarative User specifies what inputs to generate, not how Predicates are declarative specifications written in imperative code (Java) ASTGen is imperative User Specifies how to generate inputs Write code that directly generates test inputs instead of writing code that checks properties Faster generation and more involved
ASTGen Framework rather than a tool User needs to extend if for specific purpose Based on NLP description of Input First extension for generating abstract syntax trees(ASTs) of Java Programs Applied on testing parts of two Popular IDEs(refactoring engines NetBeans and Eclipse) Results : 47 Bugs (20 Eclipse and 17 NetBeans)
Related Testing Approaches Model Based Testing Predicates as specs (UML) Specification Based Testing Predicates as Specs, Bounded-exhaustive generation Constraint based generation Tools typically handle only primitive types not structures Random Generation - No guarantees, hard to generate inputs for sparse spaces Grammar based Generation Hard to navigate inputs with complex properties Combinatorial selection (Pair-wise generation) - Easy to enumerate spaces, smart selection of inputs
Conclusions Korat automates specification-based testing uses method precondition to generate all nonisomorphic test inputs prunes search space using field accesses invokes the method on each input and uses method postcondition as a test oracle ASTGen is an imperative generator for ASTs, found bugs in IDEs 41
Citations : Darko Marinov – Korat – a tool for generating Structurally Complex Test Inputs