Korat: Automated Testing Based on Java Predicates

Slides:



Advertisements
Similar presentations
Chapter 7. Binary Search Trees
Advertisements

Korat Automated Testing Based on Java Predicates Chandrasekhar Boyapati, Sarfraz Khurshid, Darko Marinov MIT ISSTA 2002 Rome, Italy.
Semantics Static semantics Dynamic semantics attribute grammars
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
1 Symbolic Execution for Model Checking and Testing Corina Păsăreanu (Kestrel) Joint work with Sarfraz Khurshid (MIT) and Willem Visser (RIACS)
1/20 Generalized Symbolic Execution for Model Checking and Testing Charngki PSWLAB Generalized Symbolic Execution for Model Checking and Testing.
Automated creation of verification models for C-programs Yury Yusupov Saint-Petersburg State Polytechnic University The Second Spring Young Researchers.
CSE 331 SOFTWARE DESIGN & IMPLEMENTATION TESTING II Autumn 2011.
CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.
272: Software Engineering Fall 2008 Instructor: Tevfik Bultan Lecture 10: Testing, Automated Testing.
Proof System HY-566. Proof layer Next layer of SW is logic and proof layers. – allow the user to state any logical principles, – computer can to infer.
CSE S. Tanimoto Syntax and Types 1 Representation, Syntax, Paradigms, Types Representation Formal Syntax Paradigms Data Types Type Inference.
Efficient Software Model Checking of Data Structure Properties Paul T. Darga Chandrasekhar Boyapati The University of Michigan.
Describing Syntax and Semantics
© 2006 Pearson Addison-Wesley. All rights reserved2-1 Chapter 2 Principles of Programming & Software Engineering.
Efficient Modular Glass Box Software Model Checking Michael Roberson Chandrasekhar Boyapati The University of Michigan.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
1/23/2003University of Virginia1 Korat: Automated Testing Based on Java Predicates CS751 Presentation by Radu Stoleru C.Boyapaty, S.Khurshid, D.Marinov.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 8: Semi-automated test generation via UDITA.
Korat: Automated Testing Based on Java Predicates Chandrasekhar Boyapati 1, Sarfraz Khurshid 2, and Darko Marinov 3 1 University of Michigan Ann Arbor.
CSC 8310 Programming Languages Meeting 2 September 2/3, 2014.
Chandrasekhar Boyapati (Google) Sarfraz Khurshid (University of Texas)
Reverse Engineering State Machines by Interactive Grammar Inference Neil Walkinshaw, Kirill Bogdanov, Mike Holcombe, Sarah Salahuddin.
© Janice Regan, CMPT 128, Jan CMPT 128 Introduction to Computing Science for Engineering Students Creating a program.
Testing. What is Testing? Definition: exercising a program under controlled conditions and verifying the results Purpose is to detect program defects.
CS527 Topics in Software Engineering (Software Testing and Analysis) Darko Marinov September 15, 2011.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
{ Graphite Grigory Arashkovich, Anuj Khanna, Anirban Gangopadhyay, Michael D’Egidio, Laura Willson.
Chapters 7, 8, & 9 Quiz 3 Review 1. 2 Algorithms Algorithm A set of unambiguous instructions for solving a problem or subproblem in a finite amount of.
1 Module Objective & Outline Module Objective: After completing this Module, you will be able to, appreciate java as a programming language, write java.
Chapter 1 Introduction to Computers and C++ Programming Goals: To introduce the fundamental hardware and software components of a computer system To introduce.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 6: Exhaustive Bounded Testing and Feedback-Directed Random Testing.
Model Based Testing Group 7  Nishanth Chandradas ( )  George Stavrinides ( )  Jeyhan Hizli ( )  Talvinder Judge ( )  Saajan.
1 Generating FSMs from Abstract State Machines Wolfgang Grieskamp Yuri Gurevich Wolfram Schulte Margus Veanes Foundations of Software Engineering Microsoft.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
Introduction to Software Testing. Types of Software Testing Unit Testing Strategies – Equivalence Class Testing – Boundary Value Testing – Output Testing.
CS 363 Comparative Programming Languages Semantics.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
CS Data Structures I Chapter 2 Principles of Programming & Software Engineering.
1. 2 Preface In the time since the 1986 edition of this book, the world of compiler design has changed significantly 3.
Introduction to Compilers. Related Area Programming languages Machine architecture Language theory Algorithms Data structures Operating systems Software.
ISBN Chapter 3 Describing Semantics.
Chapter 3 Part II Describing Syntax and Semantics.
Semantics In Text: Chapter 3.
Weaving a Debugging Aspect into Domain-Specific Language Grammars SAC ’05 PSC Track Santa Fe, New Mexico USA March 17, 2005 Hui Wu, Jeff Gray, Marjan Mernik,
1 Typing XQuery WANG Zhen (Selina) Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,
© 2006 Pearson Addison-Wesley. All rights reserved2-1 Chapter 2 Principles of Programming & Software Engineering.
© 2006 Pearson Addison-Wesley. All rights reserved 2-1 Chapter 2 Principles of Programming & Software Engineering.
TREES K. Birman’s and G. Bebis’s Slides. Tree Overview 2  Tree: recursive data structure (similar to list)  Each cell may have zero or more successors.
Using Symbolic PathFinder at NASA Corina Pãsãreanu Carnegie Mellon/NASA Ames.
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
Automated Test Generation CS Outline Previously: Random testing (Fuzzing) – Security, mobile apps, concurrency Systematic testing: Korat – Linked.
( = “unknown yet”) Our novel symbolic execution framework: - extends model checking to programs that have complex inputs with unbounded (very large) data.
Chapter – 8 Software Tools.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
Random Test Generation of Unit Tests: Randoop Experience
Evolution of C and C++ n C was developed by Dennis Ritchie at Bell Labs (early 1970s) as a systems programming language n C later evolved into a general-purpose.
Symstra: A Framework for Generating Object-Oriented Unit Tests using Symbolic Execution Tao Xie, Darko Marinov, Wolfram Schulte, and David Notkin University.
Software Engineering Algorithms, Compilers, & Lifecycle.
Mechanisms for Requirements Driven Component Selection and Design Automation 최경석.
Principles of Programming & Software Engineering
Introduction to Compiler Construction
Input Space Partition Testing CS 4501 / 6501 Software Testing
Representation, Syntax, Paradigms, Types
Dynamic Symbolic Data Structure Repair
Representation, Syntax, Paradigms, Types
Representation, Syntax, Paradigms, Types
Introduction to Data Structure
50.530: Software Engineering
Representation, Syntax, Paradigms, Types
Presentation transcript:

Korat: Automated Testing Based on Java Predicates Presented by : Praveen Kumar Vanga Authors : Chandrasekhar Boyapati, Sarfraz Khurshid, and Darko Marinov 1

What is Korat ? Korat, a framework for automated testing of Java programs Korat uses the method precondition to automatically generate all test cases Korat then executes the method on each test case Uses the method postcondition as a test output to check the correctness of each output. 2

How to generate Such Test Inputs ? Motivations Testing a Program Generated at Google. Input: Based on Acyclic Directed Graph (DAG) Strongly connected components of web graph Example of Structurally Complex input - Structural : Data Consists of Linked Lists - Complex : Nodes need to satisfy Properties Output : Set of Nodes on Certain Path How to generate Such Test Inputs ? 3

Examples of Structurally Complex Data 1 3 2 red-black tree <library> <book year=2001> <title>T1</title> <author>A1</author> </book> <book year=2002> <title>T2</title> <author>A2</author> </book> <book year=2003> <title>T3</title> <author>A3</author> </book> </library> /library/book[@year<2003]/title XML document 4

Applications with Structurally Complex Inputs Operations on Abstract Data Types - Input : Data structures Satisfying Invariants Code Processing Java (IDE) - Input : Correct Java Program (Syntactically and Semantically) XML Processing Systems - Input : XML Documents Many More..... 5

Testing Setup inputs outputs Assumptions : - User has knowledge about the Input of the code under test - Knows the properties of desired Inputs Large Number of Desired Inputs - Ex: Corner cases for Linked List : empty LL, Non connected List, disconnected LL, Linked List with Cycles...etc 1 3 2 3 2 pass fail testing oracle test generator code 6

Testing Setup examples of code under test XML processing program inputs outputs examples of code under test abstract data type input/output: data structure XML processing program input/output: XML document 1 3 2 3 2 pass fail testing oracle test generator code 7

Manual Test Generation inputs outputs drawbacks of manual generation labor-intensive and expensive Tester Manually writes test abstractions and each test abstraction describes a set of inputs User might forget certain kind of test cases which represent certain Inputs (Vulnerability) 1 3 2 3 2 pass fail test generator manual testing oracle code 8

Automated Test Generation inputs outputs Tool Automatically Generates the inputs - If test abstraction changes, tool regenerates Inputs challenges of automated test generation describing test inputs (efficient) test generation checking output 1 3 2 3 2 test generator automated pass fail testing oracle code 9

Test Abstractions Imperative : Describe how to generate Inputs - ASTGen (More recent work - Discuss Later) Declarative : Describe what Input looks like Two Kinds of Languages used Declarative Language for properties of desired inputs (NLP) - Properties written in Alloy Modeling Language - Uses Alloy Analyzer for Generation of test cases Imperative Language for Properties of Desired Inputs - Properties written in high-level languages (Java, C#,...) - Korat Uses Imperative Language 10

UseCase : Bounded-Exhaustive Generation Generate all valid Structures up to a given bound User specified bounds Number of nodes Possible Values Rationale : Find all error detectable with in bound Tools Should avoid some equivalent inputs 11

Korat Examples What it does Demo How it works Results ASTGen Conclusion Outline

Example : Directed Acyclic Graph User Provides Input representation (Java) Class DAG { List<DAGNode> nodes; int size; Static class DAGNode { DAGNode[] children; } Desired Properties of Objects of these classes -No Directed cycle among the nodes -Not a multi-graph (all outgoing edges different)

Properties as Imperative Predicates (desired structures) Method that Identifies desired Structures Takes an Input that can be desired or not (ex. Graph is a DAG or not) - Returns Boolean Indicating desirability Advantages - Familiar Language - Existing development tools - Predicates can be already present in the code

Example Finitization specifies number of objects for each class 1 object for BST: { B0 } 3 objects for Node: { N0, N1, N2 } specifies set of values for each field sets consist of objects and literals for root, left, right: { null, N0, N1, N2 } for value: { 1, 2, 3 } for size: { 3 } 15

Bonded-Exhaustive generation from imperative predicates (Desired Structures) Inputs: predicate and finitization. Output: All Structures where predicate returns true. Korat Searches input space to find appropriate structures Korat 16

Korat Implemented in Java Works on java predicates Command line tool main arguments --Class main class for which to generate structures --Predicate : Method with test properties --Finitization : Method for finitization --args : bounds for finitization Actions for generated structures Visualizing Structures Storing Structures in a file Running the code under test for each structure

Korat java -cp korat.jar korat.Korat --visualize --class korat.examples.binarytree.BinaryTree --args 3,3,3(runs binary tree example) java -cp korat.jar korat.Korat --visualize --class korat.examples.searchtree.SearchTree --args 3,3,3,0,2(runs binary search tree example)

Running Example import java.util.*; class BinaryTree { private Node root; // root node private int size; // number of nodes in the tree static class Node { private Node left; // left child private Node right; // right child } 19

Running Example public static Finitization finBinaryTree(int NUM_Node) { Finitization f = new Finitization(BinaryTree.class); ObjSet nodes = f.createObjects("Node", NUM_Node); // #Node = NUM_Node nodes.add(null); f.set("root", nodes); // root in null + Node f.set("size", NUM_Node); // size = NUM_Node f.set("Node.left", nodes); // Node.left in null + Node f.set("Node.right", nodes); // Node.right in null+ Node return f; } 20

Running Example 21 public boolean repOk() { // checks that empty tree has size zero if (root == null) return size == 0; Set visited = new HashSet(); visited.add(root); LinkedList workList = new LinkedList(); workList.add(root); while (!workList.isEmpty()) { Node current = (Node)workList.removeFirst(); if (current.left != null) { // checks that tree has no cycle if (!visited.add(current.left)) return false; workList.add(current.left); } if (current.right != null) { if (!visited.add(current.right)) workList.add(current.right); // checks that size is consistent if (visited.size() != size) return false; return true; 21

Example Valid Inputs trees with exactly 3 nodes left right N0: 2 N1: 1 B0: 3 root right N0: 1 N1: 2 N2: 3 B0: 3 root right left N0: 1 N1: 3 N2: 2 B0: 3 root left right N0: 3 N1: 1 N2: 2 B0: 3 root left N0: 3 N1: 2 N2: 1 B0: 3 root 22

Validity Properties for Example underlying graph is a tree (no sharing between subtrees) correct number of nodes reachable from root node values ordered for binary search left right N0: 2 N1: 1 N2: 3 B0: 3 root 23

Example Testing Scenario Program for XML Processing Create a model for test inputs (XML Syntax tree) Write predicates for valid inputs Korat generates valid inputs Translate from model to actual inputs (pretty-print)

Challenge for Generation How to efficiently desired test inputs? Naturally input spaces can be enumerated (eg: all graphs of given size) Sparce : number of desired test inputs much smaller than the total number of inputs (eg: #DAGs << #Graphs) Brute-force search is inferable  must reason about the behavior of the predicate

Korat efficient generates all (valid) structures within given bounds Korat efficient generates all (valid) structures within given bounds - Systematically searches the bounded input spaces - Avoids some equivalent inputs - prunes the search based on data accessed during predicate execution - Monitors dynamically what predicates accesses

Input Space all possible object graphs with a BST root (obeying type declarations) left right N0: 2 N1: 1 N2: 3 B0: 3 root left right N0: 2 N1: 1 N2: 3 B0: 2 root left right N0: 3 N1: 1 N2: 2 B0: 3 root right left N0: 2 N1: 1 N2: 3 B0: 3 root N0: 1 B0: 1 root left right N0: 1 B0: 1 root N0: 1 B0: 1 root left right B0: 0 right N0: 1 N1: 2 N2: 3 B0: 3 root left N0: 3 N1: 2 N2: 1 B0: 3 root right left N0: 1 N1: 3 N2: 2 B0: 3 root left right N0: 3 N1: 1 N2: 2 B0: 3 root 27

Isomorphic Inputs equivalent for all code and all properties example for trees: permutation of nodes removing isomorphic inputs from test suites significantly reduces the number of tests does not reduce quality N0: 2 N1: 1 N2: 3 B0: 3 left right root left right N1: 2 N2: 1 N0: 3 B0: 3 root  28

Nonisomorphic Generation simple “solution” generate all inputs filter out isomorphic inputs Korat does not require filtering generate only one (candidate) input from each isomorphism class only the lexicographically smallest input search increments some field values for >1 29

Correctness Korat’s input generation sound complete optimal no invalid input complete at least one valid input from each isomorphism class optimal at most one (valid) input from each isomorphism class 30

Korat Structure Generation very large input spaces benchmark size input space BST 8 12 253 292 HeapArray 6 8 220 229 java.util.LinkedList 291 2150 java.util.TreeMap 7 9 292 2130 java.util.HashSet 7 11 2119 2215 31

Korat Structure Generation pruning based on filed accesses very effective benchmark size input space candidate inputs BST 8 12 253 292 54418 12284830 HeapArray 6 8 220 229 64533 5231385 java.util.LinkedList 291 2150 5455 5034894 java.util.TreeMap 7 9 292 2130 256763 50209400 java.util.HashSet 7 11 2119 2215 193200 39075006 32

Korat Structure Generation correct number of nonisomorphic structures (Sloane’s) benchmark size input space candidate inputs valid inputs BST 8 12 253 292 54418 12284830 1430 208012 HeapArray 6 8 220 229 64533 5231385 13139 1005075 java.util.LinkedList 291 2150 5455 5034894 4140 4213597 java.util.TreeMap 7 9 292 2130 256763 50209400 35 122 java.util.HashSet 7 11 2119 2215 193200 39075006 2386 277387 IntentionalName 5 250 1330628 598358 33

Korat Structure Generation 800Mhz Pentium III Sun’s Java 2 SDK 1.3.1 JVM benchmark size input space candidate inputs valid inputs time [sec] BST 8 12 253 292 54418 12284830 1430 208012 2 234 HeapArray 6 8 220 229 64533 5231385 13139 1005075 2 43 java.util.LinkedList 291 2150 5455 5034894 4140 4213597 2 690 java.util.TreeMap 7 9 292 2130 256763 50209400 35 122 9 2149 java.util.HashSet 7 11 2119 2215 193200 39075006 2386 277387 4 927 IntentionalName 5 250 1330628 598358 63 34

Korat at Microsoft Research Korat implemented in the AsmLT test tool in foundations of software engineering group Predicates in Abstract state machine language (AsmL), not in java or C# GUI for setting finitization and manipulating test Korat can be used stand-alone or to generate input for method sequences Extensions (Controlled) non-exhaustive generation Generation of complete tests from partial tests Library for faster generation of common datatypes

AsmLT/Korat at Microsoft Used by testers in several product groups Enabling finding numerous errors XML Tools Xpath Compiler (10 error codes, test –suite agumentation) Serialization (3 Error codes, Changing spec) Web-Service protocols WS-Policy (13 code errors, 6 problems in informal spec) WS-Routing (1 code error, 20 problems in informal spec) Others SSL Stream MSN Authentication, …... Errors Found in Important real world applications Code already tested using best testing practices

Korat at Google Testing Web traversal code Test Inputs based on DAGs(Strongly connected components of websites with links) Problem : Large number of inputs Goal : Faster generation and execution of structurally complex test inputs Challenge : Korat search mostly sequentially Solution : Parallelized Korat Search - A Family of (Online) Algorithms for load balancing

ASTGen: Imperative Generators Korat is Inherently declarative User specifies what inputs to generate, not how Predicates are declarative specifications written in imperative code (Java) ASTGen is imperative User Specifies how to generate inputs Write code that directly generates test inputs instead of writing code that checks properties Faster generation and more involved

ASTGen Framework rather than a tool User needs to extend if for specific purpose Based on NLP description of Input First extension for generating abstract syntax trees(ASTs) of Java Programs Applied on testing parts of two Popular IDEs(refactoring engines NetBeans and Eclipse) Results : 47 Bugs (20 Eclipse and 17 NetBeans)

Related Testing Approaches Model Based Testing Predicates as specs (UML) Specification Based Testing Predicates as Specs, Bounded-exhaustive generation Constraint based generation Tools typically handle only primitive types not structures Random Generation - No guarantees, hard to generate inputs for sparse spaces Grammar based Generation Hard to navigate inputs with complex properties Combinatorial selection (Pair-wise generation) - Easy to enumerate spaces, smart selection of inputs

Conclusions Korat automates specification-based testing uses method precondition to generate all nonisomorphic test inputs prunes search space using field accesses invokes the method on each input and uses method postcondition as a test oracle ASTGen is an imperative generator for ASTs, found bugs in IDEs 41

Citations : Darko Marinov – Korat – a tool for generating Structurally Complex Test Inputs