Dynamic Analysis of Algebraic Structure to Optimize Test Generation and Test Case Selection Anthony J H Simons and Wenwen Zhao.

Slides:



Advertisements
Similar presentations
A Program Transformation For Faster Goal-Directed Search Akash Lal, Shaz Qadeer Microsoft Research.
Advertisements

Identity and Equality Based on material by Michael Ernst, University of Washington.
Automatic Verification Book: Chapter 6. What is verification? Traditionally, verification means proof of correctness automatic: model checking deductive:
Annoucements  Next labs 9 and 10 are paired for everyone. So don’t miss the lab.  There is a review session for the quiz on Monday, November 4, at 8:00.
Programming Types of Testing.
CS 267: Automated Verification Lecture 10: Nested Depth First Search, Counter- Example Generation Revisited, Bit-State Hashing, On-The-Fly Model Checking.
1 Temporal Claims A temporal claim is defined in Promela by the syntax: never { … body … } never is a keyword, like proctype. The body is the same as for.
Lightweight Abstraction for Mathematical Computation in Java 1 Pavel Bourdykine and Stephen M. Watt Department of Computer Science Western University London.
UnInformed Search What to do when you don’t know anything.
RIT Software Engineering
SE 450 Software Processes & Product Metrics 1 Defect Removal.
1 SWE Introduction to Software Engineering Lecture 5.
Michael Ernst, page 1 Improving Test Suites via Operational Abstraction Michael Ernst MIT Lab for Computer Science Joint.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
[ §4 : 1 ] 4. Requirements Processes II Overview 4.1Fundamentals 4.2Elicitation 4.3Specification 4.4Verification 4.5Validation Software Requirements Specification.
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
1/25 Pointer Logic Changki PSWLAB Pointer Logic Daniel Kroening and Ofer Strichman Decision Procedure.
State coverage: an empirical analysis based on a user study Dries Vanoverberghe, Emma Eyckmans, and Frank Piessens.
Handouts Software Testing and Quality Assurance Theory and Practice Chapter 9 Functional Testing
Query Processing Presented by Aung S. Win.
Intrusion and Anomaly Detection in Network Traffic Streams: Checking and Machine Learning Approaches ONR MURI area: High Confidence Real-Time Misuse and.
Formal Methods 1. Software Engineering and Formal Methods  Every software engineering methodology is based on a recommended development process  proceeding.
CS527: (Advanced) Topics in Software Engineering Overview of Software Quality Assurance Tao Xie ©D. Marinov, T. Xie.
Dr. Pedro Mejia Alvarez Software Testing Slide 1 Software Testing: Building Test Cases.
Reverse Engineering State Machines by Interactive Grammar Inference Neil Walkinshaw, Kirill Bogdanov, Mike Holcombe, Sarah Salahuddin.
REFACTORING Lecture 4. Definition Refactoring is a process of changing the internal structure of the program, not affecting its external behavior and.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
A Portable Virtual Machine for Program Debugging and Directing Camil Demetrescu University of Rome “La Sapienza” Irene Finocchi University of Rome “Tor.
S/W Project Management Software Process Models. Objectives To understand  Software process and process models, including the main characteristics of.
1. Topics to be discussed Introduction Objectives Testing Life Cycle Verification Vs Validation Testing Methodology Testing Levels 2.
An Introduction to MBT  what, why and when 张 坚
CS 11 C track: lecture 5 Last week: pointers This week: Pointer arithmetic Arrays and pointers Dynamic memory allocation The stack and the heap.
Chapter 8 – Software Testing Lecture 1 1Chapter 8 Software testing The bearing of a child takes nine months, no matter how many women are assigned. Many.
Coverage – “Systematic” Testing Chapter 20. Dividing the input space for failure search Testing requires selecting inputs to try on the program, but how.
Stephen P. Carl - CS 2421 Recursion Reading : Chapter 4.
Lecture 3 Software Engineering Models (Cont.)
Chapter 13 Recursion. Learning Objectives Recursive void Functions – Tracing recursive calls – Infinite recursion, overflows Recursive Functions that.
Feedback-Based Specification, Coding and Testing… …with JWalk Anthony J H Simons, Neil Griffiths and Christopher D Thomson.
The Daikon system for dynamic detection of likely invariants MIT Computer Science and Artificial Intelligence Lab. 16 January 2007 Presented by Chervet.
The Volcano Query Optimization Framework S. Sudarshan (based on description in Prasan Roy’s thesis Chapter 2)
Optimization in XSLT and XQuery Michael Kay. 2 Challenges XSLT/XQuery are high-level declarative languages: performance depends on good optimization Performance.
CSIS 123A Lecture 9 Recursion Glenn Stevenson CSIS 113A MSJC.
Introduction to Software Testing. Types of Software Testing Unit Testing Strategies – Equivalence Class Testing – Boundary Value Testing – Output Testing.
DEV322 Unit Testing Best Practices With Visual Studio 2005 Team System Mark Seemann Senior Consultant Microsoft Consulting Services.
The Volcano Optimizer Generator Extensibility and Efficient Search.
What is Testing? Testing is the process of finding errors in the system implementation. –The intent of testing is to find problems with the system.
Xusheng Xiao North Carolina State University CSC 720 Project Presentation 1.
Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008.
Directed Random Testing Evaluation. FDRT evaluation: high-level – Evaluate coverage and error-detection ability large, real, and stable libraries tot.
Generating Software Documentation in Use Case Maps from Filtered Execution Traces Edna Braun, Daniel Amyot, Timothy Lethbridge University of Ottawa, Canada.
Lazy Systematic Unit Testing for Java Anthony J H Simons Christopher D Thomson.
Benchmarking Effectiveness for Object-Oriented Unit Testing Anthony J H Simons and Christopher D Thomson.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 4 Slide 1 Software Processes.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 4 Slide 1 Software Processes.
Improving Structural Testing of Object-Oriented Programs via Integrating Evolutionary Testing and Symbolic Execution Kobi Inkumsah Tao Xie Dept. of Computer.
LECTURE 3 Compiler Phases. COMPILER PHASES Compilation of a program proceeds through a fixed series of phases.  Each phase uses an (intermediate) form.
Random Test Generation of Unit Tests: Randoop Experience
Tools for Automated Testing Presented by: Žygimantas Mockus.
Cs498dm Software Testing Darko Marinov January 24, 2012.
Chapter 13 Recursion Copyright © 2016 Pearson, Inc. All rights reserved.
Testing Tutorial 7.
CS5123 Software Validation and Quality Assurance
CSE 1030: Data Structure Mark Shtern.
What to do when you don’t know anything know nothing
Software Testing: A Research Travelogue
Software Verification and Validation
Software Verification and Validation
Software Verification and Validation
Chapter 13 Recursion Copyright © 2010 Pearson Addison-Wesley. All rights reserved.
Presentation transcript:

Dynamic Analysis of Algebraic Structure to Optimize Test Generation and Test Case Selection Anthony J H Simons and Wenwen Zhao

Overview Lazy Systematic Unit Testing  JWalk testing concept and methodology The JWalk 1.0 toolset  JWalkTester, JWalkUtility, JWalkEditor, etc. Dynamic analysis and pruning  extending earlier work to full algebraic analysis Comparison and evaluation  measure path pruning, before and after  test result prediction, before and after

Motivation State of the art in agile testing  Test-driven development is good, but…  …no specification to inform the selection of tests  …manual test-sets are fallible (missing, redundant cases)  …reusing saved tests for conformance testing is fallible – state partitions hide paths, faults (Simons, 2005) Lazy systematic testing method: the insight  Complete testing requires a specification (even in XP!)  Infer an up-to-date specification from a code prototype  Let tools handle systematic test generation and coverage  Let the programmer focus on novel/unpredicted results

Lazy Systematic Unit Testing Lazy Specification  late inference of a specification from evolving code  semi-automatic, by static and dynamic analysis of code with limited user interaction  specification evolves in step with modified code Systematic Testing  bounded exhaustive testing, up to the specification  emphasis on completeness, conformance, correctness properties after testing, repeatable test quality

JWalk 1.0 Toolset JWalk Tester JWalk Utility JWalk Editor JWalk Marker JWalk Grapher JWalk SOAR

JWalk Tester Lazy systematic unit testing for Java  static analysis - extracts the public API of a compiled Java class  protocol walk (all paths) – explores, validates all interleaved methods to a given path depth  algebra walk (memory states) – explores, validates all observations on all mutator-method sequences  state walk (high-level states) – explores, validates n-switch transition cover for all high-level states Try me

Baseline Approaches Breadth-first generation  all constructors and all interleaved methods (eg JCrasher, DSD-Crasher, Jov)  generate-and-filter (eg Rostra, Java Pathfinder) by state equivalence class Computational cost  exponential growth, memory issues, wasteful over- generation, even if filtering is later applied  #paths = Σ c.m k, for k = 0..n Key: c = #constructors, m = #methods, k = depth

Dynamic Pruning Interleaved analysis  generate-and-evaluate, pruning active paths on the fly (eg JWalk, Randoop)  remove redundant prefix paths after each test cycle, don’t bother to expand in next cycle Increasing sophistication  prune prefix paths ending in exceptions (fail again)  JWalk, Randoop (2007)  and prefixes ending in algebraic observers (unchanged)  JWalk 0.8 (2007)  and prefixes ending in algebraic transformers (reentrant)  JWalk 1.0 (2009)

Prune Exceptions… new push top pop push top pop push top pop push top pop Key:novel state exception top poptop pop top pop top poptop pop push Prune error-prefixes (JWalk0.8, Randoop)

…and Observers new push top pop push top pop push top pop push top pop Key:novel state exception unchanged state push top pop push top pop Prune error- and observer-prefixes (JWalk0.8)

Algebraic Pruning new push top pop top push top pop push top pop Key:novel state exception unchanged state reentrant state Prune error-, observer- and transformer-prefixes (JWalk1.0)

What is the Same State? Some earlier approaches  distinguish observers, mutators by signature (Rostra)  intrusive state equality predicate methods (ASTOOT)  external (partial) state equality predicates (Rostra)  subsumption of execution traces in JVM (Pathfinder) Some algebraic approaches  shallow, deep equality under all observers (TACCLE)  but assumes observations are also comparable  very costly to compute from first principles  serialise object states and hash (Henkel & Diwan)  but not all objects are serialisable  no control over depth of comparison

Smart State Inspection Reflection-and-hash  extract state vector from objects  compute hash code for each field  order-sensitive combination hash code Proper depth control  shallow or deep equality settings, to chosen depth  hash on pointer, or recursively invoke algorithm Fast state comparison  each test evaluation stores posterior state code  fast comparison with preceding, or all prior states  possible to detect unchanged, or reentrant states

Pruning: Stack Stackbaselineexcept.observ.algebr Pruned: 9,300 redundant paths Retained: 31 significant paths (best 0.33%) Table 1: Cumulative paths explored after each test cycle

Pruning: Reservable Book ResBookbaselineexcept.observ.algebr memex16941 Pruned: 37,408 redundant paths Retained: 41 significant paths (best 0.12%) Table 2: Cumulative paths explored after each test cycle

Test Result Prediction Semi-automatic validation  the user confirms or rejects key results  these constitute a test oracle, used in prediction  eventually > 90% test outcomes predicted JWalk test result prediction rules  eg: predict repeat failure  new().pop().push(e) == new().pop()  eg: predict same state  target.size().push(e) == target.push(e)  eg: predict same result  target.push(e).pop().size() == target.size() Try me

Kinds of Prediction Strong prediction  From known results, guarantee further outcomes in the same equivalence class  eg: observer prefixes empirically checked before making any inference, unchanged state is guaranteed  target.push(e).size().top() == target.push(e).top() Weak prediction  From known facts, guess further outcomes; an incorrect guess will be revealed in the next cycle  eg: methods with void type usually return no result, but may raise an exception  target.pop() predicted to have no result  target.pop().size() == -1 reveals an error

Test Confirmation – JWalk 0.8 new push top pop push top pop push top pop push top pop Key:confirm result confirm error predicted result push top pop push top pop Confirm all observations, errors on all state- modifying paths

Test Confirmation – JWalk 1.0 new push top pop top push top pop push top pop Confirm all observations, errors on all primitive algebraic constructions Key:confirm result confirm error predicted result

Confirmations: Stack Stackv0.8 algv0.8 prov1.0 algv1.0 pro Total Table 3: Confirmations per test cycle (new oracle) JWalk 0.8: trained oracle after 57 confirmations JWalk 1.0: trained oracle after 34 confirmations

Confirmations: Reservable Book ResBookv0.8 algv0.8 prov1.0 algv1.0 pro memex- Total Table 4: Confirmations per test cycle (inherited oracle) JWalk 0.8: trained oracle after 93 confirmations JWalk 1.0: trained oracle after 43 confirmations

Why Residual Confirmations? Prediction based on state equality  from state equivalence:  target.push(e).pop() == target  predict identical observations:  target.push(e).pop().size() == target.size() Novel states occur in longer protocols  JWalk has deterministic argument synthesis:  elements generated in order: e1, e2, … e n  algebraic reduction yields a novel state:  target.push(e1).pop().push(e2) == target.push(e2)  target.push(e2) != target.push(e1) from the oracle

Conclusions … Test path pruning  algebraic analysis effective at eliminating redundant paths  absolutely necessary when testing classes with large APIs  java.lang.Character: c = 1, m = 78; d3 base = 480,715 paths; alg = 79 paths, stable after 1 cycle  java.lang.String: c = 13, m = 64; d3 base = 54,093 paths; alg = 845 paths, stable after 1 cycle More test automation  presents user with the ideal mimimal test-set for judgement  user only has to confirm all errors and observations on all primitive algebraic constructions

Conclusions Faster state exploration  algebra-walking finds the leaves of the algebra-tree faster  state-walking discovers high-level states faster, by growing only primitive state-modifying paths  can afford to search to greater test depths Test result prediction  algebraic anaylsis improves predictive power as expected  but oracle must also have the reduction (and may not)  future idea: axiom generalisation? (Henkel & Diwan)

Thank You! And thanks also to:  Wenwen Zhao – hashing on states for comparison  Mihai-Gabriel Glont – prototype UI for JWalkTester  Arne-Michael Toersel – case studies for JWalk Let’s go JWalking!

Confirmations: Library Book LibBookv0.8 algv0.8 prov1.0 algv1.0 pro Total13 69 Table 5: Confirmations per test cycle (new oracle) JWalk 0.8: depth-5 oracle after 13 confirmations JWalk 1.0: depth-5 oracle after 9 confirmations