Automated Combinatorial Testing for Software Rick Kuhn and Raghu Kacker National Institute of Standards and Technology Gaithersburg, MD.

Slides:



Advertisements
Similar presentations
Testing and Inspecting to Ensure High Quality
Advertisements

An Evaluation of MC/DC Coverage for Pair-wise Test Cases By David Anderson Software Testing Research Group (STRG)
Inpainting Assigment – Tips and Hints Outline how to design a good test plan selection of dimensions to test along selection of values for each dimension.
Annoucements  Next labs 9 and 10 are paired for everyone. So don’t miss the lab.  There is a review session for the quiz on Monday, November 4, at 8:00.
Pseudo-Exhaustive Testing for Software Rick Kuhn Vadim Okun National Institute of Standards and Technology Gaithersburg,
IPOG: A General Strategy for T-Way Software Testing
Abhinn Kothari, 2009CS10172 Parth Jaiswal 2009CS10205 Group: 3 Supervisor : Huzur Saran.
T. E. Potok - University of Tennessee Software Engineering Dr. Thomas E. Potok Adjunct Professor UT Research Staff Member ORNL.
(Quickly) Testing the Tester via Path Coverage Alex Groce Oregon State University (formerly NASA/JPL Laboratory for Reliable Software)
Today’s Agenda  HW #1 Due  Quick Review  Finish Input Space Partitioning  Combinatorial Testing Software Testing and Maintenance 1.
A CONTROL INSTRUMENTS COMPANY The Effectiveness of T-way Test Data Generation or Data Driven Testing Michael Ellims.
Rick Kuhn Computer Security Division
Combinatorial Methods for Cybersecurity Testing
Formal verification Marco A. Peña Universitat Politècnica de Catalunya.
BY RAJESWARI S SOFTWARE TESTING. INTRODUCTION Software testing is the process of testing the software product. Effective software testing will contribute.
1 Software Testing Techniques CIS 375 Bruce R. Maxim UM-Dearborn.
State coverage: an empirical analysis based on a user study Dries Vanoverberghe, Emma Eyckmans, and Frank Piessens.
Finding Bugs in Web Applications Using Dynamic Test Generation and Explicit-State Model Checking -Shreyas Ravindra.
1 Joe Meehean. 2 Testing is the process of executing a program with the intent of finding errors. -Glenford Myers.
1 Functional Testing Motivation Example Basic Methods Timing: 30 minutes.
What Exactly are the Techniques of Software Verification and Validation A Storehouse of Vast Knowledge on Software Testing.
Dr. Pedro Mejia Alvarez Software Testing Slide 1 Software Testing: Building Test Cases.
System/Software Testing
Software Testing. Recap Software testing – Why do we do testing? – When it is done? – Who does it? Software testing process / phases in software testing.
© Janice Regan, CMPT 128, Jan CMPT 128 Introduction to Computing Science for Engineering Students Creating a program.
TESTING.
1 Validation & Verification Chapter VALIDATION & VERIFICATION Very Difficult Very Important Conceptually distinct, but performed simultaneously.
Combinatorial Methods for Event Sequence Testing D. Richard Kuhn 1, James M. Higdon 2, James F. Lawrence 1,3, Raghu N. Kacker 1, Yu Lei 4 1 National Institute.
Software Engineering Chapter 23 Software Testing Ku-Yaw Chang Assistant Professor Department of Computer Science and Information.
Chapter 8 – Software Testing Lecture 1 1Chapter 8 Software testing The bearing of a child takes nine months, no matter how many women are assigned. Many.
1 Software testing. 2 Testing Objectives Testing is a process of executing a program with the intent of finding an error. A good test case is in that.
The OWASP Foundation OWASP AppSec DC October This is a work of the U.S. Government and is not subject to copyright protection.
1 Software Reliability Assurance for Real-time Systems Joel Henry, Ph.D. University of Montana NASA Software Assurance Symposium September 4, 2002.
Dr. Tom WayCSC Testing and Test-Driven Development CSC 4700 Software Engineering Based on Sommerville slides.
Test vs. inspection Part 2 Tor Stålhane. Testing and inspection A short data analysis.
Slide 6.1 © The McGraw-Hill Companies, 2002 Object-Oriented and Classical Software Engineering Fifth Edition, WCB/McGraw-Hill, 2002 Stephen R. Schach
Chapter 13: Regression Testing Omar Meqdadi SE 3860 Lecture 13 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
Unit Testing 101 Black Box v. White Box. Definition of V&V Verification - is the product correct Validation - is it the correct product.
Combinatorial Testing Review “Combinatorial Software Testing”, Kuhn, Kacker, Lei, Hunter, IEEE Computer, Aug Compared “traditional” software test.
What is Testing? Testing is the process of finding errors in the system implementation. –The intent of testing is to find problems with the system.
Copyright (c) Cem Kaner Black Box Software Testing (Academic Course - Fall 2001) Cem Kaner, J.D., Ph.D. Florida Institute of Technology Section:
Software Development Problem Analysis and Specification Design Implementation (Coding) Testing, Execution and Debugging Maintenance.
Software Engineering 2004 Jyrki Nummenmaa 1 BACKGROUND There is no way to generally test programs exhaustively (that is, going through all execution.
Chapter 8 Testing. Principles of Object-Oriented Testing Å Object-oriented systems are built out of two or more interrelated objects Å Determining the.
HNDIT23082 Lecture 09:Software Testing. Validations and Verification Validation and verification ( V & V ) is the name given to the checking and analysis.
Boundary Value Testing 1.A type of “Black box” functional testing –The program is viewed as a mathematical “function” –The program takes inputs and maps.
Black Box Unit Testing What is black-box testing? Unit (code, module) seen as a black box No access to the internal or logical structure Determine.
A PRELIMINARY EMPIRICAL ASSESSMENT OF SIMILARITY FOR COMBINATORIAL INTERACTION TESTING OF SOFTWARE PRODUCT LINES Stefan Fischer Roberto E. Lopez-Herrejon.
 Software reliability is the probability that software will work properly in a specified environment and for a given amount of time. Using the following.
1 Software Testing. 2 What is Software Testing ? Testing is a verification and validation activity that is performed by executing program code.
Wolfgang Runte Slide University of Osnabrueck, Software Engineering Research Group Wolfgang Runte Software Engineering Research Group Institute.
Linux Standard Base Основной современный стандарт Linux, стандарт ISO/IEC с 2005 года Определяет состав и поведение основных системных библиотек.
SOFTWARE TESTING Date: 29-Dec-2016 By: Ram Karthick.
Ryan Lekivetz JMP Division of SAS Abstract Covering Arrays
Combinatorial Testing
Software Fault Interactions and Implications for Software Testing
Effective Test Design Using Covering Arrays
Ryan Lekivetz & Joseph Morgan, JMP
Automated Pattern Based Mobile Testing
Introduction to Software Testing
Lecture 09:Software Testing
Testing and Test-Driven Development CSC 4700 Software Engineering
IPOG: A General Strategy for T-Way Software Testing
Software Verification and Validation
Software Verification and Validation
Software Verification and Validation
Chapter 11: Integration- and System Testing
Overview Activities from additional UP disciplines are needed to bring a system into being Implementation Testing Deployment Configuration and change management.
Black Box Software Testing (Professional Seminar)
Presentation transcript:

Automated Combinatorial Testing for Software Rick Kuhn and Raghu Kacker National Institute of Standards and Technology Gaithersburg, MD

What is NIST? A US Government agency The nation’s measurement and testing laboratory – 3,000 scientists, engineers, and support staff including 3 Nobel laureates Research in physics, chemistry, materials, manufacturing, computer science Analysis of engineering failures, including buildings, materials...

Software Failure Analysis NIST studied software failures in a variety of fields including 15 years of FDA medical device recall data What causes software failures? What testing and analysis would have prevented failures? Would all-values or all-pairs testing find all errors, and if not, then how many interactions would we need to test to find all errors? Surprisingly, no one had looked at this question before

Interaction testing Interest Rate | Amount | Months | Down Pmt | Pmt Frequency All values: every value of every parameters All pairs: every value of each pair of parameters t-way interactions: every value of every t-way combination of parameters etc....

How to find all failures? Interactions: E.g., failure occurs if pressure 300 (2-way interaction) ‏ Most complex failure required 4-way interaction

How about other applications? Browser

How about other applications? Server

How about other applications? NASA distributed database

How about other applications? TCAS module (seeded errors)

Max interactions for fault triggering for these applications was 6 Wallace, Kuhn 2001 – medical devices – 98% of flaws were pairwise interactions, no fault required > 4-way interactions to trigger Kuhn, Reilly 2002 – web server, browser; no fault required > 6-way interactions to trigger Kuhn, Wallace, Gallo 2004 – large NASA distributed database; no fault required > 4 interactions to trigger Much more empirical work needed Reasonable evidence that maximum interaction strength for fault triggering is relatively small How can we apply what we have learned? What interactions would we need to test to find ALL faults?

Automated Combinatorial Testing Merge automated test generation with combinatorial methods Goals – reduce testing cost, improve cost-benefit ratio for software assurance New algorithms and faster processors make large-scale combinatorial testing practical Accomplishments – huge increase in performance, scalability + proof-of-concept demonstration Also non-testing application – modelling and simulation

Problem: the usual... Too much to test Testing may exceed 50% of development cost Even with formal methods, we still need to test Need maximum amount of information per test Example: 20 variables, 10 values each combinations Which ones to test?

Pairwise testing commonly applied to software Suppose no failure requires more than a pair of settings to trigger in previous example Then test all pairs – 180 test cases sufficient to detect any failure Pairwise testing can find 50% to 90% of flaws Solution: Combinatorial Testing What if finding 50% to 90% of flaws is not good enough?

A simple example 0 = effect off 1 = effect on 13 tests for all 3-way combinations 2 10 = 1,024 tests for all combinations

A covering array: 10 parameters, 2 values each, 3-way combinations So what happens for realistic examples? Any 3 columns contain all possible combinations 13 tests for all 3-way combinations 2 10 = 1,024 tests for all combinations

A real-world example No silver bullet because: Many values per variable Requires more tests and practical limits Need to abstract values But we can still increase information per test Input data to web application: Plan: flt, flt+hotel, flt+hotel+car From: CONUS, HI, AK, Europe, Asia... To: CONUS, HI, AK, Europe, Asia... Compare: yes, no Date-type: exact, 1to3, flex Depart: today,tomorrow, 1month, 1yr... Return: today,tomorrow, 1month, 1yr... Adults: 1,2,3,4,5,6 Minors: 0,1,2,3,4,5 Seniors: 0,1,2,3,4,5

Two ways of using combinatorial testing Use combinations here or here System under test Test data inputs Configuration

Generating covering arrays is a hard problem, one reason why anything beyond pairwise testing is rarely used Number of tests: suppose we want all 4-way combinations of 30 parameters, 5 values each: 3,800 tests May need 10 3 to 10 7 tests for realistic systems With new algorithms we can produce large covering arrays quickly Combinatorial testing requires a lot of tests, but now we can do this

Smaller test sets faster, with a more advanced user interface First parallelized covering array algorithm More information per test >1 dayNA >1 dayNA >1 dayNA >1 dayNA >21 hour >12 hour >1 hour TimeSizeTimeSizeTimeSizeTimeSizeTimeSize TVG (Open Source)‏TConfig (U. of Ottawa)‏Jenny (Open Source)‏ITCH (IBM)‏ IPOG T-Way New algorithms Traffic Collision Avoidance System (TCAS): Paintball (Kuhn, 06) ‏ IPOG (Lei, 06) ‏ So what? You still have to check the results!

Result Checking Creating test data is the easy part! How do we check that the code worked correctly on the test input? Configuration coverage, using existing test set - Easy, if test set exists Crash testing server or other code to ensure it does not crash for any test input - Easy but limited correctness check Use basic consistency checks on system output - Better but more costly White box testing – incorporate assertions in code to check critical states at different points in the code, or print out important values during execution Full scale model-checking using mathematical model of system and model checker to generate expected results for each input - expensive but tractable

Using model checking to produce tests The system can never get in this state! Yes it can, and here’s how … Model-checker test production: if assertion is not true, then a counterexample is generated. This can be converted to a test case. Black & Ammann, 1999

Proof-of-concept experiments FAA Traffic Collision Avoidance System module Mathematical model of system and model checker for results 41 versions seeded w/ errors, used in previous testing research 12 variables: 7 boolean, two 3-value, one 4-value, two 10-value Tests generated w/ Lei algorithm extended for >2 parameters >17,000 complete test cases, covering 2-way to 6-way combinations generated and executed in a few minutes All flaws found with 5-way coverage Grid computer simulator Preliminary results Crashes in >6% of tests w/ valid values “Interesting” combinations discovered

Where does this stuff make sense? More than (roughly) 8 parameters and less than Processing involves interaction between parameters (numeric or logical) Where does it not make sense? Small number of parameters (where exhaustive testing is possible) No interaction between parameters

Summary Empirical research suggests that all software failures caused by interaction of few parameters Combinatorial testing can exercise all t-way combinations of parameter values in a very tiny fraction of the time needed for exhaustive testing New algorithms and faster processors make large- scale combinatorial testing possible Project could produce better quality testing at lower cost for US industry and government Beta release of tools in December, to be open source New public catalog of covering arrays

Future directions No silver bullet - but does it improve cost-benefit ratio? What kinds of software does it work best on? What kinds of errors does it miss? Large real-world examples will help answer these questions Other applications: Modelling and simulation Testing the simulation Finding interesting combinations: performance problems, denial of service attacks Maybe biotech applications. Others? Rick Kuhn Raghu Kacker Please contact us if you are interested!