State coverage: an empirical analysis based on a user study Dries Vanoverberghe, Emma Eyckmans, and Frank Piessens.

Slides:



Advertisements
Similar presentations
Verification and Validation
Advertisements

Introducing Formal Methods, Module 1, Version 1.1, Oct., Formal Specification and Analytical Verification L 5.
Automated Software Testing: Test Execution and Review Amritha Muralidharan (axm16u)
Abhinn Kothari, 2009CS10172 Parth Jaiswal 2009CS10205 Group: 3 Supervisor : Huzur Saran.
An Integration of Program Analysis and Automated Theorem Proving Bill J. Ellis & Andrew Ireland School of Mathematical & Computer Sciences Heriot-Watt.
SE 450 Software Processes & Product Metrics Reliability: An Introduction.
Software Testing. Overview Definition of Software Testing Problems with Testing Benefits of Testing Effective Methods for Testing.
©Ian Sommerville 2000Software Engineering, 6th edition. Chapter 19Slide 1 Verification and Validation l Assuring that a software system meets a user's.
Quality is about testing early and testing often Joe Apuzzo, Ngozi Nwana, Sweety Varghese Student/Faculty Research Day CSIS Pace University May 6th, 2005.
Michael Ernst, page 1 Improving Test Suites via Operational Abstraction Michael Ernst MIT Lab for Computer Science Joint.
Soft. Eng. II, Spr. 2002Dr Driss Kettani, from I. Sommerville1 CSC-3325: Chapter 9 Title : Reliability Reading: I. Sommerville, Chap. 16, 17 and 18.
Parameterizing Random Test Data According to Equivalence Classes Chris Murphy, Gail Kaiser, Marta Arias Columbia University.
Testing Components in the Context of a System CMSC 737 Fall 2006 Sharath Srinivas.
Swami NatarajanJuly 14, 2015 RIT Software Engineering Reliability: Introduction.
Introduction to Software Testing
1 Software Testing Techniques CIS 375 Bruce R. Maxim UM-Dearborn.
1CMSC 345, Version 4/04 Verification and Validation Reference: Software Engineering, Ian Sommerville, 6th edition, Chapter 19.
Test coverage Tor Stålhane. What is test coverage Let c denote the unit type that is considered – e.g. requirements or statements. We then have C c =
Software Reliability Categorising and specifying the reliability of software systems.
Software Testing Verification and validation planning Software inspections Software Inspection vs. Testing Automated static analysis Cleanroom software.
©Ian Sommerville 2000Software Engineering, 6th edition. Chapter 19Slide 1 Verification and Validation l Assuring that a software system meets a user's.
Verification and Validation Yonsei University 2 nd Semester, 2014 Sanghyun Park.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 22 Slide 1 Verification and Validation.
1. Topics to be discussed Introduction Objectives Testing Life Cycle Verification Vs Validation Testing Methodology Testing Levels 2.
CPIS 357 Software Quality & Testing
Class Specification Implementation Graph By: Njume Njinimbam Chi-Chang Sun.
Software Engineering Prof. Dr. Bertrand Meyer March 2007 – June 2007 Chair of Software Engineering Static program checking and verification Slides: Based.
What is Software Testing? And Why is it So Hard J. Whittaker paper (IEEE Software – Jan/Feb 2000) Summarized by F. Tsui.
© SERG Dependable Software Systems (Mutation) Dependable Software Systems Topics in Mutation Testing and Program Perturbation Material drawn from [Offutt.
Coverage – “Systematic” Testing Chapter 20. Dividing the input space for failure search Testing requires selecting inputs to try on the program, but how.
Regression Testing. 2  So far  Unit testing  System testing  Test coverage  All of these are about the first round of testing  Testing is performed.
CSC-682 Cryptography & Computer Security Sound and Precise Analysis of Web Applications for Injection Vulnerabilities Pompi Rotaru Based on an article.
Testing -- Part II. Testing The role of testing is to: w Locate errors that can then be fixed to produce a more reliable product w Design tests that systematically.
1 Software Reliability Assurance for Real-time Systems Joel Henry, Ph.D. University of Montana NASA Software Assurance Symposium September 4, 2002.
Dr. Tom WayCSC Testing and Test-Driven Development CSC 4700 Software Engineering Based on Sommerville slides.
From Quality Control to Quality Assurance…and Beyond Alan Page Microsoft.
Software Construction Lecture 18 Software Testing.
Ch 22 Verification and Validation
Software Reliability Research Pankaj Jalote Professor, CSE, IIT Kanpur, India.
CHAPTER 9: VERIFICATION AND VALIDATION 1. Objectives  To introduce software verification and validation and to discuss the distinction between them 
Verification and Validation Assuring that a software system meets a user's needs.
Xusheng Xiao North Carolina State University CSC 720 Project Presentation 1.
 Descriptive Methods ◦ Observation ◦ Survey Research  Experimental Methods ◦ Independent Groups Designs ◦ Repeated Measures Designs ◦ Complex Designs.
Experimentation in Computer Science (Part 2). Experimentation in Software Engineering --- Outline  Empirical Strategies  Measurement  Experiment Process.
Benchmarking Effectiveness for Object-Oriented Unit Testing Anthony J H Simons and Christopher D Thomson.
PROGRAMMING TESTING B MODULE 2: SOFTWARE SYSTEMS 22 NOVEMBER 2013.
1 Phase Testing. Janice Regan, For each group of units Overview of Implementation phase Create Class Skeletons Define Implementation Plan (+ determine.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 23 Slide 1 Software testing.
Mutation Testing Breaking the application to test it.
Random Test Generation of Unit Tests: Randoop Experience
References & User group Reference: Software Testing and Analysis Mauro Pezze Software Engineering Ian Sommerville Eight Edition (2007) User group:
A PRELIMINARY EMPIRICAL ASSESSMENT OF SIMILARITY FOR COMBINATORIAL INTERACTION TESTING OF SOFTWARE PRODUCT LINES Stefan Fischer Roberto E. Lopez-Herrejon.
Verification vs. Validation Verification: "Are we building the product right?" The software should conform to its specification.The software should conform.
Defect testing Testing programs to establish the presence of system defects.
CS223: Software Engineering Lecture 25: Software Testing.
Testing Integral part of the software development process.
Laurea Triennale in Informatica – Corso di Ingegneria del Software I – A.A. 2006/2007 Andrea Polini XVII. Verification and Validation.
ISQB Software Testing Section Meeting 10 Dec 2012.
Testing Tutorial 7.
Generating Automated Tests from Behavior Models
CSC 480 Software Engineering
Handouts Software Testing and Quality Assurance Theory and Practice Chapter 2 Theory of Program Testing
Verification and Validation
It is great that we automate our tests, but why are they so bad?
Verification and Validation Unit Testing
Testing and Test-Driven Development CSC 4700 Software Engineering
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
George Mason University
Testing.
Mutation Testing Faults are introduced into the program by creating many versions of the program called mutants. Each mutant contains a single fault. Test.
Presentation transcript:

State coverage: an empirical analysis based on a user study Dries Vanoverberghe, Emma Eyckmans, and Frank Piessens

Software Validation Metrics Software defects after product release are expensive – NIST2002: $60 billion annually – MS Security bulletins: around 40/year at 100k to 1M $ each Validating software (Testing) – Reduce # defects before release – But not without a cost Make tradeoff: – Estimate remaining # defects => Software validation metrics

Example: Code coverage Fraction of statements/basic blocks that are executed by the test suite Principle: – not executed => no defects discovered Hypothesis: – not executed => more likely contains defect

Example: Code coverage High statement coverage – No defects? – Different paths Structural coverage metrics: – e.g. Path coverage, data flow coverage, … – Measure degree of exploration Automatic tool assistance – Metrics evaluate tools rather than human effort

Problem statement Exploration is not sufficient – Tests need to check requirements – Evaluate completeness of test oracle Impossible to automate: – Guess requirements – Evaluation is critical! No good metrics available

State coverage Evaluate strength of assertions Idea: – State updates must be checked by assertions Hypothesis: – Unchecked state update => more likely defect

State coverage Complements code coverage – No replacement Metrics also assist developers – Code coverage => reachability of statements? – State coverage => invariant established by reachable statements?

State coverage Metric: State update – Assignment to fields of objects – Return values, local variables, … also possible Computation: – Runtime monitor 8 number of state updates read in assertions total number of state updates

Design of experiment Existing evaluation: – Correlation with mutation adequacy (Koster et al.) – Case study by expert user Goal: – Directly analyze correlation with ‘real’ defects – Average users

Hypotheses Hypothesis 1: – When increasing state coverage (without increasing exploration), the number of discovered defects increases – Similar to existing case study Hypothesis 2: – State coverage and the number of discovered defects are correlated – Much stronger

Structure of experiment Base program: – Small calendar management system – Result of software design course – Existing test suite – Presence of software defects unknown

Structure of experiment Phase 1: case study – Extend test suite to find defects First increase code coverage Then increase state coverage – Dry run of experiment Simplified application Injected additional defects

Structure of experiment Phase 2: Controlled user study – Create new test suite First increase code coverage Then increase state coverage – Commit after each detected defect

Threats to validity Internal validity – Two sessions: no differences observed – Learning effect: subjects were familiar with environment before experiment External validity – Choice of application – Choice of faults – Subjects are students

Results Phase 1: case study – No additional defects discovered – No confirmation for hypothesis 1 – Potential reasons Mostly structural faults Non-structural faults were obvious Phase 2: Controlled user study – No confirmation for hypothesis 1

Potential causes Frequency of logical faults – 3/20 incorrect state updates – only 1/14 discovered! – 5/14 are detected by assertions – Focusing on these 5 faults Higher state coverage (42% wrt 34%) for classes that detect at least one of these 5 – How common are logical faults?

Potential causes Logical faults too obvious – Subjects discovered them with code coverage State coverage is not monotonic – Adding new tests may decrease state coverage – Always relative to exploration

Conclusions Experiment fails to confirm hypothesis – How frequent are logical faults? – Combine state coverage with code coverage? Or compare test suites with similar code coverage But also: – Simple – Efficient

Questions?