Automated Fitness Guided Fault Localization

Slides:

Advertisements

Similar presentations

ICSE Doctoral Symposium | May 21, 2007 Adaptive Ranking Model for Ranking Code-Based Static Analysis Alerts Sarah Smith Heckman Advised by Laurie Williams.

Advertisements

Time-Aware Test Suite Prioritization Kristen R. Walcott, Mary Lou Soffa University of Virginia International Symposium on Software Testing and Analysis.

Visual Data Mining: Concepts, Frameworks and Algorithm Development Student: Fasheng Qiu Instructor: Dr. Yingshu Li.

Analysis of Variance Outlines: Designing Engineering Experiments

First Step Towards Automatic Correction of Firewall Policy Faults Fei Chen Alex X. Liu Computer Science and Engineering Michigan State University JeeHyun.

Benjamin J. Deaver Advisor – Dr. LiGuo Huang Department of Computer Science and Engineering Southern Methodist University.

Automated Fitness Guided Fault Localization Josh Wilkerson, Ph.D. candidate Natural Computation Laboratory.

Reliability and Software metrics Done by: Tayeb El Alaoui Software Engineering II Course.

SOA for An Empirical Study of Regression Test Selection Techniques. by TODD L. GRAVES Los Alamos National Laboratory MARY JEAN HARROLD Georgia Institute.

An Experimental Evaluation on Reliability Features of N-Version Programming Xia Cai, Michael R. Lyu and Mladen A. Vouk ISSRE’2005.

© 2006 Fraunhofer CESE1 MC/DC in a nutshell Christopher Ackermann.

1 The Effect of Code Coverage on Fault Detection Capability: An Experimental Evaluation and Possible Directions Teresa Xia Cai Group Meeting Feb. 21, 2006.

State coverage: an empirical analysis based on a user study Dries Vanoverberghe, Emma Eyckmans, and Frank Piessens.

Automated Diagnosis of Software Configuration Errors

Determining the Significance of Item Order In Randomized Problem Sets Zachary A. Pardos, Neil T. Heffernan Worcester Polytechnic Institute Department of.

Software Faults and Fault Injection Models --Raviteja Varanasi.

SHOWTIME! STATISTICAL TOOLS IN EVALUATION CORRELATION TECHNIQUE SIMPLE PREDICTION TESTS OF DIFFERENCE.

Part III Exchange Rate Risk Management Information on existing and anticipated economic conditions of various countries and on historical exchange rate.

Overview of MSP Evaluation Rubric Gary Silverstein, Westat MSP Regional Conference San Francisco, February 13-15, 2008.

November 2011CSC7302: Testing & MetricsL4-IntegrationTesting:1 Integration Testing The software testing process where individual units are combined and.

An Automated Approach to Predict Effectiveness of Fault Localization Tools Tien-Duy B. Le, and David Lo School of Information Systems Singapore Management.

2011/08/09 Sunwook Bae. Contents Paper Info Introduction Overall Architecture Resource Management Evaluation Conclusion References.

Chapter 6 : Software Metrics

User Authentication Using a Haptic Stylus REU fellow(s): Janina Grayson 1 Adam Scrivener 2 Faculty mentor: Paolo Gasti, PhD 1, Kiran Balagani, PhD 1 Graduate.

Constrained Evolutionary Optimization Yong Wang Associate Professor, PhD School of Information Science and Engineering, Central South University

TestFul: Automatic Unit-Test Generation for Java Classes Matteo Miraz, Pier Luca Lanzi DEI – Politecnico di Milano (Italy)

Foundations of Software Testing Chapter 5: Test Selection, Minimization, and Prioritization for Regression Testing Last update: September 3, 2007 These.

Bug Localization with Machine Learning Techniques Wujie Zheng

Test Suite Reduction for Regression Testing of Simple Interactions between Two Software Modules Dmitry Kichigin.

Software Requirements Engineering Negotiation Process Lecture-18.

Assisting GPRA Report for MSP Xiaodong Zhang, Westat MSP Regional Conference Miami, January 7-9, 2008.

Software Engineering 2 Software Testing Claire Lohr pp 413 Presented By: Feras Batarseh.

Curiosity-Driven Exploration with Planning Trajectories Tyler Streeter PhD Student, Human Computer Interaction Iowa State University

The basics of the programming process The development of programming languages to improve software development Programming languages that the average user.

Prioritizing Test Cases for Regression Testing Article By: Rothermel, et al. Presentation by: Martin, Otto, and Prashanth.

Testing: A Roadmap Mary Jean Harrold College Of Computing Georgia Institute Of Technology Presented By Prashanth L Anmol N M.

“Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore.

Re-Configurable Byzantine Quorum System Lei Kong S. Arun Mustaque Ahamad Doug Blough.

Bug Localization with Association Rule Mining Wujie Zheng

Automatically detecting and describing high level actions within methods Presented by: Gayani Samaraweera.

Coevolutionary Automated Software Correction Josh Wilkerson PhD Candidate in Computer Science Missouri S&T.

2007 MIT BAE Systems Fall Conference: October Software Reliability Methods and Experience Dave Dwyer USA – E&IS baesystems.com.

Wireless Network Management SANDEEP. Network Management Network management is a service that employs a variety of tools, applications, and devices to.

1 CS510 S o f t w a r e E n g i n e e r i n g Delta Debugging Simplifying and Isolating Failure-Inducing Input Andreas Zeller and Ralf Hildebrandt IEEE.

Fitness Guided Fault Localization with Coevolutionary Automated Software Correction Case Study ISC Graduate Student: Josh Wilkerson, Computer Science ISC.

Foundations of Software Testing Chapter 5: Test Selection, Minimization, and Prioritization for Regression Testing Last update: September 3, 2007 These.

Software Testing. SE, Testing, Hans van Vliet, © Nasty question  Suppose you are being asked to lead the team to test the software that controls.

Testing Integral part of the software development process.

Tung Dao* Lingming Zhang+ Na Meng* Virginia Tech*

Modern Systems Analysis and Design Third Edition

Content and face validity of the Registered Nurses’ Attitudes towards Postgraduate Education (NATPGE) in Australia: A Pilot Study Linda Ng¹,², Margo Pritchard¹,²,³,

Verification and Testing

Automated Pattern Based Mobile Testing

MultiRefactor: Automated Refactoring To Improve Software Quality

Authors: Nirav Desai, prof. hema gaikwad

Auditing & Investigations I

Model-Driven Analysis Frameworks for Embedded Systems

Fault Injection: A Method for Validating Fault-tolerant System

Test Case Purification for Improving Fault Localization

A simple database Project Size (FP) Effort (Pm) Cost Rs. (000) Pp. doc

Best Practices Consortium

WALKTHROUGH and INSPECTION

M. Kezunovic (P.I.) S. S. Luo D. Ristanovic Texas A&M University

Using Automated Program Repair for Evaluating the Effectiveness of

Quality Assessment The goal of laboratory analysis is to provide the accurate, reliable and timeliness result Quality assurance The overall program that.

Design of Experiments CHM 585 Chapter 15.

STEPS Site Report.

Mitigating the Effects of Flaky Tests on Mutation Testing

Deployment Optimization of IoT Devices through Attack Graph Analysis

Presentation transcript:

Automated Fitness Guided Fault Localization Josh Wilkerson, Ph.D. candidate Natural Computation Laboratory

Technical Background Software testing Essential phase in software design Subject software to test cases to expose errors Locate errors in code Fault localization Most expensive component of software debugging [1] Tools and techniques available to assist Automation of this process is a very active research area

Technical Background Fitness Function (FF) Quantifies program performance Sensitive to all objectives of the program Graduated Can be generated from: Formal/informal specifications Oracle (i.e., software developer) Correct execution: high/maximum fitness Incorrect execution: quantify how close to being correct the execution was

The FGFL System Fitness Guided Fault Localization (FGFL) system Novel application of FFs to fault localization Ensemble of techniques Three techniques currently implemented Modular test case generation component Currently random with promotion for full fitness range coverage

FGFL Technique: Trace Comparison Trace Comparison Technique Enhanced version of execution slice comparison fault localization technique [2] Execution trace: list of lines executed in a given run of a program Positive test case: results in correct program execution (indicated by FF) Negative test case: results in incorrect program execution (indicated by FF) Basic concept: Lines unique to negative test case traces  highly suspicious Lines shared by positive/negative test case traces  moderately suspicious Lines unique to positive test case traces  not suspicious

FGFL Technique: Trace Comparison Positive Test Case: 1: Loop(i = 1…Size) 2: Loop(j = 1…Size-1) 3: If(data[j] > data[j+1]) 4: temp = data[j] 5: data[j] = data[j+1] 6: data[j+1] = data[j] Negative Test Case: 1: Loop(i = 1…Size) 2: Loop(j = 1…Size-1) 3: If(data[j] > data[j+1]) 4: temp = data[j] 5: data[j] = data[j+1] 6: data[j+1] = data[j] Error on line 6 Bold lines indicate execution Highly suspicious lines Moderately suspicious lines Result: 1: Loop(i = 1…Size) 2: Loop(j = 1…Size-1) 3: If(data[j] > data[j+1]) 4: temp = data[j] 5: data[j] = data[j+1] 6: data[j+1] = data[j]

FGFL Technique: TBLS Trend Based Line Suspicion (TBLS) Technique Based on the algorithm used by the Tarantula fault localization technique [3,4,5] Basic concept: lines containing an error are going to be executed more by negative test cases Each line has an associated suspicion level For each execution trace Suspicion Adjustment Amount (SAA) calculated based on the fitness of the execution High fitness: negative SAA Low fitness: positive SAA SAA added to suspicion for all lines in the trace

FGFL Technique: TBLS Example: Five test cases used One positive Four negative (varying performance) Results: Suspicion Levels: -1 1: Loop(i = 1…Size) 2: Loop(j = 1…Size-1) 3: If(data[j] > data[j+1]) 5 4: temp = data[j] 5: data[j] = data[j+1] 6: data[j+1] = data[j]

FGFL Technique: Fitness Monitor Run-time fitness monitor technique Novel technique Basic concept: lines that cause a decrease in fitness consistently are likely to contain an error Program is instrumented to enable calculation fitness after every line execution Fitness fluctuation lines are found in the trace Fitness regions are generated around fluctuation lines Start with just fluctuation line Expanded out until fitness becomes stable If execution of the region results in an overall drop in fitness  the fluctuation line becomes more suspicious No change in suspicion otherwise

FGFL Technique: Fitness Monitor Example: Fitness plots for five test cases 1: Loop(i = 1…Size) 2: Loop(j = 1…Size-1) 3: If(data[j] > data[j+1]) 4: temp = data[j] 5: data[j] = data[j+1] 6: data[j+1] = data[j]

FGFL: Result Combination Technique results combined using a voting system Each technique is given an equal number of votes Number of votes equal to the number of lines in the program Techniques do not have to use all votes Suspicion values adjusted to reflect confidence in result Suspicion values scaled relevant to maximum suspicion possible for the technique Votes are applied proportional to the confidence of the result Method chosen to help reduce misleading results

Experimental Setup Statistical analysis program (46-50 lines) Goal: Seven versions Each version has error(s) seeded Variety of error types 10 sets of test cases generated, each consisting of 75 test cases Goal: Proof of FGFL concept Expose strengths and weaknesses of techniques Determine synergies between techniques and strength of ensemble approach

Trace Comp. & Fitness Mon. Results Program lines ordered from most suspicious to least Average rank of the line(s) containing the error in ordered list: Program Version Trace Comp. TBLS Fitness Mon. Trace Comp. & TBLS TBLS & Fitness Mon. Trace Comp. & Fitness Mon. All 1 16 10.9 4 4.9 3 2 17 45.9 22.4 45 6 20.9 12.1 5.3 10.6 5 20 9.9 46 4.5 49 7 45.8 8.9 25.8 15.9 8.3 Best performance: full ensemble and trace comparison paired with fitness monitor Program version 6: incorrect branch predicate, making branch unreachable Need a technique based on statement reachability

Ongoing Work Techniques are still under active development Investigating the enhancement of other state-of-the-art fault localization techniques through a FF Development of new techniques exploiting the FF Use of multi-objective FF with FGFL Testing FGFL on well known, benchmark problems

References [1] I. Vessey, “Expertise in Debugging Computer Programs: An Analysis of the Content of Verbal Protocols,” IEEE Transactions on Systems, Man and Cybernetics, vol. 16, no. 5, pp. 621–637, September 1986. [2] H. Agrawal, J. R. Horgan, S. London, and W. E. Wong, “Fault localization using execution slices and dataflow sets,” in Proceedings of the 6th IEEE International Symposium on Software Reliability Engineering, 1995, pp. 143–151. [3] J. A. Jones, M. J. Harrold, and J. Stasko, “Visualization of test information to assist fault localization,” in Proceedings of the 24th International Conference on Software Engineering. New York, NY, USA: ACM, 2002, pp. 467–477. [4] J. A. Jones and M. J. Harrold, “Empirical evaluation of the tarantula automatic fault-localization technique,” in Proceedings of the 20th IEEE/ACM international Conference on Automated software engineering. New York, NY, USA: ACM, 2005, pp. 273–282. [5] J. A. Jones, “Semi-automatic fault localization,” Ph.D. dissertation, Georgia Institute of Technology, 2008.

Questions?