IMPROVED FITNESS FUNCTIONS FOR AUTOMATED PROGRAM REPAIR ZACHARY P. FRY.

Slides:



Advertisements
Similar presentations
Timetabling with Genetic Algorithms. Timetabling Problem Specifically university class timetabling Specifically university class timetabling Highly complex.
Advertisements

Ali Husseinzadeh Kashan Spring 2010
Time-Aware Test Suite Prioritization Kristen R. Walcott, Mary Lou Soffa University of Virginia International Symposium on Software Testing and Analysis.
Effective Keyword Based Selection of Relational Databases Bei Yu, Guoliang Li, Karen Sollins, Anthony K.H Tung.
Stat 112: Lecture 7 Notes Homework 2: Due next Thursday The Multiple Linear Regression model (Chapter 4.1) Inferences from multiple regression analysis.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Automatic Software Repair Using GenProg 张汉生 ZHANG Hansheng 2013/12/3.
1 Software Architecture CSSE 477: Week 5, Day 1 Statistical Modeling to Achieve Maintainability Steve Chenoweth Office Phone: (812) Cell: (937)
AN APPLICATION SPECIFIC TECHNIQUE FOR RETRIEVAL AND ADAPTATION OF TRUSTED COMPONENTS Benny Thomas Master of Computer Science Supervised by Dr. David Hemer.
Institute of Intelligent Power Electronics – IPE Page1 Introduction to Basics of Genetic Algorithms Docent Xiao-Zhi Gao Department of Electrical Engineering.
Optimizing genetic algorithm strategies for evolving networks Matthew Berryman.
SE 450 Software Processes & Product Metrics Reliability: An Introduction.
Evolutionary Computational Intelligence Lecture 10a: Surrogate Assisted Ferrante Neri University of Jyväskylä.
A Heuristic Bidding Strategy for Multiple Heterogeneous Auctions Patricia Anthony & Nicholas R. Jennings Dept. of Electronics and Computer Science University.
Previous finals up on the web page use them as practice problems look at them early.
COMP305. Part II. Genetic Algorithms. Genetic Algorithms.
COMP305. Part II. Genetic Algorithms. Genetic Algorithms.
Artificial Intelligence Genetic Algorithms and Applications of Genetic Algorithms in Compilers Prasad A. Kulkarni.
Evolutionary Computational Intelligence Lecture 9: Noisy Fitness Ferrante Neri University of Jyväskylä.
Adapted by Doug Downey from Machine Learning EECS 349, Bryan Pardo Machine Learning Clustering.
Developing Ideas for Research and Evaluating Theories of Behavior
Genetic Algorithms Sushil J. Louis Evolutionary Computing Systems LAB Dept. of Computer Science University of Nevada, Reno
Geometric Crossovers for Supervised Motif Discovery Rolv Seehuus NTNU.
Swami NatarajanJuly 14, 2015 RIT Software Engineering Reliability: Introduction.
Subgoal: conduct an in-depth study of critical representation, operator and other choices used for evolutionary program repair at the source code level.
Chapter 9 Two-Sample Tests Part II: Introduction to Hypothesis Testing Renee R. Ha, Ph.D. James C. Ha, Ph.D Integrative Statistics for the Social & Behavioral.
EVOLVING ANTS Enrique Areyan School of Informatics and Computing Indiana University January 24, 2012.
Means Tests Hypothesis Testing Assumptions Testing (Normality)
Automatic Program Repair With Evolutionary Computation Westley Weimer Computer Science Dept. University of Virginia Charlottesville, VA 22904
Issues with Data Mining
Changing Perspective… Common themes throughout past papers Repeated simple games with small number of actions Mostly theoretical papers Known available.
Charles L. Karr Rodney Bowersox Vishnu Singh
` Research 2: Information Diversity through Information Flow Subgoal: Systematically and precisely measure program diversity by measuring the information.
IMPROVED FITNESS FUNCTIONS FOR AUTOMATED PROGRAM REPAIR ZACHARY P. FRY.
Soft Computing Lecture 18 Foundations of genetic algorithms (GA). Using of GA.
A Brief Introduction to GA Theory. Principles of adaptation in complex systems John Holland proposed a general principle for adaptation in complex systems:
Evolution Strategies Evolutionary Programming Genetic Programming Michael J. Watts
Bug Localization with Machine Learning Techniques Wujie Zheng
Lecture 8: 24/5/1435 Genetic Algorithms Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Genetic Algorithms Michael J. Watts
CS 533 Information Retrieval Systems.  Introduction  Connectivity Analysis  Kleinberg’s Algorithm  Problems Encountered  Improved Connectivity Analysis.
A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:
Qi Guo Emory University Ryen White, Susan Dumais, Jue Wang, Blake Anderson Microsoft Presented by Tetsuya Sakai, Microsoft Research.
Evolving the goal priorities of autonomous agents Adam Campbell* Advisor: Dr. Annie S. Wu* Collaborator: Dr. Randall Shumaker** School of Electrical Engineering.
Automated Patch Generation Adapted from Tevfik Bultan’s Lecture.
REPRESENTATIONS AND OPERATORS FOR IMPROVING EVOLUTIONARY SOFTWARE REPAIR Claire Le Goues Westley Weimer Stephanie Forrest
“Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore.
Exploiting Group Recommendation Functions for Flexible Preferences.
Evolving RBF Networks via GP for Estimating Fitness Values using Surrogate Models Ahmed Kattan Edgar Galvan.
Emerging Technologies & Language FET-Open The European Future and Emerging Technologies Open Scheme FIL2010 Louvain-La-Neuve, March 17 th 2010 Paul Hearn.
1 Travel Times from Mobile Sensors Ram Rajagopal, Raffi Sevlian and Pravin Varaiya University of California, Berkeley Singapore Road Traffic Control TexPoint.
1 Research Seminar 1 & 2 Overview Research Seminar 1 Assignments Charles C. Tappert Seidenberg School of CSIS, Pace University.
Assistant Instructor Nian K. Ghafoor Feb Definition of Proposal Proposal is a plan for master’s thesis or doctoral dissertation which provides the.
Genetic Algorithms An Evolutionary Approach to Problem Solving.
© Steven Kazlowski/SuperStock 1 Web Search How big is the average human? What units might you use to describe weight and height? 2 Web Search How.
Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.
A MapReduced Based Hybrid Genetic Algorithm Using Island Approach for Solving Large Scale Time Dependent Vehicle Routing Problem Rohit Kondekar BT08CSE053.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Intelligent Exploration for Genetic Algorithms Using Self-Organizing.
Regression Testing with its types
Machine learning-Evolution Computational Model
Genetic Algorithms, Search Algorithms
Genetic Algorithms CPSC 212 Spring 2004.
Lecture 27: Virtual Memory
Example: Applying EC to the TSP Problem
by Xiang Mao and Qin Chen
Automated Patch Generation
Lecture 4. Niching and Speciation (1)
Using Automated Program Repair for Evaluating the Effectiveness of
Evaluating Classifiers for Disease Gene Discovery
Presentation transcript:

IMPROVED FITNESS FUNCTIONS FOR AUTOMATED PROGRAM REPAIR ZACHARY P. FRY

IMPROVED FITNESS FUNCTIONS Automatic program repair can fix bugs. BugsBugs FixesFixes 2 GenProgGenProg

IMPROVED FITNESS FUNCTIONS Automatic program repair can fix bugs. BugsBugs FixesFixes 3 GenProgGenProg

IMPROVED FITNESS FUNCTIONS The current fitness model is imprecise Ideas: Not all test cases are created equal Test cases may not describe all relevant program behavior Different types of bugs might benefit from different kinds of fixes We propose to address the naivety of the current fitness representation. 4

FITNESS DISTANCE CORRELATION “Quantifying the extent to which a GA fitness function approaches an ideal of heuristic search” 1 Informally, does a given fitness function produce values that correlate with some grounded notion of “closeness to a fix”? 5 1) T. Jones and S. Forrest. Fitness distance correlation as a measure of problem difficulty for genetic algorithms. In International Conference on Genetic Algorithms, pages 184–192, 1995.

IMPROVED FITNESS FUNCTIONS Measuring proximity to a fix Insert, delete, and swapping lines in the program 6 d(135) NO FIX FIX i(251,205)i(774,111)s(598,324)

IMPROVED FITNESS FUNCTIONS Measuring proximity to a fix Insert, delete, and swapping lines in the program 7 d(135) NO FIX FIX i(251,205)i(774,111)s(598,324) i(251,205)i(774,111)s(598,324)d(63)

IMPROVED FITNESS FUNCTIONS Measuring proximity to a fix Insert, delete, and swapping lines in the program 8 d(135) NO FIX FIX i(251,205)i(774,111)s(598,324) i(251,205)i(774,111)s(598,324)d(63) ✓✓✓ ✗ 75%

IMPROVED FITNESS FUNCTIONS Measuring proximity to a fix Insert, delete, and swapping lines in the program 9 d(135) NO FIX FIX i(251,205)i(774,111)s(598,324) i(251,205)i(774,111)s(598,324)d(63) ✓✓✓ ✗ 75% d(84)s(844,265)i(774,111)i(735,431)

IMPROVED FITNESS FUNCTIONS Measuring proximity to a fix Insert, delete, and swapping lines in the program 10 d(135) NO FIX FIX i(251,205)i(774,111)s(598,324) i(251,205)i(774,111)s(598,324)d(63) ✓✓✓ ✗ 75% d(84)s(844,265)i(774,111)i(735,431) ✓ ✗ ✗ 25% ✗

IMPROVED FITNESS FUNCTIONS The current model of fitness does not correlate well with proximity to a fix (0.145). Hypothesis: By taking into account previously unused information about test cases, bugs, and fixes we can better inform the evolutionary bug fixing process to fix bugs faster and more often. 11

IMPROVED FITNESS FUNCTIONS Approach: weight test cases based on known fixes 12 Test Case 1 Test Case 2 FIX NO FIX

IMPROVED FITNESS FUNCTIONS Approach: weight test cases based on known fixes 13 Test Case 1 Test Case 2 FIX NO FIX

IMPROVED FITNESS FUNCTIONS Approach: weight test cases based on known fixes 14 Test Case 1 Test Case 2 FIX NO FIX

IMPROVED FITNESS FUNCTIONS Evaluation: How much can we speed up fixes? Computational time and monetary cost Preliminary results How many more bugs can we fix? Fraction of previously unfixed bugs Future work 15

PRELIMINARY RESULTS For a sample of 15 bugs from one program, 31.3% of test cases show no correlation with actual fitness (closeness to a fix) 16 BugAvg Percent Time Savings libtiff-bug-0fb6cf7-b4158fa 24.47% libtiff-bug-01209c9-aaf9eb350.49% libtiff-bug-10a % libtiff-bug-5b dfb33b22.08% libtiff-bug-8f6338a-4c5a9ec62.24% Total:38.96%

PRELIMINARY RESULTS Some test cases are over 23x more correlated with actual fitness than others Suggests an adequate weighting scheme using machine learning could fix more bugs, faster This work and additional efforts to investigate other strategies for improving fitness functions are ongoing. 17

APPLICABILITY When might this work? Programs with expensive test suites – e.g. Php (12,000+) When there is heavy overlap between test cases Test suites/cases that fail to specify the bug Assumptions? Presence of historical bug fix data to mine Test suites do not evolve drastically from bug to bug Bugs for a given program are related on some level 18

GOALS By providing GenProg a better signal for mutants’ fitness we hope to: Better direct the search – arrive at fixes faster, lowering cost (up to 38%) In the limit, find more fixes for previously unfixed bugs 19

GOALS By providing GenProg a better signal for mutants’ fitness we hope to: Better direct the search – arrive at fixes faster, lowering cost (up to 38%) In the limit, find more fixes for previously unfixed bugs 20 QUESTIONS?