Automatic Program Repair With Evolutionary Computation Westley Weimer Computer Science Dept. University of Virginia Charlottesville, VA 22904

Slides:



Advertisements
Similar presentations
Biologically Inspired Computing: Operators for Evolutionary Algorithms
Advertisements

A SYSTEMATIC STUDY OF AUTOMATED PROGRAM REPAIR: FIXING 55 OUT OF 105 BUGS FOR $8 EACH Claire Le Goues Michael Dewey-Vogt Stephanie Forrest Westley Weimer.
On the Genetic Evolution of a Perfect Tic-Tac-Toe Strategy
Automatic Software Repair Using GenProg 张汉生 ZHANG Hansheng 2013/12/3.
(Quickly) Testing the Tester via Path Coverage Alex Groce Oregon State University (formerly NASA/JPL Laboratory for Reliable Software)
A new crossover technique in Genetic Programming Janet Clegg Intelligent Systems Group Electronics Department.
Evolutionary Computation Application Peter Andras peter.andras/lectures.
Genetic Algorithm What is a genetic algorithm? “Genetic Algorithms are defined as global optimization procedures that use an analogy of genetic evolution.
Subgoal: conduct an in-depth study of critical representation, operator and other choices used for evolutionary program repair at the source code level.
Khaled Rasheed Computer Science Dept. University of Georgia
Genetic Algorithms Overview Genetic Algorithms: a gentle introduction –What are GAs –How do they work/ Why? –Critical issues Use in Data Mining –GAs.
Genetic Programming.
Slides are based on Negnevitsky, Pearson Education, Lecture 10 Evolutionary Computation: Evolution strategies and genetic programming n Evolution.
Problems Premature Convergence Lack of genetic diversity Selection noise or variance Destructive effects of genetic operators Cloning Introns and Bloat.
` Research 2: Information Diversity through Information Flow Subgoal: Systematically and precisely measure program diversity by measuring the information.
Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.
Soft Computing Lecture 18 Foundations of genetic algorithms (GA). Using of GA.
SOFT COMPUTING (Optimization Techniques using GA) Dr. N.Uma Maheswari Professor/CSE PSNA CET.
Evolution Strategies Evolutionary Programming Genetic Programming Michael J. Watts
Using Execution Paths to Evolve Software Patches ThanhVu Nguyen*, Westley Weimer**, Claires Le Gouges**, Stephanie Forrest* * University of New Mexico.
Helix Automatic Software Repair with Evolutionary Computation Stephanie Forrest Westley Weimer.
© Negnevitsky, Pearson Education, Lecture 10 Evolutionary Computation: Evolution strategies and genetic programming Evolution strategies Evolution.
CS 484 – Artificial Intelligence1 Announcements Lab 3 due Tuesday, November 6 Homework 6 due Tuesday, November 6 Lab 4 due Thursday, November 8 Current.
Zorica Stanimirović Faculty of Mathematics, University of Belgrade
Genetic Algorithms Michael J. Watts
What is Genetic Programming? Genetic programming is a model of programming which uses the ideas (and some of the terminology) of biological evolution to.
Genetic Algorithms Genetic algorithms imitate a natural optimization process: natural selection in evolution. Developed by John Holland at the University.
Design of an Evolutionary Algorithm M&F, ch. 7 why I like this textbook and what I don’t like about it!
Introduction to Evolutionary Algorithms Session 4 Jim Smith University of the West of England, UK May/June 2012.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 15: Automated Patch Generation.
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
Computational Complexity Jang, HaYoung BioIntelligence Lab.
Evolving Virtual Creatures & Evolving 3D Morphology and Behavior by Competition Papers by Karl Sims Presented by Sarah Waziruddin.
1 Machine Learning: Lecture 12 Genetic Algorithms (Based on Chapter 9 of Mitchell, T., Machine Learning, 1997)
1 “Genetic Algorithms are good at taking large, potentially huge search spaces and navigating them, looking for optimal combinations of things, solutions.
EE459 I ntroduction to Artificial I ntelligence Genetic Algorithms Kasin Prakobwaitayakit Department of Electrical Engineering Chiangmai University.
Hai Wan School of Software Sun Yat-sen University KRW-2012 June 17, 2012 Boolean Program Repair Reverse Conversion Tool via SMT.
Evolution Programs (insert catchy subtitle here).
Automated Patch Generation Adapted from Tevfik Bultan’s Lecture.
REPRESENTATIONS AND OPERATORS FOR IMPROVING EVOLUTIONARY SOFTWARE REPAIR Claire Le Goues Westley Weimer Stephanie Forrest
Xusheng Xiao North Carolina State University CSC 720 Project Presentation 1.
Chapter 9 Genetic Algorithms.  Based upon biological evolution  Generate successor hypothesis based upon repeated mutations  Acts as a randomized parallel.
1. Genetic Algorithms: An Overview  Objectives - Studying basic principle of GA - Understanding applications in prisoner’s dilemma & sorting network.
Coevolutionary Automated Software Correction Josh Wilkerson PhD Candidate in Computer Science Missouri S&T.
Automated discovery in math Machine learning techniques (GP, ILP, etc.) have been successfully applied in science Machine learning techniques (GP, ILP,
1 The Software Development Process ► Systems analysis ► Systems design ► Implementation ► Testing ► Documentation ► Evaluation ► Maintenance.
An evolutionary approach for improving the quality of automatic summaries Constantin Orasan Research Group in Computational Linguistics School of Humanities,
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
Genetic Algorithm Dr. Md. Al-amin Bhuiyan Professor, Dept. of CSE Jahangirnagar University.
Genetic Programming Using Simulated Natural Selection to Automatically Write Programs.
Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.
` Question: How do immune systems achieve such remarkable scalability? Approach: Simulate lymphoid compartments, fixed circulatory networks, cytokine communication.
Genetic Algorithms An Evolutionary Approach to Problem Solving.
Classification Using Genetic Programming Patrick Kellogg General Assembly Data Science Course (8/23/ /12/15)
Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.
March 1, 2016Introduction to Artificial Intelligence Lecture 11: Machine Evolution 1 Let’s look at… Machine Evolution.
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
Chapter 14 Genetic Algorithms.
Genetic Algorithms.
Genetic Algorithms.
Evolution Strategies Evolutionary Programming
Evolution strategies and genetic programming
MultiRefactor: Automated Refactoring To Improve Software Quality
Artificial Intelligence Chapter 4. Machine Evolution
Automated Patch Generation
Aiman H. El-Maleh Sadiq M. Sait Syed Z. Shazli
Artificial Intelligence Chapter 4. Machine Evolution
Machine Learning: UNIT-4 CHAPTER-2
Beyond Classical Search
Coevolutionary Automated Software Correction
Presentation transcript:

Automatic Program Repair With Evolutionary Computation Westley Weimer Computer Science Dept. University of Virginia Charlottesville, VA Stephanie Forrest Dept. of Computer Science University of New Mexico Albuquerque, NM Claire Le Goues Computer Science Dept. University of Virginia Charlottesville, VA ThanhVu Nguyen Dept. of Computer Science University of New Mexico Albuquerque, NM Presented by: Teodoro Rosati CIS 601, Spring 2014 March 4, 2014 The material in this paper is taken from two original publications, titled “A Genetic Programming Approach to Automated Software Repair” (Genetic and Evolutionary Computation Conference, 2009) and “Automatically Finding Patches Using Genetic Programming” (Proceedings of the 2009 IEEE 31st International Conference on Software Engineering, IEEE Computer Society).

What’s the Problem? Finding bugs is relatively easy… Famous Costly Bugs –FDIV Intel Pentium processor (1994): $500 million – floating point unit had a flawed division table –Y2K (1999). Cost: $500 billion –2 digit YR storage (e.g. 95) and “00” = 1900 –Mars Climate Crasher (1998). Cost: $125 million –Imperial Units (lbs of force)  Metric Units (Newtons) Many techniques and software solutions for detecting and mitigating software errors: –Syntactic Bug Pattern Detection –Decompilation and Data Flow Analysis –Automated Theorem Proving –Model Checking 2

I Found The Bug, Here’s the catch…now I have to fix it 3

One solution… Manual Program Repair Up to 90% of total cost of a software project for maintenance after delivery –Modifying existing code, repairing defects, maintaining code during its lifecycle 4 –Products often are shipped with known and unknown bugs because lack of development resources

A better solution… Automatic Program Repair: 5 Traditional Program Analysis Methods Evolutionary Computation (Genetic Programming)

Experimental Design 6 Genetic Programming Evolves computer programs tailored to a task Meaning a program is modified using similar pathways to genetic evolution (mutation/crossover) GP techniques have been applied to unannotated off-the-shelf legacy C programs Individual variants V with the highest fitness are selected for continued evolution

7 Evolutionary Computation Genetic Programming Inspired by biological natural selection –Endler’s guppy experiment: Diverse source population –Guppies variously colored Natural selection on population –Habitat variation, coarse vs. fine gravel –Predator presence, sight based –Evolution of drug resistant bacteria Source = normal + resistant bacteria Antibiotics promote resistant bacteria

8 Evolutionary Computation Genetic Programming Natural selection and programs –Population of Program Variants Variants/Individuals

Genetic Programming Repair Technical Approach 1.What is it doing wrong? –Input a set of negative (-) test cases that characterizes a fault 2.What is it supposed to do? –Input a set of positive (+) test cases that encode functionality requirements 3.Where should we change it? –Program locations of the (-) test cases 4.How should we change it? –Insert, delete and swap program statements and control flow. Insertions are preferred 5.When are we finished? –First variant that passes (+) and (-) cases –Minimize differences between variant and original 9

10 Automatic Program Repair Representation Abstract Syntax Tree (AST) –C programs Genes: Statements are basic units –Conditional “if (x>y) {max = x;}” Expressions within a statement –“{max = x;}” –Selection Actions: Insert, Delete, Swap of Genes ____________________________ If ( x > y ) { max = x; } If x y > max x = Gene AST

Genetic Programming Mutation Operators Mutation Operator –Insert, Delete or Swap a gene 11

Genetic Programming Crossover Operators Crossover Operator –Between 2 Parent Sub-Trees Crossback Operator –Between Variant V and Parent

Measuring Fitness 13 Variants are compiled Testcase evaluated in virtual machine/sandbox Fitness measured using formula: fitness(P) = W PosT   { t  PosT  P passes t }  + W NegT   { t  NegT  P passes t }  Note: W PosT = weight of each successful positive test W NegT = weight of each successful negative test

For example… 14 December 31, 2008 –A bug was reported in Microsoft Zune media players Zune would freeze when the value of the input days is the last day of a leap year (e.g. 10,593)  INFINITE LOOP!

1. What is it doing wrong?  Negative Test Cases Negative Test Case –input days set to 10,593 (last day of leap year) –program executes lines 1 – 16 –then repeats lines 3, 4, 8 and 11 infinitely

2. What’s it supposed to do?  Positive Test Cases Positive Test Case –input days set to 1,000 (non-leap year) –program executes lines 1–8, once as expected

3. Where should we change it? Program Locations visited when executing the negative test cases –lines 3, 4, 8 and 11

4. How should we change it?  INSERT Insert an entire statement or gene –stmt j is added to stmt i stmt j

4. How should we change it?  DELETE Delete an entire statement or gene –stmt i is transformed into an empty block statement stmt i

4. How should we change it?  SWAP Swap an entire statement or gene –Second statement stmt j is chosen uniformly at random from anywhere in the program to replace stmt i stmt i stmt j

5. When we are finished?  Minimize Differences 21 Variant V program passes all the test cases –Minimization Step to discard unnecessary changes –Average repair time = 42 seconds

Performance of the Zunebug Repair… 22 Evolution of the Zunebug repair 1 GP trial –The darker curve plots the average fitness of the population –The lighter curve plots the fitness of the individual V primary repair

Performance of the Zunebug Repair… 23 Evolution of the Zunebug repair with 20 positive and 4 negative test cases (equally weighted) –The boxes represented the average over 70 distinct trials –The error bars represent one standard deviation.

Performance of the GP Algorithm… 24 Eleven defects repaired by Genetic Programming

Performance of the GP Algorithm… 25 GP search time scales with weighted path size. –18 programs successfully repaired by GP (ave. of 100 runs) –x-axis log 10 of the weighted path length –y-axis log 10 of the total number of fitness evaluations

Caveats… 26 Limitations (assumptions) defect is reproducible program behaves deterministically on test cases postive test cases encode program requirements no overlap in path taken by negative and postive test cases existing program can provide repair statements Evolution Not rigorously tested (parameter values, selection strategies, and operator design) Fault Localization critical to find viable fixes, but poorly understood Fitness Function Oversimplification Repair Quality dependent on a high-quality set of positive test cases

Future work… 27 Generic set of repair templates for GP as source code for mutations Extend with data structure definitions and variable declarations Assembly- and bytecode-level repairs Testing on more sophisticated errors –Race conditions Assessing size and distribution of bugs for targeting

28 Questions