Masatomo Hashimoto Akira Mori Tomonori Izumida

Slides:



Advertisements
Similar presentations
1 JavaCUP JavaCUP (Construct Useful Parser) is a parser generator Produce a parser written in java, itself is also written in Java; There are many parser.
Advertisements

Chapter 7 User-Defined Methods. Chapter Objectives  Understand how methods are used in Java programming  Learn about standard (predefined) methods and.
A survey of techniques for precise program slicing Komondoor V. Raghavan Indian Institute of Science, Bangalore.
1 Software Testing and Quality Assurance Lecture 9 - Software Testing Techniques.
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
StringTemplate Terence Parr University of San Francisco
Unit Testing & Defensive Programming. F-22 Raptor Fighter.
Syntax Directed Translation. Syntax directed translation Yacc can do a simple kind of syntax directed translation from an input sentence to C code We.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
Change Impact Analysis for AspectJ Programs Sai Zhang, Zhongxian Gu, Yu Lin and Jianjun Zhao Shanghai Jiao Tong University.
More with Methods (parameters, reference vs. value, array processing) Corresponds with Chapters 5 and 6.
1 Semantic Analysis Aaron Bloomfield CS 415 Fall 2005.
Reviewing Recent ICSE Proceedings For:.  Defining and Continuous Checking of Structural Program Dependencies  Automatic Inference of Structural Changes.
Software Measurement & Metrics
1 Documenting with Javadoc. 2 Motivation  Why document programs? To make it easy to understand, e.g., for reuse and maintenance  What to document? Interface:
Cross Language Clone Analysis Team 2 October 27, 2010.
Chapter 6 Programming Languages (2) Introduction to CS 1 st Semester, 2015 Sanghyun Park.
Testing. 2 Overview Testing and debugging are important activities in software development. Techniques and tools are introduced. Material borrowed here.
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
Chapter 1 Introduction Major Data Structures in Compiler
Final Review. From ArrayLists to Arrays The ArrayList : used to organize a list of objects –It is a class in the Java API –the ArrayList class uses an.
Chapter 5 Methods 1. Motivations Method : groups statements that perform a function.  Level of abstraction (black box)  Code Reuse – no need to reinvent.
Object-Oriented Design Concepts University of Sunderland.
Chap. 7, Syntax-Directed Compilation J. H. Wang Nov. 24, 2015.
BY: JAKE TENBERG & CHELSEA SHIPP PROJECT REVIEW: JGIBBERISH.
Bernd Fischer COMP2010: Compiler Engineering Abstract Syntax Trees.
Simplifying and Isolating Failure-Inducing Input Andreas Zeller and Ralf Hildebrandt IEEE Transactions on Software Engineering (TSE) 2002.
Test Case Purification for Improving Fault Localization presented by Taehoon Kwak SoftWare Testing & Verification Group Jifeng Xuan, Martin Monperrus [FSE’14]
Shadow Shadow of a Doubt: Testing for Divergences Between Software Versions Hristina PalikarevaTomasz KuchtaCristian Cadar ICSE’16, 20 th May 2016 This.
LECTURE 10 Semantic Analysis. REVIEW So far, we’ve covered the following: Compilation methods: compilation vs. interpretation. The overall compilation.
1 Problem Solving  The purpose of writing a program is to solve a problem  The general steps in problem solving are: Understand the problem Dissect the.
Clojure Macros. Homoiconicity All versions of Lisp, including Clojure, are homoiconic This means that there is no difference between the form of the data.
Semi-Automated Software Restructuring
EGR 115 Introduction to Computing for Engineers
Compiler Design (40-414) Main Text Book:
An Operational Approach to Relaxed Memory Models
EECE 310: Software Engineering
COMP261 Lecture 18 Parsing 3 of 4.
A&AI Component Diagram
CS 3304 Comparative Languages
Introduction to Parsing (adapted from CS 164 at Berkeley)
Java Programming: Guided Learning with Early Objects
Java Programming Language
GC211Data Structure Lecture2 Sara Alhajjam.
Semantic Analysis with Emphasis on Name Analysis
Core Core: Simple prog. language for which you will write an interpreter as your project. First define the Core grammar Next look at the details of how.
Programming Abstractions
MultiRefactor: Automated Refactoring To Improve Software Quality
Clone Refactoring with Lambda Expressions
Java Programming: Guided Learning with Early Objects
ITEC 2620M Introduction to Data Structures
Ruru Yue1, Na Meng2, Qianxiang Wang1 1Peking University 2Virginia Tech
4 (c) parsing.
Unit Test Pattern.
Program Slicing Baishakhi Ray University of Virginia
Test Case Purification for Improving Fault Localization
Programming Abstractions
Building Java Programs
Lesson Objectives Aims Key Words
CS 240 – Advanced Programming Concepts
Lesson Objectives Aims Understand Syntax Analysis.
Stacks Chapter 5.
Profs. Brewer CS 169 Lecture 13
Chapter 10: Compilers and Language Translation
Precise Condition Synthesis for Program Repair
Fine-grained and Accurate Source Code Differencing
Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.
Chapter 6: Methods CS1: Java Programming Colorado State University
Presentation transcript:

Masatomo Hashimoto Akira Mori Tomonori Izumida Automated Patch Extraction via Syntax- and Semantics-Aware Delta Debugging on Source Code Changes Masatomo Hashimoto Akira Mori Tomonori Izumida Chiba Institute of Technology National Institute of Advanced Industrial Science and Technology IIJ Innovation Institute 11/8/2018 ESEC/FSE 2018

✘︎ ✔︎ Software Regression Vgood Vbad Changes (Delta) PASSED FAILED --- +++ PASSED FAILED Working version Vgood Test Faulty version Vbad

Real Regression Bugs

Delta Debugging on Source Code Changes Yesterday, my program worked. Today it does not. Why? [Zeller, 1999] Compare source code Decompose delta Bisect, select, patch Build and test --- +++ --- +++ --- +++ --- +++ --- +++ --- +++ --- +++ ✘︎ ✔︎ --- +++ --- +++ Vgood Vbad 1 2 3 4

Is DD Practical for Real Regressions? Complexity of DD [Zeller-Hildebrandt, 2002] Test executions w.r.t. number of hunks (delta components) Logarithmic (best case) to quadratic (worst case) Factors that increase complexity [Zeller, 1999] Inconsistency: test may result in UNRESOLVED ... Example: Delta decomposed into 952 hunks 826 test executions (8+ hours on AWS (16 cores, 8GB-RAM/core)) 722 build failures (build success rate: 0.127)

Complexity Factors of DD [Zeller, 1999] Inconsistency: UNRESOLVED test results exist Interference: combination of multiple components induces failure Non-monotony: non-failure inducing changes contain failure inducing changes Ambiguity: disjoint sub-deltas induce failure

Causes of Build Failures @@ ... @@ void m() + throws Abort { @@ ... @@ - try { + synchronized(a){ Syntax errors "cannot find symbol" errors "unreported exception" errors ... @@ ... @@ - void m() { ... } @@ ... @@ } finally { w.unlock(); } @@ ... @@ - m(); @@ ... @@ + e = new Abort(); + throw e;

Syntax- and Semantics-Aware Decomposition to Avoid Build Failures Fine-grained delta decomposition into atomic changes Abstract syntax tree (AST) instead of plain text Deletion, insertion, relabeling, move of AST nodes Rule-based coupling of atomic changes By syntactic constraints By refactoring patterns By other change patterns --- +++ --- +++ --- +++ --- +++ --- +++

Comparing Abstract Syntax Trees (ASTs) Original Modified if(x||f(a)) return break || f(a) if(!(x&&a)) break return ! && return if if return inserted || return break ! break return moved relabeled x f @@ -1,3 +1,4 @@ -if (x||f(a)) -   return +if (!(x&&a))     break +   return && deleted Condition changed Call removed Negation inserted Then-else interchanged a x a

Tree Hunks Delete hunk Move hunk Insert hunk

Rules for Coupling Tree Hunks Syntactic rules E.g. try changed to synchronized ⬌ finally deleted Refactoring-based rules E.g. parameter removed (Remove Parameter) ⬌ argument removed Change-based rules E.g. uncaught throw added ⬌ throws added try synchronized finally o.meth() o.meth() args args meth meth params a throws x throw

Delta Decomposition A delta component = connected hunks hunk link by syntax link by refactoring link by change 691 components

DDJ: DD on Java Source Code Changes Compare source code (AST) Decompose delta Bisect, select, patch Build and test ✔︎ Vgood Vbad (Diff/TS) (500+ rules) (Patch/AST+DD.py) 1 2 3 4

AST Differencing Prog1 Tree Differencing XML AST1 Engine Prog2 parse represent parse AST delta (set of tree hunks)

AST Patching Prog Prog' Tree Patching Engine AST XML AST' AST delta parse unparse AST delta (set of tree hunks)

Evaluation Two implementations Three evaluation criteria Two datasets DDP for plain text changes (original DD)  baseline DDJ for Java program changes (proposed) Three evaluation criteria Granularity: patch size (# tokens) Efficiency: build success rate Cost: tests executed Two datasets [Small delta] 194 real bugs from Defects4J [Just et al., 2014] [Large delta] 8 real regressions from 6 open source Java projects

Real Regression Bugs

Deltas between Vgood and Vbad ANTLR 14f05bb aacd2a2 MATH v2.0 v2.2 HSQLDB1 r3227 r3262 HSQLDB2 v2.3.2 v2.3.3 JEDIT r12114 r12710 JSOUP1 b033535 v1.10.3 JSOUP2 v1.10.2 LOGBACK v1.0.13 v1.1.0 Size of delta between Vgood and Vbad (# tokens)

Delta Components and Patch Size Number of delta components Patch size (# tokens)

Build Success Rate and Tests Executed

Execution Time (Breakdown) on AWS (16 cores, 8GB RAM/core) Execution time breakdown

Defects4J Dataset Collection of real bugs from 6 Java projects [Just et al. 2014] A bug is stored as (Vbug, Vfix , T) Vbug: faulty version, Vfix: fixed version T: test suite that contain at least one failing test that triggers the bug Changes between Vbug and Vfix contains bug fixes only Vbug is human-extracted from ∆(Vprev, Vfix) ∆(Vprev, Vfix) Vprev Vbug Vfix other changes isolated bug fix

Deltas (in Total) [D4J] Size of delta (# tokens) Bugs Chart 11 Closure 69 Lang 22 Math 53 Mockito 30 Time 9 Size of delta (# tokens)

Delta Components and Patch Size [D4J] Number of delta components Patch size (# tokens)

Build Success Rate and Tests Executed [D4J]

Execution Time (Breakdown) [D4J] on AWS (16 cores, 8GB RAM/core) Execution time breakdown

Summary Fixing real regressions in an automated and efficient way Goal Fixing real regressions in an automated and efficient way Improving delta debugging on source code changes by avoiding UNRESOLVED test results Approach Contribution Syntax- and semantics-aware decomposition of delta by AST differencing/patching and 500+ coupling rules Result Dataset-L: 73% smaller patch, 40% more test exec, 0.2-5.6x time (DD-bound) Dataset-S: 50% smaller patch, 5% more test exec, 1-6.8x time (decomp-bound)

https://hub.docker.com/r/codecontinuum/ddj/