Ruru Yue1, Na Meng2, Qianxiang Wang1 1Peking University 2Virginia Tech

Slides:



Advertisements
Similar presentations
Min Zhang School of Computer Science University of Hertfordshire
Advertisements

Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
A Tool Support to Merge Similar Methods with a Cohesion Metric COB ○ Masakazu Ioka 1, Norihiro Yoshida 2, Tomoo Masai 1,Yoshiki Higo 1, Katsuro Inoue 1.
Memories of Bug Fixes Sunghun Kim, Kai Pan, and E. James Whitehead Jr., University of California, Santa Cruz Presented By Gleneesha Johnson CMSC 838P,
SE 450 Software Processes & Product Metrics 1 Defect Removal.
Revision Control Practices in Software Engineering Surekha, Kotiyala Madhuri, Komuravelly Suchitra, Yerramalla.
Generative Programming. Generic vs Generative Generic Programming focuses on representing families of domain concepts Generic Programming focuses on representing.
Zichao Qi, Fan Long, Sara Achour, and Martin Rinard MIT CSAIL
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Industrial Application.
Systematic Editing: Generating Program Transformations from an Example Na Meng Miryung Kim Kathryn S. McKinley The University of Texas at Austin.
1 Software Maintenance and Evolution CSSE 575: Session 8, Part 2 Analyzing Software Repositories Steve Chenoweth Office Phone: (812) Cell: (937)
OOSE 01/17 Institute of Computer Science and Information Engineering, National Cheng Kung University Member:Q 薛弘志 P 蔡文豪 F 周詩御.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 17: Code Mining.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University ICSE 2003 Java.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Similar.
Dependency Tracking in software systems Presented by: Ashgan Fararooy.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University What Kinds of.
Graph Data Management Lab, School of Computer Science gdm.fudan.edu.cn XMLSnippet: A Coding Assistant for XML Configuration Snippet.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A clone detection approach for a collection of similar.
1 PARSEWeb: A Programmer Assistant for Reusing Open Source Code on the Web Suresh Thummalapenta and Tao Xie Department of Computer Science North Carolina.
Lase: Locating and Applying Systematic Edits by Learning from Examples Na Meng* Miryung Kim* Kathryn S. McKinley* + The University of Texas at Austin*
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
CMCD: Count Matrix based Code Clone Detection Yang Yuan and Yao Guo Key Laboratory of High-Confidence Software Technologies (Ministry of Education) Peking.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Applying Clone.
Ontology Evolution and Regression Analysis Insights into Ontology Regression Testing Maria Copeland Rafael Goncalvez Robert Stevens Bijan Parsia Uli Sattler.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Inoue Laboratory Eunjong Choi 1 Investigating Clone.
Automatically Repairing Broken Workflows for Evolving GUI Applications Sai Zhang University of Washington Joint work with: Hao Lü, Michael D. Ernst.
Debug Concern Navigator Masaru Shiozuka(Kyushu Institute of Technology, Japan) Naoyasu Ubayashi(Kyushu University, Japan) Yasutaka Kamei(Kyushu University,
Presented by: Ashgan Fararooy Referenced Papers and Related Work on:
Automated Patch Generation Adapted from Tevfik Bultan’s Lecture.
Exploiting Code Search Engines to Improve Programmer Productivity and Quality Suresh Thummalapenta Advisor: Dr. Tao Xie Department of Computer Science.
1 Measuring Similarity of Large Software System Based on Source Code Correspondence Tetsuo Yamamoto*, Makoto Matsushita**, Toshihiro Kamiya***, Katsuro.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Classification.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Extracting Sequence.
Cross Language Clone Analysis Team 2 February 3, 2011.
What kind of and how clones are refactored? A case study of three OSS projects WRT2012 June 1, Eunjong Choi†, Norihiro Yoshida‡, Katsuro Inoue†
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
D4Science and ETICS Building and Testing gCube and gCore Pedro Andrade CERN EGEE’08 Conference 25 September 2008 Istanbul (Turkey)
Estimating Code Size After a Complete Code-Clone Merge Buford Edwards III, Yuhao Wu, Makoto Matsushita, Katsuro Inoue 1 Graduate School of Information.
SQL Database Management
The Emergent Structure of Development Tasks
Interactive Code Review for Systematic Changes
Compiler Design (40-414) Main Text Book:
Why We Refactor? Confessions of GitHub Contributors
Introduction to Parsing (adapted from CS 164 at Berkeley)
Detecting Table Clones and Smells in Spreadsheets
Introduction to Design Patterns
Towards Trustworthy Program Repair
CVS revisions UML diagram
CBCD: Cloned Buggy Code Detector
CSS 161: Fundamentals of Computing
Mining and Analyzing Data from Open Source Software Repository
Restrict Range of Data Collection for Topic Trend Detection
Accurate and Efficient Refactoring Detection in Commit History
: Clone Refactoring Davood Mazinanian Nikolaos Tsantalis Raphael Stein
Human Complexity of Software
Masatomo Hashimoto Akira Mori Tomonori Izumida
State Reporting Processing
Yuhao Wu1, Yuki Manabe2, Daniel M. German3, Katsuro Inoue1
Code search & recommendation engines
Automated Analysis and Code Generation for Domain-Specific Models
On Refactoring Support Based on Code Clone Dependency Relation
Research Activities of Software Engineering Lab in Osaka University
Precise Condition Synthesis for Program Repair
Dotri Quoc†, Kazuo Kobori†, Norihiro Yoshida
Automatically Diagnosing and Repairing Error Handling Bugs in C
Fine-grained and Accurate Source Code Differencing
Recommending Adaptive Changes for Framework Evolution
Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.
Shin Hwei Tan, Hiroaki Yoshida, Mukul R. Prasad, Abhik Roychoudhury
Presentation transcript:

Ruru Yue1, Na Meng2, Qianxiang Wang1 1Peking University 2Virginia Tech A Characterization Study of Repeated Bug Fixes Ruru Yue1, Na Meng2, Qianxiang Wang1 1Peking University 2Virginia Tech

Motivation Prior studies showed that developers apply repetitive code changes to multiple locations. Nguyen et al. found that 17-45% bug fixes were repeated [1]. Tools were built to recommend similar bug fixes. Clever detects and tracks code clones [2]. LASE suggests similar edits based on program transformation learned from two or more similar edit examples[3].

Problem Statement Some fundamental research questions are still unexplored. Q1: What is the frequency of repeated bug fixes? Q2: Where are repeated fixes usually applied? Q3: What are the common bugs and fix patterns of repeated fixes?

Study Findings 48-70% of repeated fixes occurred 2 times Code change suggestion tools based on single examples are more helpful 73-100% of repeated fixes spanned at most 3 commits Coding assistance tools should provide edit suggestions as early as possible 39% of repeated fixes added or deleted whole if- structures Automatic program repair tools should focus more on if- statements

Outline Motivation & Related Work Study Approach Experiments

Exemplar Patch diff --git a/dom/CompilationUnit.java b/dom/CompilationUnit.java @@ -484,8 +484,8 @@ * @since 3.0 */ public int getStartPosition(ASTNode node) { - if (this.commentMapper == null) { - return -1; + if (this.commentMapper == null || node !=null) { + return node.getStartPosition(); } else { return this.commentMapper.getStartPosition(node); }

Exemplar Hunk diff --git a/dom/CompilationUnit.java b/dom/CompilationUnit.java @@ -484,8 +484,8 @@ * @since 3.0 */ public int getStartPosition(ASTNode node) { - if (this.commentMapper == null) { - return -1; + if (this.commentMapper == null || node!=null) { + return node.getStartPosition(); } else { return this.commentMapper.getStartPosition(node); } Hunk

Exemplar Hunk Code changes diff --git a/dom/CompilationUnit.java b/dom/CompilationUnit.java @@ -484,8 +484,8 @@ * @since 3.0 */ public int getStartPosition(ASTNode node) { - if (this.commentMapper == null) { - return -1; + if (this.commentMapper == null || node!=null) { + return node.getStartPosition(); } else { return this.commentMapper.getStartPosition(node); } Code changes Hunk

Exemplar Hunk Context lines Code changes Context lines diff --git a/dom/CompilationUnit.java b/dom/CompilationUnit.java @@ -484,8 +484,8 @@ * @since 3.0 */ public int getStartPosition(ASTNode node) { - if (this.commentMapper == null) { - return -1; + if (this.commentMapper == null || node!=null) { + return node.getStartPosition(); } else { return this.commentMapper.getStartPosition(node); } Context lines Code changes Hunk Context lines

Exemplar Fix diff --git a/dom/CompilationUnit.java b/dom/CompilationUnit.java @@ -484,8 +484,8 @@ * @since 3.0 */ public int getStartPosition(ASTNode node) { - if (this.commentMapper == null) { - return -1; + if (this.commentMapper == null || node!=null) { + return node.getStartPosition(); } else { return this.commentMapper.getStartPosition(node); } Fix Hunk

Approach Overview Bug Fix Collection Repeated Bug Fix Detection

Bug Fix Collection Identify Fixing Patches Extract Bug Fixes Retrieve relevant commits using bug IDs in Bugzilla Extract Bug Fixes Exclude less important hunks e.g. Hunks with changes to documentations Extract fixes applied to methods Use AST Parsers to identify methods’ code ranges

Repeated Bug Fix Detection Format Bug Fixes Identify Clone Regions with CCFinder [4] Match Edit Operation Sequences

Two Exemplar Fixes Fix1 Fix2 - if (this.commentMapper == null) { - return -1; + if (this.commentMapper == null || node !=null) { + return node.getStartPosition(); Fix2 - if (this.commentMapper == null) { - return -1; + if (this.commentMapper == null || astNode!= null){ + return astNode.getLength(); + else{ + return this.commentMapper.getLength(astNode);

Format Bug Fixes FormattedFix1 FormattedFix2 - if (this.commentMapper == null) { - return -1; + if (this.commentMapper == null || node !=null) { + return node.getStartPosition(); FormattedFix2 - if (this.commentMapper == null) { - return -1; + if (this.commentMapper == null || astNode!= null){ + return astNode.getLength(); + else{ + return this.commentMapper.getLength(astNode);

Identify Clone Regions by CCFinder - if (this.commentMapper == null) { - return -1; + if (this.commentMapper == null || node !=null) { + return node.getStartPosition(); CloneRegion2 - if (this.commentMapper == null) { - return -1; + if (this.commentMapper == null || astNode!= null){ + return astNode.getLength(); + else{ + return this.commentMapper.getLength(astNode);

Match Edit Operation Sequences EditOperSeq1 - if (this.commentMapper == null) { - return -1; + if (this.commentMapper == null || node !=null) { + return node.getStartPosition(); EditOperSeq2 - if (this.commentMapper == null) { - return -1; + if (this.commentMapper == null || astNode!= null){ + return astNode.getLength(); + else{ + return this.commentMapper.getLength(astNode);

An Exemplar Repeated-fix Group Similar fixes (e.g. Fix1 and Fix2) are gathered into the same repeated-fix group. - if (this.commentMapper == null) { - return -1; + if (this.commentMapper == null || node !=null) { + return node.getStartPosition(); + if (this.commentMapper == null || astNode!= null){ + return astNode.getLength(); + else{ + return this.commentMapper.getLength(astNode); Fix1 Fix2

Outline Motivation & Related Work Study Approach Experiments

Data Sets Property Eclipse JDT Mozilla Firefox LibreOffice Resolved period of bugs 2005 2014-2015 2014 # of bugs 870 380 1,563 # of fixing patches 1,378 10,051 7,846 # fixes 16,289 3,451 33,057

Q1.D2. What’s the distribution of repeated-fix groups based on fix instance counts? For most bugs, repeated fixes did not occur many times. SYDIT [5] and LibSync [6] may be more helpful than LASE [3].

Q3. What are the common bugs and fix patterns of repeated fixes? Sample Repeated-fix Groups Randomly select 150 repeated-fix groups, with 50 groups from each project Manually Analyze Repeated-fix Groups Bug component: the main syntax component of a fix Fix pattern: the way to resolve the bug

Exemplar Repeated Fix 1. - if (aStart) 2. - currFrame = aStart->GetNextSibling(); 3. - else 4. + if (aStart) { 5. + if (aStart->GetNextSibling()) 6. + currFrame = aStart->GetNextSibling(); 7. + else if (aStart->GetParent()->GetContent() ->IsXUL(nsGkAtom::menugroup)) 8. + currFrame = aStart->GetParent()->GetNextSibling(); 9. + } 10.+ else

Similarity Relationship 1. - if (aStart) 2. - currFrame = aStart->GetNextSibling(); 3. - else 4. + if (aStart) { 5. + if (aStart->GetNextSibling()) 6. + currFrame = aStart->GetNextSibling(); 7. + else if (aStart->GetParent()->GetContent() ->IsXUL(nsGkAtom::menugroup)) 8. + currFrame = aStart->GetParent()->GetNextSibling(); 9. + } 10.+ else

Dependency Relationship 1. - if (aStart) 2. - currFrame = aStart->GetNextSibling(); 3. - else 4. + if (aStart) { 5. + if (aStart->GetNextSibling()) 6. + currFrame = aStart->GetNextSibling(); 7. + else if (aStart->GetParent()->GetContent() ->IsXUL(nsGkAtom::menugroup)) 8. + currFrame = aStart->GetParent()->GetNextSibling(); 9. + } 10.+ else Control depend on

Manual Analysis Bug Component: currFrame’s assignment in line 2 1. - if (aStart) 2. - currFrame = aStart->GetNextSibling(); 3. - else 4. + if (aStart) { 5. + if (aStart->GetNextSibling()) 6. + currFrame = aStart->GetNextSibling(); 7. + else if (aStart->GetParent()->GetContent() ->IsXUL(nsGkAtoms::menugroup)) 8. + currFrame = aStart->GetParent()->GetNextSibling(); 9. + } 10.+ else Bug Component: currFrame’s assignment in line 2 Fix Pattern: modify the value assigned to currFrame

Q3. What are the common bugs and fix patterns of repeated fixes? if-statement and if-condition were the most prevalent bug components. Program repair tools like Prophet [7] and Angelix [8] can be useful for if-condition correction We still need new tools to automate if-statement additions or deletions

Conclusions We studied the frequency, edit locations, and semantic meanings of repeated fixes. 48-70% of repeated fixes occurred 2 times. 73-100% of repeated fixes spanned at most 3 commits. 39% of repeated fixes added or deleted whole if- structures.

Thank you!

Q1.D3:  What’s the distribution of repeated-fix groups based on patch counts? Tools should generate edit suggestions as early as possible. Integrating tools to IDE seems more promising than to VCS .

References [1] T. T. Nguyen, H. A. Nguyen, N. H. Pham, J. Al-Kofahi, and T. N. Nguyen. Recurring bug fixes in object-oriented programs. In ACM/IEEE International Conference on Software Engineering, pages 315–324, 2010. [2] T. T. Nguyen, H. A. Nguyen, N. H. Pham, J. M. Al-Kofahi, and T. N. Nguyen. Clone-aware configuration management. In ASE, pages 123– 134, 2009. [3] N. Meng, M. Kim, and K. McKinley. LASE: Locating and applying systematic edits. In ICSE, page 10, 2013. [4]  T. Kamiya, S. Kusumoto, and K. Inoue. CCFinder: A multilinguistic token-based code clone detection system for large scale source code. TSE, pages 654–670, 2002. [5] N. Meng, M. Kim, and K. S. McKinley. Systematic editing: Generating program transformations from an example. In PLDI, pages 329–342, 2011. [6] H. A. Nguyen, T. T. Nguyen, G. Wilson, Jr., A. T. Nguyen, M. Kim, and T. N. Nguyen. A graph- based approach to API usage adaptation. pages 302–321, 2010. [7]  F. Long and M. Rinard. Automatic patch generation by learning correct code. SIGPLAN Not., 2016. [8] S. Mechtaev, J. Yi, and A. Roychoudhury. Angelix: Scalable multiline program patch synthesis via symbolic analysis. In ACM/IEEE International Conference on Software Engineering, pages 691– 701, 2016.