Refactoring Support Tool: Cancer

Slides:



Advertisements
Similar presentations
Introduction to Eclipse. Start Eclipse Click and then click Eclipse from the menu: Or open a shell and type eclipse after the prompt.
Advertisements

Test-Driven Development and Refactoring CPSC 315 – Programming Studio.
Reverse Engineering © SERG Code Cloning: Detection, Classification, and Refactoring.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Extracting Code.
Inheritance. Extending Classes It’s possible to create a class by using another as a starting point  i.e. Start with the original class then add methods,
OOP in Java Nelson Padua-Perez Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
A Tool Support to Merge Similar Methods with a Cohesion Metric COB ○ Masakazu Ioka 1, Norihiro Yoshida 2, Tomoo Masai 1,Yoshiki Higo 1, Katsuro Inoue 1.
Refactoring Support Tool: Cancer Yoshiki Higo Osaka University.
C++ fundamentals.
REFACTORING Lecture 4. Definition Refactoring is a process of changing the internal structure of the program, not affecting its external behavior and.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Similar.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Refactoring.
Operator Precedence First the contents of all parentheses are evaluated beginning with the innermost set of parenthesis. Second all multiplications, divisions,
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A clone detection approach for a collection of similar.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 ARIES: Refactoring.
Sadegh Aliakbary Sharif University of Technology Spring 2012.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Code-Clone Analysis.
2002/12/11PROFES20021 On software maintenance process improvement based on code clone analysis Yoshiki Higo* , Yasushi Ueda* , Toshihiro Kamiya** , Shinji.
Java Classes Methods Objects. Classes Classes We have been using classes ever since we started programming in Java Whenever we use the keyword class.
Refactoring Improving the structure of existing code Refactoring1.
1 Gemini: Maintenance Support Environment Based on Code Clone Analysis *Graduate School of Engineering Science, Osaka Univ. **PRESTO, Japan Science and.
Netprog: Java Intro1 Crash Course in Java. Netprog: Java Intro2 Why Java? Network Programming in Java is very different than in C/C++ –much more language.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Applying Clone.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Inoue Laboratory Eunjong Choi 1 Investigating Clone.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University How to extract.
Refactoring1 Improving the structure of existing code.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Code-Clone Detection.
Java™ How to Program, 10/e © Copyright by Pearson Education, Inc. All Rights Reserved.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 1 Towards an Assessment of the Quality of Refactoring.
Procedural Programming Criteria: P2 Task: 1.2 Thomas Jazwinski.
Copyright © 2015 NTT DATA Corporation Kazuo Kobori, NTT DATA Corporation Makoto Matsushita, Osaka University Katsuro Inoue, Osaka University SANER2015.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Code Clones.
Structure Programming Lecture 8 Chapter 5&6 - Function – part I 12 December 2015.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Cage: A Keyword.
1 Measuring Similarity of Large Software System Based on Source Code Correspondence Tetsuo Yamamoto*, Makoto Matsushita**, Toshihiro Kamiya***, Katsuro.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Classification.
Access Modifiers Control which classes use a feature Only class-level variables may be controlled by access modifiers Modifiers 1. public 2. protected.
Object Oriented Programming Criteria: P2 Date: 07/10/15 Name: Thomas Jazwinski.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 コードクローン解析に基づくリファクタリング支援.
Refactoring1 Improving the structure of existing code.
Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Towards a Collection of Refactoring Patterns Based.
Classes, Interfaces and Packages
1 Gemini: Code Clone Analysis Tool †Graduate School of Engineering Science, Osaka Univ., Japan ‡ Graduate School of Information Science and Technology,
21. PHP Classes To define a class, use the keyword class followed by the name and a block with the properties and method definitions Properties are declared.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Aries: Refactoring.
3-July-2002cse142-D2-Methods © 2002 University of Washington1 Methods CSE 142, Summer 2002 Computer Programming 1
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A Metric-based Approach for Reconstructing Methods.
Estimating Code Size After a Complete Code-Clone Merge Buford Edwards III, Yuhao Wu, Makoto Matsushita, Katsuro Inoue 1 Graduate School of Information.
Chapter 10 : Implementing Subprograms
User-Written Functions
More Sophisticated Behavior
Implementing Subprograms
Coupling and Cohesion 1.
Debugging and Random Numbers
Refactoring Support Based on Code Clone Analysis
INTRODUCTION TO OBJECT-ORIENTED PROGRAMMING (OOP) & CONCEPTS
Object Oriented Programming (OOP) LAB # 8
○Yuichi Semura1, Norihiro Yoshida2, Eunjong Choi3, Katsuro Inoue1
Interface.
Predicting Fault-Prone Modules Based on Metrics Transitions
Improving the structure of existing code
Object Oriented Programming in java
Overriding Methods & Class Hierarchies
Constructors, GUI’s(Using Swing) and ActionListner
On Refactoring Support Based on Code Clone Dependency Relation
Collaboration of Parafrase-2 and NaraView
Classes, Objects and Methods
Chapter 8 Inheritance Part 2.
Dotri Quoc†, Kazuo Kobori†, Norihiro Yoshida
Refactoring.
Presentation transcript:

Refactoring Support Tool: Cancer Yoshiki Higo Osaka University Next I explain refactoring support tool cancer.

Gemini & Cancer Gemini Cancer shows all code clones detected by CCFinder. gives user panoramic views of code clones in source code. has several quantitative information of code clones of each file and clone class. Cancer extracts refactoring-oriented-code clones from ones detected by CCFinder. appends several metric values to extracted code clones. enables user to know how to remove code clones if certain degree of its metric conditions is satisfied. At first I explain deference between Gemini and Cancer. Gemini shows all code clones detected by CCFinder, and gives user panoramic view of code clones in source code. Further more, Gemini has several quantitative information of code clone of each file and clone class. On the other hand, Cancer extracts refactoring-oriented-code clones from ones detected by CCFinder, and appends several metric values to extracted code clones. Cancer moreover enables user to know how to remove code clone if a certain degree of ifs metric condition is satisfied.

Code clones in Cancer Declaration unit code clone All code clones ( which are extracted from ones detected by CCFinder) correspond to structural blocks of programming language. Currently, Cancer can apply only Java language. Declaration unit code clone class declaration、interface declaration Method unit code clone method body、constructor、static initializer Statement unit code clone do 、for 、if 、switch 、synchronized 、try 、while Code clones in Cancer have some characteristics. They correspond to structural blocks of programming language. Currently, Cancer can apply only Java language. These are units of code clone of Java language. As declaration unit, class declaration and interface declaration are extracted. As method unit, method body, constructor, and static initializer are extracted. As statement unit, do statement and for statement and so on.

Metrics in Cancer(1/3) Cancer characterizes extracted code clones using 6 metrics. Using these metrics, we can get how to remove them. 6 metrics are RVK,RVN,DCH,LEN,POP,DFL And Cancer characterizes extracted code clones using 6 metrics. Using these metrics, we can get how to remove them. 6 metrics are RVK, RVN, DCH, LEN, POP, DFL. Next I explain each metric.

Metrics in Cancer(2/3) Class A{ ・・・ } Class B extends A{ void foo( ・・・ ){ Class C extends A{ void bar( ・・・ ){ code clone The value of DCH : 1 Class A{ ・・・ void foo( ・・・ ){ } void bar( ・・・ ){ code clone The value of DCH : 0 Variables which are used in code clone, but defined outside. RVK : the number of such variables. RVN : the sum of used count of such variables. Relation of code clone on class hierarchy DCH : the degree of dispersion on class hierarchy DCH takes into account only class hierarchy of target software (don’t include class hierarchy of library). int a; MyClass myClass = new MyClass(); ・・・ for( int i = a ; i < 10 ; i++ ){ int c = a + 1; myClass.set(c); } code clone RVK variable The value of RVK : 2 The value of RVN : 2 + 1 = 3 First and second metrics are for variables which are used in code clone, but defined outside. First metric is named RVK, which means the number of such variables. And second metric is named RVN, which means the sum of used count of such variables. This is an example. In this figure, this gray part is code clone. And in this code clone, variable “a” and “myClass” are defined outside. So, the variable of RVK is 2. And variable “a” is used twice, and variable “myClass” is used once. So, the variable of RVN is 3. Third metric is for relation of code clone on class hierarchy. This metric is named DCH, which means the degree of dispersion on class hierarchy. DCH takes into account only class hierarchy of target software, Because, user can’t change library class like JDK. These gray parts are code clone. In this case, code clones are in same class, so the value of DCH is 0. This is another example. In this case, there are three classes. class A is the common parent of class B and class C. And code clones exist in class B and class C. So, in this case, the value of DCH is 1. And If classes which include code fragments of a certain clone class don’t have common parent class, the value of DCH is -1.

Metrics in Cancer(3/3) LEN: the average of length of code clones in a clone class POP: the number of code clones in a clone class DFL: Estimation of how many tokens would be removed from source files when all code clones of a clone class are replaced with caller statements of a new identical routine Fourth metric is LEN, which means the average of code fragments in a clone class. Fifth metric is POP, which means the number of code fragments in a clone class. Last metric is DFL, which means an estimation of how many tokes would be removed from source files when all code fragments of a clone class are replaced with caller statements of a new identical routine like this figure. new sub routine caller statements

Removal of Code clone using Refactoring Pattern We use Refactoring Pattern[1] for removal of code clones with 6 metrics Following patterns are applicable. Extract Method Pull Up Method We use Refactoring Pattern for removal of code clone with 6 metrics. Currently, we think that “Extract Method” and “Pull up Method” are applicable. [1] M. Fowler: Refactoring: Improving the Design of Existing Code, Addison-Wesley, 1999.

The value of metrics for “Extract Method” int i; ・・・ for( int j = 0 ; j < i ; j++ ){ } i = newMethod(i); int newMethod(int a){ for( int j = 0 ; j < a ; j++ ){ return a; Target Unit : Method Unit, Statement Unit DCH == 0 RVK <= 1 I briefly explain “Extract Method”. Left source code is before refactoring, right is after refactoring. These blue highlighted parts are code clones. By extracting them as a new method this source code is modified like this. These are the metric conditions for performing “Extract Method”. At first, this pattern is applicable for method and statement unit code clone. At second, the value of DCH has to be equal to 0, because “Extract Method” is performed within a class. 0 means that all code clones are within a class. At last, the value of RVK has to be 1 or less. RVK variable has to be passed as argument and returned. So, the value of RVK has to be 1 or less to perform “Extract Method” simply.

The value of metrics for “Pull up Method” Target Unit : Method Unit DCH >= 1 Next I briefly explain “Pull Up Method” This means that same methods which are defined in several children classes are pulled up to common parent class. In this example, There are three classes, Example class, Child A class, and Child B class. And Example class is the parent of Child A and Child B. And Child A and Child B share method unit code clone. In this case, we can remove code clones by pulling up them to common parent class. These are the metric conditions for performing “Pull up Method” At first, this pattern is applicable for method unit code clone. At second, the value of DCH has to be 1 or more, because classes which including code fragments have to have common parent class.

ShapShot of Cancer Metric Graph Clone class list Currently I am implementing refactoring support tool using the 6 metrics. This is a snapshot of the tool. This line graph is called metric graph. The bottom part is called clone category selection panel. You can see that this panel is divided into twelve parts. Each part means a unit of clone class like method, if, for and so on. The right part is called clone class list. Each row of this list shows a clone class with 6 metrics. In this tool, user selects clone class using metric graph and clone category selection panel, And clone class list shows specified clone classes. clone category selection Panel

Specifying code clones User can specify code clones using Metric Graph and Clone Category Selection Panel Metrics Graph : Specifying based on metric values Clone Category Selection Panel : Specifying based on unit of code clone Clone class list shows only specified clone classes User can specify code clones using Metric Graph and Clone Category Selection Panel. On Metric Graph, user can specify based on metric values. And On Clone Category Selection Panel, user can specify based on unit of code clone.

Specifying based on metrics values LEN POP DFL RVK RVN DCH Metric graph is also used in Gemini. But some of metrics are deferent. Now I briefly explain specifying code clones using metric graph. In this figure, two clone classes are drawn in red. Red means that its clone class is specified. And this blue screen means that its area is between lower and upper limit of each metric. User can specify any clone class by changing lower and upper limit of each metric. For example, changing upper limit of DCH like this makes this clone class unspecified. This is specifying code clones based on metric values.

Specifying based on unit of code clone LEN POP DFL RVK RVN DCH While Unit Method Unit Next I explain specifying based on unit of code clone. This specifying use clone category selection panel. In this metric graph, two clone classes are drawn in red. One is method unit code clone, and the other is while statement unit code clone. This polygonal line is drawn, because method check box is checked. And this polygonal line is drawn, because which check box is checked If the check of while check box is removed, its clone class is also removed from metric graph like this. This is specifying code clones based on unit of clone class. Method While ・・・・・・・・・・・・・・ ・・・・・・・・・・・・・

Case study Target software : Ant 1.5.4, which is an open source java software. LOC : about 15k The result Using “Pull up Method”, 2 clone classes are removed. Using “Extract Method”, 1 clone class is removed I performed a simple case study. The target software is Ant, which is an open source java software. And the loc is about 15,000. As the result, I removed 2 clone classes using “Pull up Method” And, remove 1 clone class using “Extract Method”

Code Clone Code clone is a code fragment in source files that is identical or similar to another Code clone is one of factors that make software maintenance more difficult. If some faults are found in a code clone, it is necessary to consider pros and cons of modification in its all code clones. Clone Pair Clone Class At first, I explain the background of our research. Code clone is a code fragment in source files that is identical or similar to another. For example, these two figures indicate source files. And, these three gray parts are code clones. Here, we call each pair a clone pair, and these all are collectively called a clone class. It is generally said that code clone is one of factors that make software maintenance more difficult. For example, if some faults are found in a code clone, it is necessary to consider pros and cons of modification in its all code clones. As shown in this figure, when there are only three code clones, it is easy to correct them. But, if very many code clones exist in a huge software, it becomes very serious problem to detect and correct them.

Example Value of RVK, RVN Variables, “a” and “b” which are used in code clone, but defined outside, are used 2 and 3 times. RVK : 1 + 1 = 2 RVN : 2 + 3 = 5

Example values of DCH If all code fragments in a clone class are in same class, DCH : 0 If all code fragments in a clone class are in a certain class and its child classes, DCH : 1 If classes which include code fragments of a certain clone class don’t have common parent class, DCH : -1