Download presentation
Presentation is loading. Please wait.
1
Refactoring Support Tool: Cancer
Yoshiki Higo Osaka University Next I explain refactoring support tool cancer.
2
Gemini & Cancer Gemini Cancer
shows all code clones detected by CCFinder. gives user panoramic views of code clones in source code. has several quantitative information of code clones of each file and clone class. Cancer extracts refactoring-oriented-code clones from ones detected by CCFinder. appends several metric values to extracted code clones. enables user to know how to remove code clones if certain degree of its metric conditions is satisfied. At first I explain deference between Gemini and Cancer. Gemini shows all code clones detected by CCFinder, and gives user panoramic view of code clones in source code. Further more, Gemini has several quantitative information of code clone of each file and clone class. On the other hand, Cancer extracts refactoring-oriented-code clones from ones detected by CCFinder, and appends several metric values to extracted code clones. Cancer moreover enables user to know how to remove code clone if a certain degree of ifs metric condition is satisfied.
3
Code clones in Cancer Declaration unit code clone
All code clones ( which are extracted from ones detected by CCFinder) correspond to structural blocks of programming language. Currently, Cancer can apply only Java language. Declaration unit code clone class declaration、interface declaration Method unit code clone method body、constructor、static initializer Statement unit code clone do 、for 、if 、switch 、synchronized 、try 、while Code clones in Cancer have some characteristics. They correspond to structural blocks of programming language. Currently, Cancer can apply only Java language. These are units of code clone of Java language. As declaration unit, class declaration and interface declaration are extracted. As method unit, method body, constructor, and static initializer are extracted. As statement unit, do statement and for statement and so on.
4
Metrics in Cancer(1/3) Cancer characterizes extracted code clones using 6 metrics. Using these metrics, we can get how to remove them. 6 metrics are RVK,RVN,DCH,LEN,POP,DFL And Cancer characterizes extracted code clones using 6 metrics. Using these metrics, we can get how to remove them. 6 metrics are RVK, RVN, DCH, LEN, POP, DFL. Next I explain each metric.
5
Metrics in Cancer(2/3) Class A{ ・・・ } Class B extends A{ void foo( ・・・ ){ Class C extends A{ void bar( ・・・ ){ code clone The value of DCH : 1 Class A{ ・・・ void foo( ・・・ ){ } void bar( ・・・ ){ code clone The value of DCH : 0 Variables which are used in code clone, but defined outside. RVK : the number of such variables. RVN : the sum of used count of such variables. Relation of code clone on class hierarchy DCH : the degree of dispersion on class hierarchy DCH takes into account only class hierarchy of target software (don’t include class hierarchy of library). int a; MyClass myClass = new MyClass(); ・・・ for( int i = a ; i < 10 ; i++ ){ int c = a + 1; myClass.set(c); } code clone RVK variable The value of RVK : 2 The value of RVN : = 3 First and second metrics are for variables which are used in code clone, but defined outside. First metric is named RVK, which means the number of such variables. And second metric is named RVN, which means the sum of used count of such variables. This is an example. In this figure, this gray part is code clone. And in this code clone, variable “a” and “myClass” are defined outside. So, the variable of RVK is 2. And variable “a” is used twice, and variable “myClass” is used once. So, the variable of RVN is 3. Third metric is for relation of code clone on class hierarchy. This metric is named DCH, which means the degree of dispersion on class hierarchy. DCH takes into account only class hierarchy of target software, Because, user can’t change library class like JDK. These gray parts are code clone. In this case, code clones are in same class, so the value of DCH is 0. This is another example. In this case, there are three classes. class A is the common parent of class B and class C. And code clones exist in class B and class C. So, in this case, the value of DCH is 1. And If classes which include code fragments of a certain clone class don’t have common parent class, the value of DCH is -1.
6
Metrics in Cancer(3/3) LEN: the average of length of code clones in a clone class POP: the number of code clones in a clone class DFL: Estimation of how many tokens would be removed from source files when all code clones of a clone class are replaced with caller statements of a new identical routine Fourth metric is LEN, which means the average of code fragments in a clone class. Fifth metric is POP, which means the number of code fragments in a clone class. Last metric is DFL, which means an estimation of how many tokes would be removed from source files when all code fragments of a clone class are replaced with caller statements of a new identical routine like this figure. new sub routine caller statements
7
Removal of Code clone using Refactoring Pattern
We use Refactoring Pattern[1] for removal of code clones with 6 metrics Following patterns are applicable. Extract Method Pull Up Method We use Refactoring Pattern for removal of code clone with 6 metrics. Currently, we think that “Extract Method” and “Pull up Method” are applicable. [1] M. Fowler: Refactoring: Improving the Design of Existing Code, Addison-Wesley, 1999.
8
The value of metrics for “Extract Method”
int i; ・・・ for( int j = 0 ; j < i ; j++ ){ } i = newMethod(i); int newMethod(int a){ for( int j = 0 ; j < a ; j++ ){ return a; Target Unit : Method Unit, Statement Unit DCH == 0 RVK <= 1 I briefly explain “Extract Method”. Left source code is before refactoring, right is after refactoring. These blue highlighted parts are code clones. By extracting them as a new method this source code is modified like this. These are the metric conditions for performing “Extract Method”. At first, this pattern is applicable for method and statement unit code clone. At second, the value of DCH has to be equal to 0, because “Extract Method” is performed within a class. 0 means that all code clones are within a class. At last, the value of RVK has to be 1 or less. RVK variable has to be passed as argument and returned. So, the value of RVK has to be 1 or less to perform “Extract Method” simply.
9
The value of metrics for “Pull up Method”
Target Unit : Method Unit DCH >= 1 Next I briefly explain “Pull Up Method” This means that same methods which are defined in several children classes are pulled up to common parent class. In this example, There are three classes, Example class, Child A class, and Child B class. And Example class is the parent of Child A and Child B. And Child A and Child B share method unit code clone. In this case, we can remove code clones by pulling up them to common parent class. These are the metric conditions for performing “Pull up Method” At first, this pattern is applicable for method unit code clone. At second, the value of DCH has to be 1 or more, because classes which including code fragments have to have common parent class.
10
ShapShot of Cancer Metric Graph Clone class list
Currently I am implementing refactoring support tool using the 6 metrics. This is a snapshot of the tool. This line graph is called metric graph. The bottom part is called clone category selection panel. You can see that this panel is divided into twelve parts. Each part means a unit of clone class like method, if, for and so on. The right part is called clone class list. Each row of this list shows a clone class with 6 metrics. In this tool, user selects clone class using metric graph and clone category selection panel, And clone class list shows specified clone classes. clone category selection Panel
11
Specifying code clones
User can specify code clones using Metric Graph and Clone Category Selection Panel Metrics Graph : Specifying based on metric values Clone Category Selection Panel : Specifying based on unit of code clone Clone class list shows only specified clone classes User can specify code clones using Metric Graph and Clone Category Selection Panel. On Metric Graph, user can specify based on metric values. And On Clone Category Selection Panel, user can specify based on unit of code clone.
12
Specifying based on metrics values
LEN POP DFL RVK RVN DCH Metric graph is also used in Gemini. But some of metrics are deferent. Now I briefly explain specifying code clones using metric graph. In this figure, two clone classes are drawn in red. Red means that its clone class is specified. And this blue screen means that its area is between lower and upper limit of each metric. User can specify any clone class by changing lower and upper limit of each metric. For example, changing upper limit of DCH like this makes this clone class unspecified. This is specifying code clones based on metric values.
13
Specifying based on unit of code clone
LEN POP DFL RVK RVN DCH While Unit Method Unit Next I explain specifying based on unit of code clone. This specifying use clone category selection panel. In this metric graph, two clone classes are drawn in red. One is method unit code clone, and the other is while statement unit code clone. This polygonal line is drawn, because method check box is checked. And this polygonal line is drawn, because which check box is checked If the check of while check box is removed, its clone class is also removed from metric graph like this. This is specifying code clones based on unit of clone class. Method While ・・・・・・・・・・・・・・ ・・・・・・・・・・・・・
14
Case study Target software : Ant 1.5.4, which is an open source java software. LOC : about 15k The result Using “Pull up Method”, 2 clone classes are removed. Using “Extract Method”, 1 clone class is removed I performed a simple case study. The target software is Ant, which is an open source java software. And the loc is about 15,000. As the result, I removed 2 clone classes using “Pull up Method” And, remove 1 clone class using “Extract Method”
16
Code Clone Code clone is a code fragment in source files that is identical or similar to another Code clone is one of factors that make software maintenance more difficult. If some faults are found in a code clone, it is necessary to consider pros and cons of modification in its all code clones. Clone Pair Clone Class At first, I explain the background of our research. Code clone is a code fragment in source files that is identical or similar to another. For example, these two figures indicate source files. And, these three gray parts are code clones. Here, we call each pair a clone pair, and these all are collectively called a clone class. It is generally said that code clone is one of factors that make software maintenance more difficult. For example, if some faults are found in a code clone, it is necessary to consider pros and cons of modification in its all code clones. As shown in this figure, when there are only three code clones, it is easy to correct them. But, if very many code clones exist in a huge software, it becomes very serious problem to detect and correct them.
17
Example Value of RVK, RVN
Variables, “a” and “b” which are used in code clone, but defined outside, are used 2 and 3 times. RVK : = 2 RVN : = 5
18
Example values of DCH If all code fragments in a clone class are in same class, DCH : 0 If all code fragments in a clone class are in a certain class and its child classes, DCH : 1 If classes which include code fragments of a certain clone class don’t have common parent class, DCH : -1
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.