Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Extracting Code.

Slides:



Advertisements
Similar presentations
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A Preliminary.
Advertisements

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Identifying Source.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Modularization.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Extraction of.
A Tool Support to Merge Similar Methods with a Cohesion Metric COB ○ Masakazu Ioka 1, Norihiro Yoshida 2, Tomoo Masai 1,Yoshiki Higo 1, Katsuro Inoue 1.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A Prototype of.
Refactoring Support Tool: Cancer Yoshiki Higo Osaka University.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Measuring Copying.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Industrial Application.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Where Does This.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Debugging Support.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University ICSE 2003 Java.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Similar.
*Graduate School of Engineering Science, Osaka University
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University What Kinds of.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Refactoring.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A Criterion for.
Department of Computer Science, Graduate School of Information Science and Technology, Osaka University DCCFinder: A Very- Large Scale Code Clone Analysis.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Investigation.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A clone detection approach for a collection of similar.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University What Do Practitioners.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 ARIES: Refactoring.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A Method to Detect License Inconsistencies for Large-
Mining and Analysis of Control Structure Variant Clones Guo Qiao.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Code-Clone Analysis.
2002/12/11PROFES20021 On software maintenance process improvement based on code clone analysis Yoshiki Higo* , Yasushi Ueda* , Toshihiro Kamiya** , Shinji.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Detection and evolution analysis of code clones for.
1 Gemini: Maintenance Support Environment Based on Code Clone Analysis *Graduate School of Engineering Science, Osaka Univ. **PRESTO, Japan Science and.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Design and Implementation.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Applying Clone.
Assessing the Frequency of Empirical Evaluation in Software Modeling Research Workshop on Experiences and Empirical Studies in Software Modelling (EESSMod)
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Inoue Laboratory Eunjong Choi 1 Investigating Clone.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University How to extract.
Software Engineering Research Group, Graduate School of Engineering Science, Osaka University 1 Evaluation of a Business Application Framework Using Complexity.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Development of.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Retrieving Similar Code Fragments based on Identifier.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 1 Towards an Assessment of the Quality of Refactoring.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Assertion with.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 1 Towards an Investigation of Opportunities for Refactoring.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University July 21, 2008WODA.
Copyright © 2015 NTT DATA Corporation Kazuo Kobori, NTT DATA Corporation Makoto Matsushita, Osaka University Katsuro Inoue, Osaka University SANER2015.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Code Clones.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Cage: A Keyword.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Code Clone Analysis.
1 Measuring Similarity of Large Software System Based on Source Code Correspondence Tetsuo Yamamoto*, Makoto Matsushita**, Toshihiro Kamiya***, Katsuro.
Experience of Finding Inconsistently-Changed Bugs in Code Clones of Mobile Software Katsuro Inoue†, Yoshiki Higo†, Norihiro Yoshida†, Eunjong Choi†, Shinji.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Classification.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Extraction of.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Extracting Sequence.
What kind of and how clones are refactored? A case study of three OSS projects WRT2012 June 1, Eunjong Choi†, Norihiro Yoshida‡, Katsuro Inoue†
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 コードクローン解析に基づくリファクタリング支援.
Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Towards a Collection of Refactoring Patterns Based.
1 Gemini: Code Clone Analysis Tool †Graduate School of Engineering Science, Osaka Univ., Japan ‡ Graduate School of Information Science and Technology,
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Aries: Refactoring.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Detection of License Inconsistencies in Free and.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Software Ingredients:
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A Metric-based Approach for Reconstructing Methods.
Estimating Code Size After a Complete Code-Clone Merge Buford Edwards III, Yuhao Wu, Makoto Matsushita, Katsuro Inoue 1 Graduate School of Information.
Yasuhiro Hayase†, Yu Kashima‡, Yuki Manabe‡, Katsuro Inoue‡
Naoya Ujihara1, Ali Ouni2, Takashi Ishio1, Katsuro Inoue1
Do Developers Focus on Severe Code Smells?
A Pluggable Tool for Measuring Software Metrics from Source Code
Yuta Nakamura1, Eunjong Choi1, Norihiro Yoshida2,
○Yuichi Semura1, Norihiro Yoshida2, Eunjong Choi3, Katsuro Inoue1
Refactoring Support Tool: Cancer
Quaid-i-Azam University
Recommending Verbs for Rename Method using Association Rule Mining
On Refactoring Support Based on Code Clone Dependency Relation
Research Activities of Software Engineering Lab in Osaka University
Dotri Quoc†, Kazuo Kobori†, Norihiro Yoshida
Presentation transcript:

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Extracting Code Clones for Refactoring Using Combinations of Clone Metrics 1 †Osaka University, Japan ‡ Nara Institute of Science and Technology, Japan *NEC Corporation, Japan Eunjong Choi †, Norihiro Yoshida ‡, Takashi Ishio †, Katsuro Inoue †, and Tateki Sano*

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Background: Clone Set A set of code clones that is similar or identical to each other 2 Clone Set:S 1 ={Code Clone 1, Code Clone 3} S 2 ={Code Clone 2, Code Clone 4, Code Clone 5} Code Clone 4 Code Clone 5 Code Clone 3 Code Clone 2 Code Clone 1 similar identical

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Background: Refactoring Code Clone Merge code clones into a single program unit 3 Refactoring Code Clone 3 Code Clone 2 Code Clone 1 Code Clone 2 Code Clone’ 1

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University /* Code Clone in a clone set whose RNR(S) is the second highest in Ant */ else { // is the zip file in the cache file); if == null) { (file); ; } 4 Background: Language-dependent Code Clone It is unavoidable to exist in source code  because of features of the used program language. /* Code Clone A */ replacement.setTaskType(taskType); replacement.setTaskName(taskName); replacement.setLocation(location); replacement.setOwningTarget(target); replacement.setRuntime (wrapper); wrapper.setProxy(replacement); /* … */ /* Code Clone B */ def.setName(name); def.setClassName(classname); def.setClass(cl); def.setAdapterClass(adapterClass); def.setAdaptToClass(adaptToClass); def.setClassLoader(al); /* … */ Example of the language-dependent code clone (Consecutive setter invocations)

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Background: Clone Metrics [Higo2007] Quantitative information on clone sets  E.g., LEN(S), RNR(S), POP(S) Purposes  To check features of code clones in software  To extract code clones for several purposes  E.g., r efactoring, defect-prone code clones 5 [Higo2007] Yoshiki Higo, Toshihiro Kamiya, Shinji Kusumoto, Katsuro Inoue, "Method and Implementation for Investigating Code Clones in a Software System", Information and Software Technology, pp (2007-9)

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Clone Metrics: LEN(S) The average length of token sequences of code clones in a clone set S 6 Clone set S A token sequence [c c* ] is detected as a code clone from a token sequence Superscript * indicated that the token is in a repeated token sequence LEN(S) = 2

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Clone Metrics: RNR(S) The ratio of non-repeated token sequences of code clones in a clone set S 7 Clone set S RNR(S) = 100 = 50 1 2 The length of non-repeated token sequence token sequence The length of whole token sequence A token sequence [c c* ] is detected as a code clone from a token sequence

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Clone Metrics: POP(S) The number of code clones in a clone set S 8 Clone set S POP(S) =

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Single Clone Metric (1/2) Clone sets whose RNR(S) is higher  They do not organize a single semantic unit  semantic unit : many instructions forming a single functionality 9 /* Code Clone in a clone set whose RNR(S) is the second highest in Ant */ else { // is the zip file in the cache ZipFile zipFile = (ZipFile) zipFiles.get(file); if (zipFile == null) { zipFile = new ZipFile(file); zipFiles.put(file, zipFile); } ZipEntry entry = zipFile.getEntry(resourceName); if (entry != null) {x a part of semantic unit Not Appropriate for Refactoring!

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Single Clone Metric (2/2) Clone sets whose POP(S) is higher  They Include many language-dependent code clones 10 /* Code Clone in a clone set whose POP(S) is the first highest in Ant */ out.println("\">"); out.println(""); out.print("<!ELEMENT project (target | "); out.print(TASKS); out.print(" | "); out.print(TYPES); Not Appropriate for Refactoring!

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Key Idea It is not appropriate to extract refactorable code clones using just a single clone metric  According to our experiences We propose a method based on combined clone metrics  To improve the weakness of single-metric-based extraction 11

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Combined Clone Metrics Clone sets whose RNR(S), POPS(S) are higher  Each code clone organizes a single semantic units 12 /* Code Clone in a clone set whose RNR(S), POP(S) are higher than others*/ if (ifProperty != null && p.getProperty(ifProperty) == null) { return false; } else if (unlessProperty != null && p.getProperty(unlessProperty) != null) { return false; } return true; } Appropriate for Refactoring!

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Case Study (1/2) Goal: validating our key idea  Using combined clone metrics is a feasible method to extract code clone for refactoring Target System  Industrial Java software developed by NEC  110KLOC, 736 clone sets 13

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Case Study (2/2) Experimental Step 1. Selected 62 clone sets from CCFinder's output using clone metrics. 2. Conducted a survey about these clone sets and got feedback from a developer. 14 Source files CCFinder Clone sets using clone metrics Survey Feed back

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Subject Code Clones (1/2) Clone sets whose either clone metric value is high  Clone sets whose LEN(S) value is top 10 high  Clone sets whose RNR(S) value is top 10 high  Clone sets whose POP(S) value is top 10 high 15

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Subject Code Clones (2/2) Clone sets whose combined clone metrics values are high  15 clone sets whose LEN(S) and RNR(S) values are high rank in the top 15  7 clone sets whose LEN(S) and POP(S) values are high rank in the top 15  18 clone sets whose RNR(S) and POP(S) values are high rank in the top 15  1 clone set whose LEN(S), RNR(S) and POP(S) values are high rank in the top 15 16

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Results of Case Study (1/2) 17 #Selected Clone Sets: The number of selected clones #Refactoring: The number of clone sets marked as “Perform refactoring“ in survey Filtering #Selected Clone Sets #RefactoringPrecision Each Single Clone metric Combined Clone metrics

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Results of Case Study (2/2) 18 Precision : “How many refactoring candidates were accepted by a developer?“ Combined clone metrics is more accepted as refactoring candidates by a developer #Refactoring #Selected Clone Sets Precision = Filtering #Selected Clone Sets #RefactoringPrecision Each Single Clone metric Combined Clone metrics

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Summary and Future Work Summary  Our Industrial case study shows that our key idea is appropriate. Future Work  Investigate about recall  Conduct case studies of open source software  Suggest a new metric 19

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 20 Thank You

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Clone sets whose RNR(S) is higher than others Each code clone in a clone set S consists of more non-repeated token sequences 21 /* Code Clone in a clone set whose RNR(S) is the second highest in Ant */ else { // is the zip file in the cache ZipFile zipFile = (ZipFile) zipFiles.get(file); if (zipFile == null) { zipFile = new ZipFile(file); zipFiles.put(file, zipFile); } ZipEntry entry = zipFile.getEntry(resourceName); if (entry != null) { /* … */

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Clone sets whose RNR(S) is lower than others Consists of more repeated token sequences  Involve in language-dependent code clone 22 /* Code Clone in a clone set whose RNR(S) is the lowest in Ant */ String sosCmdDir = null; …… skip code…. private String filename = null; private boolean noCompress = false; private boolean noCache = false; private boolean recursive = false; private boolean verbose = false; /* … */ Consecutive variable declarations

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Survey Format: About Clone set XXX (1) Do you think that this clone set need a practice? [] Yes [] No ( →Jump to next clone set) (2) If you marked “Yes” in your answer to (1), what practice is appropriate for this clone set? [] Refactoring [] Write comments about code clones, but don’t perform refactoring. [] Change nothing. [] Others. ( (3) Write the reason why did you mark in your answer to (2) Reason : 23

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Results, and Precision of each clone set in the survey 24 Filtering#Selected Clone Sets #RefactoringPrecision Clone sets whose LEN(S) value is top 10 high Clone sets whose RNR(S) value is top 10 high Clone sets whose POP(S) value is top 10 high Clone sets whose LEN(S) and RNR(S) values are high rank in the top Clone sets whose LEN(S) and POP(S) values are high rank in the top RNR(S) and POP(S) values are high rank in the top Clone sets whose 1 clone set whose LEN(S), RNR(S), and POP(S) values are high rank in the top

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Clone metric: RNR(S) (1/2) File:  F 1 : a b c a b,  F 2 : c c* c* a b,  F 3 : d a b, e f  F 4 : c c* d e f  Superscript * indicated that the token is in a repeated token sequence  RNR(S 1 ) of Clone Set S 1 is 25 RNR(S 1 ) = 100 = Clone Set: S 1 : {,,, } ab

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Clone metric: RNR(S) (2/2) File:  F 1 : a b c a b,  F 2 : c c* c* a b,  F 3 : d a b, e f  F 4 : c c* d e f  Superscript * indicated that the token is in a repeated token sequence  RNR(S 2 ) of Clone Set S 2 is 26 Clone Set: S 2 : {,, } c c* c* c*c c* RNR(S 2 ) = 100 =

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Subject Code Clones 62 clone sets  clone sets whose individual clone metric value is high  S LEN Clone sets whose LEN(S) value is top 10 high.  S RNR Clone sets whose RNR(S) value is top 10 high.  S POP Clone sets whose POP(S) value is top 10 high.  clone sets whose combined clone metrics values are high  S LEN∙RNR 15 clone sets whose LEN(S) and RNR(S) values are high rank in the top 15.  S LEN∙POP 7 clone sets whose LEN(S) and POP(S) values are high rank in the top 15.  S RNR∙POP 18 clone sets whose RNR(S) and POP(S) values are high rank in the top 15.  S LEN∙RNR∙POP 1 clone set whose LEN(S), RNR(S) and POP(S) values are high rank in the top

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University | S RNR ∩ S POP ∩ S RNR ∙ POP | = 1 | S RNR ∩ S RNR ∙ POP | = 2 | S POP ∩ S RNR ∙ POP | = 2 | S LEN ∙ RNR ∩ S LEN ∙ POP ∩ S RNR ∙ POP ∩ S LEN ∙ RNR ∙ POP | = 1 CS セミナー 2010/12/01 28 The Number of Duplicate Clone Set

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Example of clone set that are not selected… It is too short to organize a semantic unit. RNR metric sometimes extract unintentional code clones  E.g., Language-dependent code clones 29 boolean isEqual(final DeweyDecimal other) { final int max = Math.max(other.components.length, components.length); for (int i = 0; i < max; i++) { final int component1 = (i < components.length) ? components[ i ] : 0; final int component2 = (i < other.components.length) ? other.components[ i ] : 0; if (