Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Code Clones.

Slides:



Advertisements
Similar presentations
Duplicate code detection using Clone Digger Peter Bulychev Lomonosov Moscow State University CS department.
Advertisements

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Identifying Source.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Modularization.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Extraction of.
CS-1010 Dr. Mark L. Hornick 1 Selection Statements and conditional expressions.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Extracting Code.
A Tool Support to Merge Similar Methods with a Cohesion Metric COB ○ Masakazu Ioka 1, Norihiro Yoshida 2, Tomoo Masai 1,Yoshiki Higo 1, Katsuro Inoue 1.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A Prototype of.
Refactoring Support Tool: Cancer Yoshiki Higo Osaka University.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Measuring Copying.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Industrial Application.
(c) University of Washingtonhashing-1 CSC 143 Java Hashing Set Implementation via Hashing.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Similar.
*Graduate School of Engineering Science, Osaka University
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University What Kinds of.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Refactoring.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A Criterion for.
Department of Computer Science, Graduate School of Information Science and Technology, Osaka University DCCFinder: A Very- Large Scale Code Clone Analysis.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Investigation.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A clone detection approach for a collection of similar.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University What Do Practitioners.
Chapter 2 Basic Elements of Java. Chapter Objectives Become familiar with the basic components of a Java program, including methods, special symbols,
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A Method to Detect License Inconsistencies for Large-
Mining and Analysis of Control Structure Variant Clones Guo Qiao.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Code-Clone Analysis.
2002/12/11PROFES20021 On software maintenance process improvement based on code clone analysis Yoshiki Higo* , Yasushi Ueda* , Toshihiro Kamiya** , Shinji.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Detection and evolution analysis of code clones for.
1 Gemini: Maintenance Support Environment Based on Code Clone Analysis *Graduate School of Engineering Science, Osaka Univ. **PRESTO, Japan Science and.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Applying Clone.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Inoue Laboratory Eunjong Choi 1 Investigating Clone.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University How to extract.
Refactoring1 Improving the structure of existing code.
Refactoring Deciding what to make a superclass or interface is difficult. Some of these refactorings are helpful. Some research items include Inheritance.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Code-Clone Detection.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Development of.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Retrieving Similar Code Fragments based on Identifier.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 1 Towards an Assessment of the Quality of Refactoring.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Assertion with.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 1 Towards an Investigation of Opportunities for Refactoring.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University July 21, 2008WODA.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Cage: A Keyword.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Code Clone Analysis.
1 Measuring Similarity of Large Software System Based on Source Code Correspondence Tetsuo Yamamoto*, Makoto Matsushita**, Toshihiro Kamiya***, Katsuro.
Experience of Finding Inconsistently-Changed Bugs in Code Clones of Mobile Software Katsuro Inoue†, Yoshiki Higo†, Norihiro Yoshida†, Eunjong Choi†, Shinji.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Classification.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Extraction of.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Extracting Sequence.
What kind of and how clones are refactored? A case study of three OSS projects WRT2012 June 1, Eunjong Choi†, Norihiro Yoshida‡, Katsuro Inoue†
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 コードクローン解析に基づくリファクタリング支援.
Refactoring1 Improving the structure of existing code.
Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Towards a Collection of Refactoring Patterns Based.
1 Gemini: Code Clone Analysis Tool †Graduate School of Engineering Science, Osaka Univ., Japan ‡ Graduate School of Information Science and Technology,
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Aries: Refactoring.
November 8, 2005© 2005 SHARP Corporation 1 International Workshop on Future Software Technology 2005 (g) Quality / Testing Ikuko Suzuki
Control structures in C by Dr P.Padmanabham Professor (CSE)&Director Bharat Institute of Engineering &Technology Hyderabad Mobile
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Detection of License Inconsistencies in Free and.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A Metric-based Approach for Reconstructing Methods.
Estimating Code Size After a Complete Code-Clone Merge Buford Edwards III, Yuhao Wu, Makoto Matsushita, Katsuro Inoue 1 Graduate School of Information.
Naoya Ujihara1, Ali Ouni2, Takashi Ishio1, Katsuro Inoue1
Do Developers Focus on Severe Code Smells?
A Refactoring Technique for Large Groups of Software Clones
Yuta Nakamura1, Eunjong Choi1, Norihiro Yoshida2,
○Yuichi Semura1, Norihiro Yoshida2, Eunjong Choi3, Katsuro Inoue1
Improving the structure of existing code
Refactoring Support Tool: Cancer
Recommending Verbs for Rename Method using Association Rule Mining
On Refactoring Support Based on Code Clone Dependency Relation
Dotri Quoc†, Kazuo Kobori†, Norihiro Yoshida
Controlling Program Flow
CS 325: Software Engineering
Presentation transcript:

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Code Clones for Refactoring with Clone Metrics : A Case Study of Open Source Software 1 †Osaka University, Japan ‡ Nara Institute of Science and Technology, Japan *NEC Corporation, Japan Eunjong Choi †, Norihiro Yoshida ‡, Takashi Ishio †, Katsuro Inoue †, and Tateki Sano*

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Contents 1. Background 2. Clone Metrics 3. Industrial Case Study 4. Case Study of Open Source Software 5. Summary and Future Work 2

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Background: Clone Clone Identical or similar code fragments in source code The presence of code clones  indication of low maintainability of software  if a bug is found in a code clone, the other code clone have to be checked for defect detection. 3 Similar

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Refactoring is a process of restructuring an existing code.  Alter software’s internal structure without changing its external behavior  Improve the maintainability of software Background: Refactoring [Fowler1999] (1/2) 4 [Fowler1999] M. Fowler, et al., Refactoring: Improving The Design of Existing Code, Addition Wesley, 1999.

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Refactoring Code Clones  Merge code clones into a single program unit Background: Refactoring [Fowler1999] (2/2) 5 Refactoring call statement [Fowler1999] M. Fowler, et al., Refactoring: Improving The Design of Existing Code, Addition Wesley, 1999.

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University It is unavoidable to exist in source code  because of specifications of the used program language. 6 Background: Language-dependent Code Clone Example of the language-dependent code clone (Consecutive setter invocations) replacement.setTaskType(taskType); replacement.setTaskName(taskName); replacement.setLocation(location); replacement.setOwningTarget(target); replacement.setRuntime (wrapper); wrapper.setProxy(replacement);

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Background: Clone Set A set of code clones 7 Code Clone 1 Code Clone 2 Code Clone 3 Clone Set

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Background: Clone Metrics [Higo2007] Quantitative information on clone sets  E.g., LEN(S), RNR(S), POP(S) Purposes  To check features of code clones in software  To extract code clones for several purposes  E.g., The highest length of code clones… 8 [Higo2007] Y.Higo, T. Kamiya, S.Kusumoto, K.Inoue, "Method and Implementation for Investigating Code Clones in a Software System", Information and Software Technology, pp (2007-9)

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Clone Metrics: LEN(S) The average length of token sequences of code clones in a clone set S 9 Clone set S A token sequence [a b b ] is detected as a code clone LEN(S) = 3 a b b

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Clone Metrics: RNR(S) The ratio of non-repeated token sequences of code clones in a clone set S Eliminate language dependent code clones  High RNR value 10 RNR(S) = 100 = 33.3 1 3 The length of non-repeated token sequence token sequence The length of whole token sequence Clone set S a b b

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Clone Metrics: POP(S) The number of code clones in a clone set S 11 POP(S) = Clone set S

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Single Clone Metric (1/3) Clone sets whose LEN(S) is higher  They Include many consecutive if (of if-else) blocks  involve similar but different conditional expressions. 12 if ((p = getProject().getProperty("ant.netrexxc.binary")) != null) { this.binary = Project.toBoolean(p); } // classpath makes no sense if ((p = getProject().getProperty("ant.netrexxc.comments")) != null) { this.comments = Project.toBoolean(p); } …………The last part is omitted…………………… Code Clone in a clone set whose POP(S) is the highest in Ant1.7.0

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Single Clone Metric (2/3) Clone sets whose RNR(S) is higher  They do not organize a single semantic unit  semantic unit : many instructions forming a single functionality 13 Code Clone in a clone set whose RNR(S) is the second highest in Ant else { // is the zip file in the cache ZipFile zipFile = (ZipFile) zipFiles.get(file); if (zipFile == null) { zipFile = new ZipFile(file); zipFiles.put(file, zipFile); } ZipEntry entry = zipFile.getEntry(resourceName); if (entry != null) { a part of semantic unit

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Single Clone Metric (3/3) Clone sets whose POP(S) is higher  They Include many language-dependent code clones 14 Code Clone in a clone set whose POP(S) is higher than others out.println("\">"); out.println(""); out.print("<!ELEMENT project (target | "); out.print(TASKS); out.print(" | "); out.print(TYPES);

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Key Idea It is not appropriate to extract code clones for refactoring using just a single clone metric  According to our experiences We propose a method based on combined clone metrics  To improve the weakness of single-metric-based extraction 15

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Combined Clone Metrics Clone sets whose RNR(S), POPS(S) are higher  Each code clone organizes a single semantic units 16 Code Clone in a clone set whose RNR(S), POP(S) are higher than others if (ifProperty != null && p.getProperty(ifProperty) == null) { return false; } else if (unlessProperty != null && p.getProperty(unlessProperty) != null) { return false; } return true; } Appropriate for Refactoring!

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Industrial Case Study (1/2) Goal: validating our key idea  Using combined clone metrics is a feasible method to extract code clone for refactoring Target System  Industrial Java software developed by NEC  110KLOC, 736 clone sets 17

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Industrial Case Study (2/2) Experimental Step 1. Selected 62 clone sets from CCFinder's output using clone metrics. 2. Conducted a survey about these clone sets and got feedback from a developer. 18 Source files CCFinder Clone sets using clone metrics Survey Feed back

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Subject Code Clones (1/2) Clone sets whose either clone metric value is high  S LEN : Clone sets whose LEN(S) value is top 10 high  S RNR : Clone sets whose RNR(S) value is top 10 high  S POP : Clone sets whose POP(S) value is top 10 high 19

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Subject Code Clones (2/2) Clone sets whose combined clone metrics values are high  S LENRNR : 15 clone sets whose LEN(S) and RNR(S) values are high rank in the top 15  S LENPOP : 7 clone sets whose LEN(S) and POP(S) values are high rank in the top 15  S RNRPOP : 18 clone sets whose RNR(S) and POP(S) values are high rank in the top 15  S LENRNRPOP : 1 clone set whose LEN(S), RNR(S) and POP(S) values are high rank in the top 15 20

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University In Survey : About Clone set XXX Q. Which practice is appropriate for this clone set? [] Perform refactoring [] Write comments about code clones, but don’t perform refactoring. [] Change nothing. [] Others. ( ) 21

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University In Survey : About Clone set XXX Q. Which practice is appropriate for this clone set? [] Perform refactoring [] Write comments about code clones, but don’t perform refactoring. [] Change nothing. [] Others. ( ) 22 = Appropriate for refactoring √

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University In Survey : About Clone set XXX Q. Which practice is appropriate for this clone set? [] Perform refactoring [] Write comments about code clones, but don’t perform refactoring. [] Change nothing. [] Others. ( ) 23 =Inappropriate for refactoring √ √ √

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Results of Case Study (1/2) 24 #Selected Clone Sets: The number of selected clones #Refactoring: The number of clone sets marked as “Perform refactoring“ in survey Filtering #Selected Clone Sets #RefactoringPrecision Each Single Clone metric Combined Clone metrics

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Results of Case Study (2/2) 25 Precision : “How many refactoring candidates were accepted by a developer?“ Combined clone metrics is more accepted as refactoring candidates by a developer #Refactoring #Selected Clone Sets Precision = Filtering #Selected Clone Sets #RefactoringPrecision Each Single Clone metric Combined Clone metrics

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Case Study of Open Source Software Goal: validating our key idea  Using combined clone metrics is a feasible method to extract code clone for refactoring  Using open source software Experimental Step 1. Selected clone sets from CCFinder's output using clone metrics. 2. Checked Clone sets whether they are appropriate for performing refactoring. 26

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Target systems implementation in java  Apache Ant: 198KLOC, 998 clone sets  Jboss : 633KLOC, 4284 clone sets 27

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Subject clone sets  Apached Ant: 87 clone sets  Jboss: 299 clone sets  Clone sets whose either clone metric value is top 10 high  Clone sets whose combined clone metrics values are high rank in the 15 28

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Subject Code Clones (Apache Ant) 29 Filtering #Selected Clone Sets #RefactoringPrecision Each Single Clone metric Combined Clone metrics

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Subject Code Clones (Jboss) 30 Filtering #Selected Clone Sets #RefactoringPrecision Each Single Clone metric Combined Clone metrics Q.Why results are different between the software? Because of the open source software dose not allow coding rule?

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Analysis of Results: defects of RNR metric (1/2) 31 RNR metric sometimes extract unintentional code clones  E.g., Language-dependent code clones

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Analysis of Results: defects of RNR metric (2/2) 32 lIndex = lReturn.indexOf( "*" ); while( lIndex >= 0 ) { lReturn = ( lIndex > 0 ? lReturn.substring( 0, lIndex ) : "" ) + "%2a" + ( ( lIndex + 1 ) < lReturn.length() ? lReturn.substring( lIndex + 1 ) : "" ); lIndex = lReturn.indexOf( "*" ); } lIndex = lReturn.indexOf( ":" ); while( lIndex >= 0 ) { lReturn = ( lIndex > 0 ? lReturn.substring( 0, lIndex ) : "" ) + "%3a" + ( ( lIndex + 1 ) < lReturn.length() ? lReturn.substring( lIndex + 1 ) : "" ); lIndex = lReturn.indexOf( ":" ); } Code Clone in a clone sets whose LEN(S) and RNR(S) (=96) values are high rank in the top 15 in JBOSS

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Analysis of Results: defects of RNR metric (2/2) 33 lIndex = lReturn.indexOf( "*" ); while( lIndex >= 0 ) { lReturn = ( lIndex > 0 ? lReturn.substring( 0, lIndex ) : "" ) + "%2a" + ( ( lIndex + 1 ) < lReturn.length() ? lReturn.substring( lIndex + 1 ) : "" ); lIndex = lReturn.indexOf( "*" ); } lIndex = lReturn.indexOf( ":" ); while( lIndex >= 0 ) { lReturn = ( lIndex > 0 ? lReturn.substring( 0, lIndex ) : "" ) + "%3a" + ( ( lIndex + 1 ) < lReturn.length() ? lReturn.substring( lIndex + 1 ) : "" ); lIndex = lReturn.indexOf( ":" ); } The value of RNR is really 96? Code Clone in a clone sets whose LEN(S) and RNR(S) (=96) values are high rank in the top 15 in JBOSS

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Analysis of Results: defects of RNR metric (2/2) 34 lIndex = lReturn.indexOf( "*" ); while( lIndex >= 0 ) { lReturn = ( lIndex > 0 ? lReturn.substring( 0, lIndex ) : "" ) + "%2a" + ( ( lIndex + 1 ) < lReturn.length() ? lReturn.substring( lIndex + 1 ) : "" ); lIndex = lReturn.indexOf( "*" ); } lIndex = lReturn.indexOf( ":" ); while( lIndex >= 0 ) { lReturn = ( lIndex > 0 ? lReturn.substring( 0, lIndex ) : "" ) + "%3a" + ( ( lIndex + 1 ) < lReturn.length() ? lReturn.substring( lIndex + 1 ) : "" ); lIndex = lReturn.indexOf( ":" ); } Code Clone in a clone sets whose LEN(S) and RNR(S) (=96) values are high rank in the top 15 in JBOSS

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Code Clone in a clone sets whose LEN(S) and RNR(S) (=96) values are high rank in the top 15 in JBOSS RNR value of this clone sets Code Clone in a clone sets whose LEN(S) and RNR(S) (=50) 35 Analysis of Results: defects of RNR metric (2/2)

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Summary and Future Work Summary  We conducted a case study to validate our key idea and discuss its result Future Work  Update used metrics  Investigate about recall  Use more metrics.  Conduct case studies of open source software 36

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 37 Thank You for Your Attention! 감사합니다. ありがとうございます

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Example of clone set that are not selected… It is too short to organize a semantic unit. RNR metric sometimes extract unintentional code clones  E.g., Language-dependent code clones 38 boolean isEqual(final DeweyDecimal other) { final int max = Math.max(other.components.length, components.length); for (int i = 0; i < max; i++) { final int component1 = (i < components.length) ? components[ i ] : 0; final int component2 = (i < other.components.length) ? other.components[ i ] : 0; if (

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Clone sets whose RNR(S) is higher than others Each code clone in a clone set S consists of more non-repeated token sequences 39 /* Code Clone in a clone set whose RNR(S) is the second highest in Ant */ else { // is the zip file in the cache ZipFile zipFile = (ZipFile) zipFiles.get(file); if (zipFile == null) { zipFile = new ZipFile(file); zipFiles.put(file, zipFile); } ZipEntry entry = zipFile.getEntry(resourceName); if (entry != null) { /* … */

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Clone sets whose RNR(S) is lower than others Consists of more repeated token sequences  Involve in language-dependent code clone 40 /* Code Clone in a clone set whose RNR(S) is the lowest in Ant */ String sosCmdDir = null; …… skip code…. private String filename = null; private boolean noCompress = false; private boolean noCache = false; private boolean recursive = false; private boolean verbose = false; /* … */ Consecutive variable declarations

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Clone metric: RNR(S) (1/2) File:  F 1 : a b c a b,  F 2 : c c* c* a b,  F 3 : d a b, e f  F 4 : c c* d e f  Superscript * indicated that the token is in a repeated token sequence  RNR(S 1 ) of Clone Set S 1 is 41 RNR(S 1 ) = 100 = Clone Set: S 1 : {,,, } ab

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Clone metric: RNR(S) (2/2) File:  F 1 : a b c a b,  F 2 : c c* c* a b,  F 3 : d a b, e f  F 4 : c c* d e f  Superscript * indicated that the token is in a repeated token sequence  RNR(S 2 ) of Clone Set S 2 is 42 Clone Set: S 2 : {,, } c c* c* c*c c* RNR(S 2 ) = 100 =

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University | S RNR ∩ S POP ∩ S RNR ∙ POP | = 1 | S RNR ∩ S RNR ∙ POP | = 2 | S POP ∩ S RNR ∙ POP | = 2 | S LEN ∙ RNR ∩ S LEN ∙ POP ∩ S RNR ∙ POP ∩ S LEN ∙ RNR ∙ POP | = 1 CS セミナー 2010/12/01 43 The Number of Duplicate Clone Set(Industrial)

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University | S RNR ∩ S RNR ∙ POP | = 1 | S POP ∩ S RNR ∙ POP | = 1 | S POP ∩ S LEN ∙ POP | = 1 CS セミナー 2010/12/01 44 The Number of Duplicate Clone Set(Apache ant)

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University | S RNR ∩ S LEN ∙ RNR | = 3 | S RNR ∩ S RNR ∙ POP | = 1 | S LEN ∙ RNR ∩ S LEN ∙ POP ∩ S RNR ∙ POP ∩ S LEN ∙ RNR ∙ POP | = 2 CS セミナー 2010/12/01 45 The Number of Duplicate Clone Set(JBOSS)

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 46 Clone set metrics LEN (C ): Length of token sequence of each element in clone set C POP (C ): Number of elements in clone set C RAD (C ): Distribution in the file system of elements in clone set C DFL (C ): Estimation of how many tokens would be removed from source files when all code fragments of clone set C are replaced with caller statements of a new identical routine new sub routine caller statements

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Results, and Precision of each clone set in the survey 47 Filtering#Selected Clone Sets #RefactoringPrecision Clone sets whose LEN(S) value is top 10 high Clone sets whose RNR(S) value is top 10 high Clone sets whose POP(S) value is top 10 high Clone sets whose LEN(S) and RNR(S) values are high rank in the top Clone sets whose LEN(S) and POP(S) values are high rank in the top RNR(S) and POP(S) values are high rank in the top Clone sets whose 1 clone set whose LEN(S), RNR(S), and POP(S) values are high rank in the top

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Subject Code Clones (Apache Ant) 48 Clone Sets#Selected Clone Sets #RefactoringPrecision S LEN S RNR S POP S LENRNR S LENPOP S RNRPOP S LENRNRPOP ---

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Subject Code Clones (Jboss) 49 Clone Sets#Selected Clone Sets #RefactoringPrecision S LEN S RNR S POP S LENRNR S LENPOP S RNRPOP S LENRNRPOP