Presentation is loading. Please wait.

Presentation is loading. Please wait.

Aiding Comprehension of Cloning Through Categorization Cory Kapser and Michael W. Godfrey Software Architecture Group School of Computer Science, University.

Similar presentations


Presentation on theme: "Aiding Comprehension of Cloning Through Categorization Cory Kapser and Michael W. Godfrey Software Architecture Group School of Computer Science, University."— Presentation transcript:

1 Aiding Comprehension of Cloning Through Categorization Cory Kapser and Michael W. Godfrey Software Architecture Group School of Computer Science, University Of Waterloo

2 Overview MotivationMotivation BackgroundBackground MethodsMethods Case StudiesCase Studies ResultsResults DiscussionDiscussion SummarySummary

3 Motivation Code duplication (“cloning”) is common in large, long-lived industrial software systems.Code duplication (“cloning”) is common in large, long-lived industrial software systems. –Negatively affects successful system evolution! Thus, clone management or removal is desirable.Thus, clone management or removal is desirable.

4 Problems with clone detection technologies ComprehensionComprehension –Result sets often provide little information beyond “it’s a clone” ScalabilityScalability –VERY large result sets typical AccuracyAccuracy –Esp. false positives

5 Proposed solution Classification of clonesClassification of clones –Improve comprehension through informative grouping and statistical analysis –Improve scalability through easier navigation –Improve accuracy through region- specific filtering

6 Overview MotivationMotivation BackgroundBackground MethodsMethods Case StudiesCase Studies ResultsResults DiscussionDiscussion SummarySummary

7 Code cloning A serious problem in industrial software.A serious problem in industrial software. –Typically, 15% of a system is duplicated code. –As high as 50% in some cases [Ducasse]

8 Reasons for code cloning Perceived costPerceived cost Time constraintsTime constraints Insufficient understanding of the underlying problemInsufficient understanding of the underlying problem Architectural clarityArchitectural clarity

9 Problems with clones MaintenanceMaintenance SizeSize ComprehensionComprehension Bugs (copied and new)Bugs (copied and new) Indication of poor designIndication of poor design

10 Managing clones RemovalRemoval DocumentationDocumentation

11 Overview MotivationMotivation BackgroundBackground MethodsMethods Case StudiesCase Studies ResultsResults DiscussionDiscussion SummarySummary

12 Our approach 1.Perform clone detection 2.Extract/define “regions” from source code 3.Map clone pairs to regions 4.Classify clones 5.Filter clones 6.Display results

13 The taxonomy Classifies clones according to attributes such as location and region type of a cloneClassifies clones according to attributes such as location and region type of a clone HierarchicalHierarchical

14

15 ADD A SLIDE HERE To discuss what you hoped yoru taxonomy would help you withTo discuss what you hoped yoru taxonomy would help you with –Why did you pcik that design? Give an example of how using this taxonomy could be helpful in a (simple, made up) example caseGive an example of how using this taxonomy could be helpful in a (simple, made up) example case

16 Overview MotivationMotivation BackgroundBackground MethodsMethods Case StudiesCase Studies ResultsResults DiscussionDiscussion SummarySummary

17 Case studies PostgreSQLPostgreSQL –543,387 LOC –1097 source files Linux kernel file-system subsystemLinux kernel file-system subsystem –280,177 LOC –537 source files

18 Filtering and classification results 85 – 87% of clones could be classified using the taxonomy85 – 87% of clones could be classified using the taxonomy Fewer unclassified clones in Same Directory Clones categoryFewer unclassified clones in Same Directory Clones category Large percentage of false positives were removed via filtering structural and prototype regions.Large percentage of false positives were removed via filtering structural and prototype regions.

19 Overall cloning in the systems Function Clones dominate the Same Directory Clones.Function Clones dominate the Same Directory Clones. Most cloning occurs within the same directory.Most cloning occurs within the same directory.

20 Frequency of clone types Very few loop clonesVery few loop clones Relatively many conditional clonesRelatively many conditional clones 38% of the clone pairs in the Linux fs and 53% of the clone pairs of PostgreSQL made up function clones38% of the clone pairs in the Linux fs and 53% of the clone pairs of PostgreSQL made up function clones

21 It is possible to insert a table here with the results even if it is partial (to show that the work is there and that there are numbers)?It is possible to insert a table here with the results even if it is partial (to show that the work is there and that there are numbers)? Or maybe a graph? Nice to have this to imply: here’s all the hard work we did, boy did we sweat, and there are so many results that the obersvations are probably meaningfulOr maybe a graph? Nice to have this to imply: here’s all the hard work we did, boy did we sweat, and there are so many results that the obersvations are probably meaningful

22 Overview MotivationMotivation BackgroundBackground MethodsMethods Case StudiesCase Studies DiscussionDiscussion SummarySummary

23 Cloning comprehension Classification of clones can improve comprehensionClassification of clones can improve comprehension –User will have a working understanding of what a clone in a certain type means –We believe navigation of the “clone space” will be greatly improved –We now know more about cloning as it occurs in a software system –Simple metrics are now available

24 Tool support Clone Interpretation and Classification System (CICS)Clone Interpretation and Classification System (CICS) –Provides GUI to navigate classified clones –Will provide benchmarking support for clone detection tools –Many features can be added complement the sorting of clones in the taxonomy

25 CICS

26 Overview MotivationMotivation BackgroundBackground MethodsMethods Case StudiesCase Studies DiscussionDiscussion SummarySummary

27 Summary Management of clones is important for the healthy evolution of a software systemManagement of clones is important for the healthy evolution of a software system We can make the process of managing clones more comprehensible, scalable, and accurateWe can make the process of managing clones more comprehensible, scalable, and accurate

28 Future work Deeper classificationDeeper classification Benchmark suiteBenchmark suite IDE pluginsIDE plugins Evolution of clonesEvolution of clones


Download ppt "Aiding Comprehension of Cloning Through Categorization Cory Kapser and Michael W. Godfrey Software Architecture Group School of Computer Science, University."

Similar presentations


Ads by Google