Download presentation
Presentation is loading. Please wait.
Published byJustin Sherman Modified over 9 years ago
1
Aiding Comprehension of Cloning Through Categorization Cory Kapser and Michael W. Godfrey Software Architecture Group School of Computer Science, University Of Waterloo
2
Overview MotivationMotivation BackgroundBackground MethodsMethods Case StudiesCase Studies ResultsResults DiscussionDiscussion SummarySummary
3
Motivation Code duplication (“cloning”) is common in large, long-lived industrial software systems.Code duplication (“cloning”) is common in large, long-lived industrial software systems. –Negatively affects successful system evolution! Thus, clone management or removal is desirable.Thus, clone management or removal is desirable.
4
Problems with clone detection technologies ComprehensionComprehension –Result sets often provide little information beyond “it’s a clone” ScalabilityScalability –VERY large result sets typical AccuracyAccuracy –Esp. false positives
5
Proposed solution Classification of clonesClassification of clones –Improve comprehension through informative grouping and statistical analysis –Improve scalability through easier navigation –Improve accuracy through region- specific filtering
6
Overview MotivationMotivation BackgroundBackground MethodsMethods Case StudiesCase Studies ResultsResults DiscussionDiscussion SummarySummary
7
Code cloning A serious problem in industrial software.A serious problem in industrial software. –Typically, 15% of a system is duplicated code. –As high as 50% in some cases [Ducasse]
8
Reasons for code cloning Perceived costPerceived cost Time constraintsTime constraints Insufficient understanding of the underlying problemInsufficient understanding of the underlying problem Architectural clarityArchitectural clarity
9
Problems with clones MaintenanceMaintenance SizeSize ComprehensionComprehension Bugs (copied and new)Bugs (copied and new) Indication of poor designIndication of poor design
10
Managing clones RemovalRemoval DocumentationDocumentation
11
Overview MotivationMotivation BackgroundBackground MethodsMethods Case StudiesCase Studies ResultsResults DiscussionDiscussion SummarySummary
12
Our approach 1.Perform clone detection 2.Extract/define “regions” from source code 3.Map clone pairs to regions 4.Classify clones 5.Filter clones 6.Display results
13
The taxonomy Classifies clones according to attributes such as location and region type of a cloneClassifies clones according to attributes such as location and region type of a clone HierarchicalHierarchical
15
ADD A SLIDE HERE To discuss what you hoped yoru taxonomy would help you withTo discuss what you hoped yoru taxonomy would help you with –Why did you pcik that design? Give an example of how using this taxonomy could be helpful in a (simple, made up) example caseGive an example of how using this taxonomy could be helpful in a (simple, made up) example case
16
Overview MotivationMotivation BackgroundBackground MethodsMethods Case StudiesCase Studies ResultsResults DiscussionDiscussion SummarySummary
17
Case studies PostgreSQLPostgreSQL –543,387 LOC –1097 source files Linux kernel file-system subsystemLinux kernel file-system subsystem –280,177 LOC –537 source files
18
Filtering and classification results 85 – 87% of clones could be classified using the taxonomy85 – 87% of clones could be classified using the taxonomy Fewer unclassified clones in Same Directory Clones categoryFewer unclassified clones in Same Directory Clones category Large percentage of false positives were removed via filtering structural and prototype regions.Large percentage of false positives were removed via filtering structural and prototype regions.
19
Overall cloning in the systems Function Clones dominate the Same Directory Clones.Function Clones dominate the Same Directory Clones. Most cloning occurs within the same directory.Most cloning occurs within the same directory.
20
Frequency of clone types Very few loop clonesVery few loop clones Relatively many conditional clonesRelatively many conditional clones 38% of the clone pairs in the Linux fs and 53% of the clone pairs of PostgreSQL made up function clones38% of the clone pairs in the Linux fs and 53% of the clone pairs of PostgreSQL made up function clones
21
It is possible to insert a table here with the results even if it is partial (to show that the work is there and that there are numbers)?It is possible to insert a table here with the results even if it is partial (to show that the work is there and that there are numbers)? Or maybe a graph? Nice to have this to imply: here’s all the hard work we did, boy did we sweat, and there are so many results that the obersvations are probably meaningfulOr maybe a graph? Nice to have this to imply: here’s all the hard work we did, boy did we sweat, and there are so many results that the obersvations are probably meaningful
22
Overview MotivationMotivation BackgroundBackground MethodsMethods Case StudiesCase Studies DiscussionDiscussion SummarySummary
23
Cloning comprehension Classification of clones can improve comprehensionClassification of clones can improve comprehension –User will have a working understanding of what a clone in a certain type means –We believe navigation of the “clone space” will be greatly improved –We now know more about cloning as it occurs in a software system –Simple metrics are now available
24
Tool support Clone Interpretation and Classification System (CICS)Clone Interpretation and Classification System (CICS) –Provides GUI to navigate classified clones –Will provide benchmarking support for clone detection tools –Many features can be added complement the sorting of clones in the taxonomy
25
CICS
26
Overview MotivationMotivation BackgroundBackground MethodsMethods Case StudiesCase Studies DiscussionDiscussion SummarySummary
27
Summary Management of clones is important for the healthy evolution of a software systemManagement of clones is important for the healthy evolution of a software system We can make the process of managing clones more comprehensible, scalable, and accurateWe can make the process of managing clones more comprehensible, scalable, and accurate
28
Future work Deeper classificationDeeper classification Benchmark suiteBenchmark suite IDE pluginsIDE plugins Evolution of clonesEvolution of clones
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.