Download presentation
Presentation is loading. Please wait.
1
Finding File Clones in FreeBSD Ports Collection
Yusuke Sasaki Tetsuo Yamamoto Yasuhiro Hayase Katsuro Inoue
2
File Clones Research about file-clones is scarce
Two or more files with the same content Comments and code indentation ignored Inside a project or between different projects Research about file-clones is scarce Get new knowledge about file-clones Project A Project B int main() { printf(“Hello msr!”); return 0; }
3
FCFinder Input Output Faster than other tools Detection
.c and .h files Output File-clone sets Faster than other tools Detection Tokenization MD5 Hash Calculation Exact Matching Tool Speed CCFinder 1.4M files / 960 hours x1 1PC D-CCFinder 1.4M files / 51 hours x19 80PCs FCFinder 1.4M files / hours x55
4
These values follow the power law
Experiment Target Only .c and .h files in the FreeBSD Ports Collection ~1.4M files ~12 GB 17.16 hours We measured: File size Number of files in each project Size of each file-clone set Number of file-clones in a project These values follow the power law
5
File-clone Set Size file clone set size 5 10 50 100 Left:used in PHP5
Right:used in PHP4 used in both of PHP4 and 5 D E L:650 sets R:500 sets 419 sets 120 file clones 5 10 50 100 L:61 file clones R:59 file clones file clone set size R*2 =
6
File-clones per Project
Right:PHP4 modules Center:projects related bin-utils Left:PHP5 modules G 5 10 50 100 500 1K K 10K number of file clone sets R*2 =
7
File-clones Between Projects (1/3)
* Nodes show the projects * Edges between projects show the number of file clones between two projects Ex) gcc41 and gfortran shares 7691 file clones
8
File-clones Between Projects (2/3)
* Nodes show the projects * Edges between projects show the number of file clones between two projects
9
File-clones Between Projects (3/3)
* Nodes show the projects * Edges between projects show the number of file clones between two projects
10
Conclusions & Future Work
Measured several features of the FreeBSD Ports collection. Found that the measured features follow the power law Future Work Projects logical coupling investigation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.