Where Does This Code Come from and Where Does It Go? - Integrated Code History Tracker for Open Source Systems – Katsuro Inoue, Yusuke Sasaki, Pei Xia, and Yuki Manabe Osaka University Osaka, Japan {inoue, peixia, y-manabe}@ist.osaka-u.ac.jp Software Engineering (ICSE), 2012 34th International Conference on
Outline Introduction Approach Processes Experiment Summary Future work
Introduction To reuse an open source code file We do not know much about the original project If we could safely and effectively reuse it? How to make a decision to reuse it?
Approach
Approach
Processes
Experiment - Texture.java 1,600 LOC Java File Define a graphic texture object Developed by jMonkeyEngine Popularly used by many 3D games Input Query Q qc : overall source code qa : the file name“texture.java”
Experiment - Texture.java
Experiment - Texture.java
Experiment - Texture.java
Experiment - Texture.java Texture.java code evolves along with the project progress Each version of Texture.java is copied to many other projects,which are identified as similar files in Clusters A, B, and C Cluster C, there are 6 file exactly the same as the query code One outlier project #25
Experiment – kern_malloc.c Is a C function Allocates a specified-size memory block in th kernel Fairly old Taken over and maintained by many other various projects Input Query Q qc : overall source code qa : the file name“texture.java” No SPARS/R
Experiment – kern_malloc.c
Experiment – kern_malloc.c
Experiment – kern_malloc.c
Experiment – kern_malloc.c The cover ratio of the output results diverges along the time scale No clear cluster of similar results Many variations of different code fragments Many small changes among the projects All of these results are under BSD License Overview the evolution of a core part of Unxi OS kernel code
Experiment - SSHTools SSHTools Input Query Q A suite of java SSH applications Providing a java SSH API,terminal Input Query Q qc : 339 files of the latest version 0.2.9 qa : file names Ignoring some tiny sized files A threshold of the cover ration 0.4
Experiment - SSHTools
Experiment - SSHTools
Experiment - SSHTools Many different ancestor projects SSHTools is a collection of various tools Donated by different projects Their licenses and copyrights had been modified
Future work Improve the performance and usability Explore a unified approach of local repositories and Internet repositories Use the search results as the new searche queries Tracking such code chain
Thanks! Q&A