Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Measuring Copying.

Slides:



Advertisements
Similar presentations
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A Preliminary.
Advertisements

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Identifying Source.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Evolutional Analysis.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University On the Effectiveness.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Modularization.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Extraction of.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Extracting Code.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A Prototype of.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Industrial Application.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Where Does This.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Cross-application.
Chris Hyzer University of Pennsylvania
The impact of the development of institutional repositories on “Kiyo” or institutional research journals in Japan Hiroya Takeuchi and Syun Tutiya Chiba.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Debugging Support.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University ICSE 2003 Java.
Yuki Manabe*, Daniel M. German†,‡ and Katsuro Inoue†
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A lightweight.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University What Kinds of.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University An Exploration.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A Criterion for.
Department of Computer Science, Graduate School of Information Science and Technology, Osaka University DCCFinder: A Very- Large Scale Code Clone Analysis.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Investigation.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A clone detection approach for a collection of similar.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University What Do Practitioners.
INFSOM-RI Juelich, 10 June 2008 ETICS - Maven From competition, to collaboration.
Search Technologies Maven Repository (Mirror) Central Repository (repo1.maven.org) Local Repository (/users/you/.m2) Local Machine Project target Project.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A Method to Detect License Inconsistencies for Large-
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Code-Clone Analysis.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Design and Implementation.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Applying Clone.
Elements of a Data Management Plan: Roles and Responsibilities Ruth Duerr National Snow and Ice Data Center Version 1.0 Review Date.
Apps.  Understand the list of applications or application components that are required, based on the baseline Application Portfolio, what the requirements.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University VerXCombo: An.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Development of.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Retrieving Similar Code Fragments based on Identifier.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Assertion with.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University July 21, 2008WODA.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Code Clones.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University IWPSE 2003 Program.
Software Engineering Research Group, Graduate School of Engineering Science, Osaka University A Slicing Method for Object-Oriented Programs Using Lightweight.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Software Tag:
Installing SAS 1. Requirements If you do not have an old copy of SAS installed on your computer, go directly to Slide 6. Make sure you have uninstalled.
Extracting a Unified Directory Tree to Compare Similar Software Products Yusuke Sakaguchi, Takashi Ishio, Tetsuya Kanda, Katsuro Inoue Department of Computer.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University An Empirical Study of Out-dated Third-party Code.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Classification.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Extracting Sequence.
Build Systems Presentation December 14, 2015 Noon-1pm Kathy Lee Simunich Bldg. 203/ D120 Brought to you by: Argonne Java.
Software Deployment & Release 26/03/2015 1EN-ICE.
3-1 Modeling Basic Entities DBMS Create Sort Search Addition Deletion Modification Create Sort Search Addition Deletion Modification DBMS is a Software.
(1) Introduction to Robocode Philip Johnson Collaborative Software Development Laboratory Information and Computer Sciences University of Hawaii Honolulu.
(1) Code Walkthrough robocode-pmj-dacruzer Philip Johnson Collaborative Software Development Laboratory Information and Computer Sciences University of.
Maven. Introduction Using Maven (I) – Installing the Maven plugin for Eclipse – Creating a Maven Project – Building the Project Understanding the POM.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Detection of License Inconsistencies in Free and.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Software Ingredients:
Source File Set Search for Clone-and-Own Reuse Analysis
Naoya Ujihara1, Ali Ouni2, Takashi Ishio1, Katsuro Inoue1
Are Practitioners Writing Contracts?
Boris Todorov1, Raula Gaikovina Kula2, Takashi Ishio2, Katsuro Inoue1
מדינת ישראל הוועדה לאנרגיה אטומית
Princess Nourah bint Abdulrahman University
Visualizing the Evolution of Systems and their Library Dependencies
Raula Gaikovina Kula, Daniel German, Takashi Ishio, Katsuro Inoue
Setting up an Eclipse project from a repository on GitHub
Most Common Grading Issues
Recommending Verbs for Rename Method using Association Rule Mining
Yuhao Wu1, Yuki Manabe2, Daniel M. German3, Katsuro Inoue1
Daniel Kim Software Engineering Laboratory Professor Katsuro Inoue
Software Engineering and Architecture
PADLA: A Dynamic Log Level Adapter Using Online Phase Detection
Software Engineering and Architecture
Presentation transcript:

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Measuring Copying of Java Archives Tetsuya Kanda 1, Daniel M. German 2,1, Takashi Ishio 1, Katsuro Inoue 1 1 Osaka University, Japan 2 University of Victoria, Canada

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Reusing a library Reuse existing libraries by copying them into the software development project Black-box reuse 2 Copy THE USEFUL LIBRARY

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Library in Java JAR files (Java archive file) are built on the ZIP file format A Jar file can contain another jar file inside. 3 THE USEFUL LIBRARY jar files in the library

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Duplication of jar files Since a Jar file can contain another jar file inside, they can be duplicated Jar files in another jar file might cause further duplication 4 Copy

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Question How many jar files in a large software repository contain jar files inside? Are there any duplication of jar files inside? 5 jar files in the library

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Definition: Top-level jar file A jar file found in the repository –A component ready to be reused 6 Top-level jar

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Definition: Inner jar file A jar file that is included in another jar file 7 A.jar (Top-level jar) Inner jar files of A.jar

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University The experiment Objective: –Find how many top-level jar files contain duplicate inner jar files inside Target: –Maven Central repository Default repository of Apache Maven Contains many popular libraries and projects. 8

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Counting inner jar files 599,498 top-level jar files in Maven Central repository (without duplications) 4,747 contains jar files inside 9 # inner jar files Max282 Average13.1 Median2 Min1 (in 1,833 of top-level jar files)

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Reused jar files 118,361 different inner jar files are contained in other jar files 89,054 of them are found as top-level jar files in Maven Central repository –There is a possibility of causing further duplication in software projects. 10

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Duplication of inner jar files Top-level Contains Inner jar Having Duplication Total SameDifferentBoth #files4, The same version The different versions Having the same file name and the same file hash of the contents Having the same file name with the exception of version names llibA-1.0.jar hash:3bf7 llibA-1.0.jar hash:3bf7 llibB-1.0.jarllibB-1.2.jar

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Duplication of inner jar files Top-level Contains Inner jar Having Duplication Total SameDifferentBoth #files4, Contain the same version of the same library Ver.1

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Duplication of inner jar files Top-level Contains Inner jar Having Duplication Total SameDifferentBoth #files4, Contain the different versions of the same library Ver.1 Ver.2

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Duplication of inner jar files Top-level Contains Inner jar Having Duplication Total SameDifferentBoth #files4, Contain both the same version and the different versions of the same library Ver.1 Ver.2 Ver.1

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Concluding remarks About 5,000 jar files in the Maven Central repository contain other jar files About 470 of them contains duplicate libraries Most of inner jar files are also found as Maven components –There are still possibility of further duplications. 15

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Future works Find duplications of jar files and class files in distributed software applications –eclipse, JBoss, … Analyze the behavior of the software which contains duplicated libraries –Understanding the impact of duplication 16

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University HIDDEN 17

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 18

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Inner Jar Files 19 Maven2 599,498 top-level jar files (without duplications) 4747 contains jar file inside Max: 282 inner jar files Median:2 Min:1 (in 1833 of top-level jar files)

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Duplication of Inner Jar Files Top-level Contains Inner jar Having Duplication Total SameDifferentBoth #files4, #projects The same version The different versions Having the same file name and the same file hash of the contents Having the same file name with the exception of version names llibA-1.0.jar hash:3bf7 llibA-1.0.jar hash:3bf7 llibB-1.0.jarllibB-1.2.jar

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Duplication of Inner Jar Files Top-level Contains Inner jar Having Duplication Total SameDifferentBoth #files4, #projects Contain the same version of the same library Ver.1

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Duplication of Inner Jar Files Top-level Contains Inner jar Having Duplication Total SameDifferentBoth #files4, #projects Contain the different versions of the same library Ver.1 Ver.2

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Duplication of Inner Jar Files Top-level Contains Inner jar Having Duplication Total SameDifferentBoth #files4, #projects Contain both the same version and the different versions of the same library Ver.1 Ver.2 Ver.1