Yuhao Wu1, Yuki Manabe2, Daniel M. German3, Katsuro Inoue1

Slides:



Advertisements
Similar presentations
1 Dynamic Proxies Explained Simply. 2 Dynamic Proxies License Copyright © 2008 Ciaran McHale. Permission is hereby granted, free of charge, to any person.
Advertisements

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A Preliminary.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Evolutional Analysis.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Extraction of.
The Importance of Open Source Software Networking 2002 Washington, D.C. April 18, 2002 Carol A. Kunze Napa, California.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Open Source Software A Commercial.
Ashok K. Mannava Mannava & Kang, P.C. Open Source Software and IP February 10, 2012.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Measuring Copying.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Industrial Application.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Where Does This.
Software Engineering Lab, Osaka University Code Clone Analysis and Its Application Katsuro Inoue Osaka University.
Content Management Systems …mostly Umbraco ALL ABOUT.
 Open-source software ( OSS ) is computer software that is available in source code form: the source code and certain other rights normally reserved.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Similar.
Yuki Manabe*, Daniel M. German†,‡ and Katsuro Inoue†
1 Patent Rights & Open Source Software Roger G. Brooks April 29,
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A Criterion for.
Software Engineering CS3003
Department of Computer Science, Graduate School of Information Science and Technology, Osaka University DCCFinder: A Very- Large Scale Code Clone Analysis.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Investigation.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A clone detection approach for a collection of similar.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A Method to Detect License Inconsistencies for Large-
Digital Intuition Cluster, Smart Geometry 2013, Stylianos Dritsas, Mirco Becker, David Kosdruy, Juan Subercaseaux Setup Instructions Overview 1. License.
Andrew McNab - License issues - 10 Apr 2002 License issues for EU DataGrid (on behalf of Anders Wannanen) Andrew McNab, University of Manchester
2002/12/11PROFES20021 On software maintenance process improvement based on code clone analysis Yoshiki Higo* , Yasushi Ueda* , Toshihiro Kamiya** , Shinji.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Detection and evolution analysis of code clones for.
1 Gemini: Maintenance Support Environment Based on Code Clone Analysis *Graduate School of Engineering Science, Osaka Univ. **PRESTO, Japan Science and.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Applying Clone.
1 1 © AdaCore under the GNU Free Documentation License Franco Gasperoni
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University VerXCombo: An.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Development of.
Copyright © 2015 NTT DATA Corporation Kazuo Kobori, NTT DATA Corporation Makoto Matsushita, Osaka University Katsuro Inoue, Osaka University SANER2015.
Providing Access to Your Data: Rights Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International Earth Science.
1 Measuring Similarity of Large Software System Based on Source Code Correspondence Tetsuo Yamamoto*, Makoto Matsushita**, Toshihiro Kamiya***, Katsuro.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University An Empirical Study of Out-dated Third-party Code.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Classification.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Extracting Sequence.
What kind of and how clones are refactored? A case study of three OSS projects WRT2012 June 1, Eunjong Choi†, Norihiro Yoshida‡, Katsuro Inoue†
1 Gemini: Code Clone Analysis Tool †Graduate School of Engineering Science, Osaka Univ., Japan ‡ Graduate School of Information Science and Technology,
Chapter 3: Understanding Software Licensing
Software Copyrights and Licenses DANIEL PARKER. Overview  Copyrights  Software copyright information  Software licenses & some examples  Why copyrighting.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Detection of License Inconsistencies in Free and.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Software Ingredients:
Estimating Code Size After a Complete Code-Clone Merge Buford Edwards III, Yuhao Wu, Makoto Matsushita, Katsuro Inoue 1 Graduate School of Information.
Free Software - Introduction to free software and the GPL Copyright © 2007 Marcus Rejås Free Software Foundation Europe I hereby grant everyone the right.
Katsuro Inoue Osaka University
Opening Windows to a Wider World Why Samba moved to GPLv3 Jeremy Allison Samba Team
Open Source Software Practices
Open Source Software: Top 10 Myths that Every In-House, Government, & Private Practice Lawyer Should Know 2017 NAPABA Convention Washington, D.C.
What is Copyright?.
Do Developers Focus on Severe Code Smells?
Ruru Yue1, Na Meng2, Qianxiang Wang1 1Peking University 2Virginia Tech
Open Source Software Keenan Zuraiz
○Yuichi Semura1, Norihiro Yoshida2, Eunjong Choi3, Katsuro Inoue1
FOSS 101 Sarah Glassmeyer Project Specialist Manager,
Boris Todorov1, Raula Gaikovina Kula2, Takashi Ishio2, Katsuro Inoue1
Predicting Fault-Prone Modules Based on Metrics Transitions
Quaid-i-Azam University
Reno WordPress Meetup February 12, 2015.
Multilingual Detection of Code Clones Using ANTLR Grammar Definitions
Daniel Kim Software Engineering Laboratory Professor Katsuro Inoue
GNU General Public License (GPL)
On Refactoring Support Based on Code Clone Dependency Relation
Empirical Studies on License Compliance and Copyright Inconsistency Risks in Open Source Software Shi QIU.
Where Does This Code Come from and Where Does It Go?
APACHE LICENSE HISTORICAL EVOLUTION
Research Activities of Software Engineering Lab in Osaka University
Dotri Quoc†, Kazuo Kobori†, Norihiro Yoshida
Large-scale Analysis of Software Reuse for Code and License Changes
© Healthcare Inspirations. All rights reserved
Presentation transcript:

Yuhao Wu1, Yuki Manabe2, Daniel M. German3, Katsuro Inoue1 How Are Developers Treating License Inconsistency Issues? A Case Study on License Inconsistency Evolution in FOSS Projects Yuhao Wu1, Yuki Manabe2, Daniel M. German3, Katsuro Inoue1 1Osaka University, Japan 2Kumamoto University, Japan 3University of Victoria, Canada

Oracle v. Google Never remove the copyright notice!! $9 billion Sued Google for copyright and patent infringement in Aug. 2010 Won the case in May 2016 Fair use private static void rangeCheck(int arrayLen, int fromIndex, int toIndex) { if (fromIndex > toIndex) throw new IllegalArgumentException("fromIndex(" + fromIndex + ") > toIndex(" + toIndex+")"); if (fromIndex < 0) throw new ArrayIndexOutOfBoundsException(fromIndex); if (toIndex > arrayLen) throw new ArrayIndexOutOfBoundsException(toIndex); } Re-implemented 37 Java APIs “Copied” 9 lines of code Google removed the code in new version of Android Never remove the copyright notice!!

Open Source Software License GPLv2 (taken from OpenJDK) * Copyright (c) 1997, 2011, Oracle and/or its affiliates. All rights reserved. * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. * * This code is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License version 2 only, as * published by the Free Software Foundation. Oracle designates this […] MIT License boilerplate Copyright <YEAR> <COPYRIGHT HOLDER> Permission is hereby granted, free of charge, to any person obtaining a copy […] The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

License inconsistency indicates potential copyright infringement Prior Study[1] License Inconsistency File 1 GPL File 2 Apache same code License inconsistency indicates potential copyright infringement [1] Yuhao Wu, Yuki Manabe, Tetsuya Kanda, Daniel M. German, Katsuro Inoue: "A Method to Detect License Inconsistencies in Large-Scale Open Source Projects", Proceedings of the 12th Working Conference on Mining Software Repositories (MSR 2015), pp.324-333, Florence, Itary, May 2015.

How to detect license inconsistency Source code available at: https://github.com/wyhfrank/LIFinder Collection of projects Project_1 Project_2 Project_3 … j Group files by their normalized tokens using CCFinderX[2] Group_1 Group_2 … k Identify the license of each file in each group using Ninka[3] GroupID #Licenses #None #Unknown 1 2 5 … l Calculate metrics for the groups that contain license inconsistencies [2] T. Kamiya, S. Kusumoto, and K. Inoue, “CCFinder: A multilinguistic token-based code clone detection system for large scale source code,” IEEE Transactions on Software Engineering, vol. 28, no. 7, pp. 654–670, 2002. [3] D. M. German, Y. Manabe, and K. Inoue, “A sentence-matching method for automatic license identification of source code files,” in Proceedings of the 25th International Conference on Automated Software Engineering (ASE2010), 2010, pp. 437–446.

Empirical Study Setup Goal Dataset: To understand how license inconsistencies evolve in large OSS systems Dataset: Debian 7.5 and 8.2 Number of Debian 7.5 Debian 8.2 Source packages 17,160 20,577 Total files 6,136,637 13,124,700 .c files 472,861 767,006 .cpp files 224,267 335,269 .java files 365,213 447,154

Relative complement** Results - Overview Number of license inconsistency groups detected in two versions. Number of groups Debian 7.5 Debian 8.2 Total 6763 7009 Intersection* 4062 Relative complement** 2701 2947 Why these groups disappeared in the new version? Why new groups appear? * Groups reported in both versions. ** Groups reported in one version but not the other. 7.5 8.2

Distribution Latency - disappear GPL-2.0 GPL-3.0 Packages 2011.8.17 2013.7.26 kbuild … r2543 r2695 … Reuse License upgraded here. 2006.4.1 2010.7.28 2013.10.9 make … 3.81 3.82 4.0 … 2009.1.10 2015.4.6 remake … 3.81 3.82 … 2014.04.26 2015.09.05 Debian 7.5 uses a very old version of make. Debian 7.5 Debian 8.2

Copy and Own - persist MIT Apache-2.0 Reuse License changed here Packages License changed here 2009.12.17 2013.7.11 EasyMock … 2.4 2.5.1 3.2 … Reuse 2007.11.17 2011.12.17 2012.6.4 Mockito … … 1.9.0 1.9.5 … Developers decide to cut the dependency on the upstream project. 2014.04.26 2015.09.05 Debian 7.5 Debian 8.2

Results - Summary How do license inconsistencies evolve and what is the underlying reasons? Distribution latency  Temporary Internal  Permanent Copy and own With code modification  Permanent Without code modification  Temporary License inconsistencies exist for a longer or shorter period of time based on the reason that caused them.

Conclusion Contribution Future work Conducted an analysis on license inconsistency evolution License inconsistency evolves based on different reasons License inconsistencies are properly handled in Debian Future work Analyze more projects for other evolution pattern and reasons.