Empirical Studies on License Compliance and Copyright Inconsistency Risks in Open Source Software Shi QIU.

Slides:



Advertisements
Similar presentations
Open Source in Android Apps: Tips for Becoming a Good Open Source Citizen AnDevCon Kim Weins, SVP Marketing, OpenLogic.
Advertisements

Technology Analysis LINUX Alper Alansal Brian Blumberg Ramank Bharti Taihoon Lee.
Free Beer and Free Speech Thomas Krichel
Platinum Sponsors Gold Sponsors Navigating the Open Source Legal Waters Presenter: Jeff Strauss August 14, 2013.
The Importance of Open Source Software Networking 2002 Washington, D.C. April 18, 2002 Carol A. Kunze Napa, California.
Hot Topics in Open Source Licensing Robert J. Scott Managing Partner Scott & Scott, LLP.
University of Utah 1 “Free software” Remember... In the beginning, all software was free -Just a means to sell hardware.
A DAPT IST Dissemination and Use Plan Revised version Ricardo Jiménez-Peris Universidad Politécnica de Madrid.
Open Source Basics: Definitions, Models, and Questions Johndan Johnson-Eilola Clarkson University.
Provided by OSS Watch Licensed under the Creative Commons Attribution 2.0 England & Wales licence
Open Source/Free Software Source code is available Extensible Can be changed, modified Freely distributed Copies Modified versions Alternatives to commercial/proprietary.
Data Management: Documentation & Metadata Types of Documentation.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Measuring Copying.
Software of Information Systems Hun Myoung Park, Ph.D., Public Management and Policy Analysis Program Graduate School of International Relations International.
Selenium Web Test Tool Training Using Ruby Language Discover the automating power of Selenium Kavin School Kavin School Presents: Presented by: Kangeyan.
CHAPTER 6 OPEN SOURCE SOFTWARE AND FREE SOFTWARE
 Open-source software ( OSS ) is computer software that is available in source code form: the source code and certain other rights normally reserved.
Copyright © 2012 Certification Partners, LLC -- All Rights Reserved LESSON 9  Internet Services and Tools for Business.
Computers and Society Examine the extent to which Richard Stallman’s GNU manifesto has succeeded in challenging the dominance of conventionally distributed.
Yuki Manabe*, Daniel M. German†,‡ and Katsuro Inoue†
Open Source Software An Introduction. The Creation of Software l As you know, programmers create the software that we use l What you may not understand.
Linux Last Update Copyright Kenneth M. Chipps Ph.D. 1.
Overview of Linux Dr. Michael L. Collard 1.
Presented By: Avijit Gupta V. SaiSantosh.
Software Engineering CS3003
Open Source The Future of Software What’s Open Source Open-source software is computer software whose source code is available under a copyright license.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A Method to Detect License Inconsistencies for Large-
About Openness Letizia Jaccheri Pisa
Open Source Software Architecture and Design By John Rouda.
LGPL
Legal issues of open source licenses Matthias Van hoogenbemt ICRI – K.U.Leuven - IBBT.
Zenodo Information Architecture and Usability CERN openlab Summer Students Lightning Talks Sessions Megan Potter › 19/08/2015.
1  The Kroger Co – Copyright 2008 Confidential Customer 1 st Technology Confidential.
SIG OPEN Tim Choh. AGENDA Go over some basic info on open source Look into GitHub Look into some local open source groups Find some cool open source projects.
Open Source Software. Chris Moylan Group 5...I think.
Copyright © The OWASP Foundation Permission is granted to copy, distribute and/or modify this document under the terms of the OWASP License. The OWASP.
Chapter 1: Introduction to Linux. 2 Introduction Computer Components: –Hardware –Software Types of hardware and software Important components of an OS.
Chapter 3: Understanding Software Licensing
Software Copyrights and Licenses DANIEL PARKER. Overview  Copyrights  Software copyright information  Software licenses & some examples  Why copyrighting.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Detection of License Inconsistencies in Free and.
Welcome to Open Source Technology An Overview of Software By Afroz Hippargi, CIT, YASHADA, Pune.
This slide deck is for LPI Academy instructors to use for lectures for LPI Academy courses. ©Copyright Network Development Group Module 2 Open Source.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Software Ingredients:
Open Source Project Development – A case study - CSC8350, 4/07/ Instructor: Xiaolin Hu - Presenters: Fasheng Qiu & Xue Wang.
GPLv3 と自由ソフトウェア NIIBE Yutaka Free Software Initiative of Japan OSSAJ Seminar, Tokyo,Japan.
Open Source Your Project (With Jasig) John A. Lewis Chief Software Architect Unicon, Inc. Jasig 2010 Conference 9 March 2010 © Copyright Unicon, Inc.,
2B Data Security of Workstations 1 March - May 2009 WINE 2B6304 Data Security of Workstations Lars Noodén ( )
Introduction to FOSS. Classes of software  Operating System (OS)  Software that manages all the application programs in a computer  Manages the computer.
1 February 6, Patch Submission and Review Process William Cohen NCSU CSC 591W February 11, 2008.
APACHE INSTALL AWS Linux (Amazon Web Services EC2)
Software Copyright and License
The hacker approach: the deve-lopment of free licenses
Open Source Software in Academia
Provided by OSS Watch What is open source? It’s very simple - the licence is what determines whether software is open source The licence.
Selected topic in computer science (1)
What is Copyright?.
Daniel Henry January 30, 2002 CS 4900
Open Source و الرخص Ardy Siegertالكاتب: برامج
Chapter 2: The Linux System Part 1
TMG Steering Committee
AGPL it is a free software License or a variation of the GPL License
Is the Chicken Dance Worth the Risk?
Open Source Friend or Enemy?.
Yuhao Wu1, Yuki Manabe2, Daniel M. German3, Katsuro Inoue1
Reno WordPress Meetup February 12, 2015.
Daniel Kim Software Engineering Laboratory Professor Katsuro Inoue
GNU General Public License (GPL)
COPYLEFT THE TERM The Term copyleft was forged upon the traditional copyright term by opposing the word right (which in English means both right meant.
APACHE LICENSE HISTORICAL EVOLUTION
Large-scale Analysis of Software Reuse for Code and License Changes
Presentation transcript:

Empirical Studies on License Compliance and Copyright Inconsistency Risks in Open Source Software Shi QIU

Introduction Open source license Copyright Open source license describes the terms and conditions when OSS software is used, modified and shared. Software copyright is a special case of copyright, which is used to prevents the unauthorized copying of software.

Enforce package A under GPL-2.0 as well ! Definition The situation that the license of an OSS software is not compatible with the license of its dependency[1]. Copyleft license: e.g. GPL-2.0, GPL-3.0, LGPL-2.1, etc. Package A Package B Enforce package A under GPL-2.0 as well ! MIT License GPL-2.0 License [1] Daniel German and Massimiliano Di Penta. A method for open source license compliance of java applications. IEEE software, Vol. 29, No. 3, pp. 58–63, 2012.

Problems 1. Direct risk 2. Indirect risk 3. Self risk Name: Package6 Version: 1.0.1 License: MIT Name: Package2 Version: 1.0.4 License: GPL-2.0 2. Indirect risk Name: Package3 Version: 1.0.1 License: MIT Name: Package4 Version: 2.0.1 License: MIT Name: Package5 Version: 1.2.1 License: GPL-3.0 OSS ecosystems consist of software projects that are developed and evolve together in a shared environment. Name: Package6 Version: 1.0.2 License: MIT 3. Self risk File1 File2 GPL-2.0 MIT

Research Questions Research Questions Data collection RQ1: What is the proportion of packages with license compliance risk? RQ2: Is the reuse of packages licensed under the copyleft license more likely to cause license compliance risk? RQ3: Does transitive dependency have an impact on the occurrence of license compliance risk? RQ4: What are the characteristics of license compliance risk at file level? Data collection

GPL-2, GPLv2, GPL 2, GNU GPL-2.0, GPL version 2, … Method 1. Build the license dictionary 2. Build the software evolutionary dataset GPL-2, GPLv2, GPL 2, GNU GPL-2.0, GPL version 2, … GPL-2.0 Name: package7 Version License Dependency (version) 1.0.1 MIT package8 (1.0.1), package9 (2.3.1) 1.0.2 package8 (1.0.2) 1.1.0 GPL-2.0 package9 (2.4.0), package10 (1.0.1) …

Method 3. Build the license compatibility dataset MIT, GPL-2.0, Apache-2.0, … Name: Package1 Version: 1.0.1 License: MIT Name: Package2 Version: 1.0.4 License: GPL-2.0 19 popular licenses Name: Package1 Version: 1.0.1 License: MIT Name: Package2 Version: 1.0.4 License: GPL-2.0 [2] https://www.dwheeler.com/essays/floss-license-slide.html

Method 4. Detect direct and indirect risk Name: Package1 Version: 1.0.1 License: MIT Name: Package2 Version: 1.2.1 License: MIT Name: Package3 Version: 2.0.1 License: GPL-2.0 software evolutionary dataset Name: Package4 Version: 1.2.3 License: GPL-3.0 Report Name: Package1 License: MIT ------------------------------------------- Direct risks: Package4 (GPL-3.0) Indirect risks: Package3 (GPL-2.0) license compatibility dataset

Method 5. Detect self risk license compatibility dataset Name: Package1 Version: 1.0.2 License: MIT File1 File2 GPL-2.0 MIT Report Name: Package1 License: MIT ------------------------------------------- self risks: File1 (GPL-2.0) license compatibility dataset

Proportion of Risky Packages RQ1: What is the proportion of packages with license compliance risk? Result: 2,704 packages are detected as having direct or indirect dependency risk out of 419,708 packages. The proportion is only 0.644%. We define these packages as risky packages. Answer: Packages with license compliance risk in npm is very few.

An Example A real example of risky packages cstar (MIT) commander (GPL-2.0) graceful-readlink (MIT) mucbuc-filebase (ISC) walk-json (MIT) travejs (GPL-2.0) inject-json (MIT) commander and travejs packages are not compatible with cstar package.

Risk of Copyleft License RQ2: Is the reuse of packages licensed under the copyleft license more likely to cause license compliance risk? Result: In npm, 4,067 packages includes at least one package licensed under the selected copyleft licenses in its dependency chain. Among them, 2,704 packages are detected as risky packages. The proportion is 66.49%. Answer: Yes, reuse of packages licensed under the copyleft license is more likely to cause license compliance risk.

Impact of Transitive Dependency RQ3: Does transitive dependency have an impact on the occurrence of license compliance risk? Result: Answer: Yes, it does. The direct or indirect dependency risk has a tendency to happen in the shallow transitive dependency. Direct Dependency Indirect Dependency

Self Risk RQ4: What are the characteristics of license compliance risk at file level? Result: 964 packages in 2,704 risky packages are detected as having self risk as well. The proportion is 66.49%. In the 9,679,468 source code files of 2,704 risky packages, only 291,340 files are detected. The proportion is 3.01%. Answer: The packages having direct or indirect dependency risk have a high possibility of having self risk as well. The source code files causing compliance risk only take a small part of all source code files of a package.

Conclusion A method to detect license compliance risk and an empirical study on NPM. A method to detect copyright inconsistency risk and an evolutionary study on Linux kernel. Future Work - More OSS ecosystems - A web service for developers