Open Software for (Open) Science doi. org/ /m9

Slides:



Advertisements
Similar presentations
Overview of Free/Open Source Software for Librarians Eric Goldhagen
Advertisements

The Web Wizards Guide to Freeware/Shareware Chapter Six Open Source Software.
Free Beer and Free Speech Thomas Krichel
Open Source Software Development & Commercialisation Developing Lifelong Learner Record Systems and ePortfolios in FE and HE: Planning for, and Coping.
Platinum Sponsors Gold Sponsors Navigating the Open Source Legal Waters Presenter: Jeff Strauss August 14, 2013.
Chandler ISR June Chandler Open Source Personal Information Manager , calendar, contacts, tasks, free-form items Easy sharing and collaboration.
Open Source WGISS 39. Definition of Open Source Software (OSS)  Open source or open source software (OSS) is any computer software distributed under.
How Is Open Source Affecting Software Development? Je-Loon Yang.
Open-Source Software ISYS 475.
+ Andrés Guadamuz SCRIPT Centre for Research in IP and Technology Law, Edinburgh, UK Proprietary, Free and Open Source Software (FOSS), and Mixed Platforms.
Open Source Software Licensing: Software And it’s Components SEAN KENEFICK.
Free Yourself from © and Get Creative with Presented for PNLA Annual Conference by Connie Strittmatter and René Tanner Reference Librarians, Montana State.
This slide is licensed under a Creative Commons Attribution-NoDerivs 2.5 License. Some rights reserved.Creative Commons Attribution-NoDerivs 2.5 License.
CHAPTER 6 OPEN SOURCE SOFTWARE AND FREE SOFTWARE
Open Code and Open Content for Education: The LAMS Experience
26 April Licensing of Intellectual Property Phoenix Ambulatory Blood Pressure Monitoring System © 2009 Christopher J. Adams This work is licensed.
 Open-source software ( OSS ) is computer software that is available in source code form: the source code and certain other rights normally reserved.
1 EPICS EPICS Licensing BESSY, May 2002 Andrew Johnson.
Is Open Source Software a viable option for private and public organizations? Anthony W. Hamann Tuesday, March 21, 2006.
Licenses A Legal Necessity Copyright © 2015 – Curt Hill.
Software Sustainability Institute Linking software: Citations, roles, references,and more
IBM Governmental Programs Open Computing, Open Standards and Open Source Recommendation for Governments.
1 Patent Rights & Open Source Software Roger G. Brooks April 29,
Overview of Linux Dr. Michael L. Collard 1.
Introduction to Open Source Imed Hammouda, adjunct professor Tampere University of Technology
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12 1.
Providing Access to Your Data: Rights Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International Earth Science.
Software Sustainability Institute How to choose a license for your software 14 September 2015, Cambridge Shoaib Sufi, Software Sustainability.
After completing this lesson, participants will be able to:  Identify ethical, legal, and policy issues for managing research data  Define copyrights,
A Basic Introduction to Free and Open Source Software Presented by John Bocan.
OPEN SOURCE AND FREE SOFTWARE. What is open source software? What is free software? What is the difference between the two? How the two differs from shareware?
IS1825 Multimedia Development for Internet Applications Lecture 09: Free and Open Source Software Rob Gleasure
Software Sustainability Institute Dealing with software: the research data issues 26 August.
April 30, 2007 openSUSE.org Build Service a short introduction Moiz Kohari VP Engineering.
CPS 82, Fall Open Source, Copyright, Copyleft.
Strategizing for the Future MySQL Conference April 27, 2006.
Distribution in Open Source Martin von Haller Groenbaek partner, Bender von Haller Dragsted ITECHLAW ASIA 2010 Bangalore, 5 February 2010.
Software Sustainability Institute Software Attribution can we improve the reusability and sustainability of scientific software?
How to Publish Your Code on COIN-OR Bob Fourer Industrial Engineering & Management Sciences Northwestern University COIN Strategic Leadership Board.
ESRIN Earth Observation Program Ground Segment Department 26/09/2015 CEOS-WGISS-40 - Olivier BaroisSlide 1 Open Source Practices.
Software Sustainability Institute Building a Scientific Software Accreditation Framework
Software Licences HSF Recommendations John Harvey / CERN 24 June 2015
FOSS Licenses & Business Models INF5750 Oct Knut Staring.
Providing Access to Your Data: Rights Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International Earth Science.
FP 501 OPEN SOURCE OPERATING SYSTEM CHAPTER 1: INTRODUCTION TO OPEN SOURCE SOFTWARE (OSS) TECHNOLOGY.
Software Sustainability Institute Tracking Software Contributions doi: /m9.figshare Joint ORCID – DRYAD Symposium on Research.
Software Sustainability Institute Working with research software 2 nd - 4 th November.
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 2: The Linux System Part 1.
Workshop on OSS with TT perspectives Meeting of the TT Network Board and Steering Committee Friday 10 December 2010 Bernard DENIS.
How to Use The Creative Commons Licenses. [formats]
Chapter 3: Understanding Software Licensing
Software Sustainability Institute Open science is impossible without software 5 th April 2016,
Open Source Software & Open Source Hardware licensing A comparison.
The New NAP Members’ Area Development. Elgg What is elgg? –Elgg is an award-winning open source social networking platform.
Open Source Software Practices
Industrial Research and Open Source – Reasons and
Open Source software Licensing
CREATIVE COMMONS FOR CULTURAL HERITAGE
A Canonical Production January 2013
Copyleft and dual licensing for publicly funded software
Open Source Software Licenses
FOSS 101 Sarah Glassmeyer Project Specialist Manager,
MOZILLA LICENSE HISTORICAL EVOLUTION
Chapter 2: The Linux System Part 1
Open Source Friend or Enemy?.
Reno WordPress Meetup February 12, 2015.
GNU General Public License (GPL)
COPYLEFT THE TERM The Term copyleft was forged upon the traditional copyright term by opposing the word right (which in English means both right meant.
APACHE LICENSE HISTORICAL EVOLUTION
Chapter I. Freedom and Open Source
Presentation transcript:

Open Software for (Open) Science http://dx. doi. org/10. 6084/m9 Open Software for (Open) Science http://dx.doi.org/10.6084/m9.figshare.1434044 e-IRG Workshop, 3 June 2015, Riga Neil Chue Hong (@npch), Software Sustainability Institute ORCID: 0000-0002-8876-7606 | N.ChueHong@software.ac.uk The Software Sustainability Institute is supported by EPSRC grants EP/H043160/1 and EP/N006410/1 (which includes funding from BBSRC and ESRC). Additional projects have been funded by Jisc (Software Preservation Framework and Software Hub) and NERC (Advanced Short Courses). Supported by Project funding from Slides licensed under CC-BY where indicated:

Overview Open source: philosophy and practice Open source licenses Other types of licenses Pros/Cons of licenses Case studies DOI: 10.6084/m9.figshare.1434044

I Am Not A Lawyer For legal insight, I recommend: http://ifosslawbook I Am Not A Lawyer For legal insight, I recommend: http://ifosslawbook.org/ and for everything else:http://oss-watch.ac.uk/ DOI: 10.6084/m9.figshare.1434044

Open Source Software is Free... Photo “free beer tap” by jakob fenger (CC-BY) Photo “Speech”by Quinn Dombrowski (CC-BY) DOI: 10.6084/m9.figshare.1434044

Free as in Puppy... Long term costs Needs love and attention May lose charm after growing up Occasional clean-ups required Many left abandoned by their owners Scott McNealy coined the phrase Inspired by Scott McNealy Photos of Great Pyrenees from Jen Schopf DOI: 10.6084/m9.figshare.1434044

Free as in kittens... Can become snarling, clawing, aloof… … but they will provide joy if you treat them well Open source is very much in the spirit of open science and vice-versa Photo “Kittens!”by James Wragg (CC-BY) DOI: 10.6084/m9.figshare.1434044

Four essential freedoms Freedom for anyone to run the program as they wish, for any purpose (non-discriminatory) Freedom to study how the program works, and change it so it does your computing as you wish (source code available, modifications allowed) Freedom to redistribute copies so you can help others (redistribution allowed) Freedom to distribute copies of your modified versions to others to give them a chance to benefit from your changes (derivatives allowed) From http://www.gnu.org/philosophy/free-sw.html Note that there are some differences between the open source movement (as led by the OSI) and the free software movement (led by GNU/FSF), the former is a development methodology, the latter a social movement. DOI: 10.6084/m9.figshare.1434044

Open Source Definition (1) Free redistribution Author and others can sell or give away as part of an aggregation of software Author cannot impose a royalty or fee on those redistributing Source code available Must include a way of accessing un-obfuscated source code at a reasonable reproduction code Derived works allowed Must allow modifications and derived works Must allow distribution under same terms as license of original Integrity of The Author's Source Code License may restrict source-code from being distributed in modified form only if the license allows the distribution of "patch files" with the source code for the purpose of modifying the program at build time, but must explicitly permit distribution of software built from modified source code. DOI: 10.6084/m9.figshare.1434044

Open Source Definition (2) Non-discriminatory Can’t restrict based on geography, person, group, field or type of organisation Product and technology neutral Not dependent on code being part of a particular product Not predicated on any individual technology or interface License must not restrict other software License must not place restrictions on other software that is distributed along with the licensed software. Distribution of license Rights must apply to all to whom the program is redistributed without the need for execution of an additional license by those parties http://opensource.org/docs/osd DOI: 10.6084/m9.figshare.1434044

Open source, open science Non-discriminatory Research should not be restricted or siloed Access to source code Research should be transparent, robust, and accessible Redistribution of software Providing access to the widest possible community Removing barriers to reuse Research should encourage building on the work of others, and giving them credit DOI: 10.6084/m9.figshare.1434044

Types of open source license Permissive (“Use with attribution”) Simple (e.g. Modified “3-clause” BSD, MIT) Grant Patent Rights (e.g. Apache, Eclipse) Copyleft (“Share modifications under same license”) Strongly copyleft (e.g. GPL) Weakly copyleft (e.g. LGPL, Mozilla, EUPL) All OS licenses allow private and commercial use; modification; distribution; limit liability; retain copyright More information http://choosealicense.com/licenses/ https://tldrlegal.com/ Permissive similar to CC-BY Copyleft similar to CC-BY-SA No concept of ND or NC restrictive clauses Strongly copyleft means cannot be “bundled” – GPL license dominates DOI: 10.6084/m9.figshare.1434044

Other common licenses Closed (“Proprietary”) Restricted (“Academic” / “Non-commercial”) Public domain / CC0 Informal license No license Also Larger works comprising code with different licenses (“license compatibility”) Software made available under more than one license (“dual licensing”) DOI: 10.6084/m9.figshare.1434044

Licenses vs Governance Open source software is more than the license Some open licensed software is not produced openly Some closed licensed software benefits from open source project principles and processes Most open source contributors are paid to do so Best practice identified from OSS projects useful for governance and project/product management of all types of software http://producingoss.com/ DOI: 10.6084/m9.figshare.1434044

Pros and Cons of Open Source Easier to evaluate Harder to corner cut when shipping to deadline No lock-in Harder to gain income from selling the software Easier to attract contributors and staff But most sell services Frictionless code sharing Cannot restrict to non-commercial use only Free advertising Able to take with you But what is non-commercial use? Generally better modularisation Once licensed, cannot revoke license Reduced duplication Can work together on common platforms Though can relicense Can increase competition Copyrighting better than patenting for software (costs far less!) Don’t open source anything that represents core business value DOI: 10.6084/m9.figshare.1434044

Building out new functionality Dropbox moved to Go from Python to benefit from better concurrency support and execution speed Go did not have the depth of libraries that Dropbox needed to build larger systems Dropbox team started to build its own libraries connection management and memcache client Open-sourced work to kickstart community and help create better production systems https://blogs.dropbox.com/tech/2014/07/open-sourcing-our-go-libraries/ DOI: 10.6084/m9.figshare.1434044

Building community Spark started as academic project at Berkeley in 2009 Code released under BSD in 2010 Started Apache Incubator process in June 2013 Gained contributions from 120 developers in 25 organisations, including Intel, Yahoo!, Cloudera, Alibaba Relicensed under Apache 2.0 license Became full Apache project in February 2014 Most active Apache project in 2014 Source code on GitHub Used globally by Alibaba, Amazon, IBM Company formed to commercialise in Sep 2013 Based around hosting and consulting DOI: 10.6084/m9.figshare.1434044

Creating widely used platforms Git originally written and released by Linus Torvalds under GPL license Git trademarks enforced for consistency Easy for third parties to create services on top GitHub contribute to Git and open source some of their own tools Business model exploits paid hosting / support for enterprise installations http://tom.preston-werner.com/2011/11/22/open-source-everything.html http://todogroup.org/blog/why-we-run-an-open-source-program-github/ Gitlab go further and release their commercial EE edition under MIT license, but code must be requested And GItLab ask nicely that you do not redistribute the EE edition if you want to help them DOI: 10.6084/m9.figshare.1434044

Underpinning research Basics: Website, mailing list, code repository, issue resolution Remove barriers to participation, increase efficiency 1993: First public release; 2 devs 1995: Code open sourced; 3 devs 1996: r-testers list set up 1997: lists split: r-announce, r-help, r-devel; public CVS; 11 devs 2000: CRAN split and mirror 2001: BioConductor 2003: Namespaces 2005: I8n, L8n 2007: R-Forge Today: BioConductor (33 core devs), R-Forge (532 projects, 1562 devs), CRAN (1400+ packages) http://cran.r-project.org/doc/html/interface98-paper/paper_2.html DOI: 10.6084/m9.figshare.1434044

Case studies in academia OSS Watch have produced case studies of several open source software projects in academia Koha: a case study in project ownership Wookie: a case study in sustainability ATutor LMS: a case study TexGen: a case study MediaWiki: a case study in sustainability WebPA: the road to sustainability Apache Cocoon: a case study in sustainability Moodle: a case study in sustainability Exim: a case study in sustainability http://oss-watch.ac.uk/resources/casestudies DOI: 10.6084/m9.figshare.1434044

“[T]here's been a stunning and irreversible trend in enterprise infrastructure. If you're operating a data center, you're almost certainly using an open source operating system, database, middleware and other plumbing. No dominant platform-level software infrastructure has emerged in the last ten years in closed-source, proprietary form.” - Mike Olsen, Cloudera founder

GPL vs Apache Software freedom vs developer freedom Copyleft licenses have increased license management costs Industry is generally more supportive of permissive licenses Open source software no longer written as an alternative to closed, but as the platform to drive service provision Software no longer the competitive differentiator, but the ability to operate it at scale http://timreview.ca/article/650 DOI: 10.6084/m9.figshare.1434044

Open licensing should be used for data and code Software Infrastructure and Environments for Reproducible and Extensible Research Open licensing should be used for data and code Workflow tracking should be carried out during the research process Data must be available and accessible Code and methods must be available and accessible All 3rd party data and software should be cited Stodden V and Miguez S, (2014) Best Practices for Computational Science: Software Infrastructure and Environments for Reproducible and Extensible Research, DOI: 10.5334/jors.ay

Open Source lets others benefit Slide courtesy of Nancy Wilkins Diehr “Arman Bilge, a 10th grader at Lexington High School in Massachusetts, was a newbie to phylogenetics when a science teacher there organized an after-school phylogenetic tree club. In the club, Bilge learned how to use a variety of software applications, including one well known to systematic biologists called BEAST. That led Bilge to create a map and timeline that identified when the Human Immunodeficiency Virus (HIV) arrived in the Americas, and where and when it spread across North and South America. “The BEAST is a beast,” Bilge said in a telephone interview from Lexington. He managed to tame it enough to create a detailed phylogenetic tree based on similarities and differences in the 3,000 nucleotide subunits of a gene for an envelope protein among 700 known HIV-1 strains. “You ask the BEAST program to guess at a most likely family tree for all of them and the software scores each possible tree with a likelihood function,” Bilge said. “The number of possible phylogenetic trees for HIV-1 exceeds the number of protons in the universe, and only one of them is correct, so this is a big calculation.” Bilge first tried to run the analysis on his home computer. “I ran it for three weeks, but I didn’t reach the accepted way of knowing that you came to the end,” he said. Bilge said his parameters and settings were impossible for any computer to analyze, but he learned from the experience. “I started multiple simultaneous runs on CIPRES and the geographic component of my project is the result of the concatenation of these analyses,” he said. The phylogenetic tree Bilge published for his science fair project was the one that BEAST said was the most optimal, and Bilge said his conclusions supported the previously published results of HIV experts: “A single introduction of the virus in Haiti in the mid-1900s resulted in its dispersion across the American continent.” While Bilge may not have been satisfied with his residual statistical uncertainty, the judges at the 2012 Massachusetts Science and Engineering Fair were – they awarded him first place in the biology category.” Slide courtesy of Nancy Wilkins-Diehr BEAST software licensed under LGPL

Summary Open source now de facto for infrastructure software Open source encourages Exploitation Reproducibility and Robustness Reuse Open source helps support open science “Publicly funded research […] is a public good, produced in the public interest, which should be made openly available” – RCUK http://www.software.ac.uk/resources/guides/epsrc-research-data-policy-and-software DOI: 10.6084/m9.figshare.1434044