Does it make sense to apply the FAIR Data Principles to Software?

Slides:



Advertisements
Similar presentations
Preserving and Sharing Digital Data Greg Colati, Director, Archives and Special Collections May 11, 2012.
Advertisements

Configuration management
DANS is an institute of KNAW and NWO Data Archiving and Networked Services Certification and Dutch data management services Marjan Grootveld LIBER workshop,
System Integration Verification and Validation
DANS is an institute of KNAW and NWO Data Archiving and Networked Services DANS Research Data Services and the APARSEN Centre of Excellence Peter Doorn.
DANS is een instituut van KNAW en NWO Data Archiving and Networked Services The Front Office-Back Office model: supporting research data management in.
Costs and benefits of preserving digital research data
Data Archiving and Networked Services DANS is een instituut van KNAW en NWO Certification at DANS Ingrid Dillo DSA Conference 2014 Amsterdam, 24 September.
1 The IIPC Web Curator Tool: Steve Knight The National Library of New Zealand Philip Beresford and Arun Persad The British Library An Open Source Solution.
Cloud Usability Framework
Configuration Management
Who is doing a good job in digital preservation? Audit and Certification of Digital Repositories: ISO and the European Framework.
CS 4310: Software Engineering
Chapter 7 Software Engineering Objectives Understand the software life cycle. Describe the development process models.. Understand the concept of modularity.
CLARIN-NL Second Open Call Jan Odijk CLARIN-NL Call 2 Info-session Amsterdam, 26 Aug 2010.
Sept - Dec w1d11 Beyond Accuracy: What Data Quality Means to Data Consumers CMPT 455/826 - Week 1, Day 1 (based on R.Y. Wang & D.M. Strong)
Data Archiving and Networked Services DANS is an institute of KNAW en NWO Trusted Digital Archives and the Data Seal of Approval Peter Doorn Data Archiving.
Data Archiving and Networked Services DANS is an institute of KNAW en NWO and the Peter Doorn Data Archiving and Networked Services EUDAT Conference Trust.
Recordkeeping for Good Governance Toolkit Digital Recordkeeping Guidance Funafuti, Tuvalu – June 2013.
Software Software is omnipresent in the lives of billions of human beings. Software is an important component of the emerging knowledge based service.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Software Sustainability Institute Dealing with software: the research data issues 26 August.
BTEC Unit 06 – Lesson 08 Principals of Software Design Mr C Johnston ICT Teacher
Data Archiving and Networked Services DANS is an institute of KNAW en NWO Data Archiving and Networked Services Introduction to Data Management Planning.
Software Engineering Quality What is Quality? Quality software is software that satisfies a user’s requirements, whether that is explicit or implicit.
Question To know that quality has improved, it would be helpful to be able to measure quality. How can we measure quality?
Recent Developments in CLARIN-NL Jan Odijk P11 LREC, Istanbul, May 23,
LESSON 3. Properties of Well-Engineered Software The attributes or properties of a software product are characteristics displayed by the product once.
Introduction of Geoprocessing Lecture 9. Geoprocessing  Geoprocessing is any GIS operation used to manipulate data. A typical geoprocessing operation.
Datasealofapproval.org13/12/2015 DANS is an institute of KNAW and NWO 1 Identifying and removing barriers for sharing scientific data Laurents Sesink
DANS is an institute of KNAW and NWO Data Archiving and Networked Services Measurement of research impact in OpenAIRE 2020: via text mining or the CRISs?
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
DANS is an institute of KNAW and NWO Data Archiving and Networked Services DANS Research Data Services and the APARSEN Centre of Excellence Peter Doorn.
Chapter 10 Software quality. This chapter discusses n Some important properties we want our system to have, specifically correctness and maintainability.
Open Science and Research – Services for Research Data Management © 2014 OKM ATT 2014–2017 initiative Licenced under.
WP3: Common policies and implementation strategies
CESSDA SaW Training on Trust, Identifying Demand & Networking
TOTAL QUALITY MANAGEMENT
FAIR Data in Trustworthy Data Repositories:
2nd DPHEP Collaboration Workshop
Object-Orientated Analysis, Design and Programming
Legacy and future of the World Data System (WDS) certification of data services and networks Dr Mustapha Mokrane, Executive Director, WDS International.
Classifications of Software Requirements
DSA and FAIR: a perfect couple
What are they? The Package Repository Client is a set of Tcl scripts that are capable of locating, downloading, and installing packages for both Tcl and.
Introducing ICA-Requirements Module 3: Functional Requirements for Records in Business Systems
Trustworthiness of Preservation Systems
Software Quality Assurance Software Quality Factor
The Challenge.
Enterprise Computing Collaboration System Example
SowiDataNet - A User-Driven Repository for Data Sharing and Centralizing Research Data from the Social and Economic Sciences in Germany Monika Linne, 30.
Software Project Planning &
Identifiers Answer Questions
Organised by Science Europe and the
Chapter 5 Designing the Architecture Shari L. Pfleeger Joanne M. Atlee
Charakteristiky kvality
OpenML Workshop Eindhoven TU/e,
Thursday’s Lecture Chemistry Building Musspratt Lecture Theatre,
SISAI STATISTICAL INFORMATION SYSTEMS ARCHITECTURE AND INTEGRATION
DANS is “linked 3th party of SURFsara”
An EUDAT-based FAIR Data Approach for Data Interoperability
A Case Study for Synergistically Implementing the Management of Open Data Robert R. Downs NASA Socioeconomic Data and Applications.
How to Implement the FAIR Data Principles? Elly Dijk
eScience - FAIR Science
Overview Activities from additional UP disciplines are needed to bring a system into being Implementation Testing Deployment Configuration and change management.
ISO/IEC Systems and software Quality Requirements and Evaluation
It’s all about people Data-related training experiences from EUDAT, OpenAIRE, DANS Marjan Grootveld, DANS EDISON workshop, 29 August 2017.
One Step Forward, Two Steps Back:
One Step Forward, Two Steps Back:
Presentation transcript:

Does it make sense to apply the FAIR Data Principles to Software?   Peter Doorn, Director DANS @pkdoorn @dansknaw Sustainable Software Sustainability Workshop The Hague, March 7-9, 2017

Sustainable Software Sustainability Workshop The Hague, 7-9 March 2017 Four Topics: 1. FAIR Software? 2. Software as Heritage 3. Towards an International Software Sustainability Infrastructure 4. A Software Seal of Approval?

Sustaining software _ + Preserving the code Keep the software running Costs & Complexity _ Preserving the code Keep the software running Sustain the service Software Archive Keep or emulate a platform for the software to run on Keep the knowledge and support for users +

DANS is about keeping data FAIR We are not funded to care about software… yet! Mission: promote and provide permanent access to digital research resources First predecessor dates back to 1964 (Steinmetz Foundation), Historical Data Archive 1989 Institute of Dutch Academy and Research Funding Organisation (KNAW & NWO) since 2005

Why do we care about software? Our customers ask us Research Infrastructures demand it (eg. CLARIN) Replication of results requires both data and the tools that process them Information about data is often encapsulated in software Linking data and other research resources (publications, project information - NARCIS) Software as a special form of data (that is executable) SoSu as required element in Research Data Management CLARIN: “Centres take care for the sustainability of data and tools and the permanent access to them”

Work on SoSu to which DANS contributed TDS Curator project (2010-2012): use case to “curate” the software and data of the “Typological Database System”: https://goo.gl/uTzRhH Workshops on Software Sustainability (eHumanities/KNAW, Amsterdam, 2013; Knowledge Exchange, Berlin, 2015; NCDD, Amsterdam, 2016) Reports on SoSu: https://goo.gl/BG69Au Software sustainability at the Heart of Discovery (with Netherlands eScience Center) A Conceptual Approach to Data Stewardship and Software Sustainability Software Sustainability - final report (commissioned by NCDD and NDE) Guidelines for Software Quality (with Radboud University Nijmegen, Huygens Institute, CLARIAH): https://goo.gl/HLi4nD Research Software Sustainability: Report on Knowledge Exchange workshop by Simon Hettrick, SSI (Berlin, October 2015): https://goo.gl/0Dm4kQ Webinar on SoSu: https://goo.gl/ShmyZH Memorandum of Understanding, INRIA, Software Heritage Archive

Can we use DSA and FAIR for software? Neil Chue Hong spoke about a “Software Seal of Approval” in another session I will focus on FAIR principles for Software Why use the FAIR data principles and not use existing criteria dedicated to software quality? Do the FAIR principles cover the right aspects? Do they cover enough ground? How to operationalize and implement them? How to proceed?

Why use the FAIR data principles and not use existing criteria dedicated for software quality? Simplicity: the FAIR principles are simple Popularity: everybody seems to like them Software as a special kind of data Extend Data Management Planning with requirements about software Political: research funders are embracing the FAIR principles Operationalization and Implementation are not fixed “Specify what methods or software tools are needed to access the data? Is documentation about the software needed to access the data included? Is it possible to include the relevant software (e.g. in open source code)?” https://goo.gl/7Jqqs5

In the FAIR Data approach, data should be: Findable – Easy to find by both humans and computer systems and based on mandatory description of the metadata that allow the discovery of interesting datasets; Accessible – Stored for long term such that they can be easily accessed and/or downloaded with well-defined license and access conditions (Open Access when possible), whether at the level of metadata, or at the level of the actual data content; Interoperable – Ready to be combined with other datasets by humans as well as computer systems; Reusable – Ready to be used for future research and to be processed further using computational methods. Do these principles make sense when we are dealing with software? https://www.dtls.nl/fair-data/

Approaches to software quality in the context of sustainability Software Quality Management: derived or extracted from ISO 9126-3 and ISO 25000:2005 quality model (SQuaRE). Consortium for IT Software Quality (CISQ) has defined five major desirable structural characteristics needed for a piece of software to provide business value: Reliability, Efficiency, Security, Maintainability and (adequate) Size. Maintainability of software is only one of the quality dimensions. It includes concepts of modularity, understandability, changeability, testability, reusability, and transferability from one development team to another. structure, classification and terminology of attributes and metrics applicable to software quality management have been derived or extracted from the ISO 9126-3 and the subsequent ISO 25000:2005 quality model, also known as SQuaRE. Based on these models, the Consortium for IT Software Quality (CISQ) has defined five major desirable structural characteristics needed for a piece of software to provide business value: Reliability, Efficiency, Security, Maintainability and (adequate) Size. https://en.wikipedia.org/w/index.php?curid=32370588

SSI “maintainability checklist” Can I find the code that is related to a specific problem or change? Can I understand the code? Can I explain the rationale behind it to someone else? Is it easy to change the code? Is it easy for me to determine what I need to change as a consequence? Are the number and magnitude of such knock-on changes small? Can I quickly verify a change (preferably in isolation)? Can I make a change with only a low risk of breaking existing features? If I do break something, is it quick and easy to detect and diagnose the problem? https://www.software.ac.uk/developing-maintainable-software

Criteria for evaluating Software Sustainability https://goo.gl/9YRM01 https://goo.gl/HLi4nD

From SSI and CLARIAH Criteria to FAIR: Usability CLARIAH Number CLARIAH Criterion SSI Criterion Explanation FAIR letter SSI Criteria CLARIAH Criteria MoSCoW Repository / Software 5 Usability 73 42 5.1 Understandability Is the software easily understood? F1 11 6 M S 5.2 Documentation Comprehensive well-structured documentation? F2 25 12 5.4 Buildability Straightforward to build from source on a supported system? ? 4 W 5.5 Installability Straightforward to install and deploy on a supported system? 19 10 5.3 Learnability Easy/intuitive to learn how to use its functions? R1 7 C 5.6 Performance - Does the software perform well? R2

From SSI and CLARIAH Criteria to FAIR: Sustainability 1 CLARIAH Number CLARIAH Criterion SSI Criterion Explanation FAIR letter SSI Criteria CLARIAH Criteria MoSCoW Repository / Software 6 Sustainability & Manageability 130 45 6.1 Identity Project/software identity is clear and unique? F3 8 3 M R / S 6.2 Copyright & Licensing Copyright Easy to see who owns the project/software? A1 7 - Licencing Adoption of appropriate licence? A2 5 (M) 6.14 Governance Easy to understand how the project is run and the development of the software managed? R? 2 ? W 6.4 Community Evidence of current/future community? R3 11 6.3 Accessibility Evidence of current/future ability to download? A3 12 6.5 Testability Easy to test correctness of source code? R4 19 4 S 6.6 Portability Usable on multiple platforms? I1 17* C

From SSI and CLARIAH Criteria to FAIR: Sustainability 2 CLARIAH Number CLARIAH Criterion SSI Criterion Explanation FAIR letter SSI Criteria CLARIAH Criteria MoSCoW Repository / Software 6.7 Supportability Evidence of current/future developer support? R5 21 2 W R / S 6.8 Analysability** Easy to understand at the source level? F4 20 8 M** S 6.9 Changeability Easy to modify and contribute changes to developers? I2 14 6 6.12 Interoperability Evolvability Evidence of current/future development? R6 5 1 - Interoperable with other required/related software? I3 6.13 Interoperability for community (CLARIAH) Does the software comply to requirements for integration into the community (CLARIAH) infrastructure I4 ? C 6.10 Reusability To what extend is the software reusable? R7 3 W*** 6.11 Security & Privacy Are security and privacy dealt with adequately? R? * Several PC/Mac platforms are mentioned, no platforms for mobile devices ** Combine with understandability/documentation *** Is defined by all the other criteria

CLARIAH’s Minimal Requirements for 4 Configurations No Minimal Community Requirement (CLARIAH) 1: Actively Supported End User Software 2: Unsupported End User Software 3: Actively Supported Experimental Software 4: Unsupported Experimental Software 1 version control system 3 2 one command that installs all dependencies built with one command 4 one command that will perform the installation - 5 run in the foreground 6 runtime dependencies 7 one uninstall script 8 installation exceptions 9 README file 20 19 13 10 documentation must be stored alongside the code 11 documentation must be also stored in an archivable format 12 website user interface of the application should point to the website 14 user interface should point to the contact information 15 API must be discoverable 16 configuration containing all available options 17 distributed under an OSI approved licence 18 CONTRIBUTING file not store usernames and passwords experimental nature must be clear 21 requirements must be distributed alongside the code 50 47 30 28

Scoring the criteria: binary? 5.2 Documentation (Y=1; N=0) No Yes D1 Is the software documented? 1 D2 Is the documentation accessible? D3 Is the documentation clear? D4 Is the documentation complete? D5 Is the documentation accurate? D6 Does the documentation provide a high level overview of the software? D7 Are all the necessary audiences addressed, at their appropriate levels? D8 Does the documentation make use of adequate examples? D9 Is there troubleshooting information? D10 Is the documentation available from the project website? D11 Is the documentation under version control? D12 Does the documentation describe the latest version? CLARIAH SSI

Or score on a 5-point scale? 5.2 Documentation (Low/Bad/No = 1; High/Very/Yes = 5) Low/Bad/No Moderate High/Very/Yes D1 How well is the software documented? 1 2 3 4 D2 How well is the documentation accessible? D3 How clear is the documentation? D4 How complete Is the documentation? D5 How accurate the documentation? D6 Does the documentation provide a high level overview of the software? D7 How well are all the necessary audiences addressed, at their appropriate levels? D8 How adequately does the documentation make use of examples? D9 Is there troubleshooting information? - D10 Is the documentation available from the project website? D11 Is the documentation under version control? D12 Does the documentation describe the latest version?

FAIR criteria x MoSCoW M S C W ? Total F 4 A 2 I 1 R 9 6 3 7 21 Remember KISS! Required? Additional?

Thank you for listening! peter.doorn@dans.knaw.nl iwww.dans.knaw.nl http://www.dtls.nl/go-fair/ Webinar Doorn & Dillo on FAIR & DSA: https://eudat.eu/events/webinar/fair-data-in-trustworthy-data-repositories-webinar