One Step Forward, Two Steps Back: A Design Framework and Exemplar Metrics for Assessing FAIRness in Trustworthy Data Repositories Peter Doorn, Director DANS WG RDA/WDS Assessment of Data Fitness for Use RDA 11th Plenary meeting Berlin, 22-03-2018 @pkdoorn @dansknaw
Certified Long-term Archive DANS is about keeping data FAIR https://dans.knaw.nl EASY Certified Long-term Archive NARCIS Portal aggregating research information and institutional repositories DataverseNL to support data storage during research until 10 years after 2
Previously on RDA: Barcelona 2017 DSA Principles (for data repositories) FAIR Principles (for data sets) data can be found on the internet Findable data are accessible Accessible data are in a usable format Interoperable data are reliable Reusable data can be referred to (citable) F A I R FAIR Badging scheme https://www.surveymonkey.com/r/fairdat https://eudat.eu/events/webinar/fair-data-in-trustworthy-data-repositories-webinar 3
Some measuring problems encountered in tests of FAIR data assessment tool Assessing multi-file data sets: Which are in different formats, some open, some proprietary Some files well-documented, others less so Some are openly accessible, others are protected Quality of metadata: when is metadata minimal / insufficient / sufficient / extensive / rich ? Use of standard vocabularies: how to define? Often these apply only to a subset of the data, e.g. specific variables ? 4
LiveSlide Site https://www.youtube.com/watch?v=w-3oLA7kjFY&t=0m18s 5
What have we done since? Test prototype FAIRdat within DANS, within 4 other repositories, and at Open Science FAIR in Athens Participate in FAIR metrics group: see http://fairmetrics.org/ 14 metrics on GitHub: https://github.com/FAIRMetrics/Metrics Preprint of paper ‘A design framework and exemplar metrics for FAIRness’: https://www.biorxiv.org/content/early/2017/12/01/225490 Evaluate DANS archive against FAIR metrics 6
3 (for 8 datasets) 2 (for 2 datasets) FAIRdat Prototype Testing (4 repositories) Name of Repository Number of Datasets Number of Reviewers Number of reviews VirginiaTech 5 1 MendeleyData 10 3 (for 8 datasets) 2 (for 2 datasets) 28 Dryad 9 3 (for 2 datasets) 2 (for 3 datasets) 16 CCDC 11 ? (no names) 2 (for 1 dataset) 12 Results: Variances in FAIR scores across multiple reviewers because of: subjectivity of some questions (e.g. sufficiency of metadata) misunderstanding of what was asked Worry that sensitive data will never get a high FAIR rating even if all its metadata is available and machine-readable A month ago we had the opportunity to run a pilot testing of the prototype with 4 data repositories: VirginiaTech, MendeleyData, Dryad and CCDC, in order to see if the questionnaire design is something that would be easy to use and effective. We asked reviewers to assess multiple datasets from different domains and also we had different reviewers assessing the same datasets. According to the results there were some variances in the FAIR scores because of subjectivity of some of the questions [difficulties with assessing the extent of metadata (sufficient/rich)], miss-interpreting what was asked, difficulties with assessing the sustainability of multi-file datasets (preferred vs. accepted file formats). Also, there was concern over the fact that sensitive data/restricted datasets will never be able to score highly even if all its metadata is available and machine readable or even can be available under requested permission is granted by the data holder. So we probably need to find a path for those datasets too! Despite these challenges all the repositories are willing to participate in a second round of testing once adjustments and improvements are made. Slide credits: Eleftheria Tsoupra 7
Prototype Testing (Open Science FAIR) Feedback from 17 participants Pros Simple/easy to use questionnaire Well-documented Useful Cons Oversimplified questionnaire structure Some subjective indicators Some requirements based on Reusability may be missing from the current operationalization Furthermore, 2 weeks ago the pilot version of the assessment tool was tested by the participants of a workshop - part of the Open Science FAIR conference in Athens. This time we gathered feedback by a diverse group of people and by using a set of questions (such as ‘What was best?’, ‘What was the main obstacle?’, etc.) rather than just asking for input… Pros-> According to most of the participants the tool is simple and easy to use, well- documented and useful. Cons-> While for some others the questionnaire structure appeared to be oversimplified and was suggested to add more questions…A couple of participants think that treating R as the average might mean that some requirements are missing, while other... Slide credits: Eleftheria Tsoupra 8
Can we align match the DANS FAIRdat questions with the new FAIR metrics? DANS FAIRdat metrics FAIR metrics K.I.S.S. Aspirational Questionnaire based Fully automatic assessment F + A + I = R R also operationalized Works for men & women Aimed at men & machines
(Self) assessment of DANS archive on the basis of the FAIR principles (& metrics) Delft University: DANS EASY complies with 11 out of 15 principles, for 2 DANS does not comply (I2 & R1.2), for 2 more it is unclear (A2 & R1.3) Self assessment: Some metrics: FAIRness of DANS archive could be improved E.g.: Machine accessibility; Interoperability requirements; Use of standard vocabularies; Provenance Some metrics: we are not sure how to apply them E.g.: PID resolves to landing page (metadata), not to dataset; Dataset may consist of multiple files without standard PID Sometimes the FAIR principle itself is not clear E.g.: Principle applies to both data and metadata; What does interoperability mean for images or PDFs? Are some data types intrinsically UNFAIR? Some terms are inherently subjective (plurality, richly)
General conclusion Before we started thinking about implementing the FAIR principles we were confused about the subject. Having tried to implement them we are still confused -- but on a higher level. This confusion is partly due to the very good ideas underlying FAIR, but we may need a clearer FAIR 2.0 specification of the principles
Steps forward again DANS paper to be out soon Deal with as many FAIR principles/metrics at the repository level (i.e. by Core Trust Seal certification) Have separate metrics/questions at level of dataset (= collection), metadata and single data file Questionnaire approach remains useful as a FAIR data review tool – and we doubt whether automatic testing will work in practice Some new FAIR metrics are too ambitious to be applicable to legacy data Relax on tying questions directly to F, A, I and R and separate scores for each letter Keep some form of badging for the data user to get an impression of fitness for use!