Research data requirements in Horizon 2020

Slides:



Advertisements
Similar presentations
Data Management Planning and DMPonline Angus Whyte DCC, University of Edinburgh Slides by Sarah Jones University of Aberdeen, 7 Oct 2014.
Advertisements

An Leabharlann UCD Órna Roche UCD James Joyce Library Metadata Documenting your data
INTRODUCTION TO RESEARCH DATA MANAGEMENT Robin Desmeules Janice Kung J W Scott Health Sciences Library University of Alberta Libraries.
ORGANIZING AND STRUCTURING DATA FOR DIGITAL PROJECTS Suzanne Huffman Digital Resources Librarian Simpson Library.
Data Management Planning and DMPonline Sarah Jones DCC, University of Glasgow VADS4R, UCA Epsom, 22 nd July 2014.
What are research data? July 2015 This work is licensed under a Creative Commons Attribution 4.0 International LicenseCreative Commons Attribution 4.0.
+ Sarah Jones Digital Curation Centre Supporting researchers with Data Management Plans.
Data Management Planning and DMPonline
Data sharing & reuse Library – RDM Support Project Basic training course for information specialists.
1 Guidelines For The Future Sharing Best Practice For National Bibliographies In The Digital Era Neil Wilson Information Coordinator IFLA Bibliography.
The Digital Curation Lifecycle Model Joy Davidson and Sarah Jones
Organizing Internet Resources OCLC’s Internet Cataloging Project -- funded by the Department of Education -- from October 1, 1994 to March 31, 1996.
Because good research needs good data The DCC lifecycle model, Exeter Uni, 19 May 2012 Funded by: The Digital Curation Lifecycle Model Joy Davidson and.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
Because good research needs good data The DCC lifecycle model, Exeter Uni, May 2011 Funded by: The Digital Curation Lifecycle Model Joy Davidson.
Metadata and Documentation Iain Wallace Performing Arts Data Service.
Metadata, vocabularies and licensing Managing research data in repositories workshop, 11 Nov 2015 Kathryn Unsworth.
Managing data and being open Sarah Jones Digital Curation Centre, Glasgow Data Management Plans: principles and.
Supporting DMPs: lessons from the UK and elsewhere Sarah Jones Digital Curation Centre, Glasgow DMPonline workshop,
Options for customising DMPonline Sarah Jones Digital Curation Centre, Glasgow DMPonline workshop, 9-10 November.
Data Citation Implementation Pilot Workshop
Funded by: Data Management Planning Sarah Jones Digital Curation Centre Twitter: sjDCC.
Open Science and Research – Services for Research Data Management © 2014 OKM ATT 2014–2017 initiative Licenced under.
Introduction to Research Data Management Joy Davidson and Sarah Jones Digital Curation Centre
Using the DMPTool for data management plans Kathleen Fear February 27, 2014.
Writing a successful data management plan Kathleen Fear October 17, 2013.
Because good research needs good data The DCC lifecycle model, Exeter Uni, May 2011 Funded by: The Digital Curation Lifecycle Model Joy Davidson.
Preservation Planning Bojana Tasić FORS SEEDS Workshop I Belgrade, October.
Digital Curation Centre, Glasgow
DMP nuts and bolts: the what, why and how of Data Management Planning
NRF Open Access Statement
Introduction to Research Data Management
Where next? Practical steps for you to move forward with DMPs
FAIR Data in Trustworthy Data Repositories:
Open Exeter Project Team
The OpenAIRE Catalogue of Services
DSA and FAIR: a perfect couple
Designing a better future: Active, actionable DMPs
EPSRC research data expectations and research software management
Paolo Budroni, University of Vienna
How NOT to share your data: Avoiding data horror stories
Ways to upgrade the FAIRness of your data repository.
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
Publishing software and data
Institutional role in supporting open access, open science, open data
General Finnish DMP Guidance
Facilitate Open Science Training for European Research
Horizon 2020: Open data pilots and lessons learnt
DMP Reviewing Exercise
Data Management: Documentation & Metadata
Digital Curation Centre
Experiences of the Digital Repository of Ireland
Digital Curation Centre, Glasgow
Introduction to Research Data Management
Metadata for research outputs management Part 2
Metadata for research outputs management
EOSCpilot All Hands Meeting 9 March 2018, Pisa
EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal
Research Data Management
A Case Study for Synergistically Implementing the Management of Open Data Robert R. Downs NASA Socioeconomic Data and Applications.
EOSCpilot All Hands Meeting 9 March 2018, Pisa
How to Implement the FAIR Data Principles? Elly Dijk
Bird of Feather Session
Attributes and Values Describing Entities.
Automatic evaluation of fairness
eScience - FAIR Science
Research Data: Infrastructure, Re-use and Dark Knowledge
Research Data Dr Aoife Coffey, Research Data Coordinator
Cultivating Semantics for Data in Agriculture and Nutrition
Fundamental Science Practices (FSP) of the U.S. Geological Survey
Presentation transcript:

Research data requirements in Horizon 2020 - what you need to know to assess DMPs Sarah Jones DCC, University of Glasgow sarah.jones@glasgow.ac.uk Twitter: @sjDCC #fosteropenscience OA and DMP training, Brussels, 28-29 November 2017

FAIR vs Open FAIR data does not have to be open Making data FAIR ensures it can be found, understood and reused – by the creator as well as others Data can be shared under restrictions & still be FAIR Open data is a subset of all the data shared As open as possible, as closed as necessary Image CC-BY-SA by SangyaPundir

FAIR data checklist Findable Persistent ID Metadata online Accessible Restrictions where needed Interoperable Use standards, controlled vocabs Common (open) formats Reusable Rich documentation Clear usage licence https://zenodo.org/record/1065991

Deposit in a data repository The EC guidelines point to Re3data as one of the registries that can be searched to find a home for data www.fosteropenscience.eu/content/re3data-demo www.re3data.org 5

How to select a repository? Better to use a subject specific repository if available Check they match particular data needs e.g. formats accepted, mixture of Open and Restricted Access. Do they assign a persistent and globally unique identifier for sustainable citations and to links back to particular researchers and grants? Look for certification as a ‘Trustworthy Digital Repository’ with an explicit ambition to keep the data available in long term. Icons to note open access, licenses, PIDs, certificates…

Zenodo Zenodo is a multi-disciplinary repository that can be used for the long-tail of research data An OpenAIRE-CERN joint effort Multidisciplinary repository accepting Multiple data types Publications Software Assigns a Digital Object Identifier (DOI) Links funding, publications, data & software www.zenodo.org 7

What is a Persistent Identifier? a long-lasting reference to a document, file or other object PIDs come in various forms e.g. ARK, DOI, URN, PURL, Handles... Typically they’re actionable i.e. type it into web browser to access Many repositories will assign them on deposit

Create metadata At a basic level, metadata supports findability, disambiguation and citation Rich, specific metadata will support interoperability & reuse Standards should be used. These can be general – such as Dublin Core, or discipline specific Data Documentation Initiative (DDI) – social science Ecological Metadata Language (EML) - ecology Flexible Image Transport System (FITS) – astronomy

Dublin Core metadata example Creator: Donald Cooper Role=Photographer Subject: Shakespeare, William, 1564-1616, Antony and Cleopatra [LC] Description: Vanessa Redgrave as Cleopatra Date: 1973-08-09 Type: Image Format: JPEG Identifier:4150 [catalogue no] Source: negative no 235 Relation: Antony and Cleopatra: Thompson/73-8 IsPartOf Coverage: Bankside Globe Role=Spatial Rights: Donald Cooper www.ahds.ac.uk/performingarts

Where to find relevant standards? Metadata Standards Directory Broad, disciplinary listing of standards and tools. Maintained by RDA group http://rd-alliance.github.io/ metadata-directory FAIRsharing A portal of data standards, databases, and policies Focused on life, environmental and biomedical sciences, but expanding to other disciplines https://fairsharing.org

Value of controlled vocabularies “MTBLS1: A metabolomic study of urinary changes in type 2 diabetes in……” From Cambridge University workshop, 33 participants Asked what species the study was on and a range of answers denoting humans were provided Humans can see these all mean the same things but computers can’t Example courtesy of Ken Haug, European Bioinformatics Institute (EMBL-EBI)

Controlled vocabularies E.g. SNOMED CT (clinical terms) or MeSH Include ontologies as well Defined terms + taxonomy Useful for selecting keywords to tag datasets

Create documentation We recommend that a ReadMe be a plain text file containing: for each filename, a short description of what data it includes, optionally describing the relationship to the tables, figures, or sections within the accompanying publication for tabular data: definitions of column headings and row labels; data codes (including missing data); and measurement units any data processing steps, especially if not described in the publication, that may affect interpretation of results a description of what associated datasets are stored elsewhere, if applicable whom to contact with questions http://datadryad.org/pages/readme

Choose appropriate file formats Different formats are good for different things open, lossless formats are more sustainable e.g. rtf, xml, tif, wav proprietary and/or compressed formats are less preservable but are often in widespread use e.g. doc, jpg, mp3 One format for analysis then convert to a standard format Data centres may suggest preferred formats for deposit www.data-archive.ac.uk/create-manage/format/formats-table BioformatsConverter batch converts a variety of proprietary microscopy image formats to the Open Microscopy Environment format - OME-TIFF

License research data openly This DCC guide outlines the pros and cons of each approach and gives practical advice on how to implement your licence CREATIVE COMMONS LIMITATIONS NC Non-Commercial What counts as commercial? ND No Derivatives Severely restricts use These clauses are not open licenses Horizon 2020 Open Access guidelines point to: or Guidance from the DCC can also help researchers to understand data licensing. This guide outlines the pros and cons of each approach e.g. the limitations of some CC options The OA guidelines under Horizon 2020 point to CC-0 or CC-BY as a straightforward and effective way to make it possible for others to mine, exploit and reproduce the data. See p11 at: http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf www.dcc.ac.uk/resources/how-guides/license-research-data 16

EUDAT licensing tool Answer questions to determine which licence(s) are appropriate to use http://ufal.github.io/lindat-license-selector

Thanks – any questions? Follow us on Twitter: @fosterscience and #fosteropenscience FOSTER training events and materials: www.fosteropenscience.eu/events

Reviewing DMPs Using the EC assessment grid, review one or more DMPs against the main criteria covered here: 1.c Are data types and formats accurately listed? 2.1.a Will the data be assigned a unique and persistent identifier? 2.1.d Will the data be described with rich metadata? 2.2.c is it specified where the data and associated metadata, documentation and code are deposited? 2.3.a Is it described how data interoperability will be facilitated e.g. through use of data and metadata vocabularies, standards or methodologies? 2.4.a Is data licensing and its role in facilitating re-use described?