Developing the European Open Science Cloud: how DANS supports FAIR and Open Science Peter Doorn, director DANS Elly Dijk, senior policy advisor DANS/NARCIS CHIST-ERA Projects Seminar 2019, Open Science Session Bucharest, Romania, 3 April 2019 Elly Dijk
Topics Open Science and FAIR Data Services of DANS: DataverseNL EASY NARCIS DANS in EOSC related projects: EOSCpilot OpenAIRE Advance EOSC-hub FREYA FAIRsFAIR
EC and international organisations promote Open Science ICSU-International Council for Science Organisation for Economic Co-operation and Development (OECD)
What is EOSC? The European Open Science Cloud (EOSC) project facilitates the sharing and re-use of scientific data and services, across disciplinary and state boundaries. Over the next five years, the European Union is investing two billion euro to connect fragmented European research infrastructures and realise the EOSC. The EOSC is part of the overall European Cloud Initiative, which ultimately aims to connect business, industry and public facilities through the cloud. In EOSC, it will be made mandatory for researchers to make data from EU-funded research available under the FAIR principles, i.e. data must be Findable, Accessible, Interoperable and Reusable.
Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3:160018 doi: 10.1038/sdata.2016.18 (2016). The European Open Science Cloud for Research pilot project is funded by the European Commission, DG Research & Innovation under contract no. 739563 www.eoscpilot.eu
FAIR Data Principles According to the FAIR Data Principles, data should be: Findable – Easy to find by both humans and computer systems and based on mandatory description of the metadata that allow the discovery of interesting datasets; Accessible – Stored for long term such that they can be easily accessed and/or downloaded with well-defined license and access conditions (Open when possible, closed when necessary), whether at the level of metadata, or at the level of the actual data content; Interoperable – Ready to be combined with other datasets by humans as well as computer systems; Reusable – Ready to be used for future research and to be processed further using computational methods.
Implementing the FAIR Principles? 15 Criteria See: http://datafairport.org/fair-principles-living-document-menu and https://www.force11.org/group/fairgroup/fairprinciples
What is DANS? https://dans.knaw.nl Mission: promote and provide permanent access to digital research resources First predecessor dates back to 1964 (Steinmetz Foundation), Historical Data Archive 1989 Institute of Dutch Academy and Research Funding Organisation (KNAW & NWO) since 2005
DANS Core Data Services EASY: certified Electronic Archiving System for self-deposit DataverseNL: data repository at universities and other institutions https://easy.dans.knaw.nl https://dataverse.nl https://www.narcis.nl NARCIS: Research in context: Gateway to scholarly information In the Netherlands
Additional services Data Vault for Repositories Training & Consultancy https://data.mendeley.com/ https://datadryad.org http://datasupport.researchdata.nl/ Research Data Journal for the Humanities and Social Sciences Work in Progress: Software Archive https://www.softwareheritage.org/ http://www.brill.com/rdj
Sharing and storing data during the research Storing and backing up files while research is active Likely to be on a networked file store or in a repository With collaborators (also at other institutions) while research is active Data are mutable: easy to change or delete
https://dataverse.nl/
Dataverse repositories worldwide
Storing data and sharing data after the research (Open) data sharing Data are stable, searchable, citable, clearly licensed Archiving or preserving data in the long-term Likely to be deposited in a digital repository Safeguarded and preserved
EASY – User interface 84,220 datasets published
EASY VAULT- Machine2Machine interface Partners publish the research data in their own repository DANS takes care of long term preservation Arrangements in contracts e.g. description of the data, costs Costs for preservation paid by the partners
Frequency of 0-30 dataset downloads, 2007-2017
Access to Datasets in DANS archive 2012-2018 (without Mendeley & Dryad)
NARCIS harvests 31 (+25) institutional repositories 1.8 million publications, including 664,000 Open Access publications 25 Universities of Applied Sciences
Providing metadata to international portals
Article Book part Conference paper Doctoral thesis 1978
Research data by source (14) Deposited by many researchers/institutions 215,000 datasets, including 170,000 open data
NARCIS collects CRIS information 69.000 research projects; 13,000 current Research by financier/by institution 59,000 researchers 18,000 experts + expertise 9,200 professors and associate professors Gender division (not public) Working addresses Classified (NARCIS classification) 2,900 research institutions For example: universities with underlying departments In CERIF based database - euroCRIS
RDM Training Training Essentials 4 Data Support by Research Data Netherlands Introductory course for those who (wish to) support researchers with storing, managing, archiving and sharing their research data Target group: data supporters Data Management Expert Guide: produced by 17 CESSDA social science data archives This guide is written for social science researchers who are in an early stage of practicing research data management. With this guide, CESSDA wants to contribute to professionalism in data management and increase the value of research data. https://www.cessda.eu/Training/Training-Resources/Library/Data-Management-Expert-Guide http://datasupport.researchdata.nl/en/about-the-course/
Rijksdienst Cultureel Erfgoed
https://eosc-portal.eu/
EOSCpilot: http://eoscpilot.eu/
Targets EOSCpilot project Develop EOSC governance framework and EU open science policy Develop pilots to integrate existing services and infrastructures between disciplines/domains (interoperability) Engagement with stakeholders from all disciplines and communities, building trust and skills for Open Science
Science Demonstrators Fifteen Science Demonstrators show the relevance and usefulness of EOSC Services and how they enable data reuse, and will drive EOSC development.
Skills workshops - how to fill the competence gaps EOSCpilot WP7: Skills Skills landscape and gap analysis Skills workshops - how to fill the competence gaps Skills framework linking EOSC services to competences Recommendations to service providers and governance Scoping training-as-a-service in EOSC Layered model for delivering training/ information Jisc Partners KIT DCC EGI.eu LIBER DANS “An important aspect of the EOSC is… professional data management and long term data stewardship.” A Cloud on the 2020 Horizon, EOSC HLEG SURF GÉANT
OpenAIRE Advance Open Access Infrastructure for Research in Europe: Promotes Open Science Goal: to develop and maintain the infrastructure to support OA policy of the EU Support H2020 OA mandates 100% OA on scientific publications Open Research Data Pilot: Open Data by FAIR Principles A National Open Access Desk (NOAD) in each country DANS leads Task Group Research Data Management Zenodo, a catch-all repository hosted by CERN
OpenAIRE Network: www.openaire.eu Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec,
OpenAIRE Support
RDM & Open Science training OpenAIRE supports the National Open Access Desks – NOADs - by contributing materials and helping them to run events. DANS coordinates Taskforce Research Data Management, which develops train-the-trainer package for NOADs and contributes to National workshops Webinars https://www.openaire.eu/webinars/ Guides https://www.openaire.eu/guides/ DANS initiated a Community of Practice for training coordinators from research infrastructures and major projects. Topics e.g. necessary skills for using “The” EOSC badges and certificates for lifelong learning shared EOSC training catalogue for FAIR (?) training materials The CoP organised a session at Digital Infrastructures DI4R18 conference.
https://zenodo.org/
EOSC-hub Goal: Integrating and managing services for the European Open Science Cloud EOSC-hub mobilises providers from 20 major digital infrastructures, EGI, EUDAT CDI and INDIGO-DataCloud jointly offering services, software and data for advanced data-driven research and innovation. 6/29/2019
EOSC-hub Service Catalogue EASY datasets in B2FIND https://www.eosc-hub.eu/catalogue EASY will harvest B2SHARE 6/29/2019
Training https://www.eosc-hub.eu/ Training about Data Management Planning Training about EOSC-hub and EUDAT services, and about so-called Thematic Services Training for and by the Core Communities
Hot topic: Ethical and legal aspects, General Data Protection Regulation - GDPR
Expertise on protection and reuse of personal data and GDPR Developing a decision tree Researcher uploading data is primarily responsible DANS can only check marginally DataTag Tool to support decisions on required data protection – compliant with GDPR and national legislation Using Harvard’s DataTags as starting point http://datatags.org/
FREYA Connected Open Identifiers for Discovery, Access and Use of Research Resources The FREYA project will create and extend a robust environment for Persistent Identifiers (PIDs). It will cover a wide range of entities in the research and innovation landscape and expose and enhance the links between them, and form an essential building block of the European Open Science Cloud (EOSC) vision. FREYA has identified three products within this environment which need to be delivered: the technical framework (PID Graph), a community forum (PID Forum), and a governance model (PID Commons).
PID – Graph ORCID 4 DOI 5 DOI 1 DOI 2 DOI 3 ORCID 4 DOI 5 DOI 1 DOI 2 DAI 5 ORCID 4
Data <> Publication linking
Open Access publications in Unpaywall
Fostering FAIR Data Practices in Europe A Horizon2020 project FAIRsFAIR “”Fostering FAIR Data Practices in Europe”” has received funding from the European Union’s Horizon 2020 project call H2020-INFRAEOSC-2018-5-2018-2019 (c), grant agreement 831558.
FAIRsFAIR in a nutshell DANS is project coordinator Starting date: 1 March 2019 Duration: 36 months 22 partners from 8 countries 6 core partners Budget: 10 million Strategic cooperation with RDA, CODATA, WDS, GO-FAIR, the ESFRI-clusters …and many other partners
Overall aim FAIRsFAIR assists the European Open Science Cloud (EOSC) governance bodies to deliver FAIR-aligned Rules of Participation in the EOSC. FAIRsFAIR will open up and share all knowledge, expertise, guidelines, implementations, new trajectories, courses and education needed to turn FAIR Principles into reality FAIRsFAIR work on FAIR certification, competence centers, education and training
Surveys about open data How and why researchers share data (and why they don't) (2014) https://hub.wiley.com/community/exchanges/discover/blog/2014/11/03/ how-and-why-researchers-share-data-and-why-they- dont?referrer=exchanges Towards open Research – Practices, experiences, barriers and opportunities (2016) https://figshare.com/articles/Survey_of_Wellcome_researchers_and_thei r_attitudes_to_open_research/4055448 Open Data Research – a researcher perspective (2017) http://www.elsevier.com/__data/assets/pdf_file/0004/281920/Open- data-report.pdf Providing researchers with the skills and competencies they need to practise Open Science (2017) https://cdn1.euraxess.org/sites/default/files/policy_library/ec- rtd_os_skills_report_final_complete_2207_1.pdf The State of Open Data (2017) https://figshare.com/articles/_/5481187
Providing researchers with the skills and competencies they need to practice Open Science Report by EC Working Group on Education and Skills under Open Science Survey: answered by 1,277 researchers in Europe (nearly 50% were PhD candidates). Open Science changes the research landscape. Research is conducted with a high degree of transparency, collegiality and research integrity. Training of the necessary skills and professional development of researchers; Three quarters of the researchers indicate that they did not participate in open access or open data training, but they would like to. Data management is considered relevant for research, but there is insufficient data support and infrastructure at institutional level.
Conclusion Open Science and FAIR data: it is inevitable – see development of the EOSC There is resistance from publishers and researchers (not a topic in this presentation) But there is already a lot of support and training! By DANS, other institutes, in European projects
Any questions? Acknowledgements: Thanks to some DANS colleagues for slides elly.dijk@dans.knaw.nl http://dans.knaw.nl/