Being FAIR – what next? - FORCE11 Berlin Conference -

Slides:



Advertisements
Similar presentations
Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Welcome to the Conference !! Juan Bicarregui Chair, APA Executive.
Advertisements

Update on the Research Data Alliance June 2015 Updated: 8 th June 2015.
RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT.
Update on the Research Data Alliance April  RDA community focuses on building social, organizational and technical infrastructure to  reduce.
DATA FOUNDATION TERMINOLOGY WG 4 th Plenary Update THE PLUM GOALS This model together with the derived terminology can be used Across communities and stakeholders.
Data Fabric IG Introduction. 2  about 50 interviews & about 75 community interactions  Data Management and Processing is too time consuming and costly.
Summary Data Practices Report Peter Wittenburg Max Planck Data & Compute Center former MPI for Psycholinguistics.
Summary of RDA Outputs so far dr. Ir. Herman Stehouwer 22 September 2015.
All you wanted to know about The Research @RDA_US
1 RDA and Metadata Peter Fox (my view) Metadata session
All you wanted to know about The Research @RDA_US
RDA in a nutshell 18 May 2016
Data Foundations And Terminology (DFT) IG Virtual Meeting July 6 th 2016 Co-Chairs DFT IG :Gary Berg-Cross & Raphael Ritz P8 Sessions DFT IG Breakout Session.
Bringing visibility to food security data results: harvests of PRAGMA and RDA Quan (Gabriel) Zhou, Venice Juanillas Ramil Mauleon, Jason Haga, Inna Kouper,
RDA in a nutshell February 2017
RDA in a nutshell December 2016
RDA in a nutshell January 2017
Data Foundations And Terminology (DFT) IG
Research Data Alliance in a nutshell
RDA in a nutshell May CC BY-SA 4.0.
Jennie Larkin, PhD Senior Advisor
RDA in a nutshell 01 November 2016
RDA in a nutshell August 2017
RDA in a nutshell 01 October 2016
Overview of WGs, IGs and BoFs
RDA in a nutshell 05 August 2016
RDA in a nutshell April CC BY-SA 4.0.
Current and Upcoming RDA Recommendations Dr. ir. Herman Stehouwer
GISELA & CHAIN Workshop Digital Cultural Heritage Network
RDA in a nutshell July CC BY-SA 4.0.
RDA in a nutshell 09 June
RDA in a nutshell June CC BY-SA 4.0.
Susanna-Assunta Sansone, Rebecca Lawrence and Simon Hodson
Research Data Alliance - Research Data Sharing without barriers Terena Networking Conference 22 May 2014.
Libraries as Data-Centers for the Arts and Humanities
RDA in a nutshell September 2017
RDA in a nutshell October 2017
FAIR Metrics RDA 10 Luiz Bonino – - September 21, 2017.
RDA in a nutshell November 2017
knowledge organization for a food secure world
RDA in a nutshell March CC BY-SA 4.0.
Maggie, Carlo, Peter, Rebecca (GEDE discussions)
Agenda Welcome and overview (Peter)
RDA in a nutshell December 2016
The Research Data Alliance - What’s going on in Europe?
Institutional Research Data Management (RDM)
RDA Update 04 April First off if you are new to RDA check out the newcomers
Metadata for research outputs management Part 2
The Research Data Alliance (RDA) in a nutshell November 2018
Group of European Data Experts in RDA
The Research Data Alliance (RDA) in a nutshell December 2018
RDA ICT Technical Specifications
Research Data Alliance Data as of June 2017
Research Data Alliance (RDA) 9th WG/IG Collaboration Meeting: Repository Platforms for Research Data (RPRD) Interest Group 13nd June 2018 Co-Chairs:
Common Solutions to Common Problems
RDA in a nutshell 09 June
Interoperability – GO FAIR - RDA
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Organisations & The Research Data Alliance (RDA) - the Organisational Advisory Board CC BY-SA 4.0.
Bird of Feather Session
Automatic evaluation of fairness
Update on the Research Data Alliance July 2015
The Research Data Alliance (RDA) in a nutshell March 2019
EOSC-hub Contribution to the EOSC WGs
The Research Data Alliance: a (data) window to the world
The Research Data Alliance: a (data) window to the world
The Research Data Alliance: a (data) window to the world
The Research Data Alliance and Opportunities for Libraries
The Research Data Alliance: a (data) window to the world
Open Science: is the Research Data Alliance a help or a hindrance?
Presentation transcript:

Being FAIR – what next? - FORCE11 Berlin Conference - Peter Wittenburg Max Planck Society, Max Planck Computing & Data Facility RDA Europe Director, former RDA TAB member www.rd-alliance.org - @resdatall CC BY-SA 4.0

www.rd-alliance.org - @resdatall Joining Forces It is a great opportunity to link up and to synchronize with FORCE11! Finally we share the same intentions. RDA Council would like to come to a collaboration agreement. Basically RDA shares the FAIR principles. We see them as a global language based on broad agreement. RDA has a focus on specifications of components, interfaces, procedures, guidelines to improve data sharing and re-use, i.e. building blocks of an interoperable eco-system of data infrastructures. “eco-system” = nice description of our big dilemma in research and industry when integrating data from different sources 17/09/2018 www.rd-alliance.org - @resdatall CC BY-SA 4.0

www.rd-alliance.org - @resdatall What is RDA? Why did we build RDA? What are typical results of RDA? How further? 17/09/2018 www.rd-alliance.org - @resdatall CC BY-SA 4.0

RDA worldwide practitioners from many disciplines & sectors Total RDA community members >6000 ~ 700 active Guiding Principles: Openness Consensus Balance Harmonization Community-driven Non-profit and technology-neutral > 123 Countries practitioners from many disciplines & sectors simple governance (WG/IGs, Council, TAB, OAB) get funds from NSF, EC, AU more countries ready to support us (SA, CA, CN, etc.) have >80 organisational members www.rd-alliance.org @resdatall rd-alliance.org/about-rda CC BY-SA 4.0

Organisational & Affiliate Members 43 Organisational Members 8 Affiliate Members Guiding Principles: Openness Consensus Balance Harmonization Community-driven Non-profit and technology-neutral www.rd-alliance.org @resdatall rd-alliance.org/get-involved/organisational-membership CC BY-SA 4.0

Started at ICRI 2012 Copenhagen 75 data experts from 15 countries at DAITF workshop discussed the needs of global and cross-disciplinary agreements for the data domain key was Larry Lannom’s layer presentation Guiding Principles: Openness Consensus Balance Harmonization Community-driven Non-profit and technology-neutral www.rd-alliance.org @resdatall rd-alliance.org/about-rda CC BY-SA 4.0

Official RDA start in March 2013 March 2013 P1 in Gothenburg (2 plenaries per year) Washington, Dublin, Amsterdam, San Diego, Paris, Tokio, Denver, Barcelona, Montreal, Berlin agreed on 6 basic principles for work and participation openness, consensus, balance, harmonisation, community-driven, non-profit agreed on grass-roots approach with WGs and IGs instead of reference model/architecture some people doubt whether this can lead to success looks indeed somewhat chaotic, but ... need to accept that some outputs will not be taken up don’t we know this from IETF RFCs Guiding Principles: Openness Consensus Balance Harmonization Community-driven Non-profit and technology-neutral www.rd-alliance.org @resdatall rd-alliance.org/about-rda CC BY-SA 4.0

rd-alliance.org/groups RDA Interest (IG) & Working Groups (WG) by Focus (1) Total 81 groups: 30 Working Groups & 51 Interest Groups Domain Science - focused Global Water Information IG Agrisemantics WG Linguistics Data Interest Group BioSharing Registry WG Health Data IG Fisheries Data Interoperability WG Mapping the Landscape IG On-Farm Data Sharing (OFDS) WG Marine Data Harmonization IG Rice Data Interoperability WG Quality of Urban Life IG Wheat Data Interoperability WG RDA/CODATA Materials Data, Infrastructure & Interoperability IG Research data needs of the Photon and Neutron Science community IG Agricultural Data IG (IGAD) Biodiversity Data Integration IG Small Unmanned Aircraft Systems’ Data IG Chemistry Research Data IG Structural Biology IG Digital Practices in History and Ethnography IG Weather, Climate and air quality IG Geospatial IG Community Needs - focused Data for Development IG Development of Cloud Computing Capacity and Education in Developing World Research IG Certification and Accreditation for Data Science Training and Education WG Education and Training on handling of research data IG RDA/CODATA Summer Schools in Data Science and Cloud Computing in the Developing World WG Ethics and Social Aspects of Data IG Teaching TDM on Education and Skill Development WG International Indigenous Data Sovereignty IG Archives & Records Professionals for Research Data IG www.rd-alliance.org @resdatall rd-alliance.org/groups CC BY-SA 4.0

rd-alliance.org/groups RDA Interest (IG) & Working Groups (WG) by Focus (2) Data Stewardship and Services – focused Long tail of research data IG Preservation e-Infrastructure IG Brokering Framework WG Preservation Tools, Techniques, and Policies IG WDS/RDA Assessment of Data Fitness for Use WG RDA/WDS Certification of Digital Repositories IG RDA / WDS Publishing Data Workflows WG RDA/WDS Publishing Data Cost Recovery for Data Centres IG Active Data Management Plans IG Data in Context IG Repository Platforms for Research Data IG Research Data Provenance IG Data Rescue IG Virtual Research Environments IG Data Versioning IG Domain Repositories IG Libraries for Research Data IG Base Infrastructure – focused Brokering IG Federated Identity Management IG Array Database Assessment WG Data Type Registries WG Metadata IG PID IG Metadata Standards Catalog WG Vocabulary Services IG Metadata Standards Directory WG PID Kernel Information WG Data Fabric IG Data Foundations and Terminology IG Big Data IG www.rd-alliance.org @resdatall rd-alliance.org/groups CC BY-SA 4.0

rd-alliance.org/groups RDA Interest (IG) & Working Groups (WG) by Focus (3) Reference and Sharing - focused QoS-DataLC Definitions WG International Materials Resource Registries WG Data Citation WG National Data Services IG Data Description Registry Interoperability WG Data Security and Trust WG RDA/CODATA Legal Interoperability IG Reproducibility IG Empirical Humanities Metadata WG Data Discovery Paradigms IG Provenance Patterns WG Repository Core Description WG RDA / WDS Publishing Data Bibliometrics WG Research Data Repository Interoperability WG Research Data Collections WG Partnership Groups RDA / TDWG Metadata Standards for attribution of physical and digital collections stewardship WG RDA/NISO Privacy Implications of Research Data Sets IG RDA/WDS Scholarly Link Exchange Working Group RDA/WDS Publishing Data IG ELIXIR Bridging Force IG www.rd-alliance.org @resdatall rd-alliance.org/groups CC BY-SA 4.0

RDA Europe Project - RDA DE/UK/ES/FI/etc. Community Building lots of interactions – bringing data professionals together discussing RDA results – stimulating new groups doing training, hackathons, etc. Examples: GEDE Group of EU Data Experts brings data professionals from 47 ESFRI projects together to synch about PIDs, Repositories, Versioning, etc. eIRG Interactions with e-Infrastructures many national meetings in almost all countries in EU RDA DE – community of about 200 data experts intensifying discussions with industry share same challenges focusing on solutions building software Guiding Principles: Openness Consensus Balance Harmonization Community-driven Non-profit and technology-neutral www.rd-alliance.org @resdatall rd-alliance.org/about-rda CC BY-SA 4.0

3 full days RDA Plenary (21-23 March) 2 days Co-located and Associated Events (19-20 March) 1-2 days Industrial Side Event (19-20 March)

www.rd-alliance.org - @resdatall What is RDA? Why did we build RDA? What are typical results of RDA? How further? 17/09/2018 www.rd-alliance.org - @resdatall CC BY-SA 4.0

DOBES – Humanities/Languages (2000-12) http://dobes.mpi.nl/ ~70 global teams ~80 TB in online archive 4 dynamic external copies remote archives how can one use data to validate theories about the evolution of languages (and cultures) over thousands of years how to understand which languages are more "economic" than others Revolution in humanities: scientific paper is not only goal anymore – it’s about repurposing data 17/09/2018 www.rd-alliance.org - @resdatall CC BY-SA 4.0

DOBES was almost FAIR (specs in 2002) To be Findable: F1. (meta)data are assigned a globally unique and eternally persistent identifier. F2. data are described with rich metadata. F3. (meta)data are registered or indexed in a searchable resource. F4. metadata specify the data identifier. To be Accessible: A1 (meta)data are retrievable by their identifier using a standardized comm protocol. A1.1 the protocol is open, free, and universally implementable. A1.2 the protocol allows for an authentication and authorization procedure, where necessary. A2 metadata are accessible, even when the data are no longer available. To be Interoperable: I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. I2. (meta)data use vocabularies that follow FAIR principles. I3. (meta)data include qualified references to other (meta)data. To be Re-usable: R1. meta(data) have a plurality of accurate and relevant attributes. R1.1. (meta)data are released with a clear and accessible data usage license. R1.2. (meta)data are associated with their provenance. R1.3. (meta)data meet domain-relevant community standards. CC BY-SA 4.0

CLARIN Research Infrastructure (2006-...) bad state & org of data quality differences heterogeneity at all levels biggest problem semantic mapping metadata for workflows is special etc. many different community services component-based metadata setup concept registry Virtual Language Observatory distributed Web-Workflow tool etc. DSA/WDS compliant centres (repositories, etc.) fairly FAIR compliant setup 17/09/2018 www.rd-alliance.org - @resdatall CC BY-SA 4.0

EUDAT Data Infrastructure some data centres 5 initial communities in the driving seat definition of services and task forces to build them B2SAFE was and is difficult no single API would do to easily replicate data data organisations are different metadata semantics not ready for machine processing etc. not much fits together yet did not change data practices in the labs 17/09/2018 www.rd-alliance.org - @resdatall CC BY-SA 4.0

www.rd-alliance.org - @resdatall Do we have a problem? data intensive projects and large data aggregations are a fact but ... data is hardly visible and not accessible (only 18% of data in registered repositories is accessible) data domain is fragmented – data integration is a costly job (identification, organisation and description of data, etc.) 80% of created data is not accessible any longer after short periods 75-80% (RDA EU, MIT) of data scientists time is lost with data integration/management work 60% of costs of data intensive projects is spent for pure integration tasks sorting out rights as a never ending story 17/09/2018 www.rd-alliance.org - @resdatall CC BY-SA 4.0

Will IoT change the game? 50 billion smart devices (Intel) will create true data monsters continuous streams with high-granularity optimisations and real-time decisions required much more re-purposing of data in various contexts Are we fit for these challenges? No our methods are not scalable - we need a change! 17/09/2018 www.rd-alliance.org - @resdatall CC BY-SA 4.0

www.rd-alliance.org - @resdatall What is RDA? Why did we build RDA? What are typical results of RDA? How further? 17/09/2018 www.rd-alliance.org - @resdatall CC BY-SA 4.0

Fundamental Observation Scientific Analytics Leave flexibility Even more opportunities Reduce heterogeneity & costs Make solutions stronger Achieve sustainability Management Curation Access Referencing PID, AAI, MD, WF, Registries, Repositories, meta-semantics, etc. Leave flexibility Even more opportunities Scientific Creation from 1000s of silo solutions towards tree with stable trunk 17/09/2018 www.rd-alliance.org - @resdatall CC BY-SA 4.0

RDA DFT – Simple powerful data model Data Foundation and Terminology WG Core model is very simple. If all software developers would implement this model, we would get an enormous increase in efficiency. 17/09/2018 www.rd-alliance.org - @resdatall CC BY-SA 4.0

PID resolution to state information PID Information Type WG & PID Kernel Information WG specify principles of interoperability specify core types such as „checksum“ to allow machine interpretation be compliant to ITU X.1255 17/09/2018 www.rd-alliance.org - @resdatall CC BY-SA 4.0

www.rd-alliance.org - @resdatall Data Type Registry result: a registry for data types Linking structure/semantics with functions you get an unknown file, pull it on DTR and content is being visualized You find a tag and know how to interpret no free lunch: someone needs to register and define type PIT Demo already working with DTR Various sciences make use of it 17/09/2018 www.rd-alliance.org - @resdatall CC BY-SA 4.0

Global Digital Object Cloud taken from Larry Lannom, CNRI basically global virtualisation towards a domain of DOs concept realised in cloud systems with local hashes to implement virtualisation 17/09/2018 www.rd-alliance.org - @resdatall CC BY-SA 4.0

www.rd-alliance.org - @resdatall What is RDA? Why did we build RDA? What are typical results of RDA? How further? 17/09/2018 www.rd-alliance.org - @resdatall CC BY-SA 4.0

www.rd-alliance.org - @resdatall How further? should we go ahead with grass-roots approach? what else is needed? reference models/architectures as discussed in industry holistic approach helps in understanding what we do flexible testbeds are crucial – small and large need to link up with data industry huge solutions space hampering interop and investments do we also need to rethink interop as the Internet folks did? need to join efforts - task is huge CODATA, FORCE11, RDA, W3C, etc. no competition, but collaboration at WG level need to use EOSC to make steps – not all stakeholders agree 17/09/2018 www.rd-alliance.org - @resdatall CC BY-SA 4.0

RDA is ready to extend collaborations, since task is huge. RDA Global Email - enquiries@rd-alliance.org Web - www.rd-alliance.org Twitter - @resdatall LinkedIn - www.linkedin.com/in/ResearchDataAlliance Slideshare - http://www.slideshare.net/ResearchDataAlliance RDA is ready to extend collaborations, since task is huge. RDA Europe Email - info@europe.rd-alliance.org Twitter - @RDA_Europe Email – peter.wittenburg@mpcdf.mpg.de CC BY-SA 4.0