Agenda welcome and goals (Peter)

Slides:



Advertisements
Similar presentations
The Future of Scholarship in the Digital Age: The Role of Institutional Repositories Ann J. Wolpert Director of Libraries Massachusetts Institute of Technology.
Advertisements

Repositories, Federations, APIs, Policies - wrap up - Peter Wittenburg these slides are just a personal summary of major points they do not represent per.
Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Welcome to the Conference !! Juan Bicarregui Chair, APA Executive.
Data citation from the perspective of a scholarly publisher Lyubomir Penev TDWG Data Citation Workshop, New Orleans, Oct 2011 ViBRANT.
Implementing Metadata Marjorie M K Hlava, President Access Innovations, Inc. Albuquerque, NM
Data Publishing Workflows: Strategies and Standards
M. Stockhause et al. Martina Stockhause, Michael Lautenschlager, Frank Toussaint Deutsches Klimarechenzentrum (DKRZ) World Data Centre for Climate (WDCC)
Z EGU Integration of external metadata into the Earth System Grid Federation (ESGF) K. Berger 1, G. Levavasseur 2, M. Stockhause 1, and M. Lautenschlager.
DATA FOUNDATION TERMINOLOGY WG 4 th Plenary Update THE PLUM GOALS This model together with the derived terminology can be used Across communities and stakeholders.
Libraries as Partners in Research: the UC Curation Center’s Tools and Services UC3 Team University of California Curation Center California Digital Library.
Updates from EOSDIS -- as they relate to LANCE Kevin Murphy LANCE UWG, 23rd September
RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA 6 th Plenary Paris, Sept. 25, 2015 Gary Berg-Cross, Raphael Ritz Co-Chairs.
The Department of Energy’s Public Access Solution Giving Voice to Energy and Science R&D Results Jeffrey Salmon Deputy Director for Resource Management.
Software Sustainability Institute Dealing with software: the research data issues 26 August.
Data Fabric IG Introduction. 2  about 50 interviews & about 75 community interactions  Data Management and Processing is too time consuming and costly.
Topic Rathachai Chawuthai Information Management CSIM / AIT Review Draft/Issued document 0.1.
TECHONOLOGY experts INDUSTRY Some of our clients Link Translation’s extensive experience includes translation for some of the world's largest and leading.
RDA Data Foundation and Terminology (DFT) WG: Overview  Prepared for Collab Chairs Meeting, NIST, Nov 13-14, 2014  Gary Berg-Cross, Raphael Ritz, Peter.
VIVO and Scholarly Repositories: Synergistic Opportunities.
Hydro DWG at the RDA Plenary: BoF and Aligning HDWG work with WMO expectations and timeline Sylvain, Tony, Silvano, Ilya.
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
Discussion of Data Fabric Terms & Preparation for RDA P7 Virtual Meeting Monday, January 25, 2016 Organized by Gary Berg-Cross (DFT-IG) and Peter Wittenburg.
Data Citation Implementation Pilot Workshop
RDA-WDS Publishing Data IG Data Bibliometrics Working Group.
Open Science (publishing) as-a-Service Paolo Manghi (OpenAIRE infrastructure) Institute of Information Science and Technologies Italian Research Council.
Updating image To update the background image: Go to ‘View’ Select ‘Slide Master’ Select the page with the image Right click on the image and select ‘Change.
Data Foundations And Terminology (DFT) IG Virtual Meeting July 6 th 2016 Co-Chairs DFT IG :Gary Berg-Cross & Raphael Ritz P8 Sessions DFT IG Breakout Session.
1 This slide indicated the continuous cycle of creating raw data or derived data based on collections of existing data. Identify components that could.
Intentions and Goals Comparison of core documents from DFIG and Publishing Workflow IG show that there is much overlap despite different starting points.
Data Foundations And Terminology (DFT) IG
Data Publication (in H2020)
Jeff Moon Data Librarian &
Workshop on Brokering in Data Fabrics - community perspectives -
Jennie Larkin, PhD Senior Advisor
RDA 9th Plenary Breakout 3, 5 April :00-17:30
Overview of WGs, IGs and BoFs
RDA to Deliver Why? What? When? How?.
DSA and FAIR: a perfect couple
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
RDA Data Fabric (DF) Interest Group Peter Wittenburg & Gary Berg-Cross
EOSC MODEL Pasquale Pagano CNR - ISTI
RDA/WDS IG Certification of Digital Repositories The new 'Core Trustworthy Data Repository Requirements' hands-on RDA Plenary 9, Barcelona,
Themes in Geosciences.
Data Foundations And Terminology (DFT) IG
Publishing software and data
VI-SEEM Data Repository
Institutional role in supporting open access, open science, open data
Introduction Helena Cousijn, Claire Austin & Michael Diepenbroek
Data Fabric Interest Group Plenary 9 Core Session Barcelona
Maggie, Carlo, Peter, Rebecca (GEDE discussions)
Agenda Welcome and overview (Peter)
VI-SEEM Data Repository
E2E Testing in Agile – A Necessary Evil
Introduction to Implementing an Institutional Repository
OpenAIRE Services for Open Science
Enabling discoverability and dissemination through TDM
Tools of Software Development
Vanquishing the Measurement Dragon
Implementing a content enrichment strategy
Research Data Management
Malte Dreyer – Matthias Razum
Joint DFIG – Broker Meeting The DFIG view Peter Wittenburg
Institutional Repositories
Bird of Feather Session
GEDE Focus Area Repositories - motivation -
*metrics Project Introduction
GSBPM AND ISO AS QUALITY MANAGEMENT SYSTEM TOOLS: AZERBAIJAN EXPERIENCE Yusif Yusifov, Deputy Chairman of the State Statistical Committee of the Republic.
Data + Research Elements What Publishers Can Do (and Are Doing) to Facilitate Data Integration and Attribution David Parsons – Lawrence, KS, 13th February.
QoS Metadata Status 106th OGC Technical Committee Orléans, France
Cultivating Semantics for Data in Agriculture and Nutrition
Presentation transcript:

Joint DFIG – PWWG Meeting Amy Nurnberger, Lary Lannom, Peter Wittenburg

Agenda 14.00 welcome and goals (Peter) Breakout 1: Discussion about Guidelines/Recommendations Breakout 2: configuration building and Minimal PID Types Breakout 4: DFIG Core Session Breakout 5: Joint session with Brokering Group Breakout 7: Joint meeting with Publishing Data Workflows 14.00 welcome and goals (Peter) 14.05 DFIG view on scientific data workflows (Peter) 14.20 PWWG view on scientific data and publishing workflows (Amy) 14.35 comparison, overlap and differences in views (Larry) 14.45 discussion (Larry and Amy) 15.30 end

Intentions and Goals comparing core documents from DFIG and Publishing Workflow IG show that there is much overlap despite different starting points there are barriers in culture and terminology there is some tradition to not talk to each other RDA is about bridge building this session is about building a bridge and get together need to understand how we can integrate the approaches since we address overlapping issues how to do this -> discussion

DFIG view on scientific data workflows Peter

Lab Reality – slowly changing EU survey: 75% of researcher’s time spent on DM/A M. Brodie (MIT): 80 % something is fundamentally wrong !! far away from data publication considerations are curiosity driven research and chaos twins? is DIS different? clear trends for all: data orientation, more and complex data Automatic workflows would change, but many exceptions, parameter choices, human interventions lack of experts to create flexible software solutions how can we help and change? short term and long term solutions

An illustration Feature Sets Collection X Pattern Extractor Collection Y Smart Machine Pattern Extractor Collection Z Results Iterations

An illustration Still lot of handwork, ad hoc scripting involved. Feature Sets Collection X Still lot of handwork, ad hoc scripting involved. Also many iterations to find out optimal features. It is not obvious whether evidences will be found. Such research takes years. When to register what? When to refer to what? When to create metadata for what? When to publish and cite what? Which components would improve? Pattern Extractor Collection Y Smart Machine Pattern Extractor Collection Z

Identify components that could improve (stepwise). Data Fabric Cycle Observations Experiments Simulations etc. This slide indicated the continuous cycle of creating raw data or derived data based on collections of existing data. Identify components that could improve (stepwise).

From abstract fabrics to concrete compositions Common Components & Services Specific Components & Services Closing urgent gaps t-repositories PID system MD schemas MD editors vocabularies etc. Global Digital Object Cloud

From abstract fabrics to concrete compositions Common Components & Services Specific Components & Services Of course it would be useful to consider publication requirements while building these compositions. Closing urgent gaps t-repositories PID system MD schemas MD editors vocabularies etc. Global Digital Object Cloud

Conclusions Collecting use cases and facts from many labs. Understand from heterogeneous practices how to come to agreed components. Addressing the data cycle in the labs where publication is often not an issue for quite some time. However the requirements for data management, accessibility and publication are getting tighter. So need to consider these requirements and map them with publication requirements. Need to provide easy transitions. Thus bridge conceptualisation & terminology. Need to overcome social barriers.

RDA/WDS Data Publishing Workflows WG + Amy Nurnberger

DPWWG – Where we’ve been What are the current data publishing workflow landscape across disciplines and institutions?

Data publishing entities 25 data publishing entities assessed in terms of discipline, function, data formats, and roles The assignment of persistent identifiers (PIDs) to datasets, and the PID type used -- e.g. DOI, ARK, etc. Peer review of data (e.g. by researcher and by editorial review) Curatorial review of metadata (e.g. by institutional or subject repository) Technical review and checks (e.g. for data integrity at repository/data centre on ingest) Discoverability: was there indexing of the data, and if so, where? Links to additional data products (data paper; review; other journal articles) or “stand-alone” product Links to grant information, where relevant, and usage of author PIDs Facilitation of data citation Reference to a data life cycle model Standards compliance

Key components of data publishing Austin, C. C., Bloom, T., Dallmeier-Tiessen, S., Khodiyar, V., Murphy, F., Nurnberger, A., … Whyte, A. (2016). Key components of data publishing: using current best practices to develop a reference model for data publishing. http://doi.org/10.1007/s00799-016-0178-2

Workflows Ibid

Workflows, cont. Ibid

+ What’s missing?

What’s missing? This stuff

What’s missing? This stuff “…early interactions between researchers and a suitable data repository (or repositories), while data is processed and prepared for sharing.” Dallmeier-Tiessen, S., Khodiyar, V., Murphy, F., Nurnberger, A., Raymond, L., Whyte, A. (DRAFT). Connecting data publication to the research workflow: a preliminary analysis

What’s missing? Deliberate integration of sundry products from research process, e.g., software, code, models, etc. Integration/Interoperability between data processing tools an platforms Disciplinary difference in data conception, collection, & processing Dallmeier-Tiessen, S., Khodiyar, V., Murphy, F., Nurnberger, A., Raymond, L., Whyte, A. (DRAFT). Connecting data publication to the research workflow: a preliminary analysis

What’s needed Small, modular, shareable components that help ensure platforms offer sufficient flexibility to support variety, Research workflow solutions that enable straightforward data and metadata generation in accordance with community defined and accepted standards Commit to the use of PIDs and include versioning capabilities Clear documentation that can offer direct benefits to repository depositors and users Curators Dallmeier-Tiessen, S., Khodiyar, V., Murphy, F., Nurnberger, A., Raymond, L., Whyte, A. (DRAFT). Connecting data publication to the research workflow: a preliminary analysis