PID centric fabric constructed piece by piece

Slides:



Advertisements
Similar presentations
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
Advertisements

1 CS 502: Computing Methods for Digital Libraries Lecture 25 Access Management.
OAI-PMH at Yale Report on the DLF OAI Training Session November 10, 2005 Charlottesville, VA.
RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT.
Tobias Weigel (DKRZ) Tobias Weigel Deutsches Klimarechenzentrum (DKRZ) Persistent Identifiers Solving a number of problems through a simplistic mechanism.
DATA FOUNDATION TERMINOLOGY WG 4 th Plenary Update THE PLUM GOALS This model together with the derived terminology can be used Across communities and stakeholders.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT.
RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA 6 th Plenary Paris, Sept. 25, 2015 Gary Berg-Cross, Raphael Ritz Co-Chairs.
Data Fabric IG Introduction. 2  about 50 interviews & about 75 community interactions  Data Management and Processing is too time consuming and costly.
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
Jennifer Bowen, University of Rochester ALA Annual Conference, 2009, Chicago, Illinois 1 Defining Linked Data for the eXtensible Catalog (XC): Metadata.
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
Adoption of RDA-DFT Terminology and Data Model to the Description and Structuring of Atmospheric Data Aaron Addison, Rudolf Husar, Cynthia Hudson-Vitale.
RDA Data Foundation and Terminology (DFT) WG: Overview  Prepared for Collab Chairs Meeting, NIST, Nov 13-14, 2014  Gary Berg-Cross, Raphael Ritz, Peter.
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
METS Application Profiles Morgan Cundiff Network Development and MARC Standards Office Library of Congress.
Adoption of RDA-DFT Terminology and Data Model to the Description and Structuring of Atmospheric Data Aaron Addison, Rudolf Husar, Cynthia Hudson-Vitale.
Hydro DWG at the RDA Plenary: BoF and Aligning HDWG work with WMO expectations and timeline Sylvain, Tony, Silvano, Ilya.
Adoption of RDA-DFT Terminology and Data Model to the Description and Structuring of Atmospheric Data Aaron Addison, Rudolf Husar, Cynthia Hudson-Vitale.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Breakout Session 2.2: A sustainable GEO Information System of Systems Chair: Lorenzo Bigagli Rapporteur: Greg Yetman.
Discussion of Data Fabric Terms & Preparation for RDA P7 Virtual Meeting Monday, January 25, 2016 Organized by Gary Berg-Cross (DFT-IG) and Peter Wittenburg.
Data Foundation IG DF Organizing Chairs: Gary Berg-Cross & Peter Wittenburg.
Adoption of RDA-DFT Terminology and Data Model to the Description and Structuring of Atmospheric Data Aaron Addison, Rudolf Husar, Cynthia Hudson-Vitale.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No The Data Type.
TWC Adoption* of RDA DTR and PIT in the Deep Carbon Observatory Data Portal Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox, & the.
IPDA Architecture Project International Planetary Data Alliance IPDA Architecture Project Report.
Data Fabric IG From Testing to Recommendations Beth Plale.
Draft Data Foundation and Terminology (DFT) Vocabulary Development Process Prepared for WG-Core meeting 24/25.2 Munich/Garching Gary Berg-Cross Co-Chair.
International Planetary Data Alliance Registry Project Update September 16, 2011.
Weigel, Berger, Kindermann, Lautenschlager EGU Versioning for CMIP6 in the Earth System Grid Federation Data preparation Initial registration.
Bringing visibility to food security data results: harvests of PRAGMA and RDA Quan (Gabriel) Zhou, Venice Juanillas Ramil Mauleon, Jason Haga, Inna Kouper,
1 This slide indicated the continuous cycle of creating raw data or derived data based on collections of existing data. Identify components that could.
Intentions and Goals Comparison of core documents from DFIG and Publishing Workflow IG show that there is much overlap despite different starting points.
Data Type Registries #2 Co-Chairs: RDA Chairs’ Mtg Gothenburg
Workshop on Brokering in Data Fabrics - community perspectives -
RDA Europe: Views about PID Systems
RDA Data Foundation and Terminology (DFT) WG
RDA Data Fabric (DF) Interest Group Peter Wittenburg & Gary Berg-Cross
Power of PID kernel information
WG Research Data Collections RDA P10 Montréal – September 2017
Data Type Registries #2 12 Month Status Larry Lannom, Tobias Weigel Date Location TBD? CC BY-SA 4.0.
The RPID Testbed Rob Quick Manager – High Throughput Computing
Data Ingestion in ENES and collaboration with RDA
Data Type Registries Breakout
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
Data Foundations And Terminology (DFT) IG
Overview: Fedora Architecture and Software Features
Data Foundation and Terminology (DFT) Vocabulary Development Session
Research Data Collections WG Plenary 9 Barcelona
RDA Plenary 9 Breakout Session
Agenda Welcome and overview (Peter)
Business Retention and Expansion
The JISC IE Metadata Schema Registry
Brief WG/IG reporting Tobias Weigel on behalf of co-chairs
Business Retention and Expansion
WG Research Data Collections Draft outputs of a RDA bottom-up effort P9 - April 2017 Co-chairs: Bridget Almas, Frederik Baumgardt, Tobias Weigel, Thomas.
WG Research Data Collections An overview of the recommendation
Using the RDA Collections API to Shape Humanities Data
Data types and persistent identifiers in
Metadata in Digital Preservation: Setting the Scene
Agenda (AM) 9:30-10:15 Introduction to RDA
The new RDA: resource description in libraries and beyond
Bird of Feather Session
RDA uptake activities and plans: ESGF
Joint Metadata Session Alex Ball, Keith Jeffery, Rebecca Koskela
WG PID Kernel Information RDA P11 Berlin – March 2018
Leveraging PIDs for object management in data infrastructures RDA UK Node Workshop, July Tobias Weigel (DKRZ)
Presentation transcript:

PID centric fabric constructed piece by piece Beth Plale, Tobias Weigel Drawn in part from RDA PID Training, Garching, 2016/08/31

Objective RDA Data Fabric examines fabric composition Composing from RDA Recommendations (largely but not exclusively) PID related recommendations are particularly powerful RDA Persistent Identifier Types API Recommendation and RDA Data Type Registry Recommendation

Objectives of this Session PIT API WG stopped short of recommending minimal metadata associated with PIDs There is no universal set of PID minimal metadata that will be agreed upon by all Organize 2 to 3 small groups of RDA members, where group is bound by some shared interest, who will meet between now and Barcelona (Spr 16) and define a minimal set that works for them

What is killer app for this? Bringing universality to provenance Provenance relates Data Objects to one another through Revision: change to object through time Derivation: attribution of one data object to those data objects that influenced its creation Replication: relating identical objects to one another Metadata DO to the DO itself Part of: data objects part of same collection …. W3C PROV standard for provenance is well defined

What is killer app for this? Universality of provenance How: Provenance record pointed to as part of minimal metadata record associated with a PID Provenance definition available in DTR Gives: well defined map of relationship of one data object to others Why important?: data provenance is siloed. This approach breaks down the silos. Has potential for universality of data provenance; a goal that has been elusive since data provenance inception in 2005

Minimal provenance (as JSON-LD) "@context": { "prov": "http://www.w3.org/ns/prov#", }, "prov:wasDerivedFrom": “IDENTIFIER", "prov:revisionOf": “IDENTIFIER", "prov:primarySourceOf": “IDENTIFIER", "prov:quatationOf": “IDENTIFIER", "prov:specializationOf": “IDENTIFER", "prov:alternateOf": “IDENTIFIER", "prov:hadMember": “IDENTIFER", "prov:memberOf": “IDENTIFIER" }

Provenance definition in Data Type Registry http://pragma8.cs.indiana.edu:8080/pragmapit-ext-0.2/pitapi/generic/20.5000.347/18536afecc5e6ca6ab41

Approach: core profile and community extensions Size Fixity key Data Provenance Policy ... Size Fixity key Timestamp – Creation Timestamp – Last Mod Data Provenance Policy Owner ... Size Granularity Data Provenance Policy ... Climate sciences Material sciences Linguistics Core profile

What is a profile and how does it relate to PID records? Base assumption: There is minimal core set of information associated with each PID Minimal set should be useful not only to maintainer of Data Object, but should facilitate DO‘s discovery and use Each user community may design their own profiles. No single size fits all – but recurring elements should be reused Size Fixity Key Timestamps Data Provenance ... No single size fits all – PID profiles are registered and referenced, they may differ between communities, but some elements recur

Example profile registered in Data Type Registry Property name Target type Mandatory? DO Location URL Yes Policy Policy specification* Time stamp (last modified) Date/Time Data provenance PID Deletion flag Boolean No Deletion reason String Would Policy be an attribute you think important to your community‘s profile? Why?

Take pulse of room How many of you are data providers? How many are interested in tool building to consume data under the minimal PID-DTR model? Questions End of public portion of meeting. Remainder of meeting targeted to those who are interested in working on this topic

Parcel interested parties into groups; begin discussion within group Next steps Define criteria by which we self organize into small groups to work Sep – Mar 2016 My interest is as data provider Framework for analysis of HathiTrust 14M digitized books from university libraries (Plale, director) data consumer: PRAGMA: Pacific Rim partners in facilitating shared computing and data sharing (Plale, steering board) Parcel interested parties into groups; begin discussion within group