Power of PID kernel information

Slides:



Advertisements
Similar presentations
Application to Membership in PRAGMA by Pervasive Technology Institute and Data To Insight Center Indiana University Bloomington, Indiana USA Professor.
Advertisements

NIST Data and Information Activities: May 9 th EO and the Common Access Platform John Henry J. Scott Physicist Material Measurement Laboratory National.
Research Data Alliance Chris Greer NIST Larry Lannom CNRI Fall 2013 CNI Member Meeting.
CORDRA Philip V.W. Dodds March The “Problem Space” The SCORM framework specifies how to develop and deploy content objects that can be shared and.
The Hathi Trust Research Center and tool builders John Unsworth (with Beth Plale, Scott Poole, Robert McDonald, and others) Project Bamboo Corpora Space.
Tobias Weigel (DKRZ) Tobias Weigel Deutsches Klimarechenzentrum (DKRZ) Persistent Identifiers Solving a number of problems through a simplistic mechanism.
DATA FOUNDATION TERMINOLOGY WG 4 th Plenary Update THE PLUM GOALS This model together with the derived terminology can be used Across communities and stakeholders.
Working Group: Practical Policy Rainer Stotzka, Reagan Moore.
RDA’s Recently Endorsed Outputs September 16, 2015.
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
Working Group Practical Policy based on slides and latest documents from the PP WG chaired by Reagan Moore, Rainer Stotzka presented by Johannes Reetz.
Adoption of RDA-DFT Terminology and Data Model to the Description and Structuring of Atmospheric Data Aaron Addison, Rudolf Husar, Cynthia Hudson-Vitale.
Making sense of Interest Group/Working Group Activity by RDA Technical Advisory Board Beth Plale Professor of Data Science Indiana University USA With.
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
Adoption of RDA-DFT Terminology and Data Model to the Description and Structuring of Atmospheric Data Aaron Addison, Rudolf Husar, Cynthia Hudson-Vitale.
Symposium on Global Scientific Data Infrastructures Panel Two: Stakeholder Communities in the DWF Ann Wolpert, Massachusetts Institute of Technology Board.
Data Type Registries (DTR) RDA 4th WG/IG Collab Meeting NIST: Dec 2015 Larry Lannom CNRI.
RDA’s Recently Endorsed Outputs September 16, 2015.
Adoption of RDA-DFT Terminology and Data Model to the Description and Structuring of Atmospheric Data Aaron Addison, Rudolf Husar, Cynthia Hudson-Vitale.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No The Data Type.
RDA/US Adoption Seed Projects RDA/US is partnering with four groups as part of the MacArthur 2016 Adoption Seeds program Bringing visibility to food security.
TWC Adoption* of RDA DTR and PIT in the Deep Carbon Observatory Data Portal Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox, & the.
Data Type Registries (DTR) WG RDA P3 Breakout 28 March 2014 Larry Lannom Corporation for National Research Initiatives
Data Typing BoF RDA Plenary 7 Tokyo: March 2016 Larry Lannom CNRI.
Data Fabric IG From Testing to Recommendations Beth Plale.
Weigel, Berger, Kindermann, Lautenschlager EGU Versioning for CMIP6 in the Earth System Grid Federation Data preparation Initial registration.
Bringing visibility to food security data results: harvests of PRAGMA and RDA Quan (Gabriel) Zhou, Venice Juanillas Ramil Mauleon, Jason Haga, Inna Kouper,
Data Type Registries #2 Co-Chairs: RDA Chairs’ Mtg Gothenburg
Workshop on Brokering in Data Fabrics - community perspectives -
RDA 9th Plenary Breakout 3, 5 April :00-17:30
Visit for more Learning Resources
WG/IG Collaboration Meeting 6 Dec 12-13, NIST, Gaithersburg 'Assembling the Pieces: Connecting Outputs with Each Other and with Domain Adoption‘
RDA Data Foundation and Terminology (DFT) WG
RDA Data Fabric (DF) Interest Group Peter Wittenburg & Gary Berg-Cross
Materials Resource Registries Working Group Co-chairs: Laura M
Session 3A: Catalog Services and Metadata Models
WG Research Data Collections RDA P10 Montréal – September 2017
Data Type Registries #2 12 Month Status Larry Lannom, Tobias Weigel Date Location TBD? CC BY-SA 4.0.
The RPID Testbed Rob Quick Manager – High Throughput Computing
Data Ingestion in ENES and collaboration with RDA
Data Type Registries Breakout
CLARIN Federated Identity Vision
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
Data Foundation and Terminology (DFT) Vocabulary Development Session
RDA Plenary 9 Breakout Session
PID centric fabric constructed piece by piece
Data Type Registries (DTR)
VI-SEEM Data Repository
C2CAMP (A Working Title)
Metadata for research outputs management
Brief WG/IG reporting Tobias Weigel on behalf of co-chairs
Digital Object Interface Protocol (DOIP)
WG Research Data Collections Draft outputs of a RDA bottom-up effort P9 - April 2017 Co-chairs: Bridget Almas, Frederik Baumgardt, Tobias Weigel, Thomas.
WG Research Data Collections An overview of the recommendation
Using the RDA Collections API to Shape Humanities Data
Datatypes Characterizing data
Data types and persistent identifiers in
An Open Archival Repository System for UT Austin
Agenda (AM) 9:30-10:15 Introduction to RDA
Android Introduction Platform Mihail L. Sichitiu.
Bird of Feather Session
Research Data Alliance/US Briefing for the OA
Status of Grids for HEP and HENP
RDA uptake activities and plans: ESGF
RPID: An Overview Rob Quick (Beth Plale) PI
NSF Middleware Initiative
WG PID Kernel Information RDA P11 Berlin – March 2018
Co-Chairs: Keith Jeffery, Rebecca Koskela, Alex Ball
Leveraging PIDs for object management in data infrastructures RDA UK Node Workshop, July Tobias Weigel (DKRZ)
Cultivating Semantics for Data in Agriculture and Nutrition
Presentation transcript:

Power of PID kernel information Beth Plale Indiana University Bloomington, Indiana USA plale@indiana.edu SEAIP 2016

Conceptual model PID Kernel Metadata Handle (identifier) black box data object Handle (identifier) Properties Size Checksum Timestamps Version Pointer to data object ... PID Kernel Metadata Set of attributes is one Profile of PID Kernel Metadata. Optimally there are a minimal number of profiles: e.g., one per community/discipline

What a profile is Base assumption: There is minimal set of information associated with each PID Names for this metadata (in flux): kernel, casing, gateway metadata Kernel metadata should be useful to Maintainer of Data Object, Discovery clients, and Data reuse research Each user community may design their own profiles. No single size fits all – but full benefits realized when set of profiles is minimal Size Fixity Key Timestamps Is-derivation-of ... No single size fits all – PID profiles are registered and referenced, they may differ between communities, but some elements recur plale@indiana.edu SEAIP 2016

Go on to imagine an Internet-scale data client that is handed a list of a billion IDs. Case 1: How does the client quickly sift through the list to find the research data? Case 2: When client winnows list down to research data, how does it then quicky discard the fakes? plale@indiana.edu

What is killer app for this? Universal Provenance Provenance relates a Data Object to others through data provenance expressed as relationships: Is-derivation-of Is-revision-of Is-new-version-of Is-part-of Is-supplement-to | Is-supplemented-by Has-metadata | Is-metadata-for Making provenance relationships required

Two pronged activity Consensus around small number of profiles Carried out in RDA PID Testbed for adoption evaluation. Drawn from Research Data Alliance (RDA) Recommendations: Persistent Identifier Types API and RDA Data Type Registry

Prong I: Consensus around small number of profiles: activity in RDA Science Digital Humanities Stuart Chalk, UNF Alex Thompson, iDigBio Tobias Weigel Gabriel Zhou, IU Cindy Chandler, WHO Data Provider Bridget Almas, Tufts Daan Broeder, MPI Beth Plale, HathiTrust Stuart Chalk, UNF Alex Thompson, iDigBio Sharief Youssef, NIST James Duncan, UVM Data Consumer

Prong I: Consensus around profiles Discuss in separate sub-groups over Winter ‘16: Sciences Data provider Data consumer Digital Humanities Apply for WG status Each group comes to consensus on set of attributes At RDA P9 (Barcelona), begin integration of the views Summer ‘17 : integration of views Fall ’17 : write report

Prong II: PID Testbed for Evaluation PID testbed for broader RDA audience: Status: Proposal pending at NSF Beth Plale, Indiana Univ; Larry Lannom, CNRI; Bridget Almas, Tufts PID testbed for Pacific Rim Applications and Grid Middleware Assembly (PRAGMA) 13 year old assembly of Pacific Rim organizations and groups PRAGMA Testbed Datasets HathiTrust Digital Library – Beth Plale, Indiana Univ iDigBio – Jose Fortes, UFlorida Sensor Weather data – Jay Combinido, ASTI, Philippines Ecological data – Krisanadej Jaroensuttasinee, Walailak Univ, Thailand

I encourage you to reach out to me plale@indiana.edu SEAIP 2016