Download presentation
Presentation is loading. Please wait.
1
Power of PID kernel information
Beth Plale Indiana University Bloomington, Indiana USA SEAIP 2016
2
Conceptual model PID Kernel Metadata Handle (identifier)
black box data object Handle (identifier) Properties Size Checksum Timestamps Version Pointer to data object ... PID Kernel Metadata Set of attributes is one Profile of PID Kernel Metadata. Optimally there are a minimal number of profiles: e.g., one per community/discipline
3
What a profile is Base assumption: There is minimal set of information associated with each PID Names for this metadata (in flux): kernel, casing, gateway metadata Kernel metadata should be useful to Maintainer of Data Object, Discovery clients, and Data reuse research Each user community may design their own profiles. No single size fits all – but full benefits realized when set of profiles is minimal Size Fixity Key Timestamps Is-derivation-of ... No single size fits all – PID profiles are registered and referenced, they may differ between communities, but some elements recur SEAIP 2016
4
Go on to imagine an Internet-scale data client that is handed a list of a billion IDs. Case 1: How does the client quickly sift through the list to find the research data? Case 2: When client winnows list down to research data, how does it then quicky discard the fakes?
5
What is killer app for this? Universal Provenance
Provenance relates a Data Object to others through data provenance expressed as relationships: Is-derivation-of Is-revision-of Is-new-version-of Is-part-of Is-supplement-to | Is-supplemented-by Has-metadata | Is-metadata-for Making provenance relationships required
6
Two pronged activity Consensus around small number of profiles
Carried out in RDA PID Testbed for adoption evaluation. Drawn from Research Data Alliance (RDA) Recommendations: Persistent Identifier Types API and RDA Data Type Registry
7
Prong I: Consensus around small number of profiles: activity in RDA
Science Digital Humanities Stuart Chalk, UNF Alex Thompson, iDigBio Tobias Weigel Gabriel Zhou, IU Cindy Chandler, WHO Data Provider Bridget Almas, Tufts Daan Broeder, MPI Beth Plale, HathiTrust Stuart Chalk, UNF Alex Thompson, iDigBio Sharief Youssef, NIST James Duncan, UVM Data Consumer
8
Prong I: Consensus around profiles
Discuss in separate sub-groups over Winter ‘16: Sciences Data provider Data consumer Digital Humanities Apply for WG status Each group comes to consensus on set of attributes At RDA P9 (Barcelona), begin integration of the views Summer ‘17 : integration of views Fall ’17 : write report
9
Prong II: PID Testbed for Evaluation
PID testbed for broader RDA audience: Status: Proposal pending at NSF Beth Plale, Indiana Univ; Larry Lannom, CNRI; Bridget Almas, Tufts PID testbed for Pacific Rim Applications and Grid Middleware Assembly (PRAGMA) 13 year old assembly of Pacific Rim organizations and groups PRAGMA Testbed Datasets HathiTrust Digital Library – Beth Plale, Indiana Univ iDigBio – Jose Fortes, UFlorida Sensor Weather data – Jay Combinido, ASTI, Philippines Ecological data – Krisanadej Jaroensuttasinee, Walailak Univ, Thailand
10
I encourage you to reach out to me plale@indiana.edu
SEAIP 2016
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.