WG Research Data Collections An overview of the recommendation

Slides:



Advertisements
Similar presentations
A Stepwise Modeling Approach for Individual Media Semantics Annett Mitschick, Klaus Meißner TU Dresden, Department of Computer Science, Multimedia Technology.
Advertisements

Discover, Master, InfluenceSlide 1 SQL Server Compact Edition and the Entity Framework Rob Sanders Readify.
OMap By: Haitham Khateeb Yamama Dagash Under Suppervision of: Benny Daon.
DATA FOUNDATION TERMINOLOGY WG 4 th Plenary Update THE PLUM GOALS This model together with the derived terminology can be used Across communities and stakeholders.
Metadata Harvesting The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing Workshop.
THE GITB TESTING FRAMEWORK Jacques Durand, Fujitsu America | December 1, 2011 GITB |
A Metadata Catalog Service for Data Intensive Applications Presented by Chin-Yi Tsai.
Putting it all together Dynamic Data Base Access Norman White Stern School of Business.
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
FlexElink Winter presentation 26 February 2002 Flexible linking (and formatting) management software Hector Sanchez Universitat Jaume I Ing. Informatica.
REAL TIME GPS TRACKING SYSTEM MSE PROJECT PHASE I PRESENTATION Bakor Kamal CIS 895.
Linking Tasks, Data, and Architecture Doug Nebert AR-09-01A May 2010.
Adoption of RDA-DFT Terminology and Data Model to the Description and Structuring of Atmospheric Data Aaron Addison, Rudolf Husar, Cynthia Hudson-Vitale.
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
TIDEN Node Management Texas Integrated Data Exchange Node Partnered with.
1 Service Creation, Advertisement and Discovery Including caCORE SDK and ISO21090 William Stephens Operations Manager caGrid Knowledge Center February.
1 1 ECHO Extended Services February 15, Agenda Review of Extended Services Policy and Governance ECHO’s Service Domain Model How to…
Persistent Identifiers (PIDs) & Digital Objects (DOs) Christine Staiger & Robert Verkerk SURFsara.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
TWC Adoption* of RDA DTR and PIT in the Deep Carbon Observatory Data Portal Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox, & the.
Weigel, Berger, Kindermann, Lautenschlager EGU Versioning for CMIP6 in the Earth System Grid Federation Data preparation Initial registration.
Bringing visibility to food security data results: harvests of PRAGMA and RDA Quan (Gabriel) Zhou, Venice Juanillas Ramil Mauleon, Jason Haga, Inna Kouper,
1 This slide indicated the continuous cycle of creating raw data or derived data based on collections of existing data. Identify components that could.
Workshop on Brokering in Data Fabrics - community perspectives -
RDA WG on Dynamic Data Citation
Current and Upcoming RDA Recommendations Dr. ir. Herman Stehouwer
Research Data Repository Interoperability WG David Wilcox, Thomas Jejkal Montreal, 09/20/17 CC BY-SA 4.0.
(on behalf of the POOL team)
Using E-Business Suite Attachments
WG Research Data Collections RDA P10 Montréal – September 2017
Data Type Registries #2 12 Month Status Larry Lannom, Tobias Weigel Date Location TBD? CC BY-SA 4.0.
Data Ingestion in ENES and collaboration with RDA
Data Type Registries Breakout
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
Registry Interfaces 1.1 Theresa Dower NAVO/STScI May 2016
SysML 2.0 Model Lifecycle Management (MLM) Working Group
Research Data Collections WG Plenary 9 Barcelona
PID centric fabric constructed piece by piece
Measuring Outcomes of GEO and GEOSS: A Proposed Framework for Performance Measurement and Evaluation Ed Washburn, US EPA.
API Documentation Guidelines
T-TAP for climate data RDA P10 Montréal – September 2017
CAE-SCRUB for Incorporating Static Analysis into Peer Reviews
C2CAMP (A Working Title)
RDA/TDWG Metadata Standards for Attribution of Physical and Digital Collections Stewardship Anne E Thessen, Matt Woodburn, Dimitris Koureas 21 Sept, 2017/Montreal,
CDISC SHARE API v1.0 CAC Update 22 February 2018
Relevance of RDA Outputs in the Humanities
Gateway to Competency Portability
Lecture 1: Multi-tier Architecture Overview
INSPIRE Test Framework
Appcelerator Arrow: Build APIs in Minutes. Connect to Any Data Source
Brief WG/IG reporting Tobias Weigel on behalf of co-chairs
NSDL Data Repository (NDR)
2. An overview of SDMX (What is SDMX? Part I)
WG Research Data Collections Draft outputs of a RDA bottom-up effort P9 - April 2017 Co-chairs: Bridget Almas, Frederik Baumgardt, Tobias Weigel, Thomas.
2. An overview of SDMX (What is SDMX? Part I)
Using the RDA Collections API to Shape Humanities Data
Data Migration Assessment Jump Start – Engagement Kickoff
Research Data Alliance (RDA) 9th WG/IG Collaboration Meeting: Repository Platforms for Research Data (RPRD) Interest Group 13nd June 2018 Co-Chairs:
Agenda (AM) 9:30-10:15 Introduction to RDA
IVOA Interoperability Meeting - Boston
The RDA Europe project CC BY-SA 4.0.
Overview Activities from additional UP disciplines are needed to bring a system into being Implementation Testing Deployment Configuration and change management.
RDA uptake activities and plans: ESGF
The Research Data Alliance
WG PID Kernel Information RDA P11 Berlin – March 2018
Adoption and Use of IIIF for Digital Resource Sharing in CONTENTdm
Leveraging PIDs for object management in data infrastructures RDA UK Node Workshop, July Tobias Weigel (DKRZ)
Microsoft Azure Data Catalog
API Working Group September 26, 2019 Includes notes from meeting.
Presentation transcript:

WG Research Data Collections An overview of the recommendation Tobias Weigel, Bridget Almas, Frederik Baumgardt, Thomas Zastrow, Ulrich Schwardmann, Maggie Hellstrom, Javier Quinteros, Dirk Fleischer www.rd-alliance.org - @resdatall CC-BY 4.0

Motivation for Research Data Collections (Research) data management beyond single objects Not just describe collections, but enable actions on them Create, Read, Update, Delete, List plus some others Machine agents as primary users Contribute an essential component to the Data Fabric API specification against which tools and services can be built across community boundaries Create Read Update Delete List https://rd-alliance.org/ - https://twitter.com/resdatall

What is the output? 1. Collection recommendation document 2. API specification (Swagger / OpenAPI) 3. Reference implementations https://github.com/RDACollectionsWG/specification https://github.com/RDACollectionsWG/apidocs https://rd-alliance.org/ - https://twitter.com/resdatall

https://rd-alliance.org/ - https://twitter.com/resdatall Key API requirements Support for PIDs, but not mandatory Support for non-recursive subcollections Objects may belong to more than one collection Object properties specific within a collection‘s context Cross-repository collections supported, separation of object location from collection membership No limitation to particular back-ends Advertised service/object operations for machine consumption Advertised collection capabilities (usage, behavioural restrictions) https://rd-alliance.org/ - https://twitter.com/resdatall

High-level conceptual model Member ID Capabilities Properties Membership location description ontology datatype mappings role index date added date updated https://rd-alliance.org/ - https://twitter.com/resdatall

https://rd-alliance.org/ - https://twitter.com/resdatall API: Structure https://rd-alliance.org/ - https://twitter.com/resdatall

https://rd-alliance.org/ - https://twitter.com/resdatall API: Details Service: Understand service features Collection: CRUD/L Understand capabilities Member: Location, datatype, mappings https://rd-alliance.org/ - https://twitter.com/resdatall

https://rd-alliance.org/ - https://twitter.com/resdatall API: Details https://rd-alliance.org/ - https://twitter.com/resdatall

https://rd-alliance.org/ - https://twitter.com/resdatall API: Service Features GET /features https://rd-alliance.org/ - https://twitter.com/resdatall

API: Collections Create/Read/Update/Delete/List LIST GET /collections CREATE POST /collections READ GET /collections/{id} UPDATE PUT /collections/{id} DELETE DELETE /collections/{id} https://rd-alliance.org/ - https://twitter.com/resdatall

API: Collection Member CRUD/L LIST GET /collections/{id}/members CREATE POST /collections/{id}/members READ GET /collections/{id}/members/{mid} UPDATE PUT /collections/{id}/members/{mid} DELETE DELETE /collections/{id}/members/{mid} https://rd-alliance.org/ - https://twitter.com/resdatall

https://rd-alliance.org/ - https://twitter.com/resdatall Impact Before after Incompatible silo approaches to collection management focus on describing collections, not CRUD operations missing conceptual framework, unclear scope Clients: interaction with endpoints independent of domain/infrastructure etc. common toolset ecosystems Servers: balance: conformance with basic functionality vs. extensions multi-domain service building, reusability https://rd-alliance.org/ - https://twitter.com/resdatall

Reference Implementations Source code available at GitHub.com/RDACollectionsWG API mockup (Swagger-based, Python) Ruby Collections client (Swagger Codegen) Reptor: PHP-based modern data repository demonstrator PID handling, independent of particular systems file system storage of object bitstreams Support for DTR, OAI-PMH, ResourceSync Perseids Manifold: Python/Flask-based + LDP multiple data backends: file system, RDF/LDP, MongoDB deployed at http://collections.perseids.org GEOFON implementation: Python/cherrypy + MySQL members identified by PID or URL extended with download methods for collections and members https://rd-alliance.org/ - https://twitter.com/resdatall

Open and future efforts Comprehensive approach for Data Type Registry integration included in the recommendation Needs further practical evaluation as part of test beds Some first steps done as part of adoption Fedora Commons: Possible support for API via Fedora API-X service Specification offers extension points to explore More operations Maturing the conceptual framework Feedback through RDA processes desired! https://rd-alliance.org/ - https://twitter.com/resdatall

https://rd-alliance.org/ - https://twitter.com/resdatall Adoption: GEOFON Create big pre-assembled datasets (collections) without the need to instantiate them due to storage limitations PIDs for data files were already available and should be used to identify members of the collections Extension to the specification with “download” methods for collections and members Interoperability with data storage sytem (iRODS), Handle Server and AAI system. 6000+ collections and 1.5+ million members Contact: Javier Quinteros https://rd-alliance.org/ - https://twitter.com/resdatall

https://rd-alliance.org/ - https://twitter.com/resdatall Adoption: Perseids Collections of Textual and Linguistic Annotations Annotation documents belong to multiple collections User, Subject, Community, Review Board Collection membership changes throughout the research/publication lifecycle Collections add and set the context for the data Currently: CRUD operations used. Goals require addition of Query and Set based operations. Want to combine with PID Kernel and DTR outputs to have persistently identified annotations asserted with machine actionable formalized datatypes Contact: Bridget Almas https://rd-alliance.org/ - https://twitter.com/resdatall

https://rd-alliance.org/ - https://twitter.com/resdatall Adoption: ePIC ePIC Collection Registry: Flask-based, uses registered types (DTR) allows multiple prefix-based registries backend: Handle System and file system Ready-to-use service – to be available soon under: https://coll-reg.pidconsortium.eu Contact: Ulrich Schwardmann https://rd-alliance.org/ - https://twitter.com/resdatall