Data Type Registries Breakout


Similar presentations
DC8 Registries Breakout. Goals of the session Discuss and clarify : Requirements for registry Framework for policy Relate issues raised to EOR prototype.

Information Types and Registries Giridhar Manepalli Corporation for National Research Initiatives Strategies for Discovering Online Data BRDI Symposium.
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
Measurement Data Archive – Project Highlights GEC12 Nov 2011 Giridhar Manepalli Corporation for National Research Initiatives
Tobias Weigel (DKRZ) Tobias Weigel Deutsches Klimarechenzentrum (DKRZ) Persistent Identifiers Solving a number of problems through a simplistic mechanism.
WP.5 - DDI-SDMX Integration
DATA FOUNDATION TERMINOLOGY WG 4 th Plenary Update THE PLUM GOALS This model together with the derived terminology can be used Across communities and stakeholders.
WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.
Measurement Data Archive GEC11 July 2011 Giridhar Manepalli Corporation for National Research Initiatives
Profiling Metadata Specifications David Massart, EUN Budapest, Hungary – Nov. 2, 2009.
February 17, 1999Open Forum on Metadata Registries 1 Census Corporate Statistical Metadata Registry By Martin V. Appel Daniel W. Gillman Samuel N. Highsmith,
Distributed Aircraft Maintenance Environment - DAME DAME Workflow Advisor Max Ong University of Sheffield.
CLARIN work packages. Conference Place yyyy-mm-dd
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
Adoption of RDA-DFT Terminology and Data Model to the Description and Structuring of Atmospheric Data Aaron Addison, Rudolf Husar, Cynthia Hudson-Vitale.
1 Capturing Requirements As Use Cases To be discussed –Artifacts created in the requirements workflow –Workers participating in the requirements workflow.
Adoption of RDA-DFT Terminology and Data Model to the Description and Structuring of Atmospheric Data Aaron Addison, Rudolf Husar, Cynthia Hudson-Vitale.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
Data Type Registries (DTR) RDA 4th WG/IG Collab Meeting NIST: Dec 2015 Larry Lannom CNRI.
Discussion of Data Fabric Terms & Preparation for RDA P7 Virtual Meeting Monday, January 25, 2016 Organized by Gary Berg-Cross (DFT-IG) and Peter Wittenburg.
Slide 1 2/22/2016 Policy-Based Management With SNMP SNMPCONF Working Group - Interim Meeting May 2000 Jon Saperia.
Connecting Architecture Reconstruction Frameworks Ivan Bowman, Michael Godfrey, Ric Holt Software Architecture Group University of Waterloo CoSET ‘99 May.
Adoption of RDA-DFT Terminology and Data Model to the Description and Structuring of Atmospheric Data Aaron Addison, Rudolf Husar, Cynthia Hudson-Vitale.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No The Data Type.
Data Type Registries (DTR) WG RDA P3 Breakout 28 March 2014 Larry Lannom Corporation for National Research Initiatives
Data Typing BoF RDA Plenary 7 Tokyo: March 2016 Larry Lannom CNRI.
Washington, D.C., U.S.A May Some concepts of systems architectures Industrialization of statistics and software architecture Study cases.
Data Type Registries #2 Co-Chairs: RDA Chairs’ Mtg Gothenburg
Workshop on Brokering in Data Fabrics - community perspectives -
Fundamentals of Object Oriented Modeling
Systems Analysis and Design in a Changing World, Fourth Edition
Chapter 4 – Requirements Engineering
An Overview of Requirements Engineering Tools and Methodologies*
Data Type Registry Data set descriptions for automation
RDA Data Fabric (DF) Interest Group Peter Wittenburg & Gary Berg-Cross
Presentation on Software Requirements Submitted by
OGSA Session #1 Execution Management Services
Materials Resource Registries Working Group Co-chairs: Laura M
Marc-Elian Bégin ETICS Project, CERN
WG Research Data Collections RDA P10 Montréal – September 2017
SysML v2 Formalism: Requirements & Benefits
Jessie Kennedy Rob Gales, Robert Kukla
Federation Karen Witting.
Data Type Registries #2 12 Month Status Larry Lannom, Tobias Weigel Date Location TBD? CC BY-SA 4.0.
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
Distribution and components
RDA Plenary 9 Breakout Session
PID centric fabric constructed piece by piece
Data Type Registries (DTR)
T-TAP for climate data RDA P10 Montréal – September 2017
Distributed Marine Data System:
C2CAMP (A Working Title)
Lecture Software Process Definition and Management Chapter 3: Descriptive Process Models Dr. Jürgen Münch Fall
and LMAP liaison Document Number: IEEE R0
Attributes and Values Describing Entities.
Metadata for research outputs management
Health Ingenuity Exchange - HingX
Brief WG/IG reporting Tobias Weigel on behalf of co-chairs
2. An overview of SDMX (What is SDMX? Part I)
WG Research Data Collections Draft outputs of a RDA bottom-up effort P9 - April 2017 Co-chairs: Bridget Almas, Frederik Baumgardt, Tobias Weigel, Thomas.
WG Research Data Collections An overview of the recommendation
Datatypes Characterizing data
HingX Project Overview
Agenda (AM) 9:30-10:15 Introduction to RDA
Issues for Discussion on MFI-9
MSDI training courses feedback MSDIWG10 March 2019 Busan
MyStandards and SMPG SWIFT April 2012.
Australian and New Zealand Metadata Working Group
Presentation transcript:

Data Type Registries Breakout Co-chairs: Larry Lannom, Tobias Weigel P10, Montreal September 2017

Agenda 11:30 - 11:35 Welcome & Intros, Agenda Bashing 11:35 - 11:40 Larry Lannom, State of the WG & Brief DTR Overview 11:40 - 11:50 Tobias Weigel, Climate Data Processing 11:50 - 12:00 Ulrich Schwardmann, ePIC DTR 12:00 - 12:10 Wo Chang, Common Access Protocol, IEEE BDGMM 12:10 - 12:20 Rob Quick, RPID Test Bed 12:20 - 12:25 Steve Richard, EarthCube (remote) 12:25 - 12:30 Andres Ferreyra, AgGateway (remote) 12:30 - 12:40 Giridhar Manepalli, ISO WG plus Data Models (remote) 12:40 - 13:00 Tobias Weigel, Discussion: Next Steps, Goals for P11

What is the Issue? Data sharing requires that data can be parsed, understood, and reused by people and applications other than those that created the data How do we do this now? For documents – formats are enough, e.g., PDF, and then the document explains itself to humans This doesn’t work well with data – numbers are not self-explanatory What does the number 7 mean in cell B27? Data producers may not have explicitly specified certain details in the data: measurement units, coordinate systems, variable names, etc. Need a way to precisely characterize those assumptions such that they can be identified by humans and machines that were not closely involved in its creation

Federated Set of Type Registries DTR Usage Example Users Federated Set of Type Registries 3 2 1 4 Typed Data ID Type Payload Visualization I Agree Terms:… Rights Services Data Processing 10100 11010 101…. Data Set Dissemination 4 Client (process or people) encounter data of an unknown type 1 Resolved the Type to Type Registry 2 Response includes type definitions, relationships, properties, and possibly service pointers. Response can be used locally for processing, or, optionally 3 Typed data or reference to typed data can be sent to service provider 4

Goal of the WG Evaluate and identify a few assumptions in data that can be codified and shared in order to… Produce a functioning Registry system that can easily be evaluated by organizations before adoption Highly configurable for changing scope of captured and shared assumptions depending on the domain or organization This doesn’t work well with data – numbers are not self-explanatory Supports several Type record dissemination variations Design for allowing federation between multiple Registry instances The emphasis is not on Identifying every possible assumption and data characteristic applicable for all domains Technology

Status of the WG A prototype is at: Multiple other implementations/projects, including multiple schemas Implementation supports notions of primitives and derived types Primitives are fundamental types that we expect humans and software to parse and understand Derived types depend on primitives to describe something complex Registered types are assigned unique identifiers Initial WG output published as ICT Technical Standard ISO Study Group in process

Initial Adopters EarthCube – Steve Richard Vermont Monitoring Cooperative – Mike Finnegan DKRZ – Tobias Weigel ePIC – Ulrich Schwardmann NIST, Common Access Platform – Wo Chang CNRI – multiple projects Ongoing ISO Study Group

Expected Impact of the Deliverable Best case scenario: agreed upon set of standard schemas; ISO standard Wide use of types for data sharing and workflow automation Significant use of federation of distributed set of type registries Extended use of typed attribute/value pairs in PID resolution Worst case scenario: no agreed upon set of schemas, no further standardization General concept influences multiple communities in the direction of clearer data syntax and semantics ICT Tech Standard remains Existing use of typed attribute/value pairs in PID resolution

Expected Impact of the Deliverable Before After Data sets difficult to impossible to parse, understand, and re-use unless you created them, know who did, or there exists detailed pubic documentation. Search criteria for data sets restricted to keywords and sources. Standardization across data sets fairly arbitrary, concentrated in small groups and narrow communities. Data sets can be typed at a fine level of granularity, those types can be registered in a public registry, and those type records can contain sufficient information to make detailed and accurate use of the data sets so typed. Search criteria for data sets can include type information, yielding easier comparisons and mash-ups. Greater chance of standards developing across data set construction.