AgGateway’s Progress on Data Type Registries

Slides:



Advertisements
Similar presentations
웹 서비스 개요.
Advertisements

IATI Technical Advisory Group Technical Proposals Simon Parrish IATI Technical Advisory Group, DIPR March 2010.
MicroKernel Pattern Presented by Sahibzada Sami ud din Kashif Khurshid.
Routemap to derive ISO models from BUFR Why do we need both ISO and BUFR models? –The BUFR data model is very large – much larger in principle than most.
A Unified Approach to Combat Counterfeiting: Use of the Digital Object Architecture and ITU-T Recommendation X.1255 Robert E. Kahn President & CEO CNRI,
Interoperability Principles in the Global Earth Observations System of Systems (GEOSS) Presented 13 March 2006 at eGY in Boulder, CO by: Eliot Christian,
Activities for ISO/TC series standards in China Jiang Jingtong National Geomatics Center of China
OASIS Reference Model for Service Oriented Architecture 1.0
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Object-Oriented Analysis and Design
Component Patterns – Architecture and Applications with EJB copyright © 2001, MATHEMA AG Component Patterns Architecture and Applications with EJB JavaForum.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
1 Software Testing and Quality Assurance Lecture 30 – Testing Systems.
1 Computer Systems & Architecture Lesson 1 1. The Architecture Business Cycle.
Foundations This chapter lays down the fundamental ideas and choices on which our approach is based. First, it identifies the needs of architects in the.
Course Instructor: Aisha Azeem
OCLC Online Computer Library Center A Global OpenURL Resolver Registry Phil Norman OCLC Dlsr4lib Workshop March 23 rd, 2006 Arlington VA.
1 CIM User Group Conference Call december 8th 2005 Using UN/CEFACT Core Component methodology for EIC/TC 57 works and CIM Jean-Luc SANSON Electrical Network.
Proceso kintamybių modeliavimas Modelling process variabilities Donatas Čiukšys.
An Introduction to Software Architecture
Profiling Metadata Specifications David Massart, EUN Budapest, Hungary – Nov. 2, 2009.
XML in Development of Distributed Systems Tooling Programming Runtime.
An Introduction to Design Patterns. Introduction Promote reuse. Use the experiences of software developers. A shared library/lingo used by developers.
CountryData Technologies for Data Exchange SDMX Information Model: An Introduction.
Design engineering Vilnius The goal of design engineering is to produce a model that exhibits: firmness – a program should not have bugs that inhibit.
© DATAMAT S.p.A. – Giuseppe Avellino, Stefano Beco, Barbara Cantalupo, Andrea Cavallini A Semantic Workflow Authoring Tool for Programming Grids.
Eurostat Expression language (EL) in Eurostat SDMX - TWG Luxembourg, 5 Jun 2013 Adam Wroński.
Chapter 18 Object Database Management Systems. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Motivation for object.
CLARIN work packages. Conference Place yyyy-mm-dd
Adoption of RDA-DFT Terminology and Data Model to the Description and Structuring of Atmospheric Data Aaron Addison, Rudolf Husar, Cynthia Hudson-Vitale.
10/24/09CK The Open Ontology Repository Initiative: Requirements and Research Challenges Ken Baclawski Todd Schneider.
Scaling Heterogeneous Databases and Design of DISCO Anthony Tomasic Louiqa Raschid Patrick Valduriez Presented by: Nazia Khatir Texas A&M University.
Preliminary Ocean Project Page 1 WGISS SG May 15, C. Caspar G. Tandurella P. Goncalves G. Fallourd I. Petiteville Preliminary Ocean Project Phase.
Chapter 18 Object Database Management Systems. Outline Motivation for object database management Object-oriented principles Architectures for object database.
Joel Håkansson Henry Larsson Björn Westling. Contact information  TPB – The Swedish Library of Talking Books and Braille, Stockholm  Joel Håkansson.
© 2010 IBM Corporation RESTFul Service Modelling in Rational Software Architect April, 2011.
16 April 2011 Alan, Edison, etc, Saturday.. Knowledge, Planning and Robotics 1.Knowledge 2.Types of knowledge 3.Representation of knowledge 4.Planning.
High degree of user interaction Interactive Systems: Model View Controller Presentation-abstraction-control.
Semantic metadata in the Catalogue Frédéric Houbie.
RDA IGAD pre-meeting Interoperability group Short summary – action points.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Global Water Information Interest Group meeting RDA 7 th Plenary, 1 st March 2016, Tokyo Global Water Information Interest Group Welcome to the inaugural.
U.S. Department of the Interior U.S. Geological Survey WaterML Presentation to FGDC SWG Nate Booth January 30, 2013.
NCI CBIIT LIMS ISIG Meeting– July 2007 NCI CBIIT LIMS Consortium Interface SIG Mission: focus on an overall goal of providing a library of interfaces/adapters.
NCI CBIIT LIMS ISIG Meeting– Aug. 21,2007 NCI CBIIT LIMS Consortium Interface SIG Mission: focus on an overall goal of providing a library of interfaces/adapters.
Design Engineering 1. Analysis  Design 2 Characteristics of good design 3 The design must implement all of the explicit requirements contained in the.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Digitising European Industry
What are they? The Package Repository Client is a set of Tcl scripts that are capable of locating, downloading, and installing packages for both Tcl and.
Chapter 5 – Design and Implementation
Introduction to Design Patterns
RECENT TRENDS IN METADATA GENERATION
Design and Implementation
Some … Facts & Recommendations
PDAP Query Language International Planetary Data Alliance
XML Based Interoperability Components
European Network of e-Lexicography
Introduction to the Unified Modeling Language
2. An overview of SDMX (What is SDMX? Part I)
An Introduction to Software Architecture
Session 2: Metadata and Catalogues
SDMX Information Model: An Introduction
Developing a Data Model
Metadata The metadata contains
Bird of Feather Session
Chapter 5 Architectural Design.
MSDI training courses feedback MSDIWG10 March 2019 Busan
W3C WoT Standardization
Cultivating Semantics for Data in Agriculture and Nutrition
Presentation transcript:

AgGateway’s Progress on Data Type Registries Working to preserve meaning in production agriculture data exchange R. Andres Ferreyra (Ag Connections, LLC) Research Data Alliance 9th Plenary Barcelona, April 6, 2017

Interoperability Problems The agricultural industry’s field operations segment has developed a serious interoperability problem, as the ability of equipment to generate data has far outpaced users’ ability to derive useful information from it. There is a multitude of proprietary data formats in hardware and software solutions. No consensus on controlled vocabularies (or the need thereof) for crops, operations, products, etc. “Meaning drift” when exchanging O&M data.

AgGateway AgGateway is a consortium of ~240 companies dedicated to the implementation of data exchange standards in agriculture. It has chartered several projects to enable interoperability in field operations (following prior work on supply chain): SPADE, PAIL, ADAPT A major part of this work has been the development of a system of data type registries to support the unambiguous communication of meaning.

AgGateway’s Semantic Assets Representations “Universal” variables in field operations (e.g., crop yield in units of mass per area) ContextItems Geopolitical-context-dependent data Observation Codes Expressing aspects of an ISO 19156 Observation as orthogonal components. Translates into controlled vocabularies that can be combined to create new vocabularies.

The Geopolitical Context Problem Growers need to collect increasing amounts of field operations data. This usually includes lots of ctitically important, but frequently-changing geopolitical-context-dependent information (e.g., EPA numbers, Bundessortenamt, tax data, etc.) Capturing all of this data in the object model of farm management information system (FMIS) software is infeasible in the context of corporate IT realities (i.e., cannot be upgrading software often), unless it were somehow possible to decouple the infrequently- and frequently-changing aspects of the FMIS data model. In terms of requirements thus placed on a data model, an FMIS object model should simultaneously be: Simple/generic vs comprehensive/specific Static vs dynamic: Controlled vocabulary vs extensibility In terms of requirements thus placed on a data model, an FMIS object model should simultaneously be: - Generic, simple and compact enough to be easily understood and used, as well as accepted from an international perspective (which would suggest staying free of regionally-specific clutter), but still be able to support the capture & communication of necessary region-specific (i.e., geopolitical-context-dependent) data needed by growers and their partners as part of their business processes (simple/generic vs comprehensive/specific) - Able to express data with a controlled vocabulary (so everyone can understand what it means), but allowing that controlled vocabulary to be continually updated to match the nature of data requirements (static vs dynamic)

The ADAPT Solution: The ContextItem ADAPT reconciled the contradictions by defining an object class, the ContextItem, that can be attached to various other objects in ADAPT’s common object model. A ContextItem is a key/value structure where the “key” code references a ContextItemDefinition that defines what each ContextItem means. The “value” is composed of a string value along with data needed to interpret it (such as a unit of measure) or a nested list of other ContextItems (e.g. PLSS cadastral information.) The ContextItem definitions are sourced via an API. See it here: https://api.contextitem.org/swagger We’re putting in place an ISO 19135 – based governance process, and will allow anyone to request additions.

The ContextItem Object Code identifies what a given ContextItem contains: think of it as a machine-readable string that identifies what Value means: is it a PLSS Township number? An FSA Tract ID? An EPA Number? A PLSS Prime Meridian string? ValueUoM specifies, where appropriate, a unit of measure for Value. We draw from a controlled vocabulary of unit of measure codes (UN Rec 20). TimeScopes provides the ContextItem with a temporal context. NestedItems enables a hierarchical organization of nested ContextItems, suitable for multi-attribute data (e.g., US PLSS cadastral data)

The ContextItemDefinition Object Provides a rich definition of how a specific (as per Code) ContextItem’s value should be entered / displayed. ValueType specifies the data type of ContextItem.Value. Lexicalizations allow multi-language support. Properties encapsulate values along with (enumerated) ContextItems.

The ContextItemDefinition Object NestedIDefIds specifies a hierarchical ContextItem. Presentations specify, via a regular expression, how to enter & display the ContextItem.Value. ModelScopeIds specify what classes in the ADAPT & ISO object models a given ContextItem can be attached to. GeoPoliticalContextIds specify what geopolitical context (e.g., EU, Lithuania, Wisconsin) a given ContextItem is defined for.

More recent work: Observation Codes Several aspects of AgGateway’s field operations work involves observations & measurements. We found value in implementing ISO 19156. This work, centered on the PAIL, SPADE and ADAPT projects, emphasizes the explicit capture of the semantics of the various aspects of an observation. The work, performed by a group of industry and academic AgGateway participants spanning four continents, includes three major parts.

3 parts First, defining a componentized model of the properties of an observation, based on an extensible set of orthogonal vocabularies, which includes representing valid combinations of components. Second, deploying infrastructure, in the form of a RESTful API, to make the componentized variable definitions freely available to industry and the research community; this includes putting into place an ISO 19135-based process for stakeholders to request the addition of vocabularies or entries therein. Third, incorporating observations and measurements into AgGateway’s ADAPT common object model and format conversion plug-in architecture, thus enabling widespread interoperability.

ISO 19156 OM Model

What’s in an Observation? Attributes of the Observation Itself Parameters (e.g., the depth of a soil water measurement) Phenomenon time When did this happen? Result time When did we get the result? Valid time Is there a range of validity? Data quality ISO 19157 data quality metadata Things the Observation is connected to Feature of interest / sampling feature & sampling strategy Field / core, etc. Observed property e.g., air temperature Observation context Procedure Sensor, process used, etc. Result The value returned, its type, etc. Metadata Our intent is to represent observations as key-value pairs: The value corresponds to the ISO 19156 Result, and The Key represents as much of everything else as is practical.

Additional information Our encoding model Aggregation + Observable property + Sampling strategy + Additional information N >= 0 Time window + Method Target + Quantity / phenomenon + Method And N >= 0 window components Sample type Test type Ingredient Example 1: daily average greenhouse air temperature height 1.5m Example 2: soil hot-water extractable nitrogen mass-fraction We’re not finished yet, but there seems to be an emerging pattern of repeatable components

Comments This work opens up the possibility to leverage existing research-derived controlled vocabularies in industrial settings. We initially need to “keep it simple” to promote adoption. Anyone can request codes to be added. We want to provide straightforward interfaces linking these resources to ontologies.

DTR WG Influence DTR WG Recommendation Progress Every type in a data type registry must be identified with a resolvable persistent identifier Working on it! Types should reference related standards and recommendations in order to leverage existing efforts Yes Primitive types should be established and used, when possible, in the construction of more complex types A common API should be available across all type registries Type registries should be federated such that a single service can search across all known registries or some defined subset Need to learn more Type registries should include or enable referencing related services based on types The establishment of a data type registry for any community should be subject only to the needs and requirements of that community, i.e., there should be no higher level governance beyond the maintenance of whatever standards and processes are needed for effective federation across type registries

Thank you! For more information, contact: andres.ferreyra@agconnections.com