M.Benno Blumenthal and John del Corral International Research Institute for Climate and Society Using a Resource.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.
Integrating NOAA’s Unified Access Framework in GEOSS: Making Earth Observation data easier to access and use Matt Austin NOAA Technology Planning and Integration.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
Module 2b: Modeling Information Objects and Relationships IMT530: Organization of Information Resources Winter, 2007 Michael Crandall.
1 Chapter 20 — Creating Web Projects Microsoft Visual Basic.NET, Introduction to Programming.
ToolMatch: Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Patrick West 1 Nancy Hoebelheinrich.
Linking Disparate Datasets of the Earth Sciences with the SemantEco Annotator Session: Managing Ecological Data for Effective Use and Reuse Patrice Seyed.
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
EARTH SCIENCE MARKUP LANGUAGE “Define Once Use Anywhere” INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
Unidata’s TDS Workshop TDS Overview – Part II October 2012.
RDF and OWL Developing Semantic Web Services by H. Peter Alesso and Craig F. Smith CMPT 455/826 - Week 6, Day Sept-Dec 2009 – w6d21.
Integrating Live Plant Images with Other Types of Biodiversity Records Steve Baskauf Vanderbilt Dept. of Biological Sciences
The IRI Climate Data Library: translating between data cultures Benno Blumenthal International Research Institute for Climate Prediction Columbia University.
Discovering accessibility, display, and manipulation of data in a data portal Nancy Hoebelheinrich Patrick West 2
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Definition of a taxonomy “System for naming and organizing things into groups that share similar characteristics” Taxonomy Architectures Applications.
RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data.
DAP4 James Gallagher & Ethan Davis OPeNDAP and Unidata.
Modeling and Representing National Climate Assessment Information using Linked Data Jin Guang Zheng 1 Curt Tilmes 2
Evolving MARC 21 for the future Rebecca Guenther CCS Forum, ALA Annual July 10, 2009.
M.Benno Blumenthal and John del Corral International Research Institute for Climate and Society OpenDAP 2007
ToolMatch Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Products Patrick West 1 Nancy Hoebelheinrich.
Semantic Technologies and Application to Climate Data M. Benno Blumenthal IRI/Columbia University CDW /04-01.
Server-side Analysis and a Semantic Framework for Metadata M. Benno Blumenthal International Research Institute for Climate and Society Columbia University.
Metadata Registries Registry: authoritative, centrally controlled store of information – W3C Web Services Glossary, 2004
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Information Modeling and Semantic Web Application For National Climate Assessment Jin Guang Zheng 1 Curt Tilmes 2
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
M.Benno Blumenthal, Michael Bell, John del Corral, and Emily Grover-Kopec International Research Institute for Climate and Society Columbia University.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
The HDF Group Data Interoperability The HDF Group Staff Sep , 2010HDF/HDF-EOS Workshop XIV1.
Problems with XML & XML Schemas XML falls apart on the Scalability design goal. 1.The order in which elements appear in an XML document is significant.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
Implementing Marine XML for NOAA Observing Data Nazila Merati and Eugene Burger NOAA/Pacific Marine Environmental Laboratory Seattle, WA.
Data Interoperability at the IRI: translating between data cultures Benno Blumenthal International Research Institute for Climate Prediction Columbia University.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
M.Benno Blumenthal and John del Corral International Research Institute for Climate and Society IRI Data Library.
IRI/LDEO Climate Data Library M.Benno Blumenthal, Michael Bell, John del Corral, Remi Cousin, and Haibo Liu International Research Institute for Climate.
Semantic Web underpinnings of the IRI Data Library Semantic Web as a Framework for Multiple Metadata IRI Data Library: presenting Data in multiple frameworks.
Setting the stage: linked data concepts Moving-Away-From-MARC-a-thon.
IRI Data Library Faceted Search: an example of RDF-based faceted search for climate data Drawing on multiple ontologies to build an application Using inference.
M.Benno Blumenthal and John del Corral International Research Institute for Climate and Society Use of RDF/OWL.
M. Benno Blumenthal International Research Institute for Climate and Society Connecting netcdf/CF to a semantic.
Using the Semantic Web M. Benno Blumenthal International Research Institute for Climate and Society Columbia University 31 July 2012 CU Metadata Group.
An Introduction to the Semantic Web M. Benno Blumenthal International Research Institute for Climate and Society Columbia University 2 November 2011.
IRI/LDEO Climate Data Library M.Benno Blumenthal, Michael Bell, and John del Corral International Research Institute for Climate and Society Columbia University.
Data Browsing/Mining/Metadata
IRI Data Library Overview
The IRI was founded approximately 12 years ago with a mission To enhance society’s capability to understand, anticipate and manage the impacts of climate.
Transport and Access of Data, Metadata, and Semantics using RDF
IRI/LDEO Climate Data Library
IRI/LDEO Climate Data Library
IRI/LDEO Climate Data Library
IRI/LDEO Climate Data Library
IRI/LDEO Climate Data Library
Ontologies and Model-Based Systems Engineering
IRI Data Library Overview
Connecting netcdf/CF to a semantic framework
Data Standards at the IRI Data Library
RDF Standard Data Model Exchange
IRI Data Library Faceted Search: an example of
M.Benno Blumenthal, Michael Bell,
RDA in a non-MARC environment
RDA Community and linked data
IN32A-05 The IRI/LDEO Climate Data Library: Helping People use Climate Data M.Benno Blumenthal, Emily Grover-Kopec, Michael Bell, and John del Corral.
IRI/LDEO Climate Data Library
Presentation transcript:

M.Benno Blumenthal and John del Corral International Research Institute for Climate and Society Using a Resource Description Framework (RDF) to carry metadata for climate datasets

Why RDF? Make implicit semantics explicit Web-based system for interoperating semantics RDF/OWL is an emerging technology, so tools are being built that help solve the semantic problems in handling data

Standard Metadata Users Datasets Tools Standard Metadata Schema/Data Services

Many Data Communities Tools Users Datasets Standard Metadata Schema Tools Users Datasets Standard Metadata Schema Tools Users Datasets Standard Metadata Schema Tools Users Datasets Standard Metadata Schema Tools Users Datasets Standard Metadata Schema

Super Schema Tools Users Datasets Standard Metadata Schema Tools Users Datasets Standard Metadata Schema Tools Users Datasets Standard Metadata Schema Tools Users Datasets Standard Metadata Schema Tools Users Datasets Standard Metadata Schema Standard metadata schema

Super Schema: direct Tools Users Datasets Standard Metadata Schema Tools Users Datasets Standard Metadata Schema Tools Users Datasets Standard Metadata Schema Tools Users Datasets Standard Metadata Schema Tools Users Datasets Standard Metadata Schema Standard metadata schema/data service

Flaws A lot of work Super Schema/Service is the Lowest- Common-Denominator Science keeps evolving, so that standards either fall behind or constantly change

RDF Standard Data Model Exchange Tools Users Datasets Standard Metadata Schema Tools Users Datasets Standard Metadata Schema Tools Users Datasets Standard Metadata Schema Tools Users Datasets Standard Metadata Schema Tools Users Datasets Standard Metadata Schema Standard metadata schema RDF

Standard metadata schema Tools Users Datasets Standard Metadata Schema RDF Tools Users Datasets Standard Metadata Schema RDF Tools Users Datasets Standard Metadata Schem RDF RDF Data Model Exchange RDF Tools Users Datasets Standard Metadata Schema RDF Tools Users Datasets Standard Metadata Schema RDF

Why is this better? Maps the original dataset metadata into a standard format that can be transported and manipulated Still the same impedance mismatch when mapped to the least-common-denominator standard metadata, but When a better standard comes along, the original complete-but-nonstandard metadata is already there to be remapped, and “late semantic binding” means everyone can use the new semantic mapping Can use enhanced mappings between models that have common concepts beyond the least-common- denominator EASIER – tools to enhance the mapping process, mappings build on other mappings

RDF Architecture RDF Virtual (derived) RDF queries

Example: Search Interface Search Interface Users Datasets Search Ontology Dataset Ontology Additional Semantics

Sample Tool: Faceted Search

Distinctive Features of the search Search terms are interrelated terms that describe the set of returns are displayed (spanning and not) Returned items also have structure (sub- items and superseded items are not shown)

Architectural Features of the search Multiple search structures possible Multiple languages possible Search structure is kept in the database, not in the code

Triplets of Subject Property (or Predicate) Object URI’s identify things, i.e. most of the above Namespaces are used as a convenient shorthand for the URI’s RDF: framework for writing connections

Datatype Properties {WOA} dc:title “NOAA NODC WOA01” {WOA} dc:description “NOAA NODC WOA01: World Ocean Atlas 2001, an atlas of objectively analyzed fields of major ocean parameters at monthly, seasonal, and annual time scales. Resolution: 1x1; Longitude: global; Latitude: global; Depth: [0 m,5500 m]; Time: [Jan,Dec]; monthly”

Object Properties {WOA} iridl:isContainerOf {Grid-1x1}, {Grid-1x1} iridl:isContainerOf {Monthly}

WOA01 diagram

Standard Properties {WOA} dcterm:hasPart {Grid-1x1}, {Grid-1x1} dcterm:hasPart {MONTHLY} Alternatively {WOA} iridl:isContainerOf {Grid-1x1}, {iridl:isContainerOf} rdfs:subPropertyOf {dcterm:hasPart}

{SST} rdf:type {cfatt:non_coordinate_variable}, {SST} cfobj:standard_name {cf:sea_surface_temperature}, {SST} netcdf:hasDimension {longitude} Data Structures in RDF Object properties provide a framework for explicitly writing down relationships between data objects/components, e.g. vague meaning of nesting is made explicit Properties also can be related, since they are objects too

Search Interface Term pdfhttp://iri.columbia.edu/~benno/sampleterm. pdf

Virtual Triples Use Conventions to connect concepts to established sets of concepts Generate additional “virtual” triples from the original set and semantics RDFS – some property/class semantics OWL – additional property/class semantics: more sophisticated (ontological) relationships SWRL – rules for constructing virtual triples

OWL Language for expressing ontologies, i.e. the semantics are very important. However, even without a reasoner to generate the implied RDF statements, OWL classes and properties represent a sophistication of the RDF Schema However, there are many world views in how to express concepts: concepts as classes vs concepts as individuals vs concept as predicate

Define terms Attribute Ontology Object Ontology Term Ontology

Attribute Ontology Subjects are the only type-object Predicates are “attributes” Objects are datatype Isomorphic to simple data tables Isomorphic to netcdf attributes of datasets Some faceted browsers: predicate = facet

Object Ontology Objects are object-type Isomorphic to “belongs to” Isomorphic to multiple data tables connected by keys Express the concept behind netcdf attributes which name variables Concepts as objects can be cross-walked Concepts as object can be interrelated

Example: controlled vocabulary {variable} cfatt:standard_name {“string”} Where string has to belong to a list of possibilities. {variable} cfobj:standard_name {stdnam} Where stdnam is an individual of the class cfobj:StandardName

Example: controlled vocabulary Bi-direction crosswalk between the two is somewhat trivial, which means all my objects will have both cfatt:standard_name and cfobj:standard_name

Example: controlled vocabulary If I am writing software to read/write netcdf files, I use the cfatt ontology and in particular cfatt:standard_name If I am making connections/cross-walks to other variable naming standards, I use cfobj:standard_name

Term Ontology Concepts as individuals Simple Knowledge Organization System (SKOS) is a prime example The ontology used here is slightly different: facets are classes of terms rather than being top_concepts

Nuanced tagging Concepts as objects can be interrelated: specific terms imply broader terms Object ends up being tagging with terms ranging from general to specific. Search can then be nuanced tagging can proceed in absence of perfect information

Faceted Search Explicated

Search Interface Items (datasets/maps) Terms Facets Taxa

Search Interface Semantic API {item} dc:title dc:description rss:link iridl:icon dcterm:isPartOf {item2} dcterm:isReplacedBy {item2} {item} trm:isDescribedBy {term} {term} a {facet} of {taxa} of {trm:Term}, {facet} a {trm:Facet}, {taxa} a {trm:Taxa}, {term} trm:directlyImplies {term2}

Faceted Search w/Queries

RDF Architecture RDF Virtual (derived) RDF queries

Data Servers Ontologies MMI JPL Standards Organizations Start Point RDF Crawler RDFS Semantics Owl Semantics SWRL Rules SeRQL CONSTRUCT Search Queries Location Canonicalizer Time Canonicalizer Sesame Search Interface bibliography IRI RDF Architecture

Cast of Characters NC – netcdf data file format CF – Climate and Forecast metadata convention for netcdf SWEET - Semantic Web for Earth and Environmental Terminology (OWL Ontology) IRIDL – IRI Data Library

CF attributes SWEET Ontologies (OWL) Search Terms CF Standard Names (RDF object) IRIDL Terms NC basic attributes IRIDL attributes/objects SWEET as Terms CF Standard Names As Terms Gazetteer Terms CF data objects Location

Thoughts Pure RDF framework seems currently viable for a moderate collection of data Potential for making a lot of implicit data conventions explicit Explicit conventions can improve interoperability Simple RDF concepts can greatly impact searches

Future Work Possibilities More Usable Search Interface Tagging Interface that uses tag interrelationships to simplify choices Data Format translation using semantics “Related Object Browsing” given a dataset, find related data, papers, images Document/execute/create analysis trees Stovepipe conventions/bash-to-fit Less Monolithic IRI Data Library

Question: Canonical Objects Given a set of individuals related by owl:sameAs, semantics says predicates that point to one individual should point to all the individuals. But as a programmer I usually want to work with one (the canonical object).

Question: Concept Class Membership Seemingly the only way to use a concept which is expressed as an OWL class is rdf:type, i.e. membership. Should I really be putting datasets into conceptual classes? Shouldn’t there be more choice of relationship between a concept and a dataset?

Question: OWL/SKOS Crosswalk Given a concept in both OWL and Term (SKOS) frameworks, is it possible to crosswalk between them, in particular preserving the ability of Term to have different predicates with the item being described? I’ve tried dual-defined objects, not sure it works yet …

Stovepipe Conventions Fixed Schema Agreed upon metadata domain Agreed upon data domain Designed to be a partial solution General server software needs to decide whether data legitimately fits the standard User contemplates bash-to-fit

Overview IRI Data Collection Generalized Data Tools Specialized Data Tools Dataset Variable ivar multidimensional Data Viewer Data Language Maproom URL/URI for data, calculations, figs, etc

IRI Data Collection Dataset Variable ivar multidimensional Economics Public Health “geolocated by entity” GIS “geolocation by vector object or projection metadata” Ocean/Atm “geolocated by lat/lon” multidimensional spectral harmonics equal-area grids GRIB grid codes climate divisions IRI Data Collection

IRI Data Collection Dataset Variable ivar Servers OpenDAP THREDDS GRIB netCDF images binary Database Tables queries spreadsheets shapefiles images w/proj IRI Data Collection

IRI Data Collection Dataset Variable ivar Calculations “virtual variables” images graphics descriptive and navigational pages OpenGIS WMS v1.3 WCS Data Files netcdf binary Images GeoTiff Clients OpenDAP THREDDS Tables Servers OpenDAP THREDDS GRIB netCDF images binary Database Tables queries spreadsheets shapefiles images w/proj IRI Data Collection

IRI General Data Tools Data Page

IRI General Data Tools Data Viewer

IRI General Data Tools Cut and Paste

IRI Map Room

Malaria Early Warning System Front page illustrates most recent dekadal rainfall estimates (FEWS RFE) Change dates to view different time periods Administrative and epidemiological overlays available Click and drag box across map to zoom IRI Map Room