Transport and Access of Data, Metadata, and Semantics using RDF M.Benno Blumenthal International Research Institute for Climate and Society Columbia University http://iri.columbia.edu/~benno/Notes/DataLibrary/RDFDAP/
RDF: framework for writing connections Subject Predicate (a.k.a. property) Object {SST a cfatt:non_coordinate_variable, SST cfatt:standard_name cf:sea_surface_temperature, SST netcdf:hasDimension longitude } URI’s identify everything RDF/OWL has mostly been touted for controlled vocabularies like cf:sea_surface_temperature. And it is important for that – we put, rightly or wrongly, a lot of semantics in the variable names.
OWL: writing ontologies Standard properties for relating (in RDF) Classes data-type properties object-type properties Can describe both data and metadata conventions in the same framework (RDF) Machine-readable convention statement Can interrelate metadata conventions Lets us explicate the implicit in our codes
RDF/OWL lets us address a number of current issues Imprecise semantics Implicit semantics/relationships Convention-conforming or bust Code Isolation Local Semantics
Imprecise Semantics Attributes not necessarily attached to a convection are machine-interpretable if there is one all-encompassing convention that give them meaning Conventions attribute with multiple values works as long as no two conventions use the same name, but requires parser to know the conventions so that attributes can be properly assigned
Implicit Semantics Netcdf – same name implies same meaning becomes a translation problem when converted to a scheme that explicitly represents such connections. Implicit means you don’t know for sure that the author intended the implication, e.g. string dimensions
Convention or Bust Schemes with fully-determined metadata cannot transport anything less, e.g. WCS, GRIB, … Clients that insist on a particular convention have the same problem even with a general transport mechanism. This problem can be hidden inside a library, e.g. java opendap.
Code Isolation Changes in metadata force data transport schema changes
Local Semantics Even without mapping to standard metadata, we need to express the relationships between variables in a dataset Example: additional non-standard information (like statistical moments) about variables that have standard metadata Particularly important with evolving standards, merging fields of study
OpenDAP, DAP4, and RDF http://iri.columbia.edu/~benno/Notes/DataLibrary/RDFDAP/