Download presentation
Presentation is loading. Please wait.
Published byNeal Hawkins Modified over 9 years ago
1
Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011
2
Why care about data semantics? Current data may need to be archived for decades or centuries Global change analysis requires consistent comparisons across decades or centuries Synonyms multiple words, same meaning Homonyms same word, multiple meanings Measurement ambiguities Sea “surface” temperature - at what “height”?
3
Let’s eat, Grandma. Let’s eat Grandma. Time flies like an arrow. Fruit flies like a pie. Semantic Understanding is Difficult! LA Times headline “Mission accomplished. Major combat operations in Iraq have ended” -Pres. Bush, 2003 Variable t: temperature Variable t: time Data quality= 5 Data quality= 3 Surface wind: measured 3 m above surface Surface wind: measured at surface
4
Semantic Spectrum Catalog List of controlled words Semantics Formal Hierarchy w/ Relations Relations between children defined Informal Hierarchy Terms classified by categories (e.g. GCMD) Formal Hierarchy Terms inherent properties/ meaning of parent VocabularyOntology Human-ReadableMachine-Readable
5
Scope of Representation Parameter names Parameter names Scientific units Scientific units Spatial/temporal extent/resolution Spatial/temporal extent/resolution Data quality Data quality Data provenance Data provenance Data type Data type Data services Data services CF
6
What is an Ontology? An approach to store knowledge Machine-readable and human-readable Provides definition of words or phrases expressed relative to other terms Offers shared understanding of concepts and knowledge reuse Provides semantics for machine-to-machine (or human-to-human) communications
7
Practically, an ontology is a… Framework for classifying knowledge Ensures there is a “place” to store components of knowledge
8
Ontology Languages: RDF and OWL W3C has adopted languages that specialize XML Resource Description Formulation (RDF) Ontology Web Language (OWL) Languages predefine specific tags RDF: Class, subclass, property, subproperty Class-property similar to Entity-Relation of DBMS theory
9
RDF Class and Subclass Class The basic element or “thing” or “noun” Subclass Inherits all attributes of parent class Typically, adds Properties to distinguish subclass from its parent Can have multiple parent classes Cat Animal is a has Legs 4
10
RDF Property & Subproperty Property A “verb” Examples: measures, hasLocation, hasArea, northOf Properties can have attributes: domain, range, transitive, … Subproperty Inherits parent attributes
11
OWL Language Extends RDF to predefine further tags cardinality transitive relations inverse relations same as, different from union, intersection domain, range Import (from one ontology to another, to enable sharing and reuse of the work of others) …
12
OWL Ontology Example
13
Statements about Statements OWL allows us to make statements about statements Degree of belief Timestamps Provenance / Lineage Probability / Uncertainty Security issues Author / Source / Community Community dialect … Observed Feature Landsat has Probability 0.75 Corn Crop has Source is a
14
Ontologies provide a common namespace Documents, web pages, data, people, and other resources can be mapped/ categorized to this namespace Anybody can create or extend the namespace Why are Ontologies Useful? (1)
15
Dictionary Concepts in the namespace not just “listed” (a taxonomy), but “defined” (in terms of others) Concepts defined via specializations of broader concepts -- with properties that distinguish each child from the broader parent concept Reductionist approach of science Arbitrary levels of specialization are possible As with Library of Congress and Dewey Decimal numbering systems Why are Ontologies Useful? (2)
16
Disambiguation Reduces semantic mismatch Synonym support (multiple terms with same meaning) label available to indicate preferred term for each community Homonym support (multiple meanings of same term) separate namespaces ( President:Bush vs Plant:Bush) Why are Ontologies Useful? (3)
17
Why are Ontologies Useful? (4) Machine readable Ontologies are generally stored in a format (XML) that is readable by both humans and computers Computer accessibility enables automated reasoning
18
Knowledge retention Corporations use knowledge management to ensure institutional memory over time, as personnel come and go Climate disciplines can do the same! Facts/data can be represented and related in a consistent manner Common sense knowledge is captured Instrument characteristics Why are Ontologies Useful? (5)
19
Ontology Representation (1): Knowledge Base of Triples Noun-Verb-Noun representation Parent-child relations: Flood is a Weather Phenomena GeoTIFFis aFile Format Soil Typeis aPhysical Property Pacific Oceanis a Ocean Or create your own relations: Ocean has substanceWater Sensor measuresTemperature
20
Ontology Representation (2): Visual
21
Ontology Representation (3): XML, RDF, and OWL W3C has adopted XML-based standard ontology languages Resource Description Formulation (RDF) Ontology Web Language (OWL) Languages predefine specific tags RDF: Class, subclass, property, subproperty, … OWL: Extends RDF to predefine further tags such as cardinality Three flavors of OWL (Lite, DL, and Full) Use of standard languages makes it easy to extend (specialize) work of others
22
Global Warming Query in the Semantic Web Find data which demonstrates global warming at high latitudes during summertime and plot warming rate. Extract information from the use-case - encode knowledge Translate this into a complete query for data - inference and integration of data from instruments, indices and models “Global warming”= Trend of increasing temperature over large spatial scales “High latitude”= |Latitude| > 60 degrees “Summertime”= June-Aug (NH) and Jan-Mar (SH) “Find data”= Locate datasets using catalogs, then access and read it “Plot warming rate”= Display temperature vs time
23
Semantic Web for Earth and Environmental Terminology (SWEET) Concept space written in OWL Initial focus to assist search for data resources Funded by NASA Later focus to serve as community standard (upper-level Earth system science ontology) Enables scalable classification of Earth system science and associated data concepts Specialists can further refine SWEET concepts SWEET 2.2 has 6600 concepts in 200 modular ontologies http://sweet.jpl.nasa.gov
24
SWEET Top-Level View
25
CF vs SWEET Representation CF (traditional single-attribute parameter name): tendency_of_mole_concentration_of_dissolved_inorganic_phosphorus_in_sea_water_due_to_biological_processes SWEET (multi-attribute parameter name): Quantity= mole_concentration Quantity= mole_concentration Transformation= tendency Transformation= tendency State= dissolved, inorganic State= dissolved, inorganic Substance= phosphorous Substance= phosphorous Medium= sea_water Medium= sea_water Process= biological_processes Process= biological_processes
26
SWEET Data Ontology Dataset characteristics Format, data model, dimensions, … Provenance Source, processing history, … Parameters Scale factors, offsets, … Data services Subsetting, reprojection, … Quality measures Special values Missing, land, sea, ice,...
27
Best Practices Keep ontologies small, modular Use higher level ontologies where possible Identify hierarchy of concept spaces Try to keep dependencies unidirectional Gain community buy-in Involve respected leaders
28
Ontology Development Tools: CMAP Free, downloadable tool for knowledge representation and ontology development Visual language with input/export to OWL Supports subset of OWL language http://cmap.ihmc.us/coe
29
Resources ESIP Semantic Web Cluster Monthly telecons Tutorials Ontology development Datatypes data services SWEET http://sweet.jpl.nasa.gov Rob Raskin raskin@jpl.nasa.gov Rob Raskin raskin@jpl.nasa.gov
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.