Presentation is loading. Please wait.

Presentation is loading. Please wait.

Information Fusion: Moving from domain independent to domain literate approaches Professor Deborah L. McGuinness Tetherless World Constellation, Rensselaer.

Similar presentations


Presentation on theme: "Information Fusion: Moving from domain independent to domain literate approaches Professor Deborah L. McGuinness Tetherless World Constellation, Rensselaer."— Presentation transcript:

1 Information Fusion: Moving from domain independent to domain literate approaches Professor Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic Institute Troy, NY USA AGU 2008 Fall Meeting 15–19 December 2008, San Francisco, California

2 Previous (CS) Work General toolkit work exists to support many aspects of analysis and evolution of knowledge encodings Issues diagnosis and support for: –Collaboration (with distributed teams) –Diverse training levels –Interconnectivity with many systems/standards –Scale –Ontology mapping and merging Ontology (schema) diagnostics Instance level registration and analysis

3 Approaches Review structure and encoding of schema for logical and possible problems Review structure and encoding of ground data for logical and possible problems Review (and automatically or semi- automatically gather) existing ontologies and data Incorporate domain knowledge into analysis and mapping/merging process Expose selected resources to help domain experts find, encode, and analyze

4 Chimaera: An Ontology Evolution Environment An interactive web-based tool aimed at supporting: Merging (later mapping) of ontological terms from varied sources Diagnosis of coverage and correctness of ontologies Maintaining ontologies over time Features: multiple I/O languages, loading and merging into multiple namespaces, collaborative distributed environment support, integrated browsing/editing environment, extensible diagnostic rule language Built by computer scientists uses domain independent approach. Has been extended to leverage selected portions of domain dependent info. (www.ksl.stanford.edu/software/chimaera)www.ksl.stanford.edu/software/chimaera

5 The Analysis Task Review KBs that: – Were developed using differing standards – May be syntactically but not semantically validated – May use differing modeling representations Produce KB logs (in interactive environments) –Identify provable problems –Suggest possible problems in style and/or modeling –Are extensible by being user programmable

6 Loads in logic encoding: Integrity/ logical checks (numbers outside ranges, missing values, values of wrong type, “Bad” form checks cycles in structure referenced but not defined, redundant super classes …

7 The KB Merging -> Mapping Task Work with Knowledge bases that: – Were developed independently by multiple authors – Express overlapping knowledge in a common domain – Use differing representations and vocabularies Produce merged KB with –Non-redundant –Coherent –Unified vocabulary, content, and representation Later emphasis (by this and other work) on creating mapping relationships rather than merging commands

8

9 Next Generation Update to current languages (OWL, SPARQL, …) Leverage general (domain independent) foundations Focus more on instance level data (studies show more than 90% of RDF data is instance data) Focus more on mapping rather than merging

10 Semantic e-Science Data Evaluation

11

12 Ontology-Enabled Data Registration: SEDRE (Sinha Rezgui): a system that enables scientists to semantically register data sets for optimal querying and semantic integration SEDRE enables mapping of heterogeneous data to concepts in domain ontologies Uses an ontology for the registration procedure

13 13 How to find the data? Include background knowledge of the form data providers typically do. One example from using terms and relationships from volcano chemistry, atmospheric chemistry, thermal profiles, solar irradiance data, etc.

14 14 Registration of Volcanic Data SO 2 Emission from Kilauea east rift zone - vehicle-based (Source: HVO) Abreviations: t/d=metric tonne (1000 kg)/day, SD=standard deviation, WS=wind speed, WD=wind direction east of true north, N=number of traverses Location Codes: U - Above the 180° turn at Holei Pali (upper Chain of Craters Road) L - Below Holei Pali (lower Chain of Craters Road) UL - Individual traverses were made both above and below the 180° turn at Holei Pali H - Highway 11

15 15 Registering Volcanic Data (2) Uses background knowledge to provide connections – for example by linking volcanoes to their lat long location. Exposes chemistry information in a typical structured form

16 Directions Include background ontologies in domain areas Focus on instance data and schema data (e.g. TW-OIE) Add domain-specific checks that should be performed Update interfaces to aim at broader (not just CS) audience Integrate more with existing (often domain-specific) environments (e.g., SEDRE) Focus on known issues such as unit conversion support Leverage extensive acronym expansion options (e.g., Chimaera’s extension to include other vocabularies) Smart search (e.g. Noesis) Use results of learners /crawlers Scale

17

18

19

20


Download ppt "Information Fusion: Moving from domain independent to domain literate approaches Professor Deborah L. McGuinness Tetherless World Constellation, Rensselaer."

Similar presentations


Ads by Google