Page 1 Drexel University, College of Engineering ACHIEVING SEMANTIC INTEROPERABILITY WITH HYDROLOGIC ONTOLOGIES FOR THE WEB 6 th International Conference on HydroScience and Engineering Michael Piasecki Luis Bermudez
Page 2 Drexel University, College of Engineering Overview of metadata Metadata Interoperability problems A possible Solution: Hydrologic Ontologies for the Web Content
Page 3 Drexel University, College of Engineering Answers: what, when, where, how, who and why of the described data. Helps to: discover, access, evaluate and use of data. Creator : USGS Keyword: Gage Height Metadata
Page 4 Drexel University, College of Engineering Hydrologic Information Communities (HIC) need a metadata agreement What descriptors can be used ? Keyword or topic Author or Creator Gage height or water elevation Which possible values? Is there any metadata agreement available to describe hydrologic data ? Keyword or topic Author or Creator
Page 5 Drexel University, College of Engineering Metadata Specifications related to Hydrology ISO-19115:2003 FGDC-STD Ecological Markup Language Geographical Markup Language USGS Hydrologic Markup Language Earth Science Markup Language Dublin Core Metadata Initiative
Page 6 Drexel University, College of Engineering Problem 1: Metadata specifications lack domain specific elements For example: They do not tell if area and outlet location should be defined when a watershed is being described For example: They do not incorporate a list of possible stations and variables related to surface water collected by a particular HIC What is the problem with these ?
Page 7 Drexel University, College of Engineering EX_GeographicIdentifier geographicIdentifier MD_Identifier code … Descriptive Keywords MD_Keywords keyword Citation … HIC A creates an HTML form to collect Metadata #24 Water elev. X Not consistent #23 W #34 #56 = Stage height HIC A
Page 8 Drexel University, College of Engineering EX_GeographicIdentifier geographicIdentifier MD_Identifier code … Descriptive Keywords MD_Keywords keyword Citation … Need to incorporate domain vocabulary to get consistent metadata consistent #23 #34 #56 discharge stage height
Page 9 Drexel University, College of Engineering Problem 2: Metadata standards do not solve Semantic heterogeneities Finds only data set X Metadata (FGDC) about dataset Y Theme_Keyword = Gage Height Theme_Keyword_Thesaurus = USGS Metadata (ISO) about dataset X keyword = Stage Height thesaurusName = GCMD and not data set Y Metadata repository search for: Stage Height
Page 10 Drexel University, College of Engineering Possible solutions to our Problems How to incorporate domain vocabulary in metadata specifications? Create a new metadata specification. Rewrite a previous one and extend Hardcode semantics into application Dynamic Extension with ontologies
Page 11 Drexel University, College of Engineering Extending Metadata Specifications to meet specific needs of a HIC Express metadata specifications and vocabularies in ontologies. Use the knowledge inference capabilities of ontologies to link the metadata elements with selected vocabulary terms.
Page 12 Drexel University, College of Engineering Ontologies Specification of conceptualizations Body of Water Class RiverLake Has water Is inland body Has a defined channel LakeRiver Example: 1. Properties of real world objects are identified. 2. Similarities are identified. 3. Concepts are created 4. and are expressed as a class. 5. Classes are related. Subclass
Page 13 Drexel University, College of Engineering Web Ontology Language : OWL Body of Water RiverLake Body_of_Water River Lake W3C Recommendation since 02/2004
Page 14 Drexel University, College of Engineering MD_Metadata + fileIdentifier[0..1] : CharacterString + language[0..1] : CharacterString … MD_Identification … + abstract : CharacterString … + identificationInfo 1..* Metadata specs expressed in ontologies Classes datatype Properties object Properties
Page 15 Hydrologic Unit RegionSubregionAccounting Unit Cataloging Unit Is part of Mid Atlantic Delaware Lower Delaware Schuylkill Is part of Class Subclasses Is Transitive Infer isPartOf
Page 16 Drexel University, College of Engineering More about knowledge Inference <owl:Class rdf:ID “W-Station” type of station that has property isPartOf = W W A B C Y How to infer the stations that are only in W ? W-Stations = A, B Program infer
Page 17 Dynamic extension with ontologies Restriction onProperty: code allValuesFrom : W-station MD_Identifier_Extension + code: CharacterString … MD_Identifier + code: CharacterString … W-station isPartOf = W Metadata Specifications Domain Vocabularies Program could infer code A B Dynamic HTML form using the extension A B C W Y e.g. Restrict the descriptor code to only have W-station values
Page 18 Drexel University, College of Engineering Ontologies provide means to resolve Semantic Heterogeneities
Page 19 Drexel University, College of Engineering Use of ontologies to map metadata specifications <owl:equivalentClass rdf:resource ="&fgdc;Keywords"/> <owl:equivalentProperty rdf:resource = "&fgdc;title“/>
Page 20 Drexel University, College of Engineering Use of ontologies to solve semantic heterogeneities among different domain vocabularies <owl:differentFrom rdf:resource=“&events;Stage_Height"/>
Page 21 Drexel University, College of Engineering Semantic Interoperability Finds data set X and Y Metadata repository e.g. search for Stage Height Metadata (FGDC) about dataset Y Theme_Keyword = Gage Height Theme_Keyword_Thesaurus = USGS Metadata (ISO) about dataset X keyword = Stage Height thesaurusName = GCMD USGS GCMD Mapper Hydrologic vocabulary Metadata specifications FGDC ISO Mapper
Page 22 Drexel University, College of Engineering Why is XML Schema not good enough?
Page 23 Drexel University, College of Engineering.. <xsd:element ref="outletLoc“ type="xsd:nonNegativeInteger” minOccurs="1" maxOccurs="1“/> <xsd:element ref=“id" type="xsd:nonNegativeInteger minOccurs="1" maxOccurs="1"/> E.g. defining that a watershed has only one outlet location and only one unique identifier XML Schema cannot express semantics.
Page 24 Drexel University, College of Engineering XML Schema cannot express semantics … 567 X 101 … 838 X 101 … Valid XML document Semantically they are not correct 567 <> 838 X XML Schema is good to validate the structure of a document, but not the semantics
Page 25 Drexel University, College of Engineering Hydrologic Ontologies will help to: Extend standards Solve semantic heterogeneities Interoperate between systems e.g. Find a numerical model and data to compute runoff for a specific location with a specific resolution. System Engineering benefits Efforts are not duplicated because the conceptual models could be reused and shared. Semantics not need to be hard coded in computer programs.
Page 26 Drexel University, College of Engineering Acknowledgements Drexel Team (Luis Bermudez, Saiful Islam, Bora Beran) Stephane Fellah (Member ISO TC 211 Canada team) will submit in OWL to ISO as a draft document NOPP NAG (Web based dissemination portal) NSF- GEO Directorate grant from EAR division to create Hydrologic Metadata for CUAHSI, prototype Hydrologic Information System (HIS), in the Neuse River Basin Discussion List : Protégé, Jena, W3C