Download presentation
Presentation is loading. Please wait.
Published byKenneth Shepherd Modified over 6 years ago
1
Department of Civil, Architectural & Environmental Engineering
HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture M. Piasecki November, 2007 9/15/2018 Department of Civil, Architectural & Environmental Engineering
2
Department of Civil, Architectural & Environmental Engineering
9/15/2018 Lecture Demo of HydroSeek What are the search criteria? Functionality of the Engine Interface Data Sources Common Sources Common Problems (Completeness, Syntax, Semantics) Ontologies Ontology details Concept-to-data variable tagging Architecture Flow Chart Technologies used Demo of HydroTagger Why the Tagging? Technologies 9/15/2018 Department of Civil, Architectural & Environmental Engineering
3
Department of Civil, Architectural & Environmental Engineering
9/15/2018 Department of Civil, Architectural & Environmental Engineering
4
Department of Civil, Architectural & Environmental Engineering
9/15/2018 HIS Goals Hydrologic Data Access System – better access to a large volume of high quality hydrologic data Support for Observatories – synthesizing hydrologic data for a region Advancement of Hydrologic Science – data modeling and advanced analysis Hydrologic Education – better data in the classroom, basin-focused teaching 9/15/2018 Department of Civil, Architectural & Environmental Engineering
5
Department of Civil, Architectural & Environmental Engineering
9/15/2018 Objective Search multiple heterogeneous data sources simultaneously regardless of semantic or structural differences between them What we are doing now ….. NWIS request return request return return request NAWQA NAM-12 return request return request request return request return request return NARR 9/15/2018 Department of Civil, Architectural & Environmental Engineering
6
Department of Civil, Architectural & Environmental Engineering
9/15/2018 What we would like to do ….. GetValues Semantic Mediator NWIS GetValues GetValues GetValues generic request GetValues NAWQA GetValues NARR GetValues HODM GetValues 9/15/2018 Department of Civil, Architectural & Environmental Engineering
7
Department of Civil, Architectural & Environmental Engineering
Data sources… USGS EPA CIMS TCEQ NADP 9/15/2018 Department of Civil, Architectural & Environmental Engineering
8
Department of Civil, Architectural & Environmental Engineering
Spatial Coverage STORET has 758 sites in Texas, TCEQ has 8407. STORET has 47,602 sites in Florida, NWIS has 27,906. NWIS has 121,545 in Minnesota, STORET has 22,260. 9/15/2018 Department of Civil, Architectural & Environmental Engineering
9
Department of Civil, Architectural & Environmental Engineering
Data Availability 9/15/2018 Department of Civil, Architectural & Environmental Engineering
10
Department of Civil, Architectural & Environmental Engineering
Temporal Coverage Nitrogen 9/15/2018 Department of Civil, Architectural & Environmental Engineering
11
Department of Civil, Architectural & Environmental Engineering
Interface Problem NWIS ~175 form elements on a single page STORET + NWIS + TCEQ + CIMS = ??? A drop down menu ∞ String search across parameter list? How about synonyms? ‘Elevation, water surface’ vs. ‘stage height’ 9/15/2018 Department of Civil, Architectural & Environmental Engineering
12
Completeness Problem: Metadata Catalog
Better query performance Freedom Fewer errors Total Number of Sites 274,918 Sites with geographic coordinates 274,435 Sites with State/County information 273,113 Sites with Hydrologic Unit Codes 128,646 Availability of geographic identifiers for stations in EPA STORET 9/15/2018 Department of Civil, Architectural & Environmental Engineering
13
Heterogeneity Problem
Syntax E.g. date & time formats, Gregorian versus Julian Data format/structure E.g. XML, HTML, tab/tilde/comma separated text, gunzipped tar balls… Semantics more ….. 9/15/2018 Department of Civil, Architectural & Environmental Engineering
14
Department of Civil, Architectural & Environmental Engineering
Issues with Semantics Hyponymy Parameter “Groundwater level”, “Stream stage”, “Reservoir level” versus “Water level” Pseudo hyponymy due to lack of metadata Parameter “Manganese, 6N hydrochloric acid extracted, recoverable, dry weight, milligrams per kilogram” versus “Manganese, milligrams per kilogram” Synonymy ‘Total Kjeldahl Nitrogen’ vs. ‘Ammonia+Organic Nitrogen’ 9/15/2018 Department of Civil, Architectural & Environmental Engineering
15
Department of Civil, Architectural & Environmental Engineering
Search Strategy Search Fine tune Retrieve rather than Search Retrieve avoid ‘high precision, low recall’ and ‘low precision, high recall’ problems. 9/15/2018 Department of Civil, Architectural & Environmental Engineering
16
Layered Ontology Model
9/15/2018 Department of Civil, Architectural & Environmental Engineering
17
Department of Civil, Architectural & Environmental Engineering
Core Navigation Compound 9/15/2018 Department of Civil, Architectural & Environmental Engineering
18
Department of Civil, Architectural & Environmental Engineering
Knowledge Base Supports classification of search results Entities in the ontology are associated with measured variables in a relational database Helps solving semantic heterogeneity issues between data repositories OWL Ontologies ‘Escherichia coli’ = ‘E. coli’ ‘E. coli’ is-a ‘Indicator Organism’ ‘Copper’ is-a ‘Micronutrient’ ‘Copper’ isMeasuredIn ‘Medium’ ‘Medium’ = {Water, Soil…} ‘Micronutrient’ is-a ‘Nutrient’ 9/15/2018 Department of Civil, Architectural & Environmental Engineering
19
Department of Civil, Architectural & Environmental Engineering
9/15/2018 Department of Civil, Architectural & Environmental Engineering
20
Point Observations Information Model
9/15/2018 Point Observations Information Model USGS Data Source Streamflow gages Network GetSites GetSiteInfo Neuse River near Clayton, NC Sites GetVariables Discharge, stage (Daily or instantaneous) Variables GetVariableInfo GetValues Values 206 cfs, 13 August 2006 {Value, Time, Qualifier, Offset} A data source operates an observation network A network is a set of observation sites A site is a point location where one or more variables are measured A variable is a property describing the flow or quality of water A value is an observation of a variable at a particular time A qualifier is a symbol that provides additional information about the value An offset allows specification of measurements at various depths in water 9/15/2018 Department of Civil, Architectural & Environmental Engineering
21
Hydroseek Webservices
Most Hydroseek functions are available as web services (SOAP) Support for queries using GlobalChangeMasterDirectory GCMD keywords Supports output in GeographyMarkupLanguage GML as well as WaterML MicroSoft Server VirtualEarth Map San Diego Supercomputer Center Server Native Services HydroSeek EPA STORET WaterOneFlow USGS Daily WaterOneFlow Drexel Server WaterOneFlow CIMS WaterOneFlow USGS Realtime WaterOneFlow TCEQ 9/15/2018 Department of Civil, Architectural & Environmental Engineering
22
Department of Civil, Architectural & Environmental Engineering
GetStations Request BoundingBox Response 9/15/2018 Department of Civil, Architectural & Environmental Engineering
23
Department of Civil, Architectural & Environmental Engineering
GetStationsByHU Request HUC_Code Response 9/15/2018 Department of Civil, Architectural & Environmental Engineering
24
Department of Civil, Architectural & Environmental Engineering
GetStationCatalogueFiltered Request Response 9/15/2018 Department of Civil, Architectural & Environmental Engineering
25
Department of Civil, Architectural & Environmental Engineering
GetStationCatalogue Request Response 9/15/2018 Department of Civil, Architectural & Environmental Engineering
26
Department of Civil, Architectural & Environmental Engineering
Allows searching multiple heterogeneous data sources simultaneously regardless of semantic or structural differences between them Modular & extensible Architecture Outline Inside the CUAHSI HOD Module 9/15/2018 Department of Civil, Architectural & Environmental Engineering
27
The Database-Ontology Link
9/15/2018 Department of Civil, Architectural & Environmental Engineering
28
Department of Civil, Architectural & Environmental Engineering
1) MappingsApproved_Table 2) FrequentUpDates_Table HydroSeek ODM needed an upgrade, i.e. additional tables. 9/15/2018 Department of Civil, Architectural & Environmental Engineering
29
How does the Tagging work?
Step 1 Users need to register on the web-site first before they can use the HydroTagger. When registering select the testbed site you are affiliated with. Each testbed site needs ONE administrator who can then admit additional users for that specific testbed site. Please send an to identify the designated tagger site administrator so we can promote that person to the role. 9/15/2018 Department of Civil, Architectural & Environmental Engineering
30
How does the Tagging work?
Step 2 The “Sniffer” jumps into action and trawls through the testbed sites to find and identify new variablenames (once a week, currently every Sunday night) It does so by using the regular web-services published through the WSDL (no “hacking”!!!) It returns i) data updating information and ii) variablenames used and compares these to those used by HydroSeek. WATERS Network Information System 9/15/2018 Department of Civil, Architectural & Environmental Engineering
31
How does the Tagging work?
Step 3 The Tagger now updates the HydroSeek catalogue (an amalgamation of all 10 testbed catalogues) with the newly found data entries. If it finds a new variablename (introduced during the data loading process using the Data-Loader), it puts it into a table and offers it up to he HydroTagger GUI for semantic Tagging. 9/15/2018 Department of Civil, Architectural & Environmental Engineering
32
Department of Civil, Architectural & Environmental Engineering
Thank you…Questions? 9/15/2018 Department of Civil, Architectural & Environmental Engineering
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.