Hydrologic Information System for the Nation Ilya Zaslavsky Spatial Information Systems Lab San Diego Supercomputer Center UCSD Alexandria Library talk, December 4,
San Diego Supercomputer Center Founded in 1985, as one of the five original supercomputer centers, funded by the National Science Foundation 400 employees Advanced research in high- performance computing and networking R&D and cyberinfrastructure projects: in neuroscience, geology, astronomy, environmental sciences, molecular biology, hydrology SDSC building on UCSD campus
SDSC Spatial Information Systems Lab Research and system development Services-based spatial information integration infrastructure, CI projects Mediation services for spatial data, query processing, map assembly services Long-term spatial data preservation Spatial data standards and technologies for online GIS (SVG, WMS/WFS) Support of spatial data projects at SDSC and beyond services In Geosciences (GEON, CUAHSI, CBEO,…) In regional development (NIEHS SBRP, CRN…) In Neurosciences (BIRN, CCDB) Contact:
Consortium of Universities for the Advancement of Hydrologic Science, Inc. An organization representing more than one hundred United States universities, receives support from the National Science Foundation to develop infrastructure and services for the advancement of hydrologic science and education in the U.S US Universities as of July 2008
CUAHSI HIS: NSF support through 2012 (GEO) Partners: Academic: 11 NSF hydrologic observatories, CEO:P projects, LTER… Government: USGS, EPA, NCDC, NWS, state and local Commercial: Microsoft, ESRI, Kisters International: Australia, UK Standardization: OGC, WMO (Hydrology Domain WG, CHy); adopted by USGS, NCDC An online distributed system to support the sharing of hydrologic data from multiple repositories and databases via standard water data service protocols; software for data publication, discovery, access and integration. What is the CUAHSI HIS?
Rainfall & Snow Water quantity and quality Remote sensing Water Data Modeling Meteorology Soil water
Sources of Observations Data
Observation Stations Ameriflux Towers (NASA & DOE)NOAA Automated Surface Observing System USGS National Water Information SystemNOAA Climate Reference Network Map for the US Build a common window on water data using web services
9 Getting Water Data (the old way) Different Query Pages Different Query Responses
10 Web Pages versus Web Services Uses Hypertext Markup Language (HTML) Uses WaterML (a Markup Language for water data)
CUAHSI Observations Data Model
Information communication Water web pagesWater web services HyperText Markup Language (HTML) Water Markup Language (WaterML)
Standard Water Data Services Set of query functions Returns data in WaterML NWIS Daily Values (discharge), NWIS Ground Water, NWIS Unit Values (real time), NWIS Instantaneous Irregular Data, EPA STORET, NCDC ASOS, DAYMET, MODIS, NAM12K, USGS SNOTEL, ODM (multiple sites) Next Step: WaterML 2.0; OGC Hydrology Domain Working Group
International Standardization of WaterML 14 OGC/WMO Hydrology Domain Working Group
Test bed HIS Servers Central HIS servers ArcGIS Matlab IDL, R MapWindow Excel Programming (C#, VB..) Desktop clients Customizable web interface (DASH) HTML - XML WSDL - SOAP Modeling (OpenMI) Global search (Hydroseek) Water Data Web Services, WaterML Controlled vocabularies Metadata catalogs Ontology ETL services HIS Lite Servers External data providers Deployment to test beds Other popular online clients ODM DataLoader Streaming Data Loading Ontology tagging (Hydrotagger) WSDL and ODM registration Data publishing ODMTools Server config tools HIS Central Registry & Harvester Hydrologic Information System Service Oriented Architecture HIS Desktop
16 Built for data –Storage –Loading –Analysis –Publication HIS Software free of charge HIS Server Real-time Sensors WaterOneFlow Web Service Data Archives Outside Users, HIS Central, HydroDesktop Local Users ODM Tools HIS Server SQL Server Observations Data Model Database
HIS Central – Catalog and Search 17
Managing Varying Semantics Nitrogen: e.g. NWIS parameter # 625 is labeled ‘ammonia + organic nitrogen‘, Kjeldahl method is used for determination but not mentioned in parameter description. In STORET this parameter is referred to as Kjeldahl Nitrogen. And: Dissloved oxygen acre feetacre-feet micrograms per kilogram micrograms per kilgram FTUNTU mhoSiemens ppmmg/kg In measurement units… In parameter names…
Semantic Tagging of Harvested Variables
21 Service registry and metadata catalog –Networks –Sites –Variables –Search Keywords Does not store actual observation data Example: GetSitesInBox query function HIS Central Services HICentral Web Service
GetValues Requests Per Day from HIS Central
23 HydroDesktop Capabilities Add shapefiles to map Change symbology and labels Print and export map GIS toolbox GIS Search for data Download data Display time series Export data Hydrology
Hydroseek Supports search by location and type of data across multiple observation networks including NWIS, Storet, and academic data
Visualization and Analysis of Large Datasets ► Tiled wall ► OLAP cubes for repositories ► OLAP cubes for catalogs EPA STORET water quality repository USGS NWIS catalog: measurement totals for selected nutrients over decades
How we work with agencies on web service access to observational data 1.Establish an agreement with the agency on joint development of water data services, identify agency partners (ideally, with time to support joint work) 2.Identify the scope of the service, databases to be exposed, and access control, assign network and vocabulary codes 3.Map semantics of the service to WaterML semantics, and verify with the agency 4.Include discussion of agency data and interoperability issues in the context of OGC Hydrology Domain WG (if needed) 5.Develop a first draft of the web service 6.Unit testing, over a series of validation cases developed jointly with the agency 7.Harvest an observations metadata catalog for agency data, to be housed either at SDSC or at the agency 8.Develop a procedure for catalog updates 9.Register the water data service at HISCentral (including mapping of variables to ontology terms), and test it using HydroSeek and HydroExcel. Document the service 10.Review and test the service together with the agency, for possible approval as “operational”
Federal Agency Water Data Services at HISCentral (10/09) Network NameSite CountValue CountEarliest ObservationNotes NWISDV /1/1900 WaterML-compliant GetValues service from NWIS, catalog ingested EPA /1/1900 SOAP wrapper over WQX services, catalog harvested NWISUV DAYS WaterML-compliant GetValues Service, catalog ingested NCDC ISH *1/1/2005 WaterML-compliant GetValues service from NCDC, catalog harvested NCDC ISD /1/1892 WaterML-compliant GetValues service from NCDC, catalog harvested NWISIID /9/1867 SOAP wrapper over NWIS web site, catalog harvested NWISGW /1/1900 SOAP wrapper over NWIS web site, catalog harvested RIVERGAGES /1/2000 WaterML compliant REST services from Army Corps of Engineers
Integration with other infrastructures Technical: with real time data management middleware: – OpenSource DataTurbine: Communities: – Superfund Basic Research Program (NIEHS) – CZOs (Boulder Creek, Stroud Water Research Center) – South East Asia/Malaysia (
The International Workshop on Hydrologic Data Management and Modeling in South East Asia July University of Malaya Learning how the system works Publishing hydrologic data Setting up a server for SEA Already published: sample data from JPS (Malaysia) and from Indonesia
Looking for COD measurements In HydroSeek
CUAHSI Water Data Services services 15,000 variables 1.8 million sites 9 million series 4.3 billion data
Summary CUAHSI HIS = Cyberinfrastructure for managing and publishing observational data – Supports many types of point observational data – Overcomes syntactic and semantic heterogeneity using a standard data model and controlled vocabularies – Supports a national network of observatory test beds – Maintains national registry of services (1.75 million stations – the largest in the world) WaterML is a standard language for consistently communicating water observations data from academic and government sources using web services; already adopted by several federal agencies. Joint WMO and OGC activity to enhance it. The system is already deployed at multiple locations It is free and open source