Hydrologic Information System for the Nation I. Zaslavsky (SDSC) & The CUAHSI HIS Project his.cuahsi.org, hiscentral.cuahsi.org
Consortium of Universities for the Advancement of Hydrologic Science, Inc. An organization representing more than one hundred United States universities, receives support from the National Science Foundation to develop infrastructure and services for the advancement of hydrologic science and education in the U.S US Universities as of July 2008
Databases Analysis Models CUAHSI Hydrologic Information System Goal: Enhance hydrologic science by facilitating user access to more and better data for testing hypotheses and analyzing processes Advancement of water science is critically dependent on integration of water information –Querying nation’s repository of water data –Linking small integrated research sites (<100 km2) with global and continental models –Integrating data from multiple disciplines to understand controls on hydrologic cycle It is as important to represent hydrologic environments precisely with data as it is to represent hydrologic processes with equations Rainfall & Snow Water quantity and quality Remote sensing Meteorology Soil water
What is the CUAHSI HIS? An internet based system to support the sharing of hydrologic data comprising databases connected using the internet through web services as well as software for data discovery, access and publication.
Project co-PI in Phase 2 Collaborator in Phase I CUAHSI HIS Partner Institutions
HIS WATERS Testbed CUAHSI Hydrologic Information System (HIS) NSF has funded work at 11 testbed sites, each with its own science agenda. HIS supplies the common information system
Super computer Centers: NCSA, TACC Domain Sciences: Unidata, NCAR LTER, GEON Government: USGS, EPA, NCDC, USDA Industry: ESRI, Kisters, OpenMI HIS Team WATERS Testbed WATERS Network Information System CUAHSI HIS International Partners CSIRO Land and Water Resources Water Resources Observations Network (WRON) European Commission Water database design and model integration (HarmonIT and OpenMI)
Observation Stations Ameriflux Towers (NASA & DOE)NOAA Automated Surface Observing System USGS National Water Information SystemNOAA Climate Reference Network Map for the US Build a common window on water data using web services
Water Data Web Sites
NWISWeb site output # agency_cd Agency Code # site_no USGS station number # dv_dt date of daily mean streamflow # dv_va daily mean streamflow value, in cubic-feet per-second # dv_cd daily mean streamflow value qualification code # # Sites in this file include: # USGS NEUSE RIVER NEAR CLAYTON, NC # agency_cdsite_nodv_dtdv_vadv_cd USGS USGS USGS USGS USGS USGS USGS USGS USGS USGS USGS Time series of streamflow at a gaging station USGS has committed to supporting CUAHSI’s GetValues function
Point Observations Information Model A data source operates an observation network A network is a set of observation sites A site is a point location where one or more variables are measured A variable is a property describing the flow or quality of water An observation series is an array of observations at a given site, for a given variable, with start time and end time A value is an observation of a variable at a particular time A qualifier is a symbol that provides additional information about the value Data Source Network Sites Observation Series Values {Value, Time, Qualifier} USGS Streamflow gages Neuse River near Clayton, NC Discharge, stage, start, end (Daily or instantaneous) 206 cfs, 13 August 2006 Return network information, and variable information within the network Return site information, including a series catalog of variables measured at a site with their periods of record Return time series of values
CUAHSI Observations Data Model
WaterML design principles Driven largely by hydrologists; the goal is to capture semantics of hydrologic observations discovery and retrieval Relies to a large extent on the information model as in ODM (Observations Data Model), and terms are aligned as much as possible –Several community reviews since 2005 Driven by data served by USGS NWIS, EPA STORET, multiple individual PI-collected observations Is no more than an exchange schema for CUAHSI web services A fairly simple and rigid schema tuned to the current implementation; the least barrier for adoption by hydrologists Conformance with OGC specs not in the initial scope – but working with OGC on this (OGC Discussion Paper )
Water Data Services Set of query functions Returns data in WaterML NWIS Daily Values (discharge), NWIS Ground Water, NWIS Unit Values (real time), NWIS Instantaneous Irregular Data, EPA STORET, NCDC ASOS, DAYMET, MODIS, NAM12K, USDA SNOTEL, ODM (multiple sites)
Test bed HIS Servers Central HIS servers ArcGIS Matlab IDL, R MapWindow Excel Programming (Fortran, C, VB) Desktop clients Customizable web interface (DASH) HTML - XML WSDL - SOAP Modeling (OpenMI) Global search (Hydroseek) WaterOneFlow Web Services, WaterML Controlled vocabularies Metadata catalogs Ontology ETL services HIS Lite Servers External data providers Deployment to test beds Other popular online clients ODM DataLoader Streaming Data Loading Ontology tagging (Hydrotagger) WSDL and ODM registration Data publishing ODMTools Server config tools HIS Central Registry & Harvester Hydrologic Information System Service Oriented Architecture
SQL Server ODMs and catalogs. All instances exposed as ODM (i.e. have standard ODM tables or views: Sites, Variables, SeriesCatalog, etc.) NWIS-IID NWIS-DV ASOS STORET TCEQ BearRiver... Spatial store Geodatabase or collection of shapefiles or both NWIS-IID points NWIS-DV points ASOS points STORET points TCEQ points BearRiver points... My new ODM My new points More databases More synced layers DASH Web Application Background layers (can be in the same or separate spatial store) WOF services Web services from a common template NWIS-IID WS NWIS-DV WS ASOS WS STORET WS TCEQ WS BearRiver WS... My new WS More WS from ODM-WS template USGS NCDC EPA TCEQ Web Configuration file Stores information about registered networks MXD Stores information about layers WSDLs, web service URLs Connection strings Layer info, symbology, etc. ODM DataLoader WORKGROUP HIS SERVER ORGANIZATION STEPS FOR REGISTERING OBSERVATION DATA
HISCentral Services Catalog
Against the NIH Syndrome 2006: ► CUAHSI HIS web services are discussed on the BASINS mailing list as a new way to access hydrologic data. The list is mostly used by hydrologists and developers outside academia; ► NCDC develops ASOS web services following WaterML 2007: ► MOU with USGS; USGS is developing WaterML-compliant GetValues service; ► GLEON uses an early version of ODM to develop their own database schema (VEGA); ► Phoenix LTER is developing ODM (in MySQL) and WaterML web services (in Java); ► A Google Earth-based client for CUAHSI web services is developed at CSIRO, Australia; ► Deployment to 11 hydrologic observatory test beds, + CBEO (CEOP project) 2008: ► KISTERS develops WaterML-compliant web services over their database, for a client; ► MapWindow open source GIS develops WaterOneFlow parsers; ► Florida, Texas and Idaho use ODM and WaterOneFlow web services to provide access to state data repositories; New Jersey is considering the same; ► Another CEOP project, at UC-Davis, is implementing ODM (in Postgres) and web services (in Java); ► More, which we don’t know about…
US Map of USGS Observations Antarctica Puerto Rico Hawaii Alaska
Different types of nutrients by decade: Available Data Total
Some physical properties by decade: Available Data Total
USGS Observations: California by Mean Available Data
USGS Observations: California by Available Data Total
Hydroseek Supports search by location and type of data across multiple observation networks including NWIS, Storet, and academic data
Semantic Tagging of Harvested Variables
11 Hydrologic Observatory test bed projects (NSF-funded) Growing number of water data services (LTER, CEO:P, elsewhere) WaterML Adoption (USGS, NCDC) National Hydrologic Information Server San Diego Supercomputer Center HIS Deployment
Water Quality in Moreton Bay, Brisbane, Australia (Jane Hunter)
National Water Metadata Catalog Synthesis and communication of the nation’s water data HydroseekWaterML Government Water Data Academic Water Data
Accomplishments Generic method for managing and publishing observational data –Supports many types of point observational data –Overcomes syntactic and semantic heterogeneity using a standard data model and controlled vocabularies –Supports a national network of observatory test beds but can grow! WaterML is a common language for water observations data from academic and government sources. A Hydrology Domain Working Group is established at OGC. Point Observations Data from Agencies and Academic Investigators can be consistently communicated using web services National Water Metadata Catalog is the most comprehensive index of the nation’s water observations presently existing
HIS Overview Report Summarizes the conceptual framework, methodology, and application tools for HIS version 1.1 Shows how to develop and publish a CUAHSI Water Data Service Available at: