NQuery: A Network-enabled Data-based Query Tool for Multi-disciplinary Earth-science Datasets John R. Osborne
NQuery Overview What is NQuery? NQuery assists the scientist who has selected in-situ datasets within a desired time-space domain, and wishes to refine that data selection based on characteristics of the data itself. What problem does NQuery solve? Many different sources of data are available on the network through OPeNDAP (in-situ data collections and gridded datasets), yet in order for scientists to determine whether each data file will suit their needs, they must sort through each file, wasting time and energy.
NQuery Overview Examples of the types of queries possible with NQuery and profile data: Select profiles where the average salinity is less than 34.0 PSU. Select profiles where the depth of the minimum dissolved oxygen value is greater than 1000 db. Select profiles where the mixed-layer depth is between 50 and 100 meters. Select profiles where the depth of the 10 degree isotherm is greater than 100 meters. Select profiles where the average salinity between the 27 and 27.5 sigma0 density surfaces is between 34.1 and 34.2 PSU. Select profiles where the surface salinity is greater than 35.0 PSU and the surface temperature is less than 15 degrees C.
Platform-independent Java Tool utilizing JDBC and MySQL. Access to online data archives (Climate Data Portal) via “Dapper” OPeNDAP client. Computes summary statistics (e.g., average value, depth of maximum value, and depth of minimum value) for profiles from observed parameters and from user-specified calculations (e.g., theta, sigma, apparent oxygen utilization) Custom station-level calculations including mixed-layer depth, interpolation of a measured parameter to a standard level, and integration of parameter between levels. Simple tree view user interface for variable selection. Creates temporary “on the fly” relational database populated with data and metadata. Allows powerful SQL queries to locate needed data by summary statistics and/or station-level calculated parameters. NQuery: Overview
Compatible with all major desktop operating systems including Windows, Mac OS X, and Linux. Version or greater of the Java Virtual Machine. Version 4.1 of MySQL. An Internet connection is required for access to network data through Dapper and to create databases on networked MySQL database servers. NQuery: System Requirements
NQuery: Architecture Overview
NQuery: Dapper Overview Dapper is Java-servlet-based OPeNDAP web server for in-situ profile data. OPeNDAP (formerly known as DODS) is a community standard “protocol for requesting and transporting data across the web. ” (opendap.org/faq/what_is_OPeNDAP_software.html) Dapper is fully compatible with the OPeNDAP standards, and with the only other OPeNDAP server for in-situ data (the GRaDS DODS Server, aka the GDS server). OPeNDAP has been used primarily to access a relatively small number of earth science gridded or model output products. A list of OPeNDAP clients can be found at opendap.org/faq/whatClients.html. Dapper allows access to large collections of in-situ data by converting in- situ data from the Climate Data Portal to OPeNDAP sequences. The Climate Date Portal is a relational database that acts as an OPeNDAP aggregation server.
NQuery Demonstration
Demonstration NQuery with Dapper Wizard
Future Development Accommodate time series data: Define summary statistics for time series Define user-defined calculations Ingest time series netCDF files that adhere to EPIC conventions Ingest OPenDAP time series via Dapper Enhance query builder to create queries with grouped criteria Allow merging data files from online sources with local, desktop datasets.
Acknowledgements NQuery development was funded by NOAA HPCC grant #COL/NW/06 The Dapper server was developed by Joe Sirott (Sirott and Assoc. and NOAA/PMEL), Donald Denbo (NOAA/PMEL and JISAO), and Willa Zhu (NOAA/PMEL and JISAO.) The Dapper Wizard client was developed by Donald Denbo (NOAA/PMEL and JISAO.) Additional Dapper client development and integration with Java OceanAtlas was performed by John Osborne (OceanAtlas Software and NOAA/PMEL.) The Climate data Portal was developed by Don Denbo and Willa Zhu (NOAA/PMEL and JISAO.)
Summary With NQuery, a scientist can productively use data from an OPeNDAP in- situ data server containing millions of in-situ datasets. Select station data within a specified time-latitude-longitude box Further subset this data based on characteristics of the data itself Examples: Select profiles where the average salinity is less than 34.0 Select profiles where the depth of the minimum dissolved oxygen is greater than 1000 db. Select profiles where the mixed-layer depth is between 50 and 100 meters Select profiles where the depth of the 10 degree isotherm is greater than 100 meters. Select profiles where the average salinity between density surfaces is between 34 and 35. Scientists can create databases from local desktop data files and locate data based upon it’s characteristics in addition to it’s space-time domain.