Download presentation
Presentation is loading. Please wait.
Published byRandell Floyd Modified over 9 years ago
1
BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08 http://aquatic.biodata.usgs.gov
2
Today What is BioData? Why Did We Build It? Current Capabilities Future Possibilities Data Integration/Interoperability Challenges
3
What is BioData? – in a nutshell A data management, storage, and distribution system for aquatic bioassessment data. data capture data curation data publication
4
Why We Built It - A Brief History 1992 – National Water-Quality Assessment Program (NAWQA) began collecting bioassessment data (macroinvert, fish, algae, stream habitat)
5
NAWQA Study Units
6
Why We Built It - A Brief History 1992 – National Water-Quality Assessment Program (NAWQA) begins collecting bioassessment data (macroinvert, fish, algae, stream habitat) 1992 – 1999: Local data management and national data aggregations 1999 – NAWQA national bioassessment database – (BioTDB)
7
WRD Needs Assessment (2006) Surveyed WRD Science Centers to find out: How much aquatic ecology data is being collected outside the NAWQA Program? What kinds? What methods? Where and how are data being stored?
8
What We Discovered Water collaborative projects with other agencies, states, localities, and partners are producing as much data as the NAWQA Program 80 % of WSC’s reported projects collecting aquatic ecology data 120 projects had a macroinvertebrate, fish, algae, or habitat component (2000 – 2005) Approximately 15,000 samples The majority of samples are being collected using NAWQA and USEPA national stream bioassessment protocols Samples are being sent to a variety of taxonomic labs
9
What We Discovered The data are stored electronically, but are very difficult to discover, access, and integrate 47% in Excel 13% are in EPA databases 19% in home-grown relational databases 79%
10
U.S. Department of the Interior U.S. Geological Survey BioData a new bioassessment database for the USGS briefing for the USGS GCMRC 5/9/2011 http://aquatic.biodata.usgs.gov
11
What Should We Do? 1. Do nothing? 2. Implement a federated system? 3. Incrementally refurbish existing NAWQA database? 4. Redesign and “re-build” using modern, web- enabled, extensible architecture? (BioData)
12
Biodata - Version 1 Objective A data storage, retrieval, and distribution system for aquatic bioassessment data most commonly produced by USGS WRD projects.
13
“Most Commonly Produced” Project Objectives Setting Types of Data Sampling Protocols Bioassessment and monitoring Streams and rivers Macroinvertebrates Fish Algae Study reach habitat NAWQA USEPA
14
Additional Characteristics An internet application Available to any USGS ecologist. Designed to be adapted and extended Support scientific workflow Serve as an online data archive Curate taxonomic nomenclature - map it forward and harmonize it across all the data Support biologist lab data exchange Readily add web data services
15
BioData Retrieval (DWH) project data management BioData Input data distribution field datalab data field data input data exchange with labs data review external data NAWQA legacy data public web site web data services application- specific output
16
Data Retrieval Features https://aquatic.biodata.usgs.gov Real-time feedback on how many samples your query will return Save the query to your desktop – then email to friends for them to run Variety of file formats Multiple data sets downloaded in one step
17
Data Retrieval Demo https://aquatic.biodata.usgs.gov
18
BioData Retrieval (DWH) project data management BioData Input data distribution field datalab data field data input data exchange with labs data review external data NAWQA legacy data public web site web data services application- specific output
19
Data Input/Management Features Retrieve restricted (unreleased) data Manage and organize data by project Project control over rights to enter and edit data Built in help and data validation checks Auto-saving Data entry screens tailored to field sheets Send electronic orders to labs
20
Data Input/Mgt Demo
21
Data integration – touchpoints First challenge – find the data Second challenge - compatible methods?
22
Data integration – touchpoints First challenge – find the data Second challenge - compatible methods? Third challenge – get the data We need to pick a data exchange standard
23
Data integration – touchpoints First challenge – find the data Second challenge - compatible methods? Third challenge – get the data Fourth challenge – harmonize taxonomy Does “Thienemannimyia group” = “Thienemannimyia gr.” ?? Does ITIS solve this?
24
ITIS
26
Only handles published names We have to handle unpublished names Provisional = new taxon claimed but not “officially” published Conditional = uncertain or indeterminate identification, e.g. “Thienemannimyia group” ITIS is not complete for all groups Fish – good, we can integrate tightly with it Macroinvertebrates – doable Algae – ITIS not ready yet
27
Data integration – touchpoints First challenge – find the data Second challenge - compatible methods? Third challenge – get the data Fourth challenge – harmonize taxonomy Does “Thienemannimyia group” = “Thienemannimyia gr.” ?? Fifth challenge – integrate with physio- chemical and ancillary data Common geospatial framework would help
28
NHD Which NHD? NHD “snap to” service with API’s that developers could use in their application(s)? Service to translate NHD address to other versions of NHD (and future)
29
http://aquatic.biodata.usgs.gov BioData For more information contact: Pete Ruhl pmruhl@usgs.gov 703-648-6841
30
U.S. Department of the Interior U.S. Geological Survey BioData a new bioassessment database for the USGS briefing for the USGS GCMRC 5/9/2011 http://aquatic.biodata.usgs.gov
31
NAWQA BioTDB Database NAWQA data from 1992 - present 2,294 sites 21,689 samples 6,715 macroinvertebrate community samples 2,819 fish community samples 8,749 algae community samples 2,819 reach habitat assessments > 1,200,000 specimen records
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.