Presentation is loading. Please wait.

Presentation is loading. Please wait.

BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

Similar presentations


Presentation on theme: "BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08"— Presentation transcript:

1 BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08 http://aquatic.biodata.usgs.gov

2 Today  What is BioData?  Why Did We Build It?  Current Capabilities  Future Possibilities  Data Integration/Interoperability Challenges

3 What is BioData? – in a nutshell A data management, storage, and distribution system for aquatic bioassessment data. data capture data curation data publication

4 Why We Built It - A Brief History  1992 – National Water-Quality Assessment Program (NAWQA) began collecting bioassessment data (macroinvert, fish, algae, stream habitat)

5 NAWQA Study Units

6 Why We Built It - A Brief History  1992 – National Water-Quality Assessment Program (NAWQA) begins collecting bioassessment data (macroinvert, fish, algae, stream habitat)  1992 – 1999: Local data management and national data aggregations  1999 – NAWQA national bioassessment database – (BioTDB)

7 WRD Needs Assessment (2006)  Surveyed WRD Science Centers to find out:  How much aquatic ecology data is being collected outside the NAWQA Program?  What kinds?  What methods?  Where and how are data being stored?

8 What We Discovered  Water collaborative projects with other agencies, states, localities, and partners are producing as much data as the NAWQA Program  80 % of WSC’s reported projects collecting aquatic ecology data  120 projects had a macroinvertebrate, fish, algae, or habitat component (2000 – 2005)  Approximately 15,000 samples  The majority of samples are being collected using NAWQA and USEPA national stream bioassessment protocols  Samples are being sent to a variety of taxonomic labs

9 What We Discovered  The data are stored electronically, but are very difficult to discover, access, and integrate  47% in Excel  13% are in EPA databases  19% in home-grown relational databases 79%

10 U.S. Department of the Interior U.S. Geological Survey BioData a new bioassessment database for the USGS briefing for the USGS GCMRC 5/9/2011 http://aquatic.biodata.usgs.gov

11 What Should We Do? 1. Do nothing? 2. Implement a federated system? 3. Incrementally refurbish existing NAWQA database? 4. Redesign and “re-build” using modern, web- enabled, extensible architecture? (BioData)

12 Biodata - Version 1 Objective A data storage, retrieval, and distribution system for aquatic bioassessment data most commonly produced by USGS WRD projects.

13 “Most Commonly Produced”  Project Objectives  Setting  Types of Data  Sampling Protocols  Bioassessment and monitoring  Streams and rivers  Macroinvertebrates  Fish  Algae  Study reach habitat  NAWQA  USEPA

14 Additional Characteristics  An internet application  Available to any USGS ecologist.  Designed to be adapted and extended  Support scientific workflow  Serve as an online data archive  Curate taxonomic nomenclature - map it forward and harmonize it across all the data  Support biologist lab data exchange  Readily add web data services

15 BioData Retrieval (DWH) project data management BioData Input data distribution field datalab data field data input data exchange with labs data review external data NAWQA legacy data public web site web data services application- specific output

16 Data Retrieval Features https://aquatic.biodata.usgs.gov  Real-time feedback on how many samples your query will return  Save the query to your desktop – then email to friends for them to run  Variety of file formats  Multiple data sets downloaded in one step

17 Data Retrieval Demo  https://aquatic.biodata.usgs.gov

18 BioData Retrieval (DWH) project data management BioData Input data distribution field datalab data field data input data exchange with labs data review external data NAWQA legacy data public web site web data services application- specific output

19 Data Input/Management Features  Retrieve restricted (unreleased) data  Manage and organize data by project  Project control over rights to enter and edit data  Built in help and data validation checks  Auto-saving  Data entry screens tailored to field sheets  Send electronic orders to labs

20 Data Input/Mgt Demo

21 Data integration – touchpoints  First challenge – find the data  Second challenge - compatible methods?

22 Data integration – touchpoints  First challenge – find the data  Second challenge - compatible methods?  Third challenge – get the data  We need to pick a data exchange standard

23 Data integration – touchpoints  First challenge – find the data  Second challenge - compatible methods?  Third challenge – get the data  Fourth challenge – harmonize taxonomy  Does “Thienemannimyia group” = “Thienemannimyia gr.” ??  Does ITIS solve this?

24 ITIS

25

26  Only handles published names  We have to handle unpublished names  Provisional = new taxon claimed but not “officially” published  Conditional = uncertain or indeterminate identification, e.g. “Thienemannimyia group”  ITIS is not complete for all groups  Fish – good, we can integrate tightly with it  Macroinvertebrates – doable  Algae – ITIS not ready yet

27 Data integration – touchpoints  First challenge – find the data  Second challenge - compatible methods?  Third challenge – get the data  Fourth challenge – harmonize taxonomy  Does “Thienemannimyia group” = “Thienemannimyia gr.” ??  Fifth challenge – integrate with physio- chemical and ancillary data  Common geospatial framework would help

28 NHD  Which NHD?  NHD “snap to” service with API’s that developers could use in their application(s)?  Service to translate NHD address to other versions of NHD (and future)

29 http://aquatic.biodata.usgs.gov BioData For more information contact: Pete Ruhl pmruhl@usgs.gov 703-648-6841

30 U.S. Department of the Interior U.S. Geological Survey BioData a new bioassessment database for the USGS briefing for the USGS GCMRC 5/9/2011 http://aquatic.biodata.usgs.gov

31 NAWQA BioTDB Database  NAWQA data from 1992 - present  2,294 sites  21,689 samples  6,715 macroinvertebrate community samples  2,819 fish community samples  8,749 algae community samples  2,819 reach habitat assessments  > 1,200,000 specimen records


Download ppt "BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08"

Similar presentations


Ads by Google