Download presentation
Presentation is loading. Please wait.
Published byClement Fowler Modified over 9 years ago
1
Data Science for EPA's Chief Data Scientist: Big Data for Nutrients and Air Quality Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for EPA EnviroAtlas Data Science for EPA Big Data Analytics October 19, 2015 1
2
Agenda Get Another Preview of National Data Science Organizers Workshop on November 5-6, 2015, and the Focus on National Data Science Challenges and Hackathons 6:30 p.m. Welcome and Introduction Slides Data Science for EPA EnviroAtlas Part II and Data Science for EPA Nutrient Data with Dr. Joan Aron. Also see Earth Insights from Big DataSlidesData Science for EPA EnviroAtlasData Science for EPA Nutrient DataEarth Insights from Big Data 7:00 p.m. Brief Member Introductions 7:15 p.m. Invited Presentation: EPA Engineer Dr. Gayle Hagler Slides and Ron Williams, Project Lead for ORD’s Emerging Technologies Program Area Slides See: Federal Crowdsourcing and Citizen Science Toolkit: The Air Sensor Toolbox: Citizen Scientists Measure Air QualityEPA Engineer Dr. Gayle HaglerSlidesFederal Crowdsourcing and Citizen Science ToolkitThe Air Sensor Toolbox: Citizen Scientists Measure Air Quality 8:15 p.m. Anne Bowser, Researcher in Data Science and Visualization, and CoDirector, Commons Lab, Science and Technology Innovations Program, Woodrow Wilson International Center for Scholars See SlidesAnne BowserCommons LabSlides 8:30 p.m. Open Discussion 8:45 p.m. Networking 9:00 p.m. Depart http://www.meetup.com/Federal-Big-Data-Working-Group/events/223605766/ 2
3
Purpose This Meetup was organized for: Robin Thottungal, Chief Data Scientist @ EPA, and Division Director, EAD, OIAA, OEI, Robin Thottungal Greg Godbout, Chief Technology Officer, Environmental Protection Agency, and former Executive Director and Co-Founder of 18F, and Greg Godbout Jay Benforado, Director, National Center for Environmental Innovation at EPA, and Co-Chair, Federal Community of Practice for Crowdsourcing and Citizen ScienceFederal Community of Practice for Crowdsourcing and Citizen Science for the National Data Science Organizers Workshop on November 5-6, 2015, as an example of:National Data Science Organizers Workshop data science for curated data sets, user-centric digital services focused on the interaction between government and the people and businesses it serves, and a Federal Community of Practice on Crowdsourcing and Citizen Science of Big Data that meets bi-monthly to share lessons learned and develop best practices for designing, implementing, and evaluating crowdsourcing and citizen science initiatives. 3
4
4 https://crowdsourcing-toolkit.sites.usa.gov/air-sensor-toolbox/ Contact Information: Ron Williams Email: williams.ronald@epa.govwilliams.ronald@epa.gov
5
Outline Earth Insights: Data Science for Conservation International's Big Ecosystem Data Earth Insights from Big Data EnviroAtlas 2014 (Shape files with no attributes) to 2015 (Geodatabase files with few ecosystem attributes) Data Science for EPA EnviroAtlas National Nutrient Dataset – State Data Science for EPA Nutrient Data Ohio River and Watersheds: Ohio Watershed TMDLs (Dr. Joan Aron) Accessed on August 4, 2015 Maumee River Basin, Ohio (Dr. Joan Aron) Taken by Storm: How Heavy Rain is Worsening Algal Blooms in Lake Erie With a Focus on the Maumee River in Ohio 5
6
6 Semantic Community Data Science Earth Insights from Big Data Data Science for Conservation International's Big Ecosystem Data Conservation InternationalConservation International and HP Vertica teamed for an excellent online presentation (see Slides below) on the deluge of data from the Tropical Ecology Assessment and Monitoring (TEAM) Network. The TEAM is described in a Vimeo video (see 23:52 - 26:18). The presentation also includes a reference (at 16:41).to an article on The Popularity of Data Analysis Software.HP VerticaSlidesdeluge of dataTropical Ecology Assessment and Monitoring (TEAM) NetworkVimeo videoThe Popularity of Data Analysis Software
7
EPA Data Science Meetup History Previous: EPA Data Science Products (while at EPA circa 2008-2010) EPA Data Science Products EPA/NASA Climate-Environmental Data Analytics & A Redesigned, Open Data.gov Meetup (May 6, 2014) EPA/NASA Climate-Environmental Data Analytics & A Redesigned, Open Data.gov Data Science for EPA EnviroAtlas (June 6, 2014) Data Science for EPA EnviroAtlas Data Science for EPA Big Data Analytics (April 17, 2015) Data Science for EPA Big Data Analytics President's Chief Data Scientist and EPA Big Data Analytics Meetup (April 20, 2015) President's Chief Data Scientist and EPA Big Data Analytics Meetup Uncovering EPA Hydraulic Fracturing Trends and Data with TIBCO Spotfire (September 1, 2015) Uncovering EPA Hydraulic Fracturing Trends and Data with TIBCO Spotfire Present: Translating Big Data into Big Climate Ideas (April 2015) Joan Aron Comments and Questions 7
8
Translating Big Data into Big Climate Ideas By Brian R. Pickard (U.S. EPA, Landscape Ecology Division), Jeremy Baynes (Contributor), Megan Mehaffey (Contributor), Anne C. Neale (Contributor), Solutions Vol. 6, Issue 1, April 2015. In Brief: Climate change has emerged as the significant environmental challenge of the 21st century. Therefore, understanding our changing world has forced researchers from many different fields of science to join together to tackle complicated research questions. The climate change research community now faces the daunting task of disseminating massive amounts of information about possible future climates under differing scenarios to a broad audience. They also need to make the data readily accessible so that it can be used by scientists in other research fields. One potential solution for distribution and communication of the climate scenario information may be through the EnviroAtlas, a new geospatial application developed by the United States Environmental Protection Agency and its partners. This interactive mapping tool allows users to access and explore climate change modeling information in easily understandable formats while providing a range of information on different ecosystem goods and services, or the benefits people receive from nature. By incorporating future scenarios such as land use and climate change within EnviroAtlas, we can evaluate specific components of complex ecosystems within the context of forecasted futures. Linking climate change impacts to ecosystem services, such as clean air and water, allows for opportunities to demonstrate how climate change will impact ecosystems, societies, and human health. 8 http://www.thesolutionsjournal.org/node/237304
9
Joan Aron Comments and Suggestions 1 I just sent EnviroAtlas a question about finding the climate data discussed in that Solutions article. I could not find climate data. According to the Solutions article, NASA provided climate data to use EnviroAtlas as a platform for disseminating climate data for ecological applications. I'm trying to build on that platform to leverage the NASA and EPA investment for water quality decision-making related to ecosystem function. It would also be nice to connect to other applications like your Precision Farming work. If we can link your agricultural info to EnviroAtlas, that could be interesting since agricultural runoff contributes a lot of pollution and is a focus for pollution control. The characterization of agriculture is also important since USDA covers crops, forests, ranching and CAFOs. 9
10
Response from Anne Neale, EnviroAtlas Project Lead, US EPA, RTP, NC Many apologies that you were not able to find the data we described. We are sorry and embarrassed that the data are not available as promised in the article. When we published the article, we felt sure that we would have the data available by the time the article was published. We ran into some unforeseen complications and although we have all the data and software available on our development server, we have not been able to push that to our public facing site yet. We anticipate that the issue will be resolved soon and we can let you know as soon as the data are accessible. On July 22, 2015, the EnviroAtlas team will be hosting an informational meeting in downtown Portland, OR to showcase newly available data. Register here to join us! 10 http://enviroatlas.epa.gov/enviroatlas/Status/index.html
11
http://enviroatlas.epa.gov/enviroatlas/Datadownload/index.html 11
12
Disclaimer 1 The data for this download Esri File GeoDataBase (FGDB) contains many data tables and feature classes. Varying numbers of attributes within each data table were used to map the layers in the EnviroAtlas Interactive Map. We encourage users to evaluate the associated metadata for the tables and feature classes through the EnviroAtlas Interactive Map, the XML files included in this zip file, or the EPA’s Environmental Dataset Gateway (EDG). EnviroAtlas is an interactive web-based decision support tool designed to provide information about ecosystems and the benefits people receive from those ecosystems. This application represents the first peer-reviewed public release. EnviroAtlas will continue to evolve over the coming years with periodic public releases. We welcome and request your feedback on all aspects of the website and tools. 12
13
Disclaimer 2 EnviroAtlas is designed for use by government, professional, academic, and community users, as well as members of the public. EnviroAtlas does not require special software, technical expertise, or a scientific background. However, it is the responsibility of the user to read and evaluate dataset limitations, restrictions, and intended use. To the best of our knowledge, the data and information on this website are accurate, but no warranty expressed or implied is made regarding the accuracy or utility of the data for general or scientific purposes, nor shall the act of distribution constitute any such warranty. All modeled geographic data are, by their nature, imperfect and the data provided in this Atlas should not be taken as absolute truth but as the best approximation of that truth based on best available data. For site-specific data, EnviroAtlas data will not replace "boots-on-the- ground measurements" or local knowledge. Neither EPA, EPA contractors, nor any other organizations cooperating with EPA assume any responsibility for damages or other liabilities related to the accuracy, availability, use, or misuse of the information provided on this website. EPA reserves the right to change information at any time without public notice. Any errors or omissions should be reported to the EnviroAtlas Team. 13
14
14 Semantic Community Data Science Data Science for EPA EnviroAtlas
15
15 National: 34 Files 29.8 MB
16
16 Portland: 22 Files 656 MB
17
17 Web Player
18
Joan Aron Comments and Suggestions 2 I need to have more metadata to understand the net flux of pollutants leaving agricultural fields and the locations where the flux applies. There may be multiple sources of data in the same watershed. I am especially interested in three geographical areas for the examples: Around Coos (Oregon) HUC 17100304 - Tenmile Lakes should be nearby Around Lower Maumee (Ohio) HUC 04100009 - Maumee watershed produces flux of nutrients to Lake Erie at Toledo Around Tensas (Louisiana) HUC 08050003 - near Mississippi River 18
19
Nitrogen and Phosphorus Pollution Data Access Tool 19 http://gispub2.epa.gov/NPDAT/DataDownloads.html http://www2.epa.gov/nutrient-policy-data/nitrogen-and-phosphorus-pollution-data-access-tool
20
20 http://gispub2.epa.gov/npdat/
21
21 http://gispub2.epa.gov/npdat/
22
Data Download Nitrogen and phosphorus datasets, except for water quality monitoring data, are downloaded from the Viewer Download Page. Water quality monitoring data for N/P are downloaded by HUC8 using the map as described below. These monitoring data were sampled for lakes, streams, estuaries, spring, reservoir and impoundment from 1995 to present. First, select one or more hydrologic units (HUC8s) by picking an action, then click on a HUC8 on the map to Add or Remove it from the list of selected HUC8s. Now, click on one of the buttons below to download the water quality monitoring attribute or geospatial data for the selected HUC8(s). Note that clicking on the Excel or CSV button offers two separate files containing Station and Result attributes. My Note: This Data Download goes to: http://gispub2.epa.gov/NPDAT/DataDownloads.html 22
23
23 EPANutrientIndicatorsDataSet.xlsx Hydrologic Units (HUC8) - Summaries and Boundaries SPARROW Total N/P Yields 2002 SPARROW Total N/P Yields 1992 Water Quality Monitoring Sites with N/P (STORET) (IN VIEWER) Water Quality Monitoring Sites with N/P (NWIS) (IN VIEWER) NARS N/P Values for Streams NARS N/P Values for Lakes Facilities Likely to Discharge N/P to Water NPDES - Concentrated Animal Feeding Operations (CAFO) Summary National Land Cover Dataset (NOT USED) Waters Listed for N/P Impairments Waters with N/P TMDLs Sources of Drinking Water (By HUC12) State Boundaries Tribal Lands Boundaries (NOT AVAILABLE) Mississippi-Atchafalaya River Basin Boundary
24
24 http://www.epa.state.oh.us/dsw/tmdl/index.aspx A TMDL is a written, quantitative assessment of water quality problems in a waterbody and contributing sources of pollution. It specifies the amount a pollutant needs to be reduced to meet water quality standards (WQS), allocates pollutant load reductions, and provides the basis for taking actions needed to restore a waterbody.
25
25 http://www.epa.state.oh.us/dsw/tmdl/MaumeeRiver.aspx#123086615-monitoring The Maumee River watershed is located in northwestern Ohio. It drains a total of 5,024 square miles in Ohio and flows through all or part of 18 counties. Major municipalities partially or fully in the watershed include Toledo, Defiance, Findlay, Lima, Van Wert, Napoleon and Perrysburg. The watershed is predominantly comprised of cultivated crops with some urban development, hay and pasture lands, and forest. The Maumee River is a major tributary to the western Lake Erie basin. Please see the Lake Erie program page for more information.Lake Erie program page
26
26 EPANutrientIndicatorsDataSet.xlsx Auglaize River (lower) Watershed Auglaize River (upper) Watershed Blanchard River Watershed Maumee River (lower) Tributaries and Lake Erie Tributaries Maumee River Main Stem Ottawa River Watershed (Lima area) Powell Creek Watershed St. Joseph River Watershed St. Marys River watershed Swan Creek Watershed Tiffin River Watershed Maumee River Basin Select Tributaries
27
27 Web Player EPA N & P Data Ecosystem Demo of Individual Spotfire Tabs
28
Agenda Get Another Preview of National Data Science Organizers Workshop on November 5-6, 2015, and the Focus on National Data Science Challenges and Hackathons 6:30 p.m. Welcome and Introduction Slides Data Science for EPA EnviroAtlas Part II and Data Science for EPA Nutrient Data with Dr. Joan Aron. Also see Earth Insights from Big DataSlidesData Science for EPA EnviroAtlasData Science for EPA Nutrient DataEarth Insights from Big Data 7:00 p.m. Brief Member Introductions 7:15 p.m. Invited Presentation: EPA Engineer Dr. Gayle Hagler Slides and Ron Williams, Project Lead for ORD’s Emerging Technologies Program Area Slides See: Federal Crowdsourcing and Citizen Science Toolkit: The Air Sensor Toolbox: Citizen Scientists Measure Air QualityEPA Engineer Dr. Gayle HaglerSlidesFederal Crowdsourcing and Citizen Science ToolkitThe Air Sensor Toolbox: Citizen Scientists Measure Air Quality 8:15 p.m. Anne Bowser, Researcher in Data Science and Visualization, and CoDirector, Commons Lab, Science and Technology Innovations Program, Woodrow Wilson International Center for Scholars See SlidesAnne BowserCommons LabSlides 8:30 p.m. Open Discussion 8:45 p.m. Networking 9:00 p.m. Depart http://www.meetup.com/Federal-Big-Data-Working-Group/events/223605766/ 28
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.