Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist

Similar presentations


Presentation on theme: "Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist"— Presentation transcript:

1 Data Science for EPA's Chief Data Scientist: Big Data for Nutrients and Air Quality
Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for EPA EnviroAtlas Data Science for EPA Big Data Analytics October 19, 2015

2 Agenda Get Another Preview of National Data Science Organizers Workshop on November 5-6, 2015, and the Focus on National Data Science Challenges and Hackathons 6:30 p.m. Welcome and Introduction Slides Data Science for EPA EnviroAtlas Part II and Data Science for EPA Nutrient Data with Dr. Joan Aron. Also see Earth Insights from Big Data 7:00 p.m. Brief Member Introductions 7:15 p.m. Invited Presentation: Ron Williams, Project Lead-ACE EM-3 (Emerging Technologies), ORD’s Emerging Technologies Program Area, EPA National Exposure Research Laboratory. Slides See: Federal Crowdsourcing and Citizen Science Toolkit: The Air Sensor Toolbox: Citizen Scientists Measure Air Quality, EPA Engineer Dr. Gayle Hagler and Dr. Denice Shaw, Associate Chief Innovation Officer, US Environmental Protection Agency, Office of Research and Development. 8:15 p.m. Anne Bowser, Researcher in Data Science and Visualization, and CoDirector, Commons Lab, Science and Technology Innovations Program, Woodrow Wilson International Center for Scholars See Slides 8:30 p.m. Open Discussion​  8:45 p.m. Networking  9:00 p.m. Depart

3 Purpose This Meetup was organized for:
Robin Thottungal, Chief Data EPA, and Division Director, EAD, OIAA, OEI, Greg Godbout, Chief Technology Officer, Environmental Protection Agency, and former Executive Director and Co-Founder of 18F, and Jay Benforado, Director, National Center for Environmental Innovation at EPA, and Co-Chair, Federal Community of Practice for Crowdsourcing and Citizen Science for the National Data Science Organizers Workshop on November 5-6, 2015, as an example of: data science for curated data sets, user-centric digital services focused on the interaction between government and the people and businesses it serves, and a Federal Community of Practice on Crowdsourcing and Citizen Science of Big Data that meets bi-monthly to share lessons learned and develop best practices for designing, implementing, and evaluating crowdsourcing and citizen science initiatives.

4 https://crowdsourcing-toolkit.sites.usa.gov/air-sensor-toolbox/
Contact Information: Ron Williams

5 Outline Earth Insights: Data Science for Conservation International's Big Ecosystem Data Earth Insights from Big Data EnviroAtlas 2014 (Shape files with no attributes) to 2015 (Geodatabase files with few ecosystem attributes) Data Science for EPA EnviroAtlas National Nutrient Dataset – State Data Science for EPA Nutrient Data Ohio River and Watersheds: Ohio Watershed TMDLs (Dr. Joan Aron) Accessed on August 4, 2015 Maumee River Basin, Ohio (Dr. Joan Aron) Taken by Storm: How Heavy Rain is Worsening Algal Blooms in Lake Erie With a Focus on the Maumee River in Ohio

6 Semantic Community Data Science Earth Insights from Big Data Data Science for Conservation International's Big Ecosystem Data Conservation International and HP Vertica teamed for an excellent online presentation (see Slides below) on the deluge of data from the Tropical Ecology Assessment and Monitoring (TEAM) Network. The TEAM is described in a Vimeo video (see 23: :18). The presentation also includes a reference (at 16:41).to an article on The Popularity of Data Analysis Software.

7 EPA Data Science Meetup History
Previous: EPA Data Science Products (while at EPA circa ) EPA/NASA Climate-Environment­al Data Analytics & A Redesigned, Open Data.gov Meetup (May 6, 2014) Data Science for EPA EnviroAtlas (June 6, 2014) Data Science for EPA Big Data Analytics (April 17, 2015) President's Chief Data Scientist and EPA Big Data Analytics Meetup (April 20, 2015) Uncovering EPA Hydraulic Fracturing Trends and Data with TIBCO Spotfire (September 1, 2015) Present: Translating Big Data into Big Climate Ideas (April 2015) Joan Aron Comments and Questions

8 Translating Big Data into Big Climate Ideas
By Brian R. Pickard (U.S. EPA, Landscape Ecology Division), Jeremy Baynes (Contributor), Megan Mehaffey (Contributor), Anne C. Neale (Contributor), Solutions Vol. 6, Issue 1, April 2015. In Brief: Climate change has emerged as the significant environmental challenge of the 21st century. Therefore, understanding our changing world has forced researchers from many different fields of science to join together to tackle complicated research questions. The climate change research community now faces the daunting task of disseminating massive amounts of information about possible future climates under differing scenarios to a broad audience. They also need to make the data readily accessible so that it can be used by scientists in other research fields. One potential solution for distribution and communication of the climate scenario information may be through the EnviroAtlas, a new geospatial application developed by the United States Environmental Protection Agency and its partners. This interactive mapping tool allows users to access and explore climate change modeling information in easily understandable formats while providing a range of information on different ecosystem goods and services, or the benefits people receive from nature. By incorporating future scenarios such as land use and climate change within EnviroAtlas, we can evaluate specific components of complex ecosystems within the context of forecasted futures. Linking climate change impacts to ecosystem services, such as clean air and water, allows for opportunities to demonstrate how climate change will impact ecosystems, societies, and human health.

9 Joan Aron Comments and Suggestions 1
I just sent EnviroAtlas a question about finding the climate data discussed in that Solutions article. I could not find climate data. According to the Solutions article, NASA provided climate data to use EnviroAtlas as a platform for disseminating climate data for ecological applications. I'm trying to build on that platform to leverage the NASA and EPA investment for water quality decision-making related to ecosystem function. It would also be nice to connect to other applications like your Precision Farming work. If we can link your agricultural info to EnviroAtlas, that could be interesting since agricultural runoff contributes a lot of pollution and is a focus for pollution control. The characterization of agriculture is also important since USDA covers crops, forests, ranching and CAFOs.

10 Response from Anne Neale, EnviroAtlas Project Lead, US EPA, RTP, NC
Many apologies that you were not able to find the data we described. We are sorry and embarrassed that the data are not available as promised in the article. When we published the article, we felt sure that we would have the data available by the time the article was published. We ran into some unforeseen complications and although we have all the data and software available on our development server, we have not been able to push that to our public facing site yet. We anticipate that the issue will be resolved soon and we can let you know as soon as the data are accessible. On July 22, 2015, the EnviroAtlas team will be hosting an informational meeting in downtown Portland, OR to showcase newly available data. Register here to join us!

11

12 Disclaimer 1 The data for this download Esri File GeoDataBase (FGDB) contains many data tables and feature classes. Varying numbers of attributes within each data table were used to map the layers in the EnviroAtlas Interactive Map. We encourage users to evaluate the associated metadata for the tables and feature classes through the EnviroAtlas Interactive Map, the XML files included in this zip file, or the EPA’s Environmental Dataset Gateway (EDG). EnviroAtlas is an interactive web-based decision support tool designed to provide information about ecosystems and the benefits people receive from those ecosystems. This application represents the first peer-reviewed public release. EnviroAtlas will continue to evolve over the coming years with periodic public releases. We welcome and request your feedback on all aspects of the website and tools.

13 Disclaimer 2 EnviroAtlas is designed for use by government, professional, academic, and community users, as well as members of the public. EnviroAtlas does not require special software, technical expertise, or a scientific background. However, it is the responsibility of the user to read and evaluate dataset limitations, restrictions, and intended use. To the best of our knowledge, the data and information on this website are accurate, but no warranty expressed or implied is made regarding the accuracy or utility of the data for general or scientific purposes, nor shall the act of distribution constitute any such warranty. All modeled geographic data are, by their nature, imperfect and the data provided in this Atlas should not be taken as absolute truth but as the best approximation of that truth based on best available data. For site-specific data, EnviroAtlas data will not replace "boots-on-the- ground measurements" or local knowledge. Neither EPA, EPA contractors, nor any other organizations cooperating with EPA assume any responsibility for damages or other liabilities related to the accuracy, availability, use, or misuse of the information provided on this website. EPA reserves the right to change information at any time without public notice. Any errors or omissions should be reported to the EnviroAtlas Team.

14 Semantic Community Data Science Data Science for EPA EnviroAtlas

15 National: 34 Files 29.8 MB

16 Portland: 22 Files 656 MB

17 Web Player

18 Joan Aron Comments and Suggestions 2
I need to have more metadata to understand the net flux of pollutants leaving agricultural fields and the locations where the flux applies. There may be multiple sources of data in the same watershed. I am especially interested in three geographical areas for the examples: Around Coos (Oregon) HUC Tenmile Lakes should be nearby Around Lower Maumee (Ohio) HUC Maumee watershed produces flux of nutrients to Lake Erie at Toledo Around Tensas (Louisiana) HUC near Mississippi River

19 Nitrogen and Phosphorus Pollution Data Access Tool

20

21

22 Data Download Nitrogen and phosphorus datasets, except for water quality monitoring data, are downloaded from the Viewer Download Page. Water quality monitoring data for N/P are downloaded by HUC8 using the map as described below.  These monitoring data were sampled for lakes, streams, estuaries, spring, reservoir and impoundment from 1995 to present. First, select one or more hydrologic units (HUC8s) by picking an action, then click on a HUC8 on the map to Add or Remove it from the list of selected HUC8s. Now, click on one of the buttons below to download the water quality monitoring attribute or geospatial data for the selected HUC8(s). Note that clicking on the Excel or CSV button offers two separate files containing Station and Result attributes. My Note: This Data Download goes to:

23 EPANutrientIndicatorsDataSet.xlsx Hydrologic Units (HUC8) - Summaries and Boundaries SPARROW Total N/P Yields 2002 SPARROW Total N/P Yields 1992 Water Quality Monitoring Sites with N/P (STORET) (IN VIEWER) Water Quality Monitoring Sites with N/P (NWIS) (IN VIEWER) NARS N/P Values for Streams NARS N/P Values for Lakes Facilities Likely to Discharge N/P to Water NPDES - Concentrated Animal Feeding Operations (CAFO) Summary National Land Cover Dataset (NOT USED) Waters Listed for N/P Impairments Waters with N/P TMDLs Sources of Drinking Water (By HUC12) State Boundaries Tribal Lands Boundaries (NOT AVAILABLE) Mississippi-Atchafalaya River Basin Boundary

24 A TMDL is a written, quantitative assessment of water quality problems in a waterbody and contributing sources of pollution. It specifies the amount a pollutant needs to be reduced to meet water quality standards (WQS), allocates pollutant load reductions, and provides the basis for taking actions needed to restore a waterbody.

25 http://www. epa. state. oh. us/dsw/tmdl/MaumeeRiver
The Maumee River watershed is located in northwestern Ohio. It drains a total of 5,024 square miles in Ohio and flows through all or part of 18 counties. Major municipalities partially or fully in the watershed include Toledo, Defiance, Findlay, Lima, Van Wert, Napoleon and Perrysburg. The watershed is predominantly comprised of cultivated crops with some urban development, hay and pasture lands, and forest. The Maumee River is a major tributary to the western Lake Erie basin. Please see the Lake Erie program page for more information.

26 EPANutrientIndicatorsDataSet.xlsx Auglaize River (lower) Watershed Auglaize River (upper) Watershed Blanchard River Watershed Maumee River (lower) Tributaries and Lake Erie Tributaries Maumee River Main Stem Ottawa River Watershed (Lima area) Powell Creek Watershed St. Joseph River Watershed St. Marys River watershed Swan Creek Watershed Tiffin River Watershed Maumee River Basin Select Tributaries

27 Web Player EPA N & P Data Ecosystem Demo of Individual Spotfire Tabs

28 Agenda Get Another Preview of National Data Science Organizers Workshop on November 5-6, 2015, and the Focus on National Data Science Challenges and Hackathons 6:30 p.m. Welcome and Introduction Slides Data Science for EPA EnviroAtlas Part II and Data Science for EPA Nutrient Data with Dr. Joan Aron. Also see Earth Insights from Big Data 7:00 p.m. Brief Member Introductions 7:15 p.m. Invited Presentation: Ron Williams, Project Lead-ACE EM-3 (Emerging Technologies), ORD’s Emerging Technologies Program Area, EPA National Exposure Research Laboratory. Slides See: Federal Crowdsourcing and Citizen Science Toolkit: The Air Sensor Toolbox: Citizen Scientists Measure Air Quality, EPA Engineer Dr. Gayle Hagler and Dr. Denice Shaw, Associate Chief Innovation Officer, US Environmental Protection Agency, Office of Research and Development. 8:15 p.m. Anne Bowser, Researcher in Data Science and Visualization, and CoDirector, Commons Lab, Science and Technology Innovations Program, Woodrow Wilson International Center for Scholars See Slides 8:30 p.m. Open Discussion​  8:45 p.m. Networking  9:00 p.m. Depart


Download ppt "Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist"

Similar presentations


Ads by Google