Data Science for EPA's Chief Data Scientist: Big Data for Nutrients and Air Quality Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist.

Slides:



Advertisements
Similar presentations
1 Watershed Planning: A Key to Integrated Planning FHWA Environmental Conference Ann Campbell Wetlands Division.
Advertisements

Using RMMS to Track the Implementation of Watershed-based Plans
A GIS Lesson Copyright © ESRI. What is a Watershed? +15 Million People Credit USGSUSGS.
TMDL Development for the Floyds Fork Watershed Louisville, KY August 30, 2011.
The Lake Allegan/Kalamazoo River Total Maximum Daily Load (TMDL) Plan Implementation by Jeff Spoelstra, Coordinator, Kalamazoo River Watershed Council.
Public Workshop Implementation and Enforcement of Nutrient TMDLs for Lake Elsinore and Canyon Lake CA Regional Water Quality Control Board (Regional Water.
Montana’s 2007 Nonpoint Source Management Plan Robert Ray MT Dept Environmental Quality.
A watershed is an area of land that drains to a particular river, lake, bay or other body of water. Watersheds are sometimes called “basins” or “drainage.
To protect the environment and human health. Question: What is the purpose of the Exchange Network? Why keep investing in it? Answer: The Exchange Network.
Incorporating Climate Change Adaptation in EPA Region 10 Programs: An example based on a newly initiated pilot in the Office of Water and Watershed’s Total.
Getting the Big Picture How to Look at Your Watershed Indiana Watershed Planning Guide,
Nonpoint Source Pollution Reductions – Estimating a Tradable Commodity Allen R. Dedrick Associate Deputy Administrator Natural Resources & Sustainable.
EPA Big Data Analytics: EnviroAtlas Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Community-based Education K-12 students serving as a resource for meeting community needs.
Data for Water Resource Management Module 14, part A – Data types and sources.
EPA Big Data Analytics: Data Science for EPA Fracturing Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Transforming Data-Driven Publications and Decision Support Joan L. Aron, Ph.D. Consultant Federal Big Data Working Group COM.BigData 2014.
Data Science for USGS Minerals Big Data Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data.
The Field Office Technical Guide and Other Technical Resources CNMP Core Curriculum Section 2 — Conservation Planning.
Lake Erie HABs Workshop Bill Fischbein Supervising Attorney Water Programs March 16, 2012 – Toledo March 30, Columbus.
GIS Tools for Watershed Delineation Public Policy Perspectives Teaching Public Policy in the Earth Sciences April 21, 2006 Gary Coutu Department of Geography.
AIRNow-International The future of the United States real-time air quality reporting and forecasting program and GEOSS participation John E. White U.S.
An Electronic Learning Network Joni FalkBrian Drayton Brian This site is supported by the National.
Farm Data Dashboards: USDA and Microsoft Innovation Challenge Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Lee Pera and Ethan McMahon Office of Environmental Information July 21, 2010 U.S. Environmental Protection Agency Educator Resources.
Data Science for RDA Climate Change Data Challenge and Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Bilateral Working Group SMARTe: Improving Revitalization Decisions Name Title Date Meeting Title.
Data Science for DataBay DataBay "Reclaim the Bay" Innovation Challenge: August 1-3, 2014, Smithsonian Environmental Research Center, 647 Contees Wharf.
Collaborative Monitoring in the Great Lakes: Revisiting the Lake Michigan Mass Balance Project Collaborative Monitoring in the Great Lakes: Revisiting.
Data Science for USGS Minerals Big Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
Data Science for EPA & USGS Fracturing & Fracking­­­­­ Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Ohio Balanced Growth Program Program Overview Sandra Kosek-Sills PhD Ohio Lake Erie Commission.
Data Science for USDA Big Data
Data Science for HealthData.gov Developers & Family Caregivers Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
Data Science for EPA Big Data Analytics: Oregon Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Support of the Framework for Monitoring Office of Management and Budget March 26, 2003.
1 Survey of the Nation’s Lakes Presentation at NALMS’ 25 th Annual International Symposium Nov. 10, 2005.
Michigan Watershed Plan Reviews Presentation at the Michigan Watershed-Based Planning Workshop, Mt. Pleasant, Michigan
Data Science for EPA EnviroAtlas 2015 Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
Data Science for the NOAA Chief Data Officer Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Using STORET Data to Characterize Your Watershed 1 Webcast on June 21, 2007 Randy E. Hill IT Project Manager, EPA Monitoring Branch Dwane Young IT Specialist,
Sensing Our Air: The Quest for Big Data About Our Air Quality and Data Science for EPA EnviroAtlas 2015 Dr. Brand Niemann Director and Senior Data Scientist/Data.
Source Water Collaborative Online Partnership Tool 1 Collaboration Toolkit: Protecting Drinking Water Sources through Agricultural Conservation Practices.
Puget Sound Information Challenge Experiences and Lessons Learned.
Western Lake Erie Basin Partnership Presented by: Name, Title Your Organization DATE YOU PRESENTED The Meeting you presented at.
Data Science for Conservation International's Big Ecosystem Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
1 Social Business Intelligence from Open Government Data Brand Niemann Senior Enterprise Architect US EPA November 27, 2010 DISCLAIMER: While allowed to.
1 Improved Access to EPA and Interagency Information: Before and After with Web 2.0 – Part 7 EPA Jam on Improved Access to Environmental Information, June.
University Education in Natural Resources March 15, 2008 Andrea Wirth and Janine Salwasser Oregon State University Libraries Oregon Explorer: natural resources.
BASINS 2.0 and The Trinity River Basin By Jóna Finndís Jónsdóttir.
Data Science for NIST Big Data Framework Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
National Data Science Organizers Lightning Talks From Around the Country Dr. Brand Niemann Founder and Co-Organizer Federal Big Data Working Group Meetup.
Santa Rosa Plain Groundwater Management Planning Update Fall 2013.
Application of NASA ESE Data and Tools to Particulate Air Quality Management A proposal to NASA Earth Science REASoN Solicitation CAN-02-OES-01 REASoN:
Nutrients and the Next Generation of Conservation Presented by: Tom Porta, P.E. Deputy Administrator Nevada Division of Environmental Protection President,
6/13/2016 U.S. Environmental Protection Agency 1 Starting a Facilities Flow Lee David
The Bear River Watershed Information System Jeffery S. Horsburgh Utah Water Research Laboratory Utah State University David.
Data Science for the National Big Data R&D Initiative Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist
Mission: To protect human health and safeguard the environment
Using RMMS to Track the Implementation of Watershed-based Plans
USDA Big Data Science for Precision Farming With FarmLogs
Using RMMS to Track & Report BMP Implementation
Creating Partnerships: EPA R8, NRCS, and States
Source Water Collaboration Toolkit
Watershed Literacy & Engagement
The Source Water Collaborative & the SMART About Water Program
Evaluating progress and restoration planning in the Yakima Basin: a re-examine of the EDT Framework and Model Greg Blair, ICF International Yakima Basin.
Presentation transcript:

Data Science for EPA's Chief Data Scientist: Big Data for Nutrients and Air Quality Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for EPA EnviroAtlas Data Science for EPA Big Data Analytics October 19,

Agenda Get Another Preview of National Data Science Organizers Workshop on November 5-6, 2015, and the Focus on National Data Science Challenges and Hackathons 6:30 p.m. Welcome and Introduction Slides Data Science for EPA EnviroAtlas Part II and Data Science for EPA Nutrient Data with Dr. Joan Aron. Also see Earth Insights from Big DataSlidesData Science for EPA EnviroAtlasData Science for EPA Nutrient DataEarth Insights from Big Data 7:00 p.m. Brief Member Introductions 7:15 p.m. Invited Presentation: EPA Engineer Dr. Gayle Hagler Slides and Ron Williams, Project Lead for ORD’s Emerging Technologies Program Area Slides See: Federal Crowdsourcing and Citizen Science Toolkit: The Air Sensor Toolbox: Citizen Scientists Measure Air QualityEPA Engineer Dr. Gayle HaglerSlidesFederal Crowdsourcing and Citizen Science ToolkitThe Air Sensor Toolbox: Citizen Scientists Measure Air Quality 8:15 p.m. Anne Bowser, Researcher in Data Science and Visualization, and CoDirector, Commons Lab, Science and Technology Innovations Program, Woodrow Wilson International Center for Scholars See SlidesAnne BowserCommons LabSlides 8:30 p.m. Open Discussion​ 8:45 p.m. Networking 9:00 p.m. Depart 2

Purpose This Meetup was organized for: Robin Thottungal, Chief Data EPA, and Division Director, EAD, OIAA, OEI, Robin Thottungal Greg Godbout, Chief Technology Officer, Environmental Protection Agency, and former Executive Director and Co-Founder of 18F, and Greg Godbout Jay Benforado, Director, National Center for Environmental Innovation at EPA, and Co-Chair, Federal Community of Practice for Crowdsourcing and Citizen ScienceFederal Community of Practice for Crowdsourcing and Citizen Science for the National Data Science Organizers Workshop on November 5-6, 2015, as an example of:National Data Science Organizers Workshop data science for curated data sets, user-centric digital services focused on the interaction between government and the people and businesses it serves, and a Federal Community of Practice on Crowdsourcing and Citizen Science of Big Data that meets bi-monthly to share lessons learned and develop best practices for designing, implementing, and evaluating crowdsourcing and citizen science initiatives. 3

4 Contact Information: Ron Williams

Outline Earth Insights: Data Science for Conservation International's Big Ecosystem Data Earth Insights from Big Data EnviroAtlas 2014 (Shape files with no attributes) to 2015 (Geodatabase files with few ecosystem attributes) Data Science for EPA EnviroAtlas National Nutrient Dataset – State Data Science for EPA Nutrient Data Ohio River and Watersheds: Ohio Watershed TMDLs (Dr. Joan Aron) Accessed on August 4, 2015 Maumee River Basin, Ohio (Dr. Joan Aron) Taken by Storm: How Heavy Rain is Worsening Algal Blooms in Lake Erie With a Focus on the Maumee River in Ohio 5

6 Semantic Community Data Science Earth Insights from Big Data Data Science for Conservation International's Big Ecosystem Data Conservation InternationalConservation International and HP Vertica teamed for an excellent online presentation (see Slides below) on the deluge of data from the Tropical Ecology Assessment and Monitoring (TEAM) Network. The TEAM is described in a Vimeo video (see 23: :18). The presentation also includes a reference (at 16:41).to an article on The Popularity of Data Analysis Software.HP VerticaSlidesdeluge of dataTropical Ecology Assessment and Monitoring (TEAM) NetworkVimeo videoThe Popularity of Data Analysis Software

EPA Data Science Meetup History Previous: EPA Data Science Products (while at EPA circa ) EPA Data Science Products EPA/NASA Climate-Environment­al Data Analytics & A Redesigned, Open Data.gov Meetup (May 6, 2014) EPA/NASA Climate-Environment­al Data Analytics & A Redesigned, Open Data.gov Data Science for EPA EnviroAtlas (June 6, 2014) Data Science for EPA EnviroAtlas Data Science for EPA Big Data Analytics (April 17, 2015) Data Science for EPA Big Data Analytics President's Chief Data Scientist and EPA Big Data Analytics Meetup (April 20, 2015) President's Chief Data Scientist and EPA Big Data Analytics Meetup Uncovering EPA Hydraulic Fracturing Trends and Data with TIBCO Spotfire (September 1, 2015) Uncovering EPA Hydraulic Fracturing Trends and Data with TIBCO Spotfire Present: Translating Big Data into Big Climate Ideas (April 2015) Joan Aron Comments and Questions 7

Translating Big Data into Big Climate Ideas By Brian R. Pickard (U.S. EPA, Landscape Ecology Division), Jeremy Baynes (Contributor), Megan Mehaffey (Contributor), Anne C. Neale (Contributor), Solutions Vol. 6, Issue 1, April In Brief: Climate change has emerged as the significant environmental challenge of the 21st century. Therefore, understanding our changing world has forced researchers from many different fields of science to join together to tackle complicated research questions. The climate change research community now faces the daunting task of disseminating massive amounts of information about possible future climates under differing scenarios to a broad audience. They also need to make the data readily accessible so that it can be used by scientists in other research fields. One potential solution for distribution and communication of the climate scenario information may be through the EnviroAtlas, a new geospatial application developed by the United States Environmental Protection Agency and its partners. This interactive mapping tool allows users to access and explore climate change modeling information in easily understandable formats while providing a range of information on different ecosystem goods and services, or the benefits people receive from nature. By incorporating future scenarios such as land use and climate change within EnviroAtlas, we can evaluate specific components of complex ecosystems within the context of forecasted futures. Linking climate change impacts to ecosystem services, such as clean air and water, allows for opportunities to demonstrate how climate change will impact ecosystems, societies, and human health. 8

Joan Aron Comments and Suggestions 1 I just sent EnviroAtlas a question about finding the climate data discussed in that Solutions article. I could not find climate data. According to the Solutions article, NASA provided climate data to use EnviroAtlas as a platform for disseminating climate data for ecological applications. I'm trying to build on that platform to leverage the NASA and EPA investment for water quality decision-making related to ecosystem function. It would also be nice to connect to other applications like your Precision Farming work. If we can link your agricultural info to EnviroAtlas, that could be interesting since agricultural runoff contributes a lot of pollution and is a focus for pollution control. The characterization of agriculture is also important since USDA covers crops, forests, ranching and CAFOs. 9

Response from Anne Neale, EnviroAtlas Project Lead, US EPA, RTP, NC Many apologies that you were not able to find the data we described. We are sorry and embarrassed that the data are not available as promised in the article. When we published the article, we felt sure that we would have the data available by the time the article was published. We ran into some unforeseen complications and although we have all the data and software available on our development server, we have not been able to push that to our public facing site yet. We anticipate that the issue will be resolved soon and we can let you know as soon as the data are accessible. On July 22, 2015, the EnviroAtlas team will be hosting an informational meeting in downtown Portland, OR to showcase newly available data. Register here to join us! 10

11

Disclaimer 1 The data for this download Esri File GeoDataBase (FGDB) contains many data tables and feature classes. Varying numbers of attributes within each data table were used to map the layers in the EnviroAtlas Interactive Map. We encourage users to evaluate the associated metadata for the tables and feature classes through the EnviroAtlas Interactive Map, the XML files included in this zip file, or the EPA’s Environmental Dataset Gateway (EDG). EnviroAtlas is an interactive web-based decision support tool designed to provide information about ecosystems and the benefits people receive from those ecosystems. This application represents the first peer-reviewed public release. EnviroAtlas will continue to evolve over the coming years with periodic public releases. We welcome and request your feedback on all aspects of the website and tools. 12

Disclaimer 2 EnviroAtlas is designed for use by government, professional, academic, and community users, as well as members of the public. EnviroAtlas does not require special software, technical expertise, or a scientific background. However, it is the responsibility of the user to read and evaluate dataset limitations, restrictions, and intended use. To the best of our knowledge, the data and information on this website are accurate, but no warranty expressed or implied is made regarding the accuracy or utility of the data for general or scientific purposes, nor shall the act of distribution constitute any such warranty. All modeled geographic data are, by their nature, imperfect and the data provided in this Atlas should not be taken as absolute truth but as the best approximation of that truth based on best available data. For site-specific data, EnviroAtlas data will not replace "boots-on-the- ground measurements" or local knowledge. Neither EPA, EPA contractors, nor any other organizations cooperating with EPA assume any responsibility for damages or other liabilities related to the accuracy, availability, use, or misuse of the information provided on this website. EPA reserves the right to change information at any time without public notice. Any errors or omissions should be reported to the EnviroAtlas Team. 13

14 Semantic Community Data Science Data Science for EPA EnviroAtlas

15 National: 34 Files 29.8 MB

16 Portland: 22 Files 656 MB

17 Web Player

Joan Aron Comments and Suggestions 2 I need to have more metadata to understand the net flux of pollutants leaving agricultural fields and the locations where the flux applies. There may be multiple sources of data in the same watershed. I am especially interested in three geographical areas for the examples: Around Coos (Oregon) HUC Tenmile Lakes should be nearby Around Lower Maumee (Ohio) HUC Maumee watershed produces flux of nutrients to Lake Erie at Toledo Around Tensas (Louisiana) HUC near Mississippi River 18

Nitrogen and Phosphorus Pollution Data Access Tool

20

21

Data Download Nitrogen and phosphorus datasets, except for water quality monitoring data, are downloaded from the Viewer Download Page. Water quality monitoring data for N/P are downloaded by HUC8 using the map as described below. These monitoring data were sampled for lakes, streams, estuaries, spring, reservoir and impoundment from 1995 to present. First, select one or more hydrologic units (HUC8s) by picking an action, then click on a HUC8 on the map to Add or Remove it from the list of selected HUC8s. Now, click on one of the buttons below to download the water quality monitoring attribute or geospatial data for the selected HUC8(s). Note that clicking on the Excel or CSV button offers two separate files containing Station and Result attributes. My Note: This Data Download goes to: 22

23 EPANutrientIndicatorsDataSet.xlsx Hydrologic Units (HUC8) - Summaries and Boundaries SPARROW Total N/P Yields 2002 SPARROW Total N/P Yields 1992 Water Quality Monitoring Sites with N/P (STORET) (IN VIEWER) Water Quality Monitoring Sites with N/P (NWIS) (IN VIEWER) NARS N/P Values for Streams NARS N/P Values for Lakes Facilities Likely to Discharge N/P to Water NPDES - Concentrated Animal Feeding Operations (CAFO) Summary National Land Cover Dataset (NOT USED) Waters Listed for N/P Impairments Waters with N/P TMDLs Sources of Drinking Water (By HUC12) State Boundaries Tribal Lands Boundaries (NOT AVAILABLE) Mississippi-Atchafalaya River Basin Boundary

24 A TMDL is a written, quantitative assessment of water quality problems in a waterbody and contributing sources of pollution. It specifies the amount a pollutant needs to be reduced to meet water quality standards (WQS), allocates pollutant load reductions, and provides the basis for taking actions needed to restore a waterbody.

25 The Maumee River watershed is located in northwestern Ohio. It drains a total of 5,024 square miles in Ohio and flows through all or part of 18 counties. Major municipalities partially or fully in the watershed include Toledo, Defiance, Findlay, Lima, Van Wert, Napoleon and Perrysburg. The watershed is predominantly comprised of cultivated crops with some urban development, hay and pasture lands, and forest. The Maumee River is a major tributary to the western Lake Erie basin. Please see the Lake Erie program page for more information.Lake Erie program page

26 EPANutrientIndicatorsDataSet.xlsx Auglaize River (lower) Watershed Auglaize River (upper) Watershed Blanchard River Watershed Maumee River (lower) Tributaries and Lake Erie Tributaries Maumee River Main Stem Ottawa River Watershed (Lima area) Powell Creek Watershed St. Joseph River Watershed St. Marys River watershed Swan Creek Watershed Tiffin River Watershed Maumee River Basin Select Tributaries

27 Web Player EPA N & P Data Ecosystem Demo of Individual Spotfire Tabs

Agenda Get Another Preview of National Data Science Organizers Workshop on November 5-6, 2015, and the Focus on National Data Science Challenges and Hackathons 6:30 p.m. Welcome and Introduction Slides Data Science for EPA EnviroAtlas Part II and Data Science for EPA Nutrient Data with Dr. Joan Aron. Also see Earth Insights from Big DataSlidesData Science for EPA EnviroAtlasData Science for EPA Nutrient DataEarth Insights from Big Data 7:00 p.m. Brief Member Introductions 7:15 p.m. Invited Presentation: EPA Engineer Dr. Gayle Hagler Slides and Ron Williams, Project Lead for ORD’s Emerging Technologies Program Area Slides See: Federal Crowdsourcing and Citizen Science Toolkit: The Air Sensor Toolbox: Citizen Scientists Measure Air QualityEPA Engineer Dr. Gayle HaglerSlidesFederal Crowdsourcing and Citizen Science ToolkitThe Air Sensor Toolbox: Citizen Scientists Measure Air Quality 8:15 p.m. Anne Bowser, Researcher in Data Science and Visualization, and CoDirector, Commons Lab, Science and Technology Innovations Program, Woodrow Wilson International Center for Scholars See SlidesAnne BowserCommons LabSlides 8:30 p.m. Open Discussion​ 8:45 p.m. Networking 9:00 p.m. Depart 28