Brian Johnson and Doug Young NSIDC Jupyter Hub prototyping Part of the Cross-DAAC Cloud Analysis Toolkit to Enable Earth Science (CATEES) Brian Johnson and Doug Young
CATEES Overview – cloud prototyping activity Objectives Provide easy-to-use tools for working with EOSDIS data Provide tools in a convenient package to users Show users how to access EOSDIS API via scripts Show users how use analytics optimized cloud storage for analysis Demonstrate cross-EOSDIS development Approach Python Software + Other Materials Library routines (open-source and DAAC-contrib) Interdisciplinary Use Case examples Data (links + local) Sample IPython Notebooks Deploy to cloud for usage Develop Process for joint development
CATEES Roadmap
Status – NSIDC example in GHRC Jupyter Hub Sample Code: Visualize the digital SAR mosaic and elevation map of the Greenland icesheet (Bruce Raup, Paul Madden, Doug Young; NSIDC)
Cross-DAAC example Goal: Provides a “science” analytics” example that integrated data sets from more than on DAAC and extensible to other data sets, analyses and larger domains (i.e. analytics at scale) Approach: Build a Jupyter notebook that contains 5 fundamental “building blocks” (or functional elements) that address a common science workflow, e.g. Get data from 3 differnet sources (“connectors”) Access the data (“extract”) Prepare the data (subset, regrid, bin, …) Run a simple correlation analysis Display the results
Science use case How do different vegetation communities respond to long-term changes in water availability in Colorado? Expect vegetation productivity (growth) to respond to long-term trends in soil moisture Want to evaluate spatial and temporal relationships between: Soil moisture Vegetation greenness Ecoregions
Data products SMAP soil moisture (SPL2SMP) NSDIC DAAC: HDF 5, OPeNDAP access MODIS Veg. Indices (MOD13A2) LP DAAC: HDF-EOS, OPeNDAP Precipitation (Daymet) ORNL DAAC: netCDF-4, HTTP or THREDDS Ecoregion map for Colorado SMAP 36-km soil moisture MODIS NDVI composite DAYMET daily precip map
Data prep • Re-gridded the four data sets; user specified projection • Extract Same extent (state of Colorado) • Integrated data with different resolution
Analysis
Benefits for users