ESMPy and OpenClimateGIS: Python Interfaces for High Performance Grid Remapping and Geospatial Dataset Manipulation Ryan O’Kuinghttons, Ben Koziol, Robert.

Slides:



Advertisements
Similar presentations
Expanding Regridding Capabilities of the Earth System Modeling Framework Andrew Scholbrock University of Colorado – Boulder Robert Oehmke NOAA/CIRES 1.
Advertisements

Earth System Curator Spanning the Gap Between Models and Datasets.
Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech.
The Model Output Interoperability Experiment in the Gulf of Maine: A Success Story Made Possible By CF, NcML, NetCDF-Java and THREDDS Rich Signell (USGS,
ESMPy: The Python Interface to the Earth System Modeling Framework Ryan O’Kuinghttons, Robert Oehmke Cecelia DeLuca, Gerhard Theurich Peggy Li, Joseph.
ESMPy and OpenClimateGIS: Python Interfaces for High Performance Grid Remapping and Geospatial Dataset Manipulation Ryan O’Kuinghttons, Ben Koziol, Robert.
NCAR GIS Program : Bridging Gaps
Rebecca Boger Earth and Environmental Sciences Brooklyn College.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
A Quick Tour of the NOAA Environmental Software Infrastructure and Interoperability Group Cecelia DeLuca and the ESMF team ESRL Directorate Seminar June.
Coupling Climate and Hydrological Models Interoperability Through Web Services.
Coupling Climate and Hydrological Models Interoperability Through Web Services Kathy Saint/SGI – NESII Jon Goodall/University of South Carolina Richard.
Metadata Creation with the Earth System Modeling Framework Ryan O’Kuinghttons – NESII/CIRES/NOAA Kathy Saint – NESII/CSG July 22, 2014.
Fast Parallel Grid Remapping for Unstructured and Structured Grids Robert Oehmke NOAA Cooperative Institute for Research in Environmental Sciences University.
Metadata for the Coupled Ocean/Atmosphere Mesoscale Prediction System (COAMPS) using the Earth System Modeling Framework (ESMF) Peter Bosler University.
NE II NOAA Environmental Software Infrastructure and Interoperability Program Cecelia DeLuca Sylvia Murphy V. Balaji GO-ESSP August 13, 2009 Germany NE.
ESMF Development Status and Plans ESMF 4 th Community Meeting Cecelia DeLuca July 21, 2005 Climate Data Assimilation Weather.
Update on ESMF, Earth System Curator, and Earth System CoG Cecelia DeLuca and the ESMF team CCSM Software Engineering Working Group June 23, 2011.
Unidata TDS Workshop TDS Overview – Part I XX-XX October 2014.
GMT: The Generic Mapping Tools Paul Wessel, Walter H.F. Smith and the GMT team.
Coupling Climate and Hydrological Models Interoperability Through Web Services.
The use of modeling frameworks to facilitate interoperability Cecelia DeLuca/NCAR (ESMF) Bill Putman/NASA GSFC (MAPL) David Neckels/NCAR.
Center for Component Technology for Terascale Simulation Software CCA is about: Enhancing Programmer Productivity without sacrificing performance. Supporting.
Earth System Modeling Framework Status Cecelia DeLuca NOAA Cooperative Institute for Research in Environmental Sciences University of Colorado, Boulder.
Mark Rast Laboratory for Atmospheric and Space Physics Department of Astrophysical and Planetary Sciences University of Colorado, Boulder Kiepenheuer-Institut.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
May 2003National Coastal Data Development Center Brief Introduction Two components Data Exchange Infrastructure (DEI) Spatial Data Model (SDM) Together,
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
1 Critical Water Information for Floods to Droughts NOAA’s Hydrology Program January 4, 2006 Responsive to Natural Disasters Forecasts for Hazard Risk.
Earth System Modeling Framework Python Interface (ESMP) October 2011 Ryan O’Kuinghttons Robert Oehmke Cecelia DeLuca.
Strategic Plan Implementation Cecelia DeLuca/NCAR (ESMF) December 17, 2008 ESMF Board/Interagency Meeting.
ESMF Strategic Discussion Cecelia DeLuca NOAA ESRL/University of Colorado ESMF Executive Board/Interagency Meeting June 12, 2014.
ESMF Regridding Update Robert Oehmke, Peggy Li, Ryan O’Kuinghttons, Mat Rothstein, Joseph Jacob NOAA Cooperative Institute for Research in Environmental.
ESMF Regridding Update Robert Oehmke Ryan O’Kuinghttons Amik St. Cyr.
00/XXXX 1 Data Processing in PRISM Introduction. COCO (CDMS Overloaded for CF Objects) What is it. Why is COCO written in Python. Implementation Data Operations.
Earth System Curator and Model Metadata Discovery and Display for CMIP5 Sylvia Murphy and Cecelia Deluca (NOAA/CIRES) Hannah Wilcox (NCAR/CISL) Metafor.
The Earth System Modeling Framework Robert Oehmke, Gerhard Theurich, Cecelia DeLuca NOAA Cooperative Institute for Research in Environmental Sciences University.
Curator: Gap Analysis (from a schema perspective) Rocky Dunlap Spencer Rugaber Georgia Tech.
ESMF,WRF and ROMS. Purposes Not a tutorial Not a tutorial Educational and conceptual Educational and conceptual Relation to our work Relation to our work.
Using ESMF Regridding Tools as an Observation Operator Presenter: Mathew V. Rothstein Software Engineer, NOAA/CNT Training at NRL Monterey August 5-6,
Enhancements for Hydrological Modeling in ESMF Cecelia DeLuca/NCAR (ESMF) December 19, 2008 AGU Fall Meeting.
ESMF and the future of end-to-end modeling Sylvia Murphy National Center for Atmospheric Research
State of ESMF: The NUOPC Layer Gerhard Theurich NRL/SAIC ESMF Executive Board / Interagency Working Group Meeting June 12, 2014.
A Quick Tour of the NOAA Environmental Software Infrastructure and Interoperability Group Cecelia DeLuca Dr. Robert Detrick visit March 28, 2012
Examples (D. Schmidt et al)
VisIt Project Overview
Kai Li, Allen D. Malony, Sameer Shende, Robert Bell
Andrew White, Brian Freitag, Udaysankar Nair, and Arastoo Pour Biazar
GMAO Seasonal Forecast
Scalable Interfaces for Geometry and Mesh based Applications (SIGMA)
Open Weather Weather on the Web
Flanders Marine Institute (VLIZ)
A Web-enabled Approach for generating data processors
ESPC Air-Ocean-Land-Ice Global Coupled Prediction
Unstructured Grids at Sandia National Labs
The cf-python software library
University of Technology
Prepared by Kimberly Sayre and Jinbo Bi
A Quick Tour of the NOAA Environmental Software Infrastructure and Interoperability Group Cecelia DeLuca and the ESMF team ESRL Directorate Seminar June.
DESIGN & IMPLEMENTATION
CEE 6440 GIS in Water Resources Fall 2004 Term Paper Presentation
GENERAL VIEW OF KRATOS MULTIPHYSICS
ESMF Regridding Update
Metadata Development in the Earth System Curator
ExPLORE Complex Oceanographic Data
Cecelia DeLuca1, Rocky Dunlap1
Ben Koziol (NESII/CIRES/NOAA-ESRL/GSD) February 2018
OpenClimateGIS: A Python Library for Geospatial Manipulations of CF Climate Datasets Ben Koziol1, Ryan O’Kuinghttons1, Robert Oehmke1, Richard Rood2, Cecelia.
Maria Teresa Capria December 15, 2009 Paris – VOPlaneto 2009
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

ESMPy and OpenClimateGIS: Python Interfaces for High Performance Grid Remapping and Geospatial Dataset Manipulation Ryan O’Kuinghttons, Ben Koziol, Robert Oehmke Cecelia DeLuca, Gerhard Theurich Peggy Li, Joseph Jacob Cooperative Institute for Research in Environmental Sciences NOAA Environmental Software Infrastructure and Interoperability Project European Geosciences Union General Assembly Vienna, Austria April 22, 2016

ESMF and ESMPy The Earth System Modeling Framework (ESMF) is open source software for building modeling components, and coupling them together to form weather prediction, climate, coastal, and other applications. Provides infrastructure for time management, data communications, metadata and I/O, running models as web services, grid remapping Supports a full Fortran and limited C and Python interfaces ESMF provides a mature high performance regridding package Transforms data from one grid to another by generating and applying interpolation weights Supports structured and unstructured, global and regional, 2D and 3D grids, with many options Fully parallel and highly scalable The Python interface to ESMF (ESMPy) offers access to the regridding functionality and other related features of ESMF.

OCGIS OpenClimateGIS (OCGIS) is a standalone Python package enabling dynamic access to and manipulation of high resolution climate data Subsetting, coordinate transformations, temporal averaging, and other computations Data conversions between CSV, Shapefile, GRIDSPEC, and UGRID Data conversions between ESMPy and OCGIS bring together GIS capabilities with high performance regridding functionality to create a more unified set of Python tools for Earth system modeling One area of interest is connecting high resolution hydrological models with the high performance climate models

2D Unstructured Mesh From ESMPy Overview FIM Unstructured Grid Regional Grid High performance regridding is applied as a callable Python object NumPy array access to distributed data (parallelism for FREE) Many regridding methods including first-order conservative Data objects can be created from NetCDF files in standard metadata formats Supported grids and methods for regridding with ESMPy include: Bilinear, higher order patch [1,2], first order conservative[3], or nearest neighbor regridding Global or regional 2D or 3D logically rectangular Grids 2D or 3D unstructured Meshes composed of triangles, quadrilaterals or hexahedrons 1D streams of observational data or unconnected sets of points (LocStream)

OpenClimateGIS Overview Developed by the NESII Group in association with the NCPP Project under funding provided by the NOAA Climate Program Office. Python package designed to ease the “localization” and accessibility of high-dimensional scientific datasets Primary Features: geospatial subsetting, standardized calculation, bundling, format conversion, access to OpenDAP datasets. Additional dependencies: GDAL, Shapely, Fiona, netCDF4, osgeo

ESMPy – OCGIS Integration ESMPy and OCGIS have complementary capabilities OCGIS allows access to and manipulation of high resolution data sets ESMPy provides high performance regridding and access to distributed NumPy data There are several ways to create an integrated workflow OCGIS can preprocess data files and convert between data formats ESMPy Field object is an output format of OCGIS ESMPy can read OCGIS outputs (NetCDF) in parallel, for high performance regridding OCGIS offers serial regridding using ESMPy Parallel processing requires clever use of integrated capabilities… OCGIS is implemented and used in single processor mode ESMPy is fully parallel IF objects are created in parallel Conversion between serial and distributed objects is next..

Integrated Workflow Example ESMF command line application allows parallel regrid weight generation with output to file-based output in a single step : Preprocess files using OCGIS (subsetting)2: Read distributed ESMPy objects 4: Write parallel object to files for use by downstream applications 3: Compute and apply regridding weights Data file Object processor ID ** Green text indicates steps that can be done in serial or parallel Object processor ID

Supported Data Conventions ESMPy grid files use the following standard data file formats: Climate and Forecast (CF) grid conventions UGRID - candidate CF convention for unstructured grids [3], used to represent grids with arbitrary polygons with no gaps GRIDSPEC – accepted CF convention for logically rectangular grids [4] SCRIP – Spherical Coordinate Remapping and Interpolation Package [5] Legacy format for 2D logically rectangular or 2D unstructured grids ESMF Custom format for unstructured grids, more efficient storage than SCRIP or CF when used with ESMF codes OCGIS has a rich set of conversion routines between the following: CF grid conventions (above) Shapefile – geospatial vector data format used by GIS software [6] CSV – comma separated value

Interfaces ESMPy has objects for data (Field) and underlying distribution (Grid/Mesh): Grid - logically rectangular discretization object grid=ESMF.Grid(filename=“gridspec.nc”, filetype=ESMF.FileFormat.GRIDSPEC) grid=ESMF.Grid(max_index=numpy.array([7,8,9]),coord_sys=ESMF.CoordSys.CART) Mesh - unstructured mesh discretization object mesh = ESMF.Mesh(filename=“ugrid.nc”, filetype=ESMF.FileFormat.UGRID) Field – data object built on a grid or mesh with optional mask derived type of numpy.ndarray field = ESMF.Field(dstgrid, "dstfield”, meshloc=ESMF.MeshLoc.ELEMENT, ndbounds=[1, 365, 1]) OCGIS has a very compact interface for a wide range of capabilities: ops = ocgis.OcgOperations(dataset=rd, geom=path_ugid_shp, select_ugid=select_ugid, agg_selection=True, prefix='subset_nc', output_format='nc’, add_auxiliary_files=False)

Regridding r1to2 = Regrid(field1, field2, regrid_method=RegridMethod.CONSERVE) where: f(phi,theta) = 2 + cos(theta)**2 * cos(2*phi) Mean relative error Maximum relative error Conservation error Source grid: fv1.9x2.5_ nc - 1.9x2.5 CAM finite volume grid Destination grid: wr50a_ nc - Regional 205x275 grid = 3.19E-03 = 1.93E-02 = 7.11E-15

Conservative Regridding Conservative regridding is important in Earth system modeling to preserve the total integral of a field throughout the operation (e.g. water content) The algorithm used by ESMF computes interpolation weights between cell i on the source grid and j on the destination grid using: where f ij is the fraction of the source cell contributing to the destination cell and A i and A j are the relative areas of the source and destination cells. Options exist for: Using internally computed (default) or user supplied areas Computing areas and distances using great-circle (default) or straight line distances on the surface of the sphere

Enabling Hydrological Studies Hydrological impact studies can be improved when forced with data from climate models; hydrological feedbacks can affect climate A technology and scale gap exists: Many hydrological models have limited scalability, run on desktop computers, and have watershed-sized domains Many climate models are highly parallel, run on high performance supercomputers and have global domains However, scales are slowly converging (e.g. high resolution climate models, hydrological systems of greater extent) Provides scientists opportunities to explore new coupled model configurations and modes of coupling Provides programmers opportunities to develop tools to handle this coupling interface

High Resolution Data Task: Subset high resolution climate precipitation data to local scale and then regrid to catchment basins Source data: CF formatted precipitation data file for the continental United States on a logically rectangular grid (nldas_met_update.obs.daily.pr.1990.nc) Output: Multi-dimensional precip values (including time) on a subset of catchment basins in region of interest after conservative regridding

High Performance Results Conservative regridding result with CONUS NHDPlus catchments using exact solution: Test done on IBM iDataPlex (yellowstone) with 128 and 256 cores Source grid has 2,647,454 elements with up to nodes Weight file generation takes minutes, application takes seconds

Status and Future Work Both ESMPy and OCGIS are in production and fully supported Upcoming development: Read and write ESMF formatted weight files Write ESMF Fields in parallel Seamless conversions between serial and distributed objects in ESMPy Python 3 support

Requirements, Supported Platforms, Limitations, etc... Supported Platforms: -Linux, Darwin, and Cray -Gfortran -OpenMP -Linux, Darwin, Windows Requirements: ESMPy: -Python 2.6, 2.7 -Numpy 1.6.1/2 (ctypes) -ESMF installation (with NetCDF) OCGIS (additional dependencies): -netCDF4 -Shapely -Fiona -osgeo Testing: -Nightly regression testing-Travis CI integration Installation: -ESMPy: python setup.py build --ESMFMKFILE= install -OCGIS: python setup.py install conda install -c conda-forge esmpy ocgis

Selected Users UV-CDAT (PCMDI) – Ultrascale Visualization Climate Data Analysis Tools cfpython (University of Redding) – Implementation of the CF data model for reading, writing and processing of data and metadata Iris (Met Office) – Python library for visualizing meteorological and oceanographic data sets. PyFerret (NOAA) – Python based interactive visualization and analysis environment Community Surface Dynamics Modeling System (CU-Boulder) – Tools for hydrological and other surface modeling processes OCGIS – climate4impact portal (IS-ENES): Tools for climate modelers to tailor high resolution climate data OCGIS – ClimatePipes (kitware): User- friendly data access, manipulation, analysis and visualization of community climate models

Contact Us! References: 1.Khoei S.A., Gharehbaghi A. R., The superconvergent patch recovery technique and data transfer operators in 3d plasticity problems. Finite Elements in Analysis and Design, 43(8), Hung K.C, Gu H., Zong Z., A modified superconvergent patch recovery method and its application to large deformation problems. Finite Elements in Analysis and Design, 40(5-6), D. Ramshaw, Conservative rezoning algorithm for generalized two-dimension meshes. Journal of Computational Physics,59, UGRID documentation: accessed Dec. 19, GridSpec whitepaper: accessed Dec. 19, 2014https://ice.txcorp.com/trac/modave/wiki/CFProposalGridspec 6.Jones, P.W. SCRIP: A Spherical Coordinate Remapping and Interpolation Package. Los Alamos National Laboratory Software Release LACC Shapefile whitepaper: accessed Dec. 19, 2014http:// or Website: orhttps://earthsystemcog.org/projects/esmpy/

Jupyter Notebooks

ESMPy Regridding

Plotting the solution with matplotlib shows error on the order of 10 -7

OCGIS Utilities Tech-Stack ipynb

OCGIS Utilities Tech-Stack ipynb

OCGIS Utilities Tech-Stack ipynb

OCGIS Utilities Tech-Stack ipynb

OCGIS Utilities Tech-Stack ipynb

OCGIS Utilities Tech-Stack ipynb

OCGIS Utilities Tech-Stack ipynb

Implementation Details

ctypes bindings to ESMF Allocating Numpy array buffers for memory allocated in ESMF: buffer = numpy.core.multiarray.int_asbuffer( ctypes.addressof(pointer.contents), numpy.dtype(ESMF2PythonType[self.type]).itemsize*size) array = numpy.frombuffer(buffer, ESMF2PythonType[self.type]) Interfacing with ctypes: _ESMF.ESMC_GridGetCoord.restype = ctypes.POINTER(ctypes.c_void_p) _ESMF.ESMC_GridGetCoord.argtypes = [ctypes.c_void_p, ctypes.c_int, ctypes.c_uint, numpy.ctypeslib.ndpointer(dtype=numpy.int32), ctypes.POINTER(ctypes.c_int)] gridCoordPtr = _ESMF.ESMC_GridGetCoord(grid.struct.ptr, coordDim, staggerloc, exclusiveLBound, exclusiveUBound, ctypes.byref(lrc)) # adjust bounds to be 0 based exclusiveLBound = exclusiveLBound - 1 Switching between Fortran and C array striding: array = numpy.reshape(array, self.size_local[stagger], order='F') ESMPy is connected to ESMF using ctypes bindings to the C interface