Presentation is loading. Please wait.

Presentation is loading. Please wait.

ESMPy and OpenClimateGIS: Python Interfaces for High Performance Grid Remapping and Geospatial Dataset Manipulation Ryan O’Kuinghttons, Ben Koziol, Robert.

Similar presentations


Presentation on theme: "ESMPy and OpenClimateGIS: Python Interfaces for High Performance Grid Remapping and Geospatial Dataset Manipulation Ryan O’Kuinghttons, Ben Koziol, Robert."— Presentation transcript:

1 ESMPy and OpenClimateGIS: Python Interfaces for High Performance Grid Remapping and Geospatial Dataset Manipulation Ryan O’Kuinghttons, Ben Koziol, Robert Oehmke Cecelia DeLuca, Gerhard Theurich Peggy Li, Joseph Jacob Cooperative Institute for Research in Environmental Sciences NOAA Environmental Software Infrastructure and Interoperability Project American Meteorological Society Annual Meeting Phoenix, Arizona January 5, 2015

2 Introduction ESMPy offers access to the remapping functionality and other related features of the Earth System Modeling Framework (ESMF) Transforms data from one grid to another by generating and applying remapping weights (a.k.a regridding or interpolation) Supports structured and unstructured, global and regional, 2D and 3D grids, created from file or in memory, with many options Fully parallel and highly scalable OpenClimateGIS (OCGIS) is a standalone Python package enabling dynamic access to and manipulation of high resolution climate data Subsetting, coordinate transformations, temporal averaging, computations Data format conversions between CSV, Shapefile, Gridspec, and UGRID Data type conversions between ESMPy and OCGIS bring together GIS capabilities with high performance regridding functionality to create a more unified set of Python tools for Earth system modeling

3 2D Unstructured Mesh From www.ngdc.noaa.gov ESMPy Overview FIM Unstructured Grid Regional Grid High performance regridding is applied as a callable Python object Numpy array access to distributed data (parallelism for FREE) Many regridding methods, including first-order conservative Data objects can be created from NetCDF files in standard formats Supported grids and methods for regridding with ESMPy include: Bilinear, higher order patch [1,2], first order conservative, or nearest neighbor regridding Global or regional 2D or 3D logically rectangular grids 2D unstructured meshes composed of triangles or quadrilaterals Polygons with more than 4 sides are coming soon, supported from file now 3D unstructured meshes composed of hexahedrons

4 OpenClimateGIS Overview Developed by the NESII Group in association with the NCPP Project under funding provided by the NOAA Climate Program Office. Python package designed to ease the “localization” and accessibility of high-dimensional scientific datasets Primary Features: geospatial subsetting, standardized calculation, bundling, format conversion, access to OpenDAP datasets. Additional dependencies: GDAL, Shapely, Fiona, netCDF4, osgeo https://www.earthsystemcog.org/projects/openclimategis/ https://github.com/NCPP/ocgis

5 ESMPy – OCGIS Integration Data object converters allow near seamless integration of capabilities from both packages OCGIS allows access to and manipulation of high resolution data sets ESMPy provides high performance regridding and access to distributed numpy data Shared capabilities are useful for an integrated workflow OCGIS can preprocess data files and convert between data formats Allow ESMPy to create parallel objects from files processed by OCGIS ** Grid and Mesh only, reading distributed Fields from file is expected in summer 2015 Allow ESMPy outputs to be used in GIS (and other) software ESMPy can create conservative regridding weights for OCGIS computations Parallel processing requires clever use of integrated capabilities… OCGIS is implemented and used in single processor mode Parallel IO is coming soon (summer 2015) ESMPy is fully parallel IF objects are created in parallel Conversion between serial and distributed objects is next..

6 Integrated Workflow Example Conversion from distributed to serial data objects is scheduled for the next ESMPy release (summer 2015) 01 23 1: Preprocess files using OCGIS (subsetting)2: Read distributed ESMPy objects 4: Convert output to serial object and write to file using OCGIS 3: Compute and apply regridding weights 01 23 Data file Object processor ID ** Green text indicates steps that can be done in serial or parallel Object processor ID Processor 0

7 Supported Data Conventions ESMPy grid files use the following standard data file formats (in parallel!): Climate and Forecast (CF) grid conventions UGRID - candidate CF convention for unstructured grids [3], used to represent grids with arbitrary polygons with no gaps GRIDSPEC – accepted CF convention for logically rectangular grids [4] SCRIP – Spherical Coordinate Remapping and Interpolation Package [5] Legacy format for 2D logically rectangular or 2D unstructured grids ESMF Custom format for unstructured grids, more efficient storage than SCRIP or CF when used with ESMF codes OCGIS has a rich set of conversion routines between the following: CF grid conventions (above) Shapefile – geospatial vector data format used by GIS software [6] CSV – comma separated value

8 Related Interfaces ESMPy has objects for data (Field) and underlying distribution (Grid/Mesh): Grid - logically rectangular discretization object grid=ESMF.Grid(filename=“gridspec.nc”, filetype=ESMF.FileFormat.GRIDSPEC) grid=ESMF.Grid(max_index=numpy.array([7,8,9]),coord_sys=ESMF.CoordSys.CART) Mesh - unstructured mesh discretization object mesh = ESMF.Mesh(filename=“ugrid.nc”, filetype=ESMF.FileFormat.UGRID) Field – data object built on a grid or mesh with optional mask derived type of MaskedArray field = ESMF.Field(dstgrid, "dstfield”, meshloc=ESMF.MeshLoc.ELEMENT, ndbounds=[1, 365, 1]) OCGIS has a very compact interface for a wide range of capabilities: ops = ocgis.OcgOperations(dataset=rd, geom=path_ugid_shp, select_ugid=select_ugid, agg_selection=True, prefix='subset_nc', output_format='nc’, add_auxiliary_files=False)

9 Regridding r1to2 = Regrid(field1, field2, regrid_method=RegridMethod.CONSERVE) where: f(phi,theta) = 2 + cos(theta)**2 * cos(2*phi) Mean relative error Maximum relative error Conservation error Source grid: fv1.9x2.5_050503.nc - 1.9x2.5 CAM finite volume grid Destination grid: wr50a_090614.nc - Regional 205x275 grid = 3.19E-03 = 1.93E-02 = 7.11E-15

10 Create an ESMPy Field from a subsetted OCGIS dataset: ops = ocgis.OcgOperations(dataset={'uri': subset_nc}, output_format='esmpy') efield = ops.execute() File Manipulation with OCGIS ESMPY ESMP Subset a high resolution precipitation dataset in CF format: PATH_PR = 'nldas_met_update.obs.daily.pr.1990.nc' rd = ocgis.RequestDataset(uri=PATH_PR) ops = ocgis.OcgOperations(dataset=rd, geom=path_ugid_shp, select_ugid=select_ugid, agg_selection=True, prefix='subset_nc’, output_format='nc’, add_auxiliary_files=False) subset_nc = ops.execute()

11 NFIE Demo National Flood Interoperability Experiment (NFIE) – under the Office of Hydrologic Development at the National Weather Service Operational by 2015, total water prediction by 2020 Asked ESMPy and OCGIS to subset high resolution climate precipitation data to local scale and then regrid to water catchment basins local maps Source data: CF formatted precipitation data file for the continental United States (nldas_met_update.obs.daily.pr.1990.nc) Output: Multi-dimensional precip values (including time) on a subset of 3 catchment basins in region of interest after generation and application of conservative regrid weights

12 NFIE Demo Code Convert a subsetted NetCDF file to an ESMPy Field ops = ocgis.OcgOperations(dataset={'uri': subset_nc}, output_format='esmpy') srcfield = ops.execute() Create an ESMPy Mesh and destination Field from UGRID file dstgrid = ESMF.Mesh(filename=ugridnc, filetype=ESMF.FileFormat.UGRID) dstfield = ESMF.Field(dstgrid, "dstfield”, meshloc=ESMF.MeshLoc.ELEMENT, ndbounds=[1, 365, 1]) Create an object to regrid data from the source to the destination field regrid = ESMF.Regrid(srcfield, dstfield, regrid_method=ESMF.RegridMethod.CONSERVE) Regrid from source to destination field dstfield = regrid(srcfield, dstfield)

13 Requirements, Supported Platforms, Limitations, etc... Supported Platforms: -Linux, Darwin, and Cray -Gfortran -OpenMPI Requirements: ESMPy: -Python 2.6, 2.7 -Numpy 1.6.1/2 (ctypes) -ESMF installation (with NetCDF) Additional Dependencies for OCGIS: -netCDF4 -Shapely -Fiona -osgeo Testing: -Regression tested nightly on 5 platforms Installation: -ESMPy: python setup.py build --ESMFMKFILE= install -OCGIS: python setup.py install

14 Status and Future Work ESMPy is still in beta, production release expected February 2015 OCGIS is in production and fully supported! Upcoming development: OpenClimateGIS support for distributed ESMPy data Data type for observational data (point clouds, etc.) and regridding to/from Seamless conversions between serial and distributed objects in ESMPy Python 3 support Update to UV-CDAT (currently using older ESMP interface) Components for rapid prototyping of Earth System Models?!

15 Current Users UV-CDAT (PCMDI) – Ultrascale Visualization Climate Data Analysis Tools cfpython (University of Redding) – Implementation of the CF data model for reading, writing and processing of data and metadata Iris (Met Office) – Python library for visualizing meteorological and oceanographic data sets. PyFerret (NOAA) – Python based interactive visualization and analysis environment Community Surface Dynamics Modeling System (CU-Boulder) – Tools for hydrological and other surface modeling processes OCGIS – climate4impact portal (IS-ENES): Tools for climate modelers to tailor high resolution climate data OCGIS – ClimatePipes (kitware): User- friendly data access, manipulation, analysis and visualization of community climate models

16 Questions? References: 1.Khoei S.A., Gharehbaghi A. R., The superconvergent patch recovery technique and data transfer operators in 3d plasticity problems. Finite Elements in Analysis and Design, 43(8), 2007. 2.Hung K.C, Gu H., Zong Z., A modified superconvergent patch recovery method and its application to large deformation problems. Finite Elements in Analysis and Design, 40(5-6), 2004. 3.UGRID documentation: https://github.com/ugrid-conventions/ugrid-conventions, accessed Dec. 19, 2014 4.GridSpec whitepaper: https://ice.txcorp.com/trac/modave/wiki/CFProposalGridspec, accessed Dec. 19, 2014https://ice.txcorp.com/trac/modave/wiki/CFProposalGridspec 5.Jones, P.W. SCRIP: A Spherical Coordinate Remapping and Interpolation Package. http://www.acl.lanl.gov/climate/software/SCRIP. Los Alamos National Laboratory Software Release LACC 98-45 http://www.acl.lanl.gov/climate/software/SCRIP 6.Shapefile whitepaper: http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf, accessed Dec. 19, 2014http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf Email: esmf_support@list.woc.noaa.gov or ocgis_support@list.woc.noaa.govesmf_support@list.woc.noaa.govocgis_support@list.woc.noaa.gov Website: https://earthsystemcog.org/projects/esmpy/ orhttps://earthsystemcog.org/projects/esmpy/ https://earthsystemcog.org/projects/openclimategis/

17 Additional material

18 OCGIS Computation ● Framework designed to accommodate a variety of climate indices and metrics: ○ Temporally grouped functions → monthly means, annual maximums, durations ○ String-based functions → ‘diff=tasmax-tasmin’ ○ Simple transforms → natural logarithm ○ Multivariate functions → heat indices ● Goal is to provide a simplified method for introducing new indices and a straightforward, timely method for documentation (currently works with the Sphinx Python documentation system)

19 ctypes bindings to ESMF Allocating Numpy array buffers for memory allocated in ESMF: buffer = numpy.core.multiarray.int_asbuffer( ctypes.addressof(pointer.contents), numpy.dtype(ESMF2PythonType[self.type]).itemsize*size) array = numpy.frombuffer(buffer, ESMF2PythonType[self.type]) Interfacing with ctypes: _ESMF.ESMC_GridGetCoord.restype = ctypes.POINTER(ctypes.c_void_p) _ESMF.ESMC_GridGetCoord.argtypes = [ctypes.c_void_p, ctypes.c_int, ctypes.c_uint, numpy.ctypeslib.ndpointer(dtype=numpy.int32), ctypes.POINTER(ctypes.c_int)] gridCoordPtr = _ESMF.ESMC_GridGetCoord(grid.struct.ptr, coordDim, staggerloc, exclusiveLBound, exclusiveUBound, ctypes.byref(lrc)) # adjust bounds to be 0 based exclusiveLBound = exclusiveLBound - 1 Switching between Fortran and C array striding: array = numpy.reshape(array, self.size_local[stagger], order='F')


Download ppt "ESMPy and OpenClimateGIS: Python Interfaces for High Performance Grid Remapping and Geospatial Dataset Manipulation Ryan O’Kuinghttons, Ben Koziol, Robert."

Similar presentations


Ads by Google