1 / 19

ESMPy and OpenClimateGIS: Python Interfaces for High Performance Grid Remapping and Geospatial Dataset Manipulation

ESMPy offers access to the remapping functionality of ESMF, while OpenClimateGIS enables dynamic access to and manipulation of high-resolution climate data. Together, they provide a unified set of Python tools for Earth system modeling.

carterjoann
Download Presentation

ESMPy and OpenClimateGIS: Python Interfaces for High Performance Grid Remapping and Geospatial Dataset Manipulation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ESMPy and OpenClimateGIS: Python Interfaces for High Performance Grid Remapping and Geospatial Dataset Manipulation Ryan O’Kuinghttons, Ben Koziol, Robert Oehmke Cecelia DeLuca, Gerhard Theurich Peggy Li, Joseph Jacob Cooperative Institute for Research in Environmental Sciences NOAA Environmental Software Infrastructure and Interoperability Project American Meteorological Society Annual Meeting Phoenix, Arizona January 5, 2015

  2. Introduction • ESMPy offers access to the remapping functionality and other related features of the Earth System Modeling Framework (ESMF) • Transforms data from one grid to another by generating and applyingremapping weights (a.k.a regridding or interpolation) • Supports structured and unstructured, global and regional, 2D and 3D grids, created from file or in memory, with many options • Fully parallel and highly scalable • OpenClimateGIS (OCGIS) is a standalone Python package enabling dynamic access to and manipulation of high resolution climate data • Subsetting, coordinate transformations, temporal averaging, computations • Data format conversions between CSV, Shapefile, Gridspec, and UGRID • Data type conversions between ESMPy and OCGIS bring together GIS capabilities with high performance regridding functionality to create a more unified set of Python tools for Earth system modeling

  3. ESMPy Overview • High performance regridding is applied as a callable Python object • Numpy array access to distributed data (parallelism for FREE) • Many regridding methods, including first-order conservative • Data objects can be created from NetCDF files in standard formats • Supported grids and methods for regridding with ESMPy include: • Bilinear, higher order patch [1,2], first order conservative, or nearest neighbor regridding • Global or regional 2D or 3D logically rectangular grids • 2D unstructured meshes composed of triangles or quadrilaterals • Polygons with more than 4 sides are coming soon, supported from file now • 3D unstructured meshes composed of hexahedrons 2D Unstructured Mesh From www.ngdc.noaa.gov Regional Grid FIM Unstructured Grid

  4. OpenClimateGIS Overview • Python package designed to ease the “localization” and accessibility of high-dimensional scientific datasets • Primary Features: geospatial subsetting, standardized calculation, bundling, format conversion, access to OpenDAP datasets. • Additional dependencies: • GDAL, Shapely, Fiona, netCDF4, osgeo Developed by the NESII Group in association with the NCPP Project under funding provided by the NOAA Climate Program Office. https://www.earthsystemcog.org/projects/openclimategis/ https://github.com/NCPP/ocgis

  5. ESMPy – OCGIS Integration • Data object converters allow near seamless integration of capabilities from both packages • OCGIS allows access to and manipulation of high resolution data sets • ESMPy provides high performance regridding and access to distributed numpy data • Shared capabilities are useful for an integrated workflow • OCGIS can preprocess data files and convert between data formats • Allow ESMPy to create parallel objects from files processed by OCGIS • ** Grid and Mesh only, reading distributed Fields from file is expected in summer 2015 • Allow ESMPy outputs to be used in GIS (and other) software • ESMPy can create conservative regridding weights for OCGIS computations • Parallel processing requires clever use of integrated capabilities… • OCGIS is implemented and used in single processor mode • Parallel IO is coming soon (summer 2015) • ESMPy is fully parallel IF objects are created in parallel • Conversion between serial and distributed objects is next..

  6. Integrated Workflow Example ** Green text indicates steps that can be done in serial or parallel 1: Preprocess files using OCGIS (subsetting) 2: Read distributed ESMPy objects Conversion from distributed to serial data objects is scheduled for the next ESMPy release (summer 2015) Object processor ID Data file 0 1 2 3 4: Convert output to serial object and write to file using OCGIS 3: Compute and apply regridding weights Object processor ID Processor 0 0 1 Data file 2 3

  7. Supported Data Conventions ESMPy grid files use the following standard data file formats (in parallel!): • Climate and Forecast (CF) grid conventions • UGRID - candidate CF convention for unstructured grids [3], used to represent grids with arbitrary polygons with no gaps • GRIDSPEC – accepted CF convention for logically rectangular grids [4] • SCRIP – Spherical Coordinate Remapping and Interpolation Package [5] • Legacy format for 2D logically rectangular or 2D unstructured grids • ESMF • Custom format for unstructured grids, more efficient storage than SCRIP or CF when used with ESMF codes OCGIS has a rich set of conversion routines between the following: • CF grid conventions (above) • Shapefile – geospatial vector data format used by GIS software [6] • CSV – comma separated value

  8. Related Interfaces ESMPy has objects for data (Field) and underlying distribution (Grid/Mesh): • Grid - logically rectangular discretization object grid=ESMF.Grid(filename=“gridspec.nc”, filetype=ESMF.FileFormat.GRIDSPEC) grid=ESMF.Grid(max_index=numpy.array([7,8,9]),coord_sys=ESMF.CoordSys.CART) • Mesh - unstructured mesh discretization object mesh = ESMF.Mesh(filename=“ugrid.nc”, filetype=ESMF.FileFormat.UGRID) • Field – data object built on a grid or mesh with optional mask • derived type of MaskedArray field = ESMF.Field(dstgrid, "dstfield”, meshloc=ESMF.MeshLoc.ELEMENT, ndbounds=[1, 365, 1]) OCGIS has a very compact interface for a wide range of capabilities: ops = ocgis.OcgOperations(dataset=rd, geom=path_ugid_shp, select_ugid=select_ugid, agg_selection=True, prefix='subset_nc', output_format='nc’, add_auxiliary_files=False)

  9. Regridding r1to2 = Regrid(field1, field2, regrid_method=RegridMethod.CONSERVE) where: f(phi,theta) = 2 + cos(theta)**2 * cos(2*phi) Source grid: fv1.9x2.5_050503.nc - 1.9x2.5 CAM finite volume grid Destination grid: wr50a_090614.nc - Regional 205x275 grid Mean relative error Maximum relative error Conservation error = 3.19E-03 = 1.93E-02 = 7.11E-15

  10. File Manipulation with OCGIS • Subset a high resolution precipitation dataset in CF format: PATH_PR = 'nldas_met_update.obs.daily.pr.1990.nc' rd = ocgis.RequestDataset(uri=PATH_PR) ops = ocgis.OcgOperations(dataset=rd, geom=path_ugid_shp, select_ugid=select_ugid, agg_selection=True, prefix='subset_nc’, output_format='nc’, add_auxiliary_files=False) subset_nc = ops.execute() ESMPYESMP • Create an ESMPy Field from a subsetted OCGIS dataset: ops = ocgis.OcgOperations(dataset={'uri': subset_nc}, output_format='esmpy') efield = ops.execute()

  11. NFIE Demo • National Flood Interoperability Experiment (NFIE) – under the Office of Hydrologic Development at the National Weather Service • Operational by 2015, total water prediction by 2020 • Asked ESMPy and OCGIS to subset high resolution climate precipitation data to local scale and then regrid to water catchment basins local maps Source data: CF formatted precipitation data file for the continental United States (nldas_met_update.obs.daily.pr.1990.nc) Output: Multi-dimensional precip values (including time) on a subset of 3 catchment basins in region of interest after generation and application of conservative regrid weights

  12. NFIE Demo Code • Convert a subsetted NetCDF file to an ESMPy Field ops = ocgis.OcgOperations(dataset={'uri': subset_nc}, output_format='esmpy') srcfield = ops.execute() • Create an ESMPy Mesh and destination Field from UGRID file dstgrid = ESMF.Mesh(filename=ugridnc, filetype=ESMF.FileFormat.UGRID) dstfield = ESMF.Field(dstgrid, "dstfield”, meshloc=ESMF.MeshLoc.ELEMENT, ndbounds=[1, 365, 1]) • Create an object to regrid data from the source to the destination field regrid = ESMF.Regrid(srcfield, dstfield, regrid_method=ESMF.RegridMethod.CONSERVE) • Regrid from source to destination field dstfield = regrid(srcfield, dstfield)

  13. Requirements, Supported Platforms, Limitations, etc... Requirements: ESMPy: • Python 2.6, 2.7 • Numpy 1.6.1/2 (ctypes) • ESMF installation (with NetCDF) • Additional Dependencies for OCGIS: • netCDF4 • Shapely • Fiona • osgeo Testing: • Regression tested nightly on 5 platforms Supported Platforms: • Linux, Darwin, and Cray • Gfortran • OpenMPI Installation: • ESMPy: python setup.py build --ESMFMKFILE=<path_to_esmf.mk> install • OCGIS: python setup.py install

  14. Status and Future Work • ESMPy is still in beta, production release expected February 2015 • OCGIS is in production and fully supported! • Upcoming development: • OpenClimateGIS support for distributed ESMPy data • Data type for observational data (point clouds, etc.) and regridding to/from • Seamless conversions between serial and distributed objects in ESMPy • Python 3 support • Update to UV-CDAT (currently using older ESMP interface) • Components for rapid prototyping of Earth System Models?!

  15. Current Users • UV-CDAT (PCMDI) – Ultrascale Visualization Climate Data Analysis Tools • cfpython (University of Redding) – Implementation of the CF data model for reading, writing and processing of data and metadata • Iris (Met Office) – Python library for visualizing meteorological and oceanographic data sets. • PyFerret (NOAA) – Python based interactive visualization and analysis environment • Community Surface Dynamics Modeling System (CU-Boulder) – Tools for hydrological and other surface modeling processes • OCGIS – climate4impact portal (IS-ENES): Tools for climate modelers to tailor high resolution climate data • OCGIS – ClimatePipes (kitware): User- friendly data access, manipulation, analysis and visualization of community climate models

  16. Questions? Email:esmf_support@list.woc.noaa.gov or ocgis_support@list.woc.noaa.gov Website:https://earthsystemcog.org/projects/esmpy/ or https://earthsystemcog.org/projects/openclimategis/ References: Khoei S.A., Gharehbaghi A. R., The superconvergent patch recovery technique and data transfer operators in 3d plasticity problems. Finite Elements in Analysis and Design, 43(8), 2007. Hung K.C, Gu H., Zong Z., A modified superconvergent patch recovery method and its application to large deformation problems. Finite Elements in Analysis and Design, 40(5-6), 2004. UGRID documentation: https://github.com/ugrid-conventions/ugrid-conventions, accessed Dec. 19, 2014 GridSpec whitepaper: https://ice.txcorp.com/trac/modave/wiki/CFProposalGridspec, accessed Dec. 19, 2014 Jones, P.W. SCRIP: A Spherical Coordinate Remapping and Interpolation Package. http://www.acl.lanl.gov/climate/software/SCRIP. Los Alamos National Laboratory Software Release LACC 98-45 Shapefile whitepaper: http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf, accessed Dec. 19, 2014

  17. Additional material

  18. OCGIS Computation • Framework designed to accommodate a variety of climate indices and metrics: • Temporally grouped functions → monthly means, annual maximums, durations • String-based functions → ‘diff=tasmax-tasmin’ • Simple transforms → natural logarithm • Multivariate functions → heat indices • Goal is to provide a simplified method for introducing new indices and a straightforward, timely method for documentation (currently works with the Sphinx Python documentation system)

  19. ctypes bindings to ESMF Interfacingwithctypes: _ESMF.ESMC_GridGetCoord.restype = ctypes.POINTER(ctypes.c_void_p) _ESMF.ESMC_GridGetCoord.argtypes = [ctypes.c_void_p, ctypes.c_int, ctypes.c_uint, numpy.ctypeslib.ndpointer(dtype=numpy.int32), numpy.ctypeslib.ndpointer(dtype=numpy.int32), ctypes.POINTER(ctypes.c_int)] gridCoordPtr = _ESMF.ESMC_GridGetCoord(grid.struct.ptr, coordDim, staggerloc, exclusiveLBound, exclusiveUBound, ctypes.byref(lrc)) # adjustboundstobe 0 based exclusiveLBound = exclusiveLBound - 1 Allocating Numpy array buffers for memory allocated in ESMF: buffer = numpy.core.multiarray.int_asbuffer( ctypes.addressof(pointer.contents), numpy.dtype(ESMF2PythonType[self.type]).itemsize*size) array = numpy.frombuffer(buffer, ESMF2PythonType[self.type]) Switching between Fortran and C array striding: array = numpy.reshape(array, self.size_local[stagger], order='F')

More Related