1 / 46

Michael P. Finn

pRasterBlaster: High-Performance Small-Scale Raster Map Projection Transformation Using the Extreme Science and Engineering Discovery Environment. Michael P. Finn. Collaborators/ Co-Authors. Yan Liu

zlata
Download Presentation

Michael P. Finn

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. pRasterBlaster: High-Performance Small-Scale Raster Map Projection Transformation Using the Extreme Science and Engineering Discovery Environment Michael P. Finn

  2. Collaborators/ Co-Authors • Yan Liu • University of Illinois at Urbana-Champaign (UIUC), CyberInfrastructure and Geospatial Information (CIGI) Laboratory • David M. Mattli • USGS, Center of Excellence for Geospatial Information Science (CEGIS) • Qingfeng (Gene) Guan • University of Nebraska -- Lincoln, School of Natural Resources • Anand Padmanabhan • UIUC, CIGI Laboratory • Kristina H. Yamamoto • USGS, CEGIS  • Babak Behzad • UIUC, Department of Computer Science   • Eric Shook • UIUC, CIGI Laboratory

  3. Overview • Parallel computing for raster (map projection, other) • Raster was born to be parallelized • Use a large amount of computing power for a relatively short period of time • CEGIS parallel computing as part of the US National Science Foundation CyberGISGrant • Data Input/ Output (I/O) Challenges

  4. Raster Processing & HPC-Research(map projection case) Raster processing too slow or even impossible on desktop machines for large datasets Map projection/ reprojection for raster datasets mapIMG Example: Re-projecting a 1GB raster dataset can take 45-60 minutes pRasterBlaster: mapIMG in HPC environment Rigorous geometry handling and novel resampling Resampling options for categorical data and population counts (also standard continuous data resampling methods)

  5. Motivation for HPC-Research Solve problems using multiple processors In this case, XSEDE, a virtual system that scientists can use to interactively share computing resources, data and expertise Create user-centric Web 2.0 interface that hides non-important-to-user, complex HPC implementation details

  6. Objectives • Develop a rapid raster projection transformation and Web service using HPC technology -- pRasterBlaster • Provide a software package that better handles known problems for wide use in the modeling community

  7. pRasterBlaster (the Early Days) • Very similar to mapIMG (V. 03) • Row-wise decomposition • I/O occurred directly in program inner loop • Disadvantages • Inefficient I/O caused by too many I/O requests • Changing I/O implementation difficult • Difficulty with multiple processors writing to a shared file system

  8. Fast, Accurate Raster Reprojection in Three (primary) Steps • Step 1: Calculate and Partition Output Space • Step 2: Read Input and Reproject • Step 3: Combine Temporary Files

  9. Step 1: Calculate and Partition Output Space(Each map projection represents a distinct coordinate system) • The area of the output raster dataset must be calculated by finding a minbox. • The edge of the input raster dataset is iterated over translating input coordinates to output • The smallest box that contains all of the calculated output coordinates is the minbox • The calculated output minbox is then partitioned into areas to be assigned to processors • Each output partition is matched with a partition in the input raster dataset • This partition pair, input and output, is a single task for a processor

  10. Step 2: Read Input and Reproject(Each processor is assigned a quantity of input/output partition pairs) • Memory is allocated to hold the output and input partitions • The input partition is read from the file system • For each pixel in the output, the equivalent pixels in the input are used to find the resampled value • Once the resampling is complete, write the output raster to a per-processor temporary file

  11. Steps 1& 2 Step 2: Read Input and Reproject (right) Step 1: Calculate and Partition Output Space (below) Output Space Partitioning Details (partitions can be arbitrary rectangular areas) - An example output coordinate space overlayed with a grid representing the partitions (left) - The magenta rectangle from the output raster coordinate space and the rhombus from the in put raster dataset represent the same area. (Each area would be loaded into memory.)

  12. Step 3: Combine Temporary Files(Each processor writes its output to an exclusive temporary file) • After all of the processors finish their partitions, they each take turns copying their temporary file contents to the final output file Output Raster Dataset (Mollweide projection) Note the red areas - they are outside of the projected space -- these areas have important performance consequences, especially for “Load Balancing.”

  13. Why Output to a Set of Temp Files? (Instead of a Single Output) • Because, file locking and I/O traffic congestion on shared file system • This is caused by too many processors making I/O requests • The issue is not obvious on small-size clusters; But on supercomputers, it's a big issue • The temp file strategy is particularly efficient on supercomputers • Because of fast local I/O but limited network-based I/O • For example, Trestles@SDSC.; each trestles node has a 100G solid-state drive as temp storage at runtime; But the shared file system (Lustre) has limited bandwidth

  14. Program Evaluation & Integration Process • Wall time vs. speedup • Amdahl’s Law: calculate maximum speedup • Scalability profiling • Resolving computational bottlenecks • Build, test, and deployment • pRasterBlaster in CyberGIS Toolkit • http://cybergis.cigi.uiuc.edu/toolkit • pRasterBlaster as service

  15. Computational Bottleneck I: Workload Distribution

  16. Profiling

  17. Profiling cont’d

  18. Workload Distribution Issue N rows on P processor cores When P is small When P is big

  19. Load Balancing N rows on P processor cores When P is small When P is big 19

  20. Computational Bottleneck II: Spatial Data-dependent Performance Anomaly

  21. Profiling

  22. Analysis • The anomaly is data dependent • Four corners of the raster were processed by processors whose indexes are close to the two ends • Discovery: Exception handling in C++ is costly • Coordinate transformation on nodata area was handled as an exception • Solution • Use a more efficient method to handle the nodataareas instead of using C++ exception handling

  23. Performance after Improvement

  24. Performance Observations • Dynamic Partitioning • Load Balancing • Spatial Data-dependent Performance • File I/O on Shared Clusters/ Parallel File I/O

  25. Current Implementation (1.0) • I/O and reprojection separated into stages • I/O can be batched and reordered • Partitioning more flexible • Partition can be less than a row

  26. Version 1.0 Components • Prologue • Open input files, calculate size of output, calculate partitions, create output and temp files • Reprojection – for each partition in parallel • Read input to memory, iterate through output pixels and reproject, write partition to temporary files • Epilogue • Write final output by aggregating temporary files from all computing nodes

  27. Why Aggregate Temporary Files from All Computing Nodes?(to Write Final Output) • Because the temp file strategy in current implementation (V 1.0) cannot leverage full potential of underlying parallel file system on supercomputers • Even though the strategy employed in V 1.0 allows for a certain degree of performance improvement • Parallel I/O is such a solution

  28. I/O Scalability • I/O subsystems are slow compared to other parts of cyberinfrastructure resources • I/O bandwidth is easily saturated • Once saturated I/O performance stops scaling • Adding more compute nodes increases aggregated memory bandwidth and flops/s, but not I/O • I/O is hence a very expensive operation

  29. Motivations • Using low-level parallel I/O libraries requires in-depth understanding of MPI and parallel I/O • Could be challenging for GIS developers • Generic parallel I/O libraries ignore spatial characteristics of data • Understanding the spatial nature of data leads to better optimized I/O operations • e.g. Row-based I/O, Column-based I/O, Block-based I/O

  30. Solution to the I/O Problem • Provide a usable spatially-aware parallel I/O library • Included as part of CyberGIS Toolkit • Approach that focuses on developing advanced computational capabilities • Establishes an unique service to resolve I/O issues significant to scientific problem solving

  31. Serial I/O: Spokesperson Approach • One process performs I/O • Data aggregation or duplication • Limited by a single I/O process • Simple, but does not scale! … P0 P1 P2 Pn One File Storage Device

  32. Parallel I/O: One-File-per-Process • All processes perform I/O from/to individual files • Limited by the file-system • Does not scale to large process count • A large number of files may lead to a bottleneck due to metadata operations … P0 P1 P2 Pn File 0 File 1 File n File 2 Storage Device

  33. Parallel I/O: Shared File • Shared File • Each process performs I/O to a single shared file • Performance • Data organization within the shared file is very important • For the cases of large process counts, contention may build for file system resources … P0 P1 P2 Pn Shared File Storage Device

  34. Parallel I/O Methods and Tools • Different categories of I/O software and libraries have been developed to address I/O challenges • Parallel file systems: maintains logical space, provides efficient access to data (e.g., Lustre, GPFS) • MPI-IO: deals with organizing access by many processes • High-level libraries: maps application abstractions to structuredand portable file formats (e.g. HDF5, NetCDF-4) Application HDF5, NetCDF-4 MPI I/O Parallel File System Storage Hardware

  35. Parallel Spatial Data I/O Library • Provides a simple and easy to use API to read/write in parallel from/to a NetCDF-4 file • Uses NetCDF API • Written in C • Uses MPI • High level library • Exploits spatial characteristics of data • Data could be read/written in row, column, or chunks … P0 P1 P2 Pn Storage Device

  36. Parallel Spatial Data I/O Library • Provides a simple and easy to use API to read/write in parallel from/to a NetCDF-4 file • Uses NetCDF API • Written in C • Uses MPI • High level library • Exploits spatial characteristics of data • Data could be read/written in row, column, or chunks … P0 P1 P2 Pn Storage Device

  37. Parallel Spatial Data I/O Library • Provides a simple and easy to use API to read/write in parallel from/to a NetCDF-4 file • Uses NetCDF API • Written in C • Uses MPI • High level library • Exploits spatial characteristics of data • Data could be read/written in row, column, or chunks … P0 P1 P2 Pn Storage Device

  38. A NetCDF Example netcdf example_file { dimensions: x = 512 ; y = 256 ; variables: float temp(x, y) ; temp:units = "celsius" ; data: temp = 10.3, 15.7, …, 20.1, 14.2, 12.5, …, 18.9, … 11.3, 13.8, …,10.7; } example_file x temp y units

  39. A Sample Application in pIOLibrary Main() { • Initialize • Read from the input file • read_dbl_variable_by_name(“test”, read_ncid, read_vinfo, read_dinfo); • Do the computation and store your results in memory(i.e. arrays) • Write the results • write_dbl_variable_by_name(“test”, write_ncid, write_vinfo, write_dinfo, &data[0]); • Finalize }

  40. pIOLibrary API • Documentation: • http://sandbox.cigi.illinois.edu/pio

  41. Preliminary Results • Performance of our library executing parallel read and write, to and from memory tested using the 5.4 GB NetCDF-4 dataset from NCAR

  42. Current Status • pRasterBlaster development continuing • Deployment: Trestles@SDSC, Ranger@TACC • pRasterBlaster Version 1.0 released 01 April 2012 • pRasterBlaster in CyberGIS Toolkit • Co-development between USGS and UIUC using the CyberGIS SVN repository • Identified two substantive bottlenecks • pRasterBlaster as service • Open service API interface available in the prototype • Demand for high-performance reprojection in CyberGIS analytics • pRasterBlaster Version 1.5 soon (fix identified bottlenecks; parallel I/O Library) • Iterative testing • mapIMG 4.0 (dRasterBlaster -- continuing to be released this fall) • Library of core functions shared by both dRasterBlaster & pRasterBlaster (libRasterBlaster -- almost complete) • pIOLibrary API (& Documentation): http://sandbox.cigi.illinois.edu/pio

  43. Plans for Near-Term • Release pRasterBlaster 1.5 • Finish libRasterBlaster • Release dRasterBlaster • Final Decisions on User Processes/ Interface(s)

  44. References • Atkins, D. E., K. K. Droegemeier, et al. (2003). Revolutionizing Science and Engineering Through Cyberinfrastructure: Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure. Arlington, VA, National Science Foundation. • Canters, F. (2002). Small-Scale Map Projection Design. London: Taylor & Francis. • Finn, Michael P., and David M. Mattli (2012). User’s Guide for the mapIMG 3: Map Image Reprojection Software Package. U. S. Geological Survey Open-File Report 2011-1306, 12 p.. • Finn, Michael P., Daniel R. Steinwand, Jason R. Trent, E. Lynn Usery, Robert A. Buehler and David Mattli (2011). An Implementation of MapImage, a Program for Creating Map Projections of Small Scale Geospatial Raster Data. Paper submitted to the journal Cartographic Perspectives. • Jenny, B., T. Patterson, and L. Hurni (2010) Graphical Design of World Map Projections, International Journal of Geographical Information Science, Vol. 24, No. 11, 1687-1702. • Slocum, T. A., R. B. McMaster, F. C. Kessler, and H. H. Howard (2009). Thematic Cartography and Geovisualization. 3rd Edition. Upper Saddle River, NJ: Pearson Prentice Hall. • Wang, Shaowen and Yan Liu (2009) TeraGrid GIScience Gateway: Bridging cyberinfrastructure and GIScience. International Journal of Geographical Information Science, Volume 23, Number 5, May, pages 631-656. •  Wang, Shaowen, Yan Liu, Nancy Wilkins-Diehr, and Stuart Martin (2009) SimpleGrid toolkit: Enabling geosciences gateways to cyberinfrastructure. Computers and Geosciences, Volume 35, Number 12, December, pages 2283-2294. • Zaslavsky, Ilya, Richard Marciano, Amarnath Gupta, and Chaitanya Baru (2000) XML-Based Spatial Data Mediation Infrastructure for Global Interoperability. Proceedings 4th Global Spatial Data Infrastructure Conference, Cape Town.

  45. DISCLAIMER & ACKNOWLEDGEMENT • DISCLAIMER: Any use of trade, product, or firm names in this paper is for descriptive purposes only and does not imply endorsement by the U.S. Government. • ACKNOWLEDGEMENT: This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation (NSF) grant number OCI-1053575. Additional support: NSF grants, BCS-0846655, OCI-1047916, and XSEDE SES070004N

  46. pRasterBlaster: High-Performance Small-Scale Raster Map Projection Transformation Using the Extreme Science and Engineering Discovery Environment QUESTIONS?

More Related