1 / 8

High-Performance Parallel I/O Libraries

High-Performance Parallel I/O Libraries. (PI) Alok Choudhary, (Co-I) Wei-Keng Liao Northwestern University In Collaboration with the SEA Group (Group Leader, Rob Ross, ANL). Compute node. Compute node. Compute node. Compute node. Applications. Parallel netCDF. MPI-IO.

yves
Download Presentation

High-Performance Parallel I/O Libraries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High-Performance Parallel I/O Libraries (PI) Alok Choudhary, (Co-I) Wei-Keng Liao Northwestern University In Collaboration with the SEA Group (Group Leader, Rob Ross, ANL)

  2. Compute node Compute node Compute node Compute node Applications Parallel netCDF MPI-IO Client-side File System network I/O Server I/O Server I/O Server Parallel NetCDF • NetCDF defines: • A portable file format • A set of APIs for file access • Parallel netCDF • New APIs for parallel access • Maintaining the same file format • Tasks • Built on top of MPI for portability and high performance • Support C and Fortran interfaces

  3. Parallel NetCDF - status • Version 1.0.1 was released on Dec. 7, 2005 • Web page receives 200 page views a day • Supported platforms • Linux Cluster, IBM SP, BG/L, SGI Origin, Cray X, NEC SX • Two sets of parallel APIs • High level APIs (mimicking the serial netCDF APIs) • Flexible APIs (to utilize MPI derived datatype) • Support for large files ( > 2GB files) • Test suites • Self test codes ported from Unidata netCDF package to validate against single-process results • New data analysis APIs • Basic statistical functions • min, max, mean, median, variance, deviation

  4. Illustrative PnetCDF Users • FLASH– astrophysical thermonuclear application from ASCI/Alliances center at university of Chicago • ACTM– atmospheric chemical transport model, LLNL • WRF– Weather Research and Forecast modeling system, NCAR • WRF-ROMS– regional ocean model system I/O module from scientific data technologies group, NCSA • ASPECT– data understanding infrastructure, ORNL • pVTK– parallel visualization toolkit, ORNL • PETSc– portable, extensible toolkit for scientific computation, ANL • PRISM– PRogram for Integrated Earth System Modeling, users from C&C Research Laboratories, NEC Europe Ltd. • ESMF– earth system modeling framework, national center for atmospheric research • CMAQ – Community Multiscale Air Quality code I/O module, SNL • More …

  5. PnetCDF Future Work • Non-blocking I/O • Built on top of non-blocking MPI-IO • Improve data type conversion • Type conversion while packing non-contiguous buffers • Data analysis APIs • Statistical functions • Histogram functions • Range query: regional sum, min, max, mean, … • Data transformation: DFT, FFT • Collaboration with application users

  6. MPI-IO Caching • Client-side file caching • Reduces client-server communication costs • Enables write behind to better utilize network bandwidth • Avoids file system locking overhead by aligning I/O with file block size (or stripe size) • Prototype in ROMIO • Collaborating caching by the group of MPI processes • A complete caching subsystem in MPI library • Data consistency and cache coherence control • Distributed file locking • Memory management for data caching, eviction, and migration • Applicable for both MPI collective and independent I/O • Two implementations • Creating an I/O thread in each MPI process • Using MPI RMA utility

  7. FLASH - I/O Benchmark • The I/O kernel of FLASH application, a block-structured adaptive mesh hydrodynamics code • Each process writes 80 cubes • I/O through HDF5 • Write-only operations • The improvement is due to write behind

  8. Local array is in 4D P2,0 File view P2,0 P2,0 P2,1 P2,2 P2,0 P0,2 P0,0 P0,1 P1,0 P1,1 P1,2 BTIO Benchmark • Block tri-diagonal array partitioning • 40 MPI collective writes followed by 40 collective reads

More Related