Ziolib parallel i o library
Download
1 / 8

Woo-Sun - PowerPoint PPT Presentation


  • 347 Views
  • Updated On :

ZioLib, Parallel I/O Library. Woo-Sun Yang and Chris Ding Computational Research Division Lawrence Berkeley National Laboratory. Parallel netCDF write (256  256  256). Parallel netCDF read (256  256  256). Height (Z). Latitude (Y). Longitude (X).

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Woo-Sun' - Sophia


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Ziolib parallel i o library l.jpg

ZioLib, Parallel I/O Library

Woo-Sun Yang and Chris Ding

Computational Research Division

Lawrence Berkeley National Laboratory




Ziolib uses i o staging processors for z decomposition l.jpg

Height (Z)

Latitude (Y)

Longitude (X)

ZioLib uses I/O staging processors for Z-decomposition

Distributed array

In (X,Z,Y) index order

Remapped at I/O staging PEs

In (X,Y,Z) index order

I/O staging PEs write

global field in parallel

  • Relieves memory limitations of a PE

  • Relieves congestion on I/O nodes

  • Writes/reads in large blocks (no seeks) in parallel

  • Eliminates gather/scatter from user codes


Current status of ziolib l.jpg
Current status of ZioLib

  • A set of Fortran 90 modules supporting

    • netCDF I/O (serial and parallel)

    • direct-access unformatted I/O (serial and parallel)

    • sequential-access unformatted I/O (serial)

  • Works for arrays of any number of dimensions of integer*4, real*4 and real*8

  • Reads or writes in any array index order

  • Works with any parallel decomposition

  • Can handle ghost nodes

  • Uses MPI-1 routines only – can still work for serial I/O on machines without a parallel file system, a parallel netCDF library or MPI-2


Direct access write 256 256 256 x z y to x y z l.jpg
Direct-access write (256256256; XZY to XYZ)

transpose

global array

total

remap


Direct access write 256 256 256 xzy to xyz speed up w r t existing mpi single pe i o l.jpg
Direct-access write (256256256; XZY to XYZ)Speed-up w.r.t. existing MPI + single-PE I/O


More on testing l.jpg
More on testing

  • Direct-access I/O with T42L26 resolution (1286426: 1.625 MB)

    • Write: speed up by 3-4

    • Read: speed up by 6-7

  • CAM2.0 history I/O with 8, 16 and 32 processors

    • with EUL (T42L26, Y-decomposition) and FV (B26, 2D-decomposition), load balancing chunking turned off

    • used the serial netCDF with one staging processor

      speed-up by 1.5-2.5 (with serial netCDF only)


ad