1 / 14

Connecting HPIO Capabilities with Domain Specific Needs

Connecting HPIO Capabilities with Domain Specific Needs. Rob Ross MCS Division Argonne National Laboratory rross@mcs.anl.gov. …. Application. I/O System Software. I/O System Software. …. Storage Hardware. I/O in a HPC system. Many cooperating tasks sharing I/O resources

baxter
Download Presentation

Connecting HPIO Capabilities with Domain Specific Needs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Connecting HPIO Capabilities with Domain Specific Needs Rob Ross MCS Division Argonne National Laboratory rross@mcs.anl.gov

  2. Application I/O System Software I/O System Software … Storage Hardware I/O in a HPC system • Many cooperating tasks sharing I/O resources • Relying on parallelism of hardware and software for performance Clients runningapplications (100s-1000s) Storage orSystem Network I/O devices or servers (10s-100s)

  3. Motivation • HPC applications increasingly rely on I/O subsystems • Large input datasets, checkpointing, visualization • Applications continue to be scaled, putting more pressure on I/O subsystems • Application programmers desire interfaces that match the domain • Multidimensional arrays, typed data, portable formats • Two issues to be resolved by I/O system • Very high performance requirements • Gap between app. abstractions and HW abstractions

  4. I/O history in a nutshell • I/O hardware has lagged behind and continues to lag behind all other system components • I/O software has matured more slowly than other components (e.g. message passing libraries) • Parallel file systems (PFSs) are not enough • This combination has led to poor I/O performance on most HPC platforms • Only in a few instances have I/O libraries presented abstractions matching application needs

  5. MPI-IO Local disk, POSIX Parallel file systems Serial high-level libraries Remote access (NFS, FC) Parallel high-level libraries Evolution of I/O software • Goal is convenience and performance for HPC • Slowly capabilities have emerged • Parallel high-level libraries bring together good abstractions and performance, maybe (Not to scale or necessarily in the right order…)

  6. Application High-level I/O Library MPI-IO Library Parallel File System I/O Hardware I/O software stacks • Myriad I/O components are converging into layered solutions • Insulate applications from eccentric MPI-IO and PFS details • Maintain (most of) I/O performance • Some HLL features do cost performance

  7. Role of parallel file systems • Manage storage hardware • Lots of independent components • Must present a single view • Provide fault tolerance • Focus on concurrent, independent access • Difficult to pass knowledge of collectives to PFS • Scale to many clients • Probably means removing all shared state • Lock-free approaches • Publish an interface that MPI-IO can use effectively • Not POSIX

  8. Role of MPI-IO implementations • Facilitate concurrent access by groups of processes • Understanding of the programming model • Provide hooks for tuning PFS • MPI_Info as interface to PFS tuning parameters • Expose a fairly generic interface • Good for building other libraries • Leverage MPI-IO semantics • Aggregation of I/O operations • Hide unimportant details of parallel file system

  9. Role of high-level libraries • Provide an appropriate abstraction for the domain • Multidimensional, typed datasets • Attributes • Consistency semantics that match usage • Portable format • Maintain the scalability of MPI-IO • Map data abstractions to datatypes • Encourage collective I/O • Implement optimizations that MPI-IO cannot (e.g. header caching)

  10. ASCI FLASH Parallel netCDF IBM MPI-IO GPFS Storage Example: ASCI/Alliance FLASH • FLASH is an astrophysics simulation code from the ASCI/Alliance Center for Astrophysical Thermonuclear Flashes • Fluid dynamics code using adaptive mesh refinement (AMR) • Runs on systems with thousands of nodes • Three layers of I/O software between the application and the I/O hardware • Example system: ASCI White Frost

  11. FLASH data and I/O • 3D AMR blocks • 163 elements per block • 24 variables per element • Perimeter of ghost cells • Checkpoint writes all variables • no ghost cells • one variable at a time (noncontiguous) • Visualization output is a subset of variables • Portability of data desirable • Postprocessing on separate platform Ghost cell Element (24 vars)

  12. Tying it all together • FLASH tells PnetCDF that all its processes want to write out regions of variables and store them in a portable format • PnetCDF performs data conversion and calls appropriate MPI-IO collectives • MPI-IO optimizes writing of data to GPFS using data shipping, I/O agents • GPFS handles moving data from agents to storage resources, storing the data, and maintaining file metadata • In this case, PnetCDF is a better match to the application

  13. Application Domain Specific I/O Library High-level I/O Library MPI-IO Library Parallel File System I/O Hardware Future of I/O system software • More layers in the I/O stack • Better match application view of data • Mapping this view to PnetCDF or similar • Maintaining collectives, rich descriptions • More high-level libraries using MPI-IO • PnetCDF, HDF5 are great starts • These should be considered mandatoryI/O system software on our machines • Focusing component implementations on their roles • Less general-purpose file systems • Scalability and APIs of existing PFSs aren’t up to workloads and scales • More aggressive MPI-IO implementations • Lots can be done if we’re not busy working around broken PFSs • More aggressive high-level library optimization • They know the most about what is going on

  14. Future • Creation and adoption of parallel high-level I/O libraries should make things easier for everyone • New domains may need new libraries or new middleware • HLLs that target database backends seem obvious, probably someone else is already doing this? • Further evolution of components necessary to get best performance • Tuning/extending file systems for HPC (e.g. user metadata storage, better APIs) • Aggregation, collective I/O, and leveraging semantics are even more important at larger scale • Reliability too, especially for kernel FS components • Potential HW changes (MEMS, active disk) are complementary

More Related