1 / 6

Coupling Parallel IO to SRMs for Remote Data Access

Coupling Parallel IO to SRMs for Remote Data Access. Ekow Otoo, Arie Shoshani and Alex Sim Lawrence Berkeley National Laboratory. Objectives and Goals. Allow near-online access to files/data on mass storage system (e.g., HPSS), from MPI applications on a Linux cluster

uma
Download Presentation

Coupling Parallel IO to SRMs for Remote Data Access

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Coupling Parallel IO to SRMsfor Remote Data Access Ekow Otoo, Arie Shoshani and Alex Sim Lawrence Berkeley National Laboratory

  2. Objectives and Goals • Allow near-online access to files/data on mass storage system (e.g., HPSS), from MPI applications on a Linux cluster • Access files/data from local and remote MSS with MPI applications • Environment • Applications on a Linux clusters having • Local parallel file system (e.g., PVFS2) and • Data stored on a remote mass storage system, e.g. HPSS • Development of the MPI-IO-SRM library • psrm.h libpsrm.a, libpsrm.so

  3. HPSS dCache Castor MPI-IO-SRM Library Usage Configuration • Instrument program to make SRM • calls in place of MPI-IO calls • Start nameserver, DRM and TRM • Run program as a normal MPI • program MPI Applications Srm-Server MPI-IO DRM TRM A Local Parallel File System e.g., PVFS/GPFS

  4. CCLRC RAL What is SRM? Storage Resource Managers (SRM) are middleware components whose functions are to provide: 1) Controlled file transfers, 2) Dynamic space allocation, and 3) Dynamic file management Client USER/APPLICATIONS Grid Middleware SRM SRM SRM SRM SRM SRM SRM Enstore dCache JASMine Unix-based disks Castor SE

  5. Provide information on set of files and order of files to be accessed Program Structure of an Instrumented MPI-IO Application … MPI_Init(); … MPI_Info_create(); MPI_Info_set(); … MPI_File_srm_proxy_init(); … MPI_File_srm_open(); … <Process file reads and writes with standard MPI_File_* operations> … MPI_File_srm_close(); MPI_File_srm_proxy_close(); … MPI_Finalize(); Initiates the SRM Client Issues an srm_open to invoke library. If checks If file is local; otherwise Call SRM server Issues an srm_close to “release” the file to make space for new files

  6. Future Work • Ready to use in real applications processing massively large datasets stored as multiple files on HPSS • Control of pre-fetching of File Bundles • Now processes files in sequence order only • Future: any order based availability or by specified bundles • Multi-Site File Access • Access files from multiple sites in the same session • File usage progress monitor • View progress over the internet • Scaling from clusters to support applications on MPP • Extend to other parallel files systems - GPFS and Lustre • Challenge is the operation of an SRM accessible from the MPP

More Related