1 / 27

Milestone 5.1: Initial POSIX Function Shipping Demonstration

Milestone 5.1: Initial POSIX Function Shipping Demonstration. Jerome Soumagne, Quincey Koziol 09/24/2013. Overview – Mercury. Mercury “Function Shipper”: RPC layer that supports Non-blocking transfers Large data arguments (w/RMA) Native transport protocols of HPC systems

vevay
Download Presentation

Milestone 5.1: Initial POSIX Function Shipping Demonstration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Milestone 5.1: Initial POSIX Function Shipping Demonstration Jerome Soumagne, QuinceyKoziol 09/24/2013

  2. Overview – Mercury • Mercury “Function Shipper”: RPC layer that supports • Non-blocking transfers • Large data arguments (w/RMA) • Native transport protocols of HPC systems • Mercury serves as a basis for higher-level frameworks that need to operate on/store/access data remotely • HDF5 IOD virtual object plugin • IOFSL I/O forwarding scalability layer • Storage systems • Analysis frameworks

  3. Overview – Mercury • Already largely presented in previous milestones • No major modification of Mercury for this deliverable in order to support POSIX calls • But Mercury is still being improved: • Performance tuning on Infinibandcluster • Support for additional network transports is being added (TCP / ibverbs / SSM) • Paper submitted at end of Q4 now accepted and being presented at IEEE Cluster 2013: • J. Soumagne, D. Kimpe, J. Zounmevo, M. Chaarawi, Q.Koziol, A. Afsahi, and R. Ross, “Mercury: Enabling Remote Procedure Call for High-Performance Computing”, IEEE International Conference on Cluster Computing, Sep 2013

  4. Fast Forward Stack – Function Shipping HDF5 API VOL Native (H5) IOD VOL Network VFL Mercury (Client) Mercury (Server) IOD VOL …

  5. POSIX Function Shipping (Example) HDF5 API VOL Native (H5) IOD VOL VFL POSIX I/O Network Mercury (Client) Mercury (Server) sec2 POSIX I/O POSIX I/O File System

  6. Mercury POSIX • Support POSIX I/O routines through Mercury • Completely separate package built on top of Mercury called: Mercury POSIX (lightweight library + server) • Design keys: • Support 32/64 bit platforms and large files • No modification of original source code that uses POSIX I/O (e.g., HDF5 sec2 driver) • Redirects I/O to Mercury server with dynamic linking • Can make use of all the transports available through Mercury (although MPI dynamic connection is not really flexible and always available) • Code for supporting POSIX routine is automatically generated inside Mercury POSIX by using BOOST preprocessor macros

  7. Mercury POSIX – Stub Generation • Most routines are generated with one line macro • Built on top of existing Mercury/Boost macros • However supporting variable arguments routines requires some extra lines to create encoding / decoding routines that check argument flags etc

  8. Mercury POSIX – Stub Generation • Two main macros: /* Non-bulk routines */ MERCURY_POSIX_GEN_STUB(func_name, ret_type, in_types, out_types) /* Bulk routines */ MERCURY_POSIX_GEN_BULK_STUB(func_name, ret_type, in_types, out_types, bulk_read)/* 1/0 if reading/writing bulk data */

  9. Mercury POSIX – Stub Generation • Example, showing results of the following macro: /* off_tlseek(intfildes, off_t offset, int whence) */ MERCURY_POSIX_GEN_STUB(lseek, hg_off_t, (hg_int32_t)(hg_off_t)(hg_int32_t), )

  10. Mercury POSIX – Stub Generation /* off_tlseek(intfildes, off_t offset, int whence) */ MERCURY_POSIX_GEN_STUB(lseek, hg_off_t, (hg_int32_t)(hg_off_t)(hg_int32_t), ) • Generate input structure typedefstruct { hg_int32_t in_param_0; hg_off_t in_param_1; hg_int32_t in_param_2; } lseek_in_t;

  11. Mercury POSIX – Stub Generation /* off_tlseek(intfildes, off_t offset, int whence) */ MERCURY_POSIX_GEN_STUB(lseek, hg_off_t, (hg_int32_t)(hg_off_t)(hg_int32_t), ) • Generate proc routine for input structure static __inline__ int hg_proc_lseek_in_t(hg_proc_tproc, void *data) { lseek_in_t *struct_data = (lseek_in_t *) data; hg_proc_hg_int32_t(proc, &struct_data->in_param_0); hg_proc_hg_off_t(proc, &struct_data->in_param_1); hg_proc_hg_int32_t(proc, &struct_data->in_param_2); return ret; }

  12. Mercury POSIX – Stub Generation /* off_tlseek(intfildes, off_t offset, int whence) */ MERCURY_POSIX_GEN_STUB(lseek, hg_off_t, (hg_int32_t)(hg_off_t)(hg_int32_t), ) • Generate output structure typedefstruct { hg_off_t ret; } lseek_out_t;

  13. Mercury POSIX – Stub Generation /* off_tlseek(intfildes, off_t offset, int whence) */ MERCURY_POSIX_GEN_STUB(lseek, hg_off_t, (hg_int32_t)(hg_off_t)(hg_int32_t), ) • Generate proc routine for output structure static __inline__ int hg_proc_lseek_out_t(hg_proc_tproc, void *data) { lseek_out_t *struct_data = (lseek_out_t *) data; hg_proc_hg_int64_t(proc, &struct_data->ret); return ret; }

  14. Mercury POSIX – Stub Generation /* off_tlseek(intfildes, off_t offset, int whence) */ MERCURY_POSIX_GEN_STUB(lseek, hg_off_t, (hg_int32_t)(hg_off_t)(hg_int32_t), ) • Generate client stub (simplified version) hg_off_t lseek(hg_int32_t in_param_0, hg_off_t in_param_1, hg_int32_t in_param_2) { lseek_in_tin_struct; lseek_out_tout_struct; hg_off_t ret; /* Initialization */ ...

  15. Mercury POSIX – Stub Generation /* Register function if not registered */ MERCURY_REGISTER("lseek", lseek_in_t, lseek_out_t); /* Fill input structure */ in_struct.in_param_0 = in_param_0; in_struct.in_param_1 = in_param_1; in_struct.in_param_2 = in_param_2; /* Forward call to remote addr and get a new request */ HG_Forward(addr, id, &in_struct, &out_struct, &request); /* Wait for call to be executed */ HG_Wait(request, HG_MAX_IDLE_TIME, &status); /* Get output parameters */ ret = out_struct.ret; return ret; }

  16. Mercury POSIX – Stub Generation /* off_tlseek(intfildes, off_t offset, int whence) */ MERCURY_POSIX_GEN_STUB(lseek, hg_off_t, (hg_int32_t)(hg_off_t)(hg_int32_t), ) • Generate server stub (simplified version) static int lseek_cb(hg_handle_t handle) { lseek_in_tin_struct; lseek_out_tout_struct; hg_int32_t in_param_0; hg_off_t in_param_1; hg_int32_t in_param_2; hg_off_t ret;

  17. Mercury POSIX – Stub Generation /* Get input buffer */ HG_Handler_get_input(handle, &in_struct); /* Get parameters */ in_param_0 = in_struct.in_param_0; in_param_1 = in_struct.in_param_1; in_param_2 = in_struct.in_param_2; /* Call function */ ret = lseek (in_param_0, in_param_1, in_param_2); /* Fill output structure */ out_struct.ret = ret; /* Free handle and send response back */ HG_Handler_start_output(handle, &out_struct); }

  18. Mercury POSIX • Routines currently supported:

  19. Mercury POSIX • Routines not yet supported:

  20. Mercury POSIX - Configuration • Environment variables required: • MERCURY_NA_PLUGIN: Underlying network transport method used to forward calls to remote servers. • e.g., "bmi” • MERCURY_PORT_NAME: Port name information (IP/port) specific to the network transport chosen – used to establish a connection with a remote server. • e.g., "tcp://72.36.68.242:22222” • LD_PRELOAD: Path to Mercury POSIX shared library. • e.g., “/usr/local/lib/libmercury_posix.so” • Setting LD_PRELOAD redirects all POSIX calls to the Mercury server (can be an issue with local scripts, etc. that make use of POSIX I/O)

  21. Mercury POSIX - Testing • Integrated regression tests (limited POSIX test suite) • HDF5 sec2 driver (demo) • Lustre POSIX test suite • However: framework issues, needs to be modified, possibly need to support fdopen and FILE*routines?

  22. Demo – Mercury POSIX and HDF5 tools $ pwd ~jsoumagne/demo $ ls *.h5 ls: *.h5: No such file or directory $ export MERCURY_NA_PLUGIN=“bmi” $ export MERCURY_PORT_NAME=“tcp://127.0.0.1:22222” $ export LD_PRELOAD=/path/to/libmercury_posix.so $ pwd ~jsoumagne/demo_server $ ls coord.h5 $ mercury_posix_serverbmi Waiting for client...

  23. Demo – Mercury POSIX and HDF5 tools $ h5dump -H coord.h5 HDF5 "coord.h5" { GROUP "/" { DATASET "multiple_ends_dset" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE { ( 4, 5, 3, 4, 2, 3, 6, 2 ) / ( 4, 5, 3, 4, 2, 3, 6, 2 ) } } DATASET "multiple_ends_dset_chunked" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE { ( 4, 5, 3, 4, 2, 3, 6, 2 ) / ( 4, 5, 3, 4, 2, 3, 6, 2 ) } } DATASET "single_end_dset" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE { ( 2, 3, 6, 2 ) / ( 2, 3, 6, 2 ) } ... (skip) $ mercury_posix_serverbmi Waiting for client... Thu, 19 Sep 13 17:31:00 CDT: Executing open64 Thu, 19 Sep 13 17:31:00 CDT: Executing __fxstat64 Thu, 19 Sep 13 17:31:00 CDT: Executing lseek64 Thu, 19 Sep 13 17:31:00 CDT: Executing hg_posix_read Thu, 19 Sep 13 17:31:00 CDT: Executing lseek64 Thu, 19 Sep 13 17:31:00 CDT: Executing hg_posix_read Thu, 19 Sep 13 17:31:00 CDT: Executing hg_posix_read Thu, 19 Sep 13 17:31:00 CDT: Executing hg_posix_read Thu, 19 Sep 13 17:31:00 CDT: Executing getcwd ... (skip)

  24. Demo – Mercury POSIX and HDF5 tools $ h5copy -i coord.h5 -s single_end_dset -o coord_simple.h5 -d simple Thu, 19 Sep 13 17:33:51 CDT: Executing open64 Thu, 19 Sep 13 17:33:51 CDT: Executing open64 ... (skip) Thu, 19 Sep 13 17:33:51 CDT: Executing __fxstat64 Thu, 19 Sep 13 17:33:51 CDT: Executing lseek64 Thu, 19 Sep 13 17:33:51 CDT: Executing hg_posix_read Thu, 19 Sep 13 17:33:51 CDT: Executing lseek64 Thu, 19 Sep 13 17:33:51 CDT: Executing hg_posix_read ... (skip) Thu, 19 Sep 13 17:33:51 CDT: Executing hg_posix_write Thu, 19 Sep 13 17:33:51 CDT: Executing hg_posix_write Thu, 19 Sep 13 17:33:51 CDT: Executing close

  25. Demo – Mercury POSIX and HDF5 tools $ h5dump coord_simple.h5 HDF5 "coord_simple.h5" { GROUP "/" { DATASET "simple" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE { ( 2, 3, 6, 2 ) / ( 2, 3, 6, 2 ) } DATA { (0,0,0,0): 0, 1, (0,0,1,0): 1, 2, ... (skip) (1,2,2,0): 122, 123, (1,2,3,0): 123, 124, (1,2,4,0): 124, 125, (1,2,5,0): 125, 126 } } } } Thu, 19 Sep 13 17:36:57 CDT: Executing open64 Thu, 19 Sep 13 17:36:57 CDT: Executing __fxstat64 Thu, 19 Sep 13 17:36:57 CDT: Executing lseek64 Thu, 19 Sep 13 17:36:57 CDT: Executing hg_posix_read Thu, 19 Sep 13 17:36:57 CDT: Executing lseek64 Thu, 19 Sep 13 17:36:57 CDT: Executing hg_posix_read Thu, 19 Sep 13 17:36:57 CDT: Executing hg_posix_read Thu, 19 Sep 13 17:36:57 CDT: Executing hg_posix_read ... (skip) Thu, 19 Sep 13 17:36:57 CDT: Executing lseek64 Thu, 19 Sep 13 17:36:57 CDT: Executing hg_posix_read Thu, 19 Sep 13 17:36:57 CDT: Executing close

  26. Conclusion – Future Work • Very easy to forward POSIX I/O calls and does not require modification of existing tools / code • Mercury POSIX can be easily extended to support additional system / library calls • Can directly take advantage of updates to Mercury (network transports, etc.) • Next Quarter: • Support remaining POSIX routines • Test with MPI I/O (ROMIO driver) • Test with Lustre POSIX test suite • If framework issues are solved

  27. Questions

More Related