1 / 15

Jialin Liu , Bradly Crysler , Yin Lu , Yong Chen Oct. 15. 2013@U-REaSON Seminar

Jialin Liu , Bradly Crysler , Yin Lu , Yong Chen Oct. 15. 2013@U-REaSON Seminar Data-Intensive Scalable Computing Laboratory (DISCL ). Locality-driven High-level I/O Aggregation for Processing Scientific Datasets. Introduction.

zeroun
Download Presentation

Jialin Liu , Bradly Crysler , Yin Lu , Yong Chen Oct. 15. 2013@U-REaSON Seminar

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Jialin Liu, BradlyCrysler, Yin Lu, Yong Chen Oct. 15. 2013@U-REaSON Seminar Data-Intensive Scalable Computing Laboratory (DISCL) Locality-driven High-level I/O Aggregation for Processing Scientific Datasets

  2. Introduction • Scientific simulations nowadays generate a few terabytes (TB) of data in a single run and the data sizes are expected to reach petabytes (PB) in the near future. • VPIC, Vector Particle in Cell, Plasma physics, 26 bytes per particle, 30TB • Accessing and analyzing the data reveals poor I/O performance due to the logical-physical mismatching.

  3. Introduction • Scientific Datasets and Scientific I/O Libraries • PnetCDF, HDF5, ADIOS PnetCDF MPI-IO Parallel File Systems • Scientific I/O libraries allow users to specify array-based logical input • Logical-physical mismatching

  4. Motivation I/O methods in scientific I/O libraries(PnetCDF, ADIOS, HDF5): Independent I/O • Processes collaboration: No • Calls collaboration : No Collective I/O • Processes collaboration: Yes • Calls collaboration : No Nonblocking I/O • Processes collaboration: Yes • Calls collaboration : Yes

  5. Motivation Call0 Calli Call1 … … … … … … … Two Phase Collective I/O … ag02 ag12 ag00 ag01 ag10 ag11 ag03 ag13 agi2 agi0 agi1 agi3 Contention on Storage Server without Aware of Locality

  6. Performance with Overlapping Calls Conclusion: Overlapping Should be Removed

  7. Idea: High level I/O Aggregation Logical Input Decomposition Physical Layout Physical Layout sub0 Call0 start{0,0,0} length{100,200,100} start{0,0,0} length{100,200,200} sub0 sub1 start{0,0,100} length{100,200,100} sub2 sub2 Call1 start{10,20,100} length{10,150,400} sub1 start{10,20,100} length{10,300,400} sub3 sub3 start{10,170,100} length{10,150,400}

  8. Idea: High level I/O Aggregation • Basic Idea • Figure out the overlapping among requests • Eliminate the overlapping before doing I/O • Challenges • How to decompose the requests • How to aggregate the sub-arrays at a high level

  9. Hila: High Level I/O Aggregation • Way to figure out the physical layout • Sub-correlation Function • Sub-correlation Set • Lustre Striping: stripe size: t; stripe count: l; • Dataset : Dimension: d; subsets size: m

  10. Hila Algorithm: Prior Step Prior Step: calculate sub-correlation set, one time analysis

  11. Hila Algorithm: Decomposition Main Steps: Request Decomposition and Aggregation

  12. Improvement with Hila Performance Improved with Hila

  13. Improvement with Hila FASM Improved with Hila

  14. Conclusion and Future Work • Conclusion • The mismatching between logical access and physical layout can lead to poor performance. • We propose the locality-driven high-level aggregation approach (HiLa) to facilitate the existing I/O methodsby eliminating the overlapping among sub-array requests. • Future Work • Apply to write operations • Integrate with file systems.

  15. Locality-driven High-level I/O Aggregation for Processing Scientific Datasets Thanks Q&A http://discl.cs.ttu.edu

More Related