1 / 9

Hasan Abbasi Matthew Wolf Jay Lofstead Fang Zheng Greg Eisenhauer Karsten Schwan

Analyzing large data sets quickly. Hasan Abbasi Matthew Wolf Jay Lofstead Fang Zheng Greg Eisenhauer Karsten Schwan. Scott Klasky Ron Oldfield Norbert Podhorszki. HPC project thrusts. Staging. ADIOS. Adaptive I/O. PreDatA. EnStage. Staging.

elata
Download Presentation

Hasan Abbasi Matthew Wolf Jay Lofstead Fang Zheng Greg Eisenhauer Karsten Schwan

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analyzing large data sets quickly Hasan Abbasi Matthew Wolf Jay Lofstead Fang Zheng Greg Eisenhauer Karsten Schwan Scott Klasky Ron Oldfield Norbert Podhorszki

  2. HPC project thrusts Staging ADIOS Adaptive I/O PreDatA EnStage

  3. Staging • Use additional resources in the compute node • ADIOS staging method – included in version 1.2 release • High performance asynchronous output • State aware schedulers for limiting interference Abbasi, H., Wolf, M., Eisenhauer, G., Klasky, S., Schwan, K., and Zheng, F. 2009. DataStager: scalable data staging services for petascale applications. In Proceedings of the 18th ACM international Symposium on High Performance Distributed Computing (Garching, Germany, June 11 - 13, 2009). HPDC '09. Hasan Abbasi, Jay Lofstead, Fang Zheng, Scott Klasky, Karsten Schwan, Matthew Wolf. "Extending I/O through High Performance Data Services." Cluster Computing 2009, New Orleans, LA. August 2009. Julian Cummings, Alexander Sim, ArieShoshani, Jay Lofstead, Karsten Schwan, CiprianDocan, Manish Parashar, Scott Klasky, Norbert Podhorszki and RoselyneBarreto. "EFFIS: an End-to-end Framework for Fusion Integrated Simulation". PDP 2010 - Th 18th Euromicro International Conference on Parallel, Distributed and Network-Based Computing, February 2010, Pisa, Italy

  4. DataStager Architecture

  5. EnStage • Extends the staging concept to allow computation within the application • Used C-o-D with binary code generation to flexibly move computation • Global operations are performed in the staging area • Feature extraction and function specialization enable pseudo-collective operations in independent SmartTaps

  6. Marshalling intelligently

  7. Marshalling intelligently

  8. Customizations • Copy based sampling (ST1): The ADIOS buffer is created and then subsampled • Inline sampling (ST2): The data is subsampled as the buffer is created • Staging area sampling (C-Stager): The data is subsampled on the staging area

  9. Customizations • Only Data Output: Output data without any subsampling • Tagged Data Output: Calculate statistical characteristics for output data • Bounding Box (ST1): Output particles within a bounding box using copy based marshalling • Bounding Box (ST2): Same as ST1, but use inline subsampling • Bounding Box (C-Stager): Same as ST1, but subsample in the staging area • Statistical subsample: Use the Tagged Data Output on the compute node and reduce the tags on the staging area to specialize a data reduction function using global characteristics.

More Related