1 / 35

Improving the Scalability of Comparative Debugging with MRNet

Improving the Scalability of Comparative Debugging with MRNet. Jin Chao. MeSsAGE Lab (Monash Uni.). Cray Inc. Luiz DeRose Robert Moench Andrew Gontarek. David Abramson Minh Ngoc Dinh Jin Chao. Outline. Assertion-based comparative debugging The architecture of Guard

brendy
Download Presentation

Improving the Scalability of Comparative Debugging with MRNet

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improving the Scalability of Comparative Debugging with MRNet Jin Chao MeSsAGE Lab (Monash Uni.) Cray Inc. • Luiz DeRose • Robert Moench • Andrew Gontarek • David Abramson • Minh Ngoc Dinh • Jin Chao

  2. Outline • Assertion-based comparative debugging • The architecture of Guard • Improving the scalability of Guard with MRNet • Performance evaluation • Conclusion and future work

  3. Assertion-based Comparative Debugging

  4. Challenge faced by parallel debugging (1) • Cognitive challenge • Large dataset on popular supercomputer: • Cray Blue Water: • > 1.5PB aggregated memory • > 380,000 CPU cores • A large set of scientific data that is non-readable to human : • Multi-dimensional • Floating-point • What is the correct state?

  5. Challenge faced by parallel debugging (2) • Control-flow based parallel debuggers • Limitations of visualization tools • Errors in visualized presentations are hard to detect

  6. Comparative debugging Data-centric debugging: focusing the data set in parallel programs • Comparing state between programs • Porting codes across different platforms • Re-writing codes with different languages • Software evolution: Modifying/improving existing codes • Comparing state with users’ expectations • Invariants based on the properties of scientific modelling or mathematical theories • “Verifying” the correctness of computing status

  7. Assertion for Comparative debugging “An assertion is a statement about an intended state of a system’s component.” • Assertion in Guard: • An ad hoc debug-time assertion • Simple assertion: • assert$a::p@“source.c”:33 = 5000

  8. Assertion for Comparative debugging “An assertion is a statement about an intended state of a system’s component.” • Assertion in Guard: • An ad hoc debug-time assertion • Simple assertion: assert $a::var@“source.c”:34 = 1024 • Comparative assertions • Observing the divergence in the key data structures as the programs execute assert $a::p_array@“prog1.c”:34 = $b::p_array@“prog2.c”:37 • Statistical assertions • Verifying the statistical properties of scientific modelling or mathematical theories: • standard deviation • histogram

  9. Example of statistical assertion: histogram • Histogram: user-defined abstract data model • Creating the model with the two phase operation • create $model=randset (Gaussian, 100000, 0.05) • set reduce histogram(1000, 0.0, 1.0) • assert $a::my_array@code.c:10~$model < 0.02 • estimate operator: ~ • 2 goodness of fit test

  10. Implementation of assertions • An assertion is compiled into a dataflow graph • Dataflow machine set reduce “sum” assert $a::p_array@“prog.c”:34 > 1,000,000 PROCSET($a) Set BKPT Go BKPT Hit Sum(p_array) 1,000,000 Get VAR Reduce Compare Exit

  11. Dataflow graphs

  12. The Architecture of Guard

  13. Features of Guard (CCDB) • A general parallel debugger • Supporting C, FORTRAN and MPI • A relative debugger • Comparative assertions and statistical assertions • Client/Server structure: • Machine independent command line client • Visual Studio Client for Windows • SUN One Studio and IBM Eclipse • Supporting servers from different architecture: • Unix: SUN (Solaris), x86 (Linux), IBM RS6000 (AIX) • Windows: Visual Studio .NET • Architecture Independent Format (AIF)

  14. The architecture of Guard CLI Relative debugger Front-end: Debug Client Socket Network: s0 s1 s2 sn Back-end: GDB GDB GDB GDB p2 p0 p1 pn

  15. Features of MRNet • Tree-Based Overlay Networks (TBONs) • Scalable broadcast and gather • Custom data aggregation

  16. General purpose API of MRNet • User-defined tree topology • Topology file: k-ary, k-nomial and tailored layout • Communicator • A set of back-ends • Stream • A logical data channel over a communicator • Multicast, gather and custom reduction • Packet • Collection of data • Filter • Modify data transferred across it • WaitForAll, TimeOut, NoWait • Startup: • launching back-end processes

  17. Improving the scalability of Guard with MRNet

  18. The architecture of Guard with MRNet Command Line Interface Front-end: Comparative debugger Debug Client C Wrapper MRNet Front-end CP CP CP Guard filter s1 s2 sn s0 MRNet BE GDB GDB GDB GDB Back-end: p2 p0 p1 pn

  19. Communication with MRNet • Communication patterns in Guard • General parallel debugger • Topology • Balanced, K-nomial • Placement of MRNet internal nodes • Requiring no additional resources • procset: • Communicator • Channels for commands, events and I/O • Command stream: synchronous channel • Event and I/O Stream: asynchronous channel • Aggregating redundant messages: • WaitForAll filter: synchronous channel • TimerOut filter: asynchronous channel

  20. Creating a communication tree of MRNet • Invoking a debug session on Cray: Front-end: Back-end: p0 aprun apinit GDB pn s0 GDB Guard Client launch helper agent sn MRNet FE

  21. Comparative assertion with MRNet (1) assert $a::p_array@“prog1.c”:34 = $b::p_array@“prog2.c”:37 Guard client Front-end: FE0 FE1 CP CP CP CP CP CP s0 s1 s0 s1 sm sn Back-end: BE BE

  22. Comparative assertion with MRNet (2) • Centralized comparison: Reconstruct and compare s0 s1 s2 s3 Front-end: s2 s3 MRNet MRNet p3 p0 p1 p2 p1 p0 Back-end: d2 s2 d3 s3 s3 d3 d0 s0 d1 s1 s0 s1 s2 d2 d0 d1

  23. p_array P1 P2 P3 P4 s_array Data Reconstruction • Currently support regular decompositions • Blockmap: • Assertions require the debugger to understand data decomposition blockmap test(P::V) define distribute(block, *) define data(1024, 1024) end

  24. Comparative assertion with MRNet (3) • Point-to-point comparison: Comparison results Front-end: MRNet MRNet r0 r1 r2 r3 p3 p0 p1 p2 p1 p0 Back-end: d0 dd d1 d1 dd d2 d3 d3 d3 d0 d1 d2

  25. Statistical assertion with MRNet • Standard deviation: two phase operation • Parallel: calculate a set of primary statistics • Aggregation: form a full statistical model Aggregation Front-end: Guard client CP CP CP s0 s1 sn Back-end: ps0 ps1 psn

  26. Performance Evaluation

  27. Performance Evaluation • The configuration of MRNet (3.1.0) : • A balanced topology • Fan-out: 64 • Test bed:‘Hera’ • Cray XE6 Gemini 1.2 system • 752 computing nodes, each with 32GB of memory. • 21,760 CPU cores totally • General Parallel Debugger • SP (Scalar Pentagidgonal) of NAS Parallel Benchmarks (NPB) 3.3.1 • Relative Debugger • Comparative assertion: data examples from WRF • Statistical assertion: molecular dynamics simulation

  28. Scalability of Invoke Command

  29. Latency of Debugging Commands

  30. Message Aggregation • A memory buffer of 256 bytes was added into the SP program. Its value was inspected under different degrees of aggregation (DoA).

  31. Comparative assertion • 320 KBytes*10,000=3GB. The size of data is from WRF, a production climate model.

  32. Statistical assertion • Molecular dynamics simulation: 209*106 atoms • The atomistic mechanism of fracture accompanying structural phase transformation in AIN ceramic under hypervelocity impact.

  33. Conclusion and Future Work • MRNet improves the scalability of Guard • >20,000 debug servers • The overhead of creating an MRNet tree • Attachment: • Instantiation: one BE process per core • How to take advantage of computing capability of MRNet • Comparison across multiple trees? • Building a tree across multiple aprun? • Programming filter for handling aggregations of statistical assertions • The best way of maintaining C wrapper for C++ interface ?

  34. Related publications • Dinh M.N., Abramson D., Jin C., Gontarek A., Moench R., and DeRose L., Debugging Scientific Applications With Statistical Assertion, in International Conference on Computational Science (ICCS), Omaha, USA, 2012. • Dinh M.N., Abramson D., Jin C., Gontarek A., Moench R., and DeRose L., Scalable parallel debugging with statistical assertions. ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (PPoPP) 2012, New Orleans, USA. (Poster session) • Jin C., Abramson D., Dinh M.N., Gontarek A., Moench R., and DeRose L., A Scalable Parallel Debugging Library with Pluggable Communication Protocols, the 12th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (CCGrid 2012),Ottwa, Canada, May 2012. • Dinh M.N., Abramson D., M.N., Kurniawan, Jin C., Moench R., and DeRose L., Assertion based parallel debugging, the 11th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (CCGrid 2011), Newport Beach,USA, May 2011. • Abramson, D., Dinh, M.N., Kurniawan, D., Moench, B. and DeRose, L. Data Centric Highly Parallel Debugging, 2010 International Symposium on High Performance Distributed Computing (HPDC 2010), Chicago, USA, June 2010. • Abramson D., Foster, I., Michalakes, J. and Sosic R., Relative Debugging and its Application to the Development of Large Numerical Models, Proceedings of IEEE Supercomputing 1995, San Diego, December 95. (Best paper award)

  35. Thanks and Questions?

More Related