1 / 12

OS, MESSAGE PASSING & RUNTIME TOOLS

OS, MESSAGE PASSING & RUNTIME TOOLS. Parallel software promotion philosophy OpenSource - How rosy is the promise? MPI2 - What features? RTS - How far parallel do we need to go? How might OpenSource accelerate new tools? . Panel Comments by Mary Zosel ASCI PSE / ASDE LLNL

shanae
Download Presentation

OS, MESSAGE PASSING & RUNTIME TOOLS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OS, MESSAGE PASSING & RUNTIME TOOLS Parallel software promotion philosophy OpenSource - How rosy is the promise? MPI2 - What features? RTS - How far parallel do we need to go? How might OpenSource accelerate new tools? Panel Comments by Mary Zosel ASCI PSE / ASDE LLNL For Fourth Workshop on Distributed Supercomputers

  2. Philosophy - for promoting parallel simulation development environment • Standards - promote and encourage use • Set high platform software expectations • Software in procurements • ISV support gives portability and 2nd source • Keep academia involved • Need their ideas & need their students • Local prototypes where needed • Preferably partnership with commercial partner • Full local support only as last resort • It’s fun when new - but costly burden later So where in this picture does OpenSource fit ??? It facilitates academia and prototyping, but the support issue is a concern. UCRL-VG-137868

  3. OpenSource - Does it measure up to the promise? Disclaimer --- I haven’t been actively involved in this area, but at second-look, it isn’t as promising as it first seems. There is a lot of good and successful opensource software But there are also red-flags … • One promising OpenSource tool we picked up was so full of use of platform specific “.h” files that we couldn’t make it build anywhere else. • Another OpenSource promise for a key library we were counting on evaporated. • The lawyers are still there - and source release isn’t easy. • All the usual gnu-software restriction issues … • Software-police issues will be interesting … UCRL-VG-137868

  4. MPI2 - What do the users need? • MPI-I/O • Thread - safety … actually need more support than MPI2 gives us • Dynamic process control - starting to get queries about this • Language bindings • They say they want one-sided • Various “abstraction” features (info, error…) UCRL-VG-137868

  5. Runtime tools for 1000s of cpu’s. • Yes - the users are asking for debugger support. • My code seems to be hung - what’s it doing? • My code is growing after a couple of hours why? • Where is all my memory going and why? • (Similar set of questions for performance issues.) • Easy to provide? No … • Tool infrastucture needs to be designed for scalability • Obvious gui and data presentation issues • User debug time ties up resources - another challenge • Access to resources for development- even “on-site” • But there are solutions in the works …e.g. • Variety of collapsing and filtering of data • Macros together with good CLI look promising UCRL-VG-137868

  6. ORIGINAL STRUCT ARRAY Just the values of “val1” struct member UCRL-VG-137868

  7. Sorted array values Checksum of same array UCRL-VG-137868 UCRL-VG-137868

  8. LCB View of task and thread-state can be dumped anytime application is stopped. color code tells how many processes are where. UCRL-VG-137868

  9. Root window collapsed Same Root window opened to show all tasks UCRL-VG-137868

  10. Can set any of the counters to any of it’s settings Set Counter 1 Set Counter 2 Set Counter 3 Set Counter 4 Activate Counters Stop Counters Update Counters Zero Counters ----------------------- Close Window Close All Similar Windows Save Window to File... Reexecute Last Save Window Help Nothing MFLOPS % branch mispredictions L2 Data cache miss rate ------------------------- CPU Cycles Instructions Completed Instruction Cache Misses Integer Instructions Completed Floating Instructions Completed dtlb misses (not speculative) Branch Mispredictions ------------------------- Time Base bit transition Reservations requested Values by thread UCRL-VG-137868

  11. Info about which task is using max and min memory. Memory info about all the tasks Can watch how (and which) tasks grow UCRL-VG-137868

  12. Will OpenSource help RTS tools ? and how? The biggest barrier to more tools - especially from academia - is the problem of no standard interface with parallel runtime environment - no easy way to attach-to and communicate with parallel job. If the parallel OpenSource community could come up with a (simple) scalable parallel control-daemon interface - that would be a big help to opening this area to development. There are several places interested in a parallel-tools infrastructure components “kit” - but this item is the big drawback to portability. UCRL-VG-137868

More Related