1 / 23

Middleware Activities from the Paradyn Project

Middleware Activities from the Paradyn Project. Barton Miller University of Wisconsin-Madison Condor Week May 2003. Two Complementary Activities. MRNet: A multi-cast/reduction infrastructure for distributed tools Scalable: sizes to many 1000’s of nodes High throughput, low latency

ulfah
Download Presentation

Middleware Activities from the Paradyn Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Middleware Activities from the Paradyn Project Barton Miller University of Wisconsin-Madison Condor Week May 2003 Multicast/Reduction Network

  2. Two Complementary Activities MRNet: A multi-cast/reduction infrastructure for distributed tools • Scalable: sizes to many 1000’s of nodes • High throughput, low latency TDP: A standard protocol for deploying run-time tools in a distributed environment. • Too many job/process control environments • Too many run-time tools. • The never-ending porting task is a fundamental barrier to tool availability. Multicast/Reduction Network

  3. MRNet Overview Tool Front End • Problem: Front-end centralization leads to poor scalability • Large fan-out • Front-end processing • Large data volumes • Goal: improve scalability and efficiency of groupcommunication d0 d1 d2 d3 dn-4 dn-3 dn-2 dn-1 a0 a1 a2 a3 an-4 an-3 an-2 an-1 Multicast/Reduction Network

  4. MRNet Overview Tool Front End • Multicast/Reduction Network is developed as a part of Paradyn’s scalability initiative. • MRNet provides scalable group communication and data aggregation. … … … d0 d1 d2 d3 dn-4 dn-3 dn-2 dn-1 … a0 a1 a2 a3 an-4 an-3 an-2 an-1 Multicast/Reduction Network

  5. Topologies There are many choices for multi-cast/reduction topologies: • Balanced vs. skewed trees • Fan-out • Co-locating communication and computation nodes vs. separate nodes • Geographic placement MRNet accepts an separately generated topology file with layout and interconnect. • We are agnostic to the above choices. • We provide a standard set of topology generators. • It is trivial for you to provide your own. Multicast/Reduction Network

  6. MRNet Internal Processes Front-End BE BE BE BE BE BE BE BE BE BE BE BE BE BE BE BE Multicast/Reduction Network

  7. MRNet Communicators Front-End Communicators:group back-ends forcommunication BE BE BE BE BE BE BE BE BE BE BE BE BE BE BE BE Multicast/Reduction Network

  8. Tools link with libmrnet, a library that exposes the MRNet API. Abstractions include: Network: Initialize/shut-down network Access network end-points End-points Communicators Streams Front-End BE BE BE BE BE BE BE BE BE BE BE BE BE BE BE BE MRNet Interface Multicast/Reduction Network

  9. MRNet Internal Processes Packet Batching/Unbatching Transformation Filter Data Encoding Functional layers of MRNet Internal Processes. Data Transformation Operation Data Decoding Synchronization Filter Packet Batching/Unbatching Multicast/Reduction Network

  10. MRNet in Paradyn Start-up Smg2000 on ASCI Blue Pacific Multicast/Reduction Network

  11. TDP: The Challenge Consider remote process management environments: • Condor, LSF, etc. • MPI • Portable MPI (such as MPICH) • Vendor provided MPI (such as IBM, Compaq, Sun) • Globus Each of these environments needs to monitor and control the state of its application processes. Multicast/Reduction Network

  12. Typical Process Manager Process manger: • Starts the remote job • Monitors its status • Controls the job • Sets up file I/O • Sets up standard I/O Remote Host Remote Process Manager monitor/ control Application Process Application Process Multicast/Reduction Network

  13. Typical Process Manager The run-time tool? • Also may want to start process (or attach to it) • Also needs to monitors its status • Also may want to control the job • Needs to communicate with its front-end. Remote Host Remote Process Manager Tool Dæmon Process monitor/ control ? Application Process Application Process ? Multicast/Reduction Network

  14. Typical Process Manager Remote Host Remote Process Manager Tool Dæmon Process monitor/ control ? Application Process Application Process So, who wins? ? Multicast/Reduction Network

  15. Typical Process Manager Remote Host Local Host Remote Process Manager Tool Front-End Process Tool Dæmon Process monitor/ control ? Application Process Application Process ? Multicast/Reduction Network

  16. Current State of Affairs • Each process manager starts and controls processes in its own way. • E.g., even within MPI: IBM POE MPI, SGI Origin MPI and MPICH all work differently. MPI has no standard process control! • Specialized cases of a specific tool working with a specific environment • e.g., TotalView debugger working with MPICH. • The result is an m n combination of m process managers and n tools. Bottom line:need a standard interface for process managers and tools to coexist: the Tool Dæmon Protocol (TDP). Multicast/Reduction Network

  17. The Basic TDP Steps • Create, but don’t start, new application process. • If necessary, create tool daemon process. • Pass basic information to tool daemon: e.g., • Application PID. • Front-end host/port number. • Standard I/O host/port number. • Tool daemon processes application: • For a debugger, read symbols • For Paradyn/dyninst, parse the executable. • Start the application process • Respond to changes in the application state. • Respond to changes in the tool daemon’s state. Multicast/Reduction Network

  18. Challenge: Firewalls and Private Nets Remote Host Local Host Remote Process Manager Tool Front-End Process X Tool Dæmon Process Firewall Application Process Multicast/Reduction Network

  19. Challenge: Firewalls and Private Nets Remote Host Local Host Remote Process Manager Comm Proxy Tool Front-End Process Tool Dæmon Process Firewall Application Process Multicast/Reduction Network

  20. Challenge: Firewalls and Private Nets • When tool daemon is started, pass in the host/port number of its front-end process. • If there is a communication proxy, then: • Tool daemon will receive host/port of the proxy, so daemon connects to proxy. • Proxy will connect to the tool front-end, mapping the host/port (similar to NAT). • Application connecting to console for standard I/O works the same way. Multicast/Reduction Network

  21. The Condor/Paradyn Scenario Remote Host Local Host Condor Starter Paradyn Front-End Paradyn Dæmon monitor/ control Application Process Application Process Multicast/Reduction Network

  22. The Path Forward • Have produced a prototype implementation to expose technical challenges: • Parador: Paradyn running under Condor • Ana Cortes and Miquel Senar (UAB) • Goal is to produce a standard set of libraries for process managers and tool daemons. • Involve a wider community in this standards effort • Initially: ANL (Gropp and Lusk), Etnus (Cownie and Delsignore), Compaq, Paradyn, Condor. Multicast/Reduction Network

  23. Tech Reports “MRNet: A Software-Based Multicast/Reduction Network for Scalable Tools”, Philip C. Roth, Dorian C. Arnold, and Barton P. Miller. “The Tool Dæmon Protocol (TDP)”, Barton Miller, Ana Cortés, Miquel A. Senar, and Miron Livny http://www.paradyn.org/papers/ Multicast/Reduction Network

More Related