1 / 27

Software infrastructure for the I-WAY high-performance distributed computing experiment

Software infrastructure for the I-WAY high-performance distributed computing experiment. Ian Foster, Jonathan Geisler, Bill Nickless, Warren Smith, and Steven Tuecke Grid Computing - Making the Global Infrastructure a Reality, chapter 4, pages pp. 101~106. Wiley and Sons. Outline.

hastin
Download Presentation

Software infrastructure for the I-WAY high-performance distributed computing experiment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Software infrastructure for the I-WAY high-performance distributed computing experiment Ian Foster, Jonathan Geisler, Bill Nickless, Warren Smith, and Steven Tuecke Grid Computing - Making the Global Infrastructure a Reality, chapter 4, pages pp. 101~106. Wiley and Sons

  2. Outline • Introduction • The I-WAY network • I-Way infrastructure Point of presence machines Scheduler Security Parallel programming tools File systems • Conclusions

  3. Outline • Introduction • The I-WAY network • I-Way infrastructure Point of presence machines Scheduler Security Parallel programming tools File systems • Conclusions

  4. I-WAY • In brief, the I-WAY was an ATM network connecting supercomputers, mass storage systems, and advanced visualization devices at 17 different sites within North America. • I-Soft, I-POP, I-WAY

  5. Novel concepts and techniques • Point of presence machines • Scheduler proxies • Authorization proxies • Network-aware parallel programming tools

  6. Outline • Introduction • The I-WAY network • I-Way infrastructure Point of presence machines Scheduler Security Parallel programming tools File systems • Conclusions

  7. The I-WAY network • The I-WAY network connected display devices (CAVE, ImmersaDesk) mass storage systems specialized instruments supercomputers of different architectures… • Why ATM? ATM was chosen rather than traditional Internet connectivity because it provides a broader bandwidth and is able to handle audio, video, and data more efficiently.

  8. Outline • Introduction • The I-WAY network • I-Way infrastructure Point of presence machines Scheduler Security Parallel programming tools File systems • Conclusions

  9. Point of presence machines • I-POP It provide uniform authentication, resource reservation, process creation, and communication functions across I-WAY resources. • I-Soft It was a software environment deployed on these I-POP machines. It provides a variety of services. 1. scheduling 2. security 3. parallel programming support 4. a distributed file system

  10. I-POP design

  11. I-POP discussion • All I-POPs shared a single AFS cell proved extremely useful as a means of maintaining a single, shared copy of I-Soft code and as a mechanism for distributing I-WAY scheduling information. • We never exploited this capability to monitor or control the ATM network.

  12. Outline • Introduction • The I-WAY network • I-Way infrastructure Point of presence machines Scheduler Security Parallel programming tools File systems • Conclusions

  13. Scheduler design • Computational Resource Broker (CRB) Requests are handled by an independent entity (CRB), which then negotiates with the site schedulers that manage individual resources. In the I-WAY, one was sufficient. • Virtual machines Predefined disjoint subsets of I-WAY computers. • User-to-CRB and CRB-to-resource protocols

  14. Scheduler design (cont.) • Functions of scheduler 1. management functions 2. user functions • Central scheduler and local scheduler • Two-part strategy 1. Central scheduler daemon that managed and allocated time on the different virtual machines on a first-come, first-served basis. 2. A local scheduler daemon communicating directly with the local site scheduler. Local schedulers performed site-dependent actions in response to requests from the central scheduler to allocate resources, create processes, and deallocate resources.

  15. Scheduler discussion • Limitations 1. Too-restrictive interfaces between user and scheduler and scheduler and local resources. 2.The concept of using fixed virtual machines as schedulable units was only moderately successful. 3.The long-term solution probably is to develop more sophisticated schedulers for resources that are to be incorporated into I-WAY– like systems.

  16. Outline • Introduction • The I-WAY network • I-Way infrastructure Point of presence machines Scheduler Security Parallel programming tools File systems • Conclusions

  17. Security design • Two parts authentication to the I-POP environment authentication to the local sites • Authentication to I-POPs was handled by using a telnet client modified to use Kerberos authentication and encryption. • The scheduler software served as an ‘authentication proxy.’

  18. Security discussion • Authenticate once • A more fundamental limitation of the I-WAY authentication scheme as implemented was that each user had to have an account at each site to which access was required.

  19. Outline • Introduction • The I-WAY network • I-Way infrastructure Point of presence machines Scheduler Security Parallel programming tools File systems • Conclusions

  20. Parallel tools design • I-WAY must support the creation of processes on different processors and the communication of data between these processes. • These tools should ideally relieve the programmer of the need to consider low-level details relating to network structure.

  21. Parallel tools design (cont.) • Irsh and ixterm • Nexus multithreaded communication library Nexus supports automatic configuration mechanisms that allow it to use information contained in resource databases to determine which startup mechanisms, network interfaces, and protocols to use in different situations. • CAVEcomm and MPICH

  22. Parallel tools discussion • A significant difficulty revealed by the I-WAY experiment related to the mechanisms used to generate and maintain the configuration information used by Nexus. • Automatic discovery techniques.

  23. Outline • Introduction • The I-WAY network • I-Way infrastructure Point of presence machines Scheduler Security Parallel programming tools File systems • Conclusions

  24. File systems • I-WAY–like systems introduce three related requirements with a file-system flavor. 1. Many users require access to various status data and utility programs at many different sites. 2. Users running programs on remote computers must be able to access executables and configuration data at many different sites. 3. Application programs must be able to read and write potentially large data sets. • The I-Soft system supported only the first of these requirements.

  25. File Systems (cont.) • An AFS cell (with three servers for reliability) was deployed and used as a shared repository for I-WAY software, and also to maintain scheduler status information.

  26. Outline • Introduction • The I-WAY network • I-Way infrastructure Point of presence machines Scheduler Security Parallel programming tools File systems • Conclusions

  27. Conclusions • SC’95 • Further I-WAY–like systems.

More Related