1 / 28

A Strategy for the Future of High Performance Computing?

Extreme Linux. A Strategy for the Future of High Performance Computing?. Advanced Computing Laboratory Los Alamos National Laboratory. Pete Beckman. Observations: The US Supercomputing Industry. All US high-performance vendors are building clusters of SMPs (with the exception of Tera)

sloan
Download Presentation

A Strategy for the Future of High Performance Computing?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Extreme Linux A Strategy for the Future of High Performance Computing? Advanced Computing Laboratory Los Alamos National Laboratory Pete Beckman

  2. Observations:The US Supercomputing Industry • All US high-performance vendors are building clusters of SMPs (with the exception of Tera) • Each company, IBM, SGI, Compaq, HP, and SUN has a different version of Unix • Each company attempts to scale system software designed for database, internet, technical and servers • This fractured market forces 5 different parallel file systems, fast message implementations, etc • Supercomputer companies tend to go out of business Advanced Computing Laboratory Los Alamos National Laboratory

  3. New Limitations People used to say: “The number of Tflops available is limited only by the amount of money you wish to spend” The Reality: We are at a point where our ability to build machines from components exceeds our ability to admin, program and run them But we do it anyway. Many large clusters are being installed... Advanced Computing Laboratory Los Alamos National Laboratory

  4. Scalable System Software is currently the weak linkSoftware for Tflop clusters of SMPs is hard • System administration, configuration, booting, management, & monitoring • Scalable smart NIC messaging (zero copy) • Cluster/Global/Parallel File System • Job queuing and running • I/O (scratch, prefetch, NASD) • Fault tolerance and on-the-fly reconfiguration Advanced Computing Laboratory Los Alamos National Laboratory

  5. Why use Linux for clusters of SMPs, and as a basis for system software research? Linux is a lot of fun (Shagadelic, Baby!) • The OS for scalable clusters needs more research • Open Source! (it’s more then just geek chic) • No lawyers, no NDAs, no worries mate! • Visible code improves faster • The whole environment, or just the mods can be distributed • Scientific collaboration is just an URL away... • Small, well designed, stable, mature, kernel • ~240K lines of code without device drivers • /proc filesystem and dynamically loadable modules • The OS is extendable, optimizable, tunable Did I mention no lawyers? Advanced Computing Laboratory Los Alamos National Laboratory

  6. Isn’t Open Source hype?Do you really need it? A very quick example Supermon and Superview: High Performance Cluster Monitoring Tools Ron Minnich, Karen Reid, Matt Sottile Advanced Computing Laboratory Los Alamos National Laboratory

  7. The problem: get really fast stats from a very large cluster • Monitor hundreds of nodes at rates up to 100 Hz • Monitor at 10Hz without significant impact on the application • Monitor hardware performance counters • Collect a wide range of kernel information (disk blocks, memory, interrupts, etc) Advanced Computing Laboratory Los Alamos National Laboratory

  8. Solution • Modify the kernel so all the parameters can be grabbed without going though /proc • Tightly coupled clusters can get real-time monitoring stats. • This is not of general use to the desktop, and web server markets • Stats for 100 nodes takes about 20 ms Advanced Computing Laboratory Los Alamos National Laboratory

  9. Superview: the Java tool for Supermon Advanced Computing Laboratory Los Alamos National Laboratory

  10. Scalable Linux System Software Where should we concentrate our efforts? Some areas for improvement…. Advanced Computing Laboratory Los Alamos National Laboratory

  11. Software: The hard partLinux environments (page 1) • Compilers • F90 (PGI, Absoft, Compaq) • F77 (GNU, PGI, Absoft, Compaq, Fujitsu) • HPF (PGI, Compaq?) • C/C++ (PGI, KAI, GNU, Compaq, Fujitsu) • OpenMP (PGI) • Metrowerks Code Warrior for C, C++, (Fortran?) • Debuggers • Totalview… maybe, real soon now, almost? • gdb, DDD, etc. Advanced Computing Laboratory Los Alamos National Laboratory

  12. Software: The hard partLinux environments (page 2) • Message Passing • MPICH, PVM, MPI MSTI, Nexus • OS Bypass: • ST, FM, AM, PM, GM, VIA, Portals, etc • Fast Interconnects: Myrinet, GigE, HiPPI, SCI • Shared Memory Programming • Pthreads, Tulip-Threads, etc. • Parallel Performance Tools • TAU, Vampir, PGI PGProf, Jumpshot, etc Advanced Computing Laboratory Los Alamos National Laboratory

  13. Software: The hard partLinux environments (page 3) • File Systems & I/O • e2fs (native),NFS • PVFS, Coda, GFS • MPI-IO, ROMIO • Archival Storage • HPSS & ADSM clients • Job Control • LSF, PBS,Maui Advanced Computing Laboratory Los Alamos National Laboratory

  14. Software: The hard partLinux environments (page 4) • Libraries and Frameworks • BLAS, OVERTURE, POOMA, Atlas • Alpha math libraries (Compaq) • System Administration • Building and booting tools • Cfengine • Monitoring and management tools • Configuration database • SGI Project Accounting Advanced Computing Laboratory Los Alamos National Laboratory

  15. Quick Summary Software for Linux clustersA report card (current status) ………………………………...………..A …………………………..……...I ………………………….….…..A- ……….…………..……..A …………...………..C+ ………………….………...………..D …………………………….……..C ………………………………...……..B- ……………………………………..B Compilers Parallel debuggers Message passing Shared memory prog. Parallel performance tools File Systems Archival Storage Job Control Math Libraries Advanced Computing Laboratory Los Alamos National Laboratory

  16. Summary of the most important areas • First Priority • Cluster management, administration, images, monitoring, etc • Cluster/parallel/global file systems • Continued work on scalable messaging • Faster, more scalable SMP • Virtual memory optimized for HPC • TCP/IP improvements • Wish List • NIC boot, BIOS NVRAM, Serial console • OS bypass standards in the kernel • Tightly-coupled scheduling, accounting • Newest Drivers Advanced Computing Laboratory Los Alamos National Laboratory

  17. Honest cluster costs: publish the numbers • How many sysadmins and programmers are we required for support? • What are the service and replacement costs? • How much was hardware integration? • How many users can you support and at what levels? • How much was the hardware? Advanced Computing Laboratory Los Alamos National Laboratory

  18. Tera-Scale SMP Cluster Architecture Network Attached Secure Disks Gigabit Multistage Interconnection Fabric Compute Nodes Control Node Control Node Unit Gigabit Ethernet Advanced Computing Laboratory Los Alamos National Laboratory

  19. Let someone else put it together • Compaq • Dell • Penguin Computing • Alta Tech • VA Linux • DCG • Paralogic • Microway Ask about support Advanced Computing Laboratory Los Alamos National Laboratory

  20. Cluster BenchmarkingLies, Damn Lies, and the Top500 Vendor Published Linpack, Latency, and Bandwidth numbers are worthless • Make MPI zero-byte messaging a special case (improves latency numbers) • Convert multiply flops to addition, recount flops • Hire a Linpack consultant to help you achieve “the number” the vendor promised • “We unloaded the trucks, and 24hrs later, we calculated the size of the galaxy in acres.” • For $15K and 3 rolls of duct tape I built a supercomputer in my cubicle…. Advanced Computing Laboratory Los Alamos National Laboratory

  21. Plug-in Frameworkfor Cluster Benchmarks Advanced Computing Laboratory Los Alamos National Laboratory

  22. MPI Message Matching Advanced Computing Laboratory Los Alamos National Laboratory

  23. Advanced Computing Laboratory Los Alamos National Laboratory

  24. Advanced Computing Laboratory Los Alamos National Laboratory

  25. Conclusions • Lots of Linux clusters will be at SC99 • The Big 5 vendors do not have the critical mass to develop the system software for multi-teraflop clusters • The HPC community (labs, vendors, universities, etc.) needs to work together • The hardware consolidation is nearly over, the software consolidation is on its way • A Linux-based “commodity” Open Source strategy could provide a mechanism for: • open vendor collaboration • academic and laboratory participation • one Open Source software environment Advanced Computing Laboratory Los Alamos National Laboratory

  26. News and Announcements: • The next Extreme Linux conference will be in Williamsburg in October. The call for papers will be out soon, start preparing those technical papers… • There will be several cluster tutorials at SC99. Remy Evard, Bill Saphir, and Pete Beckman will be running one focused on system administration and user environment for large clusters. Advanced Computing Laboratory Los Alamos National Laboratory

More Related