1 / 29

Introduction to Parallel Computing

Learn the basics of parallel computing, including machine architecture, parallel algorithms, programming environments, and evaluation of performance. Explore the potential of parallel computing in solving complex problems.

vernaevans
Download Presentation

Introduction to Parallel Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS160 – Spring 2000http://www-cse.ucsd.edu/classes/sp00/cse160 Prof. Fran Berman - CSE Dr. Philip Papadopoulos - SDSC

  2. Two Instructors/One Class • We are team-teaching the class • Lectures will be split about 50-50 along topic lines. (We’ll keep you guessing as to who will show up next lecture ) • TA is Derrick Kondo. He is responsible for grading homework and programs • Exams will be graded by Papadopoulos/Berman

  3. Prerequisites • Know how to program in C • CSE 100 (Data Structures) • CSE 141 (Computer Architecture) would be helpful but not required.

  4. Grading • 25% Homework • 25% Programming assignments • 25% Midterm • 25% Final Homework and Programming Assignments Due at beginning of section

  5. Policies • Exams are closed book, closed notes • No Late Homework • No Late Programs • No Makeup exams • All assignments are to be your own original work. • Cheating/copying from anyone/anyplace will be dealt with severely

  6. Office Hours (Papadopoulos) • My office is SSB 251 (Next to SDSC) • Hours will be TuTh 2:30 – 3:30 or by appointment. • My email is phil@sdsc.edu • My campus phone is 822-3628

  7. Course Materials • Book: Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers, by B. Wilkinson and Michael Allen. • Web site: Will try to make lecture notes available before class • Handouts: As needed.

  8. Computers/Programming • Please see the TA about getting an account for the undergrad APE lab. • We will use PVM for programming on workstation clusters. • A word of advice: With the web, you can probably find almost completed source code somewhere. Don’t do this. Write the code yourself. You’ll learn more. See policy on copying.

  9. Any other Adminstrative Questions?

  10. Introduction to Parallel Computing • Topics to be covered. See syllabus (online) for full details • Machine architecture and history • Parallel machine organization, • Parallel algorithm paradigm • Parallel programming environments and tools • Heterogeneous computing. • Evaluating Performance • Grid Computing • Parallel programming and project assignments

  11. What IS Parallel Computing? • Applying multiple processors to solve a single problem • Why? • Increased performance for rapid turnaround time (wall clock time) • More available memory on multiple machines • Natural progression of standard Von Neumann Architecture

  12. World’s 10th Fastest Machine (as of November 1999) @ SDSC 1152 Processors

  13. Are There Really Problems that Need O(1000) processors? • Grand Challenge Codes • First Principles Materials Science • Climate modeling (ocean, atmosphere) • Soil Contamination Remediation • Protein Folding (gene sequencing) • Hydrocodes • Simulated nuclear device detonation • Code breaking (No Such Agency)

  14. There must be problems with the approach • Scaling with efficiency (speedup) • Unparallelizable portions of code (Amdahl’s law) • Reliability • Programmability • Algorithms • Monitoring • Debugging • I/O • … • These and more keep the field interesting

  15. A Brief History of Parallel Super Computers • There have been many (dead) supercomputers • The Dead Supercomputer Society • http://ei.cs.vt.edu/~history/Parallel.html • Parallel Computing Works • Will touch on about a dozen of the important ones

  16. Basic Measurement Yardsticks • Peak Performance (AKA, guaranteed never to exceed) = nprocs X FLOPS/proc • NAS Parallel Benchmarks • Linpack Benchmark for the TOP 500 • Later in the course, We will explore about how to Fool the Masses and valid ways to measure performance

  17. Illiac IV (1966 – 1970) • $100 Million of 1990 Dollars • Single instruction multiple data (SIMD) • 32 - 64 Processing elements • 15 Megaflops • Ahead of its time

  18. ICL DAP (1979) • Distributed array Processor (also SIMD) • 1K – 4K bit Serial processors • Connected in a mesh • Required an ICL mainframe to front-end the main processor array • Never caught on in the US

  19. Goodyear MPP (late 1970s) • 16K bit-serial processors (SIMD) • Goddard Space and Flight Center – NASA • Only a few sold. Similar to the ICL DAP • About 100 Mflops (100 MHz Pentium)

  20. Cray-1 (1976) • Seymour Cray, Designer • NOT a parallel machine • Single processor machine with vector registers • Largely regarded as starting the modern supercomputer revolution • 80 MHz Processor (80 MFlops)

  21. Denelcor HEP (Heterogeneous Element Processor, early 80’s) • Burton Smith, Designer • Multiple Instruction, Multiple Data (MIMD) • Fine (instruction-level) and Large-grain parallelism (16 processors) • Instructions from different programs ran in per-processor hardware queues (128 threads/proc) • Precursor to the Tera MTA (Multithreaded architecture • Full-empty bit for every memory location. Allowed fast synchronization • Important research machine

  22. Caltech Cosmic Cube - 1983 • Chuck Seitz (Founded Myricom) and Geoffrey Fox (Lattice gauge theory) • First Hypercube interconnection network • 8086/8087 based machine with Eugene Brooks’ Crystalline Operating System (CrOS) • 64 Processors by 1983 • About 15x cheaper than a VAX 11/780 • Begat nCUBE, Floating Point Systems, Ametek, Intel Supercomputers (all dead companies) • 1987 – Vector coprocessor system achieved 500MFlops

  23. Cray – XMP (1983) and Cray-2 (1985) • Up to 4-Way shared memory machines • This was the first supercomputer at SDSC • Best Performance (600 Mflop Peak) • Best Price/Performance of the time

  24. Late 1980’s • Proliferation of (now dead) parallel computers • CM-2 (SIMD) (Danny Hillis) • 64K bit-serial, 2048 Vector Coprocessors • Achieved 5.2 Gflops on Linpack (LU Factorization) • Intel iPSC/860 (MIMD - MPP) • 128 Processors • 1.92 Gigaflops (Linpack) • Cray Y/MP (Vector Super) • 8 processors (333 Mflops/proc peak) • Achieved 2.1 Gigaflops (Linpack) • BBN Butterfly (Shared memory) • Many others (long since forgotten)

  25. Early 90’s • Intel Touchstone Delta and Paragon (MPP) • Follow-On iPSC/860 • 13.2 Gflops on 512 Processors • 1024 Nodes delivered to ORNL in 1993 (150 GFLOPS Peak) • Cray C-90 (Vector Super) • 16 Processor update of the Y/MP • Extremely popular, efficient and expensive • Thinking Machines CM-5 (MPP) • Upto 16K Processors • 1024 Node System at Los Alamos National Lab

  26. More 90’s • Distributed Shared Memory • KSR-1 (Kendall Square Research) • COMA (Cache Only Memory Architecture) • University Projects • Stanford DASH Processor (Hennessy) • MIT Alewife (Agarwal) • Cray T3D/T3E. Fast Processor Mesh with upto 512 Alpha CPUs

  27. What Can you Buy Today? (not an exhaustive list) • IBM SP • Large MPP or Cluster • SGI Origin 2000 • Large Distributed Shared Memory Machine • Sun HPC 10000 – 64 Processor True Shared Memory • Compaq Alpha Cluster • Tera MTA • Multithreaded architecture (one in existence) • Cray SV-1 Vector Processor • Fujitsu and Hitachi Vector Supers

  28. Clusters • Poor man’s Supercomputer? • A pile-of-PC’s • Ethernet or High-speed (eg. Myrinet) network • Likely to be the dominant high-end architecture. • Essentially a build-it-yourself MPP.

  29. Next Time … • Flynn’s Taxonomy • Bit-Serial, Vector, Pipelined Processors • Interconnection Networks • Routing Techniques • Embedding • Cluster interconnects • Network Bisection

More Related