1 / 33

CS 584

CS 584. Remember to read and take the quiz!. Parallelism. What is parallelism? Multiple tasks working at the same time on the same problem. Why parallelism? "I feel the need for speed!" Top Gun 1986?. Parallel Computing. What is a parallel computer?

mariel
Download Presentation

CS 584

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 584 • Remember to read and take the quiz!

  2. Parallelism • What is parallelism? • Multiple tasks working at the same time on the same problem. • Why parallelism? • "I feel the need for speed!" Top Gun 1986?

  3. Parallel Computing • What is a parallel computer? • A set of processors that are able to work cooperatively to solve a computational problem • Examples • Parallel supercomputers • IBM SP-2, Cray T3E, Intel Paragon, etc. • Clusters of workstations • Symmetric multiprocessors

  4. Won't serial computers be fast enough? • Moore's Law • Double in speed every 18 months • Predictions of need • British government in 1940s predicted they would only need about 2-3 computers • Market for Cray was predicted to be about 10 • Problem • Doesn't take into account new applications.

  5. Applications Drive Supercomputing • Traditional • Weather simulation and prediction • Climate modeling • Chemical and physical computing • New apps. • Collaborative environments • Virtual reality • Parallel databases.

  6. Application Needs • Graphics • 109 volume elements • 200 operations per element • Real-time display • Weather & Climate • 10 year simulation involves 1016 operations • Accuracy can be improved by higher resolution grids which involves more operations.

  7. Cost-Performance Trend 1990s 1980s 1970s Performance 1960s Cost

  8. What does this suggest? • More performance is easy to a point. • Significant performance increases of current serial computers beyond the saturation point is extremely expensive. • Connecting large numbers of microprocessors into a parallel computer overcomes the saturation point. • Cost stays low and performance increases.

  9. Computer Design • Single processor performance has been increased lately by increasing the level of internal parallelism. • Multiple functional units • Pipelining • Higher performance gains by incorporating multiple "computers on a chip."

  10. Computer Performance TFLOPS 1e12 IBM SP-2 Cray C90 Cray X-MP Cray 1 CDC 7600 1e7 IBM 7090 IBM 704 Eniac 1e2 1950 1975 2000

  11. Communication Performance • Early 1990s Ethernet 10 Mbits • Mid 1990s FDDI 100 Mbits • Mid 1990s ATM 100s Mbits • Late 1990s Fast Ethernet 100 Mbits • Late 1990s Gig Ethernet 100s Mbits • Soon 1000 Mbits will be commonplace

  12. Performance Summary • Applications are demanding more speed. • Performance trends • Processors are increasing in speed. • Communication performance is increasing. • Future • Performance trends suggest a future where parallelism pervades all computing. • Concurrency is key to performance increases.

  13. Parallel Processing Architectures • Architectures • Single computer with lots of processors • Multiple interconnected computers • Architecture governs programming • Shared memory and locks • Message passing

  14. Shared Memory Computers Memory Modules Interconnection Network Processors SMP Machines SGI Origin series

  15. Message Passing Computers IBM SP-2 nCube Cray T3E Intel Paragon Workstation Clusters Interconnection Network Processors Memory Modules

  16. Distributed Shared Memory SGI Origin series Workstation Clusters Kendall Square Research KSR1 and KSR2 Interconnection Network Processors Memory Modules

  17. Data Stream Single Inst. Single Data Single Inst. Mult. Data Instruction Stream Mult. Inst. Single Data Mult. Inst. Mult. Data Parallel Computers • Flynn's Taxonomy Nice, but doesn't fully account for all.

  18. Message Passing Architectures • Requires some form of interconnection • The network is the bottleneck • Latency and bandwidth • Diameter • Bisection bandwidth

  19. Message Passing Architectures • Line/Ring • Mesh/Torus

  20. Message Passing Architectures • Tree/Fat Tree

  21. Message Passing Architectures • Hypercube

  22. Embedding • The mapping of nodes from one static network onto another • Ring onto hypercube • Tree onto mesh • Tree onto hypercube • Transporting algorithms

  23. Communication Methods • Circuit switching • Path establishment • Dedicated links • Packet switching • Message is split up • Store and Forward or virtual cut through • Wormhole routing (less storage)

  24. Trends • Supercomputers had best processing and communication • However: • Commodity processors doing pretty good • Pentium III and Alpha • And: • Switched 100 Mbps network is cheap • High speed networks aren’t too high priced

  25. Cost-Performance • Supercomputer • 128 node SP-2  100’s gigaflops • Millions of dollars • Cluster of Workstations • 128 node  10’s to 100’s of gigaflops • Hundreds of thousands of dollars

  26. Advantages of Clusters • Low Cost • Easily upgradeable • Can use existing software • Low Cost

  27. What about scalability? • Current switches support about 256 connections • What about 1000’s of connections? • Interconnecting switches • Fat Tree • Hypercube • Etc.

  28. Serial vs. Parallel Programming • Serial programming has been aided by • von Neumann computer design • Procedural and object languages • Program one ---> Program all • Parallel programming needs same type of standardization. • Machine model. (Multicomputer) • Language and communication. (MPI)

  29. The Multicomputer • Multiple von Neumann computers (nodes) • Interconnection network • Each node • executes its own program • accesses its local memory • faster than remote memory accesses (locality) • sends and receives messages

  30. The Multicomputer INTERCONNECT Mem C P U

  31. Parallel Programming Properties • Concurrency • Performance should increase by employing multiple processors. • Scalability • Performance should continue to increase as we add more processors. • Locality of Reference • Performance will be greater if we only access local memory.

  32. Summary • Applications drive supercomputing. • Processor and network performance is increasing. • Trend is toward ubiquitous parallelism • Serial programming was aided by standardized machine and programming model. • Standardized machine and programming models for parallel computing are emerging.

More Related