1 / 37

Introduction to Parallel Processing

Debbie Hui CS 147 – Prof. Sin-Min Lee 7 / 11 / 2001. Introduction to Parallel Processing. Parallel Processing. Parallelism in Uniprocessor Systems Organization of Multiprocessor Systems. Parallelism in Uniprocessor Systems.

gus
Download Presentation

Introduction to Parallel Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Debbie Hui CS 147 – Prof. Sin-Min Lee 7 / 11 / 2001 Introduction to Parallel Processing

  2. Parallel Processing • Parallelism in Uniprocessor Systems • Organization of Multiprocessor Systems

  3. Parallelism in Uniprocessor Systems • A computer achieves parallelism when it performs two or more unrelated tasks simultaneously

  4. Uniprocessor Systems Uniprocessor system may incorporate parallelism using: • an instruction pipeline • a fixed or reconfigurable arithmetic pipeline • I/O processors • vector arithmetic units • multiport memory

  5. Uniprocessor Systems Instruction pipeline: • By overlapping the fetching, decoding, and execution of instructions • Allows the CPU to execute one instruction per clock cycle

  6. Reconfigurable Arithmetic Pipeline: Better suited for general purpose computing Each stage has a multiplexer at its input The control unit of the CPU sets the selected data to configure the pipeline Problem: Although arithmetic pipelines can perform many iterations of the same operation in parallel, they cannot perform different operations simultaneously. Uniprocessor Systems

  7. Uniprocessor Systems Vectored Arithmetic Unit: • Provides a solution to the reconfigurable arithmetic pipeline problem • Purpose: to perform different arithmetic operations in parallel

  8. Uniprocessor Systems Vectored Arithmetic Unit (cont.): • Contains multiple functional units - Some performs addition, subtraction, etc. • Input and output switches are needed to route the proper data to their proper destinations - Switches are set by the control unit

  9. Uniprocessor Systems Vectored Arithmetic Unit (cont.): How do we get all that data to the vector arithmetic unit? By transferring several data values simultaneously using: - Multiple buses - Very wide data buses

  10. Uniprocessor Systems Improve performance: • Allowing multiple, simultaneous memory access - requires multiple address, data, and control buses (one set for each simultaneous memory access) - The memory chip has to be able to handle multiple transfers simultaneously

  11. Uniprocessor Systems Multiport Memory: • Has two sets of address, data, and control pins to allow simultaneous data transfers to occur • CPU and DMA controller can transfer data concurrently • A system with more than one CPU could handle simultaneous requests from two different processors

  12. Uniprocessor Systems Multiport Memory (cont.): • Can • Multiport memory can handle two requests to read data from the same location at the same time • Cannot • Process two simultaneous requests to write data to the same memory location • - Requests to read from and write to the same memory location simultaneously

  13. Organization of Multiprocessor Systems Three different ways to organize/classify systems: • Flynn’s Classification • System Topologies • MIMD System Architectures

  14. Multiprocessor SystemsFlynn’s Classification Flynn’s Classification: • Based on the flow of instructions and data processing • A computer is classified by: - whether it processes a single instruction at a time or multiple instructions simultaneously - whether it operates on one more multiple data sets

  15. Multiprocessor SystemsFlynn’s Classification Four Categories of Flynn’s Classification: • SISD Single instruction single data • SIMD Single instruction multiple data • MISD Multiple instruction single data ** • MIMD Multiple instruction multiple data ** The MISD classification is not practical to implement. In fact, no significant MISD computers have ever been build. It is included only for completeness.

  16. Multiprocessor SystemsFlynn’s Classification Single instruction single data (SISD): • Consists of a single CPU executing individual instructions on individual data values

  17. Multiprocessor SystemsFlynn’s Classification Single instruction multiple data (SIMD): Main Memory Control Unit Processor Memory Communications Network Processor Memory Processor Memory • Executes a single instruction on multiple data values simultaneously using many processors • Since only one instruction is processed at any given time, it is not necessary for each processor to fetch and decode the instruction • This task is handled by a single control unit that sends the control signals to each processor. • Example: Array processor

  18. Multiprocessor SystemsFlynn’s Classification Multiple instruction Multiple data (MIMD): • Executes different instructions simultaneously • Each processor must include its own control unit • The processors can be assigned to parts of the same task or to completely separate tasks • Example: Multiprocessors, multicomputers

  19. Multiprocessor SystemsSystem Topologies System Topologies: • The topology of a multiprocessor system refers to the pattern of connections between its processors • Quantified by standard metrics: • Diameter The maximum distance between two processors in the computer system • Bandwidth The capacity of a communications link multiplied by the number of such links in the system (best case) • Bisectional Bandwidth The total bandwidth of the links connecting the two halves of the processor split so that the number of links between the two halves is minimized (worst case)

  20. Multiprocessor SystemsSystem Topologies Six Categories of System Topologies: • Shared bus • Ring • Tree • Mesh • Hypercube • Completely Connected

  21. Multiprocessor SystemsSystem Topologies Shared bus: • The simplest topology • Processors communicate with each other exclusively via this bus • Can handle only one data transmission at a time • Can be easily expanded by connecting additional processors to the shared bus, along with the necessary bus arbitration circuitry M M M P P P Shared Bus Global Memory

  22. Multiprocessor SystemsSystem Topologies Ring: • Uses direct dedicated connections between processors • Allows all communication links to be active simultaneously • A piece of data may have to travel through several processors to reach its final destination • All processors must have two communication links P P P P P P

  23. Multiprocessor SystemsSystem Topologies Tree topology: • Uses direct connections between processors • Each processor has three connections • Its primary advantage is its relatively low diameter • Example: DADO Computer P P P P P P

  24. Multiprocessor SystemsSystem Topologies Mesh topology: • Every processor connects to the processors above, below, left, and right • Left to right and top to bottom wraparound connections may or may not be present P P P P P P P P P

  25. Multiprocessor SystemsSystem Topologies Hypercube: • Multidimensional mesh • Has n processors, each with log n connections

  26. Multiprocessor SystemsSystem Topologies Completely Connected: • Every processor has n-1 • connections, one to each • of the other processors • The complexity of the • processors increases as • the system grows • Offers maximum • communication capabilities

  27. Multiprocessor SystemsSystem Topologies * Without wraparound ** With wraparound l = bandwidth of the bus n = number of processors

  28. Multiprocessor SystemsMIMD System Architecture MIMD System Architecture: • The architecture of an MIMD system refers to its connections with respect to system memory • Multiprocessor • Multicomputers

  29. Multiprocessor SystemsMIMD System Architecture Symmetric multiprocessor (SMP): • A computer system that has two or more processor with comparable capabilities • Four different types: - Uniform memory access (UMA) - Nonuniform memory access (NUMA) - Cache coherent NUMA (CC-NUMA) - Cache only memory access (COMA)

  30. Multiprocessor SystemsMIMD System Architecture Uniform memory access (UMA): • Gives all CPUs equal (uniform) access to all shared memory locations • Each processor may have its own cache memory, not directly accessible by the other processors Processor 1 Communications Mechanism Shared Memory Processor 2 Processor n

  31. Multiprocessor SystemsMIMD System Architecture Nonuniform memory access (NUMA): • Dos not allow uniform access to all shared memory locations • It still allows all processors to access all shared memory locations, however, each processor can access the memory module closest to it faster than other modules Processor 1 Processor 2 Processor n Memory 1 Memory 2 Memory n Communications Mechanism

  32. Multiprocessor SystemsMIMD System Architecture Cache Coherent NUMA (CC-NUMA): • Similar to NUMA except each processor includes cache memory • The cache can buffer data from memory modules that are not local to the processor, which can reduce the access time of the memory transfers • Creates a problem when two or more caches hold the same piece of data • A solution to this problem is Cache only memory access (COMA)

  33. Multiprocessor SystemsMIMD System Architecture Cache Only Memory Access (COMA): • Each processor’s local memory is treated as a cache • When the processor requests data that is not in its cache (local memory), the system loads that data into local memory as part of the memory operation

  34. Multiprocessor SystemsMIMD System Architecture Multicomputer: • An MIMD machine in which all processors are not under the control of one operating system • Each processor or group of processors is under the control of a different operating system, or a different instantiation of the same operating system • Two different types: - Network or cluster of workstations (NOW or COW) - Massively parallel processor (MPP)

  35. Multiprocessor SystemsMIMD System Architecture Network of workstation (NOW) or Cluster of workstation (COW): • More than a group of workstations on a local area network (LAN) • Have a master scheduler, which matches tasks and processors together

  36. Multiprocessor SystemsMIMD System Architecture Massively Parallel Processor (MPP): • Consists of many self-contained nodes, each having a processor, memory, and hardware for implementing internal communications • The processors communicate with each other using shared memory • Example: IBM’s Blue Gene Computer

  37. Thank you! Any Questions???

More Related