1 / 26

Models of Parallel Processing

Models of Parallel Processing. Parallel processors come in many different varieties. Thus, we often deal with abstract models of real machines. Development of Early Models (1). Associative processing (AP) was perhaps the earliest form of parallel processing.

mmcgowan
Download Presentation

Models of Parallel Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Models of Parallel Processing Part I

  2. Parallel processors come in many different varieties. • Thus, we often deal with abstract models of real machines. Part I

  3. Development of Early Models (1) • Associative processing (AP) was perhaps the earliest form of parallel processing. • Associative or content-addressable memories (AMs, CAMs), which allow memory cells to be accessed based on contents rather than their physical locations within the memory array. • AMI AP architectures are essentially based on incorporating simple processing logic into the memory array so as to remove the need for transferring large volumes of data through the limited-bandwidth interface between the memory and the processor (the von Neumann bottleneck) Part I

  4. Development of Early Models (2) • the AM/AP model has evolved through the incorporation of additional capabilities, so that it is in essence converging with SIMD-type array processors. Part I

  5. Development of Early Models (3) • neural networks • Cellular automata Part I

  6. Part I

  7. Part I

  8. SIMD Vs. MIMD (1) • Most early parallel machines had SIMD designs. • Within the SIMD category, two fundamental design choices exist: • Synchronous versus asynchronous SIMD • A possible cure is to use the asynchronous version of SIMD, known as SPMD • Custom- versus commodity-chip SIMD Part I

  9. SIMD Vs. MIMD (2) • In the 1990s, the MIMD paradigm has become more popular recently. • MIMD machines are most effective for medium- to coarse-grain parallel applications, where the computation is divided into relatively large subcomputations or tasks whose executions are assigned to the various processors. Part I

  10. SIMD Vs. MIMD (3) • Within the MIMD class, three fundamental issues or design choices are subjects of ongoing debates in the research community. • MPP-massively or moderately parallel processor • Is it more cost-effective to build a parallel processor out of a relatively small number of powerful processors or a massive number of very simple processors • Tightly versus loosely coupled MIMD • network of workstations (NOW), cluster computing, Grid Computing • Explicit message passing versus virtual shared memory Part I

  11. Global Vs. Distributed Memory (1) • Within the MIMD class of paranel processors, memory can be global or distributed. • Global memory may be visualized as being in a central location where all processors can access it with equal ease. • memory­latency-hiding techniques must be employed. An example of such methods is the use of multithreading. Part I

  12. Part I

  13. Global Vs. Distributed Memory (2) • Examples for both the processor-to-memory and processor-to-processor networks include: • an abstract model of global-memory computers, known as PRAM. • One approach to reducing the amount of data that must pass through the processor-to­memory interconnection network is to use a private cache memory. (locality of data access, cache coherence problem) Part I

  14. Part I

  15. Global Vs. Distributed Memory (3) • Distributed-memory architectures can be conceptually viewed as in Fig. 4.5. • In addition to the types of interconnection networks enumerated for shared-memory parallel processors, distributed-memory MIMD architectures can also be interconnected by a variety of direct networks. (as nonuniform memory access (NUMA) architectures) Part I

  16. Part I

  17. PRAM Shared-Memory Model (1) • The theoretical model used for conventional or sequential computers (SISD class) is known as the random-access machine (RAM) • The parallel version of RAM (PRAM), constitutes an abstract model of the class of global-memory parallel processors. The abstraction consists of ignoring the details of the processor-to-memory interconnection network and taking the view that each processor can access any memory location in each machine cycle, independent of what other processors are doing. Part I

  18. Part I

  19. PRAM Shared-Memory Model (2) • In the formal PRAM model, a single processor is assumed to be active initially. In each computation step, each active processor can read from and write into the shared memory and can also activate another processor. • Even though the global-memory architecture was introduced as a subclass of the MIMD class, the abstract PRAM model depicted in Fig. 4.6 can be SIMD or MIMD. Part I

  20. Part I

  21. PRAM Shared-Memory Model (3) • This implies that each instruction cycle would have to consume Ω(log p) real time. • The above point is important when we try to compare PRAM algorithms with those for distributed-memory models. An O(log p)-step PRAM algorithm may not be faster than an O(1og2p)-step algorithm for a hypercube architecture. Part I

  22. Distributed-Memory or Graph Models (1) • Given the internal processor and memory structures in each node, a distributed-memory architecture is characterized primarily by the network used to interconnect the nodes. • This network is usually represented as a graph. • Important parameters of an interconnec­tion network include • Network diameter: the longest of the shortest paths between various pairs of nodes • Bisection (band)width: the smallest number (total capacity) of links that need to be cut in order to divide the network into two subnetworks of half the size. • Vertex or node degree: the number of communication ports required of each node Part I

  23. Part I

  24. Part I

  25. Distributed-Memory or Graph Models (2) • Even though the distributed-memory architecture was introduced as a subclass of the MIMD class, machines based on networks of the type shown in Fig. 4.8 can be SIMD- or MIMD-type. • Fig. 4.9 are available for reducing bus traffic by taking advantage of the locality of communication within small clusters of processors. Part I

  26. Part I

More Related