1 / 33

Motivation Parallel Computing and Compare 3 Parallel Computing System

Motivation Parallel Computing and Compare 3 Parallel Computing System. Çankaya University Ceng 505 Ferdi Tekin. Outline. Introduction to Parallel Computing Motivation for Parallel Computing Memory Architectures PVM MPI OpenMP. There are three ways to do anything faster:. Work harder

pittsm
Download Presentation

Motivation Parallel Computing and Compare 3 Parallel Computing System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Motivation Parallel ComputingandCompare 3 ParallelComputing System Çankaya University Ceng 505 Ferdi Tekin

  2. Outline • Introduction to Parallel Computing • Motivation for Parallel Computing • Memory Architectures • PVM • MPI • OpenMP

  3. There are three ways to doanything faster: • Work harder • Work smarter • Get help

  4. In computers • Work harder => increase the processorspeed • Work smarter => use a better algorithm • Get help => use parallel processing

  5. Parallel Computing -How does it work? • Divide the problem among manyprocessors • Each processor does a portion of the workassigned to it while interacting with itspeers to exchange data Examples: • Sticking stamps on envelopes • Car production line

  6. Parallel Computing Applications • Weather prediction • Modelling of – Nuclear explosion, Biosphere, Financial Markets, • Engineering Problems • Vehicle design and dynamics • Oil Exploration • Pattern Discovery and Recognition • Analysis of Protein structures and many more...

  7. Parallel Computers • Massively Parallel Processors (MPPs) • Large number of processors connected to gigabytes of memory via proprietary hardware • Enormous computing power ‘in a box’ • Made by the likes of IBM, Cray, SGI, et al • Very costly! • Distributed / Cluster Computing • A number of computers connected by a network are used to solve a single large problem • Could consist of ordinary workstations or even MPPs • Considerably cheaper to set up

  8. How do the processors interact? • Shared Memory • Distributed Memory (Message-Passing) • Hybrid Distributed-Shared Memory

  9. Shared Memory • Memory space is shared between multiple processors for read & write operations • Processors interact by modifying data objects in the shared address space

  10. Distributed Memory • Many processors, each with local memory accessible only to it, are connected via an interconnection network • They communicate by passing messages to each other

  11. Hybrid Distributed-Shared Memory • The largest and fastest computers in the world today employ both shared and distributed memory architectures

  12. Programming in Parallel • Have to design and code the application as a set of cooperating processes • Message passing libraries are available for C, C++ and Fortran which enable communication between processes • Parallel Virtual Machine (PVM) • Message Passing Interface (MPI)

  13. Distributed Memory Systems • PVM : Parallel Virtual Machine • MPI : Message Passing Interface

  14. PVM: Parallel Virtual Machine • software package that permits a heterogeneous collection of Unix and/or Windows computers hooked together by a network to be used as a single large parallel computer • PVM enables users to exploit their existing computer hardware to solve much larger problems at minimal additional cost • Hundreds of sites around the world are using PVM to solve important scientific, industrial, and medical problems in addition to PVM's use as an educational tool to teach parallel programming

  15. PVM: Parallel Virtual Machine (2) • PVM handles all message routing, data conversion, and task scheduling across a network of incompatible computer architectures • has been compiled on everything from laptops to CRAYs http://www.csm.ornl.gov/pvm/pvm_home.html

  16. PVM Features • User-configured host pool • The application's computational tasks execute on a set of machines that are selected by the user for a given run of the PVM program • The host pool may be altered by adding and deleting machines • during operation (an important feature for fault tolerance) • Translucent access to hardware • Application programs either may view the hardware environment as an attributeless collection of virtual processing elements or may choose to exploit the capabilities of specific machines in the host pool by positioning certain computational tasks on the most appropriate computers.

  17. PVM Features (2) • Process-based computation • The unit of parallelism in PVM is a task (often but not always a Unix process) • No process-to-processor mapping is implied or enforced by PVM • Explicit message-passing model • Collections of computational tasks, each performing a part of an application's workload using data-, functional-, or hybrid decomposition, cooperate by explicitly

  18. PVM Features (3) • Heterogeneity support • The PVM system supports heterogeneity in terms of machines, networks, and applications. • With regard to message passing, PVM permits messages containing more than one datatype to be exchanged between machines having different data representations. • Multiprocessor support • PVM uses the native message-passing facilities on multiprocessors to take advantage of the underlying hardware. • Vendors often supply their own optimised PVM for their systems, which can still communicate with the public PVM version.

  19. PVM System • PVM System is composed of two parts: • daemon • resides on all the computers making up the virtual machine • Library of PVM interface routines • contains user-callable routines for message passing, spawning processes, coordinating tasks, and modifying the virtual machine

  20. PVM code samples • Processes are ‘spawned’ – • num_tasks = pvm_spawn(pvmmm, (char**)0, 0, "", r-1, &tids[1]); • Data is ‘packed’ and sent – • pvm_initsend( PvmDataDefault ); • pvm_pkint( &b.rows, 1, 1 ); • pvm_send( tids[m], 1 ); • Data is received and ‘unpacked’ – • pvm_recv( tids[0], 0 ); • pvm_upkint( &r, 1, 1 ); • pvm_upkint( tids, r, 1 ); • Inform the local PVM daemon this process is leaving – • pvm_exit();

  21. MPI: Message Passing Interface • Library specification for message-passing • proposed as a standard by a committee of vendors, implementers, and users • The impetus for developing MPI was that each Massively Parallel Processor MPP vendor was creating their own proprietary message passing API. In this scenario it was not possible to write a portable parallel application

  22. MPI: Message Passing Interface(2) • designed for high performance on both massively parallel machines and on workstation clusters • Widely available, with both freely available and vendorsupplied implementations • Examples: • LAM/MPI, MPICH/MPICH2, MPICH-G2 • Vendor implementations by Sun, IBM, SGI etc. • http://www-unix.mcs.anl.gov/mpi/

  23. MPI Features • A large set of point-to-point communication routines • A large set of collective communication routines for communication among groups of processes • A communication context that provides support for the design of safe parallel software libraries • The ability to specify communication topologies • The ability to create derived datatypes that describe messages of non contiguous data

  24. MPI code samples • Initialise – • MPI_Init( &argc, &argv ); • MPI_Comm_rank( MPI_COMM_WORLD, &me ); • MPI_Comm_size( MPI_COMM_WORLD, &size ); • Sending data – • MPI_Send( &b.rows, 1, MPI_INT, m, 2, MPI_COMM_WORLD ); • Receiving Data – • MPI_Recv( &b.rows, 1, MPI_INT, 0, 2, MPI_COMM_WORLD, &status ); • Cleanup prior to exiting – • MPI_Finalize();

  25. PVM vs. MPI • MPI is faster within a large multiprocessor • MPI has more point-to-point and collective communication options than PVM • PVM is better when applications will be run over heterogeneous networks • PVM allows development of fault tolerant applications (that can survive host or task failures) • PVM provides a powerful set of dynamic resource manager and process control functions

  26. Shared Memory Systems • OpenMP : • Open Standard for multi-processing

  27. OpenMP • supports multi-platform shared-memory parallel programming • Jointly defined by a group of major computer hardware and software vendors • Portable and scalable model - gives shared-memory parallel programmers a simple and flexible interface for developing parallel applications

  28. OpenMP: Goals • Standardization: • Provide a standard among a variety of shared memory architectures/platforms • Lean and Mean: • Establish a simple and limited set of directives for programming shared memory machines. • Ease of Use: • Provide capability to incrementally parallelize a serial program, unlike message-passing libraries which typically require an all or nothing approach • Provide the capability to implement both coarse-grain and fine grain parallelism • Portability: • Supports Fortran (77, 90, and 95), C, and C++ • Public forum for API and membership

  29. OpenMP Features • Thread Based Parallelism • A shared memory process can consist of multiple threads • OpenMP is based upon the existence of multiple threads in the shared memory programming paradigm. • Explicit Parallelism • OpenMP is an explicit (not automatic) programming model, offering the programmer full control over parallelszation. • Fork - Join Model of parallel execution: • All OpenMP programs begin as a single process: the master thread. The master thread executes sequentially until the first parallel region construct is encountered. • FORK: the master thread then creates a team of parallel threads. The statements in the program that are enclosed by the parallel region construct are then executed in parallel among the various team threads • JOIN: When the team threads complete the statements in the parallel region construct, they synchronise and terminate, leaving only the master thread

  30. OpenMP Features (2) • Compiler Directive Based • Virtually all of OpenMP parallelism is specified through the use of compiler directives which are imbedded in C/C++ or Fortran source code • Nested Parallelism Support • The API provides for the placement of parallel constructs inside of other parallel constructs. • Implementations may or may not support this feature. • Dynamic Threads • The API provides for dynamically altering the number of threads which may used to execute different parallel regions. • Implementations may or may not support this feature.

  31. OpenMP: Disadvantages • Not meant for distributed memory parallel systems (by itself) • Not necessarily implemented identically by all vendors • Not guaranteed to make the most efficient use of shared memory

  32. Summary • Introduced Parallel Computing • Memory Architectures • Distributed Memory Systems • PVM • MPI • Shared Memory Systems • OpenMP

  33. THANKS

More Related