1 / 14

project18’s Communication Drawing Design

project18’s Communication Drawing Design. By: Camilo A. Silva BIOinformatics Summer 2008. Objective. Find out what type of MPI communication design could be used for project18 Determine which MPI functions could be used to accomplish the above objective. Communication Design.

veda-webb
Download Presentation

project18’s Communication Drawing Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. project18’s Communication Drawing Design By: Camilo A. Silva BIOinformatics Summer 2008

  2. Objective • Find out what type of MPI communication design could be used for project18 • Determine which MPI functions could be used to accomplish the above objective

  3. Communication Design • What is needed? • We need all nodes to have the basic data in order to run the program prior execution • We need a “master/slave” model • All data at the end must be collected and sent back to the master node • Our communication flow and data computation should be dynamic by using all the resources. • E.g. If a processor completes a search it needs to continue with the next data computation independently—without needing to wait for other processors to finish • There needs to be a communication flow with the master node that keeps track of the status of the completion of the computation by gathering information from the slave nodes • An anti “dead-lock” mechanism must be implemented

  4. Blue print The master node is in charge of coordinating the processes and keeping track of the status of each process of each node. The slave nodes are following the coordination of the master node. Their processes should be independent. They should report their progress to the Master node in an effective manner. • At the beginning of the program, all the nodes need to have essential data needed for the program to run. • At the end of the program the output of each node needs to be collected as one; and sent to the master node for storage and access.

  5. At the beginning… • Let’s assume that all the nodes have the all the data needed for the project18 program to run successfully. • When the program is run from the cluster GCB or nay other, the user needs to indicate which genomes will be compared: genome1 vs. genome2 • That info will be sent to all nodes as a collective function: MPI_Bcast() MPI_Bcast(&nameOfGenome1, 20, MPI_CHAR, 0, MPI_COMM_WORLD);

  6. Initialization • The Master node will then orchestrate and administrate the computation amongst the nodes: • Since all nodes have the same data and info each node will be given a specific range of indexes to process • Such indexes are base locations of genome1 to be contrasted with genome2 • Here the communication would be point-to-point due to the fact that the master node is communicating with each single node independently • Each slave node will compute accordingly to their specified distribution of indexes. The results shall be stored in a text file within each node.

  7. Initialization Example Genome1=“aaaaaaacccccccgggggggtttttttcccccccaaaaaaagggggggtttttttcccccc…” 6 13 20 27 34 41 48 55 … • This is a visualization of the array of indexes to be distributed to each single node. In this case, we are using a range of seven (7) bases per process. • In this example, let’s assume that the search range of the distributing probe is 14. Thus, if node 1 will be computing the results of the first 6 bases, the iterations should be as follows: • Find pattern “aaaaaaaccccccc” in genome2 • 2nd pattern “aaaaaacccccccg” • Etc… until “acccccccgggggg” • The results of each single node shall be stored on disk as a text file. ? X 7 6 5 4 3 2 1

  8. Master node as receiver and manager As some of you may have predicted, the master node will be receiving a lot of communication from all the different nodes. This type of communication is point-to-point and the function used to accomplish this is MPI_Recv() The master node acts as a manager. It will be receiving completion codes from each node, and it shall record such completions appropriately. After recording the status of completion of a node, the master node will be in charge of administering and orchestrating the next process for a node. This will be done by creating a simple algorithm involving int arrays just as shown previously.

  9. Keeping trustworthy accountability • The master node needs to know the completion status of a process in order to keep accountability of completion of each node • The master node will determine based on the communication sent by the node if all processes were completed. • This can be done by implementing a simple completion counter in each node that will be updated after each search of the discriminating probe. This int counter will be returned to the master node which will verify its count to be the same as the index range determined. • Such result could be stored in various formats as explained in the following slide. • By having an accountable system the master node will be able to resubmit a job that was not completed or that did not finish

  10. Tracking down completion status range N1 N2 N3 N4 N5 N6 N7 next 7 6 13 20 27 34 41 48 55 Int status[][] 7 6 13 20 27 34 41 48 0 This is the completion code. It will be the same integer as the respective current process (status[0][X]) when it is not yet completed. If there is an error found, it will receive the value of zero (0). Let’s assume that N3 was the first one to complete the process. Let’s suppose it completed the searches of the indexes successfully, thus, an int count = 7 shall be returned in an MPI_Recv() to the master node.

  11. Tracking down completion status Int 7 When a process is successfully completed, the data of status[][] is modified accordingly and the next process is dynamically assigned to the node that is ready to compute. If(Check_errors()){…check on error and determine what to do} Else if (no errors in completion){report completion and assign new job} range N1 N2 N3 N4 N5 N6 N7 next 7 6 13 55 27 34 41 48 62 7 6 13 55 27 34 41 48 0

  12. Collecting the data Using MPI-IO Master Node

  13. Issues to consider… • Bottlenecking and “dead-locking” • What’s the solution: • Asynchronous communication strategies • Non-blocking strategies

  14. What’s next? • Learn about MPI-IO • Study asynchronous communications and non-blocking communication in order to prevent bottlenecking and dead-locking. • Start programming just for fun!

More Related