Download presentation
Presentation is loading. Please wait.
Published byHenry Hill Modified over 9 years ago
1
project18’s Communication Drawing Design By: Camilo A. Silva BIOinformatics Summer 2008
2
Objective Find out what type of MPI communication design could be used for project18 Determine which MPI functions could be used to accomplish the above objective
3
Communication Design What is needed? What is needed? We need all nodes to have the basic data in order to run the program prior execution We need all nodes to have the basic data in order to run the program prior execution We need a “master/slave” model We need a “master/slave” model All data at the end must be collected and sent back to the master node All data at the end must be collected and sent back to the master node Our communication flow and data computation should be dynamic by using all the resources. Our communication flow and data computation should be dynamic by using all the resources. E.g. If a processor completes a search it needs to continue with the next data computation independently—without needing to wait for other processors to finish E.g. If a processor completes a search it needs to continue with the next data computation independently—without needing to wait for other processors to finish There needs to be a communication flow with the master node that keeps track of the status of the completion of the computation by gathering information from the slave nodes There needs to be a communication flow with the master node that keeps track of the status of the completion of the computation by gathering information from the slave nodes An anti “dead-lock” mechanism must be implemented An anti “dead-lock” mechanism must be implemented
4
Blue print The master node is in charge of coordinating the processes and keeping track of the status of each process of each node. The slave nodes are following the coordination of the master node. Their processes should be independent. They should report their progress to the Master node in an effective manner. At the beginning of the program, all the nodes need to have essential data needed for the program to run. At the end of the program the output of each node needs to be collected as one; and sent to the master node for storage and access.
5
At the beginning… Let’s assume that all the nodes have the all the data needed for the project18 program to run successfully. 1.When the program is run from the cluster GCB or nay other, the user needs to indicate which genomes will be compared: genome1 vs. genome2 2.That info will be sent to all nodes as a collective function: MPI_Bcast() MPI_Bcast(&nameOfGeno me1, 20, MPI_CHAR, 0, MPI_COMM_WORLD);
6
Initialization The Master node will then orchestrate and administrate the computation amongst the nodes: 1.Since all nodes have the same data and info each node will be given a specific range of indexes to process 2.Such indexes are base locations of genome1 to be contrasted with genome2 1.Here the communication would be point-to-point due to the fact that the master node is communicating with each single node independently 3.Each slave node will compute accordingly to their specified distribution of indexes. The results shall be stored in a text file within each node.
7
Initialization Example Genome1=“aaaaaaacccccccgggggggtttttttcccccccaaaaaaagggggggtttttttcccccc…” 6 13 20 27 34 41 48 55 … This is a visualization of the array of indexes to be distributed to each single node. In this case, we are using a range of seven (7) bases per process. In this example, let’s assume that the search range of the distributing probe is 14. Thus, if node 1 will be computing the results of the first 6 bases, the iterations should be as follows: 1.Find pattern “aaaaaaaccccccc” in genome2 2.2 nd pattern “aaaaaacccccccg” 3.Etc… until “acccccccgggggg” The results of each single node shall be stored on disk as a text file. 1 2 4 5 6 7 X 3 ?
8
Master node as receiver and manager As some of you may have predicted, the master node will be receiving a lot of communication from all the different nodes. This type of communication is point-to-point and the function used to accomplish this is MPI_Recv() The master node acts as a manager. It will be receiving completion codes from each node, and it shall record such completions appropriately. After recording the status of completion of a node, the master node will be in charge of administering and orchestrating the next process for a node. This will be done by creating a simple algorithm involving int arrays just as shown previously.
9
Keeping trustworthy accountability The master node needs to know the completion status of a process in order to keep accountability of completion of each node The master node needs to know the completion status of a process in order to keep accountability of completion of each node The master node will determine based on the communication sent by the node if all processes were completed. The master node will determine based on the communication sent by the node if all processes were completed. –This can be done by implementing a simple completion counter in each node that will be updated after each search of the discriminating probe. This int counter will be returned to the master node which will verify its count to be the same as the index range determined. –Such result could be stored in various formats as explained in the following slide. By having an accountable system the master node will be able to resubmit a job that was not completed or that did not finish By having an accountable system the master node will be able to resubmit a job that was not completed or that did not finish
10
Tracking down completion status 7 6 13 20 27 34 41 48 55 7 6 13 20 27 34 41 48 0 range N1 N2 N3 N4 N5 N6 N7 next This is the completion code. It will be the same integer as the respective current process (status[0][X]) when it is not yet completed. If there is an error found, it will receive the value of zero (0). Let’s assume that N3 was the first one to complete the process. Let’s suppose it completed the searches of the indexes successfully, thus, an int count = 7 shall be returned in an MPI_Recv() to the master node. Int status[][]
11
Tracking down completion status Int 7 7 6 13 55 27 34 41 48 62 7 6 13 55 27 34 41 48 0 range N1 N2 N3 N4 N5 N6 N7 next If(Check_errors()){…check on error and determine what to do} Else if (no errors in completion){report completion and assign new job} When a process is successfully completed, the data of status[][] is modified accordingly and the next process is dynamically assigned to the node that is ready to compute.
12
Collecting the data Master Node Using MPI-IO
13
Issues to consider… Bottlenecking and “dead-locking” Bottlenecking and “dead-locking” What’s the solution: What’s the solution: Asynchronous communication strategies Asynchronous communication strategies Non-blocking strategies Non-blocking strategies
14
What’s next? Learn about MPI-IO Study asynchronous communications and non-blocking communication in order to prevent bottlenecking and dead-locking. Start programming just for fun!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.