Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008.

Slides:



Advertisements
Similar presentations
Practical techniques & Examples
Advertisements

MPI Basics Introduction to Parallel Programming and Cluster Computing University of Washington/Idaho State University MPI Basics Charlie Peck Earlham College.
Chapter 3. MPI MPI = Message Passing Interface Specification of message passing libraries for developers and users –Not a library by itself, but specifies.
Reference: / MPI Program Structure.
Tutorial on MPI Experimental Environment for ECE5610/CSC
High Performance Computing
MPI Program Structure Self Test with solution. Self Test 1.How would you modify "Hello World" so that only even-numbered processors print the greeting.
Point-to-Point Communication Self Test with solution.
CS 240A: Models of parallel programming: Distributed memory and MPI.
Distributed Memory Programming with MPI. What is MPI? Message Passing Interface (MPI) is an industry standard message passing system designed to be both.
Comp 422: Parallel Programming Lecture 8: Message Passing (MPI)
12b.1 Introduction to Message-passing with MPI UNC-Wilmington, C. Ferner, 2008 Nov 4, 2008.
CS 179: GPU Programming Lecture 20: Cross-system communication.
Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center Presented by Timothy H. Kaiser, Ph.D. San Diego.
1 Copyright © 2010, Elsevier Inc. All rights Reserved Chapter 3 Distributed Memory Programming with MPI An Introduction to Parallel Programming Peter Pacheco.
Parallel & Cluster Computing MPI Basics Paul Gray, University of Northern Iowa David Joiner, Shodor Education Foundation Tom Murphy, Contra Costa College.
IBM Research © 2006 IBM Corporation CDT Static Analysis Features CDT Developer Summit - Ottawa Beth September.
Director of Contra Costa College High Performance Computing Center
2.1 Message-Passing Computing ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 17, 2012.
1 MPI: Message-Passing Interface Chapter 2. 2 MPI - (Message Passing Interface) Message passing library standard (MPI) is developed by group of academics.
2.1 Message-Passing Computing ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 14, 2013.
MA471Fall 2003 Lecture5. More Point To Point Communications in MPI Note: so far we have covered –MPI_Init, MPI_Finalize –MPI_Comm_size, MPI_Comm_rank.
Specialized Sending and Receiving David Monismith CS599 Based upon notes from Chapter 3 of the MPI 3.0 Standard
MPI and High Performance Computing: Systems and Programming Barry Britt, Systems Administrator Department of Computer Science Iowa State University.
Parallel Computing A task is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information What do we.
Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University.
Message Passing Programming with MPI Introduction to MPI Basic MPI functions Most of the MPI materials are obtained from William Gropp and Rusty Lusk’s.
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
Summary of MPI commands Luis Basurto. Large scale systems Shared Memory systems – Memory is shared among processors Distributed memory systems – Each.
Message Passing Interface (MPI) 1 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School.
MPI Introduction to MPI Commands. Basics – Send and Receive MPI is a message passing environment. The processors’ method of sharing information is NOT.
CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.
Parallel Programming with MPI By, Santosh K Jena..
MA471Fall 2002 Lecture5. More Point To Point Communications in MPI Note: so far we have covered –MPI_Init, MPI_Finalize –MPI_Comm_size, MPI_Comm_rank.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
CSCI-455/522 Introduction to High Performance Computing Lecture 4.
1 Message Passing Models CEG 4131 Computer Architecture III Miodrag Bolic.
Running on GCB part1 By: Camilo Silva. Simple steps to run MPI 1.Use putty or the terminal 2.SSH to gcb.fiu.edu 3.Loggin by providing your username and.
Introduction to MPI CDP 1. Shared Memory vs. Message Passing Shared Memory Implicit communication via memory operations (load/store/lock) Global address.
Project18’s Communication Drawing Design By: Camilo A. Silva BIOinformatics Summer 2008.
Programming distributed memory systems: Message Passing Interface (MPI) Distributed memory systems: multiple processing units working on one task (e.g.
An Introduction to MPI (message passing interface)
Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School.
Finding Discriminating DNA Probe Sequences by Implementing a Parallelized Solution in a Cluster REU Camilo A. Silva Professor and Advisor: Dr. S. Masoud.
3/12/2013Computer Engg, IIT(BHU)1 MPI-1. MESSAGE PASSING INTERFACE A message passing library specification Extended message-passing model Not a language.
April 24, 2002 Parallel Port Example. April 24, 2002 Introduction The objective of this lecture is to go over a simple problem that illustrates the use.
Implementing Processes and Threads CS550 Operating Systems.
Lecture 5 CSS314 Parallel Computing Book: “An Introduction to Parallel Programming” by Peter Pacheco
MPI-Message Passing Interface. What is MPI?  MPI is a specification for the developers and users of message passing libraries. By itself, it is NOT a.
1 Programming distributed memory systems Clusters Distributed computers ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 6, 2015.
PVM and MPI.
Chapter 4.
MPI Basics.
CS4402 – Parallel Computing
Introduction to MPI.
MPI Message Passing Interface
CS 584.
Introduction to Message Passing Interface (MPI)
Message Passing Models
Lab Course CFD Parallelisation Dr. Miriam Mehl.
Introduction to parallelism and the Message Passing Interface
MPI MPI = Message Passing Interface
Hardware Environment VIA cluster - 8 nodes Blade Server – 5 nodes
Hello, world in MPI #include <stdio.h> #include "mpi.h"
Distributed Memory Programming with Message-Passing
Hello, world in MPI #include <stdio.h> #include "mpi.h"
Parallel Processing - MPI
MPI Message Passing Interface
Some codes for analysis and preparation for programming
CS 584 Lecture 8 Assignment?.
Presentation transcript:

Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

Goals Design a communication structure for project18 Design a communication structure for project18 Provide a clear map and detailed instructions in parallelizing the code Provide a clear map and detailed instructions in parallelizing the code Oversee a self-managing fault system for project18 Oversee a self-managing fault system for project18 Share ideas in how to self- optimize project18 Share ideas in how to self- optimize project18

Main Structure A master node communicating with all slave nodes.

Objective The plan is to run project18 in different nodes at the same time. Each node will create an output file, which presents the discriminating probes found between the two genomes compared.

How?  The master node will acquire info from the user in regards to the different genomes to be compared for project18  The master node will administer the data and create jobs to each slave node.  Each slave node will receive the data from the master node and start execution of project18  After a node has completed its task, it will report its completion to the master node which will determine if there are more tasks to be completed. If there are, the proceeding task will be given to such node.  When the program has finished, all results shall be stored in a predefined directory where such would be available for review.

Communication Drawing Design User input: { (g1,g2), (g1,g3), …, (g2, g3), (g3,g4), … } (g2, g3) Output: g2_g3.txt … Etc… (g1,g2)  node0 (g1,g3)  node1 … (g2,g3)  node7 (g3,g4)  ? …

Parallel Program Design ooooooo startend FM.N. C M.N  Master Node 1-7  Slave Nodes F  Finish C  Completion

Parallelization Roadmap In the following slides, each single section of the parallel program design code will be explained in order to parallelize project18 Each slide will be representing each single diagram element or section of the parallel design NOTE: if by any chance you need detailed information on the MPI functions go to this link:

Start /* In order to start a program using MPI the following libraries must be present… One may add as any other libraries as necessary */ #include #include #include “mpi.h” #include “project18.h”

start /* Sometimes one may want to define some constant variables. Also other functions to be used need to be defined. */ #define MASTER_NODE 0 #define BUFFER 100 #define TAG 0 void createFolder ( const char *filename, const char *newFileName ) ; int checkQueue ( char *queue ) ; void assignSingleTasks ( int node, char *queue ) ; void taskControl (int node, char *queue ) //etc…

start /* To start a MPI program one needs to initialize it in main(). Program variables should be defined here as well. */ int main ( int argc, char *argv [ ] ) { MPI_Status status ; char filename [20] ; char fileToCreate [20] ; int my_rank, numOfNodes, queueItemsLeft, start = 1 ;... //as many as needed //initializes MPI program MPI_Init ( &argc, &argv ) ; //defines the rank of the node or simply determines the node MPI_Comm_rank ( MPI_COMM_WORLD, &my_rank) ; //finds out how many processors are active MPI_Comm_size ( MPI_COMM_WORLD, &numOfNodes ) ;

Master node start /* The master node is selected in order for the user to input some values for the project18 parameters */ if ( my_rank == MASTER_NODE ) { //ask user for input… … //create queue or a data structure of the like… …

Master node start continued… if ( my_rank == MASTER_NODE ) { … /* Now that a queue or data structure of the like contains the genomes to compare, they need to be sent to each node in order to start execution of the program. Here’s just an example of how this task could be done. */ int i ; for ( i = 1 ; i < numOfNodes ; i++ ) { //get item in queue… let’s suppose it is a string value like //this: char genomes [40] = “genome1*genome2” ; //get item in queue… let’s suppose it is a string value like //this: char genomes [40] = “genome1*genome2” ; MPI_Send ( genomes, strlen (genomes) + 1, MPI_CHAR, i, TAG, MPI_COMM_WORLD ) ; //find out the number of items left in the queue and send that number as well MPI_Send ( queueItemsLeft, 1, MPI_INT, i, TAG + 1, MPI_COMM_WORLD ) ; } start = 0 ; while ( !start && checkQueue ( queue ) ) { …} }//end of if

Receiving the message from Master node /* After the if statement has sent the parameters to all nodes, each of them need to receive the messages independently. */… if ( my_rank == MASTER_NODE ) { …} else { MPI_Recv ( queueItemsLeft, 1, MPI_INT, 0, TAG+1, MPI_COMM_WORLD, &status ) ; …}

IF /* Since each node now has the required parameters to start project18, it is required for us to know when a node its finished or the program is finished */… Else { //right after the prior receive… while ( queueItemsLeft ) { MPI_Recv ( genomes, BUFFER, MPI_CHAR, 0, TAG, MPI_COMM_WORLD, &status ) ; …} }//end of else

Project18 execution /* Project18 will be executed independently in each single machine. An output file and a completion code are created at the end of the execution */ else { … while ( queueItemsLeft ) { MPI_Recv() ; … //Project18 execution … all necessary code goes here //Since the output text files are independent and do not need to be collectively saved meaning that each processor is writing onto the same file, the IO is carried as it is in C without the use of using MPI-IO. //At the end of the code add the following in order to send a completion code to node 0—the master node. Please be reminded that this is just an example, in practice this (the completion code) could be changed: Char completion [ ] = “Process Completed” ; Char completionCode [50] ; Sprintf( completionCode, “%s node%d_%s”, completion, my_rank, genomes ) ; MPI_Send ( completionCode, strlen(completionCode) + 1, MPI_CHAR, 0, TAG + 2, MPI_COMM_WORLD ) ; MPI_Recv (queueItemsLeft, 1, MPI_INT, 0, TAG+1, MPI_COMM_WORLD, &status ) ; } // end of While }//end of else

Master Node 2 nd Part /* the master node is the administrator of each process being sent to a node. In the last slide, we saw that a message is sent by each nose specifying a completion code. In this part of the code, it is shown how the Master node is able to manage all tasks. */ if ( my_rank == MASTER_NODE ) { … while ( checkQueue ( queue ) ) { MPI_Recv (completionCode, BUFFER, MPI_CHAR, MPI_ANY_SOURCE, TAG+2, MPI_COMM_WORLD, &status ) ; //this is a special function that will be implemented as a self-healing application taskControl ( status.MPI_SOURCE, queue ) //assigns a new task to the node that is available to receive one assignSingleTasks ( status.MPI_SOURCE, queue ) ; }

assignSingleTasks (…) ; This function hopes to check the data structure holding the tasks and select the new “genome” parameter to be processed. Once the genome parameter is selected it is sent to the node that is available in this case it is represented as status.MPI_SOURCE

A brief Pseudo code of Project18 //libraries + definitions #include … … //main Int main (…) { //variable definitions… If (rank == master node) { //ask user for input, create the queue and initialize all tasks While (//there are more items left in the queue) { //receive completion signals, keep fault control and task control active, and assign new available tasks to available nodes }//end while }end if //continue on the right Else { //receive the number of items left of the queue While (//there are more items left) { //receive from master node the genome parameter EXECUTE PRJECT18 Create output files Submit completion code to node0 Finally, wait to receive an updated “queueItemsList” }//end while } end of else MPI_Finalize () ; }// end of main Void checkqueueList(…) { Checks for items in queue or any other data structure } Void taskControl (…) { Makes sure that each task is completed accordingly and is succesful } Void assignSingleTasks (…) { Finds next available queue item and sends it to the available node for processing }

self healing + self optimization Void taskControl () is the function that will be in charge of revising that each node completes the assign task. This function will keep track of all completions codes as well. In case of a malfunction or unsuccesful completion, this function will make sure that the queue item that was not completed gets carried over and sent onto another node. There could be a function that will help oversee the functionality and processing of each node and the communications with the Master node. In case, there is bottlenecking then this function could provide support in changing the communication to be asynchronous instead of synchronous.

Important thoughts  This roadmap oversees in a “simple” manner how the program could be parallelized. This roadmap is not taking into account any runtime challenges or any other types of issues  Please have in mind that this design could always be modified for a better one  Your input is surely appreciated