Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte,

Slides:



Advertisements
Similar presentations
Its.unc.edu 1 Collective Communication University of North Carolina - Chapel Hill ITS - Research Computing Instructor: Mark Reed
Advertisements

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
A Message Passing Standard for MPP and Workstations Communications of the ACM, July 1996 J.J. Dongarra, S.W. Otto, M. Snir, and D.W. Walker.
CS 240A: Models of parallel programming: Distributed memory and MPI.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
1 July 29, 2005 Distributed Computing 1:00 pm - 2:00 pm Introduction to MPI Barry Wilkinson Department of Computer Science UNC-Charlotte Consortium for.
EECC756 - Shaaban #1 lec # 7 Spring Message Passing Interface (MPI) MPI, the Message Passing Interface, is a library, and a software standard.
Collective Communication.  Collective communication is defined as communication that involves a group of processes  More restrictive than point to point.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
MPI (Message Passing Interface) Basics
Basics of Message-passing Mechanics of message-passing –A means of creating separate processes on different computers –A way to send and receive messages.
1 Copyright © 2010, Elsevier Inc. All rights Reserved Chapter 3 Distributed Memory Programming with MPI An Introduction to Parallel Programming Peter Pacheco.
2a.1 Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte,
1 MPI: Message-Passing Interface Chapter 2. 2 MPI - (Message Passing Interface) Message passing library standard (MPI) is developed by group of academics.
9-2.1 “Grid-enabling” applications Part 2 Using Multiple Grid Computers to Solve a Single Problem MPI © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid.
Part I MPI from scratch. Part I By: Camilo A. SilvaBIOinformatics Summer 2008 PIRE :: REU :: Cyberbridges.
Parallel Computing A task is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information What do we.
1 Review –6 Basic MPI Calls –Data Types –Wildcards –Using Status Probing Asynchronous Communication Collective Communications Advanced Topics –"V" operations.
CS 484. Message Passing Based on multi-processor Set of independent processors Connected via some communication net All communication between processes.
Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University.
CS 838: Pervasive Parallelism Introduction to MPI Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from an online tutorial.
MPI Communications Point to Point Collective Communication Data Packaging.
Summary of MPI commands Luis Basurto. Large scale systems Shared Memory systems – Memory is shared among processors Distributed memory systems – Each.
MPI Introduction to MPI Commands. Basics – Send and Receive MPI is a message passing environment. The processors’ method of sharing information is NOT.
2.1 Message-Passing Computing Cluster Computing, UNC B. Wilkinson, 2007.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
CSCI-455/522 Introduction to High Performance Computing Lecture 4.
Message-Passing Computing Chapter 2. Programming Multicomputer Design special parallel programming language –Occam Extend existing language to handle.
1 BİL 542 Parallel Computing. 2 Message Passing Chapter 2.
12.1 Parallel Programming Types of Parallel Computers Two principal types: 1.Single computer containing multiple processors - main memory is shared,
Introduction to Parallel Programming at MCSR Message Passing Computing –Processes coordinate and communicate results via calls to message passing library.
Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School.
2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()
Chapter 5. Nonblocking Communication MPI_Send, MPI_Recv are blocking operations Will not return until the arguments to the functions can be safely modified.
S an D IEGO S UPERCOMPUTER C ENTER N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE MPI 2 Part II NPACI Parallel Computing Institute.
-1.1- MPI Lectured by: Nguyễn Đức Thái Prepared by: Thoại Nam.
1 Parallel and Distributed Processing Lecture 5: Message-Passing Computing Chapter 2, Wilkinson & Allen, “Parallel Programming”, 2 nd Ed.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
Message Passing Programming Based on MPI Collective Communication I Bora AKAYDIN
Message Passing Interface Using resources from
COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming: 1. Collective Operations 2. Overlapping Communication with Computation Dr. Xiao.
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
Distributed Processing with MPI International Summer School 2015 Tomsk Polytechnic University Assistant Professor Dr. Sergey Axyonov.
Message Passing Computing
Chapter 4.
CS4402 – Parallel Computing
MPI Message Passing Interface
Send and Receive.
CS 584.
An Introduction to Parallel Programming with MPI
Send and Receive.
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
Message Passing Models
Lecture 14: Inter-process Communication
A Message Passing Standard for MPP and Workstations
MPI: Message Passing Interface
Message-Passing Computing
Quiz Questions ITCS 4145/5145 Parallel Programming MPI
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
Barriers implementations
More Quiz Questions Parallel Programming MPI Collective routines
Message-Passing Computing Message Passing Interface (MPI)
Synchronizing Computations
Hello, world in MPI #include <stdio.h> #include "mpi.h"
5- Message-Passing Programming
Collective Communication in MPI and Advanced Features
Hello, world in MPI #include <stdio.h> #include "mpi.h"
Parallel Processing - MPI
MPI Message Passing Interface
CS 584 Lecture 8 Assignment?.
Presentation transcript:

Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, 2010. July 4, 2010.

Collective message-passing routines Have routines that send message(s) to a group of processes or receive message(s) from a group of processes Higher efficiency than separate point-to-point routines although routines not absolutely necessary.

Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_Bcast() - Broadcast from root to all other processes MPI_Gather() - Gather values for group of processes MPI_Scatter() - Scatters buffer in parts to group of processes MPI_Alltoall() - Sends data from all processes to all processes MPI_Reduce() - Combine values on all processes to single value MPI_Reduce_scatter() - Combine values and scatter results MPI_Scan() - Compute prefix reductions of data on processes MPI_Barrier() - A means of synchronizing processes by stopping each one until they all have reached a specific “barrier” call.

MPI broadcast operation Sending same message to all processes in communicator. Multicast - sending same message to defined group of processes.

MPI_Bcast parameters

Basic MPI scatter operation Sending each element of an array in root process to a separate process. Contents of ith location of array sent to ith process.

MPI scatter parameters

Simplest scatter would be as illustrated which one element of an array is sent to different processes. Extension provided in the MPI_Scatter() routine is to send a fixed number of contiguous elements to each process.

Scattering contiguous groups of elements to each process

Example In the following code, size of send buffer is given by 100 * <number of processes> and 100 contiguous elements are send to each process: main (int argc, char *argv[]) { int size, *sendbuf, recvbuf[100]; /* for each process */ MPI_Init(&argc, &argv); /* initialize MPI */ MPI_Comm_size(MPI_COMM_WORLD, &size); sendbuf = (int *)malloc(size*100*sizeof(int)); . MPI_Scatter(sendbuf,100,MPI_INT,recvbuf,100,MPI_INT,0, MPI_COMM_WORLD); MPI_Finalize(); /* terminate MPI */ }

Gather Having one process collect individual values from set of processes.

Gather parameters

Gather Example To gather items from group of processes into process 0, using dynamically allocated memory in root process: int data[10]; /*data to be gathered from processes*/ MPI_Comm_rank(MPI_COMM_WORLD, &myrank); /* find rank */ if (myrank == 0) { MPI_Comm_size(MPI_COMM_WORLD, &grp_size); /*find group size*/ buf = (int *)malloc(grp_size*10*sizeof (int)); /*alloc. mem*/ } MPI_Gather(data,10,MPI_INT,buf,grp_size*10,MPI_INT,0,MPI_COMM_WORLD) ; … MPI_Gather() gathers from all processes, including root.

Reduce Gather operation combined with specified arithmetic/logical operation. Example: Values could be gathered and then added together by root: MPI_Reduce() MPI_Reduce() MPI_Reduce()

Reduce parameters

Reduce - operations Parameters: MPI_Reduce(*sendbuf,*recvbuf,count,datatype,op,root,comm) Parameters: *sendbuf send buffer address *recvbuf receive buffer address count number of send buffer elements datatype data type of send elements op reduce operation. Several operations, including MPI_MAX Maximum MPI_MIN Minimum MPI_SUM Sum MPI_PROD Product root root process rank for result comm communicator

Sample MPI program with collective routines #include “mpi.h” #include <stdio.h> #include <math.h> #define MAXSIZE 1000 void main(int argc, char *argv) { int myid, numprocs, data[MAXSIZE], i, x, low, high, myresult, result; char fn[255]; char *fp; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); if (myid == 0) { /* Open input file and initialize data */ strcpy(fn,getenv(“HOME”)); strcat(fn,”/MPI/rand_data.txt”); if ((fp = fopen(fn,”r”)) == NULL) { printf(“Can’t open the input file: %s\n\n”, fn); exit(1); } for(i = 0; i < MAXSIZE; i++) fscanf(fp,”%d”, &data[i]); MPI_Bcast(data, MAXSIZE, MPI_INT, 0, MPI_COMM_WORLD); /* broadcast data */ x = n/nproc; /* Add my portion Of data */ low = myid * x; high = low + x; for(i = low; i < high; i++) myresult += data[i]; printf(“I got %d from %d\n”, myresult, myid); /* Compute global sum */ MPI_Reduce(&myresult, &result, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD); if (myid == 0) printf(“The sum is %d.\n”, result); MPI_Finalize(); Sample MPI program with collective routines

Collective routines General features Performed on a group of processes, identified by a communicator Substitute for a sequence of point-to-point calls Communications are locally blocking Synchronization is not guaranteed (implementation dependent) Some routines use a root process to originate or receive all data Data amounts must exactly match Many variations to basic categories No message tags are needed From http://www.pdc.kth.se/training/Talks/MPI/Collective.I/less.html#characteristics

Barrier Block process until all processes have called it. Synchronous operation. MPI_Barrier(comm) Communicator

Synchronous Message Passing Routines that return when message transfer completed. Synchronous send routine Waits until complete message can be accepted by the receiving process before sending the message. In MPI, MPI_SSend() routine. Synchronous receive routine Waits until the message it is expecting arrives. In MPI, actually the regular MPI_recv() routine.

Synchronous Message Passing Synchronous message-passing routines intrinsically perform two actions: They transfer data and They synchronize processes.

Synchronous Ssend() and recv() using 3-way protocol Process 1 Process 2 Time Request to send Ssend(); Suspend Ac kno wledgment process recv(); Both processes Message contin ue (a) When send() occurs bef ore recv() Process 1 Process 2 Time recv(); Suspend Request to send Ssend(); process Both processes Message contin ue Ac kno wledgment (b) When recv() occurs bef ore send()

Parameters of synchronous send (same as blocking send) MPI_Ssend(buf, count, datatype, dest, tag, comm) Address of Datatype of Message tag send b uff er each item Number of items Rank of destination Comm unicator to send process

Asynchronous Message Passing Routines that do not wait for actions to complete before returning. Usually require local storage for messages. More than one version depending upon the actual semantics for returning. In general, they do not synchronize processes but allow processes to move forward sooner. Must be used with care.

MPI Definitions of Blocking and Non-Blocking Blocking - return after their local actions complete, though the message transfer may not have been completed. Sometimes called locally blocking. Non-blocking - return immediately (asynchronous) Non-blocking assumes that data storage used for transfer not modified by subsequent statements prior to being used for transfer, and it is left to the programmer to ensure this. Blocking/non-blocking terms may have different interpretations in other systems.

MPI Nonblocking Routines Non-blocking send - MPI_Isend() - will return “immediately” even before source location is safe to be altered. Non-blocking receive - MPI_Irecv() - will return even if no message to accept.

Nonblocking Routine Formats MPI_Isend(buf,count,datatype,dest,tag,comm,request) MPI_Irecv(buf,count,datatype,source,tag,comm, request) Completion detected by MPI_Wait() and MPI_Test(). MPI_Wait() waits until operation completed and returns then. MPI_Test() returns with flag set indicating whether operation completed at that time. Need to know whether particular operation completed. Determined by accessing request parameter.

Example To send an integer x from process 0 to process 1 and allow process 0 to continue: MPI_Comm_rank(MPI_COMM_WORLD, &myrank); /* find rank */ if (myrank == 0) { int x; MPI_Isend(&x,1,MPI_INT, 1, msgtag, MPI_COMM_WORLD, req1); compute(); MPI_Wait(req1, status); } else if (myrank == 1) { MPI_Recv(&x,1,MPI_INT,0,msgtag, MPI_COMM_WORLD, status); }

How message-passing routines return before message transfer completed Message buffer needed between source and destination to hold message: Process 1 Process 2 Message b uff er Time send(); Contin ue recv(); process Read message b uff er

Asynchronous (blocking) routines changing to synchronous routines Message buffers only of finite length A point could be reached when send routine held up because all available buffer space exhausted. Then, send routine will wait until storage becomes re-available - i.e. routine will behave as a synchronous routine.

Next topic Parallel Algorithms Other MPI features will be introduced as we need them. Next topic Parallel Algorithms