COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming: 1. Collective Operations 2. Overlapping Communication with Computation Dr. Xiao.

COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming: 1. Collective Operations 2. Overlapping Communication with Computation Dr. Xiao Qin Auburn University http://www.eng.auburn.edu/~xqin xqin@auburn.edu

MPI_Bcast: Bad Example 3

MPI_Bcast: Good Example 4

MPI_Bcast: Another Example 5

Gather and Scatter The gather operation: a single node collects a unique message from each node. The scatter operation: a single node sends a unique message of size m to every other node (also called a one-to-all personalized communication). The scatter operation is different from broadcast, the algorithmic structure is similar, except for differences in message sizes (messages get smaller in scatter and stay constant in broadcast).

Gather The gather operation is performed in MPI using: int MPI_Gather(void *sendbuf, int sendcount, MPI_Datatype senddatatype, void *recvbuf, int recvcount, MPI_Datatype recvdatatype, int target, MPI_Comm comm) int MPI_Reduce(void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int target, MPI_Comm comm) Q1: What are differences between the Gather and Reduce operations? Each process sends data stored in sendbuf to the target process. The data received in the recvbuf of the target process in a rank order. Data from process i are stored in recvbuf starting at location i*sendcount.

Allgather int MPI_Gather(void *sendbuf, int sendcount, MPI_Datatype senddatatype, void *recvbuf, int recvcount, MPI_Datatype recvdatatype, int target, MPI_Comm comm) MPI also provides the MPI_Allgather function in which the data are gathered at all the processes. int MPI_Allgather(void *sendbuf, int sendcount, MPI_Datatype senddatatype, void *recvbuf, int recvcount, MPI_Datatype recvdatatype, MPI_Comm comm) Q2: What are differences between the Gather and Allgather operations?

Scatter The scatter operation is performed in MPI using: int MPI_Scatter(void *sendbuf, int sendcount, MPI_Datatype senddatatype, void *recvbuf, int recvcount, MPI_Datatype recvdatatype, int source, MPI_Comm comm) int MPI_Bcast(void *buf, int count, MPI_Datatype datatype, int source, MPI_Comm comm) Q3: What are differences between the Scatter and Broadcast operations? Process i receives sendcount contiguous elements of type senddatatype starting from the i*sendcount location of the sendbuf of the source process.

Overlapping Communication with Computation MPI provides non-blocking send and receive operations. int MPI_Isend(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm, MPI_Request *request) int MPI_Irecv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Request *request) These operations return before the operations have been completed. int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm) int MPI_Recv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status) Q4: What are differences between these blocking and non-blocking operations?

What is the “MPI_Request” parameter? Function MPI_Test tests whether or not the non-blocking send or receive operation identified by its request has finished. int MPI_Test(MPI_Request *request, int *flag, MPI_Status *status) MPI_Wait waits for the operation to complete. int MPI_Wait(MPI_Request *request, MPI_Status *status)

Exercise: Ring (Non-blocking Communication) Goal: communicate its rank around a ring. The sum of all ranks is then accumulated and printed out by each processor. Each processor stores its rank in MPI_COMM_WORLD as an integer and sends this value to the processor on its right. It then receives an integer from its left neighbor. Keep track of the sum of all the integers received. The processors continue passing on the values they receive until they get their own rank back. Each process should finish by printing out the sum of the values. 12

Ring (Non-blocking Communication) 13 Use synchronous non-blocking send MPI_Issend(). Please do not overwrite information in your implementation. We use synchronous message passing, because the standard send can be either buffered or synchronous.

Ring - Non-blocking Communication 14 void main(int argc, char *argv[]) { int myrank, nprocs, leftid, rightid; int val, sum, tmp; MPI_Status recv_status, send_status; MPI_Request send_request; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); MPI_Comm_size(MPI_COMM_WORLD, &nprocs);......

How to find neighbors? 15 if ((leftid=(myrank-1)) < 0) leftid = nprocs-1; if ((rightid=(myrank+1)) == nprocs) rightid = 0;

How to implement the while loop? 16 val = myrank; sum = 0; while (val != myrank) { val = tmp; sum += val; }

How to use MPI_Issend() in the loop body? 17 val = myrank; sum = 0; do { MPI_Issend(&val,1,MPI_INT,rightid, 99,MPI_COMM_WORLD,&send_request); MPI_Recv(&tmp,1,MPI_INT,leftid, 99,MPI_COMM_WORLD,&recv_status); MPI_Wait(&send_request,&send_status); val = tmp; sum += val; } while (val != myrank); Why we need two variables (i.e., val and tmp ) here?

Ring (Blocking Communication) 18

COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming: 1. Collective Operations 2. Overlapping Communication with Computation Dr. Xiao.

Similar presentations

Presentation on theme: "COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming: 1. Collective Operations 2. Overlapping Communication with Computation Dr. Xiao."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming: 1. Collective Operations 2. Overlapping Communication with Computation Dr. Xiao.

Similar presentations

Presentation on theme: "COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming: 1. Collective Operations 2. Overlapping Communication with Computation Dr. Xiao."— Presentation transcript:

Similar presentations

About project

Feedback