Lecture 3 Point-to-Point Communications Dr. Muhammad Hanif Durad Department of Computer and Information Sciences Pakistan Institute Engineering and Applied.

1 Lecture 3 Point-to-Point Communications Dr. Muhammad Hanif Durad Department of Computer and Information Sciences Pakistan Institute Engineering and Applied Sciences Some slides have bee adapted with thanks from some other lectures available on Internet

2 Dr. Hanif Durad2 Lecture Outline Models for Communication Brief introduction to MPI Basic concepts Learn 6 most commonly used functions Introduce “collective” operations IntroMPI.ppt

3 MPI Basic Send/Receive 3 We need to fill in the details in things that need specifying: How will “data” be described? How will processes be identified? How will the receiver recognize/screen messages? What will it mean for these operations to complete? Process 0 Process 1 Send(data) Receive(data) IntroMPI.ppt

4 Some Basic Concepts Processes can be collected into groups Each message is sent in a context, and must be received in the same context Provides necessary support for libraries A group and context together form a communicator A process is identified by its rank in the group associated with a communicator There is a default communicator whose group contains all initial processes, called MPI_COMM_WORLD Dr. Hanif Durad4 IntroMPI.ppt

5 MPI Datatypes The data in a message to send or receive is described by a triple (address, count, datatype), where An MPI datatype is recursively defined as: predefined, corresponding to a data type from the language (e.g., MPI_INT, MPI_DOUBLE) a contiguous array of MPI datatypes a strided block of datatypes an indexed array of blocks of datatypes an arbitrary structure of datatypes There are MPI functions to construct custom datatypes, in particular ones for subarrays May hurt performance if datatypes are complex 5Dr. Hanif Durad IntroMPI.ppt

6 MPI Tags Messages are sent with an accompanying user-defined integer tag, to assist the receiving process in identifying the message Messages can be screened at the receiving end by specifying a specific tag, or not screened by specifying MPI_ANY_TAG as the tag in a receive Some non-MPI message-passing systems have called tags “message types”. MPI calls them tags to avoid confusion with datatypes Dr. Hanif Durad6 IntroMPI.ppt

7 Blocking Point-to-Point Communication (1/2) MPI_Send() Basic blocking send operation. Routine returns only after the application buffer in the sending task is free for reuse. MPI_Recv() Receive a message and block until the requested data is available in the application buffer in the receiving task. MPI_Ssend() synchronous blocking send 7Dr. Hanif Durad Comm.ppt

8 Blocking Point-to-Point Communication (2/2) MPI_Bsend() buffered blocking send MPI_Rsend() blocking ready send, use with great care MPI_Sendrecv() Send a message and post a receive before blocking. Will block until the sending application buffer is free for reuse and until the receiving application buffer contains the received message. 8Dr. Hanif Durad Comm.ppt

9 MPI Basic (Blocking) Send MPI_SEND(start, count, datatype, dest, tag, comm) The message buffer is described by (start, count, datatype). The target process is specified by dest, which is the rank of the target process in the communicator specified by comm. When this function returns, the data has been delivered to the system and the buffer can be reused. Important: The message may not have been received by the target process. Dr. Hanif Durad9 A(10) B(20) MPI_Send( A, 10, MPI_DOUBLE, 1, …)MPI_Recv( B, 20, MPI_DOUBLE, 0, … ) IntroMPI.ppt

10 MPI Basic (Blocking) Receive MPI_RECV(start, count, datatype, source, tag,comm, status) Waits until a matching (both source and tag ) message is received from the system, and the buffer can be used source is rank in communicator specified by comm, or MPI_ANY_SOURCE tag is a tag to be matched on or MPI_ANY_TAG receiving fewer than count occurrences of datatype is OK, but receiving more is an error status contains further information (e.g. size of message) Dr. Hanif Durad10 MPI_Send( A, 10, MPI_DOUBLE, 1, …)MPI_Recv( B, 20, MPI_DOUBLE, 0, … ) A(10) B(20) IntroMPI.ppt

11 Blocking Operations pp2003\lecture4.ppt

12 A Simple MPI Program (C) #include "mpi.h" #include int main( int argc, char *argv[]) { int rank, buf; MPI_Status status; MPI_Init(&argc, &argv); MPI_Comm_rank( MPI_COMM_WORLD, &rank ); /* Process 0 sends and Process 1 receives */ if (rank == 0) { buf = 123456; MPI_Send( &buf, 1, MPI_INT, 1, 0, MPI_COMM_WORLD); } else if (rank == 1) { MPI_Recv( &buf, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &status ); printf( "Received %d\n", buf ); } MPI_Finalize(); return 0; } Dr. Hanif Durad12 IntroMPI.pptProgram name blocking.c

13 A Simple MPI Program (Fortran) program main include 'mpif.h' integer rank, buf, ierr, status(MPI_STATUS_SIZE) call MPI_Init(ierr) call MPI_Comm_rank( MPI_COMM_WORLD, rank, ierr ) ! Process 0 sends and Process 1 receives if (rank.eq. 0) then buf = 123456 call MPI_Send( buf, 1, MPI_INTEGER, 1, 0, MPI_COMM_WORLD, ierr ) else if (rank.eq. 1) then call MPI_Recv( buf, 1, MPI_INTEGER, 0, 0, MPI_COMM_WORLD, status, ierr ) print *, "Received ", buf endif call MPI_Finalize(ierr) end Dr. Hanif Durad13 IntroMPI.pptProgram name blocking.f90

14 A Simple MPI Program (C++) #include "mpi.h" #include int main( int argc, char *argv[]) { int rank, buf; MPI::Init(argc, argv); rank = MPI::COMM_WORLD.Get_rank(); // Process 0 sends and Process 1 receives if (rank == 0) { buf = 123456; MPI::COMM_WORLD.Send( &buf, 1, MPI::INT, 1, 0 ); } else if (rank == 1) { MPI::COMM_WORLD.Recv( &buf, 1, MPI::INT, 0, 0 ); std::cout << "Received" << buf << "\n"; } MPI::Finalize(); return 0; } Dr. Hanif Durad14 IntroMPI.pptProgram name blocking.cpp

15 Retrieving Further Information (C) Status is a data structure allocated in the user’s program. In C: int recvd_tag, recvd_from, recvd_count; MPI_Status status; MPI_Recv(..., MPI_ANY_SOURCE, MPI_ANY_TAG,..., &status ) recvd_tag = status.MPI_TAG; recvd_from = status.MPI_SOURCE; MPI_Get_count( &status, datatype, &recvd_count ); Dr. Hanif Durad15

16 Retrieving Further Information (Fortran) In Fortran: integer recvd_tag, recvd_from, recvd_count integer status(MPI_STATUS_SIZE) call MPI_RECV(..., MPI_ANY_SOURCE, MPI_ANY_TAG,.. status, ierr) tag_recvd = status(MPI_TAG) recvd_from = status(MPI_SOURCE) call MPI_GET_COUNT(status, datatype, recvd_count, ierr ) Dr. Hanif Durad16

17 Retrieving Further Information (C++) Status is a data structure allocated in the user’s program. In C++: int recvd_tag, recvd_from, recvd_count; MPI::Status status; Comm.Recv(..., MPI::ANY_SOURCE, MPI::ANY_TAG,..., status ) recvd_tag = status.Get_tag(); recvd_from = status.Get_source(); recvd_count = status.Get_count( datatype ); Dr. Hanif Durad17

18 Tags and Contexts Separation of messages used to be accomplished by use of tags, but this requires libraries to be aware of tags used by other libraries. this can be defeated by use of “wild card” tags. Contexts are different from tags no wild cards allowed allocated dynamically by the system when a library sets up a communicator for its own use. User-defined tags still provided in MPI for user convenience in organizing application 18 IntroMPI.ppt

19 Home Work 1 We have just used MPI_Send() and MPI_Recv() Try to use other blocking functions listed Dr. Hanif Durad19

20 Flavors of message passing Synchronous used for routines that return when the message transfer has been completed Synchronous send waits until the complete message can be accepted by the receiving process before sending the message (send suspends until receive) Synchronous receive will wait until the message it is expecting arrives (receive suspends until message sent) Also called blocking AB request to send acknowledgement message lecture2.ppt

21 Synchronous send() and recv() using 3-way protocol (1/2) Dr. Hanif Durad21 Process 1Process 2 send(); recv(); Suspend Time process Acknowledgment Message Both processes continue (a) When send() occurs before recv() Request to send slides2.ppt

22 Synchronous send() and recv() using 3-way protocol (2/2) Dr. Hanif Durad22 Process 1Process 2 recv(); send(); Suspend Time process Acknowledgment Message Both processes continue (b) When recv() occurs before send() Request to send slides2.ppt

23 Nonblocking message passing Nonblocking sends return whether or not the message has been received If receiving processor not ready, message may be stored in message buffer Message buffer used to hold messages being sent by A prior to being accepted by receive in B MPI: routines that use a message buffer and return after their local actions complete are blocking (even though message transfer may not be complete) Routines that return immediately are non-blocking AB message buffer

24 4 Communication Modes in MPI (1/3) Standard mode Not assumed that corresponding receive routine has started. Amount of buffering not defined by MPI. If buffering provided, send could complete before receive reached Buffered(asynchronous) mode Send may start and return before a matching receive. Necessary to specify buffer space via routine MPI_Buffer_attach(). Dr. Hanif Durad24 lecture4.ppt/slides2.ppt

25 Communication Modes in MPI (2/3) Synchronous mode Send and receive can start before each other but can only complete together Ready mode Send can only start if matching receive already reached, otherwise error. Use with care Dr. Hanif Durad25 lecture4.ppt/slides2.ppt

26 Communication Modes in MPI (3/3) Each of the four modes can be applied to both blocking and nonblocking send routines. Only the standard mode is available for the blocking and nonblocking receive routines. Any type of send routine can be used with any type of receive routine. Dr. Hanif Durad26 slides2.ppt

27 A Real Blocking Program (1/3) #include "mpi.h" #include int main(int argc, char *argv[]) { #define MSGLEN 2048 int ITAG_A = 100,ITAG_B = 200; int irank, i, idest, isrc, istag, iretag; float rmsg1[MSGLEN]; float rmsg2[MSGLEN]; MPI::Status recv_status; MPI::Init(argc, argv); irank = MPI::COMM_WORLD.Get_rank(); 27 Program name deadlock.cpp

28 A Real Blocking Program (2/3) for (i = 1; i <= MSGLEN; i++) { rmsg1[i] = 100; rmsg2[i] = -100; } if ( irank == 0 ) { idest = 1; isrc = 1; istag = ITAG_A; iretag = ITAG_B; } else if ( irank == 1 ) { idest = 0; isrc = 0; istag = ITAG_B; iretag = ITAG_A; } 28

29 A Real Blocking Program (3/3) std::cout << "Task " << irank << " has sent the message" <<std::endl; MPI::COMM_WORLD.Ssend(rmsg1, MSGLEN, MPI::FLOAT, idest, istag); MPI::COMM_WORLD.Recv(rmsg2, MSGLEN, MPI::FLOAT, isrc, iretag, recv_status); std::cout << "Task " << irank << " has received the message" <<std::endl; MPI::Finalize(); } 29

30 Nonblocking Point-to-Point Communication (1/2) MPI_Isend(), MPI_Irecv() identifies the send/receive buffer. Computation proceeds immediately. A communication request handle is returned for handling the pending message status. The program must use calls to MPI_Wait or MPI_Test to determine when the operation completes. MPI_Issend(), MPI_Ibsend(), MPI_Irsend() non-blocking versions Dr. Hanif Durad30 Comm.ppt

31 Nonblocking Point-to-Point Communication (2/2) MPI_Test(), MPI_Testany, MPI_Testall, MPI_Testsome() checks the status of a specified non-blocking send or receive operation MPI_Wait(), MPI_Waitany(), MPI_Waitall(), MPI_Waitsome() blocks until a specified non-blocking send or receive operation has completed MPI_Probe() performs a non-blocking test for a message. Dr. Hanif Durad31 Comm.ppt

32 Non-Blocking Operations lecture4.ppt

33 Fixing Deadlock (1/3) #include "mpi.h" #include int main(int argc, char *argv[]) { #define MSGLEN 2048 int ITAG_A = 100,ITAG_B = 200; int irank, i, idest, isrc, istag, iretag; float rmsg1[MSGLEN]; float rmsg2[MSGLEN]; MPI::Status irstatus, isstatus; MPI::Request request; MPI::Init(argc, argv); irank = MPI::COMM_WORLD.Get_rank(); 33 Program name deadlock-fix.cppDT\

34 Fixing Deadlock (2/3) for (i = 1; i <= MSGLEN; i++) { rmsg1[i] = 100; rmsg2[i] = -100; } if ( irank == 0 ) { idest = 1; isrc = 1; istag = ITAG_A; iretag = ITAG_B; } else if ( irank == 1 ) { idest = 0; isrc = 0; istag = ITAG_B; iretag = ITAG_A; } Dr. Hanif Durad34 Program name deadlock-fix.cpp

35 Fixing Deadlock (3/3) std::cout << "Task " << irank << " has sent the message" <<std::endl; request = MPI::COMM_WORLD.Isend(rmsg1, MSGLEN, MPI::FLOAT, idest, istag); MPI::COMM_WORLD.Recv(rmsg2, MSGLEN, MPI::FLOAT, isrc, iretag, irstatus); MPI_Wait(request,isstatus); std::cout << "Task " << irank << " has received the message" << std::endl; MPI::Finalize(); } Dr. Hanif Durad35 Program name deadlock-fix.cpp

36 Home Work 2 We have just used MPI_Isend(). Try to use other non-blocking functions listed Dr. Hanif Durad36

