Download presentation
Presentation is loading. Please wait.
Published byKarin Long Modified over 9 years ago
2
1 MPI Primer Lesson 10
3
2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface Forum in April 1994. The goal of MPI is to develop a widely used standard for writing message-passing programs.
4
3 Historical Perspective
5
4 Major MPI Issues 1.Process Creation and Management :Process Creation and Management discusses the extension of MPI to remove the static process model in MPI. It defines routines that allow for creation of processes. 2.One-Sided Communications :One-Sided Communications defines communication routines that can be completed by a single process.These include shared-memory operations (put/get) and remote accumulate operations. 3.Extended Collective Operations:Extended Collective Operations extends the semantics of MPI-1 collective operations to include intercommunicators. It also adds more convenient methods of constructing intercommunicators and two new collective operations. 4.External Interfaces:External Interfaces defines routines designed to allow developers to layer on top of MPI. This includes generalized requests, routines that decode MPI opaque objects, and threads. 5.I/O:I/O defines MPI-2 support for parallel I/O. 6.Language Bindings:Language Bindings describes the C, C++ binding and discusses Fortran-90 issues.
6
5 Message Passing Most popular way for distributed-memory systems –Three Steps A Message Is Passed: (1) Data is copied out of sender and buffer message assembled (2) Message passed to receiver (3) Message disassembled and data is copied to receiver buffer Communicator: Specifying a domain for communications to take place –Two Types of Message Passing 1.Intra-Communicator message passing 2.Inter-Communicator message passing –Remarks: 1.A process may belong to several communicators at the same time 2.A communicator is usually the entire collection of processors (or processes) you get for your applications
7
6 MPI_COMM_WORLD Specifies all processes available at initialization –Rank –Every message must have two attributes 1.The Envelope 2.The Data –Message Tag –MPI datatype:
8
7 Rank and two attributes Rank: 1.An integer to uniquely identify each process in your communicator. 2.Rank goes 0 through n-1 (n = number of processes) 3.Rank can be recalled by: 4.MPI_Comm_Rank(); Every message must have two attributes 1.The Envelope a.Rank of Destination b.Message Tag c.Communicator 2.The Data a.Initial Address of Send Buffer b.Number of Entries to Send c.Datatype of Each Entry
9
8 Message Tag & MPI datatype Message tag: –(1) ID for this particular message to be matched by both sender and receiver. –(2) It is like sending multiple gifts to your friend. You need to identify them. –(3) MPI_TAG_UB >= 32767 –(4) Similar in functionality to "comm" to group msgs. –(5) “comm” is safer than "tag", but "tag" is more convenient. MPI datatype: To achieve portability among different architectures –MPI_INTEGER –MPI_REAL –MPI_DOUBLE_PRECISION –MPI_COMPLEX –MPI_LOGICAL –MPI_BYTE –MPI_INT –MPI_CHAR –MPI_FLOAT –MPI_DOUBLE
10
9 Main Message Passing Functions Blocking message Send: MPI_Send( a. Initial address of send buffer b. Number of entries to send c. datatype of each entry a. rank of destination b. message tag c. communicator ); Blocking message Recv: MPI_Recv( a. initial address of recv buffer b. max #entries to recv c. datatype of each entry a. rank of src b. message tag c. communicator d. return status );
11
10 Message selection (Pulling message) A receiver selects a message by its envelope information: (1)Source rank, (2)Message tag, It can also receive all messages (wild card) (1) MPI_ANY_TAG (2) MPI_ANY_SOURCE You must specify a "comm". (1) MPI_Get_count ( a. Return status of recv operation b. Datatype of each Recv buffer entry c. Number received entries ); This function decodes the “status” from MPI_Recv(). Remarks: 1. Message transfer is initiated by sender (pushing) not pulling 2. Send self message is allowed, may produce deadlock if blocking sender recver send message receiving message 3. Passing messages of multiple datatypes (a struc) is difficult a. Use packing/unpacking b. Two-phase protocol (msg nature, msg self) 4. Avoid wildcard as much as possible
12
11 MPI_SendRecv(); A round-trip of a message. Send a message out and then receive another message. Performing remote procedure calls (RPC) - Sending the input parameter to the dst and then get output back. When you need to send AND recv a msg. MPI_SendRecv( sendbuf sendcount sendtype dst-rank send-tag recvbuf recvcount recvtype src-rank recv-tag comm status ); Remarks: 1. Matches other operations of send and recv. a. sendrecv can be received by recv srcdst sendrecvrecv b. sendrecv can receive a msg by regular send srcdst sendsendrecv 2. same "comm" 3. different tags 4. different buffers (disjoint) 5. send-recv is a concurrent double call (send and receive) 6. avoids deadlock (make it possible to have a message for round-trip.)
13
12 MPI_SendRecv_Replace(); Same as above except the sendbuf is replaced after this call by receive buffer. MPI_SendRecv_Replace( buf count sendtype dst-rank send-tag recvtype src-rank recv-tag comm status );
14
13 Dummy source or dst--- NULL Processes MPI_PROC_NULL send or recv with src=MPI_PROC_NULL or dst=MPI_PROC_NULL Remarks: 1. For convenience of balance and symmetric of a code. 2. Use it with care.
15
14 Two message protocols (short and long) Short: send msg get an ack msg ack Long: send request-to-send Signal Ready Data ack
16
15 Blocking & Non-Blocking Communication Blocking –If we reverse the arrival at the communication point. Suppose Proc 1 executes the receive, but Proc 0 doesn’t execute the send. We say that the MPI_Recv function is blocking. Proc 1 calls the receive function but is not available. Pro 1 will remain ideal until it becomes available. This id different from the synchronous communication. In blocking communication, 0 may have already buffered the message when 1 is ready to receive, but the communication line joining the processesw might be busy. Nonblocking –Most systems provides an alternative for receive operation, called MPI_Irecv for Immediate receive. It has one more parameter than MPI_Recv, the request. With this, the process gets a return “immediately” from the call. For example, Proc 1 callede MPI_Irecv, The call would notify the system that Proc 1 intended to receive a message from Proc 0 with the property indicated by the argument. Then Pro 1 could perform some other useful work and the system initialized the request argument. Proc 1 will check back later with the system (not depend on the Pro 0 message), to see if the message had arrived according to the requestargument. The use of non-blocking communication can dramatically improve the performance of message-passing programming. If each Node has a communication coprocessor, then we can start a non-blocking communication and perform the computations that don’t depend on the result of the communication.
17
16 Message Passing Functions Message passing functions can be either blocking or nonblocking. In blocking message passing, a call to a communication function won’t return until the operation is complete. The nonblocking communication consists of two phases. –first phase: a function is called that starts the communication. –second phase: another function is called that completes the communication. –If the system has the capability to simultaneously compute and communicate. We can do some useful computation in between the two phases.
18
17 Non-blocking communication Three non-blocking message passing: Sender Receiver –Method-1 T1:sendrecv T2:recvsend –Method-2 T1: sendsend T2:recvrecv –Method-3 T1:recvsend T2:sendrecv
19
18 Completion operations MPI_WAIT( request status ) This call returns only if the "request" is complete. MPI_test( request flag status ) (1) flag=TRUE if request is complete. Otherwise, it’s FALSE (2) MPI_WAIT() will return if flag in MPI_Test() is TRUE Remarks: This allows easy change of blocking code to non-blocking.
20
19 More Completion operations MPI_Request_free() removes the request handle, but will not cancel the msg. MPI_Cancel() does it. MPI_Waitany ( list length array of request handlers index of completed request handle status object ) MPI_Testany() Remarks: 1. Additional Test and Wait tools allow easy migration of blocking code to non-blocking 2. MPI_Request_free() removes the request handle, but will not cancel the msg. 3. MPI_Cancel() will cancel the msg.
21
20 MPI Message The actual message passing in the program is carried out by the MPI function MPI_Send (sends a message to a designated process) and MPI_Recv (receives a message from a process.) Problems involved in message passing: 1.a message must be composed and put in a buffer; 2.the message must be “dropped in a mailbox”; in order to know where to deliver the message, it must “enclosing the message in an envelop” and the designation addressed of the message. 3.But just the address isn’t enough. Since the physical message is a sequence of electrical signals, the system needs to know where the message ends or the size of the message. 4.To take appropriate action about the message, the receiver needs the return address or the address of the source process. 5.Also different message type or tag can help receiver to take proper action of the message. 6.Need to know from which communicator the message comes from. Therefore, the message envelop contains: 1.The rank of the receiver 2.The rank of the sender 3.A tag (message type) 4.a communicator The actual message are stored in a block of memory. The system needs the count and datatype to determine how much storage is needed for the message: 1.the count value 2.the MPI datatype The message also need a message pointer to know where to get the message 1.message pointer –.
22
21 Sending Message The parameters for MPI_Send and MPI_Recv are: int MPI_Sent( void*message/* in */, intcount/* in */, MPI_Datatype datatype/* in */, intdest/* in */, inttag/* in */, MPI_Commcomm/* in */) int MPI_Recv( void*message/*out */, intcount/* in */, MPI_Datatype datatype/* in */, intsource/* in */, inttag/* in */, MPI_Commcomm/* in */, MPI_Status*status/*out */)
23
22 Send and Receive pair The status returns information on the data that was actually received. It reference the struct with at least three members: status -> MPI_SOURCE/* contains the rank of the process that sent the message*/ status -> MPI_TAG/* status -> MPI_ERROR Send Tag A Receive Tag B (can be any sender/tag: MPI_ANY_TAG MPI_ANY_SENDER ) Identical Message
24
23 In Summary The count and datatype determine the size of the message The tag and comm are used to make sure that messages don’t get mixed up. Each message consists of two parts: the data being transmitted and the envelop of information message Data Envelop Pointer Count Datatype 1.the rank of the receiver 2.the rank of the sender 3.a tag 4.a communicator 5.status (for receive)
25
24
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.