1 PDS Lectures 6 & 7 An Introduction to MPI Originally by William Gropp and Ewing Lusk Argonne National Laboratory Adapted by Anda Iamnitchi.

Slides:



Advertisements
Similar presentations
MPI Message Passing Interface
Advertisements

CS 140: Models of parallel programming: Distributed memory and MPI.
MPI Fundamentals—A Quick Overview Shantanu Dutt ECE Dept., UIC.
Introduction MPI Mengxia Zhu Fall An Introduction to MPI Parallel Programming with the Message Passing Interface.
CS 240A: Models of parallel programming: Distributed memory and MPI.
Distributed Memory Programming with MPI. What is MPI? Message Passing Interface (MPI) is an industry standard message passing system designed to be both.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
1 Message Passing Programming (MPI). 2 What is MPI? A message-passing library specification extended message-passing model not a language or compiler.
EECC756 - Shaaban #1 lec # 7 Spring Message Passing Interface (MPI) MPI, the Message Passing Interface, is a library, and a software standard.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
MPI (Message Passing Interface) Basics
1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben.
Basics of Message-passing Mechanics of message-passing –A means of creating separate processes on different computers –A way to send and receive messages.
1 An Introduction to MPI Parallel Programming with the Message Passing Interface Originally by William Gropp and Ewing Lusk Adapted by Anda Iamnitchi.
1 An Introduction to MPI Parallel Programming with the Message Passing Interface William Gropp Ewing Lusk Argonne National Laboratory Presenter: Mike Slavik.
2a.1 Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte,
1 MPI: Message-Passing Interface Chapter 2. 2 MPI - (Message Passing Interface) Message passing library standard (MPI) is developed by group of academics.
Part I MPI from scratch. Part I By: Camilo A. SilvaBIOinformatics Summer 2008 PIRE :: REU :: Cyberbridges.
CS 240A Models of parallel programming: Distributed memory and MPI.
Parallel Computing A task is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information What do we.
CS 484. Message Passing Based on multi-processor Set of independent processors Connected via some communication net All communication between processes.
Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University.
Message Passing Programming with MPI Introduction to MPI Basic MPI functions Most of the MPI materials are obtained from William Gropp and Rusty Lusk’s.
CS 838: Pervasive Parallelism Introduction to MPI Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from an online tutorial.
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
Summary of MPI commands Luis Basurto. Large scale systems Shared Memory systems – Memory is shared among processors Distributed memory systems – Each.
Message Passing Interface (MPI) 1 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School.
MPI Introduction to MPI Commands. Basics – Send and Receive MPI is a message passing environment. The processors’ method of sharing information is NOT.
1 The Message-Passing Model l A process is (traditionally) a program counter and address space. l Processes may have multiple threads (program counters.
Distributed-Memory (Message-Passing) Paradigm FDI 2004 Track M Day 2 – Morning Session #1 C. J. Ribbens.
An Introduction to MPI Parallel Programming with the Message Passing Interface Prof S. Ramachandram.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
CSCI-455/522 Introduction to High Performance Computing Lecture 4.
1 Message Passing Models CEG 4131 Computer Architecture III Miodrag Bolic.
CS4230 CS4230 Parallel Programming Lecture 13: Introduction to Message Passing Mary Hall October 23, /23/2012.
MPI Point to Point Communication CDP 1. Message Passing Definitions Application buffer Holds the data for send or receive Handled by the user System buffer.
Programming distributed memory systems: Message Passing Interface (MPI) Distributed memory systems: multiple processing units working on one task (e.g.
12.1 Parallel Programming Types of Parallel Computers Two principal types: 1.Single computer containing multiple processors - main memory is shared,
CSE 160 – Lecture 16 MPI Concepts, Topology and Synchronization.
Introduction to Parallel Programming at MCSR Message Passing Computing –Processes coordinate and communicate results via calls to message passing library.
Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School.
2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()
1 Parallel and Distributed Processing Lecture 5: Message-Passing Computing Chapter 2, Wilkinson & Allen, “Parallel Programming”, 2 nd Ed.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
Message Passing Interface Using resources from
MPI-Message Passing Interface. What is MPI?  MPI is a specification for the developers and users of message passing libraries. By itself, it is NOT a.
Lecture 3 Point-to-Point Communications Dr. Muhammad Hanif Durad Department of Computer and Information Sciences Pakistan Institute Engineering and Applied.
COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming - Exercises Dr. Xiao Qin Auburn University
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
1 MPI: Message Passing Interface Prabhaker Mateti Wright State University.
Outline Background Basics of MPI message passing
Introduction to parallel computing concepts and technics
CS4402 – Parallel Computing
Introduction to MPI.
MPI Message Passing Interface
CS 584.
CS4961 Parallel Programming Lecture 16: Introduction to Message Passing Mary Hall November 3, /03/2011 CS4961.
MPI-Message Passing Interface
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
Lecture 14: Inter-process Communication
MPI: Message Passing Interface
Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte,
Introduction to parallelism and the Message Passing Interface
Message-Passing Computing Message Passing Interface (MPI)
Hello, world in MPI #include <stdio.h> #include "mpi.h"
Spring’19 Recitation: Introduction to MPI
Hello, world in MPI #include <stdio.h> #include "mpi.h"
MPI Message Passing Interface
CS 584 Lecture 8 Assignment?.
Presentation transcript:

1 PDS Lectures 6 & 7 An Introduction to MPI Originally by William Gropp and Ewing Lusk Argonne National Laboratory Adapted by Anda Iamnitchi

2 Outline Background  The message-passing model  Origins of MPI and current status  Sources of further MPI information Basics of MPI message passing  Hello, World!  Fundamental concepts  Simple examples in Fortran and C Extended point-to-point operations  non-blocking communication  Modes Collective operations

3 The Message-Passing Model A process is (traditionally) a program counter and address space. Processes may have multiple threads (program counters and associated stacks) sharing a single address space. MPI is for communication among processes, which have separate address spaces. Interprocess communication consists of  Synchronization  Movement of data from one process’s address space to another’s.

4 Types of Parallel Computing Models (review) Data Parallel - the same instructions are carried out simultaneously on multiple data items (SIMD) Task Parallel - different instructions on different data (MIMD)  SPMD (single program, multiple data) not synchronized at individual operation level  SPMD is equivalent to MIMD since each MIMD program can be made SPMD (similarly for SIMD, but not in practical sense.) Message passing (and MPI) is for MIMD/SPMD parallelism.

5 Cooperative Operations for Communication The message-passing approach makes the exchange of data cooperative. Data is explicitly sent by one process and received by another. An advantage is that any change in the receiving process’s memory is made with the receiver’s explicit participation. Communication and synchronization are combined. Process 0 Process 1 Send(data) Receive(data)

6 One-Sided Operations for Communication One-sided operations between processes include remote memory reads and writes Only one process needs to explicitly participate. An advantage is that communication and synchronization are decoupled One-sided operations are part of MPI-2. Process 0 Process 1 Put(data) (memory) Get(data)

7 What is MPI? A message-passing library specification  extended message-passing model  not a language or compiler specification  not a specific implementation or product For parallel computers, clusters, and heterogeneous networks Designed to provide access to advanced parallel hardware for  end users  library writers  tool developers

8 A Minimal MPI Program (C) #include "mpi.h" #include int main( int argc, char *argv[] ) { MPI_Init( &argc, &argv ); printf( "Hello, world!\n" ); MPI_Finalize(); return 0; }

9 Finding Out About the Environment Two important questions that arise early in a parallel program are:  How many processes are participating in this computation?  Which one am I? MPI provides functions to answer these questions:  MPI_Comm_size reports the number of processes.  MPI_Comm_rank reports the rank, a number between 0 and size-1, identifying the calling process

10 Better Hello (C) #include "mpi.h" #include int main( int argc, char *argv[] ) { int rank, size; MPI_Init( &argc, &argv ); MPI_Comm_rank( MPI_COMM_WORLD, &rank ); MPI_Comm_size( MPI_COMM_WORLD, &size ); printf( "I am %d of %d\n", rank, size ); MPI_Finalize(); return 0; }

11 MPI Basic Send/Receive We need to fill in the details in Things that need specifying:  How will “data” be described?  How will processes be identified?  How will the receiver recognize/screen messages?  What will it mean for these operations to complete? Process 0 Process 1 Send(data ) Receive(data )

12 What is message passing? Data transfer plus synchronization Requires cooperation of sender and receiver Cooperation not always apparent in code Data Process 0 Process 1 May I Send? Yes Data Time

13 Some Basic Concepts Processes can be collected into groups. Each message is sent in a context, and must be received in the same context. A group and context together form a communicator. A process is identified by its rank in the group associated with a communicator. There is a default communicator whose group contains all initial processes, called MPI_COMM_WORLD.

14 MPI Datatypes The data in a message to sent or received is described by a triple (address, count, datatype), where An MPI datatype is recursively defined as:  predefined, corresponding to a data type from the language (e.g., MPI_INT, MPI_DOUBLE_PRECISION)  a contiguous array of MPI datatypes  a strided block of datatypes  an indexed array of blocks of datatypes  an arbitrary structure of datatypes There are MPI functions to construct custom datatypes, such an array of (int, float) pairs, or a row of a matrix stored columnwise.

15 MPI Tags Messages are sent with an accompanying user- defined integer tag, to assist the receiving process in identifying the message. Messages can be screened at the receiving end by specifying a specific tag, or not screened by specifying MPI_ANY_TAG as the tag in a receive. Some non-MPI message-passing systems have called tags “message types”. MPI calls them tags to avoid confusion with datatypes.

16 MPI Basic (Blocking) Send MPI_SEND (start, count, datatype, dest, tag, comm) The message buffer is described by (start, count, datatype). The target process is specified by dest, which is the rank of the target process in the communicator specified by comm. When this function returns, the data has been delivered to the system and the buffer can be reused. The message may not have been received by the target process. MPI_Send(buf, count, datatype, dest, tag, comm) Address of Number of items Datatype of Rank of destination Message tag Communicator send buffer to send each item process

17 MPI Basic (Blocking) Receive MPI_RECV(start, count, datatype, source, tag, comm, status) Waits until a matching (on source and tag ) message is received from the system, and the buffer can be used. source is rank in communicator specified by comm, or MPI_ANY_SOURCE. status contains further information Receiving fewer than count occurrences of datatype is OK, but receiving more is an error. MPI_Recv(buf, count, datatype, src, tag, comm, status) Address of Maximum number Datatype of Rank of source Message tag Communicator receive buffer of items to receive each item process Status after operation

18 Example To send an integer x from process 0 to process 1, MPI_Comm_rank(MPI_COMM_WORLD,&myrank); /* find rank */ if (myrank == 0) { int x; MPI_Send(&x, 1, MPI_INT, 1, msgtag, MPI_COMM_WORLD); } else if (myrank == 1) { int x; MPI_Recv(&x, 1, MPI_INT, 0, msgtag,MPI_COMM_WORLD,status); }

19 Retrieving Further Information Status is a data structure allocated in the user’s program. In C: int recvd_tag, recvd_from, recvd_count; MPI_Status status; MPI_Recv(..., MPI_ANY_SOURCE, MPI_ANY_TAG,..., &status ) recvd_tag = status.MPI_TAG; recvd_from = status.MPI_SOURCE; MPI_Get_count( &status, datatype, &recvd_count );

20 Why Datatypes? Since all data is labeled by type, an MPI implementation can support communication between processes on machines with very different memory representations and lengths of elementary datatypes (heterogeneous communication). Specifying application-oriented layout of data in memory  reduces memory-to-memory copies in the implementation  allows the use of special hardware (scatter/gather) when available

21 Tags and Contexts Separation of messages used to be accomplished by use of tags, but  this requires libraries to be aware of tags used by other libraries.  this can be defeated by use of “wild card” tags. Contexts are different from tags  no wild cards allowed  allocated dynamically by the system when a library sets up a communicator for its own use. User-defined tags still provided in MPI for user convenience in organizing application Use MPI_Comm_split to create new communicators

22 Communicators Defines scope of a communication operation. Processes have ranks associated with communicator. Initially, all processes enrolled in a “universe” called MPI_COMM_WORLD, and each process is given a unique rank, a number from 0 to p - 1, with p processes. Other communicators can be established for groups of processes.

23 MPI is Simple Many parallel programs can be written using just these six functions, only two of which are non-trivial:  MPI_INIT  MPI_FINALIZE  MPI_COMM_SIZE  MPI_COMM_RANK  MPI_SEND  MPI_RECV Point-to-point (send/recv) isn’t the only way...

24 Introduction to Collective Operations in MPI Collective operations are called by all processes in a communicator. MPI_BCAST distributes data from one process (the root) to all others in a communicator. MPI_REDUCE combines data from all processes in communicator and returns it to one process. In many numerical algorithms, SEND/RECEIVE can be replaced by BCAST/REDUCE, improving both simplicity and efficiency.

25 Example: PI in C -1 #include "mpi.h" #include int main(int argc, char *argv[]) { int done = 0, n, myid, numprocs, i, rc; double PI25DT = ; double mypi, pi, h, sum, x, a; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); while (!done) { if (myid == 0) { printf("Enter the number of intervals: (0 quits) "); scanf("%d",&n); } MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD); if (n == 0) break;

26 Example: PI in C - 2 h = 1.0 / (double) n; sum = 0.0; for (i = myid + 1; i <= n; i += numprocs) { x = h * ((double)i - 0.5); sum += 4.0 / (1.0 + x*x); } mypi = h * sum; MPI_Reduce(&mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD); if (myid == 0) printf("pi is approximately %.16f, Error is %.16f\n", pi, fabs(pi - PI25DT)); } MPI_Finalize(); return 0; }

27 Alternative set of 6 Functions for Simplified MPI  MPI_INIT  MPI_FINALIZE  MPI_COMM_SIZE  MPI_COMM_RANK  MPI_BCAST  MPI_REDUCE What else is needed (and why)?

28 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_Bcast() - Broadcast from root to all other processes MPI_Gather() - Gather values for group of processes MPI_Scatter() - Scatters buffer in parts to group of processes MPI_Alltoall() - Sends data from all processes to all processes MPI_Reduce() - Combine values on all processes to single value MPI_Reduce_scatter() - Combine values and scatter results MPI_Scan() - Compute prefix reductions of data on processes

29 Example To gather items from group of processes into process 0, using dynamically allocated memory in root process: int data[10];/*data to be gathered from processes*/ MPI_Comm_rank(MPI_COMM_WORLD, &myrank);/* find rank */ if (myrank == 0) { MPI_Comm_size(MPI_COMM_WORLD, &grp_size); /*find group size*/ buf = (int *)malloc(grp_size*10*sizeof (int)); /*allocate memory*/ } MPI_Gather(data,10,MPI_INT,buf,grp_size*10,MPI_INT,0,MPI_COMM_WORLD) ; MPI_Gather() gathers from all processes, including root.

30 Send a large message from process 0 to process 1  If there is insufficient storage at the destination, the send must wait for the user to provide the memory space (through a receive) What happens with Sources of Deadlocks Process 0 Send(1) Recv(1) Process 1 Send(0) Recv(0) This is called “unsafe” because it depends on the availability of system buffers

31 Some Solutions to the “unsafe” Problem Order the operations more carefully: Process 0 Send(1) Recv(1) Process 1 Recv(0) Send(0) Use non-blocking operations: Process 0 Isend(1) Irecv(1) Waitall Process 1 Isend(0) Irecv(0) Waitall

32 MPI Nonblocking Routines Nonblocking send - MPI_Isend() - will return “immediately” even before source location is safe to be altered. Nonblocking receive - MPI_Irecv() - will return even if no message to accept.

33 Nonblocking Routine Formats MPI_Isend(buf,count,datatype,dest,tag,comm,request) MPI_Irecv(buf,count,datatype,source,tag,comm, request) Completion detected by MPI_Wait() and MPI_Test(). MPI_Wait() waits until operation completed and returns then. MPI_Test() returns with flag set indicating whether operation completed at that time. Need to know whether particular operation completed. Determined by accessing request parameter.

34 Example To send an integer x from process 0 to process 1 and allow process 0 to continue, MPI_Comm_rank(MPI_COMM_WORLD, &myrank);/* find rank */ if (myrank == 0) { int x; MPI_Isend(&x,1,MPI_INT, 1, msgtag, MPI_COMM_WORLD, req1); compute(); MPI_Wait(req1, status); } else if (myrank == 1) { int x; MPI_Recv(&x,1,MPI_INT,0,msgtag, MPI_COMM_WORLD, status); }

35 MPI-2 (available on C4 lab machines) Extends MPI-1 with: Dynamic Process Management  Dynamic process startup  Dynamic establishment of connections One-sided communication (Remote Memory Access)  Put/get  Other operations Parallel I/O Other features  Bindings for C++/ Fortran-90;  interlanguage issues  Etc.

36 MPI-2 Specifics: Parallel I/O

37 /* example of sequential Unix write into a common file */ #include "mpi.h" #include #define BUFSIZE 100 int main(int argc, char *argv[]) { int i, myrank, numprocs, buf[BUFSIZE]; MPI_Status status; FILE *myfile; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); MPI_Comm_size(MPI_COMM_WORLD, &numprocs); for (i=0; i<BUFSIZE; i++) buf[i] = myrank * BUFSIZE + i; if (myrank != 0) MPI_Send(buf, BUFSIZE, MPI_INT, 0, 99, MPI_COMM_WORLD); else { myfile = fopen("testfile", "w"); fwrite(buf, sizeof(int), BUFSIZE, myfile); for (i=1; i<numprocs; i++) { MPI_Recv(buf, BUFSIZE, MPI_INT, i, 99, MPI_COMM_WORLD, &status); fwrite(buf, sizeof(int), BUFSIZE, myfile); } fclose(myfile); } MPI_Finalize(); return 0; }

38 #include "mpi.h" #include #define BUFSIZE 100 int main(int argc, char *argv[]) { int i, myrank, buf[BUFSIZE]; char filename[128]; FILE *myfile; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); for (i=0; i<BUFSIZE; i++) buf[i] = myrank * BUFSIZE + i; sprintf(filename, "testfile.%d", myrank); myfile = fopen(filename, "w"); fwrite(buf, sizeof(int), BUFSIZE, myfile); fclose(myfile); MPI_Finalize(); return 0; }

39 /* example of parallel MPI write into a single file */ #include "mpi.h" #include #define BUFSIZE 100 int main(int argc, char *argv[]) { int i, myrank, buf[BUFSIZE]; MPI_File thefile; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); for (i=0; i<BUFSIZE; i++) buf[i] = myrank * BUFSIZE + i; MPI_File_open(MPI_COMM_WORLD, "testfile", MPI_MODE_CREATE | MPI_MODE_WRONLY, MPI_INFO_NULL, &thefile); MPI_File_set_view(thefile, myrank * BUFSIZE * sizeof(int), MPI_INT, MPI_INT, "native", MPI_INFO_NULL); MPI_File_write(thefile, buf, BUFSIZE, MPI_INT, MPI_STATUS_IGNORE); MPI_File_close(&thefile); MPI_Finalize(); return 0; }

40 C bindings for the I/O functions int MPI_File_open(MPI_Comm comm, char *filename, int amode, MPI_Info info, MPI_File *fh) int MPI_File_set_view(MPI_File fh, MPI_Offset disp, MPI_Datatype etype, MPI_Datatype filetype, char *datarep, MPI_Info info) int MPI_File_write(MPI_File fh, void *buf, int count, MPI_Datatype datatype, MPI_Status *status) int MPI_File_close(MPI_File *fh)

41 Remote Memory Access “ window”: portion of each process’ address space that is explicitly exposes to remove memory operations by other processes in the MPI communication.

42 RMA (cont) Objectives and performance issues:  balance efficiency and portability across several classes of architectures, including SMPs, NUMAs, distributed-memory massively parallel processors (MPPs), heterogeneous networks;  retain the "look and feel" of MPI-1;  deal with subtle memory behavior issues, such as cache coherence and sequential consistency; and  separate synchronization from data movement to enhance performance.

43 /* Compute pi by numerical integration, RMA version */ #include "mpi.h" #include int main(int argc, char *argv[]) { int n, myid, numprocs, i; double PI25DT = ; double mypi, pi, h, sum, x; MPI_Win nwin, piwin; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); if (myid == 0) { MPI_Win_create(&n, sizeof(int), 1, MPI_INFO_NULL, MPI_COMM_WORLD, &nwin); MPI_Win_create(&pi, sizeof(double), 1, MPI_INFO_NULL, MPI_COMM_WORLD, &piwin); } else { MPI_Win_create(MPI_BOTTOM, 0, 1, MPI_INFO_NULL, MPI_COMM_WORLD, &nwin); MPI_Win_create(MPI_BOTTOM, 0, 1, MPI_INFO_NULL, MPI_COMM_WORLD, &piwin); }

44 while (1) { if (myid == 0) { printf("Enter the number of intervals: (0 quits) "); fflush(stdout); scanf("%d",&n); pi = 0.0; } MPI_Win_fence(0, nwin); if (myid != 0) MPI_Get(&n, 1, MPI_INT, 0, 0, 1, MPI_INT, nwin); MPI_Win_fence(0, nwin); if (n == 0) break; else { h = 1.0 / (double) n; sum = 0.0; for (i = myid + 1; i <= n; i += numprocs) { x = h * ((double)i - 0.5); sum += (4.0 / (1.0 + x*x)); } mypi = h * sum; MPI_Win_fence( 0, piwin); MPI_Accumulate(&mypi, 1, MPI_DOUBLE, 0, 0, 1, MPI_DOUBLE, MPI_SUM, piwin); MPI_Win_fence(0, piwin); if (myid == 0) MPI_COMM_WORLD, &piwin); } else { MPI_Win_create(MPI_BOTTOM, 0, 1, MPI_INFO_NULL, MPI_COMM_WORLD, &nwin); MPI_Win_create(MPI_BOTTOM, 0, 1, MPI_INFO_NULL, MPI_COMM_WORLD, &piwin); } printf("pi is approximately %. 16f, Error is %. 16f/n", pi, fabs(pi - PI25DT)); } } MPI_Win_free(&nwin); MPI_Win_free(&piwin); MPI_Finalize(); return 0; }

45 C bindings for the RMA functions used in the cpi example int MPI_Win_create(void *base, MPI_Aint size, int disp_unit, MPI_Info info, MPI_Comm comm, MPI_Win *win) int MPI_Win_fence(int assert, MPI_Win win) int MPI_Get(void *origin_addr, int origin_count, MPI_Datatype origin_datatype, int target_rank, MPI_Aint target_disp, int target_count, MPI_Datatype target_datatype, MPI_Win win) int MPI_Accumulate(void *origin_addr, int origin_count, MPI_Datatype origin_datatype, int target_rank, MPI_Aint target_disp, int target_count, MPI_Datatype target_datatype, MPI_Op op, MPI_Win win) int MPI_Win_free(MPI_Win *win)

46 Dynamic Processes

47 When to use MPI Portability and Performance Irregular Data Structures Building Tools for Others  Libraries Need to Manage memory on a per processor basis

48 When not to use MPI Regular computation matches HPF/SIMD  But see PETSc/HPF comparison (ICASE 97-72)ICASE Solution (e.g., library) already exists  Require Fault Tolerance  Sockets Distributed Computing  CORBA, DCOM, etc.

49 Error Handling By default, an error causes all processes to abort. The user can cause routines to return (with an error code) instead.  In C++, exceptions are thrown (MPI-2) A user can also write and install custom error handlers. Libraries might want to handle errors differently from applications.

50 Resources On-line information (including tutorials):   Using MPI-2 advanced features of the message-passing interface, William Gropp... [et al.].  E-book available from the USF library The Standard itself:  at  All MPI official releases, in both postscript and HTML Online examples available at ftp://ftp.mcs.anl.gov/mpi/mpiexmpl.tar.gz contains source code and run scripts that allows you to evaluate an MPI implementation ftp://ftp.mcs.anl.gov/mpi/mpiexmpl.tar.gz