Chapter 4 Message-Passing Programming
The Message-Passing Model
What is MPI? MPI (Message Passing Interface). MPI is a library specification for message passing. MPI was designed for high performance on both massively parallel machines and on workstation clusters.
MPI History In 1992, Supercomputing conference agreed to develop and then implement a common standard for message passing. The first MPI standard, called MPI-1 was completed in May The second MPI standard, MPI-2, was completed in 1998.
MPI History The most popular one being the Argonne's MPICH (based on the P4 package and Chameleon - hence the ``CH'' suffix). There were still many supercomputer vendors that released their own implementations of MPI. (i.e. LAM MPI, Intel MPI e.t.c.)
What’s in MPI-2 Dynamic process management ◦ Adding processes to a running MPI computation Parallel I/O C++ and Fortran 90 bindings Misc ◦ Interaction with threads ◦ Interoperability between language ◦ Extensions/enhancements to MPI-1
Why MPI? Portability: MPI has been implemented for almost every distributed memory architecture. Speed: Each implementation is in principle optimized for the hardware upon which it runs. Most MPI implementations are directly callable from Fortran, C and C++, and from any language capable of interfacing with such libraries (such as C#, JAVA or Python).
Message Passing A paradigm of communication where messages are sent from a sender to one or more recipients. Message Passing Systems may have satisfied the following conditions: 1.Transferred reliably. 2.Guaranteed to be delivered in order. 3.Synchronous or asynchronous. 4.Passed one-to-one, one-to-many or many-to-one.
Programming Model (1/2) MPI lends itself to virtually any distributed memory parallel programming model. As shared memory systems became more popular, particularly SMP/NUMA architectures, MPI implementations for these platforms appeared. MPI is now used on just about any common parallel architecture including massively parallel machines, SMP clusters, workstation cluster and heterogeneous networks.
Programming Model (2/2) The number of tasks dedicated to run a parallel program is static. New tasks can not be dynamically spawned during run time. All parallelism is explicit: ◦ The programmer is responsible for correctly identifying parallelism and implementing parallel algorithms using MPI constructs.
Error Handling By default, an error causes all processes to abort. The user can cause routines to return (with an error code) instead. ◦ In C++, exceptions are thrown (MPI-2) A user can also write and install custom error handlers.
The Process A process is a program in execution. A program is not a process; a program is a passive entity, whereas a process is an active entity.
SPMD Coding Model Single Program Multiple Data
Communicators and Groups Communicator objects connect groups of processes in the MPI session. Each communicator gives each contained process an independent identifier and arranges its contained processes in an ordered topology. MPI understands single group intracommunicator operations, and bilateral intercommunicator communication.
Rank Within a communicator, every process has its own unique integer identifier (i.e., Rank). A rank is also called a “task ID”. Ranks are contiguous and begin at zero. Used by the programmer to specify the source and destination of messages.
MPI Program Architecture MPI include file Initialize MPI environment (Parallel code begins) Do work and make message passing calls Terminate MPI Environment (Parallel code ends) Declarations, prototypes, etc. Program Begins (Serial code) (Other Serial code)
Booting the MPI in Cluster for Administer In administer (root), the Intel MPI Library uses a Multi-Purpose Daemon (MPD) job startup mechanism. To run programs compiled with mpicc (or related) commands, you must first set up MPD daemons. In administer (root), shutdown the MPD daemon: pn1: ~ # mpd & [1] 5380 pn1: ~ # mpdallexit
Booting the MPI in Cluster for Users Before running a MPI program, you need to boot machines by “mpdboot” command Use the MPDTRACE daemon to trace booting nodes (computers): mpdtrace pn1 pn4 pn3 pn2 mpdboot -n 4
Compiling MPI Programs In general, starting an MPI program is dependent on the implementation of MPI you are using, and might require various scripts, program arguments, and/or environment variables. In C language: ◦ Use the gcc compiler ◦ Use the intel compiler mpicc -o a.out sample.c mpiicc -o a.out sample.c
Running a Program in Cluster To run a MPI program, the command is as follow: ◦ -n is the number of processes that run on the program. mpiexec -n 8./a.out Process 5 of 8 is on pn4 Process 6 of 8 is on pn4 Process 0 of 8 is on pn1 Process 2 of 8 is on pn1 Process 1 of 8 is on pn1 Process 3 of 8 is on pn1 Process 7 of 8 is on pn4 Process 4 of 8 is on pn4 pi is approximately , Error is wall clock time =
A Minimal MPI Program in C #include “mpi.h” #include int main(int argc, char *argv[]){ int id, t_process; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD, &id); MPI_Comm_size(MPI_COMM_WORLD, &t_process); printf(“Hello, world! This is process %d of %d\n”,id, t_process); MPI_Finalize(); return 0; }
Sample Result The sample result is as follow: mpiicc -o a.out sample.c mpirexec -n 8./a.out Hello, world! This is process 0 of 8 Hello, world! This is process 6 of 8 Hello, world! This is process 4 of 8 Hello, world! This is process 5 of 8 Hello, world! This is process 7 of 8 Hello, world! This is process 2 of 8 Hello, world! This is process 1 of 8 Hello, world! This is process 3 of 8
Sample Result Process 1 MPI_Init MPI_Rank id = 0 MPI_Rank id = 1 MPI_Rank id = 2 MPI_Comm_size t_process = 3 MPI_Comm_size t_process = 3 MPI_Comm_size t_process = 3 Print Hello world! This is process 0 of 3 Print Hello world! This is process 1 of 3 Print Hello world! This is process 2 of 3 START Process 2 Process 0 MPI_Init MPI_Finalize STOP
MPI Initial Function MPI_Init must be called ◦ before any other MPI function, ◦ in every MPI program, ◦ only once in an MPI program. Note: All MPI identifiers, including function identifiers, begin with the prefix MPI_, followed by a capital letter and a series of lowercase letters and underscores.
MPI_Comm_rank and MPI_Comm_size MPI_COMM_WORLD is the default communicator that you get “for free”. MPI_Comm_rank is to determine its rank within a communicator. MPI_Comm_size is to determine the total number of processes in a communicator.
MPI Finalize Function Terminates the MPI execution environment. Should be the last MPI routine called in every MPI program. Allowing the system to free up resources (such as memory) that have been allocated to MPI.
Circuit Satisfiability Circuit Satisfiability
#include Int main (int argc, char *argv[]) { int i; int id; int p; void check_circuit (int, int); MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &id); MPI_Comm_size (MPICOMM_WORLD, &p); for (i = id; i < 65536; i += p) check_circuit (id, i); printf {“process %d is done\n”, id); fflush (stdout); MPI_Finalize (); return 0; }
#define EXTRACT_BIT(n,i) ((n& (I << i))? 1:0) void check_circuit (int id, int z) { int v[16]; int i; for (i = 0; i < 16; i++) v[i] = EXTRACT_BIT (z,i); if ((v[0] || v[1]) && (!v[1] || !v[3]) && (v[2] || v[3]) && ……) { printf (“%d) %d%d ….%d“, id, v[0], v[1], …, v[15]); fflush (stdout); }
Output (1/3)
Output (2/3)
Output (3/3)