MPI Message Passing Interface

Slides:



Advertisements
Similar presentations
MPI Message Passing Interface
Advertisements

Its.unc.edu 1 Collective Communication University of North Carolina - Chapel Hill ITS - Research Computing Instructor: Mark Reed
Chapter 3. MPI MPI = Message Passing Interface Specification of message passing libraries for developers and users –Not a library by itself, but specifies.
Reference: / MPI Program Structure.
Message Passing Interface COS 597C Hanjun Kim. Princeton University Serial Computing 1k pieces puzzle Takes 10 hours.
Introduction MPI Mengxia Zhu Fall An Introduction to MPI Parallel Programming with the Message Passing Interface.
Introduction to MPI. What is Message Passing Interface (MPI)?  Portable standard for communication  Processes can communicate through messages.  Each.
SOME BASIC MPI ROUTINES With formal datatypes specified.
Message-Passing Programming and MPI CS 524 – High-Performance Computing.
Distributed Memory Programming with MPI. What is MPI? Message Passing Interface (MPI) is an industry standard message passing system designed to be both.
EECC756 - Shaaban #1 lec # 7 Spring Message Passing Interface (MPI) MPI, the Message Passing Interface, is a library, and a software standard.
Message Passing Interface. Message Passing Interface (MPI) Message Passing Interface (MPI) is a specification designed for parallel applications. The.
1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben.
Sahalu JunaiduICS 573: High Performance Computing6.1 Programming Using the Message Passing Paradigm Principles of Message-Passing Programming The Building.
CS 179: GPU Programming Lecture 20: Cross-system communication.
1 An Introduction to MPI Parallel Programming with the Message Passing Interface Originally by William Gropp and Ewing Lusk Adapted by Anda Iamnitchi.
1 An Introduction to MPI Parallel Programming with the Message Passing Interface William Gropp Ewing Lusk Argonne National Laboratory Presenter: Mike Slavik.
ORNL is managed by UT-Battelle for the US Department of Energy Crash Course In Message Passing Interface Adam Simpson NCCS User Assistance.
Parallel Processing1 Parallel Processing (CS 676) Lecture 7: Message Passing using MPI * Jeremy R. Johnson *Parts of this lecture was derived from chapters.
1 Collective Communications. 2 Overview  All processes in a group participate in communication, by calling the same function with matching arguments.
1 MPI: Message-Passing Interface Chapter 2. 2 MPI - (Message Passing Interface) Message passing library standard (MPI) is developed by group of academics.
HPCA2001HPCA Message Passing Interface (MPI) and Parallel Algorithm Design.
Part I MPI from scratch. Part I By: Camilo A. SilvaBIOinformatics Summer 2008 PIRE :: REU :: Cyberbridges.
PP Lab MPI programming VI. Program 1 Break up a long vector into subvectors of equal length. Distribute subvectors to processes. Let them compute the.
Parallel Computing A task is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information What do we.
Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University.
Message Passing Programming with MPI Introduction to MPI Basic MPI functions Most of the MPI materials are obtained from William Gropp and Rusty Lusk’s.
CS 838: Pervasive Parallelism Introduction to MPI Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from an online tutorial.
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
Summary of MPI commands Luis Basurto. Large scale systems Shared Memory systems – Memory is shared among processors Distributed memory systems – Each.
MPI Introduction to MPI Commands. Basics – Send and Receive MPI is a message passing environment. The processors’ method of sharing information is NOT.
1 The Message-Passing Model l A process is (traditionally) a program counter and address space. l Processes may have multiple threads (program counters.
CS 420 – Design of Algorithms MPI Data Types Basic Message Passing - sends/receives.
Distributed-Memory (Message-Passing) Paradigm FDI 2004 Track M Day 2 – Morning Session #1 C. J. Ribbens.
Parallel Programming with MPI By, Santosh K Jena..
1 Message Passing Models CEG 4131 Computer Architecture III Miodrag Bolic.
Oct. 23, 2002Parallel Processing1 Parallel Processing (CS 730) Lecture 6: Message Passing using MPI * Jeremy R. Johnson *Parts of this lecture was derived.
MPI Jakub Yaghob. Literature and references Books Gropp W., Lusk E., Skjellum A.: Using MPI: Portable Parallel Programming with the Message-Passing Interface,
CS4230 CS4230 Parallel Programming Lecture 13: Introduction to Message Passing Mary Hall October 23, /23/2012.
Message Passing and MPI Laxmikant Kale CS Message Passing Program consists of independent processes, –Each running in its own address space –Processors.
Introduction to MPI CDP 1. Shared Memory vs. Message Passing Shared Memory Implicit communication via memory operations (load/store/lock) Global address.
Introduction to MPI Nischint Rajmohan 5 November 2007.
Programming distributed memory systems: Message Passing Interface (MPI) Distributed memory systems: multiple processing units working on one task (e.g.
3/12/2013Computer Engg, IIT(BHU)1 MPI-1. MESSAGE PASSING INTERFACE A message passing library specification Extended message-passing model Not a language.
-1.1- MPI Lectured by: Nguyễn Đức Thái Prepared by: Thoại Nam.
Message Passing Programming Based on MPI Collective Communication I Bora AKAYDIN
Message Passing Interface Using resources from
MPI-Message Passing Interface. What is MPI?  MPI is a specification for the developers and users of message passing libraries. By itself, it is NOT a.
Lecture 3 Point-to-Point Communications Dr. Muhammad Hanif Durad Department of Computer and Information Sciences Pakistan Institute Engineering and Applied.
COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming: 1. Collective Operations 2. Overlapping Communication with Computation Dr. Xiao.
Distributed Processing with MPI International Summer School 2015 Tomsk Polytechnic University Assistant Professor Dr. Sergey Axyonov.
Introduction to MPI Programming Ganesh C.N.
Introduction to parallel computing concepts and technics
CS4402 – Parallel Computing
Introduction to MPI.
MPI Message Passing Interface
Send and Receive.
CS 584.
Send and Receive.
CS4961 Parallel Programming Lecture 16: Introduction to Message Passing Mary Hall November 3, /03/2011 CS4961.
MPI-Message Passing Interface
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
Message Passing Models
Lecture 14: Inter-process Communication
MPI: Message Passing Interface
Introduction to parallelism and the Message Passing Interface
Hardware Environment VIA cluster - 8 nodes Blade Server – 5 nodes
Distributed Memory Programming with Message-Passing
MPI Message Passing Interface
CS 584 Lecture 8 Assignment?.
Presentation transcript:

MPI Message Passing Interface Yvon Kermarrec

More readings “Parallel programming with MPI”, Peter Pacheco, Morgan Kaufmann Publishers LAM/MPI User Guide: http://www.lam-mpi.org/tutorials/lam/ The MPI standard is available from http://www.mpi-forum.org/

Agenda Part 0 – the context Slides extracted from a lecture from Hanjun Kin, Princeton U. Part 1 - Introduction Basics of Parallel Computing Six-function MPI Point-to-Point Communications Part 2 – Advanced features of MPI Collective Communication Part 3 – examples and how to program an MPI application

Serial Computing 1k pieces puzzle Takes 10 hours

Parallelism on Shared Memory Orange and brown share the puzzle on the same table Takes 6 hours (not 5 due to communication & contention)

The more, the better?? Lack of seats (Resource limit) More contention among people

Parallelism on Distributed Systems Scalable seats (Scalable Resource) Less contention from private memory spaces

How to share the puzzle? DSM (Distributed Shared Memory) Message Passing

DSM (Distributed Shared Memory) Provides shared memory physically or virtually Pros - Easy to use Cons - Limited Scalability, High coherence overhead

Message Passing Pros – Scalable, Flexible Cons – Someone says it’s more difficult than DSM

Agenda Part 1 - Introduction Basics of Parallel Computing Six-function MPI Point-to-Point Communications Part 2 – Advanced features of MPI Collective Communication Part 3 – examples and how to program an MPI application

Agenda Part 0 – the context Slides extracted from a lecture from Hanjun Kin, Princeton U. Part 1 - Introduction Basics of Parallel Computing Six-function MPI Point-to-Point Communications Part 2 – Advanced features of MPI Collective Communication Part 3 – examples and how to program an MPI application

We need more computational power The weather forcast example by P Pacheco: Suppose we wish to predict the weather over the United and Canada for the next 48 hours Also suppose that we want to model the atmosphere from sea level to an altitude of 20 km we use a cubical grid, with each cube measuring 0.1 km to model the atmosphere ,or 2.0 x 107 km2 x 20 km x 103 cubes per km3 = 4 x 1011 grid points Suppose we need to computer 100 instructions for each points for the next 48 hours : we need 4 x 1013 x 48 operations If our computer executes 109 ope/sec, we need 23 days

The need for parallel programming We face numerous challenges in science (biology, simulation, earthquakes, …) and we cannot build fast enough computers…. Data can be big (big data…) and memory is rather limited Processors can do a lot ... But to adress figures as mentionned we can program smarter but that is not enough

The need for parallel machines We can build a parallel machines, but there is still a huge amount of work to be done: decide on and implement an interconnection network for the processors and memory modules, design and implement system software for the hardware Design algorithms and data structures to solve our problem Divide the algorithms and data structures into subproblems Indentify the communications and data exchanges Assign subproblems to processors

The need for parallel machines Flynn’s taxonomy (or how to work more!) SISD : Single Instruction – Single Data : the common and classical machine… SIMD : Single Instruction – Multiple data : the same instructions are carried out simultaneously on multiple data items MIMD : Multiple Instructions – Multiple Data SPMD : Single Program – Multiple Data : the same version of the program is replicated and run on different data

The need for parallel machines We can build one parallel computer … but that would be very expensive, time and energy consuming, … and hard to maintain We may want to integrate what is available in the labs – to agregate the available computing ressources and reuse ordinary machines : US D.o Energy and the PVM project (Parallel Virtual Machine) from ‘89

MPI : Message Passing Interface ? MPI : an Interface A message-passing library specification extended message-passing model not a language or compiler specification not a specific implementation or product For parallel computers, clusters, and heterogeneous networks A riche set of features Designed to provide access to advanced parallel hardware for end users, library writers, and tool developers

MPI ? An international product Early vendor systems (Intel’s NX, IBM’s EUI, TMC’s CMMD) were not portable Early portable systems (PVM, p4, TCGMSG, Chameleon) were mainly research efforts Were rather limited… and lacked vendor support Were not implemented at the most efficient level The MPI Forum organized in 1992 with broad participation by: vendors: IBM, Intel, TMC, SGI, Convex … users: application scientists and library writers

How big is the MPI library? Huge ( 125 Functions )… Basic ( 6 Functions ) But only a subset is needed to program a distributed application

Environments for parallel programming Upshot, Jumpshot, and MPE tools http://www.mcs.anl.gov/research/projects/perfvis/software/viewers/ • Pallas VAMPIR http://www.vampir.eu/ • Paragraph http://www.ncsa.uiuc.edu/Apps/MCS/ParaGraph/ParaGraph.html

A Minimal MPI Program in C #include "mpi.h" #include <stdio.h> int main( int argc, char *argv[] ) { MPI_Init( &argc, &argv ); printf( "Hello, world!\n" ); MPI_Finalize(); return 0; }

Finding Out About the Environment Two important questions that arise early in a parallel program are: How many processes are participating in this computation? Which one am I? MPI provides functions to answer these questions: MPI_Comm_size reports the number of processes. MPI_Comm_rank reports the rank, a number between 0 and size-1, identifying the calling process

Better Hello (C) #include "mpi.h" #include <stdio.h> int main( int argc, char *argv[] ) { int rank, size; MPI_Init( &argc, &argv ); MPI_Comm_rank( MPI_COMM_WORLD, &rank ); MPI_Comm_size( MPI_COMM_WORLD, &size ); printf( "I am %d of %d\n", rank, size ); MPI_Finalize(); return 0; }

Some Basic Concepts Processes can be collected into groups. Each message is sent in a context, and must be received in the same context. A group and context together form a communicator. A process is identified by its rank in the group associated with a communicator. There is a default communicator whose group contains all initial processes, called MPI_COMM_WORLD.

MPI Datatypes The data in a message to sent or received is described by a triple (address, count, datatype), where An MPI datatype is recursively defined as: predefined, corresponding to a data type from the language (e.g., MPI_INT, MPI_DOUBLE_PRECISION) a contiguous array of MPI datatypes an indexed array of blocks of datatypes an arbitrary structure of datatypes There are MPI functions to construct custom datatypes, such an array of (int, float) pairs, or a row of a matrix stored columnwise.

Basic MPI types MPI datatype C datatype MPI_CHAR signed char MPI_SIGNED_CHAR signed char MPI_UNSIGNED_CHAR unsigned char MPI_SHORT signed short MPI_UNSIGNED_SHORT unsigned short MPI_INT signed int MPI_UNSIGNED unsigned int MPI_LONG signed long MPI_UNSIGNED_LONG unsigned long MPI_FLOAT float MPI_DOUBLE double MPI_LONG_DOUBLE long double

MPI Tags Messages are sent with an accompanying user-defined integer tag, to assist the receiving process in identifying the message. Messages can be screened at the receiving end by specifying a specific tag, or not screened by specifying MPI_ANY_TAG as the tag in a receive. Some non-MPI message-passing systems have called tags “message types”. MPI calls them tags to avoid confusion with datatypes.

MPI blocking send MPI_SEND(void *start, int count,MPI_DATATYPE datatype, int dest, int tag, MPI_COMM comm) The message buffer is described by (start, count, datatype). dest is the rank of the target process in the defined communicator. tag is the message identification number.

MPI Basic (Blocking) Receive MPI_RECV(start, count, datatype, source, tag, comm, status) Waits until a matching (on source and tag) message is received from the system, and the buffer can be used. source is rank in communicator specified by comm, or MPI_ANY_SOURCE. status contains further information Receiving fewer than count occurrences of datatype is OK, but receiving more is an error.

Retrieving Further Information Status is a data structure allocated in the user’s program. In C: int recvd_tag, recvd_from, recvd_count; MPI_Status status; MPI_Recv(..., MPI_ANY_SOURCE, MPI_ANY_TAG, ..., &status ) recvd_tag = status.MPI_TAG; recvd_from = status.MPI_SOURCE; MPI_Get_count( &status, datatype, &recvd_count );

More info A receive operation may accept messages from an arbitrary sender, but a send operation must specify a unique receiver. Source equals destination is allowed, that is, a process can send a message to itself.

Why MPI is simple? Many parallel programs can be written using just these six functions, only two of which are non-trivial; MPI_INIT MPI_FINALIZE MPI_COMM_SIZE MPI_COMM_RANK MPI_SEND MPI_RECV

Simple full example #include <stdio.h> #include <mpi.h> int main(int argc, char *argv[]) { const int tag = 42; /* Message tag */ int id, ntasks, source_id, dest_id, err, i; MPI_Status status; int msg[2]; /* Message array */ err = MPI_Init(&argc, &argv); /* Initialize MPI */ if (err != MPI_SUCCESS) { printf("MPI initialization failed!\n"); exit(1); } err = MPI_Comm_size(MPI_COMM_WORLD, &ntasks); /* Get nr of tasks */ err = MPI_Comm_rank(MPI_COMM_WORLD, &id); /* Get id of this process */ if (ntasks < 2) { printf("You have to use at least 2 processors to run this program\n"); MPI_Finalize(); /* Quit if there is only one processor */ exit(0);

Simple full example (Cont.) if (id == 0) { /* Process 0 (the receiver) does this */ for (i=1; i<ntasks; i++) { err = MPI_Recv(msg, 2, MPI_INT, MPI_ANY_SOURCE, tag, MPI_COMM_WORLD, \ &status); /* Receive a message */ source_id = status.MPI_SOURCE; /* Get id of sender */ printf("Received message %d %d from process %d\n", msg[0], msg[1], \ source_id); } else { /* Processes 1 to N-1 (the senders) do this */ msg[0] = id; /* Put own identifier in the message */ msg[1] = ntasks; /* and total number of processes */ dest_id = 0; /* Destination address */ err = MPI_Send(msg, 2, MPI_INT, dest_id, tag, MPI_COMM_WORLD); err = MPI_Finalize(); /* Terminate MPI */ if (id==0) printf("Ready\n"); exit(0); return 0;

Agenda Part 0 – the context Slides extracted from a lecture from Hanjun Kin, Princeton U. Part 1 - Introduction Basics of Parallel Computing Six-function MPI Point-to-Point Communications Part 2 – Advanced features of MPI Collective Communication Part 3 – examples and how to program an MPI application

Collective communications A single call handles the communication between all the processes in a communicator There are 3 types of collective communications Data movement (e.g. MPI_Bcast) Reduction (e.g. MPI_Reduce) Synchronization (e.g. MPI_Barrier)

Broadcast int MPI_Bcast(void *buffer, int count, MPI_Datatype datatype, int root, MPI_Comm comm); One process (root) sends data to all the other processes in the same communicator Must be called by all the processes with the same arguments P1 A B C D P1 A B C D P2 P2 A B C D MPI_Bcast P3 P3 A B C D P4 P4 A B C D

Gather int MPI_Gather(void *sendbuf, int sendcnt, MPI_Datatype sendtype, void *recvbuf, int recvcnt, MPI_Datatype recvtype, int root, MPI_Comm comm) One process (root) collects data to all the other processes in the same communicator Must be called by all the processes with the same arguments P1 A P1 A B C D P2 B P2 MPI_Gather P3 C P3 P4 D P4

Gather to All int MPI_Allgather(void *sendbuf, int sendcnt, MPI_Datatype sendtype, void *recvbuf, int recvcnt, MPI_Datatype recvtype, MPI_Comm comm) All the processes collects data to all the other processes in the same communicator Must be called by all the processes with the same arguments P1 A P1 A B C D P2 B P2 A B C D MPI_Allgather P3 C P3 A B C D P4 D P4 A B C D

Reduction int MPI_Reduce(void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int root, MPI_Comm comm) One process (root) collects data to all the other processes in the same communicator, and performs an operation on the data MPI_SUM, MPI_MIN, MPI_MAX, MPI_PROD, logical AND, OR, XOR, and a few more MPI_Op_create(): User defined operator P1 A … P1 A+B+C+D P2 B … P2 MPI_Reduce P3 C … P3 P4 D … P4

Synchronization int MPI_Barrier(MPI_Comm comm) #include "mpi.h" #include <stdio.h> int main(int argc, char *argv[]) { int rank, nprocs; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&nprocs); MPI_Comm_rank(MPI_COMM_WORLD,&rank); MPI_Barrier(MPI_COMM_WORLD); printf("Hello, world. I am %d of %d\n", rank, nprocs); MPI_Finalize(); return 0; }

Examples…. Master and slaves

MPICH (http://www-unix.mcs.anl.gov/mpi/mpich/) For more functions… http://www.mpi-forum.org http://www.llnl.gov/computing/tutorials/mpi/ http://www.nersc.gov/nusers/help/tutorials/mpi/intro/ http://www-unix.mcs.anl.gov/mpi/tutorial/ MPICH (http://www-unix.mcs.anl.gov/mpi/mpich/) Open MPI (http://www.open-mpi.org/) http://w3.pppl.gov/~ethier/MPI_OpenMP_2011.pdf