MPI Collective Communication CS 524 – High-Performance Computing.

Slides:



Advertisements
Similar presentations
1 Introduction to Collective Operations in MPI l Collective operations are called by all processes in a communicator. MPI_BCAST distributes data from one.
Advertisements

Its.unc.edu 1 Collective Communication University of North Carolina - Chapel Hill ITS - Research Computing Instructor: Mark Reed
MPI Collective Communications
Reference: / MPI Program Structure.
MPI_Gatherv CISC372 Fall 2006 Andrew Toy Tom Lynch Bill Meehan.
HPDC Spring MPI 11 CSCI-6964: High Performance Parallel & Distributed Computing (HPDC) AE 216, Mon/Thurs. 2 – 3:20 p.m Message Passing Interface.
A Message Passing Standard for MPP and Workstations Communications of the ACM, July 1996 J.J. Dongarra, S.W. Otto, M. Snir, and D.W. Walker.
SOME BASIC MPI ROUTINES With formal datatypes specified.
Message-Passing Programming and MPI CS 524 – High-Performance Computing.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 8 Matrix-vector Multiplication.
EECC756 - Shaaban #1 lec # 7 Spring Message Passing Interface (MPI) MPI, the Message Passing Interface, is a library, and a software standard.
Collective Communications
MPI Point-to-Point Communication CS 524 – High-Performance Computing.
Collective Communication.  Collective communication is defined as communication that involves a group of processes  More restrictive than point to point.
Distributed Systems CS Programming Models- Part II Lecture 17, Nov 2, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1.
1 Parallel Programming with MPI- Day 3 Science & Technology Support High Performance Computing Ohio Supercomputer Center 1224 Kinnear Road Columbus, OH.
Collective Communication
L15: Putting it together: N-body (Ch. 6) October 30, 2012.
ORNL is managed by UT-Battelle for the US Department of Energy Crash Course In Message Passing Interface Adam Simpson NCCS User Assistance.
Parallel Processing1 Parallel Processing (CS 676) Lecture 7: Message Passing using MPI * Jeremy R. Johnson *Parts of this lecture was derived from chapters.
Parallel Programming and Algorithms – MPI Collective Operations David Monismith CS599 Feb. 10, 2015 Based upon MPI: A Message-Passing Interface Standard.
2a.1 Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte,
1 Collective Communications. 2 Overview  All processes in a group participate in communication, by calling the same function with matching arguments.
ECE 1747H : Parallel Programming Message Passing (MPI)
1 MPI: Message-Passing Interface Chapter 2. 2 MPI - (Message Passing Interface) Message passing library standard (MPI) is developed by group of academics.
Part I MPI from scratch. Part I By: Camilo A. SilvaBIOinformatics Summer 2008 PIRE :: REU :: Cyberbridges.
Steve Lantz Computing and Information Science Distributed Memory Programming Using Advanced MPI (Message Passing Interface)
PP Lab MPI programming VI. Program 1 Break up a long vector into subvectors of equal length. Distribute subvectors to processes. Let them compute the.
Parallel Computing A task is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information What do we.
1 Review –6 Basic MPI Calls –Data Types –Wildcards –Using Status Probing Asynchronous Communication Collective Communications Advanced Topics –"V" operations.
Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University.
CS 838: Pervasive Parallelism Introduction to MPI Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from an online tutorial.
MPI Communications Point to Point Collective Communication Data Packaging.
Plan: I. Introduction: Programming Model II. Basic MPI Command III. Examples IV. Collective Communications V. More on Communication modes VI. References.
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
Summary of MPI commands Luis Basurto. Large scale systems Shared Memory systems – Memory is shared among processors Distributed memory systems – Each.
Parallel Programming with MPI By, Santosh K Jena..
Parallel Programming & Cluster Computing MPI Collective Communications Dan Ernst Andrew Fitz Gibbon Tom Murphy Henry Neeman Charlie Peck Stephen Providence.
MPI Jakub Yaghob. Literature and references Books Gropp W., Lusk E., Skjellum A.: Using MPI: Portable Parallel Programming with the Message-Passing Interface,
CS4230 CS4230 Parallel Programming Lecture 13: Introduction to Message Passing Mary Hall October 23, /23/2012.
1. 2 The logical view of a machine supporting the message-passing paradigm consists of p processes, each with its own exclusive address space. The logical.
2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()
S an D IEGO S UPERCOMPUTER C ENTER N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE MPI 2 Part II NPACI Parallel Computing Institute.
-1.1- MPI Lectured by: Nguyễn Đức Thái Prepared by: Thoại Nam.
Message Passing Programming Based on MPI Collective Communication I Bora AKAYDIN
MPI Derived Data Types and Collective Communication
Message Passing Interface Using resources from
COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming - Exercises Dr. Xiao Qin Auburn University
COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming: 1. Collective Operations 2. Overlapping Communication with Computation Dr. Xiao.
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
MPI_Alltoall By: Jason Michalske. What is MPI_Alltoall? Each process sends distinct data to each receiver. The Jth block of process I is received by process.
Computer Science Department
Introduction to MPI Programming Ganesh C.N.
CS4402 – Parallel Computing
Introduction to MPI Programming
Computer Science Department
Send and Receive.
Collective Communication with MPI
An Introduction to Parallel Programming with MPI
Collective Communication Operations
Send and Receive.
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
Distributed Systems CS
Lecture 14: Inter-process Communication
Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte,
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
Hardware Environment VIA cluster - 8 nodes Blade Server – 5 nodes
Computer Science Department
5- Message-Passing Programming
CS 584 Lecture 8 Assignment?.
Presentation transcript:

MPI Collective Communication CS 524 – High-Performance Computing

CS 524 (Wi 2003/04)- Asim LUMS2 Collective Communication Communication involving a group of processes Called by all processes in the communicator Communication takes place within a communicator Built up from point-to-point communications gather communicator

CS 524 (Wi 2003/04)- Asim LUMS3 Characteristics of Collective Communication Collective communication will not interfere with point- to-point communication All processes must call the collective function Substitute for a sequence of point-to-point function calls Synchronization not guaranteed (except for barrier) No non-blocking collective function. Function return implies completion No tags are needed Receive buffer must be exactly the right size

CS 524 (Wi 2003/04)- Asim LUMS4 Types of Collective Communication Synchronization  barrier Data exchange  broadcast  gather, scatter, all-gather, and all-to-all exchange  Variable-size-location versions of above Global reduction (collective operations)  sum, minimum, maximum, etc

CS 524 (Wi 2003/04)- Asim LUMS5 Barrier Synchronization Red light for each processor: turns green when all processors have arrived A process calling it will be blocked until all processes in the group (communicator) have called it int MPI_ Barrier(MPI_Comm comm)  comm: communicator whose processes need to be synchronized

CS 524 (Wi 2003/04)- Asim LUMS6 Broadcast One-to-all communication: same data sent from root process to all others in communicator All processes must call the function specifying the same root and communicator int MPI_Bcast(void *buf, int count, MPI_Datatype datatype, int root, MPI_Comm comm)  buf: starting address of buffer (sending and receiving)  count: number of elements to be sent/received  datatype: MPI datatype of elements  root: rank of sending process  comm: MPI communicator of processors involved

CS 524 (Wi 2003/04)- Asim LUMS7 Scatter One-to-all communication: different data sent to each process in the communicator (in rank order) Example: partition an array equally among the processes int MPI_ Scatter(void *sbuf, int scount, MPI_Datatype stype, void *rbuf, int rcount, MPI_Datatype rtype, int root, MPI_Comm comm)  sbuf and rbuf: starting address of send and receive buffers  scount and rcount: number of elements sent and received to/from each process  stype and rtype: MPI datatype of sent/received data  root: rank of sending process  comm: MPI communicator of processors involved

CS 524 (Wi 2003/04)- Asim LUMS8 Gather All-to-one communication: different data collected by the root process from other processes in the communicator Collection done in rank order MPI_Gather has same arguments as MPI_Scatter Receive arguments only meaningful at root Example: collect an array from data held by different processeses int MPI_ Gather(void *sbuf, int scount, MPI_Datatype stype, void *rbuf, int rcount, MPI_Datatype rtype, int root, MPI_Comm comm)

CS 524 (Wi 2003/04)- Asim LUMS9 Scatter and Gather ABCD B ACD Scatter Processors01 (root)23 Gather B ABCD B A CD

CS 524 (Wi 2003/04)- Asim LUMS10 Example Matrix-vector multiply with matrix A partitioned row- wise among 4 processors. Partial results are gathered from all processors at end of computation double A[25][100], x[100],ypart[25],ytotal[100]; int root = 0; for (i = 0; i < 25; i++) { for (j = 0; j < 100; j++) ypart[i] = ypart[i] + A[i][j]*x[j]; } MPI_Gather(ypart, 25, MPI_DOUBLE, ytotal, 25, MPI_DOUBLE, root, MPI_COMM_WORLD);

CS 524 (Wi 2003/04)- Asim LUMS11 All-Gather and All-to-All (1) All-gather  All processes, rather than just the root, gather data from the group All-to-all  All processes, rather than just the root, scatter data to the group  All processes receive data from all processes in rank order No root process specified Send and receive arguments significant for all processes

CS 524 (Wi 2003/04)- Asim LUMS12 All-Gather and All-to-All (2) int MPI_Allgather(void *sbuf, int scount, MPI_Datatype stype, void *rbuf, int rcount, MPI_Datatype rtype, MPI_Comm comm) int MPI_Alltoall(void *sbuf, int scount, MPI_Datatype stype, void *rbuf, int rcount, MPI_Datatype rtype, MPI_Comm comm)  scount: number of elements sent to each process; for all-to-all communication, size of sbuf should be scount*p (p = # of processes)  rcount: number of elements received from any process; size of rbuf should be rcount*p (p = # of processes)

CS 524 (Wi 2003/04)- Asim LUMS13 Global Reduction Operations (1) Used to compute a result involving data distributed over a group of processes Result placed in specified process or all processes Examples Global sum or product  Global maximum or minimum  Global user-defined operation Dj = D(0, j)*D(1, j)*D(2, j)*…*D(n-1, j)  D(i,j) is the jth data held by the ith process  n is the total number of processes in the group  * is a reduction operation  Dj is result of reduction operation performed on the of jth elements held by all processe in the group

CS 524 (Wi 2003/04)- Asim LUMS14 Global Reduction Operations (2) int MPI_Reduce(void *sbuf, void *rbuf, int count, MPI_Datatype stype, MPI_Op op, int root, MPI_Comm comm) int MPI_Allreduce(void *sbuf, void *rbuf, int count, MPI_Datatype stype, MPI_Op op, MPI_Comm comm) int MPI_Reduce_scatter(void *sbuf, void *rbuf, int *rcounts, MPI_Datatype stype, MPI_Op op, MPI_Comm comm)

CS 524 (Wi 2003/04)- Asim LUMS15 Global Reduction Operations (3) MPI_Reduce returns results to a single process (root) MPI_Allreduce returns results to all processes in the group MPI_Reduce_scatter scatters a vector, which results from a reduce operation, across all processes  sbuf: address of send buffer  rbuf: address of receive buffer (significant only at the root process)  rcounts: integer array that has counts of elements received from each process  op: reduce operation, which may be MPI predefined or user- defined (by using MPI_Op_create)

CS 524 (Wi 2003/04)- Asim LUMS16 Predefined Reduction Operations MPI nameFunction MPI_MAXMaximum MPI_MINMinimum MPI_SUMSum MPI_PRODProduct MPI_LANDLogical AND MPI_BANDBitwise AND MPI_LORLogical OR MPI_BORBitwise OR MPI_LXORLogical exclusive OR MPI_BXORBitwise exclusive OR MPI_MAXLOCMaximum and location MPI_MINLOCMinimum and location

CS 524 (Wi 2003/04)- Asim LUMS17 Minloc and Maxloc Designed to compute a global minimum/maximum and an index associated with the extreme value  Common application: index is processor rank that held the extreme value If more than one extreme exists, index returned is for the first Designed to work on operands that consist of a value and index pair. MPI defines such special data types:  MPI_FLOAT_INT, MPI_DOUBLE_INT, MPI_LONG_INT, MPI_2INT, MPI_SHORT_INT, MPI_LONG_DOUBLE_INT

CS 524 (Wi 2003/04)- Asim LUMS18 Example #include main (int argc, char *argv[]) { int rank, root; struct { double value; int rank; } in, out; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); in.value = rank+1; in.rank = rank; root = 0; MPI_Reduce(&in, &out, 1, MPI_DOUBLE_INT, MPI_MAXLOC, root, MPI_COMM_WORLD); if (rank == root) printf(“PE %i: max = %f at rank %i\n”, rank, out.value, out.rank); MPI_Finalize(); } Output: PE 0: max = 5.0 at rank 4

CS 524 (Wi 2003/04)- Asim LUMS19 Variable-Size-Location Collective Functions Allows varying size and relative locations of messages in buffer Examples: MPI_Scatterv, MPI_Gatherv, MPI_Allgatherv, MPI_Alltoallv Advantages  More flexibility in writing code  Less need to copy data into temporary buffers  More compact code  Vendor’s implementation may be optimal Disadvantage: may be less efficient than fixed size/location functions

CS 524 (Wi 2003/04)- Asim LUMS20 Scatterv and Gatherv int MPI_Scatterv(void *sbuf, int *scount, int *displs, MPI_Datatype stype, void *rbuf, int rcount, MPI_Datatype rtype, int root, MPI_Comm comm) int MPI_Gatherv(void *sbuf, int scount, MPI_Datatype stype, void *rbuf, int *rcount, int *displs, MPI_Datatype rtype, int root, MPI_Comm comm)  *scount and *rcount: integer array containing number of elements sent/received to/from each process  *displs: integer array specifying the displacements relative to start of buffer at which to send/place data to corresponding process