Introduction to MPI programming Morris Law, SCID May 18/25, 2013.

Slides:



Advertisements
Similar presentations
MPI Basics Introduction to Parallel Programming and Cluster Computing University of Washington/Idaho State University MPI Basics Charlie Peck Earlham College.
Advertisements

Chapter 3. MPI MPI = Message Passing Interface Specification of message passing libraries for developers and users –Not a library by itself, but specifies.
Reference: / MPI Program Structure.
MPI Fundamentals—A Quick Overview Shantanu Dutt ECE Dept., UIC.
Introduction to MPI. What is Message Passing Interface (MPI)?  Portable standard for communication  Processes can communicate through messages.  Each.
Message-Passing Programming and MPI CS 524 – High-Performance Computing.
Distributed Memory Programming with MPI. What is MPI? Message Passing Interface (MPI) is an industry standard message passing system designed to be both.
Parallel Programming in C with MPI and OpenMP
EECC756 - Shaaban #1 lec # 7 Spring Message Passing Interface (MPI) MPI, the Message Passing Interface, is a library, and a software standard.
12b.1 Introduction to Message-passing with MPI UNC-Wilmington, C. Ferner, 2008 Nov 4, 2008.
Message Passing Interface. Message Passing Interface (MPI) Message Passing Interface (MPI) is a specification designed for parallel applications. The.
1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben.
Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center Presented by Timothy H. Kaiser, Ph.D. San Diego.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 9 October 30, 2002 Nayda G. Santiago.
High Performance Computation --- A Practical Introduction Chunlin Tian NAOC Beijing 2011.
Director of Contra Costa College High Performance Computing Center
2.1 Message-Passing Computing ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 17, 2012.
2.1 Message-Passing Computing ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 14, 2013.
An Introduction to Parallel Programming and MPICH Nikolaos Hatzopoulos.
Part I MPI from scratch. Part I By: Camilo A. SilvaBIOinformatics Summer 2008 PIRE :: REU :: Cyberbridges.
Parallel Computing A task is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information What do we.
1 Review –6 Basic MPI Calls –Data Types –Wildcards –Using Status Probing Asynchronous Communication Collective Communications Advanced Topics –"V" operations.
1 What is MPI?  MPI = Message Passing Interface  Specification of message passing libraries for developers and users  Not a library by itself, but specifies.
Message Passing Programming with MPI Introduction to MPI Basic MPI functions Most of the MPI materials are obtained from William Gropp and Rusty Lusk’s.
Hybrid MPI and OpenMP Parallel Programming
CS 838: Pervasive Parallelism Introduction to MPI Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from an online tutorial.
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
Summary of MPI commands Luis Basurto. Large scale systems Shared Memory systems – Memory is shared among processors Distributed memory systems – Each.
Message Passing Interface (MPI) 1 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School.
CS 420 – Design of Algorithms MPI Data Types Basic Message Passing - sends/receives.
PP Lab MPI programming II. Program#1 Write a program that prints hello from every created process. Like: Hello World from process 0 of 5 Hello World from.
Chapter 4 Message-Passing Programming. The Message-Passing Model.
Message Passing and MPI Laxmikant Kale CS Message Passing Program consists of independent processes, –Each running in its own address space –Processors.
Introduction to MPI CDP 1. Shared Memory vs. Message Passing Shared Memory Implicit communication via memory operations (load/store/lock) Global address.
Introduction to MPI Nischint Rajmohan 5 November 2007.
MPI and OpenMP.
Programming distributed memory systems: Message Passing Interface (MPI) Distributed memory systems: multiple processing units working on one task (e.g.
1 HPCI Presentation Kulathep Charoenpornwattana. March 12, Outline Parallel programming with MPI Running MPI applications on Azul & Itanium Running.
Introduction to Pragnesh Patel 1 NICS CSURE th June 2015.
Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School.
Timing in MPI Tarik Booker MPI Presentation May 7, 2003.
3/12/2013Computer Engg, IIT(BHU)1 MPI-1. MESSAGE PASSING INTERFACE A message passing library specification Extended message-passing model Not a language.
Message Passing Interface Using resources from
MPI-Message Passing Interface. What is MPI?  MPI is a specification for the developers and users of message passing libraries. By itself, it is NOT a.
1 Programming distributed memory systems Clusters Distributed computers ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 6, 2015.
Chapter 4 Message-Passing Programming. Learning Objectives Understanding how MPI programs execute Understanding how MPI programs execute Familiarity with.
PVM and MPI.
1 ITCS4145 Parallel Programming B. Wilkinson March 23, hybrid-abw.ppt Hybrid Parallel Programming Introduction.
Introduction to parallel computing concepts and technics
MPI Basics.
Introduction to MPI.
MPI Message Passing Interface
CS 668: Lecture 3 An Introduction to MPI
CS 584.
Introduction to Parallel Programming with MPI
Hybrid Parallel Programming
Pattern Programming Tools
Introduction to parallelism and the Message Passing Interface
Hybrid Parallel Programming
Hybrid Parallel Programming
MPI MPI = Message Passing Interface
Introduction to Parallel Computing with MPI
Hybrid MPI and OpenMP Parallel Programming
Hardware Environment VIA cluster - 8 nodes Blade Server – 5 nodes
Introduction to Parallel Computing
Hybrid Parallel Programming
Distributed Memory Programming with Message-Passing
Parallel Processing - MPI
Some codes for analysis and preparation for programming
CS 584 Lecture 8 Assignment?.
Presentation transcript:

Introduction to MPI programming Morris Law, SCID May 18/25, 2013

What is Message Passing Interface (MPI)?  Portable standard for communication  Processes can communicate through messages.  Each process is a separable program  All data is private

Multi-core programming  Currently, most CPUs has multiple cores that can be utilized easily by compiling with openmp support  Programmers no longer need to rewrite a sequential code but to add directives to instruct the compiler for parallelizing the code with openmp.

Openmp example /* * Sample program to test runtime of simple matrix multiply * with and without OpenMP on gcc tdm1 (mingw) * compile with gcc –fopenmp * (c) 2009, Rajorshi Biswas */ #include int main(int argc, char **argv) { int i,j,k; int n; double temp; double start, end, run; printf("Enter dimension ('N' for 'NxN' matrix) ( ): "); scanf("%d", &n); assert( n >= 100 && n <= 2000 ); int **arr1 = malloc( sizeof(int*) * n); int **arr2 = malloc( sizeof(int*) * n); int **arr3 = malloc( sizeof(int*) * n); for(i=0; i<n; ++i) { arr1[i] = malloc( sizeof(int) * n ); arr2[i] = malloc( sizeof(int) * n ); arr3[i] = malloc( sizeof(int) * n ); } printf("Populating array with random values...\n"); srand( time(NULL) ); for(i=0; i<n; ++i) { for(j=0; j<n; ++j) { arr1[i][j] = (rand() % n); arr2[i][j] = (rand() % n); } printf("Completed array init.\n"); printf("Crunching without OMP..."); fflush(stdout); start = omp_get_wtime(); for(i=0; i<n; ++i) { for(j=0; j<n; ++j) { temp = 0; for(k=0; k<n; ++k) { temp += arr1[i][k] * arr2[k][j]; } arr3[i][j] = temp; } end = omp_get_wtime(); printf(" took %f seconds.\n", end-start); printf("Crunching with OMP..."); fflush(stdout); start = omp_get_wtime(); #pragma omp parallel for private(i, j, k, temp) for(i=0; i<n; ++i) { for(j=0; j<n; ++j) { temp = 0; for(k=0; k<n; ++k) { temp += arr1[i][k] * arr2[k][j]; } arr3[i][j] = temp; } end = omp_get_wtime(); printf(" took %f seconds.\n", end-start); return 0; }

Compiling for openmp support  GCC gcc –fopenmp –o foo foo.c gfortran –fopenmp –o foo foo.f  Intel Compiler icc -openmp –o foo foo.c ifort –openmp –o foo foo.f  PGI Compiler pgcc -mp –o foo foo.c pgf90 –mp –o foo foo.f

What is Message Passing Interface (MPI)?  This is a library, not a language!!  Different compilers, but all must use the same libraries, i.e. MPICH, LAM, OPENMPI etc.  There are two versions now, MPI-1 and MPI-2  Use standard sequential language. Fortran, C, C++, etc.

Basic Idea of Message Passing Interface (MPI)  MPI Environment Initialize, manage, and terminate communication among processes  Communication between processes Point to point communication, i.e. send, receive, etc. Collective communication, i.e. broadcast, gather, etc.  Complicated data structures Communicate the data effectively e.g. matrices and memory

Is MPI Large or Small?  MPI is large More than one hundred functions But not necessarily a measure of complexity  MPI is small Many parallel programs can be written with just 6 basic functions  MPI is just right One can access flexibility when it is required One need not master all MPI functions

When Use MPI?  You need a portable parallel program  You are writing a parallel library  You care about performance  You have a problem that can be solved in parallel ways

F77/F90, C/C++ MPI library calls  Fortran 77/90 uses subroutines CALL is used to invoke the library call Nothing is returned, the error code variable is the last argument All variables are passed by reference  C/C++ uses functions Just the name is used to invoke the library call The function returns an integer value (an error code) Variables are passed by value, unless otherwise specified

Types of Communication  Point to Point Communication communication involving only two processes.  Collective Communication communication that involves a group of processes.

Implementation of MPI

Getting started with MPI  Create a file called “ machines ”  The content of “ machines ” (8 nodes): compute-0-0 compute-0-1 compute-0-2 … compute-0-7

MPI Commands  mpicc - compiles an mpi program mpicc -o foo foo.c mpif77 -o foo foo.f mpif90 -o foo foo.f90  mpirun - start the execution of mpi programs mpirun -v -np 2 -machinefile machines foo

Basic MPI Functions

MPI Environment  Initialize initialize environment  Finalize terminate environment  Communicator create default communication group for all processes  Version establish version of MPI

MPI Environment  Total processes spawn total processes  Rank/Process ID assign identifier to each process  Timing Functions MPI_Wtime, MPI_Wtick

MPI_INIT  Initializes the MPI environment  Assigns all spawned processes to MPI_COMM_WORLD, default comm.  C int MPI_Init(argc,argv)  int *argc;  char **argv; Input Parameters  argc - Pointer to the number of arguments  argv - Pointer to the argument vector  Fortran CALL MPI_INIT(error_code) int error_code – variable that gets set to an error code

MPI_FINALIZE  Terminates the MPI environment  C int MPI_Finalize()  Fortran CALL MPI_FINALIZE(error_code) int error_code – variable that gets set to an error code

MPI_ABORT  This routine makes a “ best attempt ” to abort all tasks in the group of comm.  Usually used in error handling.  C int MPI_Abort(comm, errorcode)  MPI_Comm comm  int errorcode Input Parameters  comm - communicator of tasks to abort  errorcode - error code to return to invoking environment  Fortran CALL MPI_ABORT(COMM, ERRORCODE, IERROR) INTEGER COMM, ERRORCODE, IERROR

MPI_GET_VERSION  Get the version of currently used MPI  C int MPI_Get_version(int *version, int *subversion) Input Parameters  version – version of MPI  subversion – subversion of MPI  Fortran CALL MPI_GET_VERSION(version, subversion, error_code) int error_code – variable that gets set to an error code

MPI_COMM_SIZE  This finds the number of processes in a communication group  C int MPI_Comm_size (comm, size)  MPI_Comm comm – MPI communication group;  int *size; Input Parameter  comm - communicator (handle) Output Parameter  size - number of processes in the group of comm (integer)  Fortran CALL MPI_COMM_SIZE(comm, size, error_code) int error_code – variable that gets set to an error code  Using MPI_COMM_WORLD as comm will return the total number of processes started

MPI_COMM_RANK  This gives the rank/identification number of a process in a communication group  C int MPI_Comm_rank ( comm, rank )  MPI_Comm comm;  int *rank; Input Parameter  comm - communicator (handle) Output Parameter  rank – rank/id number of the process who made the call (integer)  Fortran CALL MPI_COMM_RANK(comm, rank, error_code) int error_code – variable that gets set to an error code  Using MPI_COMM_WORLD as comm will return the rank of the process in relation to all processes that were started

Timing Functions – MPI_WTIME  MPI_Wtime() - returns a floating point number of seconds, representing elapsed wall-clock time.  C double MPI_Wtime(void)  Fortran DOUBLE PRECISION MPI_WTIME()  The times returned are local to the node/process that made the call.

Timing Functions – MPI_WTICK  MPI_Wtick() - returns a double precision number of seconds between successive clock ticks.  C double MPI_Wtick(void)  Fortran DOUBLE PRECISION MPI_WTICK()  The times returned are local to the node/process that made the call.

Hello World 1  Echo the MPI version  MPI Functions Used MPI_Init MPI_Get_version MPI_Finalize

Hello World 1 (C) #include int main(int argc, char *argv[]) { int version, subversion; MPI_Init(&argc, &argv); MPI_Get_version(&version, &subversion); printf("Hello world!\n"); printf("Your MPI Version is: %d.%d\n", version, subversion); MPI_Finalize(); return(0); }

Hello World 1 (Fortran) program main include 'mpif.h' integer ierr, version, subversion call MPI_INIT(ierr) call MPI_GET_VERSION(version, subversion, ierr) print *, 'Hello world!' print *, 'Your MPI Version is: ', version, '.', subversion call MPI_FINALIZE(ierr) end

Hello World 2  Echo the process rank and the total number of process in the group  MPI Functions Used MPI_Init MPI_Comm_rank MPI_Comm_size MPI_Finalize

Hello World 2 (C) #include int main(int argc, char *argv[]) { int rank, size; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); printf( ” Hello world! I am %d of %d\n ”, rank, size); MPI_Finalize(); return(0); }

Hello World 2 (Fortran) program main include 'mpif.h' integer rank, size, ierr call MPI_INIT(ierr) call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr) call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr) print *, 'Hello world! I am ', rank, ' of ', size call MPI_FINALIZE(ierr) end

MPI C Datatypes MPI DatatypeC Datatype MPI_CHARsigned char MPI_SHORTsigned short int MPI_INTsigned int MPI_LONGsigned long int MPI_UNSIGNED_CHARunsigned char MPI_UNSIGNED_SHORTunsigned short int

MPI C Datatypes MPI DatatypeC Datatype MPI_UNSIGNEDunsigned int MPI_UNSIGNED_LONGunsigned long int MPI_FLOATfloat MPI_DOUBLEdouble MPI_LONG_DOUBLElong double MPI_BYTE MPI_PACKED

MPI Fortran Datatypes MPI DatatypeFortran Datatype MPI_INTEGERINTEGER MPI_REALREAL MPI_DOUBLE_PRECISIONDOUBLE PRECISION MPI_COMPLEXCOMPLEX MPI_LOGICALLOGICAL MPI_CHARACTERCHARACTER MPI_BYTE MPI_PACKED

Parallelization example 1: serial-pi.c #include static long num_steps = ; double step; int main () { int i; double x, pi, sum = 0.0; step = 1.0/(double) num_steps; for (i=0;i< num_steps; i++){ x = (i+0.5)*step; sum = sum + 4.0/(1.0+x*x); } pi = step * sum; printf("Est Pi= %f\n",pi); } 35

Parallelizing serial-pi.c into mpi-pi.c:- Step 1: Adding MPI environment #include "mpi.h" #include static long num_steps = ; double step; int main () { int i; double x, pi, sum = 0.0; MPI_Init(&argc,&argv); step = 1.0/(double) num_steps; for (i=0;i< num_steps; i++){ x = (i+0.5)*step; sum = sum + 4.0/(1.0+x*x); } pi = step * sum; printf("Est Pi= %f\n",pi); MPI_Finalize(); }

Parallelizing serial-pi.c into mpi-pi.c :- Step 2: Adding variables to print ranks #include "mpi.h" #include static long num_steps = ; double step; int main () { int i; double x, pi, sum = 0.0; int rank, size; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); step = 1.0/(double) num_steps; for (i=0;i< num_steps; i++){ x = (i+0.5)*step; sum = sum + 4.0/(1.0+x*x); } pi = step * sum; printf("Est Pi= %f, Processor %d of %d \n",pi, rank, size); MPI_Finalize(); }

Parallelizing serial-pi.c into mpi-pi.c :- Step 3: divide the workload #include "mpi.h" #include static long num_steps = ; double step; int main () { int i; double x, mypi, pi, sum = 0.0; int rank, size; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); step = 1.0/(double) num_steps; for (i=rank;i< num_steps; i+=size){ x = (i+0.5)*step; sum = sum + 4.0/(1.0+x*x); } mypi = step * sum; printf("Est Pi= %f, Processor %d of %d \n",mypi, rank, size); MPI_Finalize(); }

Parallelizing serial-pi.c into mpi-pi.c :- Step 4: collect partial results #include "mpi.h" #include static long num_steps = ; double step; int main () { int i; double x, mypi, pi, sum = 0.0; int rank, size; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); step = 1.0/(double) num_steps; for (i=rank;i< num_steps; i+=size){ x = (i+0.5)*step; sum = sum + 4.0/(1.0+x*x); } mypi = step * sum MPI_Reduce(&mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD); if (rank==0) printf("Est Pi= %f, \n",pi); MPI_Finalize(); }

Compile and run mpi program $ mpicc –o mpi-pi mpi-pi.c $ mpirun -np 4 -machinefile machines mpi-pi

Parallelization example 2: serial-mc-pi.c #include main(int argc, char *argv[]) { long in,i,n; double x,y,q; time_t now; in = 0; srand(time(&now)); printf("Input no of samples : "); scanf("%ld",&n); for (i=0;i<n;i++) { x = rand()/(RAND_MAX+1.0); y = rand()/(RAND_MAX+1.0); if ((x*x + y*y) < 1) { in++; } q = ((double)4.0)*in/n; printf("pi = %.20lf\n",q); printf("rmse = %.20lf\n",sqrt(( (double) q*(4-q))/n)); } 2r

Parallelization example 2: mpi-mc-pi.c #include "mpi.h" #include main(int argc, char *argv[]) { long in,i,n; double x,y,q,Q; time_t now; int rank,size; MPI_Init(&argc, &argv); in = 0; MPI_Comm_size(MPI_COMM_WORLD,&size); MPI_Comm_rank(MPI_COMM_WORLD,&rank); srand(time(&now)+rank); if (rank==0) { printf("Input no of samples : "); scanf("%ld",&n); } MPI_Bcast(&n,1,MPI_LONG,0,MPI_COMM_WORLD); for (i=0;i<n;i++) { x = rand()/(RAND_MAX+1.0); y = rand()/(RAND_MAX+1.0); if ((x*x + y*y) < 1) { in++; } q = ((double)4.0)*in/n; MPI_Reduce(&q,&Q,1,MPI_DOUBLE,MPI_SUM,0,MPI_COMM_WORLD); Q = Q / size; if (rank==0) { printf("pi = %.20lf\n",Q); printf("rmse = %.20lf\n",sqrt(( (double) Q*(4-Q))/n/size)); } MPI_Finalize(); } 2r

Compile and run mpi-mc-pi $ mpicc –o mpi-mc-pi mpi-mc-pi.c $ mpirun -np 4 -machinefile machines mpi-mc-pi

The End