The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Special Topics: MPI jobs Maha Dessokey (

Slides:



Advertisements
Similar presentations
Chapter 3. MPI MPI = Message Passing Interface Specification of message passing libraries for developers and users –Not a library by itself, but specifies.
Advertisements

Reference: / MPI Program Structure.
MPI support in gLite Enol Fernández CSIC. EMI INFSO-RI CREAM/WMS MPI-Start MPI on the Grid Submission/Allocation – Definition of job characteristics.
High Performance Computing
Introduction to MPI. What is Message Passing Interface (MPI)?  Portable standard for communication  Processes can communicate through messages.  Each.
CS 240A: Models of parallel programming: Distributed memory and MPI.
Message-Passing Programming and MPI CS 524 – High-Performance Computing.
Distributed Memory Programming with MPI. What is MPI? Message Passing Interface (MPI) is an industry standard message passing system designed to be both.
EECC756 - Shaaban #1 lec # 7 Spring Message Passing Interface (MPI) MPI, the Message Passing Interface, is a library, and a software standard.
MPI Point-to-Point Communication CS 524 – High-Performance Computing.
CS 179: GPU Programming Lecture 20: Cross-system communication.
Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center Presented by Timothy H. Kaiser, Ph.D. San Diego.
Parallel & Cluster Computing MPI Basics Paul Gray, University of Northern Iowa David Joiner, Shodor Education Foundation Tom Murphy, Contra Costa College.
1 MPI: Message-Passing Interface Chapter 2. 2 MPI - (Message Passing Interface) Message passing library standard (MPI) is developed by group of academics.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) MPI Applications with the Grid Engine Riccardo Rotondo
MA471Fall 2003 Lecture5. More Point To Point Communications in MPI Note: so far we have covered –MPI_Init, MPI_Finalize –MPI_Comm_size, MPI_Comm_rank.
Part I MPI from scratch. Part I By: Camilo A. SilvaBIOinformatics Summer 2008 PIRE :: REU :: Cyberbridges.
Parallel Computing A task is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information What do we.
Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University.
CS 838: Pervasive Parallelism Introduction to MPI Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from an online tutorial.
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
Summary of MPI commands Luis Basurto. Large scale systems Shared Memory systems – Memory is shared among processors Distributed memory systems – Each.
E-science grid facility for Europe and Latin America gLite MPI Tutorial for Grid School Daniel Alberto Burbano Sefair, Universidad de Los.
Distributed-Memory (Message-Passing) Paradigm FDI 2004 Track M Day 2 – Morning Session #1 C. J. Ribbens.
Parallel Programming with MPI By, Santosh K Jena..
MA471Fall 2002 Lecture5. More Point To Point Communications in MPI Note: so far we have covered –MPI_Init, MPI_Finalize –MPI_Comm_size, MPI_Comm_rank.
CSCI-455/522 Introduction to High Performance Computing Lecture 4.
Message Passing and MPI Laxmikant Kale CS Message Passing Program consists of independent processes, –Each running in its own address space –Processors.
An Introduction to MPI (message passing interface)
Introduction to Parallel Programming at MCSR Message Passing Computing –Processes coordinate and communicate results via calls to message passing library.
2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()
3/12/2013Computer Engg, IIT(BHU)1 MPI-1. MESSAGE PASSING INTERFACE A message passing library specification Extended message-passing model Not a language.
-1.1- MPI Lectured by: Nguyễn Đức Thái Prepared by: Thoại Nam.
Message Passing Programming Based on MPI Collective Communication I Bora AKAYDIN
Message Passing Interface Using resources from
MPI-Message Passing Interface. What is MPI?  MPI is a specification for the developers and users of message passing libraries. By itself, it is NOT a.
Advanced gLite job management Paschalis Korosoglou, AUTH/GRNET EPIKH Application Porting School 2011 Beijing, China Paschalis Korosoglou,
COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming: 1. Collective Operations 2. Overlapping Communication with Computation Dr. Xiao.
LA 4 CHAIN GISELA EPIKH School SPECFEM3D on Science Gateway.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Indian Institute of Technology Kharagpur EPIKH Workshop Kolkata,
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) How to Run MPI-enabled Applications on the EUMEDGRID Infrastructure.
MPI Applications with the Grid Engine
Advanced Topics: MPI jobs
Introduction to parallel computing concepts and technics
MPI Basics.
gLite MPI Job Amina KHEDIMI CERIST
CS4402 – Parallel Computing
MPI Point to Point Communication
Introduction to MPI.
MPI Applications with the Grid Engine
MPI Message Passing Interface
Special Topics: MPI jobs
CS 584.
An Introduction to Parallel Programming with MPI
CS4961 Parallel Programming Lecture 16: Introduction to Message Passing Mary Hall November 3, /03/2011 CS4961.
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
Message Passing Models
Lecture 14: Inter-process Communication
A Message Passing Standard for MPP and Workstations
Cenni sul calcolo parallelo. Descrizione di JDL per i job di tipo MPI.
Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte,
Introduction to parallelism and the Message Passing Interface
MPI MPI = Message Passing Interface
Hardware Environment VIA cluster - 8 nodes Blade Server – 5 nodes
Hello, world in MPI #include <stdio.h> #include "mpi.h"
Distributed Memory Programming with Message-Passing
Hello, world in MPI #include <stdio.h> #include "mpi.h"
Parallel Processing - MPI
MPI Message Passing Interface
CS 584 Lecture 8 Assignment?.
Presentation transcript:

The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Special Topics: MPI jobs Maha Dessokey ( Electronic Research Institute Joint EPIKH/EUMEDGRID-Support Event in Cairo Cairo,

MPI and its implementations Wrapper script for mpi-start Hooks for mpi-start Defining the job and executable Running the MPI job References Table of Contents Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

3 Some Basic Concepts The Message Passing Interface (MPI) is a standard for writing parallel application. An MPI process consists of a C/C++ or Fortran 77 program which communicates with other MPI processes by calling MPI routines All names of MPI routines and constants in both C and Fortran begin with the prefix MPI_ to avoid name collisions. – Fortran routine names are all upper case but C routine names are mixed case.  In general, C MPI routines return an int and Fortran MPI routines have an IERROR argument. Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

4 Basic Structures of MPI Programs Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

All sub-programs that contains calls to MPI subroutine MUST include the MPI HEADER file The header file contains definitions of MPI constants, MPI types and functions 5 Fortran: include ‘mpi.h’ Fortran: include ‘mpi.h’ C: #include C: #include Header files Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

6 Initializing MPI The first MPI routine called in any MPI program must be the initialisation routine MPI_INIT. – Every MPI program must call this routine once, before any other MPI routines. The C version of the routine accepts argc and argv as arguments : int MPI_Init(int &argc, char &argv); The Fortran version takes no arguments other than the error code: MPI_INIT(IERROR) Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

7 MPI Communicator The Communicator is a variable identifying a group of processes that are allowed to communicate with each other – There is a default communicator MPI_COMM_WORLD which identify the group of all the processes.  The processes are ordered and numbered consecutively from 0 (in both Fortran and C), the number of each process being known as its rank  The rank identifies each process within the communicator. Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

8 Fortran: CALL MPI_XXX (parameter, IERROR) Fortran: CALL MPI_XXX (parameter, IERROR) C: Error = MPI_XXX (parameter, …); C: Error = MPI_XXX (parameter, …); MPI Functions format Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

How many processes are associated with a communicator ? Output SIZE 9 Fortran : CALL MPI_COMM_SIZE (COMM,SIZE,IERR) Fortran : CALL MPI_COMM_SIZE (COMM,SIZE,IERR) C : MPI_Comm_size (MPI_Comm comm, int *SIZE); C : MPI_Comm_size (MPI_Comm comm, int *SIZE); Communicator Size Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

10 Fortran: CALL MPI_COMM_RANK (COMM, RANK, IERR) Fortran: CALL MPI_COMM_RANK (COMM, RANK, IERR) C: MPI_Comm_rank (MPI_Comm comm, int *RANK); C: MPI_Comm_rank (MPI_Comm comm, int *RANK); Process Rank What is the ID of a process in a group ? Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

11 Fortran: CALL MPI_FINALIZE (IERR) Fortran: CALL MPI_FINALIZE (IERR) Finalizing MPI An MPI program should call the MPI routine MPI_FINALIZE when all communications have completed. This routine cleans up all MPI data-structures, etc. Once this routine has been called, no other calls can be made to MPI routines Finalizing the MPI environment Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo, C: int MPI_Finalize (); C: int MPI_Finalize ();

12 Point-to-point & collective communication A point-to-point communication always involves exactly two processes. One process sends a MESSAGE to the other. This distinguishes it from the collective communication, which involves a whole group of process at one time. Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

13 Blocking Communication Non-Blocking Communication Slow and simpleFast, complex and insecure Between the initiation and the completion the program could do some useful computation (latency hiding) The programmer has to insert code to check for completion Blocking and Non-Blocking Communication Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

The format of the standard blocking receive is: Where: – buf is the address where the data should be placed once received (the receive buffer) – count is the number of elements which buf can contain. – datatype is the MPI datatype for the message – source is the rank of the source of the message in the group associated with the communicator comm. – tag is used by the receiving process to specify the message the receiver is waiting for. – comm is the communicator – status contains the status of the receiving process 14 MPI_Send(buffer,count,type,dest,tag,comm) The standard blocking send/receive Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo, MPI_RECV (buf, count, datatype, source, tag, comm, status)

15 The standard Non-Blocking send/receive The non-blocking routines have identical arguments to their blocking counterparts except for an extra argument in the non- blocking routines. – This argument, request, is very important as it provides a handle which is used to test when the communication has completed Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo, MPI_Isend(buffer,count,type,dest,tag,comm,request) MPI_Irecv(buffer,count,type,source,tag,comm,request )

16 C: MPI_WAIT (MPI_Request *req, MPI_Status *status); C: MPI_WAIT (MPI_Request *req, MPI_Status *status); Waiting and Testing for Completion /1 A call to MPI_WAIT subroutine cause the code to wait until the communication pointed by req is completed. Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo, Fortran : MPI_WAIT (req, status, ierr) Fortran : MPI_WAIT (req, status, ierr)

17 A call to MPI_TEST subroutine sets flag to true if the communication pointed by req has completed, set flag to false otherwise. Waiting and Testing for Completion /2 Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo, C: MPI_TEST (MPI_Request *req, int *flag, MPI_Status *status); C: MPI_TEST (MPI_Request *req, int *flag, MPI_Status *status); Fortran : MPI_TEST (req, flag, status, ierr) Fortran : MPI_TEST (req, flag, status, ierr)

18 Fortran – MPI Data types Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

19 C - MPI Data types Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

20 What distinguishes collective communication from point-to-point communication is that it always involves every process in the specified communicator. To perform a collective communication on a subset of the processes in a communicator, a new Communicator has to be created Collective Communication Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

21 The following table shows 16 MPI collective communication subroutines that are divided into four categories: The subroutine printed in boldface are used most frequently. All the MPI collective communication subroutine are blocking. Collective Communication Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

22 The subroutine MPI_BCAST broadcasts the message from a specific process called root to all the other processes identified by a communicator given as input MPI_BCAST Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

23 The subroutine MPI_GATHER transmits data from all the processes in the communicator to a single receiving process. MPI_GATHER Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

MPI_ALLGATHER The subroutine MPI_ALLGATHER Concatenate the data to all processes in the communicator. Each process in the group, in effect, performs a one-to-all broadcasting operation 24 Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

25 The subroutine MPI_REDUCE does reduction operations, such as summation of data distributed over processes, and brings the result to the root process MPI_REDUCE Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

MPI_Allreduce Applies a reduction operation and places the result in all tasks in the group. This is equivalent to an MPI_Reduce followed by an MPI_Bcast 26 Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

MPI Reduction Operation 27 Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

Barrier Synchronization A call to MPI_BARRIER subroutine blocks the caller until all group members have called it. The call returns at any process only after all group members have entered the call. Algiers, EUMEDGRID-Support/EPIKH School for Application Porting, C: MPI_Barrier (comm) C: MPI_Barrier (comm) Fortran : MPI_BARRIER (comm,ierr) Fortran : MPI_BARRIER (comm,ierr)

29 Wrapper script for mpi-start /1 mpi-start is a recommended solution to hide the implementation details for jobs submission. – The design of mpi-start was focused in making the MPI job submission as transparent as possible from the cluster details! – It was developed inside the Int.EU.Grid projectInt.EU.Grid Using the mpi-start system requires the user to define a wrapper script that set the environment variables and a set of hooks. Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

30 Wrapper script for mpi-start /2 #!/bin/bash # Pull in the arguments. MY_EXECUTABLE=`pwd`/$1 MPI_FLAVOR=$2 # Convert flavor to lowercase for passing to mpi-start. MPI_FLAVOR_LOWER=`echo $MPI_FLAVOR | tr '[:upper:]' '[:lower:]'` # Pull out the correct paths for the requested flavor. eval MPI_PATH=`printenv MPI_${MPI_FLAVOR}_PATH` # Ensure the prefix is correctly set. Don't rely on the defaults. eval I2G_${MPI_FLAVOR}_PREFIX=$MPI_PATH export I2G_${MPI_FLAVOR}_PREFIX # Touch the executable. #It exist must for the shared file system check. # If it does not, then mpi-start may try to distribute the executable # when it shouldn't. touch $MY_EXECUTABLE # Setup for mpi-start. export I2G_MPI_APPLICATION=$MY_EXECUTABLE export I2G_MPI_APPLICATION_ARGS= export I2G_MPI_TYPE=$MPI_FLAVOR_LOWER export I2G_MPI_PRE_RUN_HOOK=mpi-hooks.sh export I2G_MPI_POST_RUN_HOOK=mpi-hooks.sh # If these are set then you will get more debugging information. export I2G_MPI_START_VERBOSE=1 #export I2G_MPI_START_DEBUG=1 # Invoke mpi-start. $I2G_MPI_START Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

31 Hooks for mpi-start /1 The user may write a script which is called before and after the MPI executable is run. The pre-hook script can be used, for example, to compile the executable itself or download data; The post-hook script can be used to analyze results or to save the results on the grid. The pre- and post- hooks script may be defined in separate files, but the name of the functions named exactly “pre_run_hook” and “post_run_hook” Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

32 Hooks for mpi-start /2 #!/bin/sh # This function will be called before the MPI executable is started. # pre_run_hook () { # Compile the program. echo "Compiling ${I2G_MPI_APPLICATION}" # Actually compile the program. cmd="mpicc ${MPI_MPICC_OPTS} -o ${I2G_MPI_APPLICATION} ${I2G_MPI_APPLICATION}.c" echo $cmd $cmd if [ ! $? -eq 0 ]; then echo "Error compiling program. Exiting..." exit 1 fi # Everything's OK. echo "Successfully compiled ${I2G_MPI_APPLICATION}" return 0 } # This function will be called before the MPI executable is finished. # A typical case for this is to upload the results to a Storage Elem. post_run_hook () { echo "Executing post hook." echo "Finished the post hook." return 0 } Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

33 Defining the job and executable /1 Running the MPI job itself is not significantly different from running a standard grid job. JobType = “Normal"; CpuNumber = 2; Executable = "mpi-start-wrapper.sh"; Arguments = "mpi-test MPICH"; StdOutput = "mpi-test.out"; StdError = "mpi-test.err"; InputSandbox = {"mpi-start-wrapper.sh", "mpi-hooks.sh","mpi-test.c"}; OutputSandbox = {"mpi-test.err","mpi-test.out"}; Requirements = Member("MPI-START", other.GlueHostApplicationSoftwareRunTimeEnvironment) && Member(“MPICH", other.GlueHostApplicationSoftwareRunTimeEnvironment); The JobType must be “Normal” and the attribute CpuNumber must be defined Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

34 Defining the job and executable /2 #include "mpi.h" #include int main(int argc, char *argv[]) { int numprocs; /* Number of processors */ int procnum; /* Processor number */ /* Initialize MPI */ MPI_Init(&argc, &argv); /* Find this processor number */ MPI_Comm_rank(MPI_COMM_WORLD, &procnum); /* Find the number of processors */ MPI_Comm_size(MPI_COMM_WORLD, &numprocs); printf ("Hello world! from processor %d out of %d\n", procnum, numprocs); /* Shut down MPI */ MPI_Finalize(); return 0; } Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

35 Running the MPI job Running the MPI job is no different from any other grid job. If the job ran correctly, then the standard output should contain something like the following: […] =[START]========================================================= Hello world! from processor 1 out of 2 Hello world! from processor 0 out of 2 =[FINISHED]====================================================== […] Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

36 References MPI-START Documentation h/mpi-start/mpi-start-documentation/ MPI Guide Practical Exercise colud be found on the agenda site Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,

37 Questions ? Cairo, Joint EPiKH/EUMEDGRID-Support Event in Cairo,