COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming - Exercises Dr. Xiao Qin Auburn University

Slides:



Advertisements
Similar presentations
Its.unc.edu 1 Collective Communication University of North Carolina - Chapel Hill ITS - Research Computing Instructor: Mark Reed
Advertisements

Parallel Programming in C with MPI and OpenMP
Parallel Systems Parallel Systems Tools Dr. Guy Tel-Zur.
MPI Fundamentals—A Quick Overview Shantanu Dutt ECE Dept., UIC.
Introduction MPI Mengxia Zhu Fall An Introduction to MPI Parallel Programming with the Message Passing Interface.
SOME BASIC MPI ROUTINES With formal datatypes specified.
Deino MPI Installation The famous “cpi.c” Profiling
MPI Collective Communication CS 524 – High-Performance Computing.
EECC756 - Shaaban #1 lec # 7 Spring Message Passing Interface (MPI) MPI, the Message Passing Interface, is a library, and a software standard.
MPI Workshop - II Research Staff Week 2 of 3.
Collective Communication.  Collective communication is defined as communication that involves a group of processes  More restrictive than point to point.
CS 179: GPU Programming Lecture 20: Cross-system communication.
1 Lecture 4: Distributed-memory Computing with PVM/MPI.
1 An Introduction to MPI Parallel Programming with the Message Passing Interface Originally by William Gropp and Ewing Lusk Adapted by Anda Iamnitchi.
1 An Introduction to MPI Parallel Programming with the Message Passing Interface William Gropp Ewing Lusk Argonne National Laboratory Presenter: Mike Slavik.
2a.1 Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte,
1 Collective Communications. 2 Overview  All processes in a group participate in communication, by calling the same function with matching arguments.
1 MPI: Message-Passing Interface Chapter 2. 2 MPI - (Message Passing Interface) Message passing library standard (MPI) is developed by group of academics.
Part I MPI from scratch. Part I By: Camilo A. SilvaBIOinformatics Summer 2008 PIRE :: REU :: Cyberbridges.
Steve Lantz Computing and Information Science Distributed Memory Programming Using Advanced MPI (Message Passing Interface)
PP Lab MPI programming VI. Program 1 Break up a long vector into subvectors of equal length. Distribute subvectors to processes. Let them compute the.
1 Review –6 Basic MPI Calls –Data Types –Wildcards –Using Status Probing Asynchronous Communication Collective Communications Advanced Topics –"V" operations.
CS 484. Message Passing Based on multi-processor Set of independent processors Connected via some communication net All communication between processes.
1. Create list of unmarked natural numbers 2, 3, …, n 2. k  2 3. Repeat: (a) Mark all multiples of k between k 2 and n (b) k  smallest unmarked number.
Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University.
CS 838: Pervasive Parallelism Introduction to MPI Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from an online tutorial.
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
Summary of MPI commands Luis Basurto. Large scale systems Shared Memory systems – Memory is shared among processors Distributed memory systems – Each.
Distributed-Memory (Message-Passing) Paradigm FDI 2004 Track M Day 2 – Morning Session #1 C. J. Ribbens.
Parallel Programming with MPI By, Santosh K Jena..
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
CSCI-455/522 Introduction to High Performance Computing Lecture 4.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Message-passing Model.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Task/ChannelMessage-passing TaskProcess Explicit channelsMessage communication.
1 Introduction to Parallel Programming with Single and Multiple GPUs Frank Mueller
CSE 160 – Lecture 16 MPI Concepts, Topology and Synchronization.
2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()
Timing in MPI Tarik Booker MPI Presentation May 7, 2003.
S an D IEGO S UPERCOMPUTER C ENTER N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE MPI 2 Part II NPACI Parallel Computing Institute.
Message Passing Programming Based on MPI Collective Communication I Bora AKAYDIN
MPI Derived Data Types and Collective Communication
Message Passing Interface Using resources from
Lecture 3: Today’s topics MPI Broadcast (Quinn Chapter 5) –Sieve of Eratosthenes MPI Send and Receive calls (Quinn Chapter 6) –Floyd’s algorithm Other.
COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming: 1. Collective Operations 2. Overlapping Communication with Computation Dr. Xiao.
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
1 MPI: Message Passing Interface Prabhaker Mateti Wright State University.
Distributed Processing with MPI International Summer School 2015 Tomsk Polytechnic University Assistant Professor Dr. Sergey Axyonov.
Lecture 4: Distributed-memory Computing with PVM/MPI
Computer Science Department
Introduction to MPI Programming Ganesh C.N.
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Odd-Even Sort Implementation Dr. Xiao Qin.
CS4402 – Parallel Computing
CS 668: Lecture 3 An Introduction to MPI
Special Jobs: MPI Alessandro Costa INAF Catania
Computer Science Department
Send and Receive.
Collective Communication with MPI
CS 584.
Collective Communication Operations
Send and Receive.
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
Collective Communication in MPI and Advanced Features
Lecture 14: Inter-process Communication
MPI: Message Passing Interface
Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte,
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
Hardware Environment VIA cluster - 8 nodes Blade Server – 5 nodes
Introduction to Parallel Programming with Single and Multiple GPUs
Computer Science Department
Presentation transcript:

COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming - Exercises Dr. Xiao Qin Auburn University Reference: MPI Hands-On Exercises, The National Institute for Computational SciencesMPI Hands-On Exercises

Collective Communication Operations The barrier synchronization operation is performed in MPI using: int MPI_Barrier(MPI_Comm comm) The one-to-all broadcast operation is: int MPI_Bcast(void *buf, int count, MPI_Datatype datatype, int source, MPI_Comm comm) The all-to-one reduction operation is: int MPI_Reduce(void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int target, MPI_Comm comm)

MPI_Bcast 4

MPI_Bcast: An Example 5 Q1: Please complete the following code that broadcasts 100 ints from process 0 to every process in the group. MPI_Comm comm; int array[100]; int root=0;... MPI_Bcast(array, 100, MPI_INT, root, comm); int MPI_Bcast(void *buf, int count, MPI_Datatype datatype, int source, MPI_Comm comm)

MPI_Reduce 6

MPI_Reduce: Predefined operations 7

MPI_Reduce: An illustration 8

MPI_Reduce: An illustration (cont.) Q2: What happens when each process contains multiple elements? 9

Exercise 1: Pi Calculation 10 π cannot be represented as a simple fraction can be represented by an infinite series of nested fractions, called a continued fraction

Pi – Continued Fraction 11 Pi can be represented by an infinite series of nested fractions, called a continued fraction

Pi – Continued Fraction 12 Calculate Pi by integrating f(x) = 4 /(1+x^2).

Pi – Sequential Program 13 static double SerialPi() { double sum = 0.0; double step = 1.0 / (double)NUM_STEPS; for (int i = 0; i < NUM_STEPS; i++) { double x = (i - 0.5) * step; double partial = 4.0 / (1.0 + x * x); sum += partial; } return step * sum; } Q3: Can you explain this sequential program?

Pi – MPI Implementation This program calculates π-number by integrating f(x) = 4 /(1+x^2). Area under the curve is divided into rectangles and the rectangles are distributed to the processors. Each node: – receives the number of rectangles used in the approximation. – calculates the areas of it's rectangles. – Synchronizes for a global summation. Node 0 prints the result. 14

Pi – MPI Implementation 1 int main(int argc,char *argv[]) { int done = 0, n, myid, numprocs, i; double PI25DT = ; double mypi, pi, h, sum, x; double startwtime = 0.0, endwtime; int namelen; char processor_name[MPI_MAX_PROCESSOR_NAME]; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); MPI_Get_processor_name(processor_name,&namelen); fprintf(stdout,"Process %d of %d on %s\n", myid, numprocs, processor_name); n = 0; 15

while (!done) { if (myid == 0) { if (n==0) n=10000; else n=0; startwtime = MPI_Wtime(); } MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD); if (n == 0) done = 1; else { h = 1.0 / (double) n; sum = 0.0; for (i = myid + 1; i <= n; i += numprocs) { x = h * ((double)i - 0.5); sum += f(x); } mypi = h * sum; MPI_Reduce(&mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD); 16 Pi – MPI Implementation 2

if (myid == 0) { printf("pi is approximately %.16f, Error is %.16f \n”, pi, fabs(pi - PI25DT)); endwtime = MPI_Wtime(); printf("wall clock time = %f\n", endwtime - startwtime); fflush( stdout ); } /* end if */ } /* end else */ } /* end while */ MPI_Finalize(); return 0; } 17 Pi – MPI Implementation 3

Exercise 2: MPI Communication Timing Test Goal: investigate the amount of time required for message passing among processes. How many nodes do we need? Which factors will affect communication time? How many times should you test a round trip? start = MPI_Wtime(); /* usec */ finish = MPI_Wtime(); How to convert /* usec to sec */ 18

Implementation: MPI Communication Timing Test See timing.c Performance Evaluation 19