Download presentation
Presentation is loading. Please wait.
Published byEric Chase Modified over 8 years ago
1
COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming - Exercises Dr. Xiao Qin Auburn University http://www.eng.auburn.edu/~xqin xqin@auburn.edu Reference: MPI Hands-On Exercises, The National Institute for Computational SciencesMPI Hands-On Exercises
2
Collective Communication Operations The barrier synchronization operation is performed in MPI using: int MPI_Barrier(MPI_Comm comm) The one-to-all broadcast operation is: int MPI_Bcast(void *buf, int count, MPI_Datatype datatype, int source, MPI_Comm comm) The all-to-one reduction operation is: int MPI_Reduce(void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int target, MPI_Comm comm)
3
MPI_Bcast 4
4
MPI_Bcast: An Example 5 Q1: Please complete the following code that broadcasts 100 ints from process 0 to every process in the group. MPI_Comm comm; int array[100]; int root=0;... MPI_Bcast(array, 100, MPI_INT, root, comm); int MPI_Bcast(void *buf, int count, MPI_Datatype datatype, int source, MPI_Comm comm)
5
MPI_Reduce 6
6
MPI_Reduce: Predefined operations 7
7
MPI_Reduce: An illustration 8
8
MPI_Reduce: An illustration (cont.) Q2: What happens when each process contains multiple elements? 9
9
Exercise 1: Pi Calculation 10 π cannot be represented as a simple fraction can be represented by an infinite series of nested fractions, called a continued fraction
10
Pi – Continued Fraction 11 Pi can be represented by an infinite series of nested fractions, called a continued fraction
11
Pi – Continued Fraction 12 Calculate Pi by integrating f(x) = 4 /(1+x^2).
12
Pi – Sequential Program 13 static double SerialPi() { double sum = 0.0; double step = 1.0 / (double)NUM_STEPS; for (int i = 0; i < NUM_STEPS; i++) { double x = (i - 0.5) * step; double partial = 4.0 / (1.0 + x * x); sum += partial; } return step * sum; } Q3: Can you explain this sequential program?
13
Pi – MPI Implementation This program calculates π-number by integrating f(x) = 4 /(1+x^2). Area under the curve is divided into rectangles and the rectangles are distributed to the processors. Each node: – receives the number of rectangles used in the approximation. – calculates the areas of it's rectangles. – Synchronizes for a global summation. Node 0 prints the result. 14
14
Pi – MPI Implementation 1 int main(int argc,char *argv[]) { int done = 0, n, myid, numprocs, i; double PI25DT = 3.141592653589793238462643; double mypi, pi, h, sum, x; double startwtime = 0.0, endwtime; int namelen; char processor_name[MPI_MAX_PROCESSOR_NAME]; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); MPI_Get_processor_name(processor_name,&namelen); fprintf(stdout,"Process %d of %d on %s\n", myid, numprocs, processor_name); n = 0; 15
15
while (!done) { if (myid == 0) { if (n==0) n=10000; else n=0; startwtime = MPI_Wtime(); } MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD); if (n == 0) done = 1; else { h = 1.0 / (double) n; sum = 0.0; for (i = myid + 1; i <= n; i += numprocs) { x = h * ((double)i - 0.5); sum += f(x); } mypi = h * sum; MPI_Reduce(&mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD); 16 Pi – MPI Implementation 2
16
if (myid == 0) { printf("pi is approximately %.16f, Error is %.16f \n”, pi, fabs(pi - PI25DT)); endwtime = MPI_Wtime(); printf("wall clock time = %f\n", endwtime - startwtime); fflush( stdout ); } /* end if */ } /* end else */ } /* end while */ MPI_Finalize(); return 0; } 17 Pi – MPI Implementation 3
17
Exercise 2: MPI Communication Timing Test Goal: investigate the amount of time required for message passing among processes. How many nodes do we need? Which factors will affect communication time? How many times should you test a round trip? start = MPI_Wtime(); /* usec */ finish = MPI_Wtime(); How to convert /* usec to sec */ 18
18
Implementation: MPI Communication Timing Test See timing.c Performance Evaluation 19
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.