Pattern Programming Tools

Slides:

Advertisements

Similar presentations

MPI Message Passing Interface

Advertisements

Chapter 3. MPI MPI = Message Passing Interface Specification of message passing libraries for developers and users –Not a library by itself, but specifies.

Reference: / MPI Program Structure.

MPI Program Structure Self Test with solution. Self Test 1.How would you modify "Hello World" so that only even-numbered processors print the greeting.

Introduction to MPI. What is Message Passing Interface (MPI)?  Portable standard for communication  Processes can communicate through messages.  Each.

Message-Passing Programming and MPI CS 524 – High-Performance Computing.

12d.1 Two Example Parallel Programs using MPI UNC-Wilmington, C. Ferner, 2007 Mar 209, 2007.

MPI (Message Passing Interface) Basics

1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben.

2.1 Message-Passing Computing ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 17, 2012.

1 MPI: Message-Passing Interface Chapter 2. 2 MPI - (Message Passing Interface) Message passing library standard (MPI) is developed by group of academics.

2.1 Message-Passing Computing ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 14, 2013.

Part I MPI from scratch. Part I By: Camilo A. SilvaBIOinformatics Summer 2008 PIRE :: REU :: Cyberbridges.

MPI and High Performance Computing: Systems and Programming Barry Britt, Systems Administrator Department of Computer Science Iowa State University.

Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University.

Message Passing Programming with MPI Introduction to MPI Basic MPI functions Most of the MPI materials are obtained from William Gropp and Rusty Lusk’s.

Message Passing Programming Model AMANO, Hideharu Textbook pp. １４０－１４７.

MPI Introduction to MPI Commands. Basics – Send and Receive MPI is a message passing environment. The processors’ method of sharing information is NOT.

1 " Teaching Parallel Design Patterns to Undergraduates in Computer Science” Panel member SIGCSE The 45 th ACM Technical Symposium on Computer Science.

CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©

CSCI-455/522 Introduction to High Performance Computing Lecture 4.

1 Message Passing Models CEG 4131 Computer Architecture III Miodrag Bolic.

Running on GCB part1 By: Camilo Silva. Simple steps to run MPI 1.Use putty or the terminal 2.SSH to gcb.fiu.edu 3.Loggin by providing your username and.

Programming distributed memory systems: Message Passing Interface (MPI) Distributed memory systems: multiple processing units working on one task (e.g.

An Introduction to MPI (message passing interface)

Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008.

Timing in MPI Tarik Booker MPI Presentation May 7, 2003.

3/12/2013Computer Engg, IIT(BHU)1 MPI-1. MESSAGE PASSING INTERFACE A message passing library specification Extended message-passing model Not a language.

Using Compiler Directives Paraguin Compiler 1 © 2013 B. Wilkinson/Clayton Ferner SIGCSE 2013 Workshop 310 session2a.ppt Modification date: Jan 9, 2013.

Message Passing Interface Using resources from

1 Programming distributed memory systems Clusters Distributed computers ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 6, 2015.

Suzaku Pattern Programming Framework (a) Structure and low level patterns © 2015 B. Wilkinson Suzaku.pptx Modification date February 22,

Parallel Programming C. Ferner & B. Wilkinson, 2014 Introduction to Message Passing Interface (MPI) Introduction 9/4/

ITCS 4/5145 Parallel Computing, UNC-Charlotte, B

1 ITCS4145 Parallel Programming B. Wilkinson March 23, hybrid-abw.ppt Hybrid Parallel Programming Introduction.

User-Written Functions

Message-Passing Computing

A bit of C programming Lecture 3 Uli Raich.

CS4402 – Parallel Computing

Pattern Parallel Programming

Introduction to MPI.

MPI Message Passing Interface

Functions Separate Compilation

Paraguin Compiler Examples.

Suzaku Pattern Programming Framework Workpool pattern (Version 2)

Using compiler-directed approach to create MPI code automatically

Pattern Parallel Programming

Introduction to Message Passing Interface (MPI)

Paraguin Compiler Examples.

ITCS 4/5145 Parallel Computing, UNC-Charlotte, B

Message Passing Models

Paraguin Compiler Communication.

Paraguin Compiler Examples.

Quiz Questions Suzaku pattern programming framework

Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte,

Introduction to parallelism and the Message Passing Interface

ITCS 4/5145 Parallel Computing, UNC-Charlotte, B

Using compiler-directed approach to create MPI code automatically

Hybrid Parallel Programming

MPI MPI = Message Passing Interface

Patterns Paraguin Compiler Version 2.1.

Hardware Environment VIA cluster - 8 nodes Blade Server – 5 nodes

Distributed Memory Programming with Message-Passing

Parallel Processing - MPI

MPI Message Passing Interface

CS 584 Lecture 8 Assignment?.

Presentation transcript:

Pattern Programming Tools

Pattern programming approaches we have developed High-level abstraction (Seeds framework) – Patterns fully pre-implemented for a distributed or local system. Self deploys. Programmer simply writes what master and slaves do without code for pattern (message passing). Java-based. Medium-level abstraction using compiler directives (Paraguin) – Programmers provided with compiler directives that implement patterns and common data transfer operations. Requires a special compiler that recognizes directives. Low-level abstraction (Suzaku) – Programmer provided with macros and pre-written MPI routines that together can implement patterns and common data transfer operations. Compile as a regular MPI program. No abstraction – Programmer implements everything, but can be given guidance to follow a pattern approach.

Suzaku Pattern Programming Framework Project at UNC-Charlotte http://coitweb.uncc.edu/~abw/Suzaku/ © 2014 B. Wilkinson/Clayton Ferner Suzaku.ppt Modification date Sept. 23, 2014

Suzaku Framework Version 0 An on-going project, tested once in the classroom. Enables programmers implement pattern-based MPI programs using macros and routines without writing MPI message passing code Provides: Macros – in-line substitution of short code sequences Routines – for patterns and common operations needed to implement patterns

Suzaku Hello world #include “suzaku.h” void compute(double a[N][N],double b[N][N],double c[N][N],int index,int blksize) { // Slave compute function does nothing here } int main (int argc, char **argv) { int i, p, rank; MPI_START(&p, &rank, &argc, &argv); printf(“Hello world from process: %i \n”, rank); MPI_Finalize(); return 0; Suzaku routines that incorporate several MPI routines commonly needed at the beginning of MPI programs. Currently an MPI routine, which will be changed to a Suzaku routine in subsequent development of Suzaku. Program outputs “Hello world from process: ” from each process and the process number.

MPI_Start – actually a macro #define MPI_START(p, rank, argc, argv) \ MPI_Init(argc, argv); \ MPI_Comm_size(MPI_COMM_WORLD, p); \ MPI_Comm_rank(MPI_COMM_WORLD, rank) No semicolon

Suzaku routines for timing execution void startTimer(int rank); void stopTimer(int rank); Implementation: void startTimer(int rank) { if (rank == 0) { gettimeofday(&tv1, NULL); } void stopTimer(int rank) { gettimeofday(&tv2, NULL); printf("elapsed_time=\t%lf (seconds)\n", (tv2.tv_sec -tv1.tv_sec)+((tv2.tv_usec - tv1.tv_usec)/1000000.0)); MPI_Finalize(); Could have used MPI_Wtime(). However, time() or gettimeofday() useful to compare with a sequential C version of program with same libraries.

Suzaku routine to read input data void readInputFile(int argc, char *argv[], int *error, double array1[N][N], double array2[N][N]) Read values from file into 2 floating point arrays. File name given by 1st command line argument. File format used in other assignments. Implementation void readInputFile(int argc,char *argv[],int *error,double array1[N][N],double array2[N][N]){ int i, j; FILE *fd; char *usage = "Usage: %s file\n"; if (argc< 2) { fprintf (stderr, usage, argv[0]); *error = -1; } if ((fd = fopen (argv[1], "r")) == NULL) { fprintf (stderr, "%s: Cannot open file %s for reading.\n", argv[0], argv[1]); fprintf (stderr, usage, argv[1]); MPI_Bcast(error, 1, MPI_INT, 0, MPI_COMM_WORLD); if (*error != 0) MPI_Finalize(); for (i = 0; i< N; i++) for (j = 0; j < N; j++) fscanf (fd, "%lf", &array1[i][j]); for (j = 0; j < N; j++) fscanf (fd, "%lf", &array2[i][j]); fclose(fd); MPI_Barrier(MPI_COMM_WORLD);

Suzaku routine to broadcast data to all processes mpiBroadcastArrayOfDoubles(*b); Send out b array to all processes Implementation void mpiBroadcastArrayOfDoubles(double *array) { int n = N * N; MPI_Bcast(array, n , MPI_DOUBLE, 0, MPI_COMM_WORLD); }

Workpool pattern void masterProcess(double array1[N][N], double array2[N][N], int p, int rank, int blksize); Manages work flow of workers. Uses task queue to issue work. Workers come back with completed work. void workerProcess(double array1[N][N], double array2[N][N],double array3[N][N],int rank,int blksize); Workers receive work and calls compute function, returns results to master. void compute(double array1[N][N], double array2[N][N], double array3[N][N], int index, int blksize); Programmer implements compute routine:

Implementation of Master process void masterProcess(double array1[N][N], double array2[N][N], int p, int rank, int blksize) { int process, m; int n = N; int work = 0; if(rank ==0){ for (process = 1; process < p; process++) { //give all of the workers work MPI_Send(&work, 1, MPI_INT, process, work, MPI_COMM_WORLD); // send index MPI_Send(&array1[work], n * blksize, MPI_DOUBLE, process, work, MPI_COMM_WORLD); // send block of elements work+=blksize; } while (work < N) { MPI_Recv(&process,1,MPI_INT, MPI_ANY_SOURCE,MPI_ANY_TAG,MPI_COMM_WORLD, &status);//recv index & results MPI_Recv(&array2[process],n*blksize,MPI_DOUBLE,status.MPI_SOURCE,MPI_ANY_TAG,MPI_COMM_WORLD,&status); if(work < n-p * blksize){ MPI_Send(&work, 1, MPI_INT, status.MPI_SOURCE, work, MPI_COMM_WORLD); // send another block MPI_Send(&array1[work], n * blksize, MPI_DOUBLE, status.MPI_SOURCE, work, MPI_COMM_WORLD); } else if(work >= n-p * blksize && status.MPI_SOURCE == 1){ MPI_Send(&work, 1, MPI_INT, status.MPI_SOURCE, work, MPI_COMM_WORLD); if(work == N){ // all work done, terminate, send final sends to waiting recvs and pick up final sends for(m = 1; m < p; m++){ MPI_Isend(&n, 1, MPI_INT, m, 0, MPI_COMM_WORLD, &request[0]); MPI_Recv(&process, 1, MPI_INT, MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status); MPI_Recv(&array2[process], n*blksize,MPI_DOUBLE,status.MPI_SOURCE,MPI_ANY_TAG,MPI_COMM_WORLD,&status); Note: status.MPI_SOURCE used to identify where result coming from.

Implementation of Worker process void workerProcess(double array1[N][N], double array2[N][N],double array3[N][N],int rank,int blksize) { int work = 0; int n = N; if(rank != 0){ while(work < n){ MPI_Recv(&work, 1, MPI_INT, 0, MPI_ANY_TAG, MPI_COMM_WORLD, &status); // recv index if(work < n){ MPI_Recv(&array1[work],n*blksize,MPI_DOUBLE,0,MPI_ANY_TAG,MPI_COMM_WORLD,&status); compute(array1, array2, array3, work, blksize); // user programmer’s routine MPI_Send(&work, 1, MPI_INT, 0, rank, MPI_COMM_WORLD); MPI_Send(&array3[work], N * blksize, MPI_DOUBLE, 0, rank, MPI_COMM_WORLD); }

Suzaku workpool handshaking Slaves Master Will return status with destination rank Send index to any waiting process Master Send block of data to same slave (work) After all slaves receive index/data (p blocks), continue sending index/data for remaining blocks when slaves return results. Finally send termination sends

Other Suzaku routines Implementation void workerGetRowOfDoubles(double array[N][N], int *index, int rank, int blksize); Puts worker in waiting status to receive next piece of work Implementation void workerGetRowOfDoubles(double array[N][N], int *index, int rank, int blksize) { int n = N; MPI_Recv(index, 1, MPI_INT, 0, MPI_ANY_TAG, MPI_COMM_WORLD, &status); if(*index < N) { MPI_Recv(&array[*index], n * blksize, MPI_DOUBLE, 0, MPI_ANY_TAG, MPI_COMM_WORLD, &status); }

Workpool Matrix Multiplication int main(int argc, char *argv[]) { int i, j, k, error = 0; int p, rank = 0; double a[N][N], b[N][N], c[N][N]; double sum; MPI_START(&p, &rank, &argc, &argv); readInputFile(argc, argv, &error, a, b); // Read input data files startTimer(rank); if (p == 1) { // workpool fails with 1 process (no slaves), so do in master for (k = 0; k < N; k++) { for (i = 0; i < N; i++) { sum = 0; for (j = 0; j < N; j++) sum+=a[k][j] * b[j][i]; c[k][i] = sum; } } else { mpiBroadcastArrayOfDoubles(*b); // Send out the b array to the workers masterProcess(a, c, p, rank, BLKSIZE); // Task queue issue rows from a. Workers return results workerProcess(a, b, c, rank, BLKSIZE); // Fetches work, returns results from compute function printResults("C =", c, rank); stopTimer(rank); return 0;

Compute function void compute(double a[N][N], double b[N][N], double c[N][N], int index, int blksize) { int i, j; double sum; for (int indexx = index; indexx < index + blksize; indexx++) { for(i = 0; i < N; i++) { sum = 0; for(j = 0; j < N; j++) { sum+= a[indexx][j] * b[j][i]; } c[indexx][i] = sum;

Suzaku Software Has two files: suzaku.h -- header file containing macro definitions and routine signatures suzaku.o -- compiled object file of sukaku.c source file that contains suzaku routines. In class, students given suzaku.o rather than sukaku.c, because subsequently they write the routines themselves.

Compilation/execution As a regular MPI program: Command line Compile: mpicc –o hello hello.c suzaku.o mpicc uses gcc to links libraries and create executable hello, and all the usual features of gcc can be used. Execute: mpiexec –n # ./hello where “#” is number of copies of process to start. Using Eclipse Same approach as regular MPI program. See “Using Suzaku” at http://coitweb.uncc.edu/~abw/Suzaku/

Suzaku Re-design Project just started to re-design Suzaku. Avoid as much as possible MPI related parameters Have routines specifically implementing generic (type-less) low-level patterns. Suzaku_Scatter(…); Suzaku_Gather(…); Suzaku)_Compute(…); Implementation needs to get around MPI message type constraints using more advanced MPI features.

Simple master-slave pattern (scatter-compute-gather) int main (int argc, char **argv) { Suzaku_Scatter(…); Suzaku)_Compute(…); Suzaku_Gather(…); return 0; }

Questions