1 " Teaching Parallel Design Patterns to Undergraduates in Computer Science” Panel member SIGCSE 2014 - The 45 th ACM Technical Symposium on Computer Science.

Slides:



Advertisements
Similar presentations
MPI Message Passing Interface
Advertisements

Practical techniques & Examples
Grid Computing, B. Wilkinson, C Program Command Line Arguments A normal C program specifies command line arguments to be passed to main with:
Toward using higher-level abstractions to teach Parallel Computing 5/20/2013 (c) Copyright 2013 Clayton S. Ferner, UNC Wilmington1 Clayton Ferner, University.
Other Means of Executing Parallel Programs OpenMP And Paraguin 1(c) 2011 Clayton S. Ferner.
12d.1 Two Example Parallel Programs using MPI UNC-Wilmington, C. Ferner, 2007 Mar 209, 2007.
12b.1 Introduction to Message-passing with MPI UNC-Wilmington, C. Ferner, 2008 Nov 4, 2008.
1 UNC-Charlotte’s Grid Computing “Seeds” framework 1 © 2011 Jeremy Villalobos /B. Wilkinson Fall 2011 Grid computing course. Slides10-1.ppt Modification.
Csinparallel.org Patterns and Exemplars: Compelling Strategies for Teaching Parallel and Distributed Computing to CS Undergraduates Libby Shoop Joel Adams.
Monte Carlo Simulation Used when it is infeasible or impossible to compute an exact result with a deterministic algorithm Especially useful in –Studying.
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
CS470/570 Lecture 5 Introduction to OpenMP Compute Pi example OpenMP directives and options.
Lecture 5: Shared-memory Computing with Open MP. Shared Memory Computing.
1 " Teaching Parallel Design Patterns to Undergraduates in Computer Science” Panel member SIGCSE The 45 th ACM Technical Symposium on Computer Science.
Computer Science 320 Broadcasting. Floyd’s Algorithm on SMP for i = 0 to n – 1 parallel for r = 0 to n – 1 for c = 0 to n – 1 d rc = min(d rc, d ri +
Message Passing Programming with MPI Introduction to MPI Basic MPI functions Most of the MPI materials are obtained from William Gropp and Rusty Lusk’s.
Hybrid MPI and OpenMP Parallel Programming
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
Parallel Programming with MPI By, Santosh K Jena..
Message Passing and MPI Laxmikant Kale CS Message Passing Program consists of independent processes, –Each running in its own address space –Processors.
1 "Workshop 31: Developing a Hands-on Undergraduate Parallel Programming Course with Pattern Programming SIGCSE The 44 th ACM Technical Symposium.
Thinking in Parallel – Implementing In Code New Mexico Supercomputing Challenge in partnership with Intel Corp. and NM EPSCoR.
MPI and OpenMP.
Programming distributed memory systems: Message Passing Interface (MPI) Distributed memory systems: multiple processing units working on one task (e.g.
3/12/2013Computer Engg, IIT(BHU)1 MPI-1. MESSAGE PASSING INTERFACE A message passing library specification Extended message-passing model Not a language.
Using Compiler Directives Paraguin Compiler 1 © 2013 B. Wilkinson/Clayton Ferner SIGCSE 2013 Workshop 310 session2a.ppt Modification date: Jan 9, 2013.
Computer Science 320 Reduction. Estimating π Throw N darts, and let C be the number of darts that land within the circle quadrant of a unit circle Then,
Message Passing Interface Using resources from
Suzaku Pattern Programming Framework (a) Structure and low level patterns © 2015 B. Wilkinson Suzaku.pptx Modification date February 22,
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
1 ITCS4145 Parallel Programming B. Wilkinson March 23, hybrid-abw.ppt Hybrid Parallel Programming Introduction.
The Suzaku Pattern Programming Framework 6th NSF/TCPP Workshop on Parallel and Distributed Computing Education (EduPar-16) 2016 IEEE International Parallel.
Introduction to OpenMP
The Suzaku Pattern Programming Framework
Using Paraguin to Create Parallel Programs
Hybrid Parallel Programming with the Paraguin compiler
Pattern Parallel Programming
MPI Message Passing Interface
Introduction to OpenMP
September 4, 1997 Parallel Processing (CS 667) Lecture 5: Shared Memory Parallel Programming with OpenMP* Jeremy R. Johnson Parallel Processing.
Paraguin Compiler Examples.
Sieve of Eratosthenes.
Parallel Graph Algorithms
Parallel Programming with MPI and OpenMP
Using compiler-directed approach to create MPI code automatically
Collective Communication Operations
Pattern Parallel Programming
Paraguin Compiler Examples.
Using compiler-directed approach to create MPI code automatically
Hybrid Parallel Programming
Paraguin Compiler Communication.
Paraguin Compiler Version 2.1.
Paraguin Compiler Examples.
Paraguin Compiler Version 2.1.
CSCE569 Parallel Computing
Pattern Programming Tools
Introduction to parallelism and the Message Passing Interface
Hybrid Parallel Programming
Using compiler-directed approach to create MPI code automatically
Hybrid Parallel Programming
Introduction to OpenMP
Patterns Paraguin Compiler Version 2.1.
Hybrid MPI and OpenMP Parallel Programming
Parallel Graph Algorithms
Matrix Addition, C = A + B Add corresponding elements of each matrix to form elements of result matrix. Given elements of A as ai,j and elements of B as.
Hybrid Parallel Programming
This material is based upon work supported by the National Science Foundation under Grant #XXXXXX. Any opinions, findings, and conclusions or recommendations.
Quiz Questions How does one execute code in parallel in Paraguin?
Some codes for analysis and preparation for programming
CS 584 Lecture 8 Assignment?.
Presentation transcript:

1 " Teaching Parallel Design Patterns to Undergraduates in Computer Science” Panel member SIGCSE The 45 th ACM Technical Symposium on Computer Science Education Saturday March 8, 2014, 9:00 am - 10:15 am Dr. Clayton Ferner University of North Carolina Wilmington

Paraguin Compiler Create a similar abstraction as OpenMP for creating MPI code Uses pragma statements Allows for easy hybrid compilation Source-to-source compiler User can inspect and modify resulting MPI code 2

Example 1 (Monte Carlo Estimation of PI) int main(int argc, char *argv[]) { char *usage = "Usage: %s N\n"; int i, error = 0, count, count_tmp, total; double x, y, result; … total = atoi(argv[1]); #pragma paraguin begin_parallel #pragma paraguin bcast total 3 count = 0; srandom(…); for (i = 0; i < total; i++) { x = ((double) random()) / RAND_MAX; y = ((double) random()) / RAND_MAX; if (x*x + y*y <= 1.0) { count++; } ; #pragma paraguin reduce sum count count_tmp #pragma paraguin end_parallel result = 4.0 * (((double) count_tmp) / (__guin_NP * total)); Parallel Region Broadcast input Reduce Partial Results End Parallel Region

Example 2 (Matrix Addition) int main(int argc, char *argv[]) { int i, j, error = 0; double A[N][N], B[N][N], C[N][N]; char *usage = "Usage: %s file\n"; … // Read input matrices A and B #pragma paraguin begin_parallel // Scatter the input to all processors. #pragma paraguin scatter A B 4 // Parallelize the following loop nest assigning // iterations of the outermost loop (i) to different // partitions. #pragma paraguin forall for (i = 0; i < N; i++) { for (j = 0; j < N; j++) { C[i][j] = A[i][j] + B[i][j]; } ; #pragma paraguin gather C #pragma paraguin end_parallel … // Process Results Parallel Region Scatter input Gather Partial Results End Parallel Region Forall

Compilation and Running $ scc -D__x86_64__ -cc mpicc montecarlo.c -o montecarlo.out $ mpirun -np 12 montecarlo.out Estimation of PI = Paraguin Source Code w/ pragmas mpicc gcc w/ OpenMP Hybrid Executable Creating a Hybrid Program

Compile Web Page mpoptions.html Upload your source code Compile Download the resulting MPI source code and compiler log messages 6

Implemented Patterns 7 Scatter/Gather Stencil

Scatter/Gather Monte Carlo and Matrix Addition are examples of Scatter/Gather The scatter/gather pattern can also use either broadcast or reduction or both Done as a template: –Master prepares input –Scatter/Broadcast input –Compute partial results –Gather/Reduce partial results into the final result 8

Stencil #define TOTAL_TIME 3000 #define N 200 #define M 200 double computeValue (double A[][M], int i, int j) { return (A[i-1][j] + A[i+1][j] + A[i][j-1] + A[i][j+1]) * 0.25; } int main(int argc, char *argv[]) { int i, j,n, m, max_iterations, done; double A[2][N][M]; 9 … // Initialize input A #pragma paraguin begin_parallel n = N; m = M; max_iterations = TOTAL_TIME; ; #pragma paraguin stencil A n m \ max_iterations computeValue #pragma paraguin end_parallel Stencil Pattern

The Stencil Pragma is Replaced with Code to do: 1.The array given as an argument to the stencil pragma is broadcast to all available processors. 2.A loop is created to iterate max_iteration number of times. Within that loop, code is inserted to perform the following steps: a.Each processor (except the last one) will send its last row to the processor with rank one more than its own rank. b.Each processor (except the first one) will receive the last row from the processor with rank one less than its own rank. c.Each processor (except the first one) will send its first row to the processor with rank one less than its own rank. d.Each processor (except the last one) will receive the first row from the processor with rank one more than its own rank. e.Each processor will iterate through the values of the rows for which it is responsible and use the function provided compute the next value. 3.The data is gathered back to the root processor (rank 0). 10

Future Work 11

Acknowledgements Extending work to teaching environment supported by the National Science Foundation under grant "Collaborative Research: Teaching Multicore and Many-Core Programming at a Higher Level of Abstraction" # / ( ). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. Work initiated by Jeremy Villalobos in his PhD thesis “Running Parallel Applications on a Heterogeneous Environment with Accessible Development Practices and Automatic Scalability,” UNC-Charlotte, Jeremy developed “Seeds” pattern programming software. 12