MPI (continue) An example for designing explicit message passing programs Emphasize on the difference between shared memory code and distributed memory.

Slides:



Advertisements
Similar presentations
MPI version of the Serial Code With One-Dimensional Decomposition Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center Presented by Timothy.
Advertisements

Parallel & Cluster Computing Distributed Cartesian Meshes Paul Gray, University of Northern Iowa David Joiner, Shodor Education Foundation Tom Murphy,
Point-to-Point Communication Self Test with solution.
Message Passing Fundamentals Self Test. 1.A shared memory computer has access to: a)the memory of other nodes via a proprietary high- speed communications.
Message-Passing Programming and MPI CS 524 – High-Performance Computing.
Computer Science 1620 Multi-Dimensional Arrays. we used arrays to store a set of data of the same type e.g. store the assignment grades for a particular.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Comp 422: Parallel Programming Lecture 8: Message Passing (MPI)
S A B D C T = 0 S gets message from above and sends messages to A, C and D S.
Today Objectives Chapter 6 of Quinn Creating 2-D arrays Thinking about “grain size” Introducing point-to-point communications Reading and printing 2-D.
Reference: / Parallel Programming Paradigm Yeni Herdiyeni Dept of Computer Science, IPB.
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
L15: Putting it together: N-body (Ch. 6) October 30, 2012.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
ECE 1747H : Parallel Programming Message Passing (MPI)
Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.
(Superficial!) Review of Uniprocessor Architecture Parallel Architectures and Related concepts CS 433 Laxmikant Kale University of Illinois at Urbana-Champaign.
Message Passing Programming with MPI Introduction to MPI Basic MPI functions Most of the MPI materials are obtained from William Gropp and Rusty Lusk’s.
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
MPI (continue) An example for designing explicit message passing programs Advanced MPI concepts.
Distributed Shared Memory (part 1). Distributed Shared Memory (DSM) mem0 proc0 mem1 proc1 mem2 proc2 memN procN network... shared memory.
Lecture 4 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
ECE 1747H: Parallel Programming Lecture 2: Data Parallelism.
CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer/
Programming distributed memory systems: Message Passing Interface (MPI) Distributed memory systems: multiple processing units working on one task (e.g.
ECE 1747H: Parallel Programming Lecture 2-3: More on parallelism and dependences -- synchronization.
Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008.
April 24, 2002 Parallel Port Example. April 24, 2002 Introduction The objective of this lecture is to go over a simple problem that illustrates the use.
Parallel Computing Presented by Justin Reschke
Parallel Patterns.
Introduction to parallel computing concepts and technics
High Altitude Low Opening?
Duncan MacMichael & Galen Deal CSS 534 – Autumn 2016
Arrays.
Two-Dimension Arrays Computer Programming 2.
MPI: Portable Parallel Programming for Scientific Computing
Fundamentals of Java: AP Computer Science Essentials, 4th Edition
SHARED MEMORY PROGRAMMING WITH OpenMP
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Message Passing Interface (cont.) Topologies.
Parallel Programming By J. H. Wang May 2, 2017.
CS4402 – Parallel Computing
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Mapping Techniques Dr. Xiao Qin Auburn University.
The University of Adelaide, School of Computer Science
MPI Message Passing Interface
Parallel Algorithm Design
Is System X for Me? Cal Ribbens Computer Science Department
Parallel Sorting Algorithms
Parallel Programming in C with MPI and OpenMP
Parallel Programming with MPI and OpenMP
CS 584.
More on MPI Nonblocking point-to-point routines Deadlock
Lecture 14: Inter-process Communication
ECE1747 Parallel Programming
Chapter 8: ZPL and Other Global View Languages
CSCE569 Parallel Computing
Parallel Sorting Algorithms
Lab Course CFD Parallelisation Dr. Miriam Mehl.
Introduction to parallelism and the Message Passing Interface
More on MPI Nonblocking point-to-point routines Deadlock
Parallel Processing Javier Delgado
CSCE569 Parallel Computing
Send and Receive.
Message Passing Programming Based on MPI
Jacobi Project Salvatore Orlando.
Chapter 01: Introduction
Introduction to High Performance Computing Lecture 16
Parallel Programming in C with MPI and OpenMP
MPI (continue) An example for designing explicit message passing programs Emphasize on the difference between shared memory code and distributed memory.
CS 584 Lecture 8 Assignment?.
Programming Parallel Computers
Presentation transcript:

MPI (continue) An example for designing explicit message passing programs Emphasize on the difference between shared memory code and distributed memory code.

A design example SOR

Parallelizing SOR How to write a shared memory parallel program? Decide how to decompose the computation into parallel parts. Create (and destroy) processes to support that decomposition. Add synchronization to make sure dependences are covered.

SOR shared memory program grid temp p0 p0 p1 p1 p2 p2 p3 p3 p0 p1 p2 p3 Does parallelizing SOR with MPI work the same way?

MPI program complication: memory is distributed grid grid p0 p1 p2 p1 temp temp p2 p1 p3 p2 Grid logical view p1 p2 Physical data structure: each process does not have local access to boundary data items!!

Exact same code does not work: need additional boundary elements grid grid p1 p2 temp temp p1 p2 p1 p2

Boundary elements result in communications grid grid p1 p2

Communicating boundary elements Processes 0, 1, 2 send lower row to Processes 1,2 3. Processes 1, 2, 3 receiver upper row from processes 0, 1, 2 Process 1, 2, 3 send the upper row to processes 0, 1, 2 Processes 0, 1, 2 receive the lower row from processes 1, 2,3 p0 p1 p2 p3

MPI code for Communicating boundary elements if (rank < size - 1) MPI_Send( xlocal[maxn/size], maxn, MPI_DOUBLE, rank + 1, 0, MPI_COMM_WORLD ); if (rank > 0) MPI_Recv( xlocal[0], maxn, MPI_DOUBLE, rank - 1, 0, MPI_COMM_WORLD, &status ); /* Send down unless I'm at the bottom */ MPI_Send( xlocal[1], maxn, MPI_DOUBLE, rank - 1, 1, MPI_Recv( xlocal[maxn/size+1], maxn, MPI_DOUBLE, rank + 1, 1,

Now that we have boundaries Can we use the same code as in shared memory? for( i=from; i<to; i++ ) for( j=0; j<n; j++ ) temp[i][j] = 0.25*( grid[i-1][j] + grid[i+1][j] + grid[i][j-1] + grid[i][j+1]); from = myid *25, to = myid*25+25 Only if we declare a giant array (for the whole mesh on each process). If not, we will need to translate the indices.

Index translation for( i=0; i<n/p; i++) for( j=0; j<n; j++ ) temp[i][j] = 0.25*( grid[i-1][j] + grid[i+1][j] + grid[i][j-1] + grid[i][j+1]); All variables are local to each process, need the logical mapping!

Task for a message passing programmer Divide up program in parallel parts. Create and destroy processes to do above. Partition and distribute the data. Communicate data at the right time. Perform index translation. Still need to do synchronization? Sometimes, but many times goes hand in hand with data communication.