Parallel & Cluster Computing Distributed Cartesian Meshes Paul Gray, University of Northern Iowa David Joiner, Shodor Education Foundation Tom Murphy,

Slides:



Advertisements
Similar presentations
MPI version of the Serial Code With One-Dimensional Decomposition Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center Presented by Timothy.
Advertisements

Introduction to MPI Programming (Part III)‏ Michael Griffiths, Deniz Savas & Alan Real January 2006.
1 Buffers l When you send data, where does it go? One possibility is: Process 0Process 1 User data Local buffer the network User data Local buffer.
Introduction to Parallel Programming & Cluster Computing Applications and Types of Parallelism Josh Alexander, University of Oklahoma Ivan Babic, Earlham.
MPI – An introduction by Jeroen van Hunen What is MPI and why should we use it? Simple example + some basic MPI functions Other frequently used MPI functions.
Reference: Message Passing Fundamentals.
1 Friday, October 20, 2006 “Work expands to fill the time available for its completion.” -Parkinson’s 1st Law.
Point-to-Point Communication Self Test with solution.
Message Passing Fundamentals Self Test. 1.A shared memory computer has access to: a)the memory of other nodes via a proprietary high- speed communications.
Comp 422: Parallel Programming Lecture 8: Message Passing (MPI)
Efficient Parallelization for AMR MHD Multiphysics Calculations Implementation in AstroBEAR.
Chapter 5, CLR Textbook Algorithms on Grids of Processors.
Topic Overview One-to-All Broadcast and All-to-One Reduction
Supercomputing in Plain English Applications and Types of Parallelism Henry Neeman, Director OU Supercomputing Center for Education & Research University.
A Brief Look At MPI’s Point To Point Communication Brian T. Smith Professor, Department of Computer Science Director, Albuquerque High Performance Computing.
Parallel Programming & Cluster Computing Applications and Types of Parallelism Henry Neeman, University of Oklahoma Charlie Peck, Earlham College Tuesday.
1 What is message passing? l Data transfer plus synchronization l Requires cooperation of sender and receiver l Cooperation not always apparent in code.
CS 179: GPU Programming Lecture 20: Cross-system communication.
1 TRAPEZOIDAL RULE IN MPI Copyright © 2010, Elsevier Inc. All rights Reserved.
Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI RD Project Review Meeting Canadian Meteorological Centre August.
Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.
Parallel & Cluster Computing Clusters & Distributed Parallelism Paul Gray, University of Northern Iowa David Joiner, Shodor Education Foundation Tom Murphy,
Parallel & Cluster Computing Overview of Parallelism Henry Neeman, Director OU Supercomputing Center for Education & Research University of Oklahoma SC08.
Parallel & Cluster Computing MPI Basics Paul Gray, University of Northern Iowa David Joiner, Shodor Education Foundation Tom Murphy, Contra Costa College.
Supercomputing in Plain English Supercomputing in Plain English Applications and Types of Parallelism Henry Neeman, Director Director, OU Supercomputing.
Molecular Dynamics Sathish Vadhiyar Courtesy: Dr. David Walker, Cardiff University.
Specialized Sending and Receiving David Monismith CS599 Based upon notes from Chapter 3 of the MPI 3.0 Standard
Parallel Computing A task is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information What do we.
Parallel Programming & Cluster Computing Applications and Types of Parallelism Henry Neeman, Director OU Supercomputing Center for Education & Research.
Parallel & Cluster Computing Monte Carlo Henry Neeman, Director OU Supercomputing Center for Education & Research University of Oklahoma SC08 Education.
Domain Decomposed Parallel Heat Distribution Problem in Two Dimensions Yana Kortsarts Jeff Rufinus Widener University Computer Science Department.
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
Performance Oriented MPI Jeffrey M. Squyres Andrew Lumsdaine NERSC/LBNL and U. Notre Dame.
Supercomputing in Plain English Applications and Types of Parallelism PRESENTERNAME PRESENTERTITLE PRESENTERDEPARTMENT PRESENTERINSTITUTION DAY MONTH DATE.
Supercomputing in Plain English Supercomputing in Plain English Applications and Types of Parallelism Henry Neeman, Director OU Supercomputing Center for.
Supercomputing in Plain English Applications and Types of Parallelism Henry Neeman, Director OU Supercomputing Center for Education & Research Blue Waters.
Introduction to Parallel Programming & Cluster Computing Applications and Types of Parallelism Joshua Alexander, U Oklahoma Ivan Babic, Earlham College.
Parallel Programming & Cluster Computing Overview of Parallelism Henry Neeman, University of Oklahoma Paul Gray, University of Northern Iowa SC08 Education.
MPI (continue) An example for designing explicit message passing programs Advanced MPI concepts.
Parallel & Cluster Computing N-Body Simulation and Collective Communications Henry Neeman, Director OU Supercomputing Center for Education & Research University.
Parallel & Cluster Computing Instruction Level Parallelism Paul Gray, University of Northern Iowa David Joiner, Shodor Education Foundation Tom Murphy,
Parallel Programming & Cluster Computing MPI Collective Communications Dan Ernst Andrew Fitz Gibbon Tom Murphy Henry Neeman Charlie Peck Stephen Providence.
Supercomputing in Plain English Part VII: Applications and Types of Parallelism Henry Neeman, Director OU Supercomputing Center for Education & Research.
Message Passing and MPI Laxmikant Kale CS Message Passing Program consists of independent processes, –Each running in its own address space –Processors.
MPI Point to Point Communication CDP 1. Message Passing Definitions Application buffer Holds the data for send or receive Handled by the user System buffer.
Parallel Programming & Cluster Computing N-Body Simulation and Collective Communications Henry Neeman, University of Oklahoma Paul Gray, University of.
CS 471 Final Project 2d Advection/Wave Equation Using Fourier Methods December 10, 2003 Jose L. Rodriguez
Project18’s Communication Drawing Design By: Camilo A. Silva BIOinformatics Summer 2008.
MPI Workshop - III Research Staff Cartesian Topologies in MPI and Passing Structures in MPI Week 3 of 3.
Parallel & Cluster Computing Transport Codes and Shifting Henry Neeman, Director OU Supercomputing Center for Education & Research University of Oklahoma.
MPI Send/Receive Blocked/Unblocked Josh Alexander, University of Oklahoma Ivan Babic, Earlham College Andrew Fitz Gibbon, Shodor Education Foundation Inc.
Introduction to Parallel Programming & Cluster Computing MPI Collective Communications Joshua Alexander, U Oklahoma Ivan Babic, Earlham College Michial.
April 24, 2002 Parallel Port Example. April 24, 2002 Introduction The objective of this lecture is to go over a simple problem that illustrates the use.
Unit-8 Sorting Algorithms Prepared By:-H.M.PATEL.
Parallel Computing Presented by Justin Reschke
Parallel Algorithms & Implementations: Data-Parallelism, Asynchronous Communication and Master/Worker Paradigm FDI 2007 Track Q Day 2 – Morning Session.
Parallel Programming & Cluster Computing Monte Carlo Henry Neeman, University of Oklahoma Paul Gray, University of Northern Iowa SC08 Education Program’s.
Pitfalls: Time Dependent Behaviors CS433 Spring 2001 Laxmikant Kale.
Research Staff Passing Structures in MPI Week 3 of 3
Parallel Programming for Wave Equation
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Message Passing Interface (cont.) Topologies.
Saturday November 12 – Tuesday November
Supercomputing in Plain English Applications and Types of Parallelism
Introduction to parallelism and the Message Passing Interface
MPI (continue) An example for designing explicit message passing programs Emphasize on the difference between shared memory code and distributed memory.
Send and Receive.
Introduction to High Performance Computing Lecture 16
MPI (continue) An example for designing explicit message passing programs Emphasize on the difference between shared memory code and distributed memory.
Parallel Programming & Cluster Computing Transport Codes and Shifting
Presentation transcript:

Parallel & Cluster Computing Distributed Cartesian Meshes Paul Gray, University of Northern Iowa David Joiner, Shodor Education Foundation Tom Murphy, Contra Costa College Henry Neeman, University of Oklahoma Charlie Peck, Earlham College National Computational Science Institute August

NCSI Parallel & Cluster Computing OU August Cartesian Coordinates x y

NCSI Parallel & Cluster Computing OU August Structured Mesh A structured mesh is like the mesh on the previous slide. It’s nice and regular and rectangular, and can be stored in a standard Fortran or C array of the appropriate dimension.

NCSI Parallel & Cluster Computing OU August Flow in Structured Meshes When calculating flow in a structured mesh, you typically use a finite difference equation, like so: unew i,j = F(  t, uold i,j, uold i-1,j, uold i+1,j, uold i,j-1, uold i,j+1 ) for some function F, where uold i,j is at time t and unew i,j is at time t +  t. In other words, you calculate the new value of u i,j, based on its old value as well as the old values of its immediate neighbors. Actually, it may use neighbors a few farther away.

NCSI Parallel & Cluster Computing OU August Ghost Zones

NCSI Parallel & Cluster Computing OU August Ghost Zones We want to calculate values in the part of the mesh that we care about, but to do that, we need values on the boundaries. Ghost zones are mesh zones that aren’t really part of the problem domain that we care about, but that hold boundary data for calculating the parts that we do care about.

NCSI Parallel & Cluster Computing OU August Using Ghost Zones A good basic algorithm for flow that uses ghost zones is: DO timestep = 1, number_of_timesteps CALL fill_old_boundary(…) CALL advance_to_new_from_old(…) END DO This approach generally works great on a serial code.

NCSI Parallel & Cluster Computing OU August Ghost Zones in MPI What if you want to parallelize a Cartesian flow code in MPI? You’ll need to: decompose the mesh into submeshes; figure out how each submesh talks to its neighbors.

NCSI Parallel & Cluster Computing OU August Data Decomposition

NCSI Parallel & Cluster Computing OU August Data Decomposition We want to split the data into chunks of equal size, and give each chunk to a processor to work on. Then, each processor can work independently of all of the others, except when it’s exchanging boundary data with its neighbors.

NCSI Parallel & Cluster Computing OU August MPI_Cart_* MPI supports exactly this kind of calculation, with a set of functions MPI_Cart_* : MPI_Cart_create, MPI_Cart_coords, MPI_Cart_shift These routines create and describe a new communicator, one that replaces MPI_COMM_WORLD in your code.

NCSI Parallel & Cluster Computing OU August MPI_Sendrecv MPI_Sendrecv is just like an MPI_Send followed by an MPI_Recv, except that it’s much better than that. With MPI_Send and MPI_Recv, these are your choices: Everyone calls MPI_Recv, and then everyone calls MPI_Send. Everyone calls MPI_Send, and then everyone calls MPI_Recv. Some call MPI_Send while others call MPI_Recv, and then they swap roles.

NCSI Parallel & Cluster Computing OU August Why MPI_Sendrecv ? Suppose that everyone calls MPI_Recv, and then everyone calls MPI_Send. Well, these routines are synchronous (also called blocking), meaning that the communication has to complete before the process can continue on farther into the program. That means that, when everyone calls MPI_Recv, they’re waiting for someone else to call MPI_Send. We call this deadlock. Officially, the MPI standard forbids this approach.

NCSI Parallel & Cluster Computing OU August Why MPI_Sendrecv ? Suppose that everyone calls MPI_Send, and then everyone calls MPI_Recv. Well, this will only work if there’s enough buffer space available to hold everyone’s messages until after everyone is done sending. Sometimes, there isn’t enough buffer space. Officially, the MPI standard allows MPI implementers to support this, but it’s not part of the official MPI standard; that is, a particular MPI implementation doesn’t have to allow it.

NCSI Parallel & Cluster Computing OU August Why MPI_Sendrecv ? Suppose that some processors call MPI_Send while others call MPI_Recv, and then they swap roles. This will work, and is sometimes used, but it can be a pain in the rear end to manage.

NCSI Parallel & Cluster Computing OU August Why MPI_Sendrecv ? MPI_Sendrecv allows each processor to simultaneously send to one processor and receive from another. For example, P 1 could send to P 0 while simultaneously receiving from P 2. This is exactly what we need in Cartesian flow: we want the boundary information to come in from the east while we send boundary information out to the west, and then vice versa. These are called shifts.

NCSI Parallel & Cluster Computing OU August MPI_Sendrecv MPI_Sendrecv( westward_send_buffer, westward_send_size, MPI_REAL, west_neighbor_process, westward_tag, westward_recv_buffer, westward_recv_size, MPI_REAL, east_neighbor_process, westward_tag, cartesian_communicator, ok_mpi_status); This call sends to west_neighbor_process the data in westward_send_buffer, and at the same time receives from east_neighbor_process a bunch of data that end up in westward_recv_buffer.

NCSI Parallel & Cluster Computing OU August Why MPI_Sendrecv ? The advantage of MPI_Sendrecv is that it allows us the luxury of no longer having to worry about who should send when and who should receive when. This is exactly what we need in Cartesian flow: we want the boundary information to come in from the east while we send boundary information out to the west – without us having to worry about deciding who should do what to who when.

NCSI Parallel & Cluster Computing OU August MPI_Sendrecv Concept in Principle Concept in practice

NCSI Parallel & Cluster Computing OU August MPI_Sendrecv Concept in practice westward_send_buffer westward_recv_buffer Actual Implementation