CS4961 Parallel Programming Lecture 17: Message Passing, cont

Slides:

Advertisements

Similar presentations

MPI Message Passing Interface

Advertisements

The Building Blocks: Send and Receive Operations

Sahalu Junaidu ICS 573: High Performance Computing 8.1 Topic Overview Matrix-Matrix Multiplication Block Matrix Operations A Simple Parallel Matrix-Matrix.

CS 240A: Models of parallel programming: Distributed memory and MPI.

Message-Passing Programming and MPI CS 524 – High-Performance Computing.

Distributed Memory Programming with MPI. What is MPI? Message Passing Interface (MPI) is an industry standard message passing system designed to be both.

MPI Point-to-Point Communication CS 524 – High-Performance Computing.

1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben.

Sahalu JunaiduICS 573: High Performance Computing6.1 Programming Using the Message Passing Paradigm Principles of Message-Passing Programming The Building.

Parallel Programming with Java

CS 179: GPU Programming Lecture 20: Cross-system communication.

Lecture 4: Parallel Programming Models. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.

A Message Passing Standard for MPP and Workstations Communications of the ACM, July 1996 J.J. Dongarra, S.W. Otto, M. Snir, and D.W. Walker.

1 MPI: Message-Passing Interface Chapter 2. 2 MPI - (Message Passing Interface) Message passing library standard (MPI) is developed by group of academics.

Parallel Computing A task is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information What do we.

Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University.

Message Passing Programming with MPI Introduction to MPI Basic MPI functions Most of the MPI materials are obtained from William Gropp and Rusty Lusk’s.

Message Passing Programming Model AMANO, Hideharu Textbook pp. １４０－１４７.

Performance Oriented MPI Jeffrey M. Squyres Andrew Lumsdaine NERSC/LBNL and U. Notre Dame.

Summary of MPI commands Luis Basurto. Large scale systems Shared Memory systems – Memory is shared among processors Distributed memory systems – Each.

MPI Introduction to MPI Commands. Basics – Send and Receive MPI is a message passing environment. The processors’ method of sharing information is NOT.

L20: Introduction to “Irregular” Algorithms November 9, 2010.

MPI Send/Receive Blocked/Unblocked Tom Murphy Director of Contra Costa College High Performance Computing Center Message Passing Interface BWUPEP2011,

1 Overview on Send And Receive routines in MPI Kamyar Miremadi November 2004.

11/04/2010CS4961 CS4961 Parallel Programming Lecture 19: Message Passing, cont. Mary Hall November 4,

Parallel Programming with MPI By, Santosh K Jena..

CSCI-455/522 Introduction to High Performance Computing Lecture 4.

1 Message Passing Models CEG 4131 Computer Architecture III Miodrag Bolic.

Message-Passing Computing Chapter 2. Programming Multicomputer Design special parallel programming language –Occam Extend existing language to handle.

MPI Point to Point Communication CDP 1. Message Passing Definitions Application buffer Holds the data for send or receive Handled by the user System buffer.

L17: MPI, cont. October 25, Final Project Purpose: -A chance to dig in deeper into a parallel programming model and explore concepts. -Research.

Introduction to Parallel Programming at MCSR Message Passing Computing –Processes coordinate and communicate results via calls to message passing library.

MPI Send/Receive Blocked/Unblocked Josh Alexander, University of Oklahoma Ivan Babic, Earlham College Andrew Fitz Gibbon, Shodor Education Foundation Inc.

Parallel Algorithms & Implementations: Data-Parallelism, Asynchronous Communication and Master/Worker Paradigm FDI 2007 Track Q Day 2 – Morning Session.

Message Passing Interface Using resources from

COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming: 1. Collective Operations 2. Overlapping Communication with Computation Dr. Xiao.

L18: MPI, cont. November 10, Administrative Class cancelled, Tuesday, November 15 Guest Lecture, Thursday, November 17, Ganesh Gopalakrishnan CUDA.

1 MPI: Message Passing Interface Prabhaker Mateti Wright State University.

3/12/2013Computer Engg, IIT(BHU)1 MPI-2. POINT-TO-POINT COMMUNICATION Communication between 2 and only 2 processes. One sending and one receiving. Types:

Introduction to parallel computing concepts and technics

Definition of Distributed System

Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Message Passing Interface (cont.) Topologies.

Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Principles of Message-Passing Programming.

MPI Point to Point Communication

Introduction to MPI.

MPI Message Passing Interface

CS4961 Parallel Programming Lecture 18: Introduction to Message Passing Mary Hall November 2, /02/2010 CS4961.

Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Principles of Message-Passing Programming.

Parallel Programming with MPI and OpenMP

Advanced Operating Systems

CMSC 611: Advanced Computer Architecture

CS4961 Parallel Programming Lecture 16: Introduction to Message Passing Mary Hall November 3, /03/2011 CS4961.

Programming Using the Message Passing Model

MPI-Message Passing Interface

Message Passing Models

Distributed Systems CS

CS 5334/4390 Spring 2017 Rogelio Long

Lecture 14: Inter-process Communication

A Message Passing Standard for MPP and Workstations

MPI: Message Passing Interface

May 19 Lecture Outline Introduce MPI functionality

CSCE569 Parallel Computing

Introduction to parallelism and the Message Passing Interface

Send and Receive.

NOTE: FOR CLASS USE MPICH Version of MPI!!

5- Message-Passing Programming

September 4, 1997 Parallel Processing (CS 730) Lecture 9: Advanced Point to Point Communication Jeremy R. Johnson *Parts of this lecture was derived.

Parallel Processing - MPI

MPI Message Passing Interface

Programming Parallel Computers

Presentation transcript:

CS4961 Parallel Programming Lecture 17: Message Passing, cont CS4961 Parallel Programming Lecture 17: Message Passing, cont. Introduction to CUDA Mary Hall November 3, 2009 11/03/2009 CS4961

Administrative Homework assignment 3 Due, Thursday, November 5 before class Use the “handin” program on the CADE machines Use the following command: “handin cs4961 hw3 <gzipped tar file>” OMIT VTUNE PORTION but do Problem 3 as homework. CADE Lab working on VTUNE installation. You can turn it in later for EXTRA CREDIT! Mailing list set up: cs4961@list.eng.utah.edu 11/03/2009 CS4961

A Few Words About Final Project Purpose: A chance to dig in deeper into a parallel programming model and explore concepts. Present results to work on communication of technical ideas Write a non-trivial parallel program that combines two parallel programming languages/models. In some cases, just do two separate implementations. OpenMP + SSE-3 TBB + SSE-3 MPI + OpenMP MPI + SSE-3 MPI + CUDA Open CL??? (keep it simple! need backup plan) Present results in a poster session on the last day of class 11/03/2009 CS4961

Example Projects Look in the textbook or on-line Recall Red/Blue from Ch. 4 Implement in MPI (+ SSE-3) Implement main computation in CUDA Algorithms from Ch. 5 SOR from Ch. 7 CUDA implementation? FFT from Ch. 10 Jacobi from Ch. 10 Graph algorithms Image and signal processing algorithms Other domains… 11/03/2009 CS4961

Next Thursday, November 12 Turn in <1 page project proposal Algorithm to be implemented Programming model(s) Validation and measurement plan 11/03/2009 CS4961

Today’s Lecture More Message Passing, largely for distributed memory Message Passing Interface (MPI): a Local View language Sources for this lecture Larry Snyder, http://www.cs.washington.edu/education/courses/ 524/08wi/ Online MPI tutorial http://www- unix.mcs.anl.gov/mpi/tutorial/gropp/talk.html Vivek Sarkar, Rice University, COMP 422, F08 http://www.cs.rice.edu/~vs3/comp422/lecture -notes/index.html 11/03/2009 CS4961

MPI Critique from Last Time (Snyder) Message passing is a very simple model Extremely low level; heavy weight Expense comes from λ and lots of local code Communication code is often more than half Tough to make adaptable and flexible Tough to get right and know it Tough to make perform in some (Snyder says most) cases Programming model of choice for scalability Widespread adoption due to portability, although not completely true in practice 11/03/2009 CS4961

Today’s MPI Focus Blocking communication Non-blocking Overhead Deadlock? Non-blocking One-sided communication 11/03/2009 CS4961

MPI-1 MPI is a message-passing library interface standard. Specification, not implementation Library, not a language Classical message-passing programming model MPI was defined (1994) by a broadly-based group of parallel computer vendors, computer scientists, and applications developers. 2-year intensive process Implementations appeared quickly and now MPI is taken for granted as vendor-supported software on any parallel machine. Free, portable implementations exist for clusters and other environments (MPICH2, Open MPI) 11/03/2009 CS4961

MPI-2 Same process of definition by MPI Forum MPI-2 is an extension of MPI – Extends the message- passing model. Parallel I/O Remote memory operations (one-sided) Dynamic process management Adds other functionality C++ and Fortran 90 bindings similar to original C and Fortran-77 bindings External interfaces Language interoperability MPI interaction with threads 11/03/2009 CS4961

Non-Buffered vs. Buffered Sends A simple method for forcing send/receive semantics is for the send operation to return only when it is safe to do so. In the non-buffered blocking send, the operation does not return until the matching receive has been encountered at the receiving process. Idling and deadlocks are major issues with non- buffered blocking sends. In buffered blocking sends, the sender simply copies the data into the designated buffer and returns after the copy operation has been completed. The data is copied at a buffer at the receiving end as well. Buffering alleviates idling at the expense of copying overheads. 11/03/2009 CS4961

Non-Blocking Communication The programmer must ensure semantics of the send and receive. This class of non-blocking protocols returns from the send or receive operation before it is semantically safe to do so. Non-blocking operations are generally accompanied by a check-status operation. When used correctly, these primitives are capable of overlapping communication overheads with useful computations. Message passing libraries typically provide both blocking and non-blocking primitives. 11/03/2009 CS4961

Deadlock? int a[10], b[10], myrank; MPI_Status status; ... MPI_Comm_rank(MPI_COMM_WORLD, &myrank); if (myrank == 0) { MPI_Send(a, 10, MPI_INT, 1, 1, MPI_COMM_WORLD); MPI_Send(b, 10, MPI_INT, 1, 2, MPI_COMM_WORLD); } else if (myrank == 1) { MPI_Recv(b, 10, MPI_INT, 0, 2, MPI_COMM_WORLD); MPI_Recv(a, 10, MPI_INT, 0, 1, MPI_COMM_WORLD); } ... 11/03/2009 CS4961

Deadlock? Consider the following piece of code, in which process i sends a message to process i + 1 (modulo the number of processes) and receives a message from process i - 1 (module the number of processes). int a[10], b[10], npes, myrank; MPI_Status status; ... MPI_Comm_size(MPI_COMM_WORLD, &npes); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); MPI_Send(a, 10, MPI_INT, (myrank+1)%npes, 1, MPI_COMM_WORLD); MPI_Recv(b, 10, MPI_INT, (myrank-1+npes)%npes, 1, MPI_COMM_WORLD); ... 11/03/2009 CS4961

Non-Blocking Communication To overlap communication with computation, MPI provides a pair of functions for performing non- blocking send and receive operations (“I” stands for “Immediate”): int MPI_Isend(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm, MPI_Request *request) int MPI_Irecv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Request *request) These operations return before the operations have been completed. Function MPI_Test tests whether or not the non- blocking send or receive operation identified by its request has finished. int MPI_Test(MPI_Request *request, int *flag, MPI_Status *status) MPI_Wait waits for the operation to complete. int MPI_Wait(MPI_Request *request, MPI_Status *status) 11/03/2009 CS4961

One-Sided Communication 11/03/2009 CS4961

MPI One-Sided Communication or Remote Memory Access (RMA) Goals of MPI-2 RMA Design Balancing efficiency and portability across a wide class of architectures shared-memory multiprocessors NUMA architectures distributed-memory MPP’s, clusters Workstation networks Retaining “look and feel” of MPI-1 Dealing with subtle memory behavior issues: cache coherence, sequential consistency 11/03/2009 CS4961

MPI Constructs supporting One-Sided Communication (RMA) MPI_Win_create exposes local memory to RMA operation by other processes in a communicator Collective operation Creates window object MPI_Win_free deallocates window object MPI_Put moves data from local memory to remote memory MPI_Get retrieves data from remote memory into local memory MPI_Accumulate updates remote memory using local values 03/2009 CS4961