Four Variations of Matrix Multiplication About 30 minutes – by Mike 1.

Slides:



Advertisements
Similar presentations
Demo of ISP Eclipse GUI Command-line Options Set-up Audience with LiveDVD About 30 minutes – by Ganesh 1.
Advertisements

1 Tuning for MPI Protocols l Aggressive Eager l Rendezvous with sender push l Rendezvous with receiver pull l Rendezvous blocking (push or pull)
1 What is message passing? l Data transfer plus synchronization l Requires cooperation of sender and receiver l Cooperation not always apparent in code.
MPI Message Passing Interface Portable Parallel Programs.
MPI Message Passing Interface
Illustration of ISP for each bug-class MPI Happens-before: how MPI supports Out-of-order execution The POE algorithm of ISP explained using MPI happens-before.
1 Implementing Master/Slave Algorithms l Many algorithms have one or more master processes that send tasks and receive results from slave processes l Because.
Practical Formal Verification of MPI and Thread Programs Sarvani Vakkalanka Anh Vo* Michael DeLisi Sriram Aananthakrishnan Alan Humphrey Christopher Derrick.
1 Buffers l When you send data, where does it go? One possibility is: Process 0Process 1 User data Local buffer the network User data Local buffer.
Toward Efficient Support for Multithreaded MPI Communication Pavan Balaji 1, Darius Buntinas 1, David Goodell 1, William Gropp 2, and Rajeev Thakur 1 1.
MPI Fundamentals—A Quick Overview Shantanu Dutt ECE Dept., UIC.
Introduction MPI Mengxia Zhu Fall An Introduction to MPI Parallel Programming with the Message Passing Interface.
Point-to-Point Communication Self Test with solution.
Portability Issues. The MPI standard was defined in May of This standardization effort was a response to the many incompatible versions of parallel.
1 Parallel Computing—Introduction to Message Passing Interface (MPI)
S an D IEGO S UPERCOMPUTER C ENTER N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Message Passing Interface (MPI) Part I NPACI Parallel.
MPI Point-to-Point Communication CS 524 – High-Performance Computing.
Message Passing Interface. Message Passing Interface (MPI) Message Passing Interface (MPI) is a specification designed for parallel applications. The.
A Brief Look At MPI’s Point To Point Communication Brian T. Smith Professor, Department of Computer Science Director, Albuquerque High Performance Computing.
Today Objectives Chapter 6 of Quinn Creating 2-D arrays Thinking about “grain size” Introducing point-to-point communications Reading and printing 2-D.
1 TRAPEZOIDAL RULE IN MPI Copyright © 2010, Elsevier Inc. All rights Reserved.
Profile Guided MPI Protocol Selection for Point-to-Point Communication Calls 5/9/111 Aniruddha Marathe, David K. Lowenthal Department of Computer Science.
Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center Presented by Timothy H. Kaiser, Ph.D. San Diego.
1 An Introduction to MPI Parallel Programming with the Message Passing Interface Originally by William Gropp and Ewing Lusk Adapted by Anda Iamnitchi.
A Message Passing Standard for MPP and Workstations Communications of the ACM, July 1996 J.J. Dongarra, S.W. Otto, M. Snir, and D.W. Walker.
1 Choosing MPI Alternatives l MPI offers may ways to accomplish the same task l Which is best? »Just like everything else, it depends on the vendor, system.
1 MPI: Message-Passing Interface Chapter 2. 2 MPI - (Message Passing Interface) Message passing library standard (MPI) is developed by group of academics.
2.1 Message-Passing Computing ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 14, 2013.
Specialized Sending and Receiving David Monismith CS599 Based upon notes from Chapter 3 of the MPI 3.0 Standard
Parallel Computing A task is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information What do we.
Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University.
Jonathan Carroll-Nellenback CIRC Summer School MESSAGE PASSING INTERFACE (MPI)
Message Passing Programming with MPI Introduction to MPI Basic MPI functions Most of the MPI materials are obtained from William Gropp and Rusty Lusk’s.
CS 838: Pervasive Parallelism Introduction to MPI Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from an online tutorial.
MPI Communications Point to Point Collective Communication Data Packaging.
Summary of MPI commands Luis Basurto. Large scale systems Shared Memory systems – Memory is shared among processors Distributed memory systems – Each.
Message Passing Interface (MPI) 1 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School.
1 The Message-Passing Model l A process is (traditionally) a program counter and address space. l Processes may have multiple threads (program counters.
1 Overview on Send And Receive routines in MPI Kamyar Miremadi November 2004.
Non-Data-Communication Overheads in MPI: Analysis on Blue Gene/P P. Balaji, A. Chan, W. Gropp, R. Thakur, E. Lusk Argonne National Laboratory University.
Parallel Programming with MPI By, Santosh K Jena..
CSCI-455/522 Introduction to High Performance Computing Lecture 4.
1 Message Passing Models CEG 4131 Computer Architecture III Miodrag Bolic.
1 Client-Server Interaction. 2 Functionality Transport layer and layers below –Basic communication –Reliability Application layer –Abstractions Files.
MPI: Portable Parallel Programming for Scientific Computing William Gropp Rusty Lusk Debbie Swider Rajeev Thakur.
Message-Passing Computing Chapter 2. Programming Multicomputer Design special parallel programming language –Occam Extend existing language to handle.
MPI Point to Point Communication CDP 1. Message Passing Definitions Application buffer Holds the data for send or receive Handled by the user System buffer.
Programming distributed memory systems: Message Passing Interface (MPI) Distributed memory systems: multiple processing units working on one task (e.g.
FIT5174 Parallel & Distributed Systems Dr. Ronald Pose Lecture FIT5174 Distributed & Parallel Systems Lecture 5 Message Passing and MPI.
Stacey Levine Chapter 4.1 Message Passing Communication.
Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School.
MPI Send/Receive Blocked/Unblocked Josh Alexander, University of Oklahoma Ivan Babic, Earlham College Andrew Fitz Gibbon, Shodor Education Foundation Inc.
MPI Adriano Cruz ©2003 NCE/UFRJ e Adriano Cruz NCE e IM - UFRJ Summary n References n Introduction n Point-to-point communication n Collective.
1 Advanced MPI William D. Gropp Rusty Lusk and Rajeev Thakur Mathematics and Computer Science Division Argonne National Laboratory.
Computer Science 320 Introduction to Cluster Computing.
MPI: Message Passing Interface An Introduction S. Lakshmivarahan School of Computer Science.
MPI_Alltoall By: Jason Michalske. What is MPI_Alltoall? Each process sends distinct data to each receiver. The Jth block of process I is received by process.
1 MPI: Message Passing Interface Prabhaker Mateti Wright State University.
Implementation and Optimization of MPI point-to-point communications on SMP-CMP clusters with RDMA capability.
MPI Message Passing Interface
Blocking / Non-Blocking Send and Receive Operations
Parallel Programming with MPI and OpenMP
CS 584.
More on MPI Nonblocking point-to-point routines Deadlock
A Message Passing Standard for MPP and Workstations
Numerical Algorithms Quiz questions
More on MPI Nonblocking point-to-point routines Deadlock
More Quiz Questions Parallel Programming MPI Collective routines
MPI Message Passing Interface
Programming Parallel Computers
Presentation transcript:

Four Variations of Matrix Multiplication About 30 minutes – by Mike 1

Matrix Multiplication Challenge See – – Look under Concurrency Education – Look at MPI teaching resources Many of these resources are due to Simone Atzeni Many are due to Geof Sawaya – All the examples from Pacheco’s MPI book! – This will soon be available as projects within our ISP Eclipse GUI! Matrix Challenge is due to Steve Siegel – See – Steve’s challenge stems from an MPI book It is based on an example from the book Using MPI: Portable Parallel Programming with the Message-Passing Interface by William Gropp, Ewing Lusk, and Anthony Skjellum.Using MPI: Portable Parallel Programming with the Message-Passing Interface 2

Matrix Multiplication Illustration For this tutorial, we include four variants These try various versions of mat-mult Includes one buggy version Also reveals one definite future work item – Detect symmetries in MPI programs – Avoid redundant searches – Very apparent when you run our fourth version 3

Example of MPI Code (Mat Mat Mult) 4 X= MPI_Send MPI_Recv

Salient Code Features 5 if (myid == master) {... MPI_Bcast(b, brows*bcols, MPI_FLOAT, master, …);... } else { // All Slaves do this... MPI_Bcast(b, brows*bcols, MPI_FLOAT, master, …);... }

Salient Code Features 6 if (myid == master) {... for (i = 0; i < numprocs-1; i++) { for (j = 0; j < acols; j++) { buffer[j] = a[i*acols+j]; } MPI_Send(buffer, acols, MPI_FLOAT, i+1, …); numsent++; } else { // slaves... while (1) {... MPI_Recv(buffer, acols, MPI_FLOAT, master, …);... } Block till buffer is copied into System Buffer System Buffer

Handling Rows >> Processors … 7 MPI_Send MPI_Recv Send Next Row to First Slave which By now must be free

Handling Rows >> Processors … 8 MPI_Send MPI_Recv OR Send Next Row to First Slave that returns the answer!

Optimization 9 if (myid == master) {... for (i = 0; i < crows; i++) { MPI_Recv(ans, ccols, MPI_FLOAT, FROM ANYBODY,...);... if (numsent < arows) { for (j = 0; j < acols; j++) { buffer[j] = a[numsent*acols+j]; } MPI_Send(buffer, acols, MPI_FLOAT, BACK TO THAT BODY,...); numsent++;... }

Optimization 10 if (myid == master) {... for (i = 0; i < crows; i++) { MPI_Recv(ans, ccols, MPI_FLOAT, FROM ANYBODY,...);... if (numsent < arows) { for (j = 0; j < acols; j++) { buffer[j] = a[numsent*acols+j]; } MPI_Send(buffer, acols, MPI_FLOAT, BACK TO THAT BODY,...); numsent++;... } Shows that wildcard receives can arise quite naturally …

Further Optimization 11 if (myid == master) {... for (i = 0; i < crows; i++) { MPI_Recv(ans, ccols, MPI_FLOAT, FROM ANYBODY,...);... if (numsent < arows) { for (j = 0; j < acols; j++) { buffer[j] = a[numsent*acols+j]; } … here, WAIT for previous Isend to finish (software pipelining) … MPI_Isend(buffer, acols, MPI_FLOAT, BACK TO THAT BODY,...); numsent++;... }

Summary of Some MPI Commands MPI_Irecv(source, msg_bug, req_struct,..) This is a non-blocking receive call MPI_Wait(req_struct) awaits completion Source could be – “wildcard” or * or ANY_SOURCE Receive from any eligible (matching) sender 12

End of E 13