CS6235 L17: Design Review and 6-Function MPI. L17: DRs and MPI 2 CS6235 Administrative Organick Lecture: TONIGHT -David Shaw, “Watching Proteins Dance:

Slides:



Advertisements
Similar presentations
MPI Message Passing Interface
Advertisements

MPI Basics Introduction to Parallel Programming and Cluster Computing University of Washington/Idaho State University MPI Basics Charlie Peck Earlham College.
CS 140: Models of parallel programming: Distributed memory and MPI.
Chapter 3. MPI MPI = Message Passing Interface Specification of message passing libraries for developers and users –Not a library by itself, but specifies.
Tutorial on MPI Experimental Environment for ECE5610/CSC
High Performance Computing
Introduction MPI Mengxia Zhu Fall An Introduction to MPI Parallel Programming with the Message Passing Interface.
CS 240A: Models of parallel programming: Distributed memory and MPI.
Message-Passing Programming and MPI CS 524 – High-Performance Computing.
Distributed Memory Programming with MPI. What is MPI? Message Passing Interface (MPI) is an industry standard message passing system designed to be both.
S an D IEGO S UPERCOMPUTER C ENTER N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Message Passing Interface (MPI) Part I NPACI Parallel.
Comp 422: Parallel Programming Lecture 8: Message Passing (MPI)
12b.1 Introduction to Message-passing with MPI UNC-Wilmington, C. Ferner, 2008 Nov 4, 2008.
1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben.
Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center Presented by Timothy H. Kaiser, Ph.D. San Diego.
1 An Introduction to MPI Parallel Programming with the Message Passing Interface Originally by William Gropp and Ewing Lusk Adapted by Anda Iamnitchi.
1 An Introduction to MPI Parallel Programming with the Message Passing Interface William Gropp Ewing Lusk Argonne National Laboratory Presenter: Mike Slavik.
Lecture 11 –Parallelism on Supercomputers Parallelism on Supercomputers and the Message Passing Interface (MPI) Parallel Computing CIS 410/510 Department.
CS6235 L15: Design Review, Midterm Review and 6- Function MPI.
1 Copyright © 2010, Elsevier Inc. All rights Reserved Chapter 3 Distributed Memory Programming with MPI An Introduction to Parallel Programming Peter Pacheco.
Parallel & Cluster Computing MPI Basics Paul Gray, University of Northern Iowa David Joiner, Shodor Education Foundation Tom Murphy, Contra Costa College.
Lecture 5 MPI Introduction Topics …Readings January 19, 2012 CSCE 713 Advanced Computer Architecture.
Director of Contra Costa College High Performance Computing Center
2.1 Message-Passing Computing ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 17, 2012.
2.1 Message-Passing Computing ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 14, 2013.
CS6963 L15: Design Review and CUBLAS Paper Discussion.
Part I MPI from scratch. Part I By: Camilo A. SilvaBIOinformatics Summer 2008 PIRE :: REU :: Cyberbridges.
CS 240A Models of parallel programming: Distributed memory and MPI.
Parallel Computing A task is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information What do we.
Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University.
Message Passing Programming with MPI Introduction to MPI Basic MPI functions Most of the MPI materials are obtained from William Gropp and Rusty Lusk’s.
CS 838: Pervasive Parallelism Introduction to MPI Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from an online tutorial.
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
Summary of MPI commands Luis Basurto. Large scale systems Shared Memory systems – Memory is shared among processors Distributed memory systems – Each.
Message Passing Interface (MPI) 1 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School.
1 The Message-Passing Model l A process is (traditionally) a program counter and address space. l Processes may have multiple threads (program counters.
Distributed-Memory (Message-Passing) Paradigm FDI 2004 Track M Day 2 – Morning Session #1 C. J. Ribbens.
Parallel Programming with MPI By, Santosh K Jena..
CSCI-455/522 Introduction to High Performance Computing Lecture 4.
1 Message Passing Models CEG 4131 Computer Architecture III Miodrag Bolic.
Chapter 4 Message-Passing Programming. The Message-Passing Model.
CS4230 CS4230 Parallel Programming Lecture 13: Introduction to Message Passing Mary Hall October 23, /23/2012.
Message Passing and MPI Laxmikant Kale CS Message Passing Program consists of independent processes, –Each running in its own address space –Processors.
Introduction to MPI CDP 1. Shared Memory vs. Message Passing Shared Memory Implicit communication via memory operations (load/store/lock) Global address.
Programming distributed memory systems: Message Passing Interface (MPI) Distributed memory systems: multiple processing units working on one task (e.g.
An Introduction to MPI (message passing interface)
1 HPCI Presentation Kulathep Charoenpornwattana. March 12, Outline Parallel programming with MPI Running MPI applications on Azul & Itanium Running.
Introduction to Parallel Programming at MCSR Message Passing Computing –Processes coordinate and communicate results via calls to message passing library.
Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008.
3/12/2013Computer Engg, IIT(BHU)1 MPI-1. MESSAGE PASSING INTERFACE A message passing library specification Extended message-passing model Not a language.
Lecture 5 CSS314 Parallel Computing Book: “An Introduction to Parallel Programming” by Peter Pacheco
Message Passing Interface Using resources from
MPI-Message Passing Interface. What is MPI?  MPI is a specification for the developers and users of message passing libraries. By itself, it is NOT a.
Lecture 3 Point-to-Point Communications Dr. Muhammad Hanif Durad Department of Computer and Information Sciences Pakistan Institute Engineering and Applied.
1 Programming distributed memory systems Clusters Distributed computers ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 6, 2015.
Introduction to parallel computing concepts and technics
CS4402 – Parallel Computing
Introduction to MPI.
MPI Message Passing Interface
CS4961 Parallel Programming Lecture 18: Introduction to Message Passing Mary Hall November 2, /02/2010 CS4961.
CS 584.
CS4961 Parallel Programming Lecture 16: Introduction to Message Passing Mary Hall November 3, /03/2011 CS4961.
MPI-Message Passing Interface
Lecture 14: Inter-process Communication
Introduction to parallelism and the Message Passing Interface
Hardware Environment VIA cluster - 8 nodes Blade Server – 5 nodes
Hello, world in MPI #include <stdio.h> #include "mpi.h"
Distributed Memory Programming with Message-Passing
Hello, world in MPI #include <stdio.h> #include "mpi.h"
CS 584 Lecture 8 Assignment?.
Presentation transcript:

CS6235 L17: Design Review and 6-Function MPI

L17: DRs and MPI 2 CS6235 Administrative Organick Lecture: TONIGHT -David Shaw, “Watching Proteins Dance: Molecular Simulation and the Future of Drug Design”, 220 Skaggs Biology, Reception at 6:15, talk at 7:00PM -Round-table with Shaw in the Large Conference Room (MEB 3147) beginning TODAY at 3:30pm (refreshments! -Technical talk TOMORROW“Anton: A Special-Purpose Machine That Achieves a Hundred-Fold Speedup in Biomolecular Simulations”, 104 WEB, Reception at 11:50, talk at 12:15PM

L17: DRs and MPI 3 CS6235 Design Reviews Goal is to see a solid plan for each project and make sure projects are on track -Plan to evolve project so that results guaranteed -Show at least one thing is working -How work is being divided among team members Major suggestions from proposals -Project complexity – break it down into smaller chunks with evolutionary strategy -Add references – what has been done before? Known algorithm? GPU implementation?

L17: DRs and MPI 4 CS6235 Design Reviews Oral, 10-minute Q&A session (April 4 in class, plus office hours if needed) -Each team member presents one part -Team should identify “lead” to present plan Three major parts: I.Overview - Define computation and high-level mapping to GPU II.Project Plan -The pieces and who is doing what. -What is done so far? (Make sure something is working by the design review) III.Related Work -Prior sequential or parallel algorithms/implementations -Prior GPU implementations (or similar computations) Submit slides and written document revising proposal that covers these and cleans up anything missing from proposal.

L17: DRs and MPI 5 CS6235 Final Project Presentation Dry run on April 18 -Easels, tape and poster board provided -Tape a set of Powerpoint slides to a standard 2’x3’ poster, or bring your own poster. Poster session during class on April 23 -Invite your friends, profs who helped you, etc. Final Report on Projects due May 4 -Submit code -And written document, roughly 10 pages, based on earlier submission. -In addition to original proposal, include -Project Plan and How Decomposed (from DR) -Description of CUDA implementation -Performance Measurement -Related Work (from DR)

L17: DRs and MPI 6 CS6235 Let’s Talk about Demos For some of you, with very visual projects, I encourage you to think about demos for the poster session This is not a requirement, just something that would enhance the poster session Realistic? -I know everyone’s laptops are slow … -… and don’t have enough memory to solve very large problems Creative Suggestions? -Movies captured from run on larger system

L17: DRs and MPI 7 CS6235 Message Passing and MPI Message passing is the principle alternative to shared memory parallel programming, predominant programming model for supercomputers and clusters -Portable -Low-level, but universal and matches earlier hardware execution model What it is -A library used within conventional sequential languagess (Fortran, C, C++) -Based on Single Program, Multiple Data (SPMD) -Isolation of separate address spaces + no data races, but communication errors possible + exposes execution model and forces programmer to think about locality, both good for performance -Complexity and code growth! Like OpenMP, MPI arose as a standard to replace a large number of proprietary message passing libraries.

L17: DRs and MPI 8 CS6235 Message Passing Library Features All communication, synchronization require subroutine calls -No shared variables -Program runs on a single processor just like any uniprocessor program, except for calls to message passing library Subroutines for -Communication -Pairwise or point-to-point: A message is sent from a specific sending process (point a) to a specific receiving process (point b). -Collectives involving multiple processors –Move data: Broadcast, Scatter/gather –Compute and move: Reduce, AllReduce -Synchronization -Barrier -No locks because there are no shared variables to protect -Queries -How many processes? Which one am I? Any messages waiting?

L17: DRs and MPI 9 CS6235 MPI References The Standard itself: -at -All MPI official releases, in both postscript and HTML Other information on Web: -at -pointers to lots of stuff, including other talks and tutorials, a FAQ, other MPI pages Slide source: Bill Gropp

L17: DRs and MPI 10 CS6235 Compilation Copyright © 2010, Elsevier Inc. All rights Reserved mpicc -g -Wall -o mpi_hello mpi_hello.c wrapper script to compile turns on all warnings source file create this executable file name (as opposed to default a.out) produce debugging information

L17: DRs and MPI 11 CS6235 Execution mpiexec -n mpiexec -n 1./mpi_hello mpiexec -n 4./mpi_hello run with 1 process run with 4 processes Copyright © 2010, Elsevier Inc. All rights Reserved

L17: DRs and MPI 12 CS6235 Hello (C) #include "mpi.h" #include int main( int argc, char *argv[] ) { int rank, size; MPI_Init( &argc, &argv ); MPI_Comm_rank( MPI_COMM_WORLD, &rank ); MPI_Comm_size( MPI_COMM_WORLD, &size ); printf( ”Greetings from process %d of %d\n", rank, size ); MPI_Finalize(); return 0; } Slide source: Bill Gropp 11/03/ CS4961

L17: DRs and MPI 13 CS6235 Hello (C++) #include "mpi.h" #include int main( int argc, char *argv[] ) { int rank, size; MPI::Init(argc, argv); rank = MPI::COMM_WORLD.Get_rank(); size = MPI::COMM_WORLD.Get_size(); std::cout << ”Greetings from process " << rank << " of " << size << "\n"; MPI::Finalize(); return 0; } Slide source: Bill Gropp, 11/03/ CS4961

L17: DRs and MPI 14 CS6235 Execution mpiexec -n 1./mpi_hello mpiexec -n 4./mpi_hello Greetings from process 0 of 1 ! Greetings from process 0 of 4 ! Greetings from process 1 of 4 ! Greetings from process 2 of 4 ! Greetings from process 3 of 4 ! Copyright © 2010, Elsevier Inc. All rights Reserved

L17: DRs and MPI 15 CS6235 MPI Components MPI_Init -Tells MPI to do all the necessary setup. MPI_Finalize -Tells MPI we’re done, so clean up anything allocated for this program. Copyright © 2010, Elsevier Inc. All rights Reserved

L17: DRs and MPI 16 CS6235 Basic Outline Copyright © 2010, Elsevier Inc. All rights Reserved

L17: DRs and MPI 17 CS6235 MPI Basic (Blocking) Send MPI_SEND(start, count, datatype, dest, tag, comm) The message buffer is described by ( start, count, datatype ). The target process is specified by dest, which is the rank of the target process in the communicator specified by comm. When this function returns, the data has been delivered to the system and the buffer can be reused. The message may not have been received by the target process. Slide source: Bill Gropp A(10) B(20) MPI_Send( A, 10, MPI_DOUBLE, 1, …) MPI_Recv( B, 20, MPI_DOUBLE, 0, … )

L17: DRs and MPI 18 CS6235 MPI Basic (Blocking) Receive MPI_RECV(start, count, datatype, source, tag, comm, status) Waits until a matching (both source and tag ) message is received from the system, and the buffer can be used source is rank in communicator specified by comm, or MPI_ANY_SOURCE tag is a tag to be matched on or MPI_ANY_TAG receiving fewer than count occurrences of datatype is OK, but receiving more is an error status contains further information (e.g. size of message) Slide source: Bill Gropp A(10) B(20) MPI_Send( A, 10, MPI_DOUBLE, 1, …)MPI_Recv( B, 20, MPI_DOUBLE, 0, … )

L17: DRs and MPI 19 CS6235 MPI Datatypes The data in a message to send or receive is described by a triple (address, count, datatype), where An MPI datatype is recursively defined as: -predefined, corresponding to a data type from the language (e.g., MPI_INT, MPI_DOUBLE) -a contiguous array of MPI datatypes -a strided block of datatypes -an indexed array of blocks of datatypes -an arbitrary structure of datatypes There are MPI functions to construct custom datatypes, in particular ones for subarrays Slide source: Bill Gropp

L17: DRs and MPI 20 CS6235 A Simple MPI Program #include “mpi.h” #include int main( int argc, char *argv[]) { int rank, buf; MPI_Status status; MPI_Init(&argv, &argc); MPI_Comm_rank( MPI_COMM_WORLD, &rank ); /* Process 0 sends and Process 1 receives */ if (rank == 0) { buf = ; MPI_Send( &buf, 1, MPI_INT, 1, 0, MPI_COMM_WORLD); } else if (rank == 1) { MPI_Recv( &buf, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &status ); printf( “Received %d\n”, buf ); } MPI_Finalize(); return 0; } Slide source: Bill Gropp

L17: DRs and MPI 21 CS6235 Six-Function MPI Most commonly used constructs A decade or more ago, almost all supercomputer programs only used these -MPI_Init -MPI_Finalize -MPI_Comm_Size -MPI_Comm_Rank -MPI_Send -MPI_Recv Also very useful MPI_Reduce and other collectives Other features of MPI -Task parallel constructs -Optimized communication: non-blocking, one-sided

L17: DRs and MPI 22 CS6235 MPI_Reduce Copyright © 2010, Elsevier Inc. All rights Reserved

L17: DRs and MPI 23 CS6235 Count 6s in MPI?