Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 2: Part II Message Passing Programming: MPI

Similar presentations


Presentation on theme: "Lecture 2: Part II Message Passing Programming: MPI"— Presentation transcript:

1 Lecture 2: Part II Message Passing Programming: MPI
Introduction to MPI MPI programming Running MPI program Architecture of MPICH

2 Message Passing Interface (MPI)

3 What is MPI? A message passing library specification
message-passing model not a compiler specification not a specific product For parallel computers, clusters and heterogeneous networks. Full-featured

4 Why use MPI? (1) Message passing now mature as programming paradigm
well understood efficient match to hardware many applications

5 Why use MPI? (2) Full range of desired features modularity
access to peak performance portability heterogeneity subgroups topologies performance measurement tools

6 Who Designed MPI ? Venders Library writers
IBM, Intel, TMC, SGI, Meiko, Cray, Convex, Ncube,….. Library writers PVM, p4, Zipcode, TCGMSG, Chameleon, Express, Linda, DP (HKU), PM (Japan), AM (Berkeley), FM (HPVM at Illinois) Application specialists and consultants

7 Vender-Supported MPI HP-MPI Hewlett Packard; Convex SPP
MPI-F IBM SP1/SP2 Hitachi/MPI Hitachi SGI/MPI SGI PowerChallenge series MPI/DE NEC. INTEL/MPI Intel. Paragon (iCC lib) T.MPI Telmat Multinode Fujitsu/MPI Fujitsu AP1000 EPCC/MPI Cray & EPCC, T3D/T3E. Cho-Li Wang

8 Public-Domain MPI MPICH Argonne National Lab. &
Mississippi State Univ. LAM Ohio Supercomputer center MPICH/NT Mississippi State University MPI-FM Illinois (Myrinet) MPI-AM UC Berkeley (Myrinet) MPI-PM RWCP, Japan (Myrinet) MPI-CCL California Institute of Technology Cho-Li Wang

9 Public-Domain MPI CRI/EPCC MPI Cray Research and Edinburgh
Parallel Computing Centre (Cray T3D/E) MPI-AP Australian National University- CAP Research Program (AP1000) W32MPI Illinois, Concurrent Systems RACE-MPI Hughes Aircraft Co. MPI-BIP INRIA, France (Myrinet)

10 Communicator Concept in MPI
Identify the process group and context with respect to which the operation is to be performed

11 Communicator (2) Four communicators Communicator within Communicator
Process Same process can be existed in different communicators Process Process in different communicators cannot communicate Process Process Process Process Process Process Process Process Process Process Process Process

12 Features of MPI (1) General
Communicators combine context and group for message security

13 Features of MPI (2) Point-to-point communication
Structured buffers and derived data types, heterogeneity Modes : normal (blocking and non-blocking), synchronous, ready (to allow access to fast protocols), buffered

14 Features of MPI (3) Collective Communication
Both built-in and user-defined collective operations Large number of data movement routines Subgroups defined directly or by topology E.g, broadcast, barrier, reduce, scatter, gather, all-to-all, ..

15 MPI Programming

16 Writing MPI programs MPI comprises 125 functions
Many parallel programs can be written with just 6 basic functions

17 Six basic functions (1) MPI_INIT Initiate an MPI computation
MPI_FINALIZE Terminate a computation

18 Six basic functions (2) MPI_COMM_SIZE Determine number of processes in a communicator MPI_COMM_RANK Determine the identifier of a process in a specific communicator

19 Six basic functions (3) MPI_SEND Send a message from one process to another process MPI_RECV Receive a message from one process to another process

20 A simple program Each process prints Find the process ID of
print(“I am “, myid, “ of “, count) Each process prints out its output MPI_COMM_RANK(MPI_COMM_WORLD, myid) Find the process ID of current process MPI_COMM_SIZE(MPI_COMM_WORLD, count) Find the number of processes Program main begin MPI_INIT() MPI_COMM_SIZE(MPI_COMM_WORLD, count) MPI_COMM_RANK(MPI_COMM_WORLD, myid) print(“I am ”, myid, “ of ”, count) MPI_FINALIZE() end MPI_FINALIZE() Shut down MPI_INIT() Initiate computation

21 Result I’m 3 of 4 I’m 1 of 4 I’m 0 of 4 I’m 2 of 4 Process 3 Process 1

22 Point-to-Point Communication
The basic point-to-point communication operators are send and receive. Send Transmission Receive Buffer Buffer Sender Receiver

23 Another simple program (2 nodes)
….. MPI_COMM_RANK(MPI_COMM_WORLD, myid) if myid=0 MPI_SEND(“Zero”,…,…,1,…,…) MPI_RECV(words,…,…,1,…,…,…) else MPI_RECV(words,…,…,0,…,…,…) MPI_SEND(“One”,…,…,0,…,…) END IF print(“Received from “,words) …… I’m process 0! if myid=0 MPI_SEND(“Zero”,…,…,1,…,…) MPI_RECV(words,…,…,1,…,…,…)…… I’m process 1! else MPI_RECV(words,…,…,0,…,…,…) MPI_SEND(“One”,…,…,0,…,…)

24 Process 0 Process 1 MPI_SEND (“Zero”,…,…,1,…,…) MPI_RECV
(words,…,…,0,…,…,…) Received Setup buffer and wait the message from process 0 Send “Zero” to process 1 Zero words (buffer) MPI_RECV (words,…,…,1,…,…) MPI_SEND (“One”,…,…,0,…,…,…) Wait Setup buffer and wait the message from process 1 Received One words (buffer) Send “One” to process 0 Print(“Received from “,words) Wait

25 Result Received from One Received from Zero Process 0 Process 1

26 Collective Communication (1)
Communication that involves a group of processes Receive Send Transmission Buffer Buffer Buffer Buffer Sender Receivers

27 Collective Communication (2)
Three Types Barrier MPI_BARRIER Data movement MPI_BCAST MPI_GATHER MPI_SCATTER Reduction operations MPI_REDUCE

28 Barrier MPI_BARRIER Used to synchronize execution of a group of processes Wait for us! We can’t go on! Barrier Barrier Barrier We’re together! The barrier will be disappeared! Let’s go!

29 Data movement (1) MPI_BCAST
One single process sends the same data to all other processes, itself included BCAST BCAST BCAST BCAST FACE FACE FACE FACE FACE Process 0 Process 1 Process 2 Process 3

30 Data movement (2) MPI_GATHER
All process (include the root process) send the same data to one process and store them in rank order GATHER GATHER GATHER GATHER F F A A C C FACE E E Process 0 Process 1 Process 2 Process 3

31 Data movement (3) MPI_SCATTER
A process sends out a message, which is split into several equals parts, and the ith portion is sent to the ith process SCATTER SCATTER SCATTER SCATTER F FACE A C E Process 0 Process 1 Process 2 Process 3

32 Data movement (4) MPI_REDUCE (e.g., find maximum value)
combine the values of each process, using a specified operation, and return the combined value to a process REDUCE REDUCE REDUCE REDUCE 8 9 max 3 7 8 9 9 3 7 Process 0 Process 1 Process 2 Process 3

33 Example program (1) Calculating the value of  by:

34 Example program (2) …… MPI_BCAST(numprocs, …, …, 0, …)
for (i = myid + 1; i <= n; i += numprocs) compute the area for each interval accumulate the result in processes’ program data (sum) MPI_REDUCE(&sum, …, …, …, MPI_SUM, 0, …) if (myid == 0) Output result Boardcast the no. of process MPI_BCAST(numprocs, …, …, 0, …) Each process calculate specified areas for (i = myid + 1; i <= n; i += numprocs) compute the area for each interval accumulate the result in processes’ program data (sum) Sum up all the areas MPI_REDUCE(&sum, …, …, …, MPI_SUM, 0, …) Print the result if (myid == 0) Output result

35 =3.141... Start calculation! OK! OK! Calculated by process 0

36 MPICH - A Portable Implementation of MPI
Argonne National Laboratory

37 What is MPICH??? The first complete and portable implementation of full MPI standard. ‘CH’ stands for “Chameleon” symbol of adaptability and portability. It contains a programming environment for working with MPI programs. It includes a portable startup mechanism and libraries.

38 How can I install it??? Install the packet mpich.tar.gz to a directory
Use ‘./configure’ and ‘make >& make.log to choose appropriate architecture and device and compile the file Syntax: ./configure -device=DEVICE -arch=ARCH_TYPE ARCH_TYPE: specify the type of machine to be configured DEVICE: specify what kind of communication device the system will choose - ch_p4 (TCP/IP)

39 How to run an MPI Program
The file should be in the format: mercury venus earth mars Edit mpich/util/machines/machines.XXXX, to contain names of machines of architecture xxxx. For example: Computer mercury Computer venus Computer mars Computer earth

40 How to run an MPI Program
include “mpi.h” into the source program. Compile program by using command ‘mpicc’ - mpicc -c foo.c Use ‘mpirun’ to run an MPI program. mpirun will determine the environment for the program to run

41 How to run an MPI Program
mpirun -np 4 a.out - a.out are going to run four processors for massively parallel processors mpirun -arch sun4 -np2 -arch rs6000 -np 3 program - Run a program on 2 sun4s and 3 rs6000s, with local machine being a sun4 (multiple architectures) 5 6

42 MPIRUN (1) How to start a mpi program? Use mpirun Examples:
#mpirun -np 4 cpi it starts four processes of cpi

43 MPIRUN (2) What MPIRUN do?
1. Read the arguments to specify the environment of the mpi program. i) How many processes should be started ii) Which machines will the mpi program be started iii) What device will be used (e.g. ch_p4) 2. Split the processes to the machines will be ran 3. Record down the split results in the PI???? file

44 MPIRUN(3) Example Suppose using ch_p4 device #mpirun -np 4 cpi
1. mpirun knows 4 processes need to be started 2. mpirun reads the machines file to find which machines can be ran 3. ch_p4 device will be used if no specified argument given in the command

45 MPIRUN (4) 4. Split the tasks and save in PI???? file File format:
<hostname> <no. of proc.> <program> genius.cs.hku.hk cpi eagle.cs.hku.hk cpi dragon.cs.hku.hk cpi virtue.cs.hku.hk cpi 5. Start the processes in remote machines by using “rsh”

46 Architecture of MPICH

47 Structure of MPICH ABSTRACT DEVICE INTERFACE ABSTRACT DEVICE INTERFACE
MPI PORTABLE API LIBRARY MPICH ABSTRACT DEVICE MPICH CHANNEL INTERFACE Low Level Layer Low Level Layer Low Level Layer Low Level Layer Low Level Layer Low Level Layer Low Level Layer Low Level Layer Low Level Layer Low Level Layer Low Level Layer Socket TCP/IP Shared Memory Vendor Design

48 MPICH - Abstract Device Interface
Interface between high-level MPI and low-level device. Manages message packaging, buffering policies and handle heterogeneous communication. 4 sets of functions: 1. Specify send or receive of a message. 2. Data movement between API and hardware. 3. Manage lists of pending messages. 4. Provide information about execution environment.

49 MPICH - The Channel Interface (1)
The interface transfer data from one process‘s address space to another’s. Information is divided into two parts: message envelop and data It includes five functions: MPID_SendControl, MPID_RecvAnyControl, MPID_ControlMsgAvail - envelop information MPID_SendChannel, MPID_RecvFromChannel - data information

50 MPICH - The Channel Interface (2)
Channel Interface adopt data exchange mechanism in accordance to the size of message. Data Exchange Mechanism implemented: Short, Eager, Rendezvous, Get

51 Protocol - Short The size of data managed by this mechanism is shortest. The data is delivered within the message envelop.

52 Short Protocol Data Transfer
Reach Reach Reach Reach Reach Reach Store in Buffer MPI_Recv MPI_Recv MPI_Recv Data Control Message Control Message Control Message Control Message Control Message Control Message Control Message Control Message Control Message Control Message Control Message Control Message Control Message Short Protocol Data Transfer

53 Protocol - Eager Data is sent to the destination immediately.
The receiver must allocate some space to store the data locally. It is the default choice in MPICH. It is not suitable for large amounts of data transfer.

54 Eager Protocol Data Transfer
Buffer Full!!! Save in Buffer MPI_Control Data MPI_Control MPI_Control MPI_Control MPI_Control MPI_Control MPI_Control MPI_Control MPI_Control Data3 Data Data Data Data Data Data Data1 Data Data Data Data4 Data2 MPI_Recv MPI_Recv MPI_Recv MPI_Recv Eager Protocol Data Transfer

55 Protocol - Rendezvous Data is sent to the destination only when requested. If users want to use it, add -use_rndv in the command ‘./configure’. No buffering required.

56 Rendezvous Protocol Data Transfer
Wait Again! Wait! MPI_Control MPI_Control MPI_Cotrol MPI_Control Match!!! Received! Wait MPI_Control MPI_Control MPI_Control MPI_Control Data Data Data Data Data Data Data Data Data MPI_Recv MPI_Request MPI_Request MPI_Request MPI_Request MPI_Request MPI_Request MPI_Request Rendezvous Protocol Data Transfer

57 Protocol - Get In this protocol, data is read directly by the receiver. Data is directly transferred from one process’s memory to another. Highest Performance. require shared memory remote memory operation

58 Get Protocol Data Transfer
Receiver directly access sender shared memory I want to get data from sender Receiver directly copy data from sender shared memory to its memory Get Protocol Data Transfer

59 Conclusion

60 MPI–1.1 (June 95) MPI 1.1 doesn’t provide process management
remote memory transfers active messages threads virtual shared memory

61 MPI–2 (July 97) Extensions to the MPI process creation and management
one-sided communications extended collective operations external interface I/O additional language bindings


Download ppt "Lecture 2: Part II Message Passing Programming: MPI"

Similar presentations


Ads by Google