Download presentation
Presentation is loading. Please wait.
Published byKathryn Doyle Modified over 9 years ago
1
Lecture 5 MPI Introduction Topics …Readings January 19, 2012 CSCE 713 Advanced Computer Architecture
2
– 2 – CSCE 713 Spring 2012 A distributed memory system Copyright © 2010, Elsevier Inc. All rights Reserved
3
– 3 – CSCE 713 Spring 2012 A shared memory system Copyright © 2010, Elsevier Inc. All rights Reserved
4
– 4 – CSCE 713 Spring 2012 Identifying MPI processes Common practice to identify processes by nonnegative integer ranks. p processes are numbered 0, 1, 2,.. p-1 Copyright © 2010, Elsevier Inc. All rights Reserved
5
– 5 – CSCE 713 Spring 2012 Our first MPI program Copyright © 2010, Elsevier Inc. All rights Reserved
6
– 6 – CSCE 713 Spring 2012 Compilation mpicc -g -Wall -o mpi_hello mpi_hello.c
7
– 7 – CSCE 713 Spring 2012 Execution Copyright © 2010, Elsevier Inc. All rights Reserved mpiexec -n mpiexec -n 1./mpi_hello mpiexec -n 4./mpi_hello run with 1 process run with 4 processes
8
– 8 – CSCE 713 Spring 2012 Execution Copyright © 2010, Elsevier Inc. All rights Reserved mpiexec -n 1./mpi_hello mpiexec -n 4./mpi_hello Greetings from process 0 of 1 ! Greetings from process 0 of 4 ! Greetings from process 1 of 4 ! Greetings from process 2 of 4 ! Greetings from process 3 of 4 !
9
– 9 – CSCE 713 Spring 2012 MPI Programs Written in C. Has main. Uses stdio.h, string.h, etc. Need to add mpi.h header file. Identifiers defined by MPI start with “MPI_”. First letter following underscore is uppercase. For function names and MPI-defined types. Helps to avoid confusion. Copyright © 2010, Elsevier Inc. All rights Reserved
10
– 10 – CSCE 713 Spring 2012 MPI Components MPI_Init Tells MPI to do all the necessary setup.MPI_Finalize Tells MPI we’re done, so clean up anything allocated for this program. Copyright © 2010, Elsevier Inc. All rights Reserved
11
– 11 – CSCE 713 Spring 2012 Basic Outline Copyright © 2010, Elsevier Inc. All rights Reserved
12
– 12 – CSCE 713 Spring 2012 Communicators A collection of processes that can send messages to each other. MPI_Init defines a communicator that consists of all the processes created when the program is started. Called MPI_COMM_WORLD. Copyright © 2010, Elsevier Inc. All rights Reserved
13
– 13 – CSCE 713 Spring 2012 Communicators Copyright © 2010, Elsevier Inc. All rights Reserved number of processes in the communicator my rank (the process making this call)
14
– 14 – CSCE 713 Spring 2012 SPMD Single-Program Multiple-Data We compile one program. Process 0 does something different. Receives messages and prints them while the other processes do the work. The if-else construct makes our program SPMD. Copyright © 2010, Elsevier Inc. All rights Reserved
15
– 15 – CSCE 713 Spring 2012 Communication Copyright © 2010, Elsevier Inc. All rights Reserved
16
– 16 – CSCE 713 Spring 2012 Data types Copyright © 2010, Elsevier Inc. All rights Reserved
17
– 17 – CSCE 713 Spring 2012 Communication Copyright © 2010, Elsevier Inc. All rights Reserved
18
– 18 – CSCE 713 Spring 2012 Message matching Copyright © 2010, Elsevier Inc. All rights Reserved MPI_Send src = q MPI_Recv dest = r r q
19
– 19 – CSCE 713 Spring 2012 Receiving messages A receiver can get a message without knowing: the amount of data in the message, the sender of the message, or the tag of the message. Copyright © 2010, Elsevier Inc. All rights Reserved
20
– 20 – CSCE 713 Spring 2012 status_p argument Copyright © 2010, Elsevier Inc. All rights Reserved MPI_SOURCE MPI_TAG MPI_ERROR MPI_Status* MPI_Status* status; status.MPI_SOURCE status.MPI_TAG
21
– 21 – CSCE 713 Spring 2012 How much data am I receiving? Copyright © 2010, Elsevier Inc. All rights Reserved
22
– 22 – CSCE 713 Spring 2012 Issues with send and receive Exact behavior is determined by the MPI implementation. MPI_Send may behave differently with regard to buffer size, cutoffs and blocking. MPI_Recv always blocks until a matching message is received. Know your implementation; don’t make assumptions! Copyright © 2010, Elsevier Inc. All rights Reserved
23
– 23 – CSCE 713 Spring 2012 TRAPEZOIDAL RULE IN MPI Copyright © 2010, Elsevier Inc. All rights Reserved
24
– 24 – CSCE 713 Spring 2012 The Trapezoidal Rule Copyright © 2010, Elsevier Inc. All rights Reserved
25
– 25 – CSCE 713 Spring 2012 The Trapezoidal Rule Copyright © 2010, Elsevier Inc. All rights Reserved
26
– 26 – CSCE 713 Spring 2012 One trapezoid Copyright © 2010, Elsevier Inc. All rights Reserved
27
– 27 – CSCE 713 Spring 2012 Pseudo-code for a serial program Copyright © 2010, Elsevier Inc. All rights Reserved
28
– 28 – CSCE 713 Spring 2012 Parallelizing the Trapezoidal Rule 1.Partition problem solution into tasks. 2.Identify communication channels between tasks. 3.Aggregate tasks into composite tasks. 4.Map composite tasks to cores. Copyright © 2010, Elsevier Inc. All rights Reserved
29
– 29 – CSCE 713 Spring 2012 Parallel pseudo-code Copyright © 2010, Elsevier Inc. All rights Reserved
30
– 30 – CSCE 713 Spring 2012 Tasks and communications for Trapezoidal Rule Copyright © 2010, Elsevier Inc. All rights Reserved
31
– 31 – CSCE 713 Spring 2012 First version (1) Copyright © 2010, Elsevier Inc. All rights Reserved
32
– 32 – CSCE 713 Spring 2012 First version (2) Copyright © 2010, Elsevier Inc. All rights Reserved
33
– 33 – CSCE 713 Spring 2012 First version (3) Copyright © 2010, Elsevier Inc. All rights Reserved
34
– 34 – CSCE 713 Spring 2012 Dealing with I/O Copyright © 2010, Elsevier Inc. All rights Reserved Each process just prints a message.
35
– 35 – CSCE 713 Spring 2012 Running with 6 processes Copyright © 2010, Elsevier Inc. All rights Reserved unpredictable output
36
– 36 – CSCE 713 Spring 2012 Input Most MPI implementations only allow process 0 in MPI_COMM_WORLD access to stdin. Process 0 must read the data (scanf) and send to the other processes. Copyright © 2010, Elsevier Inc. All rights Reserved
37
– 37 – CSCE 713 Spring 2012 Function for reading user input Copyright © 2010, Elsevier Inc. All rights Reserved
38
– 38 – CSCE 713 Spring 2012 COLLECTIVE COMMUNICATION Copyright © 2010, Elsevier Inc. All rights Reserved
39
– 39 – CSCE 713 Spring 2012 Tree-structured communication 1.In the first phase: (a) Process 1 sends to 0, 3 sends to 2, 5 sends to 4, and 7 sends to 6. (b) Processes 0, 2, 4, and 6 add in the received values. (c) Processes 2 and 6 send their new values to processes 0 and 4, respectively. (d) Processes 0 and 4 add the received values into their new values. 2.(a) Process 4 sends its newest value to process 0. (b) Process 0 adds the received value to its newest value. Copyright © 2010, Elsevier Inc. All rights Reserved
40
– 40 – CSCE 713 Spring 2012 A tree-structured global sum Copyright © 2010, Elsevier Inc. All rights Reserved
41
– 41 – CSCE 713 Spring 2012 Overview Last Time Posix Pthreads: create, join, exit, mutexes /class/csce713-006 Code and Data Readings for today http://csapp.cs.cmu.edu/public/1e/public/ch9-preview.pdfNew Website alive and kicking; dropbox too! From Last Lecture’s slides: Gauss-Seidel Method, Barriers, Threads Assignment Next time performance evaluation, barriers and MPI intro
42
– 42 – CSCE 713 Spring 2012 Threads programming Assignment 1.Matrix addition (embarassingly parallel) 2.Versions Sequential Sequential with blocking factor Sequential Read without conversions Multi threaded passing number of threads as command line argument (args.c code should be distributed as an example) 3.Plot of several runs 4.Next time
43
– 43 – CSCE 713 Spring 2012 Next few slides are from Parallel Programming by Pacheco /class/csce713-006/Code/Pacheco contains the code (again only for use with this course do not distribute) Examples 1.Simple use of mutex in calculating sum 2.Semaphore
44
– 44 – CSCE 713 Spring 2012 Global sum function that uses a mutex (2) Copyright © 2010, Elsevier Inc. All rights Reserved
45
– 45 – CSCE 713 Spring 2012 Barriers - synchronizing threads Synchronizing threads after a period of computation No thread proceeds until all others have reached the barrierNo thread proceeds until all others have reached the barrier E.g., last time iteration of Gauss-Siedel, sync check for convergence Examples of barrier uses and implementations 1.Using barriers for testing convergence, i.e. satisfying a completion criteria 2.Using barriers to time the slowest thread 3.Using barriers for debugging
46
– 46 – CSCE 713 Spring 2012 Using barriers for testing convergence, i.e. satisfying a completion criteria Slide 43 of Lecture 3 (slightly modified) General computation Stop when all x i from this iteration calculated
47
– 47 – CSCE 713 Spring 2012 Using barriers to time the slowest thread Copyright © 2010, Elsevier Inc. All rights Reserved
48
– 48 – CSCE 713 Spring 2012 Using barriers for debugging Copyright © 2010, Elsevier Inc. All rights Reserved
49
– 49 – CSCE 713 Spring 2012 Barrier Implementations 1.Mutex, threadCounter, busy-waits 2.Mutex plus barrier semaphore
50
– 50 – CSCE 713 Spring 2012 Busy-waiting and a Mutex Implementing a barrier using busy-waiting and a mutex is straightforward. We use a shared counter protected by the mutex. When the counter indicates that every thread has entered the critical section, threads can leave the critical section. Copyright © 2010, Elsevier Inc. All rights Reserved
51
– 51 – CSCE 713 Spring 2012 Busy-waiting and a Mutex Copyright © 2010, Elsevier Inc. All rights Reserved We need one counter variable for each instance of the barrier, otherwise problems are likely to occur.
52
– 52 – CSCE 713 Spring 2012 Implementing a barrier with semaphores Copyright © 2010, Elsevier Inc. All rights Reserved
53
– 53 – CSCE 713 Spring 2012 Condition Variables A condition variable is a data object that allows a thread to suspend execution until a certain event or condition occurs. When the event or condition occurs another thread can signal the thread to “wake up.” A condition variable is always associated with a mutex. Copyright © 2010, Elsevier Inc. All rights Reserved
54
– 54 – CSCE 713 Spring 2012 Condition Variables Copyright © 2010, Elsevier Inc. All rights Reserved
55
– 55 – CSCE 713 Spring 2012 Implementing a barrier with condition variables Copyright © 2010, Elsevier Inc. All rights Reserved
56
– 56 – CSCE 713 Spring 2012 POSIX Semaphores $ man -k semaphore sem_close (3) - close a named semaphore sem_destroy (3) - destroy an unnamed semaphore sem_getvalue (3) - get the value of a semaphore sem_init (3) - initialize an unnamed semaphore sem_open (3) - initialize and open a named semaphore sem_overview (7) - Overview of POSIX semaphores sem_post (3) - unlock a semaphore sem_timedwait (3) - lock a semaphore sem_trywait (3) - lock a semaphore sem_unlink (3) - remove a named semaphore sem_wait (3) - lock a semaphore $ man –s 7 semaphores No manual entry for semaphore in section 7
57
– 57 – CSCE 713 Spring 2012 Implementing a barrier with condition variables Copyright © 2010, Elsevier Inc. All rights Reserved
58
– 58 – CSCE 713 Spring 2012 System V Semaphores (Sets) In System V IPC there are semaphore sets Commands ipcs - provide information on ipc facilitiesipcs - provide information on ipc facilities ipcrm - remove a message queue, semaphore set or shared memory segment etc.ipcrm - remove a message queue, semaphore set or shared memory segment etc. System Calls: semctl (2) - semaphore control operations semget (2) - get a semaphore set identifier semop (2) - semaphore operations semtimedop (2) - semaphore operations
59
– 59 – CSCE 713 Spring 2012
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.