Download presentation
Presentation is loading. Please wait.
Published byAnis Ross Modified over 9 years ago
1
Lecture 3: Designing Parallel Programs
2
Methodological Design Designing and Building Parallel Programs by Ian Foster www-unix.mcs.anl.gov/dbpp
3
Methodological Design Partitioning Communication Agglomeration Mapping
4
Methodological Design PROBLEM
5
Methodological Design Partitioning The computation that is to be performed and the data operated on by this computation are decomposed into small tasks. Practical issues such as the number of processors in the target computer are ignored, and attention is focused on recognizing opportunities for parallel execution. PROBLEM Partitioning
6
Functional Decomposition Data Decomposition
7
Partitioning Functional Decomposition
8
Partitioning Data Decomposition
9
Methodological Design Single Program Multiple Data (SPMD) programming model A fixed number of identical tasks are created at program startup. Each task executes the same program but operates on different data.
10
Methodological Design Communication The communication required to coordinate task execution is determined, and appropriate communication structures and algorithms are defined. PROBLEM Partitioning Communication
11
Latency (L): How long does it take to start sending a "message"? (in microseconds) Bandwidth (B): What data rate can be sustained once the message is started? (in Mbytes/sec) Transfer time = L + (1/B) * data size
12
Communication Local Communication X i,j = (4 X i,j + X i-1,j + X i+1,j + X i,j-1 + X i,j+1 )/8 X i-1,j X i,j X i,j-1 X i,j+1 X i+1,j 5-point Stencil
13
Communication Global Communication n ∑ X i i=0 X0X0 X1X1 X2X2 X3X3 X4X4 X5X5 X6X6 X7X7 ∑01∑01 ∑23∑23 ∑45∑45 ∑67∑67 ∑07∑07 ∑03∑03 ∑47∑47
14
Communication Static Communication Communicating tasks do not change over time. Dynamic Communication Communicating tasks are determined by data computed at run time.
15
Communication Dynamic Communication B[i]=A[ k[i] ] k 1 2 1 3 4 A 10 15 20 25 30 B 10 15 10 20 25
16
Methodological Design Agglomeration The task and communication structures defined in the first two stages of a design are evaluated with respect to performance requirements and implementation costs. If necessary, tasks are combined into larger tasks to improve performance or to reduce development costs. PROBLEM Partitioning Communication Agglomeration
19
Goals are: Increasing Granularity (fine grain / course grain) Decreasing communication/computation ratio
20
Agglomeration X i,j = (4 X i,j + X i-1,j + X i+1,j + X i,j-1 + X i,j+1 )/8
21
Agglomeration X i,j = (4 X i,j + X i-1,j + X i+1,j + X i,j-1 + X i,j+1 )/8
22
Agglomeration
23
Methodological Design Mapping Each task is assigned to a processor in a manner that minimizes the total execution time. Mapping can be specified statically or determined at runtime by load-balancing algorithms. PROBLEM Partitioning Communication Agglomeration Mapping
24
If each task performs the same amount of computation and communicates: Cyclic Mapping
25
Mapping The goal of mapping algorithms is to minimize total execution time. Mapping algorithms attempt to satisfy the competing goals of: maximizing processor utilization minimizing communication costs. Mapping is NP-complete
26
Mapping Two strategies to achieve this goal: Place tasks that are able to execute concurrently on different processors Place tasks that communicate frequently on the same processor
27
Mapping Consider the topology
28
Methodological Design Case Study: Atmosphere Model
29
Methodological Design Case Study: Atmosphere Model
30
Methodological Design Case Study: Atmosphere Model
31
Methodological Design Case Study: Atmosphere Model Partitioning N x x N y x N z tasks
32
Methodological Design Case Study: Atmosphere Model Communication 9-point stencil in horizontal dimension 3-point stencil in vertical dimension
33
Methodological Design Case Study: Atmosphere Model Communication Computations 1. Global operations 2. Physics calculations where denotes the mass at grid point (i,j,k). The total clear sky (TCS) at level k≥1 is defined as where level 0 is the top of the atmosphere and cld i is the cloud fraction at level i. In total, the physics component of the model requires on the order of 30 communications per grid point and per time step
34
Methodological Design Case Study: Atmosphere Model Communication obtains data from eight other tasks
35
Methodological Design Case Study: Atmosphere Model Communication obtains data from eight other tasks only 4 communications are required per task
36
Methodological Design Case Study: Atmosphere Model Agglomeration Vertical dimension requires communication (2 messages) also (30 messages) for various ``physics'' computations These communications can be avoided by agglomerating tasks within each vertical column. Therefore tasks will be created.
37
Methodological Design Case Study: Atmosphere Model Mapping
38
Methodological Design
39
Agglomeration
40
Methodological Design Increase Granularity Decrease communication/computation ratio
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.