Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 584.

Similar presentations


Presentation on theme: "CS 584."— Presentation transcript:

1 CS 584

2 Review Partition Communication Domain Decomposition
Functional Decomposition Communication Avoid centralized patterns Avoid sequential patterns Divide and Conquer

3 Agglomeration Partition and Communication steps were abstract
Agglomeration moves to concrete. Combine tasks to execute efficiently on some parallel computer. Consider replication.

4 Agglomeration Goals Reduce communication costs by
increasing computation decreasing/increasing granularity Retain flexibility for mapping and scaling. Reduce software engineering costs.

5 Changing Granularity A large number of tasks does not necessarily produce an efficient algorithm. We must consider the communication costs. Reduce communication by having fewer tasks sending less messages (batching)

6 Surface to Volume Effects
Communication is proportional to the surface of the subdomain. Computation is proportional to the volume of the subdomain. Increasing computation will often decrease communication.

7 How many messages total?
How much data is sent?

8 How many messages total?
How much data is sent?

9 Replicating Computation
Trade-off replicated computation for reduced communication. Replication will often reduce execution time as well.

10 Summation of N Integers
s = sum b = broadcast How many steps?

11 Using Replication (Butterfly)

12 Using Replication Butterfly to Hypercube

13 Avoid Communication Look for tasks that cannot execute concurrently because of communication requirements. Replication can help accomplish two tasks at the same time, like: Summation Broadcast

14 Preserve Flexibility Create more tasks than processors.
Overlap communication and computation. Don't incorporate unnecessary limits on the number of tasks.

15 Agglomeration Checklist
Reduce communication costs by increasing locality. Do benefits of replication outweigh costs? Does replication compromise scalability? Does the number of tasks still scale with problem size? Is there still sufficient concurrency?

16 Mapping Specify where each task is to operate.
Mapping may need to change depending on the target architecture. Mapping is NP-complete.

17 Mapping Goal: Reduce Execution Time Mapping is a game of trade-offs.
Concurrent tasks ---> Different processors High communication ---> Same processor Mapping is a game of trade-offs.

18 Mapping Many domain-decomposition problems make mapping easy. Grids
Arrays etc.

19 Mapping Unstructured or complex domain decomposition based algorithms are difficult to map.

20 Other Mapping Problems
Variable amounts of work per task Unstructured communication Heterogeneous processors different speeds different architectures Solution: LOAD BALANCING

21 Load Balancing Static Probabilistic Dynamic
Determined a priori Based on work, processor speed, etc. Probabilistic Random Dynamic Restructure load during execution Task Scheduling (functional decomp.)

22 Static Load Balancing Based on a priori knowledge.
Goal: Equal WORK on all processors Algorithms: Basic Recursive Bisection

23 å Basic æ ö ç ÷ p r = R ç ÷ p ç ÷ è ø Divide up the work based on
Work required Processor speed æ ö ç ÷ p r = R ç i ÷ å i p ç ÷ è ø i

24 Recursive Bisection Divide work in half recursively.
Based on physical coordinates.

25 Dynamic Algorithms Adjust load when an imbalance is detected.
Local or Global

26 Task Scheduling Many tasks with weak locality requirements.
Manager-Worker model.

27 Task Scheduling Manager-Worker Hierarchical Manager-Worker
Uses submanagers Decentralized No central manager Task pool on each processor Less bottleneck

28 Mapping Checklist Is the load balanced?
Are there communication bottlenecks? Is it necessary to adjust the load dynamically? Can you adjust the load if necessary? Have you evaluated the costs?

29 PCAM Algorithm Design Partition Communication Agglomeration Mapping
Domain or Functional Decomposition Communication Link producers and consumers Agglomeration Combine tasks for efficiency Mapping Divide up the tasks for balanced execution

30 Example: Atmosphere Model
Simulate atmospheric processes Wind Clouds, etc. Solves a set of partial differential equations describing the fluid behavior

31 Representation of Atmosphere

32 Data Dependencies

33 Partition & Communication

34 Agglomeration

35 Mapping

36 Mapping


Download ppt "CS 584."

Similar presentations


Ads by Google