Download presentation
1
Mapping Techniques for Load Balancing
2
Overheads Two sources Time spent in inter-process interaction Time of being idle A good mapping must ensure that computations and interactions among processes at each stage of the execution of the parallel algorithms are well balance
3
Mapping Techniques Static Mapping distribute the tasks among processes prior to the execution of the algorithm Dynamic mapping distribute the work among processes during the execution of the algorithm Dynamic mapping apply to if tasks are generated dynamically if task size unknown entail but if the amount of data associated with tasks is large
4
Parallel Algorithm Models
Data-Parallel Model Work Pool Model Master-Slave Model Pipeline or Producer-Consumer Model Hybrid Model
5
Communication Model of Parallel Platforms
Two primary form of data exchange between parallel tasks Accessing a shared data space Exchanging message Shared-Address-Space Platforms view of a parallel platform supports a common data space that is accessible to all processors processors interact by modifying data objects stored in it
6
Message-Passing Platform (MPP)
Logic machine view of MPP consists p processing nodes either a single processor or shared-address-space multi-processors each with its own exclusive address space e.g. cluster workstations Messages data and work synchronize
7
MP Paradigm Four basic operations: MP APIs:
MP Paradigms support execution of a different program on each nodes Four basic operations: Interactions: send and receive ID for each processes. Using function whoami numprocs which specify the number of processes MP APIs: MPI (Message Passing Interface) PVM (Parallel Virtual Machine)
8
Explicit Parallel Programming
Numerous programming and libraries have been developed Difference in their view of address space degree of synchronization multiplicity of programs MPI is wide-spread adoption due to the fact that it impose minimal requirements on hardware
9
Principles of Message-Passing Programming
Two key attributes of MPP It assumes a partitioned address space each data element must belong to one of the partitions of the space, i.e. data must explicitly partitioned and placed all interactions requires cooperation of two processes for dynamic and/or unstructured interactions the complexity of code written is very high It suppose only explicit parallelization decompose computations and extract concurrency
10
Structure of MP Programs(1)
Asynchronous all concurrent tasks execute asynchronously harder to reason about; can have non-deterministic behavior due to race conditions Loosely synchronous tasks or subtasks synchronize to perform interactions between these interactions, tasks execute asynchronously easy to reason about
11
Structure of MP Programs (2)
MP Paradigm supports execution of a different program on each of the p processes but makes the job of writing parallel program s effectively unscalable Most use single program multiple data(SPMD) code executed by different processes is identical except for a small processes (e.g. the ‘root’ process) SPMD can be loosely synchronous or asynchronous
12
Building Blocks Send and Receive Operations (1)
send (void *sendbuf, int nelems, int dest) receive(void *recvbuf,int nelems,int source) sendbuf points to a buffer that store data to be sent recvbuf points to a buffer that store data to be received nelems is the number of data units dest is the identifier that the process receives data source is the identifier of the process that sends data
13
Send and Receive Operations (2)
P P1 a=100; receive (&a,1,0); send(&a,1,1); printf(“%d\n”,a); a=0; >>>>> What p1 prints out?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.