Download presentation
Presentation is loading. Please wait.
Published byMorgan Doyle Modified over 9 years ago
1
The hybird approach to programming clusters of multi-core architetures
2
High-performance computing Cluster computing In order to gain high performance many computers are networked together to work on computation heavy problems as a whole Multi-core computers Since the max single core computer speeds began to level out, multi-core computers became the popular solution
3
High-performance computing Multi-core clusters With multi-core computers being popular, nodes in clusters began to make use of this style Having multi-core node would seem to make more powerful nodes and save lot of power and space Programming these multi-core nodes could possibly be different
4
Programming Cluster computers The standard model for programing clusters is MPI MPI is the message passing interface that is used mostly with distributed memory Multi-core computers Multi-core computers have shared memory An efficient way to program on shared memory would be the use of OpenMP
5
Multi-core Clusters When programming for multi-core clusters MPI treats cores on different nodes and the same node in a similar manner MPI completely ignore the fact that core on a single node work on shared memory This causes problems because duplicate data is on on a single node with multiple cores
6
MPI MPI uses communicator objects which connect groups of processes in the MPI session. MPI supports point-to-point communication between two specific processes. Collective functions are used to communicate all processes in a process group
7
Problems with MPI Collectives Collectives take arguments that are arrays of size equal to the number of processes in the communicator ○ Example of collectives are MPI_Gatherv, MPI_Scatterv, and MPI_Alltoallv In the case of MPI_Alltoallv, it takes arguments for both sends and receives which add up to 4 MB on each process if there is a million processes
8
Solution The combination of OpenMP and MPI is a worthy solution MPI can handle the communication between nodes on distributed memory OpenMP can handle communication within a single node on shared memory
9
Implementation Since OpenMP is not an all or nothing model, it can be injected into certain parts of the program One can identify the most time consuming loops and place OpenMP directives on them One can also place directives on loops across undistributed dimensions.
10
Hybrid masteronly This model is when only one MPI process is used per node All communication within the node is done by OpenMP Problem The idling of other threads in the node while communication my MPI is taking place MPI bandwidth not always fully used with a single communicating thread
11
Hybrid with overlap Using this method is a way to avoid idling computed threads during communication The communication is split into to one or more OpenMP threads to handle communication in parallel with useful calculation
12
Conclusion Taking advantage of the hybrid approach has advantages over pure MPI in some cases This is not valid for all cases, some cases will have reverse effects If programmed well, applications can be very scalable and have great performance increase
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.