Presentation is loading. Please wait.

Presentation is loading. Please wait.

Implementation and Optimization of MPI point-to-point communications on SMP-CMP clusters with RDMA capability.

Similar presentations


Presentation on theme: "Implementation and Optimization of MPI point-to-point communications on SMP-CMP clusters with RDMA capability."— Presentation transcript:

1 Implementation and Optimization of MPI point-to-point communications on SMP-CMP clusters with RDMA capability

2 MPI point-to-point communication
Pairing MPI_Send with MPI_Recv or MPI_Isend/MPI_Irecv/MPI_Wait There is an implicit synchronization – Receiver can complete only after sender performs the send; the communication operation cannot complete until both sender and receiver are ready.

3 MPI point-to-point communication
Use different protocol for large and small messages Eager protocol for small messages Low latency communication Sender not depending on receiver Rendevuous protocols for large messages No message copy

4 Eager protocol

5 Rendezvous protocol

6 Existing RDMA based small message channel – the MVAPICH design [Liu03]

7 Our improved design – eliminating persistent buffer association

8 Further improvement – node-shared Small message channels

9

10

11

12

13

14

15

16

17 Optimizing Rendezvous protocol – ideal rendezvous protocol
SS – Send start, SW – Send wait, RS– Receive start, RW – Receive wait. When both sender and receiver have initiate the communication, data transfer should start

18 Optimizing Rendezvous protocol – the problem
Poor progress

19 Optimizing Rendezvous protocol – the problem
The performance is heavily affected by the timing of the events? Is it possible to have near optimal performance for all timing situations?

20

21

22

23

24

25

26 How to use these protocols
Dynamic protocol selection – design maga-protocol that combines multiple of these protocols. Profile-guided optimization – use profiling to determine the timing information, and use the timing information to select the protocol. Compiler-assisted optimization – use compiler analysis to determine the timing information, and use the timing information to select the best performing protocol.


Download ppt "Implementation and Optimization of MPI point-to-point communications on SMP-CMP clusters with RDMA capability."

Similar presentations


Ads by Google