Presentation is loading. Please wait.

Presentation is loading. Please wait.

Implementing SRC in MPI Ishai Rabinovitz 19/7/07.

Similar presentations


Presentation on theme: "Implementing SRC in MPI Ishai Rabinovitz 19/7/07."— Presentation transcript:

1 Implementing SRC in MPI Ishai Rabinovitz 19/7/07

2 SRC domai n SRC domai n rcv QP send QP SRQ send QP Core 1 Core 2 SHM SRQ SRC domai n SRC domai n rcv QP send QP SRQ send QP Core 1 Core 2 SHM SRQ SRC with two (and three) machines

3 SRC domai n SRC domai n rcv QP send QP SRQ send QP Core 1 Core 2 SHM SRQ SRC domai n SRC domai n rcv QP send QP SRQ send QP Core 1 Core 2 SHM SRQ SRC doma in SRC doma in rcv QP send QP SRQ send QP Core 1Core 2 SHM SRQ

4 Notations: (m,c) indicates core c in machine m. SendQP((m1, c1), m2) is the QP that sends from core (m1, c1) to any core in machine m2. RecvQP(m1, (m2, c2)) is the RecvQP in m1 that should get messages from core (m2, c2). SRQ(m1, c1) is the SRQ in core (m1, c1).

5 Data structures Each core (m,c) has: –Its own SRQ (SRQ(m, c)) –SendQPs table that maps machines to SendQP to this machine Each entry is of the kind: m’->SendQP((m, c), m’) –Ranks table that maps ranks to their (m’, c’) and to the SRQ id (and SendQP) that should be used when sending messages to this rank. Each entry is of the kind: r’->((m’, c’), SRQ(m’, c’), SendQP((m, c), m’)) The sheared memory has –An RC (reference count) that count the number of cores working with the sheared entities. (We may use the SRC domain RC for this aim). –A RecvQP for each remote rank. –RecvQPs table that maps remote cores/ranks to the RecvQP number that should be used by this cores to send messages to the current machine. Each entry is of the kind: r’->RecvQP(m, (m’, c’))

6 On initialization Each core does the following protocol: –Lock(file) –Try to create the SRC domain If success than you are the owner of the domain If fail than the SRC domain already exists connect to it In any case increase the RC –Unlock(file) –Create SRQ, connect it to the SRC domain and save its number –Fill the Rank table with the (m, c) for all ranks

7 Connection (main idea) When core1 wants to connect to core2 on another machine it sends a connection request This connection request has only information on how core2 can create connection for sending messages to core1 After core2 establish a connection to core1, it send a reconnection request to core1 with data that will allow core1 to establish a connection on which it can send messages to core2 This reconnection message can be sent on the first connection that was established

8 Creating a connection Core (m1, c1) that wants to connect to another rank r2 (in core (m2,c2)) does the following: –Checks if there is an entry for r2 in the shared RecvQPs table (looking for RecvQP(m1, (m2, c2))) –If not: Sets a lock on the shared RecvQP table (we can use lock(file)) Rechecks that there is no entry for r2 in the RecvQPs table Creates this RecvQp and saves its info in the RecvQPs table Unlock Send (using Eth. Or UD) a connection request with the following information to the other rank: –Its rank –Its details (m1, c1) –Its SRQ id –RecvQP(m1, (m2, c2)) number Our SRQ(m1, ca) id. (We will create the SRQ on intializtaion)The SendQP numberOur SRQ(m1, ca) id. (We will create the SRQ on intializtaion)The SendQP number

9 Handling connection and reconnection request When a rank (m1, c1) gets a connection request from (m2, c2) it does the following protocol: –Checks in the SendQPs table if there is already a SendQP to m2. –If there is such SendQP, increase its reference count –If there is no such SendQP: Creates this SendQP Connect it to the RecvQp it got in the connection message Updates the SendQPs table with this SendQP Updates the SRQs table with the SRQ it got in the connection request and with the SendQP it got in the last action. –If it is a connection request (and not a reconnection request) do the same protocol as in the last slide and sends a reconnection request back to (m2, c2). This reconnection request can be sent using the IB connection that was established. –If there is a waiting message to this rank in the waiting queue, send it.

10 Message from (m1,c1) to (m2,c2) To send a message from (m1, c1) to (m2, c2) do the following: –Look in SRQs table for the rank. –If it does not exist in the table Translates the rank to the (m2, c2) tuple using the ranks table you got in the beginning Create a connection to (m2, c2) Move the message to wait queue. –If it exist in the table Take from the table sendQP((m1, c1), m2) and SRQ(m2, c2) Send the message from sendQP((m1, c1), m2) using SRQ(m2, c2) id in the message.

11 Cleaning Each rank that finish its execution should: –Close all sendQPs of this rank. –Free the SRQ of this rank –Put back the SRC domain –Decrease the RC –If the RC reaches 0 (Or the SRC domain is free) close all shared RecvQPs, free the Shared tables –Maybe we should release the a RecvQP after we get a disconnect in the CM and not after all cores have finished.


Download ppt "Implementing SRC in MPI Ishai Rabinovitz 19/7/07."

Similar presentations


Ads by Google