Presentation is loading. Please wait.

Presentation is loading. Please wait.

Application Mapping Over OFIWG SFI Sean Hefty. MPI Over SFI Example MPI Implementation over SFI Demonstrates possible usage model –Initialization –Send.

Similar presentations


Presentation on theme: "Application Mapping Over OFIWG SFI Sean Hefty. MPI Over SFI Example MPI Implementation over SFI Demonstrates possible usage model –Initialization –Send."— Presentation transcript:

1 Application Mapping Over OFIWG SFI Sean Hefty

2 MPI Over SFI Example MPI Implementation over SFI Demonstrates possible usage model –Initialization –Send injection –Send Completions –Polling –RMA Counters Completions 2

3 Query Interfaces: Tagged /* Tagged provider */ hints.type = FID_RDM; #ifdef MPIDI_USE_AV_MAP hints.addr_format = FI_ADDR; #else hints.addr_format = FI_ADDR_INDEX; #endif hints.protocol = FI_PROTO_UNSPEC; hints.ep_cap = FI_TAGGED | FI_BUFFERED_RECV | FI_REMOTE_COMPLETE | FI_CANCEL; hints.op_flags = FI_REMOTE_COMPLETE; 3 Reliable unconnected endpoint Address vector optimized for minimal memory footprint and no internal lookups Transport agnostic Default flags to apply to data transfer operations Behavior required by endpoint

4 Query Interfaces: RMA/Atomics /* RMA provider */ hints.type = FID_RDM; #ifdef MPIDI_USE_AV_MAP hints.addr_format = FI_ADDR; #else hints.addr_format = FI_ADDR_INDEX; #endif hints.protocol = FI_PROTO_UNSPEC; hints.ep_cap = FI_RMA | FI_ATOMICS | FI_REMOTE_COMPLETE | FI_REMOTE_READ | FI_REMOTE_WRITE; hints.op_flags = FI_REMOTE_COMPLETE; 4 Support for RMA and atomic operations Remote RMA read and write support Separate endpoint for RMA operations

5 Query Interfaces: Message Queue eq_attr.mask = FI_EQ_ATTR_MASK_V1; eq_attr.domain = FI_EQ_DOMAIN_COMP; eq_attr.format = FI_EQ_FORMAT_TAGGED; fi_eq_open(domainfd, &eq_attr, &p2p_eqfd, NULL); eq_attr.mask = FI_EQ_ATTR_MASK_V1; eq_attr.domain = FI_EQ_DOMAIN_COMP; eq_attr.format = FI_EQ_FORMAT_DATA; fi_eq_open(domainfd, &eq_attr, rma_eqfd, NULL); fi_bind(tagged_epfd, p2p_eqfd, FI_SEND | FI_RECV); fi_bind(rma_epfd, rma_eqfd, FI_READ | FI_WRITE); 5 Event queue optimized to report tagged completions Event queue optimized to report RMA completions Associate endpoints with event queues

6 Query Limits optlen = sizeof(max_buffered_send); fi_getopt(tagged_epfd, FI_OPT_ENDPOINT, FI_OPT_MAX_INJECTED_SEND, &max_buffered_send, &optlen); optlen = sizeof(max_send); fi_getopt(tagged_epfd, FI_OPT_ENDPOINT, FI_OPT_MAX_MSG_SIZE, &max_send, &optlen); 6 Query endpoint limits Maximum ‘inject’ data size – buffer is reusable immediately after function call returns Maximum application level message size

7 Short Send int MPIDI_Send(buf, count, datatype, rank, tag, comm, context_offset, **request) { data_sz = get_size(count, datatype); if (data_sz <= max_buffered_send) { match_bits = init_sendtag(comm->context_id + context_offset, comm->rank, tag, 0); fi_tinjectto(tagged_epfd, buf, data_sz, COMM_TO_PHYS(comm, rank), match_bits); } else {... } 7 Small sends map directly to tagged-injectto call Fabric address provided directly to provider

8 Large Message Send int MPIDI_Send(buf, count, datatype, rank, tag, comm, context_offset, **request) { /* code for type calculations, tag creation, etc */ REQUEST_CREATE(sreq); fi_tsendto(MPIDI_Global.tagged_epfd,send_buf, data_sz, NULL, COMM_TO_PHYS(comm,rank), match_bits, &(REQ_OF2(sreq)->of2_context)); *request = sreq; } 8 Large sends require request allocation SFI completion context embedded in request object

9 Progress/Polling for Completions int MPIDI_Progress() { eq_tagged_entry_t wc; fid_eq_t fd[2] = {p2p_eqfd, rma_eqfd}; for(i=0;i<2;i++) { MPID_Request *req; rc = fi_eq_read(fd[i],(void *)&wc, sizeof(wc)); handle_errs(rc); req = context_to_request(wc.op_context); req->callback(req); } 9 Fields align on tagged entry to data_entry

10 RMA Completions (Counters and Completions) int MPIDI_Win_fence(MPID_Win *win) { /* synchronize software counters via completions */ PROGRESS_WHILE(win->started!=win->completed); /* Syncronize hardware counters */ fi_sync(WIN_OF2(win)->rma_epfd, FI_WRITE|FI_READ|FI_BLOCK, NULL); /* Notify any request based objects that use counter completion */ RequestQ->notify() } 10


Download ppt "Application Mapping Over OFIWG SFI Sean Hefty. MPI Over SFI Example MPI Implementation over SFI Demonstrates possible usage model –Initialization –Send."

Similar presentations


Ads by Google