Download presentation
Presentation is loading. Please wait.
Published byEleanore Hodge Modified over 8 years ago
1
A SPMD Model for OCR Sanjay Chatterjee 2/9/2015 Intel Confidential1
2
OCR SPMD model A SPMD context in OCR is a collection of individual logical execution units called ranks A rank has a unique id within a SPMD context and can be viewed as a sequential chain of SPMD-EDTs A SPMD context includes two kinds of SPMD EDT templates: compute and sync A SPMD rank starts by launching a compute EDT instance SPMD ranks collectively synchronize by individually calling a SYNC operation SPMD ranks collectively transition from a synchronization phase to a computation phase by individually calling COMPUTE A SPMD EDT restarts itself by calling NEXT Intel Confidential 2 SPMD CONTEXT RANK 1 C10 C11 C12 S10 C13 RANK 0 C00 S00 S01 S02 C01 RANK 2 C20 C21 C22 S20 S21 C23 RANK 3 C30 C31 C32 S30 C33 COMPUTE PHASE COLLECTIVE SYNC PHASE COMPUTE PHASE RANK MESSAGE NEXT SYNC COMPUTE NEXT
3
SPMD EDTs vs Regular EDTs SPMD EDTs are of two kinds: COMPUTE and SYNC SPMD EDTs are anonymized i.e they do not have a guid A SPMD EDT only lives within a SPMD context and is associated with a rank Returning from a SPMD EDT will exit the rank from the SPMD context A SPMD EDT can restart itself by calling NEXT A compute SPMD EDT can call SYNC to exit itself and start a new sync EDT on the same rank A sync SPMD EDT can call COMPUTE to exit itself and start a new compute EDT on the same rank A compute EDT calling COMPUTE or a sync EDT calling SYNC is an error A SPMD EDT in one rank can communicate with another rank using rank messages A SPMD EDT can add a self dependence with a new API ocrAddSelfDependence i.e, the EDT can make its next instance be dependent on another Event/DB Intel Confidential3
4
Creating and launching a SPMD Context u8 ocrSpmdLaunch( u64 numRanks, ocrGuid_t computeTemplate, u32 paramc, u64* paramv, u32 spmdDepc, ocrSpmdDep_t *spmdDepv, ocrGuid_t collector, ocrGuid_t affinity, ocrGuid_t outputEvent ); [in] numRanks : Number of ranks in the SPMD context [in] computeTemplate : Guid of the compute EDT template [in] paramc : Number of SPMD params [in] paramv : Params for the compute EDTs. They are copied to every rank. [in] spmdDepc : Number of ocrSpmdDep_t inputs for the compute EDT [in] spmdDepv : The ocrSpmdDep_t inputs for the compute EDT [in] collector : Guid of a library of synchronization algorithms [in] affinity : Affinity guid of the SPMD EDT [in] outputEvent : SPMD output event u8 ocrSpmdDepCreate( ocrSpmdDep_t *dep, ocrGuid_t db, ocrDbAccessMode_t mode, SpmdDepType_t type, u64 index, size_t elSize ); Creates a Spmd dependence type to provide as input to the Spmd context [out] dep: The Spmd EDT dep variable [in] db: Guid of DB used as input to the SPMD context [in] mode: Access mode on the DB [in] type: Type of SPMD dependence. Can be either “REGULAR” or “INDEXED” REGULAR: The DB used in a “regular” dep is copied to the compute EDTs on every rank. Each rank gets a new GUID for the copied DB. INDEXED: The DB used in an “indexed” dep is read in slices on every rank. Each rank get a new DB for the slice it uses. DB should be an array containing elements of size “elSize” The array length should at least “index” + SPMD numRanks Each rank “i” gets a DB of size “elSize” starting at offset ((index + i) * elSize) of the source DB [in] index: Used for “INDEXED” deps only. It is the starting index for rank 0 to start it’s slice. [in] elSize: Used for “INDEXED” deps only. Size of element used in the input DB. Intel Confidential4
5
SPMD Rank Messages SPMD rank messages support point-to-point communication between ranks Messages can be communicated only between the same kind of SPMD EDT templates Compute SPMD EDTs on one rank can only send/receive messages to/from compute SPMD EDTs on other ranks Sync SPMD EDTs on one rank can only send/receive messages to/from sync SPMD EDTs on other ranks Message ordering at source rank is guaranteed to be maintained at destination rank depv slot u8 ocrSend(u64 dstRank, u64 dstSlot, ocrGuid_t db); [in] dstRank: rank id of message destination rank [in] dstSlot: slot id at destination rank [in] db: Guid of the datablock communicated Called by message source Message send is guaranteed to be complete after NEXT is called Another send to the same location and slot is permitted only after calling NEXT u8 ocrRecv(u64 srcRank, u64 dstSlot); [in] srcRank: rank id of the message source rank [in] dstSlot: slot id in current rank where message will be received Called by message destination DB at destination can be accessed in slot after calling NEXT Intel Confidential5
6
API for adding a dependence in a SPMD EDT ocrAddSelfDependence(ocrGuid_t source, u32 slot, ocrDbAccessMode_t mode); [in] source: Source of the dependence edge. Maybe event or DB. [in] slot: Slot in the current SPMD EDT that will be satisfied by the dependence [in] mode: The access mode on the DB attached to the slot Adds a dependence to an event or DB source Allows SPMD EDT to wait for an event NEXT has to be called for completion of the wait on the satisfaction of the dependence The data from the source is visible only after calling NEXT Intel Confidential6
7
API for NEXT void ocrNext(); exits and restarts current SPMD EDT All sends and receives called before ocrNext are guaranteed to be complete before the EDT restarts After restart, the depv slots that receive messages are updated with new DB. All other depv slots and params maintain their state from previous ocrNext Intel Confidential7
8
API for COMPUTE void ocrCompute(); Creates and launches a new SPMD EDT in the current rank from the compute template of the SPMD context Can be called from either the initialization function or a sync SPMD EDT All compute EDTs in a rank share the same paramv and depv state setup during the initialization function can be updated during the lifetime of the rank Intel Confidential8
9
OCR Collectors OCR collectors are libraries that consist of various synchronization algorithms These algorithms are to be written as SYNC EDTs u8 ocrCollectorCreate( ocrGuid_t *collectorGuid); Creates a collector object [out] collectorGuid: Guid of the collector object u8 ocrCollectorRegister( ocrGuid_t collector, ocrCollective_t colType, ocrGuid_t syncTemplate); Registers a SYNC EDT template with this collector object [in] collector: Guid of the collector object [in] colType: Type of the collective synchronization operation [in] syncTemplateGuid: Guid of the sync EDT template that implements the collective operation Intel Confidential9
10
API for SYNC void ocrSync(ocrCollective_t colType, ocrGuid_t db, bool reqResult, u32 paramc, u64 *paramv); Creates and launches a new sync SPMD EDT in the current rank Called from a compute SPMD EDT (this call will exit the compute EDT). [in] colType: type of collective synchronization to be performed. E.g: sum-reduction, barrier, etc [in] db: Datablock passed to the synchronization operation as rank’s input element The DB can be accessed in slot 0 of the sync EDT [in] reqResult: Boolean to indicate if current rank needs the result of the collective [in] paramc: Number of params passed to the sync EDT [in] paramv: Params passed to sync EDT u8 ocrSyncResultPut(ocrGuid_t db) The sync EDT that holds the final result of the synchronization op [in] db: Result DB from the collective synchronization op u8 ocrSyncResultGet(ocrEdtDep_t *result); [out] result: The DB guid and pointer of the collective operation result Can be called only from a compute EDT Call will result in error if the previous ocrSync was called with reqResult as FALSE Intel Confidential10
11
Other API supported inside a SPMD context u64 ocrGetRank() – returns the current rank u64 ocrGetRanks() – returns total number of ranks in the current SPMD context Intel Confidential11
12
Example: Sum-Reduction Intel Confidential12 { … ocrEdtTemplateCreate( &sumRedTempl, sumRedFunc, 2, 2 ); ocrCollectorCreate( &collectorGuid ); ocrCollectorRegister( collectorGuid, SUM_REDUCTION, sumRedTempl ); ocrEdtTemplateCreate( &computeRedTempl, computeFunc, 1, 1 ); ocrEventCreate( &outputRed, OCR_EVENT_STICKY_T, TRUE ); u64 phase = 0; ocrSpmdDepCreate( &spmdDep, elArrayDb, DB_MODE_RO, INDEXED, 0, sizeof(u64) ); ocrSpmdContextCreate( NUM_RANKS, computeRedTempl, 1, &phase, 1, &spmdDep, collectorGuid, NULL_GUID, outputRed ); } ocrGuid_t computeFunc ( u32 paramc, u64* paramv, u32 depc, ocrEdtDep_t depv[] ) { if (*paramv == 0) { u64 sync_paramv[2]; sync_paramv[0] = ocrGetRank(); sync_paramv[1] = 1; ocrSync(SUM_REDUCTION, depv[0].guid, (ocrGetRank() == 0 ? TRUE : FALSE), 2, &sync_paramv) } else if (ocrGetRank() == 0) { ocrEdtDep_t result; ocrSyncResultGet( &result ); return result.guid; } return NULL_GUID; } ocrGuid_t sumRedFunc ( u32 paramc, u64* paramv, u32 depc, ocrEdtDep_t depv[] ) { u64 myRank = ocrGetRank(), numRanks = ocrGetRanks(); if (paramv[0] % 2) == 0) { if (paramv[1] != 1) {//reduce: depv[0] = depv[0] + depv[1];} u64 srcRank = myRank + paramv[1]; if (srcRank >= numRanks) break; ocrRecv(srcRank, 1); paramv[0] /= 2; paramv[1] *= 2; ocrNext(); } else { u64 dstRank = myRank - paramv[1]; //ASSERT(dstRank >= 0 && dstRank < numRanks); ocrSend(dstRank, 1, depv[0].guid); ocrCompute(); } ocrSyncResultPut( depv[0].guid ); }
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.