EDMA3 Keystone SoC Devices
Agenda What is DMA? EDMA Architecture Definition of EDMA3 Terminology Synchronization Indexing Example to Summarize Trigger Mechanisms Action Mechanisms Linking Chaining QDMA EDMA3 LLD Review
What is DMA? What is DMA? EDMA Architecture Definition of EDMA3 Terminology Synchronization Indexing Example to Summarize Trigger Mechanisms Action Mechanisms Linking Chaining QDMA EDMA3 LLD Review
Why Use DMA? buf_0 D0 buf_1 D1 D2 D3 The primary function of DMA is to move data without direct CPU involvement. What information does a DMA controller need to perform a transfer? Source address Destination address Length (or size) What options might be useful to perform the transfer? Do you want to interrupt the CPU when the transfer is complete? Is this transfer synchronized to an event (like the McBSP RCV buffer is full)? How do the source and destination addresses update? (same, +1, -1, +4 ?)
DMA in KeyStone Devices There are MANY forms of DMA (Direct Memory Access) in the KeyStone Architecture. EDMA3 – Enhanced DMA handles M DMA CHs and X QDMA CHs DMA – M Channels that can be triggered manually or by events/chaining QDMA – X channels of Quick DMA triggered by writing to a trigger word Q0 Q1 Q2 Qn TC0 TC1 TC2 TCn TeraNet QDMA EVTx Chain Manual EDMA3 DMA Trigger Word Resources connected to TeraNet IDMA – 2 CHs of Internal DMA (Periph Cfg, Xfr L1 ↔ L2) When we say DMA, it means different thing. In KS devices, it can be an EDMA, IDMA or peripheral DMA. IDMA L1D L2 Ch0 PERIPH L1 Ch1 L2 Peripheral DMAs – Each master device hooked to the TeraNet has its own DMA (PktDMA) (e.g. SRIO, EMAC, etc.)
EDMA Architecture What is DMA? EDMA Architecture Definition of EDMA3 Terminology Synchronization Indexing Example to Summarize Trigger Mechanisms Action Mechanisms Linking Chaining QDMA EDMA3 LLD Review
Global Interrupt & Region Interrupt (0-n) EDMA3 Architecture En E1 E0 Q0 Q1 Q2 Qm Queue CC TC Evt Reg (ER) PSET 0 PSET 1 PSET X Evt Enable Reg (EER) TC0 . . . TR Submit TC1 Evt Set Reg (ESR) Data TeraNet TC2 Chain Evt Reg (CER) Early TCC TCm Int Pending Reg – IPR Completion Detection Normal TCC Int Enable Reg – IER Global Interrupt & Region Interrupt (0-n) Memory Protection
Shadow Regions and Memory Protection Multi-level protection: Regions restrict access to the channels from the peripheral masters. Memory Protection provides restricted access to different memory spaces within the device. Each region has a copy of the channel configuration registers to configure the channels allocated to the specific region (DRAEn and DRAEHn, QRAEn). In addition to the shadow regions, there is a global region access to the Channel Controller. Memory protection is provided by setting the privilege level, requestor, and types of access allowed for each region (MPPAn and MPPAG). Each shadow region is also associated with a completion interrupt that can be tied to different interrupt events.
Shadow Region
Definition of EDMA3 Terminology What is DMA? EDMA Architecture Definition of EDMA3 Terminology Synchronization Indexing Example to Summarize Trigger Mechanisms Action Mechanisms Linking Chaining QDMA EDMA3 LLD Review
Direct Memory Access (DMA) Goal : Examples : Controlled by : Copy from memory to memory – HARDWARE memcpy(dst, src, len); Faster than CPU LD/ST. One INT per block vs. one INT per sample Import raw data from off-chip to on-chip before processing. Export results from on-chip to off-chip afterward. Transfer Configuration (i.e., Parameter Set - aka PaRAM or PSET) Transfer configuration primarily includes 8 control registers. Original Data Block Copied DMA Source For ARM+DSP, resources are shared between ARM and DSP. ARM has some channels, DSP has other channels. If you’re using CE, this is taken care of for you. If you’re NOT using CE, it is up to you to manage resources. Length BCNT ACNT Transfer Configuration Destination
(# of contiguous bytes) How Much to Move? Element (# of contiguous bytes) A Count (Element Size) 15 Options Source Destination Index Link Addr Cnt Reload Transfer Count A Count 16 31 B Count (# Elements) Elem 1 Elem 2 Elem N Frame . . B Count C Rsvd Frame 1 Frame 2 Frame M Block C Count C Count (# Frames) A B Transfer Configuration 9
Example: How to VIEW the Transfer Let’s start with a simple example. We need to transfer 12 bytes from “here” to “there.” 8-bit NOTE: These are contiguous memory locations What is ACNT, BCNT, and CCNT? Hmmm…. You can view the transfer several ways: ACNT = 1 BCNT = 4 CCNT = 3 ACNT = 2 BCNT = 2 CCNT = 3 ACNT = 12 BCNT = 1 CCNT = 1 = 12 Which “view” is the best? Well, that depends on what your system needs and the type of sync and indexing (covered later…)
Synchronization What is DMA? EDMA Architecture Definition of EDMA3 Terminology Synchronization Indexing Example to Summarize Trigger Mechanisms Action Mechanisms Linking Chaining QDMA EDMA3 LLD Review
A – Synchronization An event (i.e., McBSP receive register full) triggers the transfer of exactly 1 array of ACNT bytes (2 bytes) Example: McBSP tied to a codec. You want to sync each transfer of a 16-bit word to the receive buffer being full or the transmit buffer being empty. EVTx EVTx EVTx Frame 1 Array1 Array2 Array BCNT Frame 2 Array1 Array2 Array BCNT Frame CCNT Array1 Array2 Array BCNT
AB – Synchronization An event triggers a two-dimensional transfer of BCNT arrays of ACNT bytes (A*B). Example: Line of video pixels; Each line has BCNT pixels consisting of 3 bytes each – Y, Cb, Cr EVTx Frame 1 Array1 Array2 Array BCNT Frame 2 Array1 Array2 Array BCNT Frame CCNT Array1 Array2 Array BCNT
Indexing What is DMA? EDMA Architecture Definition of EDMA3 Terminology Synchronization Indexing Example to Summarize Trigger Mechanisms Action Mechanisms Linking Chaining QDMA EDMA3 LLD Review
Indexing: ‘BIDX, ‘CIDX . . . . A-Sync AB-Sync EDMA3 has two types of indexing: ‘BIDX and ‘CIDX Each index can be set separately for SRC and DST (next slide…) ‘BIDX = index in bytes between ACNT arrays (same for A-sync and AB-sync) ‘CIDX = index in bytes between BCNT frames (different for A-sync vs. AB-sync) ‘BIDX/’CIDX: signed 16-bit, -32768 to +32767 . . EVTx ‘BIDX ‘CIDXA A-Sync . . EVTx ‘BIDX CIDXAB AB-Sync CIDX distance is calculated from the starting address of the previously transferred block (array for A-sync, frame for AB-sync) to the next frame to be transferred.
Indexed Transfers EDMA3 has 4 indexes allowing higher flexibility for complex transfers: SRCBIDX = # bytes between arrays (Ex: SRCBIDX = 2) SRCCIDX = # bytes between frames (Ex: SRCCIDXA = 2, SRCCIDXAB = 4) Note: ‘CIDX depends on the synchronization used – “A” or “AB” DSTBIDX = # bytes between arrays (Ex: DSTBIDX = 3) DSTCIDX = # bytes between frames (Ex: DSTCIDXA = 5, DSTCIDXAB = 8) SRCBIDX DSTBIDX 1 3 9 11 5 7 13 15 1 3 SRCCIDXA DSTCIDXA CCNT = 4. 5 7 SRC (8-bit) 9 11 (contiguous) DST (8-bit) (contiguous)
Example: Using Indexing Remember this example? For each “view”, fill in the proper SOURCE index values: NOTE: These are contiguous memory locations 8-bit ACNT = 1 BCNT = 4 CCNT = 3 ACNT = 2 BCNT = 2 CCNT = 3 ACNT = 12 BCNT = 1 CCNT = 1 ‘BIDX = 1 ‘CIDXA = 1 ‘CIDXAB = 4 ‘BIDX = 2 ‘CIDXA = 2 ‘CIDXAB = 4 ‘BIDX = N/A ‘CIDXA = N/A ‘CIDXAB = N/A Which “view” is the best? Well, that depends on what you are transferring from/to and which sync mode is used.
Example to Summarize What is DMA? EDMA Architecture Definition of EDMA3 Terminology Synchronization Indexing Example to Summarize Trigger Mechanisms Action Mechanisms Linking Chaining QDMA EDMA3 LLD Review
Parameters for a Single Block Transfer 8-bit Pixels &myDest: 8 7 Goals: 1 2 3 4 5 8 9 Transfer a block of 8-bit pixels from &pixel_7 to &myDest Transfer all pixels as quickly as possible (single EVTx – xfr all data, AB-sync) 6 7 8 9 10 11 10 9 12 13 14 15 16 17 10 11 18 19 20 21 22 23 13 24 25 26 27 28 29 14 (Src: &pixel_7) 15 Note: data values are in contiguous memory 16 19 Param Set (active) 31 0 Options &pixel_7 &myDest 1 RSVD 4 3 6 0xFFFF (later) = BCNT Solution 20 31 0 Options Source Destination CCNT RSVD ACNT BCNT SRCBIDX DSTBIDX LINK BCNTRLD SRCCIDX DSTCIDX 21 22 8 bits The goals say that a single event transfers ALL data. If ACNT=1, BCNT would have to be 4. CCNT would have to be 3. In this case, doing an AB-sync transfer, you would have 3 AB transfers – each one waiting for a new EVT which never occurs. So, ACNT has to be 4 and BCNT has to be 3. One event and you get A*B which is the whole transfer. Why can’t we use ACNT=1? How does this transfer work inside the EDMA? What happens when the transfer completes? How do you program this transfer?
Parameters for a Single Block Transfer 8-bit Pixels &myDest: 8 7 Goals: 1 2 3 4 5 8 9 Transfer a block of 8-bit pixels from &pixel_7 to &myDest Transfer all pixels as quickly as possible (single EVTx – xfr all data, AB-sync) 6 7 8 9 10 11 10 9 12 13 14 15 16 17 10 11 18 19 20 21 22 23 13 24 25 26 27 28 29 14 (Src: &pixel_7) 15 Note: data values are in contiguous memory 16 19 Param Set (active) Solution 20 31 0 Options Source Destination CCNT RSVD ACNT BCNT SRCBIDX DSTBIDX LINK BCNTRLD SRCCIDX DSTCIDX 21 22 8 bits The goals say that a single event transfers ALL data. If ACNT=1, BCNT would have to be 4. CCNT would have to be 3. In this case, doing an AB-sync transfer, you would have 3 AB transfers – each one waiting for a new EVT which never occurs. So, ACNT has to be 4 and BCNT has to be 3. One event and you get A*B which is the whole transfer. Why can’t we use ACNT=1? How does this transfer work inside the EDMA? What happens when the transfer completes? How do you program this transfer? 31 0
Parameters for a Single Block Transfer 8-bit Pixels &myDest: 8 7 Goals: 1 2 3 4 5 8 9 Transfer a block of 8-bit pixels from &pixel_7 to &myDest Transfer all pixels as quickly as possible (single EVTx – xfr all data, AB-sync) 6 7 8 9 10 11 10 9 12 13 14 15 16 17 10 11 18 19 20 21 22 23 13 24 25 26 27 28 29 14 (Src: &pixel_7) 15 Note: data values are in contiguous memory 16 19 Param Set (active) Solution 20 31 0 Options Source Destination CCNT RSVD ACNT BCNT SRCBIDX DSTBIDX LINK BCNTRLD SRCCIDX DSTCIDX 21 22 8 bits The goals say that a single event transfers ALL data. If ACNT=1, BCNT would have to be 4. CCNT would have to be 3. In this case, doing an AB-sync transfer, you would have 3 AB transfers – each one waiting for a new EVT which never occurs. So, ACNT has to be 4 and BCNT has to be 3. One event and you get A*B which is the whole transfer. Why can’t we use ACNT=1? How does this transfer work inside the EDMA? What happens when the transfer completes? How do you program this transfer? 31 0
Parameters for a Single Block Transfer 8-bit Pixels &myDest: 8 7 Goals: 1 2 3 4 5 8 9 Transfer a block of 8-bit pixels from &pixel_7 to &myDest Transfer all pixels as quickly as possible (single EVTx – xfr all data, AB-sync) 6 7 8 9 10 11 10 9 12 13 14 15 16 17 11 10 18 19 20 21 22 23 13 24 25 26 27 28 29 14 (Src: &pixel_7) 15 Note: data values are in contiguous memory 16 19 Param Set (active) Solution 20 31 0 Options Source Destination CCNT RSVD ACNT BCNT SRCBIDX DSTBIDX LINK BCNTRLD SRCCIDX DSTCIDX AB-sync 21 22 3 4 8 bits The goals say that a single event transfers ALL data. If ACNT=1, BCNT would have to be 4. CCNT would have to be 3. In this case, doing an AB-sync transfer, you would have 3 AB transfers – each one waiting for a new EVT which never occurs. So, ACNT has to be 4 and BCNT has to be 3. One event and you get A*B which is the whole transfer. Why can’t we use ACNT=1? How does this transfer work inside the EDMA? What happens when the transfer completes? How do you program this transfer? 1 31 0
Parameters for a Single Block Transfer 8-bit Pixels &myDest: 8 7 Goals: 1 2 3 4 5 8 9 Transfer a block of 8-bit pixels from &pixel_7 to &myDest Transfer all pixels as quickly as possible (single EVTx – xfr all data, AB-sync) 6 7 8 9 10 11 10 9 12 13 14 15 16 17 11 10 18 19 20 21 22 23 13 24 25 26 27 28 29 14 (Src: &pixel_7) 15 Note: data values are in contiguous memory 16 19 Param Set (active) Solution 20 31 0 Options Source Destination CCNT RSVD ACNT BCNT SRCBIDX DSTBIDX LINK BCNTRLD SRCCIDX DSTCIDX AB-sync 21 22 3 4 8 bits The goals say that a single event transfers ALL data. If ACNT=1, BCNT would have to be 4. CCNT would have to be 3. In this case, doing an AB-sync transfer, you would have 3 AB transfers – each one waiting for a new EVT which never occurs. So, ACNT has to be 4 and BCNT has to be 3. One event and you get A*B which is the whole transfer. Why can’t we use ACNT=1? How does this transfer work inside the EDMA? What happens when the transfer completes? How do you program this transfer? 1 31 0
Parameters for a Single Block Transfer 8-bit Pixels &myDest: 8 7 Goals: 1 2 3 4 5 8 9 Transfer a block of 8-bit pixels from &pixel_7 to &myDest Transfer all pixels as quickly as possible (single EVTx – xfr all data, AB-sync) 6 7 8 9 10 11 10 9 12 13 14 15 16 17 11 10 18 19 20 21 22 23 13 24 25 26 27 28 29 14 (Src: &pixel_7) 15 Note: data values are in contiguous memory 16 19 Param Set (active) Solution 20 31 0 Options Source Destination CCNT RSVD ACNT BCNT SRCBIDX DSTBIDX LINK BCNTRLD SRCCIDX DSTCIDX AB-sync 21 22 3 4 8 bits The goals say that a single event transfers ALL data. If ACNT=1, BCNT would have to be 4. CCNT would have to be 3. In this case, doing an AB-sync transfer, you would have 3 AB transfers – each one waiting for a new EVT which never occurs. So, ACNT has to be 4 and BCNT has to be 3. One event and you get A*B which is the whole transfer. Why can’t we use ACNT=1? How does this transfer work inside the EDMA? What happens when the transfer completes? How do you program this transfer? 1 31 0
Parameters for a Single Block Transfer 8-bit Pixels &myDest: 8 7 Goals: 1 2 3 4 5 8 9 Transfer a block of 8-bit pixels from &pixel_7 to &myDest Transfer all pixels as quickly as possible (single EVTx – xfr all data, AB-sync) 6 7 8 9 10 11 10 9 12 13 14 15 16 17 11 10 18 19 20 21 22 23 13 24 25 26 27 28 29 14 (Src: &pixel_7) 15 Note: data values are in contiguous memory 16 19 Param Set (active) Solution 20 31 0 Options Source Destination CCNT RSVD ACNT BCNT SRCBIDX DSTBIDX LINK BCNTRLD SRCCIDX DSTCIDX A-sync? 21 22 4 8 bits The goals say that a single event transfers ALL data. If ACNT=1, BCNT would have to be 4. CCNT would have to be 3. In this case, doing an AB-sync transfer, you would have 3 AB transfers – each one waiting for a new EVT which never occurs. So, ACNT has to be 4 and BCNT has to be 3. One event and you get A*B which is the whole transfer. Why can’t we use ACNT=1? How does this transfer work inside the EDMA? What happens when the transfer completes? How do you program this transfer? 31 0
Parameters for a Single Block Transfer 8-bit Pixels &myDest: 8 7 Goals: 1 2 3 4 5 8 9 Transfer a block of 8-bit pixels from &pixel_7 to &myDest Transfer all pixels as quickly as possible (single EVTx – xfr all data, AB-sync) 6 7 8 9 10 11 10 9 12 13 14 15 16 17 11 10 18 19 20 21 22 23 13 - 11 24 25 26 27 28 29 14 - 12 (Src: &pixel_7) 15 - 13 Note: data values are in contiguous memory 16 - 14 19 - 15 Param Set (active) Solution 20 - 16 31 0 Options Source Destination CCNT RSVD ACNT BCNT SRCBIDX DSTBIDX LINK BCNTRLD SRCCIDX DSTCIDX A-sync? 21 - 17 22 - 18 12 8 bits The goals say that a single event transfers ALL data. If ACNT=1, BCNT would have to be 4. CCNT would have to be 3. In this case, doing an AB-sync transfer, you would have 3 AB transfers – each one waiting for a new EVT which never occurs. So, ACNT has to be 4 and BCNT has to be 3. One event and you get A*B which is the whole transfer. Why can’t we use ACNT=1? How does this transfer work inside the EDMA? What happens when the transfer completes? How do you program this transfer? 31 0
Parameters for a Single Block Transfer 8-bit Pixels &myDest: 7 8 Goals: 1 2 3 4 5 8 9 Transfer a block of 8-bit pixels from &pixel_7 to &myDest Transfer all pixels as quickly as possible (single EVTx – xfr all data, AB-sync) 6 7 8 9 10 11 10 9 12 13 14 15 16 17 11 10 18 19 20 21 22 23 13 24 25 26 27 28 29 14 (Src: &pixel_7) 15 Note: data values are in contiguous memory 16 19 Param Set (active) Solution 20 31 0 Options Source Destination CCNT RSVD ACNT BCNT SRCBIDX DSTBIDX LINK BCNTRLD SRCCIDX DSTCIDX AB-sync 21 &pixel_7 22 3 4 8 bits The goals say that a single event transfers ALL data. If ACNT=1, BCNT would have to be 4. CCNT would have to be 3. In this case, doing an AB-sync transfer, you would have 3 AB transfers – each one waiting for a new EVT which never occurs. So, ACNT has to be 4 and BCNT has to be 3. One event and you get A*B which is the whole transfer. &myDest Why can’t we use ACNT=1? How does this transfer work inside the EDMA? What happens when the transfer completes? How do you program this transfer? 1 31 0
Parameters for a Single Block Transfer 8-bit Pixels &myDest: 7 8 Goals: 1 2 3 4 5 8 9 Transfer a block of 8-bit pixels from &pixel_7 to &myDest Transfer all pixels as quickly as possible (single EVTx – xfr all data, AB-sync) 6 7 8 9 10 11 10 9 12 13 14 15 16 17 10 11 18 19 20 21 22 23 13 24 25 26 27 28 29 14 (Src: &pixel_7) 15 Note: data values are in contiguous memory 16 19 Param Set (active) Solution 20 31 0 Options Source Destination CCNT RSVD ACNT BCNT SRCBIDX DSTBIDX LINK BCNTRLD SRCCIDX DSTCIDX AB-sync 21 &pixel_7 22 3 4 8 bits The goals say that a single event transfers ALL data. If ACNT=1, BCNT would have to be 4. CCNT would have to be 3. In this case, doing an AB-sync transfer, you would have 3 AB transfers – each one waiting for a new EVT which never occurs. So, ACNT has to be 4 and BCNT has to be 3. One event and you get A*B which is the whole transfer. &myDest 6 Why can’t we use ACNT=1? How does this transfer work inside the EDMA? What happens when the transfer completes? How do you program this transfer? 1 31 0
Parameters for a Single Block Transfer 8-bit Pixels &myDest: 7 8 Goals: 1 2 3 4 5 8 9 Transfer a block of 8-bit pixels from &pixel_7 to &myDest Transfer all pixels as quickly as possible (single EVTx – xfr all data, AB-sync) 6 7 8 9 10 11 10 9 12 13 14 15 16 17 10 11 18 19 20 21 22 23 13 24 25 26 27 28 29 14 (Src: &pixel_7) 15 Note: data values are in contiguous memory 16 19 Param Set (active) Solution 20 31 0 Options Source Destination CCNT RSVD ACNT BCNT SRCBIDX DSTBIDX LINK BCNTRLD SRCCIDX DSTCIDX AB-sync 21 &pixel_7 22 3 4 8 bits The goals say that a single event transfers ALL data. If ACNT=1, BCNT would have to be 4. CCNT would have to be 3. In this case, doing an AB-sync transfer, you would have 3 AB transfers – each one waiting for a new EVT which never occurs. So, ACNT has to be 4 and BCNT has to be 3. One event and you get A*B which is the whole transfer. &myDest 4 6 Why can’t we use ACNT=1? How does this transfer work inside the EDMA? What happens when the transfer completes? How do you program this transfer? 1 31 0
Parameters for a Single Block Transfer 8-bit Pixels &myDest: 7 8 Goals: 1 2 3 4 5 8 9 Transfer a block of 8-bit pixels from &pixel_7 to &myDest Transfer all pixels as quickly as possible (single EVTx – xfr all data, AB-sync) 6 7 8 9 10 11 9 10 12 13 14 15 16 17 10 11 18 19 20 21 22 23 13 24 25 26 27 28 29 14 (Src: &pixel_7) 15 Note: data values are in contiguous memory 16 19 Param Set (active) Solution 20 31 0 Options Source Destination CCNT RSVD ACNT BCNT SRCBIDX DSTBIDX LINK BCNTRLD SRCCIDX DSTCIDX AB-sync 21 &pixel_7 22 3 4 8 bits The goals say that a single event transfers ALL data. If ACNT=1, BCNT would have to be 4. CCNT would have to be 3. In this case, doing an AB-sync transfer, you would have 3 AB transfers – each one waiting for a new EVT which never occurs. So, ACNT has to be 4 and BCNT has to be 3. One event and you get A*B which is the whole transfer. &myDest 4 6 Why can’t we use ACNT=1? How does this transfer work inside the EDMA? What happens when the transfer completes? How do you program this transfer? 1 31 0
Parameters for a Single Block Transfer 8-bit Pixels &myDest: 7 8 Goals: 1 2 3 4 5 8 9 Transfer a block of 8-bit pixels from &pixel_7 to &myDest Transfer all pixels as quickly as possible (single EVTx – xfr all data, AB-sync) 6 7 8 9 10 11 9 10 12 13 14 15 16 17 10 11 18 19 20 21 22 23 13 24 25 26 27 28 29 14 (Src: &pixel_7) 15 Note: data values are in contiguous memory 16 19 Param Set (active) Solution 20 31 0 Options Source Destination CCNT RSVD ACNT BCNT SRCBIDX DSTBIDX LINK BCNTRLD SRCCIDX DSTCIDX AB-sync 21 &pixel_7 22 3 4 8 bits The goals say that a single event transfers ALL data. If ACNT=1, BCNT would have to be 4. CCNT would have to be 3. In this case, doing an AB-sync transfer, you would have 3 AB transfers – each one waiting for a new EVT which never occurs. So, ACNT has to be 4 and BCNT has to be 3. One event and you get A*B which is the whole transfer. &myDest 4 6 Why can’t we use ACNT=1? How does this transfer work inside the EDMA? What happens when the transfer completes? How do you program this transfer? BCNT or any 1 31 0
Parameters for a Single Block Transfer 8-bit Pixels &myDest: 7 8 Goals: 1 2 3 4 5 8 9 Transfer a block of 8-bit pixels from &pixel_7 to &myDest Transfer all pixels as quickly as possible (single EVTx – xfr all data, AB-sync) 6 7 8 9 10 11 9 10 12 13 14 15 16 17 10 11 18 19 20 21 22 23 13 24 25 26 27 28 29 14 (Src: &pixel_7) 15 Note: data values are in contiguous memory 16 19 Param Set (active) Solution 20 31 0 Options Source Destination CCNT RSVD ACNT BCNT SRCBIDX DSTBIDX LINK BCNTRLD SRCCIDX DSTCIDX AB-sync 21 &pixel_7 22 3 4 8 bits The goals say that a single event transfers ALL data. If ACNT=1, BCNT would have to be 4. CCNT would have to be 3. In this case, doing an AB-sync transfer, you would have 3 AB transfers – each one waiting for a new EVT which never occurs. So, ACNT has to be 4 and BCNT has to be 3. One event and you get A*B which is the whole transfer. &myDest 4 6 Why can’t we use ACNT=1? How does this transfer work inside the EDMA? What happens when the transfer completes? How do you program this transfer? 3 0xffff 1 31 0
Parameters for a Single Block Transfer 8-bit Pixels &myDest: 7 8 Goals: 1 2 3 4 5 8 9 Transfer a block of 8-bit pixels from &pixel_7 to &myDest Transfer all pixels as quickly as possible (single EVTx – xfr all data, AB-sync) 6 7 8 9 10 11 9 10 12 13 14 15 16 17 10 11 18 19 20 21 22 23 13 24 25 26 27 28 29 14 (Src: &pixel_7) 15 Note: data values are in contiguous memory 16 19 Param Set (active) Solution 20 31 0 Options Source Destination CCNT RSVD ACNT BCNT SRCBIDX DSTBIDX LINK BCNTRLD SRCCIDX DSTCIDX AB-sync 21 &pixel_7 22 3 4 8 bits The goals say that a single event transfers ALL data. If ACNT=1, BCNT would have to be 4. CCNT would have to be 3. In this case, doing an AB-sync transfer, you would have 3 AB transfers – each one waiting for a new EVT which never occurs. So, ACNT has to be 4 and BCNT has to be 3. One event and you get A*B which is the whole transfer. &myDest 4 6 Why can’t we use ACNT=1? How does this transfer work inside the EDMA? What happens when the transfer completes? How do you program this transfer? 3 0xffff 1 31 0
Channel OPTions Register The Options register contains bit fields that configure how the channel operates. Each field has a corresponding description in the Param Setup code comments. TCC = Transfer Complete Code to signal completion SYNCDIM = A-sync or AB-sync PRIV = Privilege level of the host that can program the PSET PRIVID = Privilege ID of the host that program the PSET ITCCHEN = Intermediate Transfer Completion Chaining Enable TCCHEN = Transfer Completion Chaining Enable ITCINTEN = Intermediate Transfer Completion Interrupt Enable TCINTEN = Transfer Completion Interrupt Enable TCC = Transfer Completion Code TCCMODE = Point at which the transfer is considered to be complete. SAM = Source Address Mode DAM = Desitination Address Mode FWID = FIFO Width STATIC = Option to enable changing PSET SAM/DAM are typically INCR for normal EDMA transfers. These bits are only set to a “1” for an internal peripheral that supports FIFO mode – this is NOT for internal FIFOs.
Trigger Mechanisms What is DMA? EDMA Architecture Definition of EDMA3 Terminology Synchronization Indexing Example to Summarize Trigger Mechanisms Action Mechanisms Linking Chaining QDMA EDMA3 LLD Review
EDMA3 Basics Revisited T T E A Count: How many items to move 1 2 3 4 5 6 7 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 8 9 10 11 Count: How many items to move A, B, and C counts Addresses: The source & destination addresses Index: How far to increment the src/dst after each transfer T (xfer config) T (xfer config) E (event) Done A (action) Options Source Destination Index Link Addr Cnt Reload Transfer Count C Rsvd A B Event:Triggers the transfer to begin Transfer: The transfer config describes the transfers to be executed when triggered. Resulting Action: What do you want to happen after the transfer is complete?
How to TRIGGER a Transfer There are 3 ways to trigger an EDMA transfer: 1 Event Sync from peripheral SPIREVT SPIXEVT SPI EDMA3 ER EER Start Ch Xfr ER = Event Register (flag) EER = Event Enable Register (user) 2 Manually trigger the channel to run Application Channel y ESR = Event Set Register (user) Set Ch #y; ESR Start Ch Xfr 3 Chain event from another channel (more details later…) Channel x Channel y TCCHEN = TC Chain Enable (OPT) TCCHEN_EN TCC = Chy CER Start Ch Xfr 28
Action Mechanisms What is DMA? EDMA Architecture Definition of EDMA3 Terminology Synchronization Indexing Example to Summarize Trigger Mechanisms Action Mechanisms Linking Chaining QDMA EDMA3 LLD Review
Generate EDMA Interrupt (Setting IERbit) EDMA Channels EDMA Interrupt Generation Channel # Options TCC IPR IER TCINTEN=0 TCC=0 IER0 = 0 1 TCINTEN=0 TCC=1 IER1 = 0 EDMA3CC_INT . 1 TCINTEN=1 TCC=14 IER14 = 1 N TCINTEN=0 TCC=N IERN = 0 Options TCINTEN TCC IER – EDMA Interrupt Enable Register (NOT the CPU IER) IPR – EDMA Interrupt Pending Register (set by TCC) 20 17 12 Use EDMA3 Low-Level Driver (LLD) to program the EDMA IER bits N Channels and ONE interrupt? How do you determine WHICH channel completed?
EDMA Interrupt Dispatcher Here’s the interrupt chain from beginning to end: 1. An interrupt occurs 2. Interrupt Selector 3. HWI_INT5 Properties HWI_INT5 EDMA3CC_GINT 4. EDMA Dispatcher Function 5. ISR (interrupt handler) Read IPR bits Determine which one is set Call corresponding handler (ISR) in Fxn Table void edma_rcv_isr (void) { SEM_post (&semaphore); } How does the ISR Fxn Table (in #4 above) get loaded with the proper handler Fxn names? Use EDMA3 LLD to program the proper callback fxn for this HWI.
Linking What is DMA? EDMA Architecture Definition of EDMA3 Terminology Synchronization Indexing Example to Summarize Trigger Mechanisms Action Mechanisms Linking Chaining QDMA EDMA3 LLD Review
Linking – “Action” – Overview (xfer config) E (event) Done A (action) Options Source Destination Index Link Addr Cnt Reload Transfer Count C Rsvd B Alias: “Re-load” “Auto-init” Need: auto-reload channel with new config Ex1: do the same transfer again Ex2: ping/pong system Solution: use linking to reload Ch config Concept: Linking two or more channels together allows the EDMA to auto-reload a new configuration when the current transfer is complete. Linking still requires a “trigger” to start the transfer (manual, chain, event). You can link as many PSETs as you like – it is only limited by the #PSETs on a device. How does linking work? User must specify the LINK field in the config to link to another PSET. When the current xfr (0) is complete, the EDMA auto reloads the new config (1) from the linked PSET. Config 0 Config 1 reload LINK LINK 1 NULL NOTE: Does NOT start transfer!!
Chaining What is DMA? EDMA Architecture Definition of EDMA3 Terminology Synchronization Indexing Example to Summarize Trigger Mechanisms Action Mechanisms Linking Chaining QDMA EDMA3 LLD Review
Triggering Transfers Revisited There are 3 ways to trigger an EDMA transfer: 1 Event sync from peripheral RRDY XRDY McASP0 EDMA3 ER EER Start Ch Xfr ER = Event Register (flag) EER = Event Enable Register (user) 2 Manually trigger the channel to run Application Channel y ESR = Event Set Register (user) Set Ch #y; ESR Start Ch Xfr 3 Chain event from another channel Channel x Channel y TCCHEN = TC Chain Enable (OPT) TCCHEN_EN TCC = Chy CER Start Ch Xfr
Chaining – “Action” & “Event” – Overview (xfer config) E (event) Done A (action) Options Source Destination Index Link Addr Cnt Reload Transfer Count C Rsvd B Need: When one transfer completes, trigger another transfer to run Ex: ChX completes, kicks off ChY Solution: Use chaining to kick off next xfr Concept: Chaining actually refers to both both an action and an event – the completed ‘action’ from the 1st channel is the ‘event’ for the next channel You can chain as many Chan’s as you like – it is only limited by the #Ch’s on a device Chaining does NOT reload current Chan config – that can only be accomplished by linking. It simply triggers another channel to run. How does chaining work? Set the TCC field to match the next (i.e. chained) channel # Turn ON chaining When the current xfr (X) is complete, it triggers the next Ch (Y) to run Ch X Ch Y Y ? TCC Done ? TCC RUN Y EN DIS Chain EN Chain EN
QDMA What is DMA? EDMA Architecture Definition of EDMA3 Terminology Synchronization Indexing Example to Summarize Trigger Mechanisms Action Mechanisms Linking Chaining QDMA EDMA3 LLD Review
Quick DMA (QDMA) QDMA is used for simple transfers where syncing to an event is not required. Address/count updates and linking are not performed. CCNT = 1 (single event transfer). A transfer can be triggered by two methods: (1) writing to a trigger word (2) using the EDMA3 LLD. It is “quick” because the CPU can initiate a transfer with as few as ONE write to a channel register. How does it work? QDMA channel is “auto-triggered” when CPU writes to the “trigger” word Eliminates the need to write to PSET and kick off transfer w/ separate write to ESR Selection of the trigger word allows CPU to modify only words of interest in a PSET Assumes OPT.STATIC = 1. Count and address updates and linking NOT performed. Only ONE QDMA transfer is allowed in one queue at a time. Example: If ACNT/BCNT/CCNT are typically static for a given algorithm, but SRC is different for each transfer, then SRC could be defined as the trigger word. CPU can initiate a transfer with a single write to the SRC address for the specified PSET.
QDMA Mapping
EDMA3 LLD Review What is DMA? EDMA Architecture Definition of EDMA3 Terminology Synchronization Indexing Example to Summarize Trigger Mechanisms Action Mechanisms Linking Chaining QDMA EDMA3 LLD Review
Programming EDMA3 Low Level Driver (LLD) is optimal way to program EDMA3. Implements synchronized DMA transfers Consists of libraries to manage the EDMA3 peripheral: Resource Manager (EDMA3 RM) manages all EDMA3 hardware resources and interrupts. Driver (EDMA3 DRV) handles all EDMA3 configuration and allocating resources (via RM). Application Code (Drivers) LLD (DRV) Resource Mgr (RM) EDMA3 Hardware
Programming EDMA3 EDMA3_DRV_create(edma3InstanceId, globalConfig&miscParam); hEdma = EDMA3_DRV_open (edma3InstanceId, (void *) &initCfg, &edma3Result); EDMA3_DRV_requestChannel (hEdma, nChannel, nTransferControl, ..); EDMA3_DRV_setSrcParams (hEdma, nChannel, Src Addr, Addrmode, width); EDMA3_DRV_setDestParams (hEdma, nChannel, DstAddr, Addrmode, width); EDMA3_DRV_setTransferParams (hEdma, nChannel, acnt, bcnt, ccnt, bbcntrld, syncType); EDMA3_DRV_enableTransfer (hEdma, nChannel, trgMode);
Program Flow Identify all the channels that are going to be used by the application. Develop corresponding service routines for these events. Initialize all these ISR with the underlying OS. Initialize the Resource Manager to get all the available resources. Create and open the EDMA3 instance. Set the params for the transfers. Enable the transfer.
For More Information Refer to the Enhanced Direct memory Access 3 (EDMA3) for KeyStone Devices User's Guide. Device-specific Data Manuals for the KeyStone SoCs can be found at TI.com/multicore. Multicore articles, tools, and software are available at Embedded Processors Wiki for the KeyStone Device Architecture. View the complete C66x Multicore SOC Online Training for KeyStone Devices, including details on the individual modules. For questions regarding topics covered in this training, visit the support forums at the TI E2E Community website.