Presentation is loading. Please wait.

Presentation is loading. Please wait.

TMS320C6000 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Architectural Overview.

Similar presentations


Presentation on theme: "TMS320C6000 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Architectural Overview."— Presentation transcript:

1 TMS320C6000 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Architectural Overview

2 2 C6000 CPU architecture Describe C6000 CPU architecture basic instructions Introduce some basic instructions C6000 memory map Describe the C6000 memory map overview of the peripherals Provide an overview of the peripherals Learning Objectives

3 3 General DSP System Block Diagram

4 4 It has been shown in Introduction that SOP is the key element for most DSP algorithms. So let’s write the code for this algorithm and at the same time discover the C6000 architecture. Two basic operations are required for this algorithm. (1) Multiplication (2) Addition Therefore two basic instructions are required Y = N  a n x n n = 1 * = a 1 * x 1 + a 2 * x 2 +... + a N * x N Implementation of Sum of Products (SOP) Implementation of Sum of Products (SOP) The implementation in this module will be done in assembly

5 5 Multiply (MPY) The multiplication of a 1 by x 1 is done in assembly by the following instruction: MPY a1, x1, Y MPY a1, x1, Y This instruction is performed by a multiplier unit that is called “.M” Y = N  a n x n n = 1 * = a 1 * x 1 + a 2 * x 2 +... + a N * x N

6 6 Multiply (.M unit).M.M Y = 40  a n x n n = 1 * The.M unit performs multiplications in hardware MPY.Ma1, x1, Y MPY.Ma1, x1, Y Note: 16-bit by 16-bit multiplier provides a 32-bit result. 32-bit by 32-bit multiplier provides a 64-bit result. 32-bit by 32-bit multiplier provides a 64-bit result.

7 7 Addition (.L unit).M.M.L.L Y = 40  a n x n n = 1 * MPY.Ma1, x1, prod ADD.LY, prod, Y RISC processors such as the C6000 use registers to hold the operands, so lets change this code.

8 8 Register File A Y = 40  a n x n n = 1 * MPY.Ma1, x1, prod ADD.LY, prod, Y MPY.MA0, A1, A3 ADD.LA4, A3, A4 MPY.MA0, A1, A3 ADD.LA4, A3, A4.M.M.L.L A0A1A2A3A4A15 Register File A............ a1 x1 prod 32-bits Y............ Let us correct this by replacing a, x, prod and Y by the registers as shown above.

9 9 Specifying Register Names Register File A contains 16 registers (A0 -A15) which are 32-bits wide..M.M.L.L A0A1A2A3A4A15 Register File A............ a1 x1 prod 32-bits Y............

10 10 Data loading (load unit.D) Q: How do we load the operands into the registers?.M.M.L.L A0A1A2A3A4A15 Register File A............ a1 x1 prod 32-bits Y............ A: The operands are loaded into the registers by loading them from the memory using the.D unit..D.D Data Memory It is worth noting at this stage that the only way to access memory is through the.D unit.

11 11 Load Instruction Q: Which instruction(s) can be used for loading operands from the memory to the registers?.M.M.L.L A0A1A2A3A4A15 Register File A............ a1 x1 prod 32-bits Y.............D.D Data Memory

12 12 Load Instructions A: The load instructions. Q: Which instruction(s) can be used for loading operands from the memory to the registers?.M.M.L.L A0A1A2A3A4A15 Register File A............ a1 x1 prod 32-bits Y.............D.D Data Memory LDB, LDH LDW,LDDW

13 13 Using the Load Instructions Before using the load unit you have to be aware that this processor is, which means that each byte is represented by a unique address. Before using the load unit you have to be aware that this processor is byte addressable, which means that each byte is represented by a unique address. Also the addresses are 32-bit wide. 00000000 00000002 00000004 00000006 00000008Data16-bitsaddressFFFFFFFF Byte

14 14 The syntax for the load instruction is: Where: Rn is a register that contains the address of the operand to be loaded and Rm is the destination register. Using the Load Instructions LD *Rn,Rm 00000000 00000002 00000004 00000006 00000008Data16-bitsaddressFFFFFFFF a1 x1 prod Y

15 15 The syntax for the load instruction is: The question now is how many bytes are going to be loaded into the destination register? Using the Load Instructions LD *Rn,Rm 00000000 00000002 00000004 00000006 00000008Data16-bitsaddressFFFFFFFF a1 x1 prod Y

16 16 The syntax for the load instruction is: LD *Rn,Rm Using the Load Instructions The answer, is that it depends on the instruction you choose: LD B : loads one byte (8-bit) LD B : loads one byte (8-bit) LD H : loads half word (16-bit) LD H : loads half word (16-bit) LD W : loads a word (32-bit) LD W : loads a word (32-bit) LD DW : loads a double word (64-bit) LD DW : loads a double word (64-bit) Note: LD on its own does not exist. 00000000 00000002 00000004 00000006 00000008Data16-bitsaddressFFFFFFFF a1 x1 prod Y

17 17 Using the Load Instructions Example: If we assume that A5 = 0x4 then: (1)LDB *A5, A7 ; gives A7 = 0x00000001 gives A7 = 0x00000001 (2)LDH *A5,A7; gives A7 = 0x00000201 gives A7 = 0x00000201 (3)LDW *A5,A7; gives A7 = 0x04030201 gives A7 = 0x04030201 (3)LDDW *A5,A7:A6; gives A7:A6 = 0x0807060504030201 gives A7:A6 = 0x0807060504030201 00000000 00000002 00000004 00000006 00000008Data16-bits address FFFFFFFF 0xB0xA 0xD0xC 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 01 The syntax for the load instruction is: LD *Rn,Rm Little Endian la parte meno significativa va memorizzata per prima, all'indirizzo più basso di memoria

18 18 Using the Load Instructions Q: If data can only be accessed by the load instruction and the.D unit, how can we load the register pointer Rn in the first place? 00000000 00000002 00000004 00000006 00000008Data16-bits address FFFFFFFF 0xB0xA 0xD0xC 0x10x2 0x30x4 0x50x6 0x70x8 01 The syntax for the load instruction is: LD *Rn,Rm

19 19 The instruction MVKL will allow a move of a 16-bit constant into a register as shown below: The instruction MVKL will allow a move of a 16-bit constant into a register as shown below: MVKL.? a, A5 (‘a’ is a constant or label) How many bits represent a full address? How many bits represent a full address? 32 bits So why does the instruction not allow a 32-bit move? So why does the instruction not allow a 32-bit move? All instructions are 32-bit wide (see instruction opcode). Loading the Pointer *Rn

20 20 To solve this problem another instruction is available: MVKH To solve this problem another instruction is available: MVKH Loading the Pointer *Rn eg. MVKH.? a, A5 (‘a’ is a constant or label) ahahahah ahahahahx alalalal a A5 Finally, to move the 32-bit address to a register we can use: Finally, to move the 32-bit address to a register we can use: MVKL a, A5 MVKH a, A5

21 21 Always use MVKL then MVKH, look at the following examples: Always use MVKL then MVKH, look at the following examples: Loading the Pointer *Rn MVKL0x1234FABC, A5 A5 = 0xFFFFFABC Wrong 87654321 Example 1 (A5 = 0x87654321) MVKL0x1234FABC, A5 A5 = 0xFFFFFABC (sign extension) MVKH0x1234FABC, A5 A5 = 0x1234FABC OK 87654321 Example 2 (A5 = 0x87654321) MVKH0x1234FABC, A5 A5 = 0x12344321

22 22 LDH, MVKL and MVKH MVKL pt1, A5 MVKL pt1, A5 MVKH pt1, A5 MVKH pt1, A5 MVKL pt2, A6 MVKL pt2, A6 MVKH pt2, A6 MVKH pt2, A6 LDH.D*A5, A0 LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 pt1 and pt2 point to some locations in the data memory..M.M.L.L A0A1A2A3A4A15 Register File A............ a1 x1 prod 32-bits Y.............D.D Data Memory Le componenti aixi ai e xi sono numeri 16 bits interi da 16 bits

23 23 Creating a loop MVKL pt1, A5 MVKL pt1, A5 MVKH pt1, A5 MVKH pt1, A5 MVKL pt2, A6 MVKL pt2, A6 MVKH pt2, A6 MVKH pt2, A6 LDH.D*A5, A0 LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 So far we have only implemented the SOP for one tap only, i.e. Y= a 1 * x 1 So let’s create a loop so that we can implement the SOP for N Taps.

24 24 Creating a loop – B instruction With the C6000 processors there are no dedicated instructions such as block repeat. The loop is created using the B instruction. So far we have only implemented the SOP for one tap only, i.e. Y= a 1 * x 1 So let’s create a loop so that we can implement the SOP for N Taps.

25 25 What are the steps for creating a loop ?  Create a label to branch to.  Add a branch instruction, B.  Create a loop counter (LC).  Add an instruction to decrement the LC.  Make the branch conditional based on LC.

26 26 1. Create a label to branch to MVKL pt1, A5 MVKL pt1, A5 MVKH pt1, A5 MVKH pt1, A5 MVKL pt2, A6 MVKL pt2, A6 MVKH pt2, A6 MVKH pt2, A6 loop LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4

27 27 MVKL pt1, A5 MVKL pt1, A5 MVKH pt1, A5 MVKH pt1, A5 MVKL pt2, A6 MVKL pt2, A6 MVKH pt2, A6 MVKH pt2, A6 loop LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 B.?loop 2. Add a branch instruction, B.

28 28 Which unit is used by the B instruction? Data Memory.M.M.L.L A0A1A2A3A15 Register File A............ a x prod 32-bits Y.D.D.M.M.L.L A0A1A2A3A15............ a x prod 32-bits Y.D.D.S.S MVKL.S pt1, A5 MVKL.S pt1, A5 MVKH.S pt1, A5 MVKH.S pt1, A5 MVKL.S pt2, A6 MVKL.S pt2, A6 MVKH.S pt2, A6 MVKH.S pt2, A6 loop LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 B.Sloop

29 29 3. Create a loop counter. MVKL.S pt1, A5 MVKL.S pt1, A5 MVKH.S pt1, A5 MVKH.S pt1, A5 MVKL.S pt2, A6 MVKL.S pt2, A6 MVKH.S pt2, A6 MVKH.S pt2, A6 MVKL.Scount, B0 loop LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 B.Sloop B registers will be introduced later Data Memory.M.M.L.L A0A1A2A3A15 Register File A............ a x prod 32-bits Y.D.D.M.M.L.L A0A1A2A3A15............ a x prod 32-bits Y.D.D.S.S

30 30 4. Decrement the loop counter MVKL.S pt1, A5 MVKL.S pt1, A5 MVKH.S pt1, A5 MVKH.S pt1, A5 MVKL.S pt2, A6 MVKL.S pt2, A6 MVKH.S pt2, A6 MVKH.S pt2, A6 MVKL.Scount, B0 loop LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 SUB.SB0, 1, B0 B.Sloop Data Memory.M.M.L.L A0A1A2A3A15 Register File A............ a x prod 32-bits Y.D.D.M.M.L.L A0A1A2A3A15............ a x prod 32-bits Y.D.D.S.S

31 31 What is the syntax for making instruction conditional? What is the syntax for making instruction conditional? [condition] InstructionLabel e.g.[B1]Bloop (1) The condition can be one of the following registers: A1, A2, B0, B1, B2. (2) Any instruction can be conditional. 5. Make the branch conditional based on the value in the loop counter

32 32 The condition can be inverted by adding the exclamation symbol “!” as follows: The condition can be inverted by adding the exclamation symbol “!” as follows: [!condition] InstructionLabel e.g. [!B0]Bloop; branch if B0 = 0 [B0]Bloop; branch if B0 != 0 5. Make the branch conditional based on the value in the loop counter

33 33 MVKL.S2 pt1, A5 MVKL.S2 pt1, A5 MVKH.S2 pt1, A5 MVKH.S2 pt1, A5 MVKL.S2 pt2, A6 MVKL.S2 pt2, A6 MVKH.S2 pt2, A6 MVKH.S2 pt2, A6 MVKL.S2count, B0 loop LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 SUB.SB0, 1, B0 [B0]B.Sloop [B0]B.Sloop Data Memory.M.M.L.L A0A1A2A3A15 Register File A............ a x prod 32-bits Y.D.D.M.M.L.L A0A1A2A3A15............ a x prod 32-bits Y.D.D.S.S 5. Make the branch conditional

34 34  With this processor all the instructions are encoded in a 32-bit.  Therefore the label must have a dynamic range of less than 32-bit as the instruction B has to be coded.  Case 1: B.S1 label  Relative branch.  Label limited to +/- 2 20 offset. More on the Branch Instruction (1) 21-bit relative address B32-bit

35 35 More on the Branch Instruction (2)  By specifying a register as an operand instead of a label, it is possible to have an absolute branch.  This will allow a dynamic range of 2 32.  Case 2: B.S2register  Absolute branch.  Operates on.S2 ONLY! 5-bit register code B32-bit

36 36 Testing the code This code performs the following operations: a 0 *x 0 + a 0 *x 0 + a 0 *x 0 + … + a 0 *x 0 However, we would like to perform: a 0 *x 0 + a 1 *x 1 + a 2 *x 2 + a 0 *x 0 + a 1 *x 1 + a 2 *x 2 + … + a N *x N … + a N *x N MVKL.S2 pt1, A5 MVKL.S2 pt1, A5 MVKH.S2 pt1, A5 MVKH.S2 pt1, A5 MVKL.S2 pt2, A6 MVKL.S2 pt2, A6 MVKH.S2 pt2, A6 MVKH.S2 pt2, A6 MVKL.S2count, B0 loop LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 SUB.SB0, 1, B0 [B0]B.Sloop [B0]B.Sloop

37 37 Modifying the pointers The solution is to modify the pointers A5 and A6 MVKL.S2 pt1, A5 MVKL.S2 pt1, A5 MVKH.S2 pt1, A5 MVKH.S2 pt1, A5 MVKL.S2 pt2, A6 MVKL.S2 pt2, A6 MVKH.S2 pt2, A6 MVKH.S2 pt2, A6 MVKL.S2count, B0 loop LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 SUB.SB0, 1, B0 [B0]B.Sloop [B0]B.Sloop

38 38 Indexing Pointers Description Pointer + Pre-offset - Pre-offset Pre-incrementPre-decrementPost-incrementPost-decrement Syntax Pointer Modified *R *+R[ disp ] *-R[ disp ] *++R[ disp ] *--R[ disp ] *R++[ disp ] *R--[ disp ] NoNoNoYesYesYesYes  [disp] specifies # elements - size in DW, W, H, or B.  disp = R or 5-bit constant.  R can be any register.

39 39 Modify and testing the code This code now performs the following operations: a 0 *x 0 + a 1 *x 1 + a 2 *x 2 +... + a N *x N MVKL.S2 pt1, A5 MVKL.S2 pt1, A5 MVKH.S2 pt1, A5 MVKH.S2 pt1, A5 MVKL.S2 pt2, A6 MVKL.S2 pt2, A6 MVKH.S2 pt2, A6 MVKH.S2 pt2, A6 MVKL.S2count, B0 loop LDH.D*A5 ++, A0 LDH.D*A6 ++, A1 LDH.D*A6 ++, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 SUB.SB0, 1, B0 [B0]B.Sloop [B0]B.Sloop

40 40 Store the final result This code now performs the following operations: a 0 *x 0 + a 1 *x 1 + a 2 *x 2 +... + a N *x N MVKL.S2 pt1, A5 MVKL.S2 pt1, A5 MVKH.S2 pt1, A5 MVKH.S2 pt1, A5 MVKL.S2 pt2, A6 MVKL.S2 pt2, A6 MVKH.S2 pt2, A6 MVKH.S2 pt2, A6 MVKL.S2count, B0 loop LDH.D*A5++, A0 LDH.D*A6++, A1 LDH.D*A6++, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 SUB.SB0, 1, B0 [B0]B.Sloop [B0]B.Sloop STH.DA4, *A7

41 41 Store the final result The Pointer A7 is now initialised. MVKL.S2 pt1, A5 MVKL.S2 pt1, A5 MVKH.S2 pt1, A5 MVKH.S2 pt1, A5 MVKL.S2 pt2, A6 MVKL.S2 pt2, A6 MVKH.S2 pt2, A6 MVKH.S2 pt2, A6 MVKL.S2pt3, A7 MVKH.S2pt3, A7 MVKL.S2count, B0 loop LDH.D*A5++, A0 LDH.D*A6++, A1 LDH.D*A6++, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 SUB.SB0, 1, B0 [B0]B.Sloop [B0]B.Sloop STH.DA4, *A7

42 42 What is the initial value of A4? A4 is used as an accumulator, so it needs to be so it needs to be reset to zero. MVKL.S2 pt1, A5 MVKL.S2 pt1, A5 MVKH.S2 pt1, A5 MVKH.S2 pt1, A5 MVKL.S2 pt2, A6 MVKL.S2 pt2, A6 MVKH.S2 pt2, A6 MVKH.S2 pt2, A6 MVKL.S2pt3, A7 MVKH.S2pt3, A7 MVKL.S2count, B0 ZERO.LA4 loop LDH.D*A5++, A0 LDH.D*A6++, A1 LDH.D*A6++, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 SUB.SB0, 1, B0 [B0]B.Sloop [B0]B.Sloop STH.DA4, *A7


Download ppt "TMS320C6000 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Architectural Overview."

Similar presentations


Ads by Google