Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 2 TMS320C6000 Architectural Overview. Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 2  Describe C6000 CPU.

Similar presentations


Presentation on theme: "Chapter 2 TMS320C6000 Architectural Overview. Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 2  Describe C6000 CPU."— Presentation transcript:

1 Chapter 2 TMS320C6000 Architectural Overview

2 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 2  Describe C6000 CPU architecture.  Introduce some basic instructions.  Describe the C6000 memory map.  Provide an overview of the peripherals. Learning Objectives

3 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 3 General DSP System Block Diagram PERIPHERALSPERIPHERALSPERIPHERALSPERIPHERALS CentralProcessing Unit Unit Internal Memory Internal Buses ExternalMemory

4 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 4 Implementation of Sum of Products (SOP) Implementation of Sum of Products (SOP) It has been shown in Chapter 1 that SOP is the key element for most DSP algorithms. So let’s write the code for this algorithm and at the same time discover the C6000 architecture. Two basic operations are required for this algorithm. (1) Multiplication (2) Addition Therefore two basic instructions are required Y = N  a n x n n = 1 * = a 1 * x 1 + a 2 * x 2 +... + a N * x N

5 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 5 Two basic operations are required for this algorithm. (1) Multiplication (2) Addition Therefore two basic instructions are required Implementation of Sum of Products (SOP) Implementation of Sum of Products (SOP) Y = N  a n x n n = 1 * So let’s implement the SOP algorithm! The implementation in this module will be done in assembly. = a 1 * x 1 + a 2 * x 2 +... + a N * x N

6 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 6 Multiply (MPY) The multiplication of a 1 by x 1 is done in assembly by the following instruction: MPYa1, x1, Y This instruction is performed by a multiplier unit that is called “.M” Y = N  a n x n n = 1 * = a 1 * x 1 + a 2 * x 2 +... + a N * x N

7 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 7 Multiply (.M unit).M.M Y = 40  a n x n n = 1 * The. M unit performs multiplications in hardware MPY.Ma1, x1, Y Note: 16-bit by 16-bit multiplier provides a 32-bit result. 32-bit by 32-bit multiplier provides a 64-bit result.

8 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 8 Addition (.?).M.M.?.? Y = 40  a n x n n = 1 * MPY.Ma1, x1, prod ADD.?Y, prod, Y

9 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 9 Add (.L unit).M.M.L.L Y = 40  a n x n n = 1 * MPY.Ma1, x1, prod ADD.LY, prod, Y RISC processors such as the C6000 use registers to hold the operands, so lets change this code.

10 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 10 Register File - A Y = 40  a n x n n = 1 * MPY.Ma1, x1, prod ADD.LY, prod, Y.M.M.L.L A0A1A2A3A4A15 Register File A............ a1 x1 prod 32-bits Y Let us correct this by replacing a, x, prod and Y by the registers as shown above.

11 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 11 Specifying Register Names Y = 40  a n x n n = 1 * MPY.MA0, A1, A3 ADD.LA4, A3, A4 The registers A0, A1, A3 and A4 contain the values to be used by the instructions..M.M.L.L A0A1A2A3A4A15 Register File A............ a1 x1 prod 32-bits Y

12 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 12 Specifying Register Names Y = 40  a n x n n = 1 * MPY.MA0, A1, A3 ADD.LA4, A3, A4 Register File A contains 16 registers (A0 -A15) which are 32-bits wide..M.M.L.L A0A1A2A3A4A15 Register File A............ a1 x1 prod 32-bits Y

13 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 13 Data loading Q: How do we load the operands into the registers?.M.M.L.L A0A1A2A3A4A15 Register File A............ a1 x1 prod 32-bits Y

14 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 14 Load Unit “.D” A: The operands are loaded into the registers by loading them from the memory using the.D unit..M.M.L.L A0A1A2A3A15 Register File A............ a1 x1 prod 32-bits Y.D.D Data Memory Q: How do we load the operands into the registers?

15 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 15 Load Unit “.D” It is worth noting at this stage that the only way to access memory is through the.D unit..M.M.L.L A0A1A2A3A15 Register File A............ a1 x1 prod 32-bits Y.D.D Data Memory

16 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 16 Load Instruction Q: Which instruction(s) can be used for loading operands from the memory to the registers?.M.M.L.L A0A1A2A3A15 Register File A............ a1 x1 prod 32-bits Y.D.D Data Memory

17 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 17 Load Instructions (LDB, LDH,LDW,LDDW) A: The load instructions..M.M.L.L A0A1A2A3A15 Register File A............ a1 x1 prod 32-bits Y.D.D Data Memory Q: Which instruction(s) can be used for loading operands from the memory to the registers?

18 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 18 Using the Load Instructions 00000000 00000002 00000004 00000006 00000008 Data 16-bits Before using the load unit you have to be aware that this processor is byte addressable, which means that each byte is represented by a unique address. Also the addresses are 32-bit wide. address FFFFFFFF

19 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 19 The syntax for the load instruction is: Where: Rn is a register that contains the address of the operand to be loaded and Rm is the destination register. Using the Load Instructions 00000000 00000002 00000004 00000006 00000008 Data a1 x1 prod 16-bits Y address FFFFFFFF LD *Rn,Rm

20 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 20 The syntax for the load instruction is: The question now is how many bytes are going to be loaded into the destination register? Using the Load Instructions 00000000 00000002 00000004 00000006 00000008 Data a1 x1 prod 16-bits Y address FFFFFFFF LD *Rn,Rm

21 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 21 The syntax for the load instruction is: LD *Rn,Rm Using the Load Instructions 00000000 00000002 00000004 00000006 00000008 Data a1 x1 prod 16-bits Y address FFFFFFFF The answer, is that it depends on the instruction you choose: LDB: loads one byte (8-bit) LDB: loads one byte (8-bit) LDH: loads half word (16-bit) LDH: loads half word (16-bit) LDW: loads a word (32-bit) LDW: loads a word (32-bit) LDDW: loads a double word (64-bit) LDDW: loads a double word (64-bit) Note: LD on its own does not exist.

22 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 22 Using the Load Instructions 00000000 00000002 00000004 00000006 00000008 Data 16-bits address FFFFFFFF 0xB0xA 0xD0xC Example: If we assume that A5 = 0x4 then: (1) LDB *A5, A7 ; gives A7 = 0x00000001 (2) LDH *A5,A7; gives A7 = 0x00000201 (3) LDW *A5,A7; gives A7 = 0x04030201 (4) LDDW *A5,A7:A6; gives A7:A6 = 0x0807060504030201 0x10x2 0x30x4 0x50x6 0x70x8 The syntax for the load instruction is: LD *Rn,Rm 01

23 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 23 Using the Load Instructions 00000000 00000002 00000004 00000006 00000008 Data 16-bits address FFFFFFFF 0xB0xA 0xD0xC Question: If data can only be accessed by the load instruction and the.D unit, how can we load the register pointer Rn in the first place? 0x10x2 0x30x4 0x50x6 0x70x8 The syntax for the load instruction is: LD *Rn,Rm

24 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 24  The instruction MVKL will allow a move of a 16-bit constant into a register as shown below: MVKL.?a, A5 (‘a’ is a constant or label)  How many bits represent a full address? 32 bits  So why does the instruction not allow a 32-bit move? All instructions are 32-bit wide (see instruction opcode). Loading the Pointer Rn

25 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 25  To solve this problem another instruction is available: MVKH Loading the Pointer Rn eg. MVKH.?a, A5 (‘a’ is a constant or label) ah ahx al a A5 MVKL a, A5 MVKH a, A5  Finally, to move the 32-bit address to a register we can use:

26 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 26 Loading the Pointer Rn MVKL0x1234FABC, A5 A5 = 0xFFFFFABC ; Wrong Example 1 A5 = 0x87654321 MVKL0x1234FABC, A5 A5 = 0xFFFFFABC (sign extension) MVKH0x1234FABC, A5 A5 = 0x1234FABC ; OK Example 2 MVKH0x1234FABC, A5 A5 = 0x12344321  Always use MVKL then MVKH, look at the following examples:

27 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 27 LDH, MVKL and MVKH.M.M.L.L A0A1A2A3A15 Register File A............ a x prod 32-bits Y.D.D Data Memory MVKL pt1, A5 MVKL pt1, A5 MVKH pt1, A5 MVKH pt1, A5 MVKL pt2, A6 MVKL pt2, A6 MVKH pt2, A6 MVKH pt2, A6 LDH.D*A5, A0 LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4

28 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 28 Creating a loop MVKL pt1, A5 MVKL pt1, A5 MVKH pt1, A5 MVKH pt1, A5 MVKL pt2, A6 MVKL pt2, A6 MVKH pt2, A6 MVKH pt2, A6 LDH.D*A5, A0 LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 So far we have only implemented the SOP for one tap only, i.e. Y= a 1 * x 1 So let’s create a loop so that we can implement the SOP for N Taps.

29 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 29 Creating a loop With the C6000 processors there are no dedicated instructions such as block repeat. The loop is created using the B instruction. So far we have only implemented the SOP for one tap only, i.e. Y= a 1 * x 1 So let’s create a loop so that we can implement the SOP for N Taps.

30 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 30 What are the steps for creating a loop 1. Create a label to branch to. 2. Add a branch instruction, B. 3.Create a loop counter. 4. Add an instruction to decrement the loop counter. 5. Make the branch conditional based on the value in the loop counter.

31 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 31 1. Create a label to branch to MVKL pt1, A5 MVKL pt1, A5 MVKH pt1, A5 MVKH pt1, A5 MVKL pt2, A6 MVKL pt2, A6 MVKH pt2, A6 MVKH pt2, A6 loop LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4

32 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 32 MVKL pt1, A5 MVKL pt1, A5 MVKH pt1, A5 MVKH pt1, A5 MVKL pt2, A6 MVKL pt2, A6 MVKH pt2, A6 MVKH pt2, A6 loop LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 B.?loop 2. Add a branch instruction, B.

33 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 33 Which unit is used by the B instruction?.M.M.L.L A0A1A2A3A15 Register File A............ a x prod 32-bits Y.D.D.M.M.L.L A0A1A2A3A15............ a x prod 32-bits Y.D.D Data Memory.S.S MVKLpt1, A5 MVKLpt1, A5 MVKH pt1, A5 MVKH pt1, A5 MVKL pt2, A6 MVKL pt2, A6 MVKH pt2, A6 MVKH pt2, A6 loop LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 B.?loop

34 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 34 Data Memory Which unit is used by the B instruction?.M.M.L.L A0A1A2A3A15 Register File A............ a x prod 32-bits Y.D.D.M.M.L.L A0A1A2A3A15............ a x prod 32-bits Y.D.D.S.S MVKL.S pt1, A5 MVKL.S pt1, A5 MVKH.S pt1, A5 MVKH.S pt1, A5 MVKL.S pt2, A6 MVKL.S pt2, A6 MVKH.S pt2, A6 MVKH.S pt2, A6 loop LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 B.Sloop

35 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 35 Data Memory 3. Create a loop counter..M.M.L.L A0A1A2A3A15 Register File A............ a x prod 32-bits Y.D.D.M.M.L.L A0A1A2A3A15............ a x prod 32-bits Y.D.D.S.S MVKL.S pt1, A5 MVKL.S pt1, A5 MVKH.S pt1, A5 MVKH.S pt1, A5 MVKL.S pt2, A6 MVKL.S pt2, A6 MVKH.S pt2, A6 MVKH.S pt2, A6 MVKL.Scount, B0 loop LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 B.Sloop B registers will be introduced later

36 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 36 4. Decrement the loop counter.M.M.L.L A0A1A2A3A15 Register File A............ a x prod 32-bits Y.D.D Data Memory.M.M.L.L A0A1A2A3A15 Register File A............ a x prod 32-bits Y.D.D.S.S MVKL.S pt1, A5 MVKL.S pt1, A5 MVKH.S pt1, A5 MVKH.S pt1, A5 MVKL.S pt2, A6 MVKL.S pt2, A6 MVKH.S pt2, A6 MVKH.S pt2, A6 MVKL.Scount, B0 loop LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 SUB.SB0, 1, B0 B.Sloop

37 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 37  What is the syntax for making instruction conditional? [condition] InstructionLabel e.g. [B1]Bloop (1) The condition can be one of the following registers: A1, A2, B0, B1, B2. (2) Any instruction can be conditional. 5. Make the branch conditional based on the value in the loop counter

38 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 38  The condition can be inverted by adding the exclamation symbol “!” as follows: [!condition] InstructionLabel e.g. [!B0]Bloop ;branch if B0 = 0 [B0]Bloop;branch if B0 != 0 5. Make the branch conditional based on the value in the loop counter

39 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 39 Data Memory.M.M.L.L A0A1A2A3A15 Register File A............ a x prod 32-bits Y.D.D.M.M.L.L A0A1A2A3A15............ a x prod 32-bits Y.D.D.S.S MVKL.S2 pt1, A5 MVKL.S2 pt1, A5 MVKH.S2 pt1, A5 MVKH.S2 pt1, A5 MVKL.S2 pt2, A6 MVKL.S2 pt2, A6 MVKH.S2 pt2, A6 MVKH.S2 pt2, A6 MVKL.S2count, B0 loop LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 SUB.SB0, 1, B0 [B0]B.Sloop [B0]B.Sloop 5. Make the branch conditional

40 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 40  Case 1: B.S1 label  Relative branch.  Label limited to +/- 2 20 offset. More on the Branch Instruction (1)  With this processor all the instructions are encoded in a 32-bit.  Therefore the label must have a dynamic range of less than 32-bit as the instruction B has to be coded. 21-bit relative address B32-bit

41 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 41 More on the Branch Instruction (2)  By specifying a register as an operand instead of a label, it is possible to have an absolute branch.  This will allow a dynamic range of 2 32.  Case 2: B.S2register  Absolute branch.  Operates on.S2 ONLY! 5-bit register code B32-bit

42 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 42 Testing the code This code performs the following operations: operations: a 0 *x 0 + a 0 *x 0 + a 0 *x 0 + … + a 0 *x 0 However, we would like to perform: a 0 *x 0 + a 1 *x 1 + a 2 *x 2 + … + a N *x N MVKL.S2 pt1, A5 MVKL.S2 pt1, A5 MVKH.S2 pt1, A5 MVKH.S2 pt1, A5 MVKL.S2 pt2, A6 MVKL.S2 pt2, A6 MVKH.S2 pt2, A6 MVKH.S2 pt2, A6 MVKL.S2count, B0 loop LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 SUB.SB0, 1, B0 [B0]B.Sloop [B0]B.Sloop

43 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 43 Modifying the pointers The solution is to modify the pointers A5 and A6. MVKL.S2 pt1, A5 MVKL.S2 pt1, A5 MVKH.S2 pt1, A5 MVKH.S2 pt1, A5 MVKL.S2 pt2, A6 MVKL.S2 pt2, A6 MVKH.S2 pt2, A6 MVKH.S2 pt2, A6 MVKL.S2count, B0 loop LDH.D*A5, A0 LDH.D*A6, A1 LDH.D*A6, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 SUB.SB0, 1, B0 [B0]B.Sloop [B0]B.Sloop

44 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 44 Indexing Pointers Description Pointer Syntax Pointer Modified *R*R*R*RNo R can be any register In this case the pointers are used but not modified.

45 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 45 Indexing Pointers Description Pointer + Pre-offset - Pre-offset Syntax Pointer Modified *R *+R[ disp ] *-R[ disp ] NoNoNo  [disp] specifies the number of elements size in DW (64-bit), W (32-bit), H (16-bit), or B (8-bit).  disp = R or 5-bit constant.  R can be any register. In this case the pointers are modified BEFORE being used and RESTORED to their previous values.

46 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 46 Indexing Pointers Description Pointer + Pre-offset - Pre-offset Pre-incrementPre-decrement Syntax Pointer Modified *R *+R[ disp ] *-R[ disp ] *++R[ disp ] *--R[ disp ] NoNoNoYesYes In this case the pointers are modified BEFORE being used and NOT RESTORED to their Previous Values.

47 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 47 Indexing Pointers Description Pointer + Pre-offset - Pre-offset Pre-incrementPre-decrementPost-incrementPost-decrement Syntax Pointer Modified *R *+R[ disp ] *-R[ disp ] *++R[ disp ] *--R[ disp ] *R++[ disp ] *R--[ disp ] NoNoNoYesYesYesYes In this case the pointers are modified AFTER being used and NOT RESTORED to their Previous Values.

48 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 48 Indexing Pointers Description Pointer + Pre-offset - Pre-offset Pre-incrementPre-decrementPost-incrementPost-decrement Syntax Pointer Modified *R *+R[ disp ] *-R[ disp ] *++R[ disp ] *--R[ disp ] *R++[ disp ] *R--[ disp ] NoNoNoYesYesYesYes  [disp] specifies # elements - size in DW, W, H, or B.  disp = R or 5-bit constant.  R can be any register.

49 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 49 Modify and testing the code This code now performs the following operations: operations: a 0 *x 0 + a 1 *x 1 + a 2 *x 2 +... + a N *x N MVKL.S2 pt1, A5 MVKL.S2 pt1, A5 MVKH.S2 pt1, A5 MVKH.S2 pt1, A5 MVKL.S2 pt2, A6 MVKL.S2 pt2, A6 MVKH.S2 pt2, A6 MVKH.S2 pt2, A6 MVKL.S2count, B0 loop LDH.D*A5++, A0 LDH.D*A6++, A1 LDH.D*A6++, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 SUB.SB0, 1, B0 [B0]B.Sloop [B0]B.Sloop

50 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 50 Store the final result This code now performs the following operations: operations: a 0 *x 0 + a 1 *x 1 + a 2 *x 2 +... + a N *x N MVKL.S2 pt1, A5 MVKL.S2 pt1, A5 MVKH.S2 pt1, A5 MVKH.S2 pt1, A5 MVKL.S2 pt2, A6 MVKL.S2 pt2, A6 MVKH.S2 pt2, A6 MVKH.S2 pt2, A6 MVKL.S2count, B0 loop LDH.D*A5++, A0 LDH.D*A6++, A1 LDH.D*A6++, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 SUB.SB0, 1, B0 [B0]B.Sloop [B0]B.Sloop STH.DA4, *A7

51 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 51 Store the final result The Pointer A7 has not been initialised. MVKL.S2 pt1, A5 MVKL.S2 pt1, A5 MVKH.S2 pt1, A5 MVKH.S2 pt1, A5 MVKL.S2 pt2, A6 MVKL.S2 pt2, A6 MVKH.S2 pt2, A6 MVKH.S2 pt2, A6 MVKL.S2count, B0 loop LDH.D*A5++, A0 LDH.D*A6++, A1 LDH.D*A6++, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 SUB.SB0, 1, B0 [B0]B.Sloop [B0]B.Sloop STH.DA4, *A7

52 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 52 Store the final result The Pointer A7 is now initialised. MVKL.S2 pt1, A5 MVKL.S2 pt1, A5 MVKH.S2 pt1, A5 MVKH.S2 pt1, A5 MVKL.S2 pt2, A6 MVKL.S2 pt2, A6 MVKH.S2 pt2, A6 MVKH.S2 pt2, A6 MVKL.S2pt3, A7 MVKH.S2pt3, A7 MVKL.S2count, B0 loop LDH.D*A5++, A0 LDH.D*A6++, A1 LDH.D*A6++, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 SUB.SB0, 1, B0 [B0]B.Sloop [B0]B.Sloop STH.DA4, *A7

53 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 53 What is the initial value of A4? A4 is used as an accumulator, so it needs to be reset to zero. so it needs to be reset to zero. MVKL.S2 pt1, A5 MVKL.S2 pt1, A5 MVKH.S2 pt1, A5 MVKH.S2 pt1, A5 MVKL.S2 pt2, A6 MVKL.S2 pt2, A6 MVKH.S2 pt2, A6 MVKH.S2 pt2, A6 MVKL.S2pt3, A7 MVKH.S2pt3, A7 MVKL.S2count, B0 ZERO.LA4 loop LDH.D*A5++, A0 LDH.D*A6++, A1 LDH.D*A6++, A1 MPY.MA0, A1, A3 MPY.MA0, A1, A3 ADD.LA4, A3, A4 ADD.LA4, A3, A4 SUB.SB0, 1, B0 [B0]B.Sloop [B0]B.Sloop STH.DA4, *A7

54 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 54 How can we add more processing power to this processor?.S1.S1.M1.M1.L1.L1.D1.D1 A0 A1 A2 A3 A4 Register File A............ Data Memory A15 32-bits Increasing the processing power!

55 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 55 (1)Increase the clock frequency..S1.S1.M1.M1.L1.L1.D1.D1 A0 A1 A2 A3 A4 Register File A............ Data Memory A15 32-bits Increasing the processing power! (2)Increase the number of Processing units.

56 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 56 To increase the Processing Power, this processor has two sides (A and B or 1 and 2) Data Memory.S1.S1.M1.M1.L1.L1.D1.D1 A0 A1 A2 A3 A4 Register File A............ A15 32-bits.S2.S2.M2.M2.L2.L2.D2.D2 B0 B1 B2 B3 B4 Register File B............ B15 32-bits

57 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 57 Can the two sides exchange operands in order to increase performance? Data Memory.S1.S1.M1.M1.L1.L1.D1.D1 A0 A1 A2 A3 A4 Register File A............ A15 32-bits B15.S2.S2.M2.M2.L2.L2.D2.D2 B0 B1 B2 B3 B4 Register File B............ 32-bits

58 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 58 The answer is YES but there are limitations.  To exchange operands between the two sides, some cross paths or links are required. What is a cross path?  A cross path links one side of the CPU to the other.  There are two types of cross paths:  Data cross paths.  Address cross paths.

59 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 59 Data Cross Paths  Data cross paths can also be referred to as register file cross paths.  These cross paths allow operands from one side to be used by the other side.  There are only two cross paths:  one path which conveys data from side B to side A, 1X.  one path which conveys data from side A to side B, 2X.

60 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 60 TMS320C67x Data-Path

61 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 61 Data Cross Paths  Data cross paths only apply to the.L,.S and.M units.  The data cross paths are very useful, however there are some limitations in their use.

62 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 62 Data Cross Path Limitations A 2x.L1.M1.S1 B 1x <src> <src> <dst> (1) The destination register must be on same side as unit. (2) Source registers - up to one cross path per execute packet per side. Execute packet: group of instructions that execute simultaneously.

63 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 63 Cross Path Limitations Data Cross Path LimitationsA 2x.L1.M1.S1 B 1x <src> <src> <dst> eg: ADD.L1x A0,A1,B2 MPY.M1x A0,B6,A9 SUB.S1x A8,B2,A8 || ADD.L1x A0,B0,A2 || Means that the SUB and ADD belong to the same fetch packet, therefore execute simultaneously.

64 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 64 Data Cross Path Limitations A 2x.L1.M1.S1 B 1x <src> <src> <dst> eg: ADD.L1x A0,A1,B2 MPY.M1x A0,B6,A9 SUB.S1x A8,B2,A8 || ADD.L1x A0,B0,A2 NOT VALID!

65 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 65 Data Cross Paths for both sides A 2x.L1.M1.S1 B 1x <src> <src> <dst>.L2.M2.S2 <dst> <src> <src>

66 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 66 Address cross paths.D1 A Addr Data LDW.D1T1 *A0,A5 STW.D1T1 A5,*A0 LDW.D1T1 *A0,A5 STW.D1T1 A5,*A0 (1) The pointer must be on the same side of the unit.

67 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 67 Load or store to either side.D1 A *A0 B Data1A5 Data2B5 DA1 = T1 DA2 = T2 LDW.D1T1 *A0,A5 LDW.D1T2 *A0,B5 LDW.D1T1 *A0,A5 LDW.D1T2 *A0,B5

68 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 68 Standard Parallel Loads.D1 A A5 *A0 B B5.D2 Data1 *B0 LDW.D1T1 *A0,A5 LDW.D1T1 *A0,A5 || LDW.D2T2 *B0,B5 LDW.D1T1 *A0,A5 LDW.D1T1 *A0,A5 || LDW.D2T2 *B0,B5 DA1 = T1 DA2 = T2

69 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 69 Parallel Load/Store using address cross paths.D1 A A5 *A0 B B5.D2 Data1 *B0 LDW.D1T2 *A0,B5 LDW.D1T2 *A0,B5 || STW.D2T1 A5,*B0 LDW.D1T2 *A0,B5 LDW.D1T2 *A0,B5 || STW.D2T1 A5,*B0 DA1 = T1 DA2 = T2

70 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 70 Fill the blanks... Does this work?.D1 A *A0 B.D2 Data1 *B0 LDW.D1__ *A0,B5 LDW.D1__ *A0,B5 || STW.D2__ B6,*B0 LDW.D1__ *A0,B5 LDW.D1__ *A0,B5 || STW.D2__ B6,*B0 DA1 = T1 DA2 = T2

71 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 71 Not Allowed!.D1 A *A0 B B5B6.D2 Data1 *B0 LDW.D1T2 *A0,B5 LDW.D1T2 *A0,B5 || STW.D2T2 B6,*B0 LDW.D1T2 *A0,B5 LDW.D1T2 *A0,B5 || STW.D2T2 B6,*B0 DA2 = T2

72 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 72 Not Allowed! Parallel accesses: both cross or neither cross.D1 A *A0 B B5B6.D2 Data1 *B0 LDW.D1T2 *A0,B5 LDW.D1T2 *A0,B5 || STW.D2T2 B6,*B0 LDW.D1T2 *A0,B5 LDW.D1T2 *A0,B5 || STW.D2T2 B6,*B0 DA2 = T2

73 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 73 Conditions Don’t Use Cross Paths  If a conditional register comes from the opposite side, it does NOT use a data or address cross-path.  Examples: [B2] ADD.L1 A2,A0,A4 [A1] LDW.D2 *B0,B5

74 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 74 CPU Ref Guide Full CPU Datapath (Pg 2-2) ‘C62x Data-Path Summary

75 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 75 ‘C67x Data-Path Summary ‘C67x

76 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 76 Cross Paths - Summary  Data  Destination register on same side as unit.  Source registers - up to one cross path per execute packet per side.  Use “x” to indicate cross-path.  Address  Pointer must be on same side as unit.  Data can be transferred to/from either side.  Parallel accesses: both cross or neither cross. Conditionals Don’t Use Cross Paths. Conditionals Don’t Use Cross Paths.

77 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 77 Code Review (using side A only) MVK.S140, A2; A2 = 40, loop count loop:LDH.D1*A5++, A0; A0 = a(n) LDH.D1*A6++, A1; A1 = x(n) MPY.M1A0, A1, A3; A3 = a(n) * x(n) ADD.L1A3, A4, A4; Y = Y + A3 SUB.L1A2, 1, A2; decrement loop count [A2]B.S1loop; if A2  0, branch [A2]B.S1loop; if A2  0, branch STH.D1A4, *A7; *A7 = Y Y = 40  a n x n n = 1 * Note: Assume that A4 was previously cleared and the pointers are initialised.

78 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 78 Let us have a look at the final details concerning the functional units. Consider first the case of the.L and.S units.

79 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 79 So where do the 40-bit registers come from? Operands - 32/40-bit Register, 5-bit Constant  Operands can be:  5-bit constants (or 16-bit for MVKL and MVKH).  32-bit registers.  40-bit Registers.  However, we have seen that registers are only 32-bit.

80 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 80  A 40-bit register can be obtained by concatenating two registers.  However, there are 3 conditions that need to be respected:  The registers must be from the same side.  The first register must be even and the second odd.  The registers must be consecutive. Operands - 32/40-bit Register, 5-bit Constant

81 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 81 A1:A0A3:A2A5:A4A7:A6A9:A8A11:A10A13:A12A15:A14 odd even : 32 8 40-bit Reg 40-bit RegB1:B0B3:B2B5:B4B7:B6B9:B8B11:B10B13:B12B15:B14 odd even : 32 8 Operands - 32/40-bit Register, 5-bit Constant  All combinations of 40-bit registers are shown below:

82 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 82 32-bit Reg 40-bit Reg 32-bit Reg 5-bit Const 32-bit Reg 40-bit Reg.L or.S Operands - 32/40-bit Register, 5-bit Constant instr.unit,, instr.unit,,

83 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 83 Operands - 32/40-bit Register, 5-bit Constant instr.unit,, instr.unit,, 32-bit Reg 40-bit Reg 32-bit Reg 5-bit Const 32-bit Reg 40-bit Reg.L or.S

84 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 84 Operands - 32/40-bit Register, 5-bit Constant OR.L1 A0, A1, A2 instr.unit,, instr.unit,, 32-bit Reg 40-bit Reg 32-bit Reg 5-bit Const 32-bit Reg 40-bit Reg.L or.S

85 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 85 Operands - 32/40-bit Register, 5-bit Constant OR.L1 A0, A1, A2 ADD.L2 -5, B3, B4 instr.unit,, instr.unit,, 32-bit Reg 40-bit Reg 32-bit Reg 5-bit Const 32-bit Reg 40-bit Reg.L or.S

86 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 86 Operands - 32/40-bit Register, 5-bit Constant OR.L1 A0, A1, A2 ADD.L2 -5, B3, B4 ADD.L1 A2, A3, A5:A4 instr.unit,, instr.unit,, 32-bit Reg 40-bit Reg 32-bit Reg 5-bit Const 32-bit Reg 40-bit Reg.L or.S

87 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 87 Operands - 32/40-bit Register, 5-bit Constant OR.L1 A0, A1, A2 ADD.L2 -5, B3, B4 ADD.L1 A2, A3, A5:A4 SUB.L1 A2, A5:A4, A5:A4 instr.unit,, instr.unit,, 32-bit Reg 40-bit Reg 32-bit Reg 5-bit Const 32-bit Reg 40-bit Reg.L or.S

88 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 88 Operands - 32/40-bit Register, 5-bit Constant OR.L1 A0, A1, A2 ADD.L2 -5, B3, B4 ADD.L1 A2, A3, A5:A4 SUB.L1 A2, A5:A4, A5:A4 ADD.L2 3, B9:B8, B9:B8 instr.unit,, instr.unit,, 32-bit Reg 40-bit Reg 32-bit Reg 5-bit Const 32-bit Reg 40-bit Reg.L or.S

89 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 89  To move the content of a register (A or B) to another register (B or A) use the move “MV” Instruction, e.g.: MV A0, B0 MV B6, B7  To move the content of a control register to another register (A or B) or vice-versa use the MVC instruction, e.g.: MVC IFR, A0 MVC A0, IRP Register to register data transfer

90 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 90 TMS320C6211/6711 Instruction Set

91 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 91 'C6211 Instruction Set (by category) Arithmetic ABS ADD ADDA ADDK ADD2 MPY MPYH NEG SMPY SMPYH SADD SAT SSUB SUB SUBA SUBC SUB2 ZERO Program Ctrl B IDLE NOP Logical AND CMPEQ CMPGT CMPLT NOT OR SHL SHR SSHL XOR Data Mgmt LDB/H/W MV MVC MVK MVKL MVKH MVKLH STB/H/W Bit Mgmt CLR EXT LMBD NORM SET Note: Refer to the 'C6000 CPU Reference Guide for more details.

92 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 92 'C6211 Instruction Set (by unit).S Unit MVKLH NEG NOT OR SET SHL SHR SSHL SUB SUB2 XOR ZERO ADD ADDK ADD2 AND B CLR EXT MV MVC MVK MVKL MVKH.M Unit SMPY SMPYH MPY MPYH.L Unit NOT OR SADD SAT SSUB SUB SUBC XOR ZERO ABS ADD AND CMPEQ CMPGT CMPLT LMBD MV NEG NORM.D Unit STB/H/W SUB SUBA ZERO ADD ADDA LDB/H/W MV NEG Other IDLENOP Note: Refer to the 'C6000 CPU Reference Guide for more details.

93 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 93 ‘C6711 Additional Instructions (by unit).S Unit CMPLTDP RCPSP RCPDP RSQRSP RSQRDP SPDP ABSSP ABSDP CMPGTSP CMPEQSP CMPLTSP CMPGTDP CMPEQDP.M Unit MPYI MPYID MPYSP MPYDP.L Unit INTSP INTSPU SPINT SPTRUNC SUBSP SUBDP ADDDP ADDSP DPINT DPSP INTDP INTDPU.D Unit ADDADLDDW Note: Refer to the 'C6000 CPU Reference Guide for more details. ‘C67x

94 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 94 TMS320C6211/6711 Memory Map

95 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 95 ‘C6211 Memory Map Byte Address FFFF_FFFF 0000_0000 64K x 8 Internal (L2 cache) Internal Memory  Unified (data or prog)  4 blocks - each can be RAM or cache On-chip Peripherals 0180_0000 External Memory   Async (SRAM, ROM, etc.)   Sync (SBSRAM, SDRAM) 256M x 8 External 2 3 8000_0000 9000_0000 A000_0000 B000_0000 0 1 Level 1 Cache   4KB Program   4KB Data   Not in mapCPU L264K 4KP 4KD

96 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 96 TMS320C6211/6711 Peripherals

97 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 97 'C6x System Block Diagram PERIPHERALSPERIPHERALSPERIPHERALSPERIPHERALS Memory Internal Buses ExternalMemory CPU.D1.M1.L1.S1.D2.M2.L2.S2 Regs (B0-B15) Regs (A0-A15) Control Regs

98 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 98 ‘C6x Internal Buses A D Internal Memory x32 A D External Interface A D x32 Peripherals can perform 64-bit data loads. ‘C67x Data Addr- T1 x32 Data Data- T1 x32/64 Data Addr- T2x32 Data Data- T2 x32/64 Aregs Bregs Program Addrx32 Program Datax256 PC DMA Addr- Readx32 DMA Data- Readx32 DMA Addr- Writex32 DMA Data- Writex32 DMA

99 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 99 PERIPHERALSPERIPHERALSPERIPHERALSPERIPHERALS Memory 'C6x System Block Diagram Internal Buses CPU.D1.M1.L1.S1.D2.M2.L2.S2 Regs (B0-B15) Regs (A0-A15) Control Regs EMIF Ext’lMemory

100 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 100 ‘C6202 16M x 8 External 0 Int’l Prog (64K instr) Int’l Data ( 128K bytes ) On-chip Peripherals 4M x 8 External 1 16M x 8 External 2 3 64K x 8 Internal (L2 cache) On-chip Peripherals 256M x 8 External 2 3 0 1 ‘C6211 ‘C6201/11 Memory Maps

101 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 101 PERIPHERALSPERIPHERALSPERIPHERALSPERIPHERALS 'C6x System Block Diagram Internal Buses CPU.D1.M1.L1.S1.D2.M2.L2.S2 Regs (B0-B15) Regs (A0-A15) Control Regs EMIF Ext’lMemory - Sync - Async Program RAM Data Ram Addr D (32)

102 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 102 'C6x Peripherals ‘C6xCPU‘C6xCPU EMIFEMIF DMADMA BootBootExternalMemory EMIF (External Memory Interface) - Glueless access to async/sync memory EPROM, SRAM, SDRAM, SBSRAM DMA/EDMA (Enhance Direct Memory Acces) - 4/16 Channels BOOT - Boot from 4M external block - Boot from HPI/XB McBSPMcBSPHPI/XBHPI/XB TimerTimer PLLPLL McBSP (Multi-Channel Buffered Serial Port) - High speed sync serial comm - T1/E1/MVIP interface HPI (Host Port Interface) /Expansion Bus (XB) - 16/32-bit host  P access Timer/Counters - Two 32-bit Timer/Counters

103 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 103 Clocking - Basic Definitions What is a “clock cycle”? ‘C6x ‘C6x CLKIN CLKOUT2 (1/2 CLKOUT1) CLKOUT1 (‘C6x clock cycle) PLL x1 x4 CLKIN - MHz PLL CLKOUT2 - MHz MIPs (max) 250x11252000 200x11001600 50x41001600 25x450800 CLKOUT1 - MHz 250 (4ns) 200 (5ns) 200 100 (10ns) When we talk about cycles...

104 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 104 'C6x System Block Diagram (Final) Internal Buses CPU.D1.M1.L1.S1.D2.M2.L2.S2 Regs (B0-B15) Regs (A0-A15) Control Regs EMIF Ext’lMemory - Sync - Async Program RAM Data Ram D (32) Serial Port Host Port Boot Load Timers Pwr Down DMA Addr

105 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 105 Program Memory Data Memory L2 Memory ’C6201B 64KB 1blkPgm/Cache 64KB 2blks 4 banksea External ’C6701 64KB 1blkPgm/Cache 64KB 2blks 8 banksea External ’C6202 256KB 1blkPgm/Cache 1blk MappedPgm 128 KB 2blks 4 banksea External ’C6211/C6711 4 KB 1blk Cache 4 KB 1blk Cache 64 KB 4blk Mapped Cache L1 Memory Internal Memory Summary TMS320C67x DSP Generation Parametric Table TMS320C64x DSP Generation Parametric Table TMS320C62x DSP Generation Parametric Table

106 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 106 ‘C6000 Device Summary DeviceMIPSMHz Kbytes pinsmmW$Periphs 6201B1600200128352271.980-110D2H 62022000250384352271.9120-150D3X 6211120015072256271.520-40E2H TMS320MFLOPS MHz Kbytes pinsmmW$Periphs 67011000167128352351.9170-200D2H 671160010072256270.920-40E2H Peripherals Legend: D,E:DMA,EDMA 2,3:# of McBSPs H,X:HPI, XBUS TMS320C62x DSP Generation Parametric Table TMS320C67x DSP Generation Parametric Table TMS320C64x DSP Generation Parametric Table

107 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 107 ’C6000 History 6201 r11Q97 coincident ‘C6x architectural announcement. Sample CPU core, minimal peripherals. 6201 r2 4Q97. Full production, with peripherals. 6201B 4Q98. Power reduced,.18 micron silicon, double ports into internal data memory. 67013Q98. Pin-for-pin compatible floating-point version of ‘C6201. 1GFLOP (@ 167MHz) performance. 6202 2Q99. 2000 MIPS @ 250MHz. 2-3x 6201 on-chip memory. Replaced HPI with Expansion Bus (32-bit HPI + more). 62113Q99. 2 cents per MIPS! 1200MIPS @ 150MHz as low as $25. Double-level cache, enhanced DMA. 6711Announced 3/1/99. 6701 floating-point CPU with 6211-like memory/peripherals. Volume pricing under $20.

108 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 108 ‘C6x Family Part Numbering  Example = TMS320LC6201PKGA200  TMS320= TI DSP  L = Place holder for voltage levels  C6 = C6x family  2 = Fixed-point core  01 = Memory/peripheral configuration  PKG= Pkg designator (actual letters TBD)  A = -40 to 85C (blank for 0 to 70C)  200 = Core CPU speed in Mhz

109 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 109 Device Summary Table Device Int Mem Ext MemPeripherals 6201/670164K Data3 x 16MDMA 16K Instr1 x 4M2 McBSP HPI (16-bit) 2 Timer/Counters (32-bit) 6202128K Data3 x 16MDMA 48K Instr1 x 4M2 McBSP 4 x 256M ----XBus (32-bit) 2 Timer/Counters (32-bit) 62114K Data Cache4 x 256MEDMA 4K Prog Cache2 McBSP 64K RAM/CacheHPI (16-bit) 2 Timer/Counters (32-bit)

110 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 110 Module 1 Exam 1. Functional Units a. How many can perform an ADD? Name them. a. How many can perform an ADD? Name them. b. Which support memory loads/stores? b. Which support memory loads/stores?.M.S.D.L six;.L1,.L2,.D1,.D2,.S1,.S2 2. Memory Map a. How many external ranges exist on ‘C6201? Four

111 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 111 3. Conditional Code a.Which registers can be used as cond’l registers? b.Which instructions can be conditional? 4. Performance a.What is the 'C6711 instruction cycle time? b.How can the 'C6711 execute 1200 MIPs? A1, A2, B0, B1, B2 All of them CLKOUT1 1200 MIPs = 8 instructions (units) x 150 MHz

112 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 112 5. Coding Problems a. Move contents of A0-->A1 a. Move contents of A0-->A1 MV.L1A0, A1 orADD.S1A0, 0, A1 orMPY.M1A0, 1, A1 (what’s the problem with this?)

113 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 113 5. Coding Problems a. Move contents of A0-->A1 a. Move contents of A0-->A1 b. Move contents of CSR-->A1 b. Move contents of CSR-->A1 c. Clear register A5 MV.L1A0, A1 orADD.S1A0, 0, A1 orMPY.M1A0, 1, A1 ZERO.S1A5 orSUB.L1A5, A5, A5 orMPY.M1A5, 0, A5 orCLR.S1A5, 0, 31, A5 orMVK.S10, A5 orXOR.L1A5,A5,A5 (A0 can only be a (A0 can only be a 16-bit value) MVCCSR, A1

114 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 114 5. Coding Problems (cont’d) d. A2 = A0 2 + A1 d. A2 = A0 2 + A1 e. If (B1  0) then B2 = B5 * B6 e. If (B1  0) then B2 = B5 * B6 f. A2 = A0 * A1 + 10 g. Load an unsigned constant (19ABCh) into register A6. MPY.M1A0, A0, A2 ADD.L1A2, A1, A2 [B1] MPY.M2B5, B6, B2 mvkl.s10x00019abc,a6 mvkh.s10x00019abc,a6 value.equ0x00019abc mvkl.s1value,a6 mvkh.s1value,a6 MPYA0, A1, A2 ADD10, A2, A2

115 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 115 5. Coding Problems (cont’d) h.Load A7 with contents of mem1 and post- increment the selected pointer. h.Load A7 with contents of mem1 and post- increment the selected pointer. x16 mem mem1 10h A7 load_mem1:MVKL.S1mem1,A6 load_mem1:MVKL.S1mem1,A6 MVKH.S1mem1,A6 LDH.D1*A6++,A7

116 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 116Architecture  Links:  C6711 data sheet: tms320c6711.pdf tms320c6711.pdf  User guide: spru189f.pdf spru189f.pdf  Errata: sprz173c.pdf sprz173c.pdf

117 Chapter 2 TMS320C6000 Architectural Overview - End -


Download ppt "Chapter 2 TMS320C6000 Architectural Overview. Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2002 Chapter 2, Slide 2  Describe C6000 CPU."

Similar presentations


Ads by Google