Presentation is loading. Please wait.

Presentation is loading. Please wait.

Texas Instruments Course DSP. Module 1 Outline Introduction Introduction  TI DSP Families  C6000 Architecture  C6000 Roadmap.

Similar presentations


Presentation on theme: "Texas Instruments Course DSP. Module 1 Outline Introduction Introduction  TI DSP Families  C6000 Architecture  C6000 Roadmap."— Presentation transcript:

1 Texas Instruments Course DSP

2 Module 1 Outline Introduction Introduction  TI DSP Families  C6000 Architecture  C6000 Roadmap

3 Learning Objectives  Why process signals digitally?  Definition of a real-time application.  Why use Digital Signal Processing processors?  What are the typical DSP algorithms?  Parameters to consider when choosing a DSP processor.  Programmable vs ASIC DSP.

4 Present Day Applications Consumer Audio Stereo A/D, D/A Stereo A/D, D/A PLL PLL Mixers Mixers Multimedia Stereo audio Stereo audio Imaging Imaging Graphics palette Graphics palette Voltage regulation Voltage regulation Wireless / Cellular Voice-band audio Voice-band audio RF codecs RF codecs Voltage regulation Voltage regulation HDD PRML read channel PRML read channel MR pre-amp MR pre-amp Servo control Servo control SCSI tranceivers SCSI tranceivers Automotive Digital radio A/D/A Digital radio A/D/A Active suspension Active suspension Voltage regulation Voltage regulation DTAD Speech synthesizer Speech synthesizer Mixed-signal processor Mixed-signal processor DSP: Technology Enabler

5 Performance Interfacing Power Size Ease-of Use Programming Programming Interfacing Interfacing Debugging DebuggingIntegration Memory Memory Peripherals Peripherals Cost Device cost Device cost System cost System cost Development cost Development cost Time to market Time to market System Considerations

6 Different Needs? Multiple Families! Different Needs? Multiple Families! Lowest Cost Control Systems  Motor Control  Storage  Digital Ctrl Systems C2000 (C20x/24x/28x) ‘C1x ‘C2x C6000 (C62x/64x/67x) ‘C3x ‘C4x ‘C8x  Multi Channel and Multi Function App's  Comm Infrastructure  Wireless Base-stations  DSL  Imaging  Multi-media Servers  Video Max Performance with Best Ease-of-Use Efficiency Efficiency Best MIPS per Best MIPS per Watt / Dollar / Size  Wireless phones  Internet audio players  Digital still cameras  Modems  Telephony  VoIP C5000 (C54x/55x) ‘C5x

7 Why go digital?  Digital signal processing techniques are now so powerful that sometimes it is extremely difficult, if not impossible, for analogue signal processing to achieve similar performance.  Examples:  FIR filter with linear phase.  Adaptive filters.

8 Why go digital?  Analogue signal processing is achieved by using analogue components such as:  Resistors.  Capacitors.  Inductors.  The inherent tolerances associated with these components, temperature, voltage changes and mechanical vibrations can dramatically affect the effectiveness of the analogue circuitry.

9 Why go digital?  With DSP it is easy to:  Change applications.  Correct applications.  Update applications.  Additionally DSP reduces:  Noise susceptibility.  Chip count.  Development time.  Cost.  Power consumption.

10 Why NOT go digital?  High frequency signals cannot be processed digitally because of two reasons:  Analog to Digital Converters, ADC cannot work fast enough.  The application can be too complex to be performed in real-time.

11 Real-time processing  DSP processors have to perform tasks in real-time, so how do we define real- time?  The definition of real-time depends on the application.  Example: a 100-tap FIR filter is performed in real-time if the DSP can perform and complete the following operation between two samples:

12  We can say that we have a real-time application if:  Waiting Time  0 Real-time processing Processing Time Waiting Time Sample Time nn+1

13  Why not use a General Purpose Processor (GPP) such as a Pentium instead of a DSP processor?  What is the power consumption of a Pentium and a DSP processor?  What is the cost of a Pentium and a DSP processor? Why do we need DSP processors?

14  Use a DSP processor when the following are required:  Cost saving.  Smaller size.  Low power consumption.  Processing of many “high” frequency signals in real-time.  Use a GPP processor when the following are required:  Large memory.  Advanced operating systems. Why do we need DSP processors?

15 What are the typical DSP algorithms?  The Sum of Products (SOP) is the key element in most DSP algorithms:

16 Introduction Introduction Introduction TI DSP Families TI DSP Families  C6000 Architecture  CPU  Buses and Peripherals  C6000 Roadmap

17 ExternalMemory CPU Internal Buses PERIPHERALS 'C6000 Block Diagram Internal Memory

18 'C6000 System Block Diagram ExternalMemory.D1.M1.L1.S1.D2.M2.L2.S2 Regs (B0-B15) Regs (A0-A15) CPU PERIPHERALS Internal Buses Internal Memory

19 What Problem Are We Trying To Solve? Digital sampling of an analog signal: A t What does it take to do this fast … and easy? Most DSP algorithms can be expressed with MAC: count i = 1 i = 1 Y =  a i * x i for (i = 1; i < count; i++){ sum += m[i] * n[i]; } DAC xY ADC DSP

20  Fastest Execution of MACs  The ‘C6x roadmap... from 200 to 2400 MMACs  Ease of C Programming  Even using natural C, the ‘C6000 Architecture can perform 2 to 4 MACs per cycle  Compiler generates 80-100% efficient code Multiply-Accumulate (MAC) in Natural C Code for (i = 0; i < count; i++){ sum += m[i] * n[i]; } Fast MAC using only C How does the ‘C6000 achieve such performance from C?

21   Great out-of-box experience Great out-of-box experience   Completely natural C code (non ’C6x specific) Completely natural C code (non ’C6x specific)   Code available at: www.ti.com/sc/c6000compiler Code available at: www.ti.com/sc/c6000compiler   Versus hand-coded assembly based on cycle count Versus hand-coded assembly based on cycle count How does the ‘C6000 achieve such performance from C? Sample Compiler Benchmarks

22 'C6000 Architecture: Built for Speed A0 A31........ A15.........M1.M1.L1.L1.D1.D1.S1.S1.M2.M2.L2.L2.D2.D2.S2.S2 B0 B31........ B15........ Controller/DecoderController/Decoder Memory  ‘C6000 Compiler excels at Natural C  While dual-MAC speeds math intensive algorithms, flexibility of 8 independent functional units allows the compiler to quickly perform other types of processing  All ‘C6000 instructions are conditional allowing efficient hardware pipelining  Instruction set and CPU hardware orthogonality allow the compiler to achieve 80- 100% efficiency

23 Fastest MAC using Natural C ;** --------------------------------------------------* LOOP:; PIPED LOOP KERNEL LDDW.D1A4++,A7:A6 ||LDDW.D2B4++,B7:B6 ||MPYSP.M1XA6,B6,A5 ||MPYSP.M2XA7,B7,B5 ||ADDSP.L1A5,A8,A8 ||ADDSP.L2B5,B8,B8 || [A1]B.S2LOOP || [A1]SUB.S1A1,1,A1 ;** --------------------------------------------------* float mac(float *m, float *n, int count) { int i, float sum = 0; for (i=0; i < count; i++) { for (i=0; i < count; i++) { sum += m[i] * n[i]; } … sum += m[i] * n[i]; } … A0 A31........ A15.........M1.M1.L1.L1.D1.D1.S1.S1.M2.M2.L2.L2.D2.D2.S2.S2 B0 B31........ B15........ Controller/DecoderController/Decoder Memory

24 Introduction Introduction Introduction TI DSP Families TI DSP Families  'C6000 Architecture CPU CPU  Buses and Peripherals  'C6000 Roadmap

25 'C6000 System Block Diagram ExternalMemory.D1.M1.L1.S1.D2.M2.L2.S2 Register Set B Register Set A CPU PERIPHERALS Internal Buses Looking at the internal buses... Internal Memory

26 ‘C6000 Internal Buses PC Program Addrx32 Program Datax256 DMA DMA Addr- Read DMA Data- Read DMA Addr- Write DMA Data- Write Aregs Bregs Data Addr- T1 x32 Data Data- T1 x32/64 Data Addr- T2x32 Data Data- T2 x32/64 InternalMemoryExternalMemory Peripherals

27 'C6000 System Block Diagram ExternalMemory.D1.M1.L1.S1.D2.M2.L2.S2 Register Set B Register Set A CPU Internal Buses Next, the internal memory... Internal Memory

28 ‘C6711 Memory cache details cache logic FFFF_FFFF0000_0000 64KB Internal On-chip Peripherals 0180_0000 128MB External 2 3 8000_0000 9000_0000 A000_0000 B000_0000 0 1 64K Prog / Data (Level 2) CPU 4K Program Cache 4K Data Cache

29 ‘C6711 Cache Logic CPU requests data Is data in L1? Is data in L2? Copy Data from External Mem to L2 Copy Data from L2 to L1 Send Data to CPU No Yes Yes No

30 ‘C6711 Cache Details Level 1 Program Always cacheAlways cache 1 way cache (direct mapped)1 way cache (direct mapped) Zero wait-stateZero wait-state Line size:512 bits (or 16 instr)Line size:512 bits (or 16 instr) Level 1 Data Always cacheAlways cache 2 way cache2 way cache Zero wait-stateZero wait-state Line size:256 bitsLine size:256 bits Level 2 Unified (prog or data)Unified (prog or data) RAM or cacheRAM or cache 1-4 way cache1-4 way cache 32 data bytes in 4 cycles32 data bytes in 4 cycles 16 instr. in 5 cycles16 instr. in 5 cycles Line Size:1024 bits (or 128 bytes)Line Size:1024 bits (or 128 bytes) CPU L1 Prog (4KB) L1 Data (4KB) L2Unified (64KB) 256 8/16/32/64 128 256

31 'C6000 System Block Diagram ExternalMemory.D1.M1.L1.S1.D2.M2.L2.S2 Register Set B Register Set A CPU PERIPHERALS Internal Buses Looking at each peripheral... Internal Memory

32 'C6000 Peripherals ExternalMemory.D1.M1.L1.S1.D2.M2.L2.S2 Register Set B Register Set A CPU Internal Buses Internal Memory McBSP’s Utopia GPIO VCPTCP DMA, EDMA (Boot) Timers PLL XB, PCI, Host Port EMIF

33 .D1.M1.L1.S1.D2.M2.L2.S2 Register Set B Register Set A CPU Internal Buses Internal Memory SDRAM Async SBSRAM EMIF EMIF External Memory Interface ( EMIF )  Glueless access to async/sync memory  Works with PC100 SDRAM (cheap, fast, and easy!)  Byte-wide data access  16, 32, or 64-bit bus widths External Memory Interface ( EMIF )  Glueless access to async/sync memory  Works with PC100 SDRAM (cheap, fast, and easy!)  Byte-wide data access  16, 32, or 64-bit bus widths

34 Internal and External Memory DevicesInternal EMIF (A) size of range EMIFB C6201C6204C6205C6701 P = 64 KB D = 64 KB 52M Bytes (32-bits wide) N/A C6202 P = 256 KB D = 128 KB 52M Bytes (32-bits wide) N/A C6203 P = 384 KB D = 512 KB 52M Bytes (32-bits wide) N/A C6211C6711C6712 L1P=4 KB L1D=4 KB L2=64 KB 128M Bytes (32-bits wide) (32-bits wide)N/A C6712 64M Bytes (16-bits wide) (16-bits wide)N/A C6414C6415C6416 L1P=16 KB L1D=16 KB L2=1 MB 256M Bytes (64-bits wide) (64-bits wide) 64M Bytes (16-bits wide) (16-bits wide) 52M Bytes (32-bits wide) N/A L1P=4 KB L1D=4 KB L2=64 KB N/A

35 ExternalMemory.D1.M1.L1.S1.D2.M2.L2.S2 Register Set B Register Set A CPU Internal Buses Internal Memory XBUS, PCI, Host Port EMIF HPI / XBUS / PCI Parallel Peripheral Interface HPI:Dedicated, slave-only, async 16/32-bit bus allows host-  P access to C6000 memory XBUS:Similar to HPI but provides …  Master/slave and sync modes  Glueless i/f to FIFOs (up to single-cycle xfer rate)  Master/slave and sync modes  Glueless i/f to FIFOs (up to single-cycle xfer rate) PCI:Standard 32-bit, 33MHz PCI interface These interfaces provide means to bootstrap the C6000 Parallel Peripheral Interface HPI:Dedicated, slave-only, async 16/32-bit bus allows host-  P access to C6000 memory XBUS:Similar to HPI but provides …  Master/slave and sync modes  Glueless i/f to FIFOs (up to single-cycle xfer rate)  Master/slave and sync modes  Glueless i/f to FIFOs (up to single-cycle xfer rate) PCI:Standard 32-bit, 33MHz PCI interface These interfaces provide means to bootstrap the C6000

36 GPIO ExternalMemory.D1.M1.L1.S1.D2.M2.L2.S2 Register Set B Register Set A CPU Internal Buses Internal Memory GPIO XB, PCI, Host Port EMIF General Purpose Input/Output (GPIO)  ‘C64x provides 8 or 16 bits of general purpose bitwise I/O  Use to observe or control the signal of a single-pin General Purpose Input/Output (GPIO)  ‘C64x provides 8 or 16 bits of general purpose bitwise I/O  Use to observe or control the signal of a single-pin

37 McBSP and Utopia ExternalMemory.D1.M1.L1.S1.D2.M2.L2.S2 Register Set B Register Set A CPU Internal Buses Internal Memory McBSP’s Utopia GPIO XB, PCI, Host Port EMIF Multi-Channel Buffered Serial Port ( McBSP )  2 (or 3) full-duplex, synchronous serial-ports  Up to 100 Mb/sec performance  Supports multi-channel operation (T1, E1, MVIP, …) Utopia ( C64x )  ATM connection  50 MHz wide area network connectivity Multi-Channel Buffered Serial Port ( McBSP )  2 (or 3) full-duplex, synchronous serial-ports  Up to 100 Mb/sec performance  Supports multi-channel operation (T1, E1, MVIP, …) Utopia ( C64x )  ATM connection  50 MHz wide area network connectivity

38 DMA / EDMA ExternalMemory.D1.M1.L1.S1.D2.M2.L2.S2 Register Set B Register Set A CPU Internal Buses Internal Memory McBSP’s Utopia GPIO DMA, EDMA (Boot) XB, PCI, Host Port EMIF Direct Memory Access (DMA / EDMA)  Transfers any set of memory locations to another  4 / 16 / 64 channels (transfer parameter sets)  Transfers can be triggered by any interrupt (sync)  Operates independent of CPU  On reset, provides bootstrap from memory Direct Memory Access (DMA / EDMA)  Transfers any set of memory locations to another  4 / 16 / 64 channels (transfer parameter sets)  Transfers can be triggered by any interrupt (sync)  Operates independent of CPU  On reset, provides bootstrap from memory

39 Timer / Counter ExternalMemory.D1.M1.L1.S1.D2.M2.L2.S2 Register Set B Register Set A CPU Internal Buses Internal Memory McBSP’sUtopia GPIO DMA, EDMA (Boot) Timers XB, PCI, Host Port EMIF Timer / Counter  Two (or three) 32-bit timer/counters  Can generate interrupts  Both input and output pins Timer / Counter  Two (or three) 32-bit timer/counters  Can generate interrupts  Both input and output pins

40 VCP / TCP -- 3G Wireless ExternalMemory.D1.M1.L1.S1.D2.M2.L2.S2 Register Set B Register Set A CPU Internal Buses Internal Memory McBSP’sUtopia GPIO VCPTCP DMA, EDMA (Boot) Timers XB, PCI, Host Port EMIF Turbo Coprocessor (TCP)  Supports 35 data channels at 384 kbps  3GPP / IS2000 Turbo coder  Programmable parameters include mode, rate and frame length Viterbi Coprocessor (VCP)  Supports >500 voice channels at 8 kbps  Programmable decoder parameters include constraint length, code rate, and frame length Turbo Coprocessor (TCP)  Supports 35 data channels at 384 kbps  3GPP / IS2000 Turbo coder  Programmable parameters include mode, rate and frame length Viterbi Coprocessor (VCP)  Supports >500 voice channels at 8 kbps  Programmable decoder parameters include constraint length, code rate, and frame length

41 PLLExternalMemory.D1.M1.L1.S1.D2.M2.L2.S2 Register Set B Register Set A CPU Internal Buses Internal Memory McBSP’s Utopia GPIO VCPTCP DMA, EDMA (Boot) Timers PLL XB, PCI, Host Port EMIF PLL  External clock multiplier  Reduces EMI and cost  Pin selectable PLL Input  CLKIN Output  CLKOUT1 - Output rate of PLL - Instruction (MIP) rate  CLKOUT2 - 1/2 rate of CLKOUT1 Input  CLKIN Output  CLKOUT1 - Output rate of PLL - Instruction (MIP) rate  CLKOUT2 - 1/2 rate of CLKOUT1

42 'C6000 Peripherals ExternalMemory.D1.M1.L1.S1.D2.M2.L2.S2 Register Set B Register Set A CPU Internal Buses Internal Memory McBSP’s Utopia GPIO VCPTCP DMA, EDMA (Boot) Timers PLL XB, PCI, Host Port EMIF

43 Introduction Introduction Introduction TI DSP Families TI DSP Families 'C6000 Architecture 'C6000 Architecture  'C6000 Roadmap

44 C6000 Roadmap Performance Highest Performance Time Software Compatible Floating Point Multi-coreMulti-core C64x ™ DSP 1.1 GHz C64x ™ DSP 2nd Generation General Purpose C6414C6414 C6415C6415C6416C6416 Media Gateway 3G Wireless Infrastructure C6201 C6701 C6202 C6203 C6211 C6711 C6204 1st Generation C6205 C6712 C62x ™ C67x ™

45 Performance Time C67x 3 GFLOPS and beyond C6712 600 MFLOPS C6711 900 MFLOPS C6701 1 GFLOPS 150 MFLOPS C32 C31 C30 C33 ’C6000 Floating-Point

46 TI Floating Point - A History of Firsts: First commercially-successful floating-point DSP ‘C30 (1987) First floating-point DSP with multiprocessing support ‘C40 (1991) First $10 floating-point DSP ‘C32 (1995) First 1-GFLOPS DSP ‘C6701 (1998) First $5 floating-point DSP ‘C33 (1999) First 2-level cache floating-point DSP ‘C6711 (1999) First to offer 600 MFLOPS for under $10 ‘C6712 (2000) TI Floating-Point Innovation

47 Introduction Introduction Introduction TI DSP Families TI DSP Families 'C6000 Architecture 'C6000 Architecture 'C6000 Roadmap 'C6000 Roadmap Optional Topics Go to Module 2 Go to Module 2  'C6000 CPU Details 'C6000 CPU Details 'C6000 CPU Details  'C6000 Instruction Summaries 'C6000 Instruction Summaries 'C6000 Instruction Summaries

48 What Problem Are We Trying To Solve? Digital sampling of an analog signal: A t What are the two primary instructions? Most DSP algorithms can be expressed as: 40 i = 1 i = 1 Y =  a i * x i DAC xY ADC DSP

49 MultMult ALUALU MPYa, x, prod ADD y, prod, y y = 40  a n x n n = 1 * The Core of DSP : Sum of Products Where are the variables? The ’C6000 Designed to handle DSP’s math-intensive calculations ALUALUALUALU ALUALUALUALU.M.M MPY.Ma, x, prod.L.L ADD.L y, prod, y Note: You don’t have to specify functional units (.M or.L)

50 y = 40  a n x n n = 1 * Register File A a x prod 32-bits y.............M.M.L.L Working Variables : The Register File Working Variables : The Register File How are the number of iterations specified? 16 registers MPY.Ma, x, prod ADD.Ly, prod, y

51 Loops: Coding on a RISC Processor Loops: Coding on a RISC Processor 1. Program flow: the branch instruction 2. Initialization: setting the loop count 3. Decrement: subtract 1 from the loop counter B loop B loop SUB cnt, 1, cnt SUB cnt, 1, cnt MVK 40, cnt MVK 40, cnt

52 y = 40  a n x n n = 1 * The “S” Unit : For Standard Operations The “S” Unit : For Standard Operations MVK.S40, cnt loop: MPY.Ma, x, prod ADD.L y, prod, y SUB.Lcnt, 1, cnt B.Sloop B.Sloop How is the loop terminated?.M.M.L.L.S.S Register File A 32-bits 16 registers a x prod y............ cnt

53 Conditional Instruction Execution Note: if condition is false, execution replaced with nop Code SyntaxExecute if: [ cnt ]cnt  0 [ !cnt ]cnt = 0 Execution based on [zero/non-zero] value of specified variable To minimize branching, all instructions are conditional [condition] Bloop

54 y = 40  a n x n n = 1 * Loop Control via Conditional Branch Loop Control via Conditional Branch MVK.S40, cnt loop: MPY.Ma, x, prod ADD.L y, prod, y SUB.Lcnt, 1, cnt [cnt]B.Sloop [cnt]B.Sloop How are the a and x array values brought in from memory?.M.M.L.L.S.S Register File A 32-bits a x prod y............ cnt

55 Memory Access via “.D” Unit.M.M.L.L.S.S y = 40  a n x n n = 1 * MVK.S40, cnt loop: LDH.D*ap, a LDH.D*xp, x MPY.Ma, x, prod ADD.L y, prod, y SUB.Lcnt, 1, cnt [cnt]B.Sloop [cnt]B.Sloop Data Memory: x(40), a(40), y How do we increment through the arrays? 16 registers Register File A a x prod y cnt *ap *xp *yp.D.D

56 Auto-Increment of Pointers Register File A a x prod y cnt *ap *xp *yp y = 40  a n x n n = 1 * MVK.S40, cnt loop: LDH.D*ap++, a LDH.D*xp++, x MPY.Ma, x, prod ADD.L y, prod, y SUB.Lcnt, 1, cnt [cnt]B.Sloop [cnt]B.Sloop How do we store results back to memory?.M.M.L.L.S.S Data Memory: x(40), a(40), y.D.D 16 registers

57 Storing Results Back to Memory Register File A a x prod y cnt *ap *xp *yp y = 40  a n x n n = 1 * MVK.S40, cnt loop: LDH.D*ap++, a LDH.D*xp++, x MPY.Ma, x, prod ADD.L y, prod, y SUB.Lcnt, 1, cnt [cnt]B.Sloop [cnt]B.Sloop STW.Dy, *yp But wait - that’s only half the story....M.M.L.L.S.S Data Memory: x(40), a(40), y.D.D

58 Dual Resources : Twice as Nice A0 A1 A2 A3 A4 Register File A A15 A5 A6 A7 anananan xnxnxnxn prd sum cnt........ *a *x *y.M1.M1.L1.L1.S1.S1.D1.D1.M2.M2.L2.L2.S2.S2.D2.D2 Register File B B0 B1 B2 B3 B4 B15 B5 B6 B7........ 32-bits................ 32-bits Our final view of the sum of products example...

59 ExternalMemory.D1.M1.L1.S1.D2.M2.L2.S2 Register Set B Register Set A CPU PERIPHERALS Internal Buses To summarize each units’ instructions... Internal Memory ‘C6000 System Block Diagram

60 ‘C62x RISC-like instruction set.L.L.D.D.S.S.M.M No Unit Used IDLENOP.S Unit NEG NOT OR SET SHL SHR SSHL SUB SUB2 XOR ZERO ADD ADDK ADD2 AND B CLR EXT MV MVC MVK MVKH.L Unit NOT OR SADD SAT SSUB SUB SUBC XOR ZERO ABS ADD AND CMPEQ CMPGT CMPLT LMBD MV NEG NORM.M Unit SMPY SMPYH MPY MPYH MPYLH MPYHL.D Unit NEG STB(B/H/W) SUB SUBAB(B/H/W) ZERO ADD ADDAB(B/H/W) LDB(B/H/W) MV

61 ‘C67x : Superset of Fixed-Point.L.L.D.D.S.S.M.M No Unit Used IDLENOP.S Unit NEG NOT OR SET SHL SHR SSHL SUB SUB2 XOR ZERO ADD ADDK ADD2 AND B CLR EXT MV MVC MVK MVKH ABSSP ABSDP CMPGTSP CMPEQSP CMPLTSP CMPGTDP CMPEQDP CMPLTDP RCPSP RCPDP RSQRSP RSQRDP SPDP.L Unit NOT OR SADD SAT SSUB SUB SUBC XOR ZERO ABS ADD AND CMPEQ CMPGT CMPLT LMBD MV NEG NORM ADDSP ADDDP SUBSP SUBDP INTSP INTDP SPINT DPINT SPRTUNC DPTRUNC DPSP.M Unit SMPY SMPYH MPY MPYH MPYLH MPYHL MPYSP MPYDP MPYI MPYID.D Unit NEG STB(B/H/W) SUB SUBAB (B/H/W) ZERO ADD ADDAB(B/H/W) LDB(B/H/W) LDDW MV

62 ‘C64x  Superset of ‘C62x.L.L.S.S.D.D.M.M.S Unit PACK2 PACKH2 PACKLH2 PACKHL2 UNPKHU4 UNPKLU4 SWAP2 SPACK2 SPACKU4 SADD2 SADDUS2 SADD4 ANDN SHR2 SHRU2 SHLMB SHRMB CMPEQ2 CMPEQ4 CMPGT2 CMPGT4 BDEC BPOS BNOP ADDKPC.L Unit SHLMB SHRMB MVK(5-bit) ABS2 ADD2 ADD4 MAX MIN SUB2 SUB4 SUBABS4 ANDN PACK2 PACKH2 PACKLH2 PACKHL2 PACKH4 PACKL4 UNPKHU4 UNPKLU4 SWAP2/4.D Unit LDDW LDNW LDNDW STDW STNW STNDW MVK(5-bit) ADD2 SUB2 AND ANDN OR XOR ADDAD.M.M.M Unit MVD BITC4 BITR DEAL SHFL MPYHI MPYLI MPYHIR MPYLIR AVG2 AVG4 ROTL SSHVL SSHVR BITC4 BITR DEAL SHFL MPY2/SMPY2 DOTP2 DOTPN2 DOTPRSU2 DOTPNRSU2 DOTPU4 DOTPSU4 GMPY4 XPND2/4 Double-size Register sets (A16-A31)(B16-B31) Advanced Instruction Packing (minimizes code-size) Advanced Emulation Features


Download ppt "Texas Instruments Course DSP. Module 1 Outline Introduction Introduction  TI DSP Families  C6000 Architecture  C6000 Roadmap."

Similar presentations


Ads by Google