Presentation is loading. Please wait.

Presentation is loading. Please wait.

UNIT - VIII. DSP Introduction Digital Signal Processing: ◦ Application of mathematical operations to digitally represented signals Signals represented.

Similar presentations


Presentation on theme: "UNIT - VIII. DSP Introduction Digital Signal Processing: ◦ Application of mathematical operations to digitally represented signals Signals represented."— Presentation transcript:

1 UNIT - VIII

2 DSP Introduction Digital Signal Processing: ◦ Application of mathematical operations to digitally represented signals Signals represented digitally as sequences of samples Digital Signal Processor (DSP): ◦ Electronic system that processes digital signals

3 DSP tasks Most DSP tasks require: ◦ Repetitive numeric computations ◦ Real-time processing ◦ High memory ◦ System Flexibility DSPs must perform these tasks efficiently while minimizing: ◦ Cost ◦ Power ◦ Memory use ◦ Development time

4 Programmable Digital Signal Processors Low power requirement Low cost Real time I/O capability Availability of High speed on-chip memories

5 Advantages of DSP’s over Analog Circuits Can implement complex linear or nonlinear algorithms. Can modify easily by changing software. Reduced parts count makes fabrication easier High reliability

6 Difference between DSPs and Other Microprocessors Over the past few years it is seen that general purpose computers are capable of performing two major tasks. (1) Data Manipulation, and (2) Mathematical Calculations

7 Difference between DSPs and Other Microprocessors Data manipulation involves ◦ Storing ◦ Organizing ◦ Retrieving and ◦ Sorting of information. While mathematics ◦ occasionally used ◦ does not significantly affect the overall execution speed In comparison to this, ◦ the execution speed of most of the DSP algorithms is limited almost completely by the number of multiplications and additions required.

8 DSP Applications Digital cellular phones Digital cameras Satellite communication Voice mail Music synthesis Modems RADAR

9 TMS DSP IC.. TI TMS 320 C5X TI: Texas Instruments make TMX : Experimental device TMP : Prototype TMS : Qualified device C: CMOS Tech with on – chip non- volatile memory as ROM E: CMOS tech with on-chip non – volatile memory as EPROM nothing : NMOS tech with on-chip non – volatile memory as ROM 5 : Generation X : Version number- 0,1,2,3,4x,5,6,7

10 TMS DSP Types… Fixed Point DSPs  TMS320C5x & 54x  16-bit DSPs Floating Point DSPs  TMS320C3x, 4x & 6x  16 & 32-bit DSPs Multiprocessor DSPs  TMS320C8x

11 EVOLUTION OF TMS320 FAMILY

12 MULTIPLIER and MULTIPLIER ACCUMULATOR(MAC) Most common operation in DSP applications – Array Multiplication ◦ Eg: Convolution and Correlation A single dedicated MAC unit – Motorola DSP processor Texas Instruments - TMS320C5x ◦ Separate Multiplier and Accumulator

13 IMPLEMENTATION OF COLVOLVER WITH SINGLE MULTIPLIER/ADDER

14 MAC in P-DSPs TMS320C5x Special Instruction – MACD Multiply Accumulate with data move MACD pgm, dma ◦ pgm – program memory ◦ dma – data memory

15 MACD MAC operation with data move requires four memory accesses per instruction cycle ◦ Fetch the MACD instruction from the program memory ◦ Fetch one of the operands from the program memory ◦ Fetch the second operand from the data memory ◦ Write the content of the data memory with address dmadd in to the location with the address dmadd+ 1

16 Von Neumann architecture MACD instruction to be executed requires four clock cycles Single address bus and single data bus

17 Von Neumann architecture

18 Harvard Architecture Reduced number of clock cycles required for the memory access  Using more than one bus for both address and memory

19 Harvard Architecture

20 Von Neumann Vs Harvard

21 First development Von Neumann architecture ◦ Developed from a research paper written by John von Neumann and others in 1946 Harvard architecture ◦ Built by IBM in 1944 at Harvard university

22 Multiple access memory The number of memory access per clock period can also be increased by using a high speed memory ◦ DARAM – Dual Access RAM ◦ Two memory accesses/clock period Multiple access RAM may be connected to the processing unit of the P-DSP by using the Harvard architecture

23 Multiported Memory Another technique to increase the number of memory accesses per clock period Dual port memory ◦ Two memory accesses per clock period Need for storing the program and data in two different memory chips to permit simultaneous access to both program and data memory

24 Multiported Memory DUAL PORT MEMORY Address Bus 1 Address Bus 2 Data Bus 1 Data Bus 2 Limitation: Increase in the cost compared to two single port memory of the same capacity Because of the increased number of pins and larger chip area

25 VLIW architecture Very Long Instruction Word TMS 320 C6x More number of ALUs, MAC units, Shifters, etc VLIW architectures execute multiple instructions/cycle and use simple, regular instruction sets More parallelism, higher performance

26 VLIW architecture PROGRAM CONTROL UNITPROGRAM CONTROL UNIT Multiported Register File Read / Write Cross bar Functional Unit 1 ……… Functional Unit n Instruction Cache

27 PIPELINING Instruction cycle Micro Instructions ◦ Fetch phase ◦ Decode phase ◦ Memory read phase ◦ Execution phase Each phase may be carried out separately by different functional units

28 PIPELINING Value of T FetchDecodeReadExecute 1 I1I1 2 I1I1 3 I1I1 4 I1I1 5 I2I2 6 I2I2 7 I2I2 8 I2I2 9 I3I3 10 I3I3 11 I3I3 12 I3I3 INSTRUCTION CYCLES OF PROCESSOR WITH NO PIPELINING

29 PIPELINING Value of T FetchDecodeReadExecute 1 I1I1 2 I2I2 I1I1 3 I3I3 I2I2 I1I1 4 I4I4 I3I3 I2I2 I1I1 5 I5I5 I4I4 I3I3 I2I2 6 I6I6 I5I5 I4I4 I3I3 7 I7I7 I6I6 I5I5 I4I4 8 I8I8 I7I7 I6I6 I5I5 9 I9I9 I8I8 I7I7 I6I6 10 I9I9 I8I8 I7I7 11 I9I9 I8I8 12 I9I9 INSTRUCTION CYCLES OF PROCESSOR WITH PIPELINING

30 PIPELINING The number of instructions that are processed simultaneously in the CPU is referred to as depth of the instruction pipeline Instruction pipeline depth of some P-DSPs P-DSP Name / FamilyPipeline Depth Analog devices2 Motorola DSP560x3 TI TMS320C5x4 TI TMS320C54x5

31 SPECIAL ADDRESSING MODES Short Immediate Addressing ◦ In short immediate instructions, the operand is contained within the instruction machine code. ◦ In this example, the lower 8 bits are the operand and will be added to the ACC by the Central ALU. ◦ Length of short constant depends on  instruction type and  P-DSP

32 SPECIAL ADDRESSING MODES Short Direct Addressing ◦ In TI TMS 320 DSPs, the higher9-bits are stored in the data page pointer ◦ Only the lower 7-bits are specified as a part of the instruction Bits 15 through 8 contain the opcode Bit 7, with a value of 0, defines the addressing mode as direct Bits 6 through 0 contain the DMA

33 SPECIAL ADDRESSING MODES INDIRECT Addressing ◦ Permits an array of data to be processed in P-DSP to be efficiently fetched and stored ◦ The address of the operands can be stored in one of the registers called Indirect Address Registers ◦ In TI DSPs  Indirect Address Registers are called as Auxiliary Registers (ARS)

34 SPECIAL ADDRESSING MODES INDIRECT Addressing…. The ’C5x provides four indirect addressing options: ◦ No increment or decrement ◦ Increment by one ◦ Decrement by one. ◦ Increment or decrement by a value present in the Offset Register ◦ In TI DSP s Offset Register is called as INDX Register

35 SPECIAL ADDRESSING MODES Memory – mapped Addressing The CPU Registers and the I/O Registers of the P-DSPs are also accessible as memory location Storing them in either starting or final page of the memory space In TI DSPs Page 0 corresponds to the CPU Registers and I/O Registers When these registers are accessed using Memory-mapped addressing modes, the 9 MSBs of the address are forced to 0 This allows you to address the memory-mapped registers of data page 0 directly without affecting the current data page pointer value

36 SPECIAL ADDRESSING MODES Bit Reversed Addressing ◦ An auxiliary register points to the physical location of a data value. ◦ When we add INDX to the current AR using bit reversed addressing, addresses are generated in a bit-reversed fashion. ◦ Eg., FFT Circular Addressing ◦ Real time processing of signals ◦ Memory is organized as a circular buffer ◦ Beginning and Ending address defined by the programmer ◦ In this addressing mode, when the address pointer is incremented, it checks for the ending memory address of the circular buffer. ◦ If it exceeds that the address will be made equal to the beginning address of the circular buffer

37 ON-CHIP Peripherals P-DSPs have a number of ON-CHIP peripherals that relieve the CPU from routine functions

38 ON-CHIP TIMER ◦ Generation of periodic interrupts to P-DSPs ◦ Generation of sampling clocks for A/D converters ◦ Can be programmed by P-DSPs ◦ Can generate single pulse or pulse train ◦ Can generate square wave or periodic square wave ◦ The timer period is programmable

39 SERIAL PORT ◦ Data communication between the P-DSP and an external peripheral such as  A/D Converter  D/A Converter  RS 232 C Devices ◦ Input and Output Buffers(Write / Read) ◦ Sends and receive data to and from peripherals ◦ Synchronous mode (Tx/Rx Data lines) ◦ Asynchronous mode (Bit clock and Frame sync) ◦ Generate Interrupts  Output buffer empty, Input buffer full

40 TDM Serial Port Permits P-DSP to communicate with other devices or P-DSPs by using Time Division Multiplexing TDM Frame with 8 time slots

41 TDM Serial Port The TDM serial port normally uses 4 lines for serial communication TFRM: The frame sync signal TClock: The bit clock TADD: The address of serial device that outputs the data in a particular TDM slot TDAT: The data transmitted into the TDM channel by the authorized device

42 TDM Serial Port Interconnecting 8 devices using TDM serial using 4-bit bus

43 Parallel Port Communication between the P-DSP and other devices becomes faster compared to serial communication One approach - Data bus Separate lines – for parallel ports including the handshaking signals

44 BIT I/O PORTS Single bit wide Individually set, reset or read Normally used for control purposes Can also be used for data transfer No handshaking signals Used for conditional branching

45 HOST PORT Special Parallel port 8-bit or 16-bit wide Communicate with a microprocessor or PC called HOST Generate Interrupts

46 COMM PORTS Parallel ports used for inter-process communication between a number of identical P-DSPs in a multiprocessor systems 8-bit wide Data to be processed may be 32-bit or more Splitting in streams of 8-bits Assemble the 8-bits into words of 32-bits

47 ON CHIP A/D & D/A CONVERTERS Used towards voice applications such as ◦ Cellular telephones ◦ Tapeless answering machines

48 P-DSPs with RISC and CISC Arguments Advanced for RISC ◦ Small, heavily optimized instruction set executable in single short cycle ◦ All instructions same size ◦ No microcode = faster execution ◦ High Level Language support ◦ Better compiler target ◦ Simple enough for academic designs, class projects

49 P-DSPs with RISC and CISC Arguments Advanced for CISC ◦ Fewer instructions per task ◦ Shorter programs ◦ Hardware implementation of complex instructions faster than software ◦ HLL support ◦ Extra addressing modes help compiler

50 Characteristics of some of the TMS320 family DSP chips

51 INTERNAL ARCHITECHTURE OF TMS320C5X

52 Architecture of TMS320 C5x DSP Advanced Harvard Architecture Separate memory bus structures for program and data Have instructions that enable data transfer between the program and data memory area

53 BUS STRUCTURE Separate program and data buses Simultaneous access to program instructions and data High degree of parallelism Control mechanisms ◦ Manage interrupts ◦ Repeated operations ◦ Function calling

54 BUS STRUCTURE Program Bus (PB) ◦ Carries instruction code and immediate operands from program memory to CPU Program Address Bus(PAB) ◦ Provides addresses to program memory for both reads and writes Data Read Bus(DB) ◦ Interconnects various elements of CPU to data memory Data Read Address Bus(DAB) ◦ Provides the address to access the data memory space

55 Central Arithmetic Logic Unit (CALU) The CALU components: 16-bit X 16-bit Parallel Multiplier ◦ The C5x Processor performs 16x16 multiplication of numbers in 2’s complement form 32-bit Arithmetic Logic Unit (ALU) 32-bit Accumulator (ACC) ◦ One of the operands for ALU operation comes from ACC ◦ The results of operations performed in CALU are stored in ACC ◦ Higher order or lower order words can be loaded from ACC

56 CALU contd…. 32-bit Accumulator Buffer (ACCB) ◦ Used for temporary storage of ACC 32-bit Product Register (PREG) ◦ Holds the result of multiplication ◦ 16-bit Temporary Register 0 (TREG 0) holds the multiplicand 0- to 16-bit left barrel shifter and right barrel shifter ◦ Permit the contents of the memory to be left or right shifted by 0 to 16 bits before they are fed to ALU or stored from ALU to memory ◦ For example take a 4-bit barrel shifter, with inputs A, B, C and D. The shifter can cycle the order of the bits ABCD as DABC, or CDAB

57 The CPU registers ACC and PREG can also be shifted using these shifters A 5-bit register TREG 1 specifies the number of bits by which the scaling factor should shift CALU contd….

58 AUXILIARY REGISTER ALU (ARAU) Consists of ◦ Eight 16-bit auxiliary registers (ARs) AR0-AR7 ◦ A 3-bit Auxiliary Register Pointer (ARP) ◦ An unsigned 16-bit ALU Basically used for indirect addressing mode operations

59 The Auxiliary Registers AR0-AR7 may also be used as the general purpose registers for holding the operands for arithmetic and logic operations in CALU Some of the registers of ARAU are 16-bit Index Register (INDX) ◦ Used by ARAU as a step value(±1) to modify the address in the ARs during indirect addressing. ◦ It can also map the dimension of the address block used for bit-reversal addressing ARAU contd….

60 Auxiliary Register Compare Register (ARCR) The 16-bit ARCR is used for address boundary comparison Block Move Address Register (BMAR) The 16-bit BMAR holds an address value to be used with block moves and MAC operations This register provides the 16-bit indirect address for an indirect –addressed second operand ARAU contd….

61 Block Repeat Registers ◦ Repeat Counter Register (RPTC) ◦ Holds the repeat count in a repeat single-instruction operation and is loaded by RPT and RPTZ instructions ◦ Block Repeat Counter Register(BRCR) ◦ Holds the count value for the block repeat feature ◦ This value is loaded before the block repeat operation is initiated ◦ Block Repeat Program Address Start Register (PASR) ◦ Indicates the 16-bit address where the repeated block of code starts ◦ Block Repeat Program Address End Register (PAER) ◦ Indicates the 16-bit address where the repeated block of code ends ARAU contd….

62 Parallel Logic Unit (PLU) Performs Boolean operations Allows logical operations to be performed on data memory values directly without affecting the contents of ACC and PREG Results of PLU function are written back to the original data memory location

63 Memory Mapped Registers The ‘C5X has 96 registers mapped into page 0 of the data memory space. ‘C5X DSPs have 28 CPU registers and 16 I/O port registers and also different numbers of peripheral and reserved registers Memory mapped registers can be written to and read from in the same way as any other data memory location

64 Program Controller This contains logic circuitry that ◦ Decodes the instructions ◦ Manages the CPU pipeline ◦ Stores the status of CPU operations & ◦ Decodes the conditional operations

65 Elements of Program Controller 16-bit Program Counter (PC) 16-bit Status Registers (ST0,ST1) Instruction Register Interrupt Flag Register Interrupt Mask Register

66 Flags in ST0

67 FLAGS IN ST0 ARP – Auxiliary Register Pointer (ARB) OV – Overflow flag bit (ALU) ◦ Arithmetic Operation OVM – Overflow Mode bit ◦ ACC overflow saturation mode INTM – Interrupt Mode bit ◦ Globally masks or enables all interrupts DP – Data Memory Page Pointer

68 Flags in ST1

69 FLAGS in ST 1 ARB – Auxiliary Register Buffer CNF – On-chip RAM Configuration control bit (DARAM B0) ◦ CNF = 0 (DM) ◦ CNF = 1 (PM) TC – Test / Control Flag Bit ◦ Conditional Branch, call instructions

70 FLAGS in ST 1 SXM – Sign Extension Mode Bit Enables / disables sign extension of an arithmetic operation HM – Hold Mode Bit CPU stops or continues execution XF – External Flag output pin Reset = 1 PM – Product shift mode bit

71

72 ON-CHIP Memory ‘C5X architecture contains a considerable amount of ON-CHIP memory to aid in system performance and integration Program Read-Only Memory (ROM) Data/Program Dual-Access RAM (DARAM) Data/Program Single-Access RAM (SARAM)

73 Program ROM 16-bit ON-CHIP Programmable ROM This memory is used for booting program code from slower external ROM or EPROM to fast ON-CHIP or external RAM Once the custom program has been booted into RAM, the boot ROM space can be removed from program memory space ON-CHIP Memory…….

74 Data / Program Dual Access RAM ( DARAM) 1056 words X16-bit ON-CHIP DARAM The DARAM is divided into 3 individually selectable memory blocks: ◦ 512-word data or program DARAM block B0 ◦ 512-word data DARAM block B1 ◦ 32-word data DARAM block B2 Primarily intended to store data values and when needed, can be used to store programs as well The Dual data buses (DB & DAB) allow the CPU to read from and write to DARAM in the same instruction cycle ON-CHIP Memory…….

75 All ’C5x DSPs except the ’C52 carry a 16-bit on-chip SARAM of various sizes Code can be booted from an off-chip ROM and then executed at full speed, once it is loaded into the on-chip SARAM The SARAM can be configured by software in one of three ways: ◦ All SARAM configured as data memory ◦ All SARAM configured as program memory ◦ SARAM configured as both data memory and program memory Data / Program Single Access RAM ( SARAM) ON-CHIP Memory…….

76 All ’C5x CPUs support parallel accesses to these SARAM blocks. However, one SARAM block can be accessed only once per machine cycle. In other words, the CPU can read from or write to one SARAM block while accessing another SARAM block. Data / Program Single Access RAM ( SARAM) ON-CHIP Memory…….

77

78 http://focus.ti.com.cn/cn/lit/ug/spru056d/spru056d.pdf


Download ppt "UNIT - VIII. DSP Introduction Digital Signal Processing: ◦ Application of mathematical operations to digitally represented signals Signals represented."

Similar presentations


Ads by Google