Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digital Signal Processors (DSPs). DSP Advanced signal processor circuits MAC (Multiply and Accumulate) unit (s) - provides fast multiplication of two.

Similar presentations


Presentation on theme: "Digital Signal Processors (DSPs). DSP Advanced signal processor circuits MAC (Multiply and Accumulate) unit (s) - provides fast multiplication of two."— Presentation transcript:

1 Digital Signal Processors (DSPs)

2 DSP Advanced signal processor circuits MAC (Multiply and Accumulate) unit (s) - provides fast multiplication of two operands and accumulating results at a single address. MAC unit computes fast an expression such as the following summation. yn = Σ (ai × xn−i) where the sum is made for i = 0, 1, 2, …, N-1. Here i, n and N are the integers, ai is a coefficient, xj is independent variable or an input element and yk is the dependent variable or an output element.

3 DSP DSP processors invariably have Harward architecture. Caches are organized in Harward architecture separate I-cache and D-Cache Basic Units: Memory, Internal Bus, Data bus, Address bus, control bus, Bus Interface Unit, Instruction fetch register, Instruction decoder, Control unit, Instruction Cache, Data Cache, multistage pipeline processing, multi-line superscalar processing for obtaining processing speed higher than one instruction per clock cycle, Program counter

4 DSP special structural units Packed Data processing Parallel Execution MAC units Special Instructions Instruction Packing unit

5 ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION SHARC and Tiger SHARC

6 SHARC Processor from Analog Devices. SHARC stands for Super Harvard Single - Chip Computer Super Harvard architecture means more than one set of address and data buses for data For example, set J of address and data buses for data-memory space, set K of address and data buses for data-memory space, and set I of address and data buses for program memory space

7 SHARC functions Program memory configurable as program and data memory parts (Princeton architecture) SHARC functions as VLIW (very large instruction word) processor. used in large number of DSP applications. Controlled power dissipation in floating point ALU. Different SHARCs can link by serial communication between them

8 SHARC features Instruction word 48-bits 32-bit data word for integer and floating point operations 40-bit extended floating point. Smaller 16 or 8 bit must also store as full 32-bit data. Therefore, the big endian or little endian data alignment is not considered during processing with carry also extended for rotating (RRX). ON chip memory 1 MB Program memory and data memory (Harvard architecture) External OFF chip memory OFF chip as well as ON-chip Memory can be configured for 32-bit or 48 bit words.

9 Parallel Processing features Supports processing instruction level parallelism as well as memory access parallelism. Multiple data accesses in a single Instruction Addressing Features SHARC 32-bit address space for accesses Accesses 16 GB or 20 GB or 24 GB as per the word size ( 32-bit, 40-bit or 48 bit) configured in the memory for each address. When word size = 32-bit then external memory configuration addressable space is 16 GB (232 × 4) bytes

10 Word size features Provides for two word sizes 32-bit and 48-bit. SHARC two full sets of 16 general-purpose registers, therefore fast context switching Allows multitasking OS and multithreading. Registers are called R0 to R15 or F0 to F15 depending upon integer operation configuration or floating point configuration. Registers are of 32-bit. A few registers are special, of 48 bits that may also be accesses as pair of 16- bit and 32-bit registers.

11 TigerSHARC Highest performance density family of processors from Analog Devices Precision high-performance integrated circuits used in analog and digital signal processing applications Designed for multiprocessing applications and for peak performance greater than BFLOPS (billion floating-point operations per second)

12 Example ADSP-TS203SABP-050 processor processes using 250 MHz clock and on chip memory of 6 M bits and operates at 1.2V/3.3 V Low voltage design helps in processing with little power dissipation. Analog Devices claims the highest performance per Watt.

13 TigerSHARC Multiple TigerSHARCs can connect by serial communication at 1 GBps. A TigerSHARC version has 24M bits ON-chip memory. Two ALUs and twos set of address and data buses for data memory On set of address and data buses for program memory TigerSHARC is available as IP core also so that new applications and enhancements can be developed.

14

15 Arithmatic Fn = Fx + Fy Fn = Fx-Fy Fn = ABS(Fx + Fy) Fn = ABS(Fx-Fy) Fn=(Fx + Fy)/2 COMP(Fx,Fy) Fn = -Fx Rn = Rx + Ry + CI Add Subtract Absolute value of sum Absolute value of difference Average Compare Negate Add with Carry

16 Arithmatic Fn = ABSFx Fn = PASS Fx Fn = RND Fx Fn = SCALE Fx by Ry Rn = MANX Fx Rn = LOGB Fx Rn = FIX Fx, Fn = FLOAT Rx by Ry Absolute value CopyFx to Fn Round Scale exponent of Fx by Ry Extract mantissa of Fx Convert exponent of Fx to integer Convert floating-point to integer Convert integer to floating-point

17 Arithmatic Fn = RECIPS Fx Fn = RSQRTS Fx Fn = Fx COPYSIGN Fy Fn = MIN(Fx.Fy) Fn = MAX(Fx,Fy) Fn = CLIP Fx by Fy Create seed for reciprocal Create seed for reciprocal square root Copy sign of Fy to Fx Minimum of Fx, Fy Maximum of Fx, Fy Clip Fx within range [-Fy,Fy]

18 Logical Rn = LSHIFT Rx by Ry Rn = Rn OR LSHIFT Rx by Ry Rn=ASHIFT Rx by Ry Rn = Rn OR ASHIFT Rx by Ry Rn = ROT Rx by Ry Rn = BCLR Rx by Ry Rn = BSET Rx by Ry Rn = BTGL Rx by Ry Logical shift distance Ry Logical shift and logical OR Arithmetic shift Arithmetic shift and logical OR Rotate distance Ry Clear one bit in Rx Set one bit in Rx Toggle one bit in Rx

19 Logical BTST Rx by Ry Rn = FDEP Rx by Ry Rn = Rn OR FDEP Rx by Ry Rn = FDEP Rx by Ry Rn = Rn OR FDEP Rx by Ry Rn = FEXT Rx by Ry Rn = EXP Rx Test one bit in Rx Deposit field from Rx into Rn Deposit field from Rx using OR Deposit and sign extend field from Rx Deposit and sign extend using OR Extract field from Rx Extract and sign extend field from Rx Extract exponent field

20 Operations Rn = EXP Rx (EX) Rn = LEFTZ Rx Rn = LEFTO Rx Rn = FPACK Fx Fx = FUNPACK Rn Extract exponent field from ALU Extract number of leading Os Extract number of leading Is Convert 32-bit floating-point to 16- bit floating-point Convert 16-bit floating-point to 32- bit floating-point

21 Load Value R0 = DM (0x2000000); R0 = DM(_a); loads R0 the contents of the variable a DM(_a) = R0; stores R0 into memory location

22 Little Endian and Big Endian in a Memory Organization Little Endian: The LSB Byte (8-bit) of a word of 32 bit is written into the lowest address byte, and the other bytes are written in increasing order of significance. Word (32 bit): 0x90ABCDEF Word Address: 0x1000 Address0x10000x10010x10020x1003 Little EF CD AB 90 Big Endian: The byte order is revered. The MSB Byte (8- bit) of a word of 32 bit is written into the lowest address byte, and the other bytes are written in decreasing order of significance. Word (32 bit): 0x90ABCDEF Word Address: 0x1000 Address0x10000x10010x10020x1003 Big 90 AB CD EF

23 Princeton Architecture 80x86 processors and ARM7 have Princeton architecture for main memory. 8051-family microcontrollers have Harvard architecture.). Vectors and pointers, variables, program segments and memory blocks for data and stacks have different addresses in the program in Princeton memory architecture.

24 Harvard architecture When the address spaces for the data and for program are distinct Handling streams of data that are required to be accessed in cases of single instruction multiple data type instructions and DSP instructions. Separate data buses ensure simultaneous accesses for instructions and data. Program segments and memory blocks for data and stacks have separate set of addresses in Harvard architecture. Control signals and read-write instructions are also read-write instructions are also separate for accessing the program memory and data memory.

25


Download ppt "Digital Signal Processors (DSPs). DSP Advanced signal processor circuits MAC (Multiply and Accumulate) unit (s) - provides fast multiplication of two."

Similar presentations


Ads by Google