Download presentation
Presentation is loading. Please wait.
1
1 Architectural Analysis of a DSP Device, the Instruction Set and the Addressing Modes SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software and Applications Miodrag Bolic
2
2 Outline FIR filter on ADPS-21x DSP Requirements Fast Multiply-Accumulates (Data-path) Extended Precision Accumulator Register (Data-path) Dual Operand Fetch (Memory) Circular Buffering (Addressing) Zero-Overhead Looping (Instruction set) Analog Devices Architectures and Programming SHARC Blackfin Performance Optimization
3
3 ADSP -21x Copied from [Kester03]
4
4 CALCULATING OUTPUTS OF 4-TAP FIR FILTER USING A CIRCULAR BUFFER y(3) = h(0) x(3) + h(1) x(2) + h(2) x(1) + h(3) x(0) y(4) = h(0) x(4) + h(1) x(3) + h(2) x(2) + h(3) x(1) y(5) = h(0) x(5) + h(1) x(4) + h(2) x(3) + h(3) x(2) Memory Location 0 1 2 3 Read x(0) x(1) x(2) x(3) Write x(4) Read x(4) x(1) x(2) x(3) Write x(5) Read x(4) x(5) x(2) x(3) Copied from [Kester03]
5
5 FIR filter steps 1. Obtain a sample with the ADC; generate an interrupt 2. Detect and manage the interrupt 3. Move the sample into the input signal's circular buffer 4. Update the pointer for the input signal's circular buffer 5. Zero the accumulator 6. Control the loop through each of the coefficients 7. Fetch the coefficient from the coefficient's circular buffer 8. Update the pointer for the coefficient's circular buffer 9. Fetch the sample from the input signal's circular buffer 10. Update the pointer for the input signal's circular buffer 11. Multiply the coefficient by the sample 12. Add the product to the accumulator 13. Move the output sample (accumulator) to a holding buffer 14. Move the output sample from the holding buffer to the DAC Copied from [Kester03]
6
6 FIR filter steps (cont.) ADSP21xx Example code: CNTR = N-1; DO convolution UNTIL CE; convolution: MR = MR + MX0 * MY0(SS), MX0 = DM(I0,M1), MY0 = PM(I4,M5); Single Cycle Instruction Copied from [Kester03]
7
7 Outline FIR filter on ADPS-21x DSP Requirements Fast Multiply-Accumulates (Data-path) Extended Precision Accumulator Register (Data-path) Dual Operand Fetch (Memory) Circular Buffering (Addressing) Zero-Overhead Looping (Instruction set) Analog Devices Architectures and Programming SHARC Blackfin Performance Optimization
8
8 Copied from [Takala05]
9
9
10
10 Motorola DSP5600X Copied from [Takala05]
11
11 Copied from [Takala05]
12
12 Copied from [Takala05]
13
13 ADSP -21x MAC www.analog.com/dsp
14
14 Copied from [Takala05]
15
15 SHARC Architecture ADSP-2106X Copied from [Takala05]
16
16 Outline FIR filter on ADPS-21x DSP Requirements Fast Multiply-Accumulates (Data-path) Extended Precision Accumulator Register (Data-path) Dual Operand Fetch (Memory) Circular Buffering (Addressing) Zero-Overhead Looping (Instruction set) Analog Devices Architectures and Programming SHARC Blackfin Performance Optimization
17
17 Copied from [Takala05]
18
18 Copied from [Takala05]
19
19 Copied from [Takala05]
20
20 Outline FIR filter on ADPS-21x DSP Requirements Fast Multiply-Accumulates (Data-path) Extended Precision Accumulator Register (Data-path) Dual Operand Fetch (Memory) Circular Buffering (Addressing) Zero-Overhead Looping (Instruction set) Analog Devices Architectures and Programming SHARC Blackfin Performance Optimization
21
21 Copied from [Takala05]
22
22 Copied from [Takala05]
23
23 Hardware loops Software loop: MOVE #16,BInitialize loop counter B LOOP: MAC (R0)+,(R4)+,ARegister-indirect addressing with post-increment DEC B JNELOOP Hardware loops: no time is spent on –Decrementing counters –Checking to see if the loop is finished –Branching back to the top of the loop RPT #16 MAC (R0)+,(R4)+,A [Lapsley97]
24
24 Copied from [Kester03]
25
25 Upto 3000MMACS Image compression Digital Still/Video Camera MMOIP Telematics Biometrics Upto 160MMACS Wired Voice Wireless Voice VOIP/VON Industrial Control ADSP-218x/9x Power Efficient $5 - $10 ADSP-218x/9x Power Efficient $5 - $10 Upto 4800MMACS (16-bit) or 1200MMACS (32-bit) 2.5G/3G Infrastructure Medical Imaging Industrial Imaging Multiprocessing TigerSHARCHigh-Performance $35 - $200 TigerSHARCHigh-Performance Performance Blackfin Media Enabled $5 - $30 Blackfin Media Enabled $5 - $30 ADI General Purpose DSP Product Families Upto 600MMACS (32-bit) Audio Infotainment Industrial SHARCLow-Cost Floating Point $10 - $100 SHARCLow-Cost Floating Point $10 - $100 www.analog.com/dsp
26
26 Outline FIR filter on ADPS-21x DSP Requirements Fast Multiply-Accumulates (Data-path) Extended Precision Accumulator Register (Data-path) Dual Operand Fetch (Memory) Circular Buffering (Addressing) Zero-Overhead Looping (Instruction set) Analog Devices Architectures and Programming SHARC Blackfin Performance Optimization
27
27 SHARC Architecture Copied from [Smith97]
28
28 SHARC Architecture - Features SHARCThe Super Harvard ARChitecture 100MHz Core / 300 MFLOPS Peak Parallel Operation of: Multiplier, ALU, 2 Address Generators & Sequencer –No Arithmetic Pipeline; All Computations Are Single-Cycle High Precision and Extended Dynamic Range –32/40-Bit IEEE Floating-Point Math –32-Bit Fixed-Point MAC’s with 64-Bit Product & 80-Bit Accumulation Single-Cycle Transfers with Dual-Ported Memory Structures –Supported by Cache Memory and Enhanced HarvardArchitecture Glueless Multiprocessing Features JTAG Test and Emulation Port DMA Controller, Serial Ports, Link Ports, External Bus, SDRAM Controller, Timers www.analog.com/dsp
29
29 ADSP-2106x Core Architecture www.analog.com/dsp
30
30 Example- Dot product C code Copied from [Smith97]
31
31 Example- Dot product - Assembly Copied from [Smith97]
32
32 Example- Dot product - Assembly Copied from [Smith97]
33
33 C or Assembly How complicated is the program? Are you pushing the maximum speed of the DSP? How many programmers will be working together? Which is more important, product cost or development cost? What is your background? What does the DSP's manufacturer suggest you use? Copied from [Smith97]
34
34 Outline FIR filter on ADPS-21x DSP Requirements Fast Multiply-Accumulates (Data-path) Extended Precision Accumulator Register (Data-path) Dual Operand Fetch (Memory) Circular Buffering (Addressing) Zero-Overhead Looping (Instruction set) Analog Devices Architectures and Programming SHARC Blackfin Performance Optimization
35
35 B LACK fin Processor Core Two 16-bit Multipliers Two 40-bit ALUs, Four 8-bit Video ALUs Barrel Shifter Sixteen 16-bit /Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index, Modify, Length, Base 16-bit Instructions, 32-bit Instructions Multi-Issue, 64-bit Instructions Interlocked Pipeline Micro Signal Architecture, developed with Intel www.analog.com/dsp
36
36 ADSP-BF535 BLACKfin Processor Architecture Great Performance Value Highest Frequency (350 MHz) 1.0V to 1.6V 260 PBGA High System Integration Address range 768Mbytes SPORTs support 8 Channels of I2S Audio (532Mbps) I/O Bandwidth, DMA Bandwidth & Memory Bandwidth Microcontroller features include WDT, PCI, USB1.1 SDRAM controller To 350 MHz BLACKfin Processor Core SDRAM FLASH/SRAM Interfaces Real Time Clock Watchdog JTAG System Peripherals 308 Kbytes On-Chip SRAM DMA SPI 2 UART 2 Timers 3 (32bit) GPIO 16 User Peripherals Dynamic Power Management SPORTs 2 PCI Memory PLL 264Kbytes On-Chip SRAM 48 Kbytes On-Chip Cache USB 1.1 www.analog.com/dsp
37
37 Seminars about Blackfin
38
38 Seminars about Blackfin
39
39 Seminars about Blackfin
40
40 Seminars about Blackfin
41
41 Seminars about Blackfin
42
42 Seminars about Blackfin
43
43 Seminars about Blackfin
44
44 Seminars about Blackfin
45
45 Seminars about Blackfin
46
46 Seminars about Blackfin
47
47 Seminars about Blackfin
48
48 Seminars about Blackfin
49
49 Seminars about Blackfin
50
50 Seminars about Blackfin
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.