DSP for FPGA SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software and Applications Miodrag Bolic.

Slides:



Advertisements
Similar presentations
FPGA (Field Programmable Gate Array)
Advertisements

© 2003 Xilinx, Inc. All Rights Reserved Course Wrap Up DSP Design Flow.
Programmable FIR Filter Design
Commercial FPGAs: Altera Stratix Family Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Reconfigurable Computing (EN2911X, Fall07) Lecture 04: Programmable Logic Technology (2/3) Prof. Sherief Reda Division of Engineering, Brown University.
Altera FLEX 10K technology in Real Time Application.
Avalon Switch Fabric. 2 Proprietary interconnect specification used with Nios II Principal design goals – Low resource utilization for bus logic – Simplicity.
A Survey of Logic Block Architectures For Digital Signal Processing Applications.
EELE 367 – Logic Design Module 2 – Modern Digital Design Flow Agenda 1.History of Digital Design Approach 2.HDLs 3.Design Abstraction 4.Modern Design Steps.
Spartan II Features  Plentiful logic and memory resources –15K to 200K system gates (up to 5,292 logic cells) –Up to 57 Kb block RAM storage  Flexible.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR SRAM-based FPGA n SRAM-based LE –Registers in logic elements –LUT-based logic element.
Digital Signal Processing and Field Programmable Gate Arrays By: Peter Holko.
Week 2 Dr. Kimberly E. Newman Hybrid Embedded Systems.
Aug. 24, 2007ELEC 5200/6200 Project1 Computer Design Project ELEC 5200/6200-Computer Architecture and Design Fall 2007 Vishwani D. Agrawal James J.Danaher.
Configurable System-on-Chip: Xilinx EDK
Programmable logic and FPGA
Implementation of DSP Algorithm on SoC. Mid-Semester Presentation Student : Einat Tevel Supervisor : Isaschar Walter Accompaning engineer : Emilia Burlak.
Introduction to FPGA and DSPs Joe College, Chris Doyle, Ann Marie Rynning.
Lecture 7 Lecture 7: Hardware/Software Systems on the XUP Board ECE 412: Microcomputer Laboratory.
Chapter 17 Microprocessor Fundamentals William Kleitz Digital Electronics with VHDL, Quartus® II Version Copyright ©2006 by Pearson Education, Inc. Upper.
Using Programmable Logic to Accelerate DSP Functions 1 Using Programmable Logic to Accelerate DSP Functions “An Overview“ Greg Goslin Digital Signal Processing.
GallagherP188/MAPLD20041 Accelerating DSP Algorithms Using FPGAs Sean Gallagher DSP Specialist Xilinx Inc.
FPGA Based Fuzzy Logic Controller for Semi- Active Suspensions Aws Abu-Khudhair.
Viterbi Decoder Project Alon weinberg, Dan Elran Supervisors: Emilia Burlak, Elisha Ulmer.
Delevopment Tools Beyond HDL
The 6713 DSP Starter Kit (DSK) is a low-cost platform which lets customers evaluate and develop applications for the Texas Instruments C67X DSP family.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
COE4OI5 Engineering Design Chapter 2: UP2/UP3 board.
Highest Performance Programmable DSP Solution September 17, 2015.
Lecture #3 Page 1 ECE 4110– Sequential Logic Design Lecture #3 Agenda 1.FPGA's 2.Lab Setup Announcements 1.No Class Monday, Labor Day Holiday 2.HW#2 assigned.
1 3-General Purpose Processors: Altera Nios II 2 Altera Nios II processor A 32-bit soft core processor from Altera Comes in three cores: Fast, Standard,
1 Nios II Processor Architecture and Programming CEG 4131 Computer Architecture III Miodrag Bolic.
CHAPTER 3 TOP LEVEL VIEW OF COMPUTER FUNCTION AND INTERCONNECTION
Electronics in High Energy Physics Introduction to Electronics in HEP Field Programmable Gate Arrays Part 1 based on the lecture of S.Haas.
Lecture #3 Page 1 ECE 4110– Sequential Logic Design Lecture #3 Agenda 1.FPGA's 2.Lab Setup Announcements 1.No Class Monday, Labor Day Holiday 2.HW#2 assigned.
© 2005 Altera Corporation SOPC Builder: a Design Tool for Rapid System Prototyping on FPGAs Kerry Veenstra Workshop on Architecture Research using FPGA.
Research on Reconfigurable Computing Using Impulse C Carmen Li Shen Mentor: Dr. Russell Duren February 1, 2008.
Stanford µSequencer December Motivation Control, initialization, and constant maintenance of Avalon peripherals –Perfectly deterministic Microprocessor.
J. Christiansen, CERN - EP/MIC
FPGA (Field Programmable Gate Array): CLBs, Slices, and LUTs Each configurable logic block (CLB) in Spartan-6 FPGAs consists of two slices, arranged side-by-side.
ALTERA FPGAs and NIOSII
© 2010 Altera Corporation—Public Easily Build Designs Using Altera’s Video and Image Processing Framework 2010 Technology Roadshow.
NIOS II Ethernet Communication Final Presentation
Field Programmable Gate Arrays (FPGAs) An Enabling Technology.
Lecture #3 Page 1 ECE 4110–5110 Digital System Design Lecture #3 Agenda 1.FPGA's 2.Lab Setup Announcements 1.HW#2 assigned Due.
BR 1/991 Issues in FPGA Technologies Complexity of Logic Element –How many inputs/outputs for the logic element? –Does the basic logic element contain.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
Lecture 12: Reconfigurable Systems II October 20, 2004 ECE 697F Reconfigurable Computing Lecture 12 Reconfigurable Systems II: Exploring Programmable Systems.
Network On Chip Platform
Tools - LogiBLOX - Chapter 5 slide 1 FPGA Tools Course The LogiBLOX GUI and the Core Generator LogiBLOX L BX.
© 2010 Altera Corporation - Public Lutiac – Small Soft Processors for Small Programs David Galloway and David Lewis November 18, 2010.
© 2003 Xilinx, Inc. All Rights Reserved Course Wrap Up DSP Design Flow.
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
DDRIII BASED GENERAL PURPOSE FIFO ON VIRTEX-6 FPGA ML605 BOARD PART B PRESENTATION STUDENTS: OLEG KORENEV EUGENE REZNIK SUPERVISOR: ROLF HILGENDORF 1 Semester:
Teaching Digital Logic courses with Altera Technology
1 Level 1 Pre Processor and Interface L1PPI Guido Haefeli L1 Review 14. June 2002.
Survey of Reconfigurable Logic Technologies
Altera Technical Solutions Seminar Schedule OpeningIntroduction FLEX ® 10KE Devices APEX ™ 20K & Quartus ™ Overview Design Integration EDA Integration.
FPGA Technology Overview Carl Lebsack * Some slides are from the “Programmable Logic” lecture slides by Dr. Morris Chang.
1 of 24 The new way for FPGA & ASIC development © GE-Research.
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
Head-to-Head Xilinx Virtex-II Pro Altera Stratix 1.5v 130nm copper
An Introduction to FPGA and SOPC Development Board
Introduction to Digital Signal Processors (DSPs)
Field Programmable Gate Array
Field Programmable Gate Array
Low cost FPGA implimentation of tracking system from USB to VGA
ADSP 21065L.
Programmable logic and FPGA
Presentation transcript:

DSP for FPGA SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software and Applications Miodrag Bolic

Objectives Comparison between PDSP and FPGA Virtex II Pro Altera Stratix FPGA Stratix DSP Block and its configuration Altera design flow

What Is an FPGA? Field Programmable Gate Array Device that Has a Regular Architecture (Set of Blocks) that Can Be Programmed for Various Functions “Glue” Logic Customizable Hardware Solution Configurable Processors

Why Use FPGAs in DSP Applications? 10x More DSP Throughput Than DSP Processors –Parallel vs. Serial Architecture Cost-Effective for Multi-Channel Applications Flexible Hardware Implementation Single-Chip Solution –System (Hardware/Software) Integration Benefits FPGA Software Embedded Processor FPGA DSP System Software DSP

MAC Can implement hundreds of MAC functions in an FPGA Parallel implementation allows for faster throughput – 200 Tap FIR Filter would need 1 clock cycle per sample 1-8 Multipliers  Needs looping for more than 8 multiplications Needs multiple clock cycles because of serial computation  200 Tap FIR Filter would need 25+ clock cycles per sample with an 8 MAC unit processor MAC High Speed DSP Processor High Level of Parallel Processing in FPGA DSP Processors vs. FPGAs

100 - Complete Hardware Implementation Performance (MMACs/sec) Embedded Processors Embedded Processors Hardware Acceleration New! Extending Range of Altera Reconfigurable DSP Solutions

DataProgrammable DSP ProcessorsReconfigurable DSP Benefits Easy to Use Programmed Via C-Code or Assembly Fast Development Time Easy to Use Programmed via C-Code, Assembly, or HDL Efficient for Recursive Algorithms Using DSP IP Cores Higher Levels of Integration Weaknesses Fixed Architecture Inefficient for Highly Recursive Algorithms Unless Hardware Accelerated Potential Bus Bottlenecks Other Devices (FPGAs) Often Used on Board for Other Functions Longer Development Time (But Getting Shorter!) Comparison of DSP Devices

Objectives Comparison between PDSP and FPGA Virtex II Pro Altera Stratix FPGA Stratix DSP Block and its configuration Altera design flow

Stratix EP1S10 [2]

TriMatrix™ Memory [1] M512 Blocks M4K Blocks M-RAM Dedicated External Memory Interface Look-Up Schemes Packet & Cell Buffering Cache More Bits For Larger Memory Buffering More Data Ports for Greater Memory Bandwidth Small FIFOs Shift Register Rake Receiver Correlator FIR Filter Delay Line Header / Cell Storage Channelized Functions ATM cell–packet processing Nios Program Memory Packet / Data Storage Nios Program Memory System Cache Video Frame Buffers Echo Canceller Data Storage 512 bits per block + parity 4 Kbits per block + parity 512 Kbits per block + parity

Memory Bandwidth Summary Stratix Device Family [1] DeviceTotal RAM Bits M-RAM Blocks M4K Blocks M512 Blocks Maximum Bandwidth (Mbps) EP1S10920, ,245,024 EP1S201,669, ,096,928 EP1S251,944, ,894,400 EP1S303,317, ,750,192 EP1S403,423, ,384,800 EP1S605,215, ,762,528 EP1S807,427, ,784,720

Logic Element (LE) [2] Sync Load & Clear Logic D DATA 4-Input LUT Register Control Signals Register Chain Input Register Chain Output LUT Chain Output data1 data2 data3 data4 cin Row, Column & DirectLink Routing Local Routing Note: 1)Functional Diagram Only. Please See Datasheet for more Details. 2)Addnsum & data1 connected via XOR logic LUT Chain Input Register Feedback addnsub (2)

Dynamic Arithmetic Mode Sync Load & Clear Logic D DATA Register Control Signals Register Chain Input Register Chain Output data1 data2 addnsub Row, Column & DirectLink Routing Local Routing Note: Functional Diagram Only. Please See Datasheet for more Details. Carry-Out Logic Carry-In Logic LAB Carry-In Carry-In0 Carry-In1 Sum Calculator Carry Calculator data3 Carry-In0 Carry-In1 Carry-Out1 Carry-Out0

Logic Array Blocks (LAB) [2] 10 LEs Local Interconnect LAB-Wide Control Signals LE1 LE2 LE3 LE4 LE5 LE6 LE7 LE8 LE10 LE Control Signals Local Interconnect 30 LAB Input Lines 10 LE Feedback Lines

Avalon Switch Fabric Contents Avalon Switch Fabric provides the following to peripherals it connects –Data-Path Multiplexing –Address Decoding –Wait-State Generation –Dynamic Bus Sizing –Interrupt-Priority Assignment –Latent Transfer Capabilities –Streaming Read and Write Capabilities Avalon Switch Fabric tailors transactions to the characteristic of peripherals that are attached

SOPC Design Example DMA Controller With Streaming Control Port (Slave) Read Port (Master – Streaming) Write Port (Master – Streaming) UARTInstruction Memory 32- bit Data path Avalon Switch Fabric Avalon Tri-State Bridge VGA Controller External FLASH 1 MB 16-bit Datapath External SRAM 256 KB 32-bit Datapath Inst Master Data Master CPU 32 Bit Data Memory 32-bit Data path Allows for Masters and Slaves to communicate without knowledge of each others interface details

Data Path Multiplexing & Slave Arbitration DMA Controller With Streaming Control Port (Slave) Read Port (Master – Streaming) Write Port (Master – Streaming) UARTInstruction Memory 32- bit Data path Avalon Switch Fabric Arbiter Avalon Tri-State Bridge VGA Controller External FLASH 1 MB 16-bit Datapath External SRAM 256 KB 32-bit Datapath Inst Master Data Master CPU 32 Bit Data Memory 32-bit Data path MUX 1.Data-Path Multiplexing 2- Slave Arbitration 3- Address Decoding

Objectives Comparison between PDSP and FPGA Virtex II Pro Altera Stratix FPGA Stratix DSP Block and its configuration Altera design flow

DSP Blocks Eight 9 × 9 bit multipliers Four 18 × 18 bit multipliers One 36 × 36 bit multiplier

DSP Blocks (cont.) The DSP block consists of A multiplier block An adder/subtractor/accumulator block A summation block An output interface Output registers Routing and control signals

Stratix DSP Blocks High Performance Dedicated Multiplier Circuitry –18x18 Functions at 280 MHz Variable Operand Widths with Full Precision Outputs –9x9 (8 Max.) –18x18 (4 Max.) –36x36 (1 Max.) Add, Accumulate or Subtract –Signed & Unsigned Operations –Dynamically Change between Add & Subtract –Supports DSP Requirements Including Complex Numbers + Optional Pipelining Output Register Unit Output Multiplexer + -  Input Register Unit

DSP Block for 18 x 18-bit Mode

Shift Register Chain

Adder/Output Block

Time-Domain Multiplexed FIR Filters

Operation of TDM Filter

DSP Block –Reduces LE Usage –Reduces Routing Congestion –Reduces Power –Maintains Performance 90% of your problems are hidden under the surface! 18 X X SAVES 652 ROUTING NETS! Resource Savings with DSP Blocks

Design Flow

Design Flow Overview 1)Create Design in Simulink Using Altera Libraries 2)Simulate in Simulink 3)Add SignalCompiler to Model 4)Create HDL Code & Generate Testbench 5)Perform RTL Simulation 6)Synthesize HDL Code & Place & Route 7)Program Device 8)Signal Tap II Logic Analyzer

Step 1- Create Design in Simulink Using Altera Libraries Drag & Drop Library Blocks into Simulink Design & Parameterize Each Block

Parameterization of IP Megacores

Step 2 - Simulate in Simulink

Step 3 - Add “Signal Compiler” to Model to Generate HDL code APEX20K/E/C APEX II Stratix & Stratix GX Cyclone & ACEX 1K Mercury FLEX10K & FLEX 6000 DSP Boards Speed vs. Area Message Window Leonardo Spectrum Synplify Quartus II Testbench Generation

Step 4 - Create HDL Code & Generate Testbench AltrFir32.vhd AltrFir32.mdl Enable "Generate Stimuli for VHDL Testbench" Button

HDL Code Generation

DSP Builder Report File Lists All Converted Blocks –Port Widths –Sampling Frequencies –Warnings & Messages

Step 5 – Perform RTL Simulation ( ModelSim ) 1) Set working directory (File => Change Directory) 2) Run TCL file (Tools => Execute Macro)

Perform Verification ModelSim vs Simulink

Step 6 - Synthesize HDL & Place & Route – Synthesis Leonardo Spectrum Synplify Quartus II – Quartus II Fitter

Step 7 – Program Device Download Design to DSP Development Kits

Stratix DSP Development Board 40-Pin Connectors for Analog Devices Texas Instruments Connectors on Underside of Board Mictor-Type Connectors for HP Logic Analyzers MAX 7000 Device Analog SMA Connectors D/A Converters A/D Converters Prototyping Area Nios Expansion Prototype Connector

Stratix DSP Board – Key Features Stratix EP1S25F780C5 Device (Starter Version) Stratix EP1S80B956C7 Device (Professional Version) Analog I/O –Two 12-bit, 125 MHz A/D Converters –Two 14-bit, 165 MHz D/A Converters Digital I/O –Two 40-pin Connectors for Analog Devices A/D Converter Evaluation Boards –Connector for TI TMS320 Cross-Platform Daughter Card –3.3V Expansion/Prototype Headers –RS-232 Serial Port Memory –2 Mbytes of 7.5-ns Synchronous SRAM –32 Mbytes of FLASH

Step 8 - SignalTap II Logic Analyzer Embedded Logic Analyzer –Downloads into Device with Design –Captures State of Internal Nodes –Uses JTAG for Communication

SignalTap II Logic Analyzer Imported Data Imported Plot Analysis of Imported Data