An Introduction to FPGA Design

Slides:



Advertisements
Similar presentations
Basic HDL Coding Techniques
Advertisements

Spartan-3 FPGA HDL Coding Techniques
ECE 551 Digital System Design & Synthesis Lecture 08 The Synthesis Process Constraints and Design Rules High-Level Synthesis Options.
Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Virtex-II Architecture Virtex™-II architecture’s core voltage.
Spartan II Features  Plentiful logic and memory resources –15K to 200K system gates (up to 5,292 logic cells) –Up to 57 Kb block RAM storage  Flexible.
Kazi Spring 2008CSCI 6601 CSCI-660 Introduction to VLSI Design Khurram Kazi.
The Spartan 3e FPGA. CS/EE 3710 The Spartan 3e FPGA  What’s inside the chip? How does it implement random logic? What other features can you use?  What.
Programmable logic and FPGA
Implementation of DSP Algorithm on SoC. Mid-Semester Presentation Student : Einat Tevel Supervisor : Isaschar Walter Accompaning engineer : Emilia Burlak.
GallagherP188/MAPLD20041 Accelerating DSP Algorithms Using FPGAs Sean Gallagher DSP Specialist Xilinx Inc.
© 2011 Xilinx, Inc. All Rights Reserved This material exempt per Department of Commerce license exception TSU Xilinx Tool Flow.
The Xilinx Spartan 3 FPGA EGRE 631 2/2/09. Basic types of FPGA’s One time programmable Reprogrammable (non-volatile) –Retains program when powered down.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
Global Timing Constraints FPGA Design Workshop. Objectives  Apply timing constraints to a simple synchronous design  Specify global timing constraints.
ISE. Tatjana Petrovic 249/982/22 ISE software tools ISE is Xilinx software design tools that concentrate on delivering you the most productivity available.
© 2003 Xilinx, Inc. All Rights Reserved Reading Reports Xilinx: This module was completely redone. Please translate entire module Some pages are the same.
Highest Performance Programmable DSP Solution September 17, 2015.
© 2003 Xilinx, Inc. All Rights Reserved For Academic Use Only Xilinx Design Flow FPGA Design Flow Workshop.
FPGA (Field Programmable Gate Array): CLBs, Slices, and LUTs Each configurable logic block (CLB) in Spartan-6 FPGAs consists of two slices, arranged side-by-side.
Tools - Implementation Options - Chapter15 slide 1 FPGA Tools Course Implementation Options.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
© 2003 Xilinx, Inc. All Rights Reserved Synchronous Design Techniques.
Basic Sequential Components CT101 – Computing Systems Organization.
This material exempt per Department of Commerce license exception TSU Xilinx Tool Flow.
© 2003 Xilinx, Inc. All Rights Reserved Global Timing Constraints FPGA Design Flow Workshop.
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU FPGA Design with Xilinx ISE Presenter: Shu-yen Lin Advisor: Prof. An-Yeu Wu 2005/6/6.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
Timing and Constraints “The software is the lens through which the user views the FPGA.” -Bill Carter.
Field Programmable Port Extender (FPX) 1 Modular Design Techniques for the FPX.
Introduction to FPGA Tools
Tools - Design Manager - Chapter 6 slide 1 Version 1.5 FPGA Tools Training Class Design Manager.
Tools - LogiBLOX - Chapter 5 slide 1 FPGA Tools Course The LogiBLOX GUI and the Core Generator LogiBLOX L BX.
This material exempt per Department of Commerce license exception TSU Synchronous Design Techniques.
CORE Generator System V3.1i
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
© 2003 Xilinx, Inc. All Rights Reserved Answers DSP Design Flow.
Introduction to Field Programmable Gate Arrays Lecture 1/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden, 31 May – 9 June 2007 Javier.
ASIC/FPGA design flow. Design Flow Detailed Design Detailed Design Ideas Design Ideas Device Programming Device Programming Timing Simulation Timing Simulation.
Introduction to the FPGA and Labs
Issues in FPGA Technologies
Sequential Logic Design
Lab 1: Using NIOS II processor for code execution on FPGA
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
Design for Embedded Image Processing on FPGAs
Introduction to Programmable Logic
Spartan FPGAs مرتضي صاحب الزماني.
FPGA Implementation of Multicore AES 128/192/256
Getting Started with Programmable Logic
Field Programmable Gate Array
Field Programmable Gate Array
Field Programmable Gate Array
We will be studying the architecture of XC3000.
The Xilinx Virtex Series FPGA
XC4000E Series Xilinx XC4000 Series Architecture 8/98
FPGA Tools Course Basic Constraints
FPGA Tools Course Answers
Programmable Logic- How do they do that?
ChipScope Pro Software
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
Interfacing Data Converters with FPGAs
Reconfigurable FPGAs (The Xilinx Virtex II Pro / ProX FPGA family)
Win with HDL Slide 4 System Level Design
The Xilinx Virtex Series FPGA
ChipScope Pro Software
"Computer Design" by Sunggu Lee
THE ECE 554 XILINX DESIGN PROCESS
FPGA Tools Course Timing Analyzer
Optimizing RTL for EFLX Tony Kozaczuk, Shuying Fan December 21, 2016
THE ECE 554 XILINX DESIGN PROCESS
FPGA’s 9/22/08.
Presentation transcript:

An Introduction to FPGA Design Avnet SpeedWay Workshops An Introduction to FPGA Design FPGA Architecture

Xilinx FPGA Architecture Avnet SpeedWay Workshops Xilinx FPGA Architecture Logic Fabric Gates and flip-flops Embedded Blocks Memory DSP/Multipliers Clock management High speed serial I/O Soft/hard processors Programmable I/Os In-system programmable This presentation is somewhat targeted towards Spartan 3E (and that is the demo board used in the lab) but with discussion of other families and architectures as well. Some slides will be more or less generic depending on what is being discussed. Ensure that the audience recognizes this so that they aren’t confused and think that PPC405s are included in the Spartan-3E architecture. Ver 1.1a

Avnet SpeedWay Workshops Logic Fabric I3 I1 I2 I0 O D Q SET RST CE 0 1 Logic Cell Lookup table (LUT) Flip-Flop Carry logic Muxes (not shown) Slice Two Logic Cells Spartan-3E FPGAs 2K to 33K logic cells Explain the basic LUT/Slice architecture. Since this is generic, you may decide to mention the CLB – don’t muddy the water, however. The main idea is to explain the composition of the logic fabric, and the typical fpga sizes in terms of logic cells. The F5MUX and FiMUX benefits and operations are covered in XAPP466 here: http://www.xilinx.com/bvdocs/appnotes/xapp466.pdf The basics are that the additional muxes have the capability of implementing any function of 5 inputs (F5), 6 inputs (F6), 7 inputs (F7) and 8 inputs (F8) without leaving the CLB. This doesn’t introduce any level of logic delay because the routes are inside the CLB. As far as mux functionality, the F5,6,7,8 muxes can be used to create 4:1, 8:1, 16:1 and 32:1 muxes. Ver 1.1a

Avnet SpeedWay Workshops Memory Block RAM RAM or ROM True dual port Separate read and write ports Independent port size Data width translation Excellent for FIFOs DIA DOA DIPA DOPA ADDRA CLKA DIB DOB DIPB DOPB ADDRB CLKB Ver 1.1a

Avnet SpeedWay Workshops Multipliers 18 x 18 Multipliers Signed or unsigned Optional pipeline stage Cascadable 18 bit 36 bit 18 bit Pipelining in Spartan-3E means using the registers at the input and output of the multiplier. Unlike Spartan-3, the registers are part of the multiplier block and are not used from the fabric. Ver 1.1a

Avnet SpeedWay Workshops Clock Management Digital Clock Managers (DCMs) Clock de-skew Phase shifting Clock multiplication Clock division Frequency synthesis CLKIN CLK0 CLK90 CLKFX Ver 1.1a

Avnet SpeedWay Workshops Programmable I/Os Single-ended Differential / LVDS Programmable I/O standards Multiple I/O banks DDR I/O registers On-chip termination Reg DDR mux 3-State PAD Input Output I/O Banks The list of standards on the left is taken from the Spartan-3 data sheet and is only meant to be an example. Point this out during the presentation so that the audience doesn’t think that this is the complete list of electrical standards supported by the Xilinx fpgas. It is only an example. DDR I/O registers – allows data to be transmitted and/or received on both edges of the clock. This type of I/O is used in DDR memory interfaces, and other high-speed I/O schemes. Example: a 311MHz clock can be used to achieve 622Mb/s interfaces with use of DDR data tansfers, Ver 1.1a

Xilinx Spartan-3E Family Avnet SpeedWay Workshops Xilinx Spartan-3E Family 36 28 20 12 4 18x18 Multipliers 8 2 DCMs 136K 504K 304 19,512 1.2M 15K 72K 108 2,160 100K 33,192 10,476 5,508 Logic Cells 648K 360K 216K Block RAM bits 231K 73K 38K Distributed RAM bits 376 232 172 Maximum I/O 1.6M 500K 250K Gates Device 3S1200E 3S100E 3S1600E 3S500E 3S250E This is a Spartan-3E family chard provided because the presentation / labs target the Spartan-3E. There is no V-4 slide because the presentation needs to take only 50 minutes and this is meant to be a fundamentals of design class rather than an fpga family presentation. Ver 1.1a

An Introduction to FPGA Design Avnet SpeedWay Workshops An Introduction to FPGA Design Why doing DSP In FPGA ?

High-Speed DSP Challenges High performance digital communication and video imaging designs challenge existing DSP solutions Need higher performance Need lower costs Need lower power Compromises are often made… Performance is sacrificed Time is spent designing substitute implementations Ver 1.1a

FPGAs Enable Massively Parallel DSP Example 256 TAP Filter Implementation Programmable DSP - Sequential FPGA - Fully Parallel Implementation Data In Data In Reg Reg Reg Reg Coefficients X … C0 X C1 X C0 C2 X C3 X C255 X MAC Unit 256 clock cycles needed + + 256 operations in 1 clock cycle Reg Data Out Data Out 1 GHz 256 clock cycles = 4 MSPS 500 MHz 1 clock cycle = 500 MSPS “… the unprecedented signal processing requirements of next-generation wireless devices threaten to outpace the capabilities of DSP processors, creating opportunities for massively parallel and highly customized devices.” BDTI, 2004 Ver 1.1a

Usual Parallel Adder Tree Implementation Data In Reg Reg Reg Reg Reg Reg Reg Reg Reg C0 X C1 X C0 C2 X C3 X C4 X C5 X C0 C6 X C7 X C30 X C31 X + + + + + + + Consumes Logic to Implement Adders + Variable Latency + 32 TAP filter implementation will consume 1,461 logic cells to implement adders in fabric Data Out Fabric and Routing May Reduce Performance Ver 1.1a

Virtex-4 Parallel Implementation Parallel Adder Cascade Implementation Data In Reg Reg Reg Reg Reg Reg Reg Reg Reg Reg Reg Reg Reg Reg Reg Reg Reg Reg C0 X C1 X C2 X C3 X C4 X C5 X C6 X C7 X C30 X C31 X + + + + + + + + + + Reg Reg Reg Reg Reg Reg Reg Reg Reg Data Out Filters Implemented Entirely Within the XtremeDSP Slice Guaranteed 500MHz Performance Regardless of Filter Size 32 TAP filter implementation using 32 XtremeDSP Slices Ver 1.1a

Xilinx 4th Generation XtremeDSP Virtex-4 XtremeDSP Highest DSP Bandwidth Available 4th Generation 256GMACs/s DSP Bandwidth GMACs/s 250 200 3rd Generation 111GMACs/s 150 2nd Generation 32GMACs/s 100 1st Generation 11GMACs/s 50 Virtex-E Virtex-II Virtex-II Pro Virtex-4 Ver 1.1a

An Introduction to FPGA Design Avnet SpeedWay Workshops An Introduction to FPGA Design Development Flow

Avnet SpeedWay Workshops Xilinx Design Process Implementation Constraints Silicon Design Entry Synthesis Timing Simulation Floor-Planning Behavioral Simulation Timing Analysis Place & Route Map Translate Ver 1.1a

Avnet SpeedWay Workshops Xilinx ISE Software ISE Foundation Windows & Linux support ISE Simulator Lite All Virtex series FPGAs All Spartan-II/3 series FPGAs All CPLDs $2,495 USD ISE WebPACK Windows & Linux support ISE Simulator Lite Limited Virtex series FPGAs All Spartan-II/3 series FPGAs All CPLDs FREE Web download or CD Optional Software Accessories ChipScope Pro Full ISE Simulator MXE-III PlanAhead FPGA real-time debug $695 For ISE Foundation HDL simulator $995 ModelTech HDL simulator $945 Hierarchical Floorplanner $4,995 Evaluation Versions Available Ver 1.1a

Avnet SpeedWay Workshops Project Navigator Viewing Area Sources in project Processes for source Message Console Ver 1.1a

ISE Tools and Processes Avnet SpeedWay Workshops ISE Tools and Processes Design entry Synthesis Implementation Configuration Simulation processes only appear when a simulation testbench is the selected source. Ver 1.1a

Avnet SpeedWay Workshops HDL Basics Coding style affects how logic is inferred Asynch vs. Synch reset Flip-flop initial value See Language Templates for coding style examples Do not gate clocks! Introduces skew Negative effects on performance Use CE function on Flip-flop Use BUFGMUX Optimizing HDL for design performance is covered in a separate training class D CE Q R S Ver 1.1a

Don’t Re-Invent the Wheel!! Avnet SpeedWay Workshops Don’t Re-Invent the Wheel!! 64 Tap FIR FILTER Ver 1.1a

Core Generator & Architecture Wizard Avnet SpeedWay Workshops Core Generator & Architecture Wizard Generate customized IP Extensive library of macros (parameterized blocks) Core Generator Output Files HDL black box declaration HDL instantiation template Black box netlist Access Core Generator via ProjectNew SourceIP . . . The Fundamentals of FPGA Design course includes more information (including a lab) on the Architecture Wizard. Example: DSP Core – Finite Impulse Response (FIR) Filter Ver 1.1a

An Introduction to FPGA Design Avnet SpeedWay Workshops An Introduction to FPGA Design Timing Constraints

Avnet SpeedWay Workshops Timing Constraints Timing Constraints give the tools a performance goal Place & Route uses timing constraints PAR runs the timing analysis tool in the background Real-time analysis of current results against performance goals Without constraints, PAR tries to reduce run-time Finishes quickly Modest effort to optimize performance With constraints, PAR tries to meet performance goals Run-time may be longer Aggressive time constraints and higher effort levels can significantly increase run-time Ver 1.1a

Basic Timing Constraints Avnet SpeedWay Workshops Basic Timing Constraints PERIOD – Target clock period for internal sequential paths OFFSET IN BEFORE – Target “input setup” time (Tsu) Reference between external INPUT pin and CLK pin OFFSET OUT AFTER – Target “clock to out” time (Tco) Reference between external CLK pin and OUTPUT pin Ver 1.1a

Avnet SpeedWay Workshops PERIOD Constraint CLOCK PERIOD NET “MYCLK" TNM_NET = " MYCLK "; TIMESPEC " TS_MYCLK " = PERIOD " MYCLK " 10 ns HIGH 50 %; Data paths between synchronous elements only Does not cover Cross clock domains between unrelated clocks Ver 1.1a

OFFSET IN BEFORE Constraint Avnet SpeedWay Workshops OFFSET IN BEFORE Constraint OFFSET IN BEFORE OFFSET = IN 5 ns BEFORE “MYCLK" ; “The signal will be valid at the pad X nanoseconds before the clock appears at the clock pad…” Covers the first path to a synchronous element Recommendation for high performance: Use the IOB registers Ver 1.1a

OFFSET OUT AFTER Constraint Avnet SpeedWay Workshops OFFSET OUT AFTER Constraint OFFSET OUT AFTER OFFSET = OUT 7 ns AFTER " MYCLK " ; “The signal will be valid at the pad X nanoseconds after the clock appears at the clock pad…” Covers the last path from a synchronous element Recommendation for high performance: Use the IOB registers Ver 1.1a

UCF – User Constraints File Avnet SpeedWay Workshops UCF – User Constraints File # #Global Clock Constraint # constrain net on external pin NET “CLK” TNM_NET = “CLK”: TIMESPEC “TS_CLK” = PERIOD “CLK” 20 ns HIGH 50% #Input Timing OFFSET = IN 8 ns BEFORE “CLK” ; #Output Timing OFFSET = OUT 5 ns AFTER “CLK” ; #Pad-pad combinatorial timing TIMESPEC “TS_P2P” = FROM “PADS” TO “PADS” 15 ns; # Input timing exception from global input constraint NET “STRTSTOP” OFFSET = IN 3 ns BEFORE “CLK” ; Ver 1.1a

Timing Constraints Editor Avnet SpeedWay Workshops Timing Constraints Editor PERIOD OFFSET OFFSET IN OUT All of the clocks in the design will appear in the “Clock Net Name” list. If unexpected signals appear, then the synthesis tool found some clocks that were unintentional. The designer needs to go back to the HDL and analyze the coding style and make any necessary changes. Ver 1.1a

Avnet SpeedWay Workshops Timing Analysis Synthesis Estimated Timing Report User Timing Constraints Translate Map “Post Map Static Timing” Not used for most designs PAR Timing Analyzer “Post Route Timing” Ver 1.1a

An Introduction to FPGA Design Avnet SpeedWay Workshops An Introduction to FPGA Design Thank You !