Download presentation
Presentation is loading. Please wait.
Published byDominic Kearsley Modified over 9 years ago
1
® Xilinx FPGA Architecture Overview
2
® www.xilinx.com Virtex/Spartan-II Top-level Architecture Gate-array like architecture Configurable logic blocks —Implement logic here! I/O blocks —16 signal standards Block RAM —On-chip memory for higher performance Clocks & Delay-Locked Loop Interconnect resources —Three-state internal buses
3
® www.xilinx.com Logic Cell Capacity A better first-order alternative to gate counting Better comparisons among different FPGAs Logic cell definition: —4-input look-up table + dedicated flip-flop Logic cells per CLB: —Xc4000/Spartan 2.375 (2 4-LUTs, 1 3-LUT, 2 FFs) —Virtex/Spartan-II 4.5 (4 4-LUTs, 1 F5MUX, 4 FFs)
4
® www.xilinx.com Combinational Logic Function (LUT) Flip- Flop Inputs Outputs Configurable Logic Block (CLB) Combinational logic generated in a lookup table (LUT) —Any function of available inputs LUT output feeds CLB output or D input of flip-flop
5
® www.xilinx.com CLB MUXF6 Slice LUT MUXF5 Slice LUT MUXF5 Virtex/Spartan-II Function Generators Four 4-input function generators —Independent inputs (4 functions of 4 inputs) MUXF5 combines 2 LUTs to form —4x1 multiplexer —Or any 5-input function MUXF6 combines 2 slices to form —8x1 multiplexer —Or any 6-input function
6
® www.xilinx.com LUT Lookup Table Generates any function of its inputs —Typically 4 inputs Logically equivalent to a 16 x 1 ROM Inputs Output 00000 00011 00101 00110
7
® www.xilinx.com CLB Lookup Table Targeting LUT-based Logic LUT limit is on inputs, not complexity —Reducing inputs/function (fan-in) to fit CLBs improves density and speed —Automatically done by Xilinx synthesis and implementation tools Inverters are free
8
® www.xilinx.com I1 N1 must go to two places, so O1 may require a second level of logic Duplicating first gate allows N1A to always be collapsed inside a single lookup table O1 N1 O1 I1 N1A N1B Duplicating Logic Can Improve Results Collapsing of logic into CLBs affects number of levels required and therefore speed The gates you use will determine mapping —Nets with a fanout >1 may be outside a CLB
9
® www.xilinx.com AND2 Defining Lookup Tables With Gate Primitives Example of gate primitive Up to five inputs with all combinations of inversion —AND2B1 indicates 1 “bubbled” or inverted input Up to nine inputs non-inverted —Add external INV primitives if desired
10
® www.xilinx.com Stores data (D) on rising edge of clock (K) —Clock enable (CE) —Asynchronous clear (C) KCECDQ Xx1x0 10dd 0x0xq D K Q C CE Flip-Flops
11
® www.xilinx.com Additional Flip-Flop Controls Reset (Clear) and/or Set Global initialization (GSR) —Use to initialize all flip- flops Programmable clock polarity Clock enable can be left unconnected
12
® www.xilinx.com Virtex/Spartan-II CLB Slice 1 CLB holds 2 slices Each slice has two sets of —Four-input LUT –Any 4-input logic function –Or 16-bit x 1 RAM –Or 16-bit shift register —Carry & Control –Fast arithmetic logic –Multiplier logic –Multiplexer logic —Storage element –Latch or flip-flop –Set and reset –True or inverted inputs –Sync. or Async. Control
13
® www.xilinx.com Dedicated Multiplier Logic Highly efficient ‘Shift & Add’ implementation —For a 16x16 multiplier –30% reduction in area –1 less logic level
14
® www.xilinx.com On-chip RAM All Xilinx FPGAs use RAM-based programming Adding Write Enable to LUT creates on-chip SelectRAM memory
15
® www.xilinx.com Data Write Enable Write Clock Address Output Data Write Enable Write Clock Write Address/ Single-Port Read Address Single-Port Output Dual-Port Output Dual-Port Read Address SelectRAM Benefits Single-Port —Synchronous —Simple timing Dual-Port
16
® www.xilinx.com kilobytes Block RAM 200 MHz Memory Continuum bytes 16x1 DSP Coefficients Small FIFOs Shallow/Wide Distributed RAM 4Kx1 2Kx2 1Kx4 512x8 256x16 Large FIFOs Packet Buffers Video Line Buffers Cache Tag Memory Deep/Wide megabytes SDRAM ZBTRAM SSRAM SGRAM External RAM Memory Bandwidth and Flexibility Virtex/Spartan-II On-Chip SelectRAM+ Memory
17
® www.xilinx.com Spartan-II Dual-R/W Port Block RAM Port A Port B W R W R W R R W Spartan-II Memory CLB LUTs provide small distributed RAM (16 bits/LUT) Block RAM provides 4K bits each —Dual read/write port. Each port has… –Independent Clock, R/W, and Enable –Independently configurable data width from 4K x 1 to 256 x 16
18
® www.xilinx.com IOB Pad Bonded to Package Pin Clocks TS O I I/O Block (IOB) Periphery of identical I/O blocks —Input, output, or bi-directional —Direct or registered (or latched input) —Pullup/Pulldown —Programmable slew rate —Three-state output —Programmable thresholds
19
® www.xilinx.com IPAD IBUF Use Special IOB Primitives User explicitly defines what resources in the IOB are to be used I/Os are defined with —1 pad primitive —At least 1 function primitive –1 input element, 1 output element or both –Inverters may also be pulled into IOBs
20
® www.xilinx.com Locking Down I/O Locations LOC=Pxx attribute defines I/O pad location(s) Avoid locking IOBs early —Makes routing more difficult Use IOB LOC= to lock pins late in design cycle once PCB is built —Can lock IOBs if floorplanning the connected CLBs
21
® www.xilinx.com IPAD IBUF Use Pullups/Pulldowns Pullup automatically connected on unused IOBs User can specify PULLUP or PULLDOWN primitive on used IOBs Inputs should not be left floating —Add Pullup to design inputs that may be left floating to reduce power and noise
22
® www.xilinx.com Input Buffer Q D Routing Delay Pad Example IOB External Data External Clock Delay External Clock Routed Clock External Data Delay Data X X Faster Setup With NODELAY Delay included by default —Compensates for clock routing delay to prevent hold time NODELAY attribute removes delay element —Creates hold time
23
® www.xilinx.com FAST OPAD OBUF Slew Rate Control Slew rate controls output speed Default slow slew rate reduces noise & ground bounce Use fast slew rate wherever speed is important —FAST parameter on output logic primitive
24
® www.xilinx.com OE OBUFE T T OBUFT OE Output Three-State Control Free inverter on output buffer control —Use OBUFE macro for active-high enable —Use OBUFT primitive for active-low enable
25
® www.xilinx.com STARTUP GTS GSR Global Three-State 3-state control either local and/or via a dedicated global net —Global three-state controlled by STARTUP... primitive
26
® www.xilinx.com Virtex/Spartan-II I/O Block (Simplified)
27
® www.xilinx.com Multiple I/O Interface Standards 16 to 20 I/O interface standards supported CMOS, HSTL, SSTL, GTL, CTT, PCI As many as eight banks on a device —Package dependent Different banks can support different standards at the same time —Logic level translation —Boards with mixed standards
28
® www.xilinx.com 2ns CLB Array High Performance Routing Hierarchical Routing —Singles, Hexes, Longs Sparse connections on longer interconnects for high speed Routing delay depends primarily on distance —Direction independent —Device-size independent Predictable for early design analysis
29
® www.xilinx.com Flexible General-Purpose Interconnect Flexible but slow if crosses many channels —Programmable switch matrix at each channel crossing —Connects across, changes direction or fans out
30
® www.xilinx.com Switch Matrix Bidirectional pass transistors High routing flexibility
31
® www.xilinx.com Reduce Fanout Higher fanout nets (>16 loads) are harder to route & slower Consider duplicating source in schematic to improve routing or speed
32
® www.xilinx.com CLB Long Lines for High Fanout Nets Metal lines that traverse length & width of chip Lowest skew Ideal for high fan-out signals Ideal for clocking Requires vertical or horizontal alignment of loads
33
® www.xilinx.com Internal Three-State Buses Two 3-state drivers per CLB OR-AND logic implementation in place of 3-state drivers —With no drivers enabled, bus is a logic 1 Low power —No danger of contention when multiple BUFTs enabled —No physical pullups or large capacitance to drive
34
® www.xilinx.com General Clock Support Use clock buffers for highest fanout clocks —Drive high-speed long line resources –Lowest skew across a device –No internal hold times —Use generic BUFG primitive –Allows software to choose best type of buffer –Allows easy migration across families Four dedicated global low skew buffers —Dedicated input pin (clock distribution only) Additional shared resources (i.e., long lines) —Distribute low-skew/high-fanout signals (10ns max.) Four delay-locked loops on each device —All-digital implementation —Two global buffers associated with each DLL pair
35
® www.xilinx.com Configuration Schematic or HDL description is converted to a configuration file by the Xilinx development system Configuration file is loaded into FPGA on power-up —Stored in configuration latches —Controls CLBs, IOBs, interconnect, etceteras
36
® www.xilinx.com Configuration Bitstream Binary programming file Length depends only on device, not utilization —Typically 1 ms per bit (total from a few ms to <1s) FPGA can load its configuration automatically on power-up, or under microprocessor control Can be loaded directly into device/configuration PROM
37
® www.xilinx.com Configuration Modes Bit-serial configuration —Simple, uses few device pins —Controlled by FPGA (Master) or externally (Slave) —Xilinx serial proms available Byte-parallel configuration —Can drive PROM addresses (Master) —Can be microprocessor-controlled
38
® www.xilinx.com Configuration Pins Configuration starts on power-up Mode pin(s) checked to determine method —Usable as extra I/O after configuration All I/O not used for configuration are disabled Reconfiguration possible by pulling PROGRAM pin low
39
® www.xilinx.com RIP DATA TRIG CLK READBACK Readback Configuration data can be read back serially —Allows verification of programming Readback data can include user-register values —Allows in-circuit functional verification —Requires READBACK... symbol
40
® www.xilinx.com Boundary Scan IEEE 1149.1-compatible boundary scan (JTAG) Available before configuration Configuration & readback possible via boundary scan logic
41
® www.xilinx.com Power Consumption CMOS SRAM technology provides low standby power Operating power is mostly dynamic —Proportional to transition frequency of internal nodes —Xilinx segmented interconnect minimizes amount of metal capacitance to switch, minimizing power
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.