Download presentation
Presentation is loading. Please wait.
Published byPercival Jackson Modified over 9 years ago
1
Basic FPGA Architecture 2 - 2 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Virtex-II Architecture Virtex™-II architecture’s core voltage operates at 1.5V I/O Blocks (IOBs) Configurable Logic Blocks (CLBs) Configurable Logic Blocks (CLBs) Clock Management (DCMs, BUFGMUXes) Block SelectRAM™ resource Block SelectRAM™ resource Dedicated multipliers Programmable interconnect
2
Basic FPGA Architecture 2 - 3 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Slices and CLBs Each Virtex -II CLB contains four slices – Local routing provides feedback between slices in the same CLB, and it provides routing to neighboring CLBs – A switch matrix provides access to general routing resources CIN Switch Matrix BUFT COUT Slice S0 Slice S1 Local Routing Slice S2 Slice S3 CIN SHIFT
3
Basic FPGA Architecture 2 - 4 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Slice 0 LUT Carry LUT Carry DQ CE PRE CLR D Q CE PRE CLR Simplified Slice Structure Each slice has four outputs – Two registered outputs, two non-registered outputs – Two BUFTs associated with each CLB, accessible by all 16 CLB outputs Carry logic runs vertically, up only – Two independent carry chains per CLB
4
Basic FPGA Architecture 2 - 5 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Detailed Slice Structure The next few slides discuss the slice features – LUTs – MUXF5, MUXF6, MUXF7, MUXF8 (only the F5 and F6 MUX are shown in this diagram) – Carry Logic – MULT_ANDs – Sequential Elements
5
Basic FPGA Architecture 2 - 6 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Combinatorial Logic A B C D Z Look-Up Tables Combinatorial logic is stored in Look-Up Tables (LUTs) – Also called Function Generators (FGs) – Capacity is limited by the number of inputs, not by the complexity Delay through the LUT is constant ABCDZ 00000 00010 00100 00111 01001 01011... 11000 11010 11100 11111
6
Basic FPGA Architecture 2 - 7 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Connecting Look-Up Tables F5 F8 F5 F6 CLB Slice S3 Slice S2 Slice S0 Slice S1 F5 F7 F5 F6 MUXF8 combines the two MUXF7 outputs (from the CLB above or below) MUXF6 combines slices S2 and S3 MUXF7 combines the two MUXF6 outputs MUXF6 combines slices S0 and S1 MUXF5 combines LUTs in each slice
7
Basic FPGA Architecture 2 - 8 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Fast Carry Logic Simple, fast, and complete arithmetic Logic – Dedicated XOR gate for single-level sum completion – Uses dedicated routing resources – All synthesis tools can infer carry logic
8
Basic FPGA Architecture 2 - 9 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only CO DICI S LUT CY_MUX CY_XOR MULT_AND A B A x B LUT MULT_AND Gate Highly efficient multiply and add implementation – Earlier FPGA architectures require two LUTs per bit to perform the multiplication and addition – The MULT_AND gate enables an area reduction by performing the multiply and the add in one LUT per bit
9
Basic FPGA Architecture 2 - 10 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only D CE PRE CLR Q FDCPE D CE S R Q FDRSE D CE PRE CLR Q LDCPE G _1 Flexible Sequential Elements Either flip-flops or latches Two in each slice; eight in each CLB Inputs come from LUTs or from an independent CLB input Separate set and reset controls – Can be synchronous or asynchronous All controls are shared within a slice – Control signals can be inverted locally within a slice
10
Basic FPGA Architecture 2 - 11 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Shift Register LUT (SRL16CE) Dynamically addressable serial shift registers – Maximum delay of 16 clock cycles per LUT (128 per CLB) – Cascadable to other LUTs or CLBs for longer shift registers Dedicated connection from Q15 to D input of the next SRL16CE – Shift register length can be changed asynchronously by toggling address A LUT DQ CE DQ DQ DQ LUT D CE CLK A[3:0] Q Q15 (cascade out)
11
Basic FPGA Architecture 2 - 12 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only IOB Element Input path – Two DDR registers Output path – Two DDR registers – Two 3-state enable DDR registers Separate clocks and clock enables for I and O Set and reset signals are shared Reg DDR MUX 3-state OCK1 OCK2 Reg DDR MUX Output OCK1 OCK2 PAD Reg Input ICK1 ICK2 IOB
12
Basic FPGA Architecture 2 - 13 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Distributed SelectRAM Resources Uses a LUT in a slice as memory Synchronous write Asynchronous read – Accompanying flip-flops can be used to create synchronous read RAM and ROM are initialized during configuration – Data can be written to RAM after configuration Emulated dual-port RAM – One read/write port – One read-only port RAM16X1S O D WE WCLK A0 A1 A2 A3 LUT RAM32X1S O D WE WCLK A0 A1 A2 A3 A4 RAM16X1D SPO D WE WCLK A0 A1 A2 A3 DPRA0DPO DPRA1 DPRA2 DPRA3 Slice LUT
13
Basic FPGA Architecture 2 - 14 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Block SelectRAM Resources Up to 3.5 Mb of RAM in 18-kb blocks – Synchronous read and write True dual-port memory – Each port has synchronous read and write capability – Different clocks for each port Supports initial values Synchronous reset on output latches Supports parity bits – One parity bit per eight data bits DIA DIPA ADDRA WEA ENA SSRA CLKA DIB DIPB WEB ADDRB ENB SSRB DOA CLKB DOPA DOPB DOB 18-kb block SelectRAM memory
14
Basic FPGA Architecture 2 - 15 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only ConfigurationDepthData BitsParity Bits 16k x 116 kb10 8k x 28 kb20 4k x 44 kb40 2k x 92 kb81 1k x 181 kb162 512 x 36512324 Dual-Port Block RAM Configurations Configurations available on each port Independent configurations on ports A and B – Supports data-width conversion, including parity bits Port A: 8 bits IN 8 bit OUT 32 bit Port B: 32 bits
15
Basic FPGA Architecture 2 - 16 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Dedicated Multiplier Blocks 18-bit twos complement signed operation Optimized to implement Multiply and Accumulate functions Multipliers are physically located next to block SelectRAM™ memory 18 x 18 Multiplier 18 x 18 Multiplier Output (36 bits) Data_A (18 bits) Data_B (18 bits) 4 x 4 signed 8 x 8 signed 12 x 12 signed 18 x 18 signed
16
Basic FPGA Architecture 2 - 17 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Translate Map Place & Route Xilinx Design Flow Plan & Budget HDL RTL Simulation Synthesize to create netlist Functional Simulation Create BIT File Attain Timing Closure Timing Simulation Implement Create Code/ Schematic
17
Basic FPGA Architecture 2 - 18 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Xilinx Implementation Once you generate a netlist, you can implement the design There are several outputs of implementation – Reports – Timing simulation netlists – Floorplan files – FPGA Editor files – and more! Translate Map Place & Route Implement.........
18
Basic FPGA Architecture 2 - 19 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only What is Implementation? More than just Place & Route Implementation includes many phases – Translate: Merge multiple design files into a single netlist – Map: Group logical symbols from the netlist (gates) into physical components (slices and IOBs) – Place & Route: Place components onto the chip, connect the components, and extract timing data into reports Each phase generates files that allow you to use other Xilinx tools – Floorplanner, FPGA Editor, XPower
19
Basic FPGA Architecture 2 - 20 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Project Summary Design Overview Device Utilization Performance and Constraints Reports
20
Basic FPGA Architecture 2 - 21 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Map Reports Map Report contents – Command line options for the map program – Design summary List of how many device resources are used – Errors and warnings – Removed logic summary List of logic that was removed due to sourceless or loadless nets – IOB properties Indicates whether an I/O flip-flop is used List of attributes on each I/O pin Post-Map Static Timing Report not covered here
21
Basic FPGA Architecture 2 - 22 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Map Report Example Release 4.1i - Map E.30 Xilinx Mapping Report File for Design 'top' Design Information ------------------ Command Line : map -p xc2v40-fg256-4 -cm area -k 4 -c 100 -tx off top.ngd Target Device : x2v40 Target Package : fg256 Target Speed : -4 Mapper Version : virtex2 -- $Revision: 1.58 $ Mapped Date : Tue Aug 21 09:42:20 2001 Design Summary --------------
22
Basic FPGA Architecture 2 - 23 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Map Report Example Number of errors: 0 Number of warnings: 0 Number of Slices: 182 out of 256 71% Number of Slices containing unrelated logic: 0 out of 182 0% Number of Slice Flip Flops: 170 out of 512 33% Total Number 4 input LUTs: 248 out of 512 48% Number used as LUTs: 167 Number used as a route-thru: 81 Number of bonded IOBs: 26 out of 88 29% Number of GCLKs: 1 out of 16 6% Total equivalent gate count for design: 3,475 Additional JTAG gate count for IOBs: 1,248
23
Basic FPGA Architecture 2 - 24 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Place & Route Reports Place & Route Report contents – Command line options for the par program – Errors and warnings – Device utilization summary Similar to the Design Summary from the Map Report – Unrouted nets – Timing summary Statistics on average routing delays Performance versus constraints if the design contains timing constraints
24
Basic FPGA Architecture 2 - 25 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Timing Reports Timing Report contents (for designs with constraints) – Command line options for the trce program – Timing Constraints section Summary of each timing constraint Details on paths that fail to meet constraints – Data Sheet section Setup/hold, clock to pad, timing between clock domains, and pad-to-pad delay information Organized in easy-to-read table format – Timing Summary section Number of errors and Timing Score Constraint coverage
25
Basic FPGA Architecture 2 - 26 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Timing Report Example Release 4.1i - Trace E.30 Copyright (c) 1995-2001 Xilinx, Inc. All rights reserved. trce -e 3 -l 3 -xml top top.ncd -o top.twr top.pcf Design file: top.ncd Physical constraint file: top.pcf Device,speed: xc2v40,-4 (ADVANCED 1.85 2001-07-24) Report level: error report -------------------------------------------------------------------------------- WARNING:Timing - No timing constraints found, doing default enumeration. ================================================================================ Timing constraint: Default period analysis 8292 items analyzed, 0 timing errors detected. Minimum period is 8.852ns. Maximum delay is 11.830ns. --------------------------------------------------------------------------------
26
Basic FPGA Architecture 2 - 27 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Timing Report Example All constraints were met. Data Sheet report: ----------------- All values displayed in nanoseconds (ns) Clock FiftyM_clk to Pad ---------------+------------+ | clk (edge) | Destination Pad| to PAD | ---------------+------------+ EN | 10.035(R)| half1 | 9.465(R)| half2 | 9.166(F)| half3 | 9.740(R)| half4 | 9.174(F)| ---------------+------------+
27
Basic FPGA Architecture 2 - 28 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Without Timing Constraints This design had no timing constraints or pin assignments entered when it was implemented Note the logical structure of the placement and pins. Xilinx recommends that you compile your design at least once without timing constraints or pin assignments This design has a maximum system clock frequency of 50 MHz
28
Basic FPGA Architecture 2 - 29 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only With Timing Constraints This is the same design with three global timing constraints entered with the Constraints Editor It has a maximum system clock frequency of 60 MHz Note how most of the logic is placed closer to the edge of the device where the pins have been placed
29
Basic FPGA Architecture 2 - 30 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Period Constraint In this example the Period constraint optimizes all delay paths between flip-flops The Period constraint does NOT optimize delay paths from input pads to output pads (purely combinatorial), paths from input pads to flip-flops, or paths from flip-flops to output pads = Combinatorial Logic BUFG CLK ADATA OUT2 OUT1 Q FLOP3 DQ FLOP1 D Q FLOP5 D Q FLOP4 D BUS [7..0] CDATA Q FLOP2 D
30
Basic FPGA Architecture 2 - 31 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only The Period Constraint A synchronous element is a flip-flop, latch, or a synchronous RAM The Period constraint covers paths… – Between synchronous elements which are clocked by the reference net Synchronous elements are grouped by the clock signal driving them. This is called forward propagation and enables constraining large pieces of logic with a single constraint
31
Basic FPGA Architecture 2 - 32 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Offset Constraint In this example, the Offset constraint optimizes delay paths from input pads to flip-flops and paths from flip-flops to output pads = Combinatorial Logic BUFG CLK ADATA OUT2 OUT1 Q FLOP DQ D Q D Q D BUS [7..0] CDATA Q FLOP D Offset InOffset Out
32
Basic FPGA Architecture 2 - 33 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only The Offset Constraint The Offset constraint covers paths… – From input pads to synchronous elements clocked by the reference net (Offset In) – From synchronous elements to output pads clocked by the reference net (Offset Out) Note, that this constraint does not cover paths… – Between synchronous elements – From pads to pads (purely combinatorial paths)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.