FPGA Design Flow Workshop

Slides:



Advertisements
Similar presentations
What are FPGA Power Management HDL Coding Techniques Xilinx Training.
Advertisements

Basic HDL Coding Techniques
Lecture 15 Finite State Machine Implementation
Spartan-3 FPGA HDL Coding Techniques
ECE 551 Digital System Design & Synthesis Lecture 08 The Synthesis Process Constraints and Design Rules High-Level Synthesis Options.
Synthesis Options. Welcome If you are new to FPGA design, this module will help you synthesize your design properly These synthesis techniques promote.
Logic Synthesis – 3 Optimization Ahmed Hemani Sources: Synopsys Documentation.
Xilinx/Exemplar Logic FPGA Synthesis Solution. LeonardoSpectrum Powerful Integrated Modular ASIC & FPGA.
FPGA and ASIC Technology Comparison - 1 © 2009 Xilinx, Inc. All Rights Reserved Synthesis Options.
© 2003 Xilinx, Inc. All Rights Reserved Architecture Wizard and PACE FPGA Design Flow Workshop Xilinx: new module Xilinx: new module.
Lecture 11 Xilinx FPGA Memories
Kazi Spring 2008CSCI 6601 CSCI-660 Introduction to VLSI Design Khurram Kazi.
FPGAs and VHDL Lecture L12.1. FPGAs and VHDL Field Programmable Gate Arrays (FPGAs) VHDL –2 x 1 MUX –4 x 1 MUX –An Adder –Binary-to-BCD Converter –A Register.
Dr. Turki F. Al-Somani VHDL synthesis and simulation – Part 3 Microcomputer Systems Design (Embedded Systems)
VHDL Synthesis in FPGA By Zhonghai Shi February 24, 1998 School of EECS, Ohio University.
XST Synthesis FPGA Design Workshop. Presentation Name 2 Objectives After completing this module, you will be able to…  List the synthesis options for.
Random-Access Memory Distributed and Block RAM Discussion D10.3 Example 41.
Virtex-6 and Spartan-6 HDL Coding Techniques
© 2011 Xilinx, Inc. All Rights Reserved This material exempt per Department of Commerce license exception TSU Xilinx Tool Flow.
The Xilinx Spartan 3 FPGA EGRE 631 2/2/09. Basic types of FPGA’s One time programmable Reprogrammable (non-volatile) –Retains program when powered down.
Global Timing Constraints FPGA Design Workshop. Objectives  Apply timing constraints to a simple synchronous design  Specify global timing constraints.
StateCAD FPGA Design Workshop. For Academic Use Only Presentation Name 2 Objectives After completing this module, you will be able to:  Describe how.
FPGA-Based System Design: Chapter 4 Copyright  2004 Prentice Hall PTR HDL coding n Synthesis vs. simulation semantics n Syntax-directed translation n.
ISE. Tatjana Petrovic 249/982/22 ISE software tools ISE is Xilinx software design tools that concentrate on delivering you the most productivity available.
© 2003 Xilinx, Inc. All Rights Reserved FPGA Design Techniques.
George Mason University FPGA Memories ECE 448 Lecture 13.
Synthesis Presented by: Ms. Sangeeta L. Mahaddalkar ME(Microelectronics) Sem II Subject: Subject:ASIC Design and FPGA.
System Arch 2008 (Fire Tom Wada) /10/9 Field Programmable Gate Array.
© 2003 Xilinx, Inc. All Rights Reserved For Academic Use Only Xilinx Design Flow FPGA Design Flow Workshop.
FPGA and ASIC Technology Comparison - 1 © 2009 Xilinx, Inc. All Rights Reserved Basic HDL Coding Techniques Part 1.
© 2003 Xilinx, Inc. All Rights Reserved FPGA Editor: Viewing and Editing a Routed Design.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
© 2009 Xilinx, Inc. All Rights Reserved Part 2 Virtex-6 and Spartan-6 HDL Coding Techniques.
© 2003 Xilinx, Inc. All Rights Reserved Synchronous Design Techniques.
Slide 1 6. VHDL/Verilog Behavioral Description. Slide 2 Verilog for Synthesis: Behavioral description Instead of instantiating components, describe them.
Programmable Logic Training Course Project Manager.
This material exempt per Department of Commerce license exception TSU Xilinx Tool Flow.
Winning with HDL. AGENDA  Introduction  HDL coding techniques  Virtex hardware  Summary.
© 2003 Xilinx, Inc. All Rights Reserved Global Timing Constraints FPGA Design Flow Workshop.
CPE 626 Advanced VLSI Design Lecture 6: VHDL Synthesis Aleksandar Milenkovic
Slide 1 2. Verilog Elements. Slide 2 Why (V)HDL? (VHDL, Verilog etc.), Karen Parnell, Nick Mehta, “Programmable Logic Design Quick Start Handbook”, Xilinx.
Tools - Design Entry - Chapter 4 slide 1 FPGA Tools Course Design Entry.
Introduction to VHDL Simulation … Synthesis …. The digital design process… Initial specification Block diagram Final product Circuit equations Logic design.
Introduction to FPGA Tools
Tools - LogiBLOX - Chapter 5 slide 1 FPGA Tools Course The LogiBLOX GUI and the Core Generator LogiBLOX L BX.
This material exempt per Department of Commerce license exception TSU Synchronous Design Techniques.
Finite State Machine (FSM) Nattha Jindapetch December 2008.
VHDL Discussion Sequential Sytems. Memory Elements. Registers. Counters IAY 0600 Digital Systems Design Alexander Sudnitson Tallinn University of Technology.
1 COMP541 State Machines – 2 Registers and Counters Montek Singh Feb 11, 2010.
CDA 4253 FGPA System Design Xilinx FPGA Memories
Teaching Digital Logic courses with Altera Technology
George Mason University ECE 448 – FPGA and ASIC Design with VHDL FPGA Devices ECE 448 Lecture 5.
Case Study: Xilinx Synthesis Tool (XST). Arrays & Records 2.
Lecture 11 Xilinx FPGA Memories Part 2
© 2005 Xilinx, Inc. All Rights Reserved This material exempt per Department of Commerce license exception TSU Synthesis Techniques.
Introduction to Programmable Logic
Introduction Introduction to VHDL Entities Signals Data & Scalar Types
Advance Skills TYWu.
Programmable Logic Memories
Combinatorial Logic Design Practices
Topics HDL coding for synthesis. Verilog. VHDL..
Field Programmable Gate Array
Field Programmable Gate Array
Field Programmable Gate Array
RTL Style در RTL مدار ترتيبي به دو بخش (تركيبي و عناصر حافظه) تقسيم مي شود. مي توان براي هر بخش يك پروسس نوشت يا براي هر دو فقط يك پروسس نوشت. مرتضي صاحب.
SYNTHESIS OF SEQUENTIAL LOGIC
FPGA Tools Course Answers
Powerful High Density Solutions
Win with HDL Slide 4 System Level Design
Optimizing RTL for EFLX Tony Kozaczuk, Shuying Fan December 21, 2016
Presentation transcript:

FPGA Design Flow Workshop Synthesis Techniques FPGA Design Flow Workshop

Objectives After completing this module, you will be able to: Select a proper coding style to create efficient FPGA designs Specify Xilinx resources that need to be instantiated for various FPGA synthesis tools Describe an approach to using your synthesis tool to obtain higher performance FPGA Synthesis tools: Synopsys - FPGA Compiler II Synplicity - Synplify, Synplify Pro, Amplify (physical optimization tool) Xilinx – XST (Xilinx Synthesis Technology) XST and Synplicity can be integrated into ISE.

Outline Coding Tips Instantiating Resources Synthesis Options Summary Appendix: Inferring Logic and Flip-Flop Resources Inferring Memory Inferring I/Os and Global Resources Inference versus Instantiation by Synthesis Vendor

Instantiation versus Inference Instantiate a component when you must dictate exactly which resource is needed Synthesis tool is unable to infer the resource Synthesis tool fails to infer the resource Xilinx recommends inference whenever possible Inference makes your code more portable Xilinx recommends using the CORE Generator™ system to create ALUs, fast multipliers, FIR filters, etc. for instantiation Instantiation: Directly referencing a library primitive or macro in your HDL. Inference: Writing an RTL description of circuit behavior that the synthesis tool converts into library primitives. Why instantiate? Instantiation is useful when you cannot infer the component. For example, it is not possible to infer the Virtex-II DCM component. Hence, the only way to use it is to instantiate the DCM block. Components that must be instantiated are listed later in this module. Examples of inferring various Xilinx resources can be found in the appendix at the end of this module.

Coding Flip-Flops Control signal precedence: Clear, Preset, Clock Enable VHDL FF_AR_CE: process(CLK) begin if (CLK’event and CLK = ‘1’) then if (RST = ‘1’) then Q <= ‘0’; elsif (SET = ‘1’) then Q <= ‘1’; elsif (CE = ‘1’) then Q <= D_IN; end if; end process Verilog always @(posedge CLK) if (RST) Q = 1’b0; else if (SET) Q = 1’b1; else if (CE) Q = D_IN; The natural precedence of control inputs for Xilinx FPGA flip-flops is: Reset, Set, Enable. To properly infer flip-flops, your code should be written with the same precedence. If an asynchronous clear or preset is used, it must have precedence over the clock enable. If synchronous reset or set is used, it should have precedence over the clock enable. If you write your code so that the enable has precedence, the set or reset will be implemented by using logic gates instead of the dedicated flip-flop input. The examples shown here implement synchronous reset and set.

State Machine Design Put the next-state logic in one CASE statement The state register may also be included here or in a separate process block or always block Put the state machine outputs in a separate process or always block Easier for synthesis tools to optimize logic this way Inputs to FSM S2 S1 S3 State Machine Module S5 S4 HDL Code Next-state logic FSMs are faster when they are in separate processes because the combinatorial logic does not share resources. Hence, logic can be combined into a single LUT. State register State machine outputs

The Perfect State Machine The perfect state machine has Inputs: input signals, state jumps Outputs: output states, and control/enable signals to the rest of the design NO arithmetic logic, datapaths, or combinatorial functions inside the state machine Current State Feedback to Drive State Jumps State Jumps Only! Next State Output State and Enables State Register Input Signals

State Machine Encoding Use enumerated types to define state vectors (VHDL) Most synthesis tools have commands to extract and re-encode state machines described in this way Use one-hot encoding for high-performance state machines Uses more registers, but simplifies next-state logic Experiment to discover how your synthesis tool chooses the default encoding scheme Register state machine outputs for higher performance One-hot: The advantage to one-hot encoding in Xilinx FPGAs is that the next-state decoding logic may be simplified to logic equations with four inputs or less, which can fit into a single Look-Up Table (LUT). This will maximize the performance of the state machine. Many synthesis tools automatically choose one-hot encoding for state machines when you target a Xilinx FPGA.

Outline Coding Tips Instantiating Resources Synthesis Options Summary Appendix: Inferring Logic and Flip-Flop Resources Inferring Memory Inferring I/Os and Global Resources Inference versus Instantiation by Synthesis Vendor

Instantiation Tips Use instantiation only when it is necessary to access device features or increase performance or decrease area Exceptions are noted at the end of this section Limit the location of instantiated components to a few source files, to make it easier to locate these components when porting the code

FPGA Resources Can be inferred by all synthesis tools: Shift register LUT (SRL16 / SRLC16) F5, F6, F7, and F8 MUX Carry logic MULT_AND MULT18x18 / MULT18x18S Memories (ROM) Global clock buffers (BUFG) SelectIO (single-ended) I/O registers (single data rate) Input DDR registers Can be inferred by some synthesis tools: Memories (RAM) Global clock buffers (BUFGCE, BUFGMUX, BUFGDLL**) Cannot be inferred by any synthesis tools: SelectIO (differential) Output DDR registers DCM Specific information for specific synthesis tools can be found in the Appendix to this module.

Suggested Instantiation Xilinx recommends that you instantiate the following elements: Memory resources Block RAMs specifically (use the CORE Generator™ system to build large memories) SelectIO resources Clock management resources DCM (use the Architecture Wizard) IBUFG, BUFG, BUFGMUX, BUFGCE

Suggested Instantiation Why do we suggest this? Easier to change (port) to other and newer technologies Fewer synthesis constraints and attributes to pass on Keeping most of the attributes and constraints in the Xilinx UCF file keeps it simple—one file contains critical information Create a separate hierarchical block for instantiating these resources Above the top-level block, create a Xilinx “wrapper” with Xilinx-specific instantiations STARTUP Xilinx “wrapper” top_xlnx Top-Level Block OBUF_GTL IBUFG DCM BUFG OBUF_GTL IBUF_SSTL2_I OBUF_GTL

Outline Coding Tips Instantiating Resources Synthesis Options Summary Appendix: Inferring Logic and Flip-Flop Resources Inferring Memory Inferring I/Os and Global Resources Inference versus Instantiation by Synthesis Vendor

Synthesis Options There are many synthesis options that can help you obtain your performance and area objectives: Timing Driven Synthesis Timing Constraint Editor FSM Extraction Retiming Register Duplication Hierarchy Management Schematic Viewer Error Navigation Cross Probing Physical Optimization Options for each tool can be found. FPGA Compiler II: Synthesis  Options or Constraints Editor, or Create Implementation Options (accessed when the click on Create Implementation) Synplify: Implementation Options or Constraints Editor (SCOPE) XST: Listed under the Properties menu. In the Project Navigator’s Process window, right-click on Synthesize and select Properties

Timing Driven Synthesis Timing driven synthesis uses performance objectives to drive the optimization of the design Based on your performance objectives, the tools will try several algorithms to attempt to meet performance while keeping the amount of resources in mind Performance objectives are provided to the synthesis tool via timing constraints

FSM Extraction Finite State Machine (FSM) extraction optimizes your state machine by re-encoding and optimizing your design based on the number of states and inputs By default, the tools will use FSM extraction For more information on specifics of how your synthesis tool will re-encode your FSM, see the user guide provided by each vendor.

Retiming Retiming: The synthesis tool automatically tries to move register stages to balance combinatorial delay on each side of the registers Before Retiming D Q After Retiming To access retiming in XST: Enable under the Properties dialog box for Synthesize  Xilinx Specific Options  Register balancing. Retiming results will be design dependent. In some situations, retiming may not provide any benefit (highly pipelined designs). However, for some designs it may improve performance. D Q

Register Duplication Register duplication is used to reduce fanout on registers (to improve delays) Register duplication of the output 3-state register is used so that the IOB 3-state register can be moved inside the IOB to reduce clk-to-output delays Xilinx recommends manual register duplication Most synthesis vendors create signals <signal_name>_rep0, _rep1, etc. Implementation tools pack these signals into the same slice Not necessarily wrong, but it may prohibit a register from being moved closer to its destination When manually duplicating registers, do NOT use a number at the end Example: <signal_name>_0dup, <signal_name>_1dup Note that for the 3-state register to be placed in the IOB, its fanout must be one. This is a separate attribute from the maximum fanout control.

Hierarchy Management The basic settings are: Flatten the design: Allows total combinatorial optimization across all boundaries Maintain hierarchy: Preserves hierarchy without allowing optimization of combinatorial logic across boundaries If you have followed the synchronous design guidelines, use the setting -maintain hierarchy If you have not followed the synchronous design guidelines, use the setting - flatten the design To access hierarchy control in XST: Turn on Advanced Property Display level in the Edit  Preferences dialog. Then look under Properties for the Synthesize process  Synthesis Options tab  Keep Hierarchy

Hierarchy Preservation Benefits Easily locate problems in the code based on the hierarchical instance names contained within static timing analysis reports Enables floorplanning and incremental design flow The primary issue is optimization of combinatorial logic across hierarchical boundaries The easiest way to eliminate this problem is to register the outputs of leaf-level blocks

Schematic Viewer Allows you to view synthesis results graphically Check the number of logic levels between flip-flops Quickly locate net and instance names Works best when hierarchy has been preserved during synthesis

Error Navigation and Cross Probing Allows you to click on errors in Xilinx reports and cross-navigate to the problem area by using the synthesis tool You must set some environment variables for this to work For more information, see application note XAPP406 For a list of application notes, go to http://support.xilinx.com  Documentation  App Notes.

Increasing Productivity Use synchronous design techniques Preserve hierarchy during synthesis Aids in debugging and cross-referencing to report files Use timing-driven synthesis if your tool supports it Check the synthesis report for performance estimates After implementation, look at timing reports and identify critical paths Double-check the HDL coding style for these paths Try some of the synthesis options discussed earlier For paths that continually fail to meet timing, add path-specific constraints during synthesis Add corresponding path-specific constraints for implementation Synchronous design techniques are listed on the next two pages. Static timing analysis performed by the synthesis tool may not reflect the true critical paths. The performance reported by the synthesis tool will usually be accurate to within 30 percent. Therefore, it is better to implement the design once through the Xilinx tools to locate critical paths, then specify path-specific constraints in your synthesis tool.

Outline Coding Tips Instantiating Resources Synthesis Options Summary Appendix: Inferring Logic and Flip-Flop Resources Inferring Memory Inferring I/Os and Global Resources Inference vs. Instantiation by Synthesis Vendor

Skills Check

Review Questions Which encoding scheme is preferred for high-performance state machines? Which Xilinx resources, generally, must be instantiated? List a few of the options that the synthesis tools provide to help you to increase performance What is the synthesis approach presented here for obtaining higher performance?

Answers Which encoding scheme is preferred for high-performance state machines? One-hot Which Xilinx resources generally must be instantiated? Double-data-rate output registers Differential I/O BUFGMUX BUFGCE DCM Complex block RAMs

Answers List a few of the options that the synthesis tools provide to help you to increase performance Timing driven synthesis FSM extraction Retiming Register duplication Physical optimization What is the synthesis approach presented here for obtaining higher performance? Follow synchronous design techniques Preserve hierarchy during synthesis Use timing-driven synthesis

Summary Your HDL coding style can affect synthesis results Infer functions whenever possible Use one-hot encoding to improve design performance When coding a state machine, separate the next-state logic from the state machine output equations Most resources are inferable, either directly or with an attribute Take advantage of the synthesis options provided to help you meet your timing objectives Use synchronous design techniques and timing-driven synthesis to achieve higher performance (more on this later)

Where Can I Learn More? Synthesis & Simulation Design Guide: http://support.xilinx.com > Software Manuals User Guides: http://support.xilinx.com > Documentation Technical Tips: http://support.xilinx.com > Tech Tips Click Xilinx Synthesis Technology, Synopsys FPGA and Design Compiler, or Synplicity Synthesis documentation or online help Answer 7140: How to Infer ROM for Virtex Answer 5800: Inferring SRL16 and SRL16E Answer 3992: How to Implement a Synchronous Reset Answer 10900: Specifying IOSTANDARDS for Virtex-II Answer 4392: Attribute Passing

Outline Introduction Coding Tips Instantiating Resources Summary Appendix: Inferring Logic and Flip-Flop Resources Inferring Memory Inferring I/Os and Global Resources Inference vs. Instantiation by Synthesis Vendor

Shift Register LUT (SRL16) To infer the SRL, these are the primary characteristics that the code must have: No set or reset signal Serial-in, serial-out SRLs can be initialized on power-up via an INIT attribute in the Xilinx User Constraint File (UCF)

SRL16E Example Verilog: VHDL: always @ (posedge clk) process(clk) begin if rising_edge(clk) then if ce = ‘1’ then sr <= input & sr(0 to 14); end if; end process; output <= sr(15); Verilog: always @ (posedge clk) begin if (ce) sr <= {in, sr[0:14]}; end assign out <= sr[15]; To infer an SRL16, simply omit the check for a clock enable.

Dynamically Addressable SRL SRL16/SRL16E, and SRLC16/SRLC16E SRLC16 has two outputs in Virtex™-II: q15 - final output, and q - dynamically addressable output

SRLC16E Example Verilog: VHDL: always @ (posedge clk) process(clk) begin if rising_edge(clk) then if CE = ‘1’ then sr <= input & sr(0 to 14); end if; end process; output <= sr(15); dynamic_out <= sr(addr); Verilog: always @ (posedge clk) begin if (ce) sr <= {in, sr[0:14]}; end assign out <= sr[15]; assign dynamic_out <= sr[addr];

Virtex-II Multiplexers F5MUX, F6MUX, F7MUX, F8MUX primitives Dedicated multiplexers in Virtex™-II CLB Only F5/F6 available in Virtex family 4:1 multiplexer will use one slice 16:1 multiplexer will use 4 slices (1 Virtex-II CLB) 32:1 multiplexer will use 8 slices No attribute needed -- inferred automatically data(2) F5MUX LUT F6MUX data(0) data(1) data(3) data(4) data(5) data(6) data(7) muxout

F5MUX and F6MUX Example VHDL: process(sel, data) begin case (sel) is when “000” => out <= data(0); when “001” => out <= data(1); when “010” => out <= data(2); when “011” => out <= data(3); when “100” => out <= data(4); when “101” => out <= data(5); when “110” => out <= data(6); when “111” => out <= data(7); when others => out <= ‘0’; end case; end process; Verilog: always @ (sel or data) case(sel) 3'b000: muxout = data[0]; 3'b001: muxout = data[1]; 3'b010: muxout = data[2]; 3'b011: muxout = data[3]; 3'b100: muxout = data[4]; 3'b101: muxout = data[5]; 3'b110: muxout = data[6]; 3'b111: muxout = data[7]; default : muxout = 0; endcase

Flip-flop Set/Reset Conditions When using asynchronous set and asynchronous reset, reset has priority When using synchronous set and synchronous reset, reset has priority When using any combination of asynchronous set or reset with synchronous set or reset: Asynchronous set or reset has priority (furthermore, reset has highest priority) In this mode, the synchronous set, reset, or both is implemented in the LUT The priority of the synchronous set versus synchronous reset is defined by how the HDL is written If synchronous set or reset is implemented in logic, then any clock enable will also be implemented in logic. Otherwise, the enable would have precedence over the synchronous set or reset.

Flip-Flop Example VHDL: process(clk, reset, set) begin if (reset = ‘1’) then q <= ‘0’; elsif (set = ‘1’) then q <= ‘1’; elsif rising_edge(clk) then if (sync_set = ‘1’) then q <= ‘1’; elsif (sync_reset = ‘1’) then q <= ‘0’; elsif (ce = ‘1’) then q <= d; end if; end process; Verilog: always @ (posedge clk or posedge reset or posedge set) if (reset) q = 0; else if (set) q = 1; else if (sync_set) else if (sync_reset) else if (ce) q = d; end This flip-flop will have the following control signals, listed in order of priority: Asynchronous reset Asynchronous set Synchronous set, implemented in logic Synchronous reset, implemented in logic Clock enable, implemented in logic (because it must have lower precedence than synchronous set or reset)

Carry Logic Synthesis maps directly to the dedicated carry logic Access carry logic through adders, subtractors, counters, comparators (>15 bits), and other arithmetic operations Adders and subtractors (SUM <= A + B) Comparators (if A < B then) Counters (COUNT <= COUNT + 1) Note: Carry logic will not be inferred if arithmetic components are built with gates For example: XOR gates for addition and an AND gates for carry logic will not infer carry logic

Carry Logic Examples VHDL: count <= count + 1 when (addsub = ‘1’) else count - 1; if (a >= b) then a_greater_b <= ‘1’; product <= constant * multiplicand; Verilog: assign count = addsub ? count + 1: count - 1; if (a >= b) a_greater_b = 1; assign product = constant * multiplicand;

MULT18x18 To use MULT_18x18 in XST set: syn_multstyle = block_mult (default) Possible values are “block_mult” and “logic”

CLB MULT_AND Synplicity set: syn_multstyle = logic Synopsys will use MULT_AND by default B y+1 A x-1 y x DI CI LO S MUXCY_L LI XORCY_L MULT_AND LUT

Multiplier Example VHDL: library ieee; use ieee.std_logic_signed.all; use ieee.std_logic_unsigned.all; … process (clk, reset) begin if (reset = ‘1’) then product <= (others => ‘0’); elsif rising_edge(clk) then product <= a * b; end if; end process; Verilog: always @ ( posedge clk or posedge reset) begin if (reset) product <= 0; else product <= a * b; end VHDL: Use the IEEE.std_logic_signed package for signed multiplication. Use the IEEE.std_logic_unsigned package for unsigned multiplication. Depending on the settings mentioned on the previous pages, this code will infer either a MULT18x18 or logic gates that include the MULT_AND resource.

Outline Introduction Coding Tips Instantiating Resources Summary Appendix: Inferring Logic and Flip-Flop Resources Inferring Memory Inferring I/Os and Global Resources Inference vs. Instantiation by Synthesis Vendor

Block SelectRAM Based on the size and characteristics of the code, XST can automatically select the best style Available settings: Auto, Block, Distributed

Block RAM Inference Notes Synthesis tools cannot infer: Dual-port block RAMs with configurable aspect ratios Ports with different widths Block RAMs with enable or reset functionality Always enabled Output register cannot be reset Dual-port block RAMs with read and write capability on both ports Block RAMs with read capability on one port and write on the other port can be inferred Dual-Port functionality with different clocks on each port These limitations on inferring block RAMs can be overcome by creating the RAM with the CORE Generator™ system or instantiating primitives

Block RAM Example VHDL: signal mem: mem_array; attribute syn_ramstyle of mem: signal is “block_ram”; … process (clk) begin if rising_edge(clk) then addr_reg <= addr; if (we = ‘1’) then mem(addr) <= din; end if; end process; dout <= mem(addr); Verilog: reg [31:0] mem[511:0] /*synthesis syn_ramstyle = “block_ram”*/; always @ ( posedge clk) begin addr_reg <= addr; if (we) mem[addr] <= din; end assign dout = mem[addr_reg]; The syn_ramstyle attribute is shown here as an example of how to use attributes in Synplicity. Note that in the Verilog example, the attribute is placed before the semicolon at the end of the declaration. VHDL: The following definitions are assumed: type mem_array is array(511 downto 0) of std_logic_vector(31 downto 0); attribute syn_ramstyle: string; addr and addr_reg are integers. If they are defined as std_logic_vectors, then a conversion function must be used when referencing mem(addr).

Distributed SelectRAM Each LUT can implement a 16x1-bit synchronous RAM Automatic inference when code is written with two requirements: Write must be synchronous Read must be asynchronous However, if the read address is registered, the SelectRAM memory can be inferred and will be driven by a register XST: Specify block or distributed RAM, or let XST automatically select the best implementation style Note: Dual port SelectRAM memory can be inferred with different read and write address signals

Distributed RAM Example VHDL: signal mem: mem_array; … process (clk) begin if rising_edge(clk) then if (we = ‘1’) then mem(addr) <= din; end if; end process; dout <= mem(addr); Verilog: reg [7:0] mem[31:0]; always @ ( posedge clk) begin if (we) mem[addr] <= din; end assign dout = mem[addr]; VHDL: The following definitions are assumed: type mem_array is array(15 downto 0) of std_logic_vector(7 downto 0); addr is an integer. If it is defined as a std_logic_vector, then a conversion function must be used when referencing mem(addr).

ROM XST will automatically map to ROM primitives

Distributed ROM Example VHDL: type rom_type is array(7 downto 0) of std_logic_vector(1 downto 0); constant rom_table: rom_type := (“10”, “00”, “11”, “01”, “11”, “10”, “01”, “00”); attribute syn_romstyle: string; attribute syn_romstyle of rom_table: signal is “select_rom”; … rom_dout <= rom_table(addr); Verilog: reg [1:0] rom_dout /*synthesis syn_romstyle = “select_rom”*/; always @ ( addr) case (addr) 3’b000: rom_dout <= 2’b00; 3’b001: rom_dout <= 2’b01; 3’b010: rom_dout <= 2’b10; 3’b011: rom_dout <= 2’b11; 3’b100: rom_dout <= 2’b01; 3’b101: rom_dout <= 2’b11; 3’b110: rom_dout <= 2’b00; 3’b111: rom_dout <= 2’b10; endcase The syn_romstyle attribute is shown here as an example of how to use attributes in Synplicity. Note that in the Verilog example, the attribute is placed before the semicolon at the end of the declaration. VHDL: addr is an integer. If is is defined as a std_logic_vectors, then a conversion function must be used when referencing rom_table(addr).

Outline Introduction Coding Tips Instantiating Resources Summary Appendix: Inferring Logic and Flip-Flop Resources Inferring Memory Inferring I/Os and Global Resources Inference vs. Instantiation by Synthesis Vendor

SelectIO Standard Instantiate in HDL code (required for differential I/O) For a complete list of buffers, see the following elements in the Libraries Guide: IBUF_selectIO, IBUFDS IBUFG_selectIO, IBUFGDS IOBUF_selectIO OBUF_selectIO, OBUFT_selectIO, OBUFDS, OBUFTDS Specify in the UCF file or XST constraints file (XCF) Use iostandard attribute in XCF file Use Xilinx Constraints Editor In the Ports tab, check the I/O Configuration Options box

SelectIO Standard Example VHDL: component IBUF_HSTL_III port (I: in std_logic; O: out std_logic); end component; ... ibuf_data_in_inst: IBUF_HSTL_III port map (I => data_in, O => data_in_i); Verilog: /* For primitive instantiations in Verilog you must use UPPERCASE for the primitive name and port names */ IBUF_HSTL_III ibuf_data_in_inst (.I(data_in), .O(data_in_i)); This example shows how to instantiate an input buffer. The internal signal is the same as the name of the port signal with “_i” added. VHDL is not case-sensitive. However, using ALL UPPERCASE for instantiated Xilinx primitives can make them easier to locate when scanning through the code. Verilog is case-sensitive, so you must use ALL UPPERCASE for component names and port names when instantiating Xilinx primitives.

IOB Registers For single-data-rate I/O registers: Set Map Process Properties  Pack I/O Registers/Latches into IOBs Use the IOB = TRUE attribute in the UCF file Use on instantiated FFs or inferred FFs with known instance name Example: INST <FF_instance_name> IOB = TRUE; XST: Automatically packs registers in the IOB, based on timing To override default behavior, under Synthesize  Properties  Xilinx Specific Options tab  Pack I/O Registers into IOBs  Auto, Yes, or No Examples of the Synplify attribute override default behavior on global or individual ports: define_global_attribute syn_useioff 1 define_attribute syn_useioff{d[3:0]} 1

IOB Register Limitations All registers that are packed into the same IOB must share the same set or reset signal Output and 3-state enable registers that are packed into the same IOB must share the same clock signal Output and 3-state enable registers must have a fanout of one Synopsys and Synplicity will automatically replicate 3-state enable registers to enable packing into IOBs Output 3-state enables must be active-Low

I/O Register Example VHDL: process(clk, reset) begin if (reset = ‘1’) then data_in_i <= ‘0’; data_out <= ‘0’; out_en <= ‘1’; elsif rising_edge(clk) then data_in_i <= data_in; out_en <= out_en_i; if (out_en = ‘0’) then data_out <= data_out_i; end if; end process; Verilog: always @ (posedge clk or posedge reset) if (reset) begin data_in_i <= 0; data_out <= 0; out_en <= 1; end else data_in_i <= data_in; out_en <= out_en_i; if (~out_en) data_out <= data_out_i; This example shows a bidirectional I/O pin using all three registers (single data rate). The internal signal is the same as the name of the port signal with “_i” added. Assuming that after synthesis, the instance names of the flip-flops are data_in_i, data_out, and out_en, you must make sure that the following constraints are in the NCF or UCF file: INST data_in_i IOB = TRUE; INST data_out IOB = TRUE; INST out_en IOB = TRUE; The instance names are case-sensitive.

DDR Registers Input double-data-rate registers can be inferred Data input must come from a top-level port DDR registers must use CLK and (NOT CLK) with 50-percent duty-cycle or DLL outputs CLK0 and CLK180 Standard flip-flops are inferred during synthesis, but MAP will move them into the IOB Output double-data-rate registers must be instantiated See the following elements in the Libraries Guide: OFDDRCPE, OFDDRRSE (for output or 3-state enable flip-flops) OFDDRTCPE, OFDDRTRSE (for output flip-flops)

Input DDR Register Example VHDL: process(clk) begin if (rising_edge(clk)) then data_in_p <= data_in; end if; end process; if (falling_edge(clk)) then data_in_n <= data_in; Verilog: always @ (posedge clk) data_in_p <= data_in; always @ (negedge clk) data_in_n <= data_in; This example shows an input DDR register on the input data_in. Instead of using both edges of a single clock signal, you can also use two DCM outputs that are 180 degrees out of phase (such as CLK0 and CLK180).

Global Buffers BUFG BUFGDLL All synthesis tools will infer on input signals that drive the clock pin of any synchronous element BUFGDLL Synopsys: Specify in the Ports tab of FPGA Compiler II constraints editor Synplicity: Can be inferred through synthesis by setting attribute xc_clockbuftype = BUFGDLL XST: Must instantiate When instantiating, the port names for each primitive are as follows: BUFG: I, O BUFGDLL: I, O BUFGCE: I, CE, O BUFGMUX: I0, I1, S, O

DCM Digital Clock Manager Must be instantiated Clock de-skew Frequency synthesis Phase shifting Must be instantiated Port names are as shown in diagram DCM ports (Verilog port names are case-sensitive): CLKIN: Main clock input coming from an IBUFG primitive. CLKFB: Feedback clock for DLL operation. This input must be driven by either a BUFG that is connected to either the CLK0 or a CLK2X output (on-chip synchronization) or by an IBUFG (off-chip synchronization). RST: Asynchronous reset input. PSINCDEC: Used to alter the phase shift amount in VARIABLE mode. PSEN: Used to enable the PSINCDEC input in VARIABLE mode. PSCLK: Used to clock in the phase shift amount in VARIABLE mode. CLK0, CLK90, CLK180, CLK270: De-skewed versions of CLKIN with different amounts of phase shift. CLK2X, CLK2X180: Multiplied versions of CLKIN. CLKDV: Divided version of CLKIN (controlled with CLKDV_DIVIDE attribute). CLKFX, CLKFX180: M/D multiple of CLKIN (controlled with CLKFX_MULTIPLY and CLKFX_DIVIDE attributes). LOCKED: Indicates that all DCM outputs are locked. STATUS: Miscellaneous error flags (see Libraries Guide for details) PSDONE: Indicates that new phase shift amount is locked in when using VARIABLE mode.

STARTUP_VIRTEX2 Provides three functions Must be instantiated Global Set/Reset (GSR) Global Three-State for output pins (GTS) User-defined configuration clock to synchronize configuration startup sequence Must be instantiated Port names are GSR, GTS, and CLK Note: Using GSR is not recommended for Virtex™-II designs Normal routing resources are faster and plentiful If you use the CLK pin, make sure you select the option to use a “User Clock” when generating the configuration bitstream.

Outline Introduction Coding Tips Instantiating Resources Summary Appendix: Inferring Logic and Flip-Flop Resources Inferring Memory Inferring I/Os and Global Resources Inference versus Instantiation by Synthesis Vendor

Synopsys FPGA Compiler II Can be inferred: Shift register LUT (SRL16 / SRLC16) F5, F6, F7, and F8 MUX Carry logic MULT_AND Memories (ROM) MULT18x18 / MULT18x18S Global clock buffers (BUFG) Can be inferred by using constraints table: SelectIO(single-ended) I/O registers (single data rate) Input DDR registers BUFGDLL* Cannot be inferred: Memories (RAM) SelectIO (differential) Output DDR registers Global clock buffers (BUFGCE, BUFGMUX) DCM * When a BUFGDLL is inferred via the constraints table, only the clk0 output is available. To utilize phase-shifted, multiplied, or divided clocks, you must instantiate the three components: IBUFG, CLKDLL or DCM, BUFG. Synopsys supports instantiation and does not require any library declarations. Simply by specifying the target technology, Synopsys tools support Xilinx primitive instantiations.

Synplicity Synplify Pro 7.3 Can be inferred: Shift register LUT (SRL16 / SRLC16) F5, F6, F7, and F8 MUX Carry logic MULT_AND Memories (distributed RAM) MULT18x18 / MULT18x18S Global clock buffers (BUFG) Can be inferred by using constraints editor or attributes: Memories (distributed ROM, some block RAM*) SelectIO(single-ended) I/O registers (single data rate) Input DDR registers BUFGDLL** Synplicity’s constraints editor is also known as SCOPE. * Only simple block RAMs can be inferred. See the memory section in the appendix for details. **When a BUFGDLL is inferred via attributes, only the clk0 output is available. To utilize phase-shifted, multiplied, or divided clocks, you must instantiate the three components: IBUFG, CLKDLL or DCM, BUFG.

Synplicity Synplify Pro 7.3 Cannot be inferred: Memories (complex block RAM) SelectIO (differential) Output DDR registers Global clock buffers (BUFGCE, BUFGMUX) DCM When instantiating Xilinx primitives in Synplicity, you must include an HDL file with primitive information in your input HDL list . This file can be found under $SYNPLICITY/synplify/lib/xilinx. The file you use will depend on the device you are targeting. Choices include: virtex2.v/.vhd virtex2p.v/.vhd spartan3.v/.vhd virtex.v/.vhd Virtexe.v/.vhd

XST 6.2i Can be inferred: Shift register LUT (SRL16 / SRLC16) F5, F6, F7, and F8 MUX Carry logic MULT_AND Memories (distributed ROM and RAM, block RAM*) MULT18x18 / MULT18x18S Global clock buffers (BUFG) Can be inferred by using constraints editor or attributes: SelectIO (single-ended) I/O registers (single data rate) Input DDR registers Global clock buffers (BUFGCE, BUFGMUX, BUFGDLL**) Cannot be inferred: SelectIO (differential) Output DDR registers DCM * Not all types of block RAM can be inferred. See the XST User Guide in the Online Documentation. Go to HDL Coding Techniques  RAMs. ** The BUFGCE and BUFGMUX attributes control the inference of those components. To infer a BUFGDLL, use the CLOCK_BUFFER attribute. See the Constraints Guide for more information. For more information on instantiating Xilinx primitives when using XST, see the XST User Guide  FPGA Optimization  Virtex Primitive Support.