Presentation is loading. Please wait.

Presentation is loading. Please wait.

DSP Design Flows in FPGA

Similar presentations


Presentation on theme: "DSP Design Flows in FPGA"— Presentation transcript:

2 DSP Design Flows in FPGA

3 Objectives After completing this module, you will be able to:
Describe the advantages and disadvantages of three different design flows Use HDL, CORE Generator, or System Generator for DSP depending on design requirements and familiarity with the tools Explain why there is a need for an integrated flow from system design to implementation Describe the System Generator and the tools it interfaces with Build a model, simulate it, generate VHDL, and go through the design flow Describe how Hardware in the Loop verification is beneficial in complex system design

4 Outline Using HDL Using the Xilinx CORE Generator
Using the Xilinx System Generator for DSP HDL Co-Simulation Hardware Verification In System Debug Resource Estimator Summary Simulink Tips and Tricks

5 HDL Design Verification
Synthesis Implementation Download Implement your design using VHDL or Verilog Functional Simulation Timing Simulation In-Circuit Verification Behavioral Simulation In the HDL flow, two sets of codes must be written: HDL design and Verification code (testbench) for Behavioral and Functional simulation. The designer is responsible for testing and should create the environment for verification.

6 Synthesis Design Verification
Behavioral Simulation HDL Synthesis Implementation Download Synthesize the design to create an FPGA netlist Functional Simulation Timing Simulation In-Circuit Verification

7 Implementation Design Verification
Behavioral Simulation HDL Synthesis Implementation Download Translate, place and route, and generate a bitstream to download in the FPGA Functional Simulation Timing Simulation In-Circuit Verification If the design does not meet performance, the HDL code may have to be modified.

8 Outline Using HDL Using the Xilinx CORE Generator
Using the Xilinx System Generator for DSP HDL Co-Simulation Hardware Verification Resource Estimator Summary Simulink Tips and Tricks

9 CORE Generator Design Verification
Instantiate optimized IP within the HDL code HDL Behavioral Simulation COREGen Functional Simulation Synthesis Implementation Timing Simulation In-Circuit Verification Download

10 Synthesize, Implement, Download Design Verification
Behavioral Simulation Synthesis Implementation Download Functional Simulation Timing Simulation In-Circuit Verification COREGen Synthesize, Implement, and Download the bitstream, similar to the original design flow HDL

11 Xilinx IP Solutions DSP Functions Math Functions Memory Functions
$P Additive White Gaussian Noise (AWGN) $P Reed Solomon $ 3GPP Turbo Code $P Viterbi Decoder P Convolution Encoder $P Interleaver/De-interleaver P LFSR P 1D DCT P 2D DCT P DA FIR P MAC P MAC-based FIR filter Fixed FFTs 16, 64, 256, 1024 points P FFT 16- to points P FFT - 32 Point P Sine Cosine Look-Up Tables $P Turbo Product Code (TPC) P Direct Digital Synthesizer P Cascaded Integrator Comb P Bit Correlator P Digital Down Converter P Multiplier Generator - Parallel Multiplier - Dyn Constant Coefficient Mult - Serial Sequential Multiplier - Multiplier Enhancements P Pipelined Divider P CORDIC P Asynchronous FIFO P Block Memory modules P Distributed Memory P Distributed Mem Enhance P Sync FIFO (SRL16) P Sync FIFO (Block RAM) P CAM (SRL16) P CAM (Block RAM) Base Functions P Binary Decoder P Twos Complement P Shift Register RAM/FF P Gate modules P Multiplexer functions P Registers, FF & latch based P Adder/Subtractor P Accumulator P Comparator P Binary Counter IP CENTER Although continuously increasing, the list of IP is limited and you will not always find the function of interest. Several options are available in that case: build the required function from lower level block IP or choose a mix-mode where part of the function is specified in HDL or schematic and the rest in IP. Key: $ = License Fee, P = Parameterized, S = Project License Available, BOLD = Available in the Xilinx Blockset for the System Generator for DSP

12 Xilinx CORE Generator List of available IP from or Fully
Parameterizable The Xilinx CORE Generator is the delivery vehicle for IP. IP from Xilinx and from Alliance partner are listed in the CORE Generator, although only the Xilinx LogiCORE can be generated from this tool. IP is fully parameterizable via a customization GUI and can be generated for any type of HDL/schematic flow as long as it is officially supported by Xilinx.

13 Xilinx Smart-IP Technology
Pre-defined placement and routing enhances performance and predictability Performance is independent of: Relative Placement Other logic has no effect on the core Fixed Placement & Pre-defined Routing Guarantees Performance Guarantees I/O and Logic Predictability Fixed Placement I/Os 200 MHz Core Placement Number of Cores Device Size With relative placement of the logic within a core, you get logic predictability. Because the logic has consistent internal placement, the performance of a core remains constant regardless of its position in the device. This is the intelligent software part of Smart-IP technology. In addition to the modular routing capability, we can keep track of the relative location of a core’s logic. Hence, we can floorplan the core or fix its placement with respect to the I/O. For guaranteed performance, we can even fix the placement and predefine the routing. For example, the Xilinx DSP CORES use only relative placement, but for the more performance-sensitive PCI design we use the fixed placement and predefined routing strategy. Designs can be migrated to larger devices without any performance degradation. Because of the use of regular local logic and interconnect as well as segmented routing, the IP modules can be placed anywhere on the device without impacting performance. Because the IP modules use regular local logic and interconnect, you can also place multiple copies of the same module on a device and they will ALL continue to function at the published performance speeds. For the same reasons, you can also migrate an IP module from smaller devices to larger devices without degrading the module’s performance.

14 Outputs .EDN (EDIF implementation netlist)
.XCO (core implementation data file / log file) Optional: .ASY Foundation or Innoveda symbols .VEO Verilog instantiation template .V Verilog behavioral simulation model .VHO VHDL instantiation template .VHD VHDL behavioral simulation model The EDIF file is the actual netlist that is used by the place and route implementation tools. The XCO file is a CORE Generation “log” file that contains all of the parameterization options that were used to create the IP Module. This XCO file can be re-run to recreate the IP module. Optional files, the creation of which will depend on the current CORE Generator’s project settings, will be the VHDL and Verilog instantiation templates. These templates point to the behavioral models for this IP module.

15 Labs 1-2: Generating a MAC
You will be generating the MAC using three different methods Using VHDL and the Xilinx CORE Generator Using the Xilinx System Generator for DSP Compare the implementation results Contrast the design methodologies

16 Lab 1 Creating a MAC using a combination of VHDL and Core Generator
Become familiar with the HDL and CORE Generator design flows, which includes: Coding a piece of HDL Generating CORE Generator macros Instantiating the macros in VHDL Synthesizing a design using Xilinx XST Implementation using the Xilinx ISE 6 tools Performing an on-chip verification with Chipscope-Pro Create a 12 x 8 MAC by generating a multiply accumulator using the CORE Generator

17 Wrap up Implementation results: 71 slices, 175 MHz
Important to notice: Global clock buffer should be instantiated because the synthesis tool may not know which signal is the clock because it is looking at a black box

18 Outline Using HDL Using the Xilinx CORE Generator
Using the Xilinx System Generator for DSP HDL Co-Simulation Hardware Verification In System Debug Resource Estimator Summary Simulink Tips and Tricks

19 The Challenges for a DSP Software Platform
Industry Trends Trend towards platform chips (FPGAs, DSP) resulting in greater complexity Highly flexible systems required to meet changing standards Multiple design methodologies - control plane/datapath Challenges in modeling and implementing an entire platform Hardware in the loop verification is useful in complex system design and System Generator supports it System Design Challenges Leveraging legacy HDL code Modeling & implementing control logic and datapath No expert exists for all facets of system design Industry Trends Platform FPGAs - Designers putting more of the system onto a chip to reduce costs Flexibility - self explanatory Multiple design methodologies - e.g., embedded system design for microprocessor and VHDL based design for hardware Challenges in modeling the entire platform - With the integration of processors/dsp/logic/IO there are new challenges in effectively modeling the entire system System Design Challenges Designers want to leverage existing HDL code as they merge bits of a system together. They want the system to have both the control and datapath and this means that there may be multiple experts for each facet of the design tat are used to different environments

20 MATLAB MATLAB™, the most popular system design tool, is a programming language, interpreter, and modeling environment Extensive libraries for math functions, signal processing, DSP, communications, and much more Visualization: large array of functions to plot and visualize your data and system/design Open architecture: software model based on base system and domain-specific plug-ins The MathWorks has been developing system design tools since Its latest product is MATLAB 6.5 (from MATLAB tools release 13, July, 2002). Visit The MathWorks website at for further details. Other vendors of system-level modeling packages are: Visual data flow SPW (Cadence), COSSAP (Synopsys), Ptolemy (UC Berkeley), SystemView (Elanix) Programming language based C++ SystemC, OCAPI (IMEC) C Streams-C (Gokhale et al.), Handel-C (Celoxica) Java JHDL (BYU) Relative success of each approach, at least to date, is indicated by the predominance of commercial offerings in VDF (Visual Data Flow) as compared to research activity.

21 MATLAB Frequency response of input sound file
This is an excellent example of how designers can visualize their signals at any point in their algorithm and analyze the effects their systems are having on their design. This is far more difficult, if not impossible, to do with current FPGA design tools. Another strong reason for these system-level design tools is the speed and ease of implementing algorithms, concepts and ideas. This example was executed in three lines of code in no time at all. To implement such an algorithm in an FPGA design would take large amounts of design work, code, and time. Notice the Workspace Window pane on the left. This window pane is used to view the different variables that have been created by the designer, and are accessible to algorithms. The two variables that can currently be seen are Fw (the frequency response vector from zero to the Nyquist frequency) and voice (the variable which stores the sound information). Other useful windows include the Current Directory window (a navigation window).

22 Simulink Simulink™ - Visual data flow environment for modeling and simulation of dynamical systems Fully integrated with the MATLAB engine Graphical block editor Event-driven simulator Models parallelism Extensive library of parameterizable functions Simulink Blockset - math, sinks, sources DSP Blockset - filters, transforms, etc. Communications Blockset - modulation, DPCM, etc. Simulink, The MathWorks’ visual data flow tool, presents an alternative to using programming languages for system design. This enables designers to visualize the dynamic nature of your system while illustrating their complete system in a realistic fashion with respect to the hardware design. Most hardware design starts out with a block diagram description and specification of the system, very similar to the Simulink design. The main part of Simulink is the Library browser that contains all the available building blocks to the user. This library is expandable and each block is parameterizable. Users can even create their own libraries of functions they have created. An important point of note about Simulink is that it can model concurrency in a system. Unlike the sequential manner of software code, the Simulink model can be seen to be executing sections of a design at the same time (in parallel). This notion is fundamental to implementing a high-performance hardware implementation.

23 MATLAB/Simulink Real time frequency response from a microphone: emphasizes the dynamic nature of Simulink It is also possible to interface Simulink and MATLAB together, which enables users to bring M files into their Simulink model, save signals to the MATLAB workspace, and vice versa. Use the FIND feature at the top of the library browser to search for blocks. There are many available and it is not easy to remember where they all reside.

24 Traditional Simulink FPGA Flow
System Architect System Verification GAP Simulink FPGA Designer HDL Synthesis Functional Simulation Verify Equivalence In the past, if a DSP designer wanted to target an FPGA, he would have no option but a “dual path” of development. The DSP designer writes an algorithm in pseudo-C, using filters, certain C code, certain precision. He may know everything about DSP and Simulink models, but may not know anything about FPGAs. Not only does he not know how to target an FPGA, he doesn’t know how to take advantage of the FPGA architecture, or how to write a design to avoid a bad FPGA implementation. When he’s done with his DSP design, he may have a working model in Simulink, but he must design the same thing in VHDL, or he gives his design to an FPGA implementer (who may know nothing about DSP) who writes the VHDL for him. The implementer might end up using a core that doesn’t do exactly what the designer wants, by not being a DSP expert, the FPGA implementer is just trying to translate the pseudo code that came to him into VHDL for an FPGA. There is also no way to co-simulate: one is simulating in C in MATLAB, the other simulating in VHDL in a behavioral simulation. It’s only when they get into the lab and simulate the board, late in the process, that they find out something’s wrong. Implementation Timing Simulation Download In-Circuit Verification

25 System Generator for DSP v7.1 – An Overview
Industry’s system-level design environment (IDE) for FPGAs Integrated design flow from Simulink to bit file Leverages existing technologies Matlab/Simulink R13.1 or R14 from The MathWorks HDL synthesis IP Core libraries FPGA implementation tools Simulink library of arithmetic, logic operators and DSP functions (Xilinx Blockset) Bit and cycle true to FPGA implementation Arithmetic abstraction Arbitrary precision fixed-point, including quantization and overflow Simulation of double precision as well as fixed point System Generator is a plug-in to the Simulink environment, adding a Blockset to the Simulink library browser: Blockset v6.1 has over 61 blocks and targets 28 LogiCORES. They are listed in nine categories: Basic Elements Communication Control Logic Data Types DSP Index Math Memory Tools Each block is bit and cycle true. This means that for any injection of data into a SysGen model (via a Gateway In) and any extraction of data from a SysGen model (via a Gateway Out), the bits at the gateways match the corresponding bits in hardware at the sample times defined in Simulink. To provide system-level designers with a portal into the FPGA, System Generator taps into existing technologies, leveraging the MathWorks tool suite to provide the foundations for system design and the Xilinx FPGA design flow to implement the design.

26 System Generator for DSP v7.1 – An Overview
VHDL code generation for Virtex-4™, Virtex-II Pro™, Virtex™-II, Virtex™-E, Virtex™, Spartan™-3, Spartan™-IIE and Spartan™-II devices Hardware expansion and mapping Synthesizable VHDL with model hierarchy preserved Mixed language support for Verilog Automatic invocation of CORE Generator to utilize IP cores ISE project generation to simplify the design flow HDL testbench and test vector generation Constraint file (.xcf), simulation ‘.do’ files generation HDL Co-Simulation via HDL C-Simulation Verification acceleration using Hardware in the Loop Along with the VHDL code that gets generated for the Simulink model is: - .npl file: An ISE project file, providing a seamless entry into the FPGA design flow - .xcf: User Constraints File to assist the implementation in obtaining maximum performance - A testbench: The testbench uses the stimulus provided in Simulink and also compares the HDL simulation results with the results from Simulink to verify equivalence. - .do files: For MTI users, scripts are created to make compilation and simulation simple - CORE Generator is automatically invoked for the elements in the design that use COREGen components. See “help” under the parameters GUI to see which Core will be used for a particular block.

27 Mathworks R14 Compliant! System level modeling tool
Release 13.1 or R14 Xilinx implementation tools - ISE 7.1i Synthesis XST & Project Navigator within ISE 7.1i Leonardo Spectrum LS 2003b.35 or later Synplify v7.2 or later HDL Simulation ModelSim 5.7e or later Here is a list of tools that System Generator v3.1 is compatible with Read list

28 System Generator for DSP Platform Designs
PCI/JTAG The System Generator 6.1 provides a convenient way to perform HDL co-simulation and Hardware in the loop simulation using Black Box block. The black box can be used to incorporate hardware description language (HDL) models into System Generator.

29 System Generator Based Design Flow
MATLAB/Simulink HDL System Generator System Verification Synthesis Functional Simulation Implementation Timing Simulation This diagram shows one of three ways to design a system. This diagram shows that one can use only Xilinx System Generator’s blocks in the design and generate a synthesizable design which can be implemented using Xilinx ISE’s Project Navigator. No user defined blocks are included. It is a quick way to design a system. Download In-Circuit Verification

30 System Generator Based Design Flow
MATLAB/Simulink Files Used Configuration file VHDL IP Constraints File HDL System Generator System Verification Synthesis Functional Simulation Implementation Timing Simulation HDL-CoSimulation This diagram shows second of three ways to design a system. This is a typical design flow for HDL Co-Simulation. This diagram shows that one can use Black Box and include user’s VHDL code or IP core along with Xilinx System Generator’s blocks in the design and generate a synthesizable design which can be implemented using Xilinx ISE’s Project Navigator. It also uses ModelSim block which is a helper block to invoke ModelSim simulator and actually simulate the design. The simulator’s output is fed back to Simulink and the results can be displayed using Simulink’s sinks. Download In-Circuit Verification

31 System Generator Based Design Flow
MATLAB/Simulink Files Used Configuration file VHDL IP Constraints File HDL System Generator System Verification Synthesis Functional Simulation Implementation Timing Simulation This diagram shows third of three ways to design a system. This is a typical design flow for Hardware in the Loop accelerated verification. This diagram shows that one can use Black Box, include user’s VHDL code or IP core, hardware interface block, and Xilinx System Generator’s blocks in the design. In this flow, user can generate a synthesizable design which is implemented using Xilinx XFLOW, generate a configuration bit file, and a hardware in the Loop library block. Once the block is added to the design, user can run the simulation which downloads the bit file in the hardware and obtains the responses from the hardware in real-time. The hardware’s output is fed back to Simulink and the results can be displayed using Simulink’s sinks. Download In-Circuit Verification

32 Creating a System Generator Design
Invoke Simulink library browser To open the Simulink library browser, click the Simulink library browser button or type “Simulink” in MATLAB console The library browser contains all the blocks available to designers Start a new design by clicking the new sheet button The .mdl (MATLAB Description Language) file is the main design file for Simulink designs. You can also open .mdl files directly through the MATLAB console GUI or from the MATLAB command line. Make sure your MATLAB current directory path is always pointing to your working directory. You can “cd” to the correct directory.

33 Creating a System Generator Design
Build the design by dragging and dropping blocks from the Xilinx blockset onto your new sheet. Design Entry is similar to a schematic editor Connect up blocks by pulling the arrows on the sides of each block The next few slides show the process that a system designer may use to get his design into an FPGA, via the System Generator and Simulink. Right-click on a block to format blocks (rotate, drop shadow, font) and to change foreground color and background color.

34 Finding Blocks Use the Find feature to search ALL Simulink libraries
Xilinx blockset has nine major sections Basic elements Counters, delays Communication Error correction blocks Control Logic MCode, Black Box Data Types Convert, Slice DSP FDATool, FFT, FIR Index All Xilinx blocks – quick way to view all blocks Math Multiply, accumulate, inverter Memory Dual Port RAM, Single Port RAM Tools ModelSim, Resource Estimator As there are a vast array of blocks already accessible from the Simulink libraries, trying to locate a specific block can be daunting or frustrating. Here are two ways to find a specific block: Use the FIND feature at the top of the Simulink browser. or Go to the MATLAB help and search for a keyword. This will search all the help documents and may find the block you are looking for.

35 Configure Your Blocks Double-click or go to Block Parameters to view a block’s configurable parameters Arithmetic Type: Unsigned or twos complement Implement with Xilinx Smart-IP Core (if possible)/ Generate Core Latency: Specify the delay through the block Overflow and Quantization: Users can saturate or wrap overflow. Truncate or Round Quantization Override with Doubles: Simulation only Precision: Full or the user can define the number of bits and where the decimal point is for the block Sample Period: Can be inherent with a “-1” or must be an integer value Note: While all parameters can be simulated, not all are realizable Further notes on: Sampling Period: The data streams are processed at a specific sample rate, or clock period, as they flow through a dataflow system such as Simulink. Typically, each block detects the input sample rate and produces the correct sample rate on its output. Xilinx Blockset elements Up Sample and Down Sample provide a means to increase or decrease sample rates. If you select Use Explicit Sample Period rather than the default, you may set the sample period required for all the block outputs. This is useful when implementing features in your design, such as feedback loops. In a feedback loop, it is not possible for the System Generator to determine a default sample rate, because the loop makes an input sample rate dependent on a yet-to-be-determined output sample rate. The System Generator therefore requires you to supply a hint to establish sample periods throughout a loop. Latency: Defines the number of input sample periods required for an input to affect a block output. This sample period may correspond to multiple clock cycles in the corresponding FPGA implementation, for example when the hardware is overclocked with respect to the Simulink model. System Generator does not perform extensive pipelining; additional latency is usually implemented as a shift register on the output of the block. Override with Doubles: This causes the fixed point simulation to be bypassed and the complete simulation is executed in “doubles”. Although not realizable, this can be useful in examining the effects of quantization on your design. Experiment with these parameters on a simple sine wave to gain further insight as to how they affect the results of a system. For details of all the parameters, see the help documentation provided with the System Generator documentation.

36 Values Can Be Equations
You can also enter equations in the block parameters, which can aid calculation and your own understanding of the model parameters The equations are calculated at the beginning of a simulation Useful MATLAB operators + add - subtract * multiply / divide ^ power  pi ( …) exp(x) exponential (ex) This can be very useful for entering the frequency of a sine wave. Instead of calculating the value in radians, which is 2*pi*f, you can enter the complete equation with your value of frequency. MATLAB will calculate the result at the beginning of the simulation. You can even enter f as a variable and specify “f” from the MATLAB workspace. More on that later with the filter lab.

37 Important Concept 1: The Numbers Game
Simulink uses a “double” to represent numbers in a simulation. A double is a “64-bit twos complement floating point number” Because the binary point can move, a double can represent any number between +/ x 1018 with a resolution of 1.08 x …a wide desirable range, but not efficient or realistic for FPGAs Xilinx Blockset uses n-bit fixed point number (twos complement optional) 1 -22 21 20 2-1 2-2 2-3 2-4 2-5 2-6 2-7 2-8 2-9 2-10 2-11 2-12 2-13 Integer Fraction Value = … Format = Fix_16_13 (Sign: Fix = Signed Value UFix = Unsigned value) Format = Sign_Width_Decimal point from the LSB Understanding binary numbers is very important when using SysGen and fixed point systems. Resources are valuable and cost money in FPGA, hence the less we can use to do the same job, the cheaper our solution. NOTE: Go to the “Format” menu and check “Port Data Types” to view the type of numbers being used in each bus. This is very useful in analyzing bit growth in a system. The format will be FIX or UFIX (for unsigned/signed fixed pt number) then the width of the fix point number and then the position from the LSB of the decimal point. E.g., FIX_16_13 Design Hint: Always try to maximize the dynamic range of design by using only the required number of bits Thus, a conversion is required when communicating with Xilinx blocks with Simulink blocks (Xilinx blockset  MATLAB I/O  Gateway In/Out)

38 What About All Those Other Bits?
The Gateway In and Out blocks support parameters to control the conversion from double precision to N - bit fixed point precision DOUBLE -26 25 24 23 22 21 20 2-1 2-2 2-3 2-4 2-5 2-6 2-7 2-8 2-9 2-10 2-11 2-12 2-13 1 1 1 1 1 1 1 1 1 1 1 1 1 1 OVERFLOW QUANTIZATION - Wrap - Saturate - Flag Error - Truncate - Round These are common parameters found on all blocks, but are very applicable to the gateway blocks: Quantization in this context refers to how the tools handle the LSBs of numbers. When a large floating point number is converted to a fixed point number, a lot of “unnecessary” precision is lost. Users must decide whether to “cut the precision off” (truncate) or to round the result for the nearest precision value. Overflow in this context refers to how the tools handle the MSBs of numbers. When a large floating point number is converted to a fixed point number and the number is too large to be represented by the fixed point number scheme, users must decide whether to allow Wrap (the MSBs are dropped) or to Saturate the result (like when a counter reaches its max value) so that the maximum number is used for values greater than it -22 21 20 2-1 2-2 2-3 2-4 2-5 2-6 2-7 2-8 2-9 1 1 1 1 1 1 1 1 FIX_12_9

39 Other Type: Boolean The Xilinx Blockset also uses the type Boolean for control ports like CE and RESET The Boolean type is a variant on the 1-bit unsigned number in that it will always be defined (High or Low). A 1-bit unsigned number can become invalid; a Boolean type cannot The Xilinx Blockset also uses the type Boolean for control ports like CE, and RESET. The Boolean type is a variant on the 1-bit unsigned number in that it will always be defined. A 1-bit unsigned number can become invalid; a Boolean type cannot. This type should be used for all of the Control ports and logic in your designs. It is also the output of a number of blocks like the relational operator and logical gates. This is because the results for these blocks can only be high or low.

40 Fractional Number Formats
Using the technique shown, convert the following fractional values… Define the format of the following twos complement binary fraction and calculate the value it represents What format should be used to represent a signal that has: Fill in the table: Format = < _ _ > 1 1 1 1 1 1 1 Value = a) Max value: +1 Min value: -1 Quantized to 12 bit data b) Max value: 0.8 Min value: 0.2 Quantized to 10 bit data c) Max value: 278 Min value: -138 Quantized to 11 bit data Format = < _ _ > Format = < _ _ > Format = < _ _ > Hint 1: Use the calculator on the PC in scientific mode to convert between decimal, hex and binary Hint 2: Ask yourself the question what is the maximum number I can obtain by doing this operation fix_12_9 + fix_8_3"

41 Creating a System Generator Design
I/O blocks used as interface between the Xilinx Blockset and other Simulink blocks A System Generator design can be dissected into very distinct sections. The “hardware realizable” section is shown in BLUE. This will be the section that will go into the FPGA. The YELLOW blocks represent the gateways into and out of the Xilinx blockset as they must be implemented in fixed point arithmetic. The gateways will also illustrate the inputs and outputs of your VHDL top-level entity (the pins of the device). All Simulink blocks do not have color and most designs will utilize the Simulink/DSP sources and sinks, which will drive and test the design. Any Simulink block (although not hardware-realizable unless manually created in VHDL) can be interfaced to the Xilinx Blockset with the aid of the gateway blocks to convert between “doubles” and “fixed-point numbers.” The System Generator token is required at the top-level design for the design using Xilinx blocks to be simulated. SysGen blocks realizable in Hardware Simulink sinks and library functions Simulink sources

42 Important Concept 2: Sample Period
Every SysGen signal must be “sampled”; transitions occur at equidistant discrete points in time called sample times Each block in a Simulink design has a “Sample Period” and it corresponds to how often that block’s function is calculated and the results outputted This sample period must be set explicitly for: Gateway in Blocks w/o inputs (note: constants are idiosyncratic) Sample period can be “derived” from input sample times for other blocks - A sample period of “0” equates to an analog signal. This is not supported by Xilinx blocks. - A sample period of “-1” means that the block inherits the sampling frequency of the data input. - An example of Blocks w/o inputs: counters

43 Important Concept 2: Sample Period
The units of the sample period can be thought of as arbitrary, BUT a lot of Simulink source blocks do have an essence of time For example, a sample period of 1/44100 means the block’s function will be executed every 1/44100 of a sec Remember Nyquist Theorem (Fs  2fmax) when setting sample periods The sample period of a block DIRECTLY relates to how that block will be clocked in the actual hardware. More on this later Plenty can go wrong when you are simulating a Simulink design if you do not have a firm grasp upon the sample periods used in a design and how they can affect your simulation results. For example, if you create a sine wave with a sample period of 1 (which is 1sec) and the frequency of the sine wave is 10 Hz (Period = 0.1s), you will see a meaningless number on the output. This is because your signal has gone through 10 oscillations before one sample of the signal has even occurred. - A sample period of “0” equates to an analog signal. This is not supported by Xilinx blocks. - A sample period of “-1” means that the block inherits the sampling frequency of the data input.

44 Setting the Global Sample Period
The Simulink System Period MUST be set in the System Generator token. For single rate systems it will be the same as the Sample Periods set in the design. More on Multi Rate designs later Sample Period = 1

45 SysGen Token Master Controls Slave Controls
“Simulink System Period” MUST be set correctly for simulation to work ALL designs must have a System Generator token present in their design. System Generator also offers the option to have numerous System Generator tokens in their designs. This provides the ability to generate and test lower levels of their designs. The Simulink System Period must be set correctly for the simulation to execute. This value reflects the smallest period that the system should run at, so that all other sample periods in the design can be determined from this sample period. In hardware the Simulink System Period equates to the System CLK that drives the design. Hence, the FPGA System CLK Period requires a value in ns to pass on to the timing constraints. These in turn will inform the placer and to implement the design so to achieve the user desired timing performance. System Generator block that lies in the scope of another System Generator block is called a slave. Otherwise, it is called a master. Most system parameters can be set only in a master block; System Generator will automatically synchronize slave blocks to parameters specified in their master block. It is of course possible to have multiple master blocks in a Simulink model. Because the system parameters specified in the System Generator block affect the Simulink behavior, the hardware realization, or the relationship between the two for every block in the Xilinx Blockset, every element in the blockset must lie in the scope of a System Generator block. Consequently, every Simulink model containing any element from the blockset must contain at least one System Generator block.

46 Using the Scope Click Properties to change the number of axes displayed and the time range value (X-axis) Use the Data History tab to control how many values are stored and displayed on the scope Also can direct output to workspace Click Autoscale to quickly let the tools configure the display to the correct axis values Right-click on the Y-axis to set its value This will be the most frequently used Simulink block. To display multiple signals on one scope, use the “MUX” block to combine signals. (Simulink  Signals&Systems) For more information on using Simulink blocks, see the help documentation under the parameterization GUI of each block.

47 Design and Simulate in Simulink
Push “play” to simulate the design. Go to “Simulation Parameters” under the “Simulation” menu to control the length of simulations This example shows a Costas Loop, which is used in communications to account for Doppler shifts in transmitted signals. The eye from the eye diagram is clearly open and, in the absence of any channel impairments, the receiver can easily make correct symbol decisions using this waveform. This diagram is produced by plotting segments of successive matched filter outputs on top of each other. You can enter “inf” into the end time for the length of simulation so that a simulation will run forever.

48 Generate the VHDL Code Select the target device
Once complete, double-click the System Generator token Select the target device Select to generate the testbench Set the System clock period desired Generate the VHDL Once a design is completed and successfully simulated, double-click the System Generator token, which should reside on highest hierarchy level including Xilinx blocks. Double-clicking opens the Options window, where you can specify the target device, the synthesis tool you will be using, and whether testbench generation is desired. Warning: if “create testbench” is selected, the simulation will be run again to capture the DAT files and create the testbench. The Simulink System Period is the value at which the System Period must work in order to achieve all the respective block’s desired sample period in the design. More on this later (multi-rate systems module). You must set the System Clock Period in this GUI. This value will translate to the period constraint in the UCF file; PAR will shoot for this value when laying out the design. You can also specify whether you want cores to be generated (one may not want to if it has been done before, as to save time) and also globally specify simulation in doubles, if desired. Click the Generate button and the System Generator will create all the files that were outlined earlier.

49 System Generator Output Files
Design files .VHD : VHDL design files .EDN : Core implementation file .XCF : Xilinx constraints file for timing constraints Project files .NPL : Project Navigator project file .TCL : Scripts for Synplify and Leonardo project creation Simulation files .DO : Simulation scripts for MTI .DAT : Data files containing the test vectors from System Generator .VHD : Simulation testbench System Generator generates many different files, the majority of which are VHDL files: .VHD : Apart from the design_name_testbench.vhd these are all the VHDL design files that were generated for your design. Note how the hierarchy of your design is maintained in the names of the modules. .EDN : Core implementation files (see module on CORE Generator for further details). All other core files will be located in the “corework” directory generated. .XCF : This file (the User constraints file) contains the timing constraints for your design. This file is very important if you wish to achieve the performance specified. The xcf file is generated for XST. The ncf file is generated for Synplify and Leonardo Spectrum synthesis tools. .NPL : Project Navigator project file. Crucial to making the flow more “push button” .TCL : These tcl scripts are for Synplify and Leonardo project creation. .DO : There are four DO files for Modelsim. Each individual DO is automatically pointed to by the ISE project and are used to run functional, post-translate, post-map and post-par simulation respectively. .DAT : These are the data files containing the test vectors from System Generator. .VHD : design_name_testbench is the verification testbench. It uses the .DAT files to verify the simulation results.

50 Outline Using HDL Using the Xilinx CORE Generator
Using the Xilinx System Generator for DSP HDL Co-Simulation Hardware Verification In System Debug Resource Estimator Summary Simulink Tips and Tricks

51 HDL Co-simulation Allows Import of HDL Code
Being able to include new or legacy modules is essential for many DSP system designers HDL modules can be imported into Simulink “Black box” function allows designers to import HDL Single HDL simulator for multiple black boxes HDL modules can be simulated in Simulink to significantly reduce development time HDL is co-simulated transparently HDL simulated using industry-standard ModelSim tool from Mentor Graphics directly from Simulink framework Another new frature of SG 6.1 is the ability for designers to import their legacy HDL code into Simulink via a black box and be able to simulate this as part of the system level simulation When the Simulink simulation reaches the black box, the tool automatically invokes Mentor’s ModelSIm and simulates the HDL. Results are stored in memory and then fed back into the Simulink simulation when needed This will save development time, resources and cost as designers no longer need to write S-functions for Simulink

52 Import HDL code Drag a Black Box into the model Configuration Wizard
detects VHDL files & customizes block

53 Co-Simulate with ModelSim
Select ModelSim Simulation Mode Drag a ModelSim block into the model Simulink opens ModelSim and co-simulates

54 Outline Using HDL Using the Xilinx CORE Generator
Using the Xilinx System Generator for DSP HDL Co-Simulation Hardware Verification In System Debug Resource Estimator Summary Simulink Tips and Tricks

55 Hardware-in-the-Loop Reduces Design Time & Cost
Configure any development board for hardware-in-the-loop using JTAG header in <20 minutes Automatically create FPGA bit-stream from Simulink Transparent use of FPGA implementation tools Accelerate and verify the Simulink design using FPGA hardware Mirrors traditional DSP processor design flows Combine with black box to simulate HDL & EDIF There are 2 main benefits of hardware in the loop 1 Designers can now verify their designers in hardware without leaving Simulink Traditional flow was Simulink > System generator > Synthesis > ISE > bitstream Now, designers can do all this without leaving Simulink and even feed the results from hardware back into Simulink This mirrors traditional DSP design flows like from TI (although their’s is software based) 2. Because of hardware in the loop, designers can accelerate the simulation when needed with out the need for expensive emulation hardware

56 Select Target H/W Platform Step 2 Generate Bit-stream
Create Bit-stream Step 1 Select Target H/W Platform Step 2 Generate Bit-stream

57 Co-Simulate in Hardware
Step 3 contd. Post-generation script creates a new library containing a parameterized run-time co-simulation block. Step 5 Simulate for verification Step 4 Copy the a co-simulation run-time block into the original model.

58 Hardware in the Loop Performance Results
Single Step Clock Mode (bit and cycle accurate) Software Simulation Time (seconds) Hardware Simulation Time (seconds) Application Speed-up Image Filtering 676 6 112X QAM Demodulator + Extension 1203 18 67X 5 x 5 Image Filter 170 4 43X Cordic Arc Tangent 187 27 7X Additive White Gaussian Noise Channel 600 80 7.5X There are many possible synchronization techniques for the interface between FPGA hardware and Simulink. One technique uses a single step clock to keep the hardware in lockstep with the software simulation. This can be achieved by providing a single clock pulse to the hardware for each simulation cycle. Using this technique enables the user to perform incremental design and verification. On the other hand, when a single step clock is the only method used for clocking the FPGA design, the communication overhead between hardware and Simulink (further exacerbated by bus latency) can for some designs prohibitively limit the effective processing rate. Simulation speed can be greatly increased by allowing the hardware to process more than one set of input samples at a time. One way to accomplish this is to provide a free running clock to the design under test, using an explicit synchronization mechanism (e.g. a flag implemented as a memory mapped register) to coordinate data transfers between Simulink and the hardware. The inputs and outputs of the design can then be written to and sampled asynchronously. Free Running Clock Mode A free running clock is provided to the design, thus the hardware is no longer running in lockstep with the software. The test is started, and after some time a 'done' flag is set to read the results from the FPGA and display them in Simulink. Using this hardware co-simulation method, designers can achieve up to 6 orders of magnitude performance enhancement over original software simulation.

59 Choice of Target Hardware
Hardware-in-the-loop development platforms: Xilinx XtremeDSP Development kit Multimedia Board Distributors: Avnet, Insight, Nu Horizons Key board vendors Alphadata, Annapolis, Nallatech, Lyrtech… You? Configure your JTAG-based board in 20 minutes The new API for System Generator for DSP means that designers can now put their target hardware into the loop. If there is no specific hardware that the customer has then they can purchase one of a number of boards from Xilinx DSP partners. AlphaData, Annapolis, Nallatech and Lyr. Boards from Spectrum and Insight will follow shortly

60 Outline Using HDL Using the Xilinx CORE Generator
Using the Xilinx System Generator for DSP HDL Co-Simulation Hardware Verification In System Debug Resource Estimator Summary Simulink Tips and Tricks

61 In-System Debug at Near System Speeds
Insert Chipscope block into Simulink design Configure FPGA using JTAG interface Perform in-system debug at near system speeds

62 Outline Using HDL Using the Xilinx CORE Generator
Using the Xilinx System Generator for DSP HDL Co-Simulation Hardware Verification In System Debug Resource Estimator Summary Simulink Tips and Tricks

63 Resource Estimator The block provides fast estimates of FPGA resources required to implement the subsystem Most of the blocks in the System Generator Blockset carries the resources information LUTs FFs BRAM Embedded multipliers 3-state buffers I/Os The Xilinx Resource Estimator block provides fast estimates of FPGA resources required to implement a System Generator subsystem or model. These estimates are computed by invoking block-specific estimators for Xilinx blocks, and summing these values to obtain aggregated estimates of lookup tables (LUTs), flip-flops (FFs), block memories (BRAM), 18x18 multipliers, Tri-state buffers, and I/Os. Every Xilinx block that requires FPGA resources has a mask parameter that stores a vector containing its resource requirements. The Resource Estimator block can invoke underlying functions to populate these vectors (e.g. after parameters or data types have been changed), or aggregate previously computed values that have been stored in the vectors. Each block has a checkbox control "Use Area Above for Estimation" that short-circuits invocation of the estimator function and uses the estimates stored in the vector instead. An estimator block can be placed in any subsystem of a model. Blocks that have resource estimation functions Accumulator, Addressable Shift Register, AddSub, CMult (Sequential version not supported), Convert, Counter, Delay, Down Sample, Dual Port RAM, FIFO, FFT, Gateway In, Gateway Out, Inverter, LFSR, Logical, Mult (Sequential version not supported), Mux (3-state version not supported), Negate, Parallel to Serial, PicoBlaze Processor, Register, Relational, ROM, Serial To Parallel, Shift, Single Port RAM, Threshold, Up Sample Blocks that consume zero hardware resources System Generator, Clear Quantization Error, Clock Enable Probe, Clock Probe, Concat, Constant, Discard Subsystem, FDATool, Indeterminate Probe, ModelSim, Quantization Error, Reinterpret, Sample Time, Scale, Simulation Multiplexer, Slice Blocks with special handling Discard Subsystem (Resource Estimator will ignore any resources in a subsystem containing this block)

64 Resource Estimator Three types of estimation Estimate Area Quick Sum
This option computes resources for the current level and all sub-levels Quick Sum Uses the resources stored in block directly and sum them up (no sub-levels functions are invoked) Post-Map Area Opens up a file browser and let user select map report file. The design should have been generated and gone through synthesis, translate, and mapping phases. Estimate Area: Clicking on the Estimate Area button invokes block estimation functions top-down for each block and subsystem recursively. If any block has the Use Area Above for Estimation option selected, its estimator function is short-circuited and its current estimate is used. If this option is selected for a Resource Estimation block contained in the hierarchy, this block's estimate will be used for the entire subsystem containing it, and no other block estimation functions will be invoked for that portion of the model hierarchy. Warning messages will be generated for each block encountered that has no underlying estimation function. Quick Sum: Clicking on the Quick Sum button causes the Resource Estimator block to sum all of the FPGA Area fields on the blocks and subsystems at or below the current subsystem. No underlying estimation functions are invoked. Post-Map Area : Clicking on the Post-Map Area button will cause the Resource Estimator to open up a file browser.

65 Lab 2 c = S aibi yn = S xn-i hi Build a MAC with System Generator:
Goals: Gain familiarity with the SysGen v6.3 and its design flow, including ProjNav, synthesis tools (XST), ModelSim simulators, and the ISE implementation tools. Use Resource Estimator to estimate resources used by the design. Familiarize with hardware in the Loop flow Background: The multiply-accumulate (MAC) operation is fundamental in digital signal processing and numerous other applications c = S aibi a + c b N-1 i Note: Once again, be careful with the sampling period. Try a sampling period of 1 Hz for a sine wave with a frequency of 1/100 Hz. You should see a fairly smooth sine wave. Don’t forget that frequency in radians is 2*pi*f Try to predict the results before simulating. Do you see what you expected? A sine wave with an amplitude of 1 is best represented with the decimal point two places from the MSB. E.G. FIX_8_6 yn = S xn-i hi For example, the output of a digital filter with impulse response hi and input sequence xi, is given by: i=0

66 Outline Using HDL Using the Xilinx CORE Generator
Using the Xilinx System Generator for DSP Hardware Verification In System Debug Resource Estimator Summary Simulink Tips and Tricks

67 Summary Full VHDL/Verilog (RTL code) Advantages: Disadvantages:
Portability Complete control of the design implementation and tradeoffs Easier to debug and understand a code that you own Disadvantages: Can be time-consuming Don’t always have control over the Synthesis tool Need to be familiar with the algorithm and how to write it Must be conversant with the synthesis tools to obtain optimized design

68 Summary Full VHDL/Verilog (Instantiating Primitives) Advantages:
Full access to all architecture features Carry on further with optimization Best optimization Disadvantages: Not as portable as RTL VHDL/Verilog Must be an FPGA expert and know the architecture Time-consuming

69 Summary CORE Generator Advantages Disadvantages
Can quickly access and generate existing functions No need to reinvent the wheel and re-design a block if it meets specifications IP is optimized for the specified architecture Disadvantages IP doesn’t always do exactly what you are looking for Need to understand signals and parameters and match them to your specification Dealing with black box and have little information on how the function is implemented

70 Summary System Generator for DSP Advantages Disadvantages
Huge productivity gains through high-level modeling Ability to simulate the complete designs at a system level Very attractive for FPGA novices Excellent capabilities for designing complex testbenches HDL Testbench, test vector and golden data written automatically Hardware in the loop simulation improves productivity and provides quick verification of the system functioning correctly or not Disadvantages Minor cost of abstraction: doesn’t always give the best result from an area usage point of view Customer may not be familiar with Simulink Not well suited to multiple clock designs No bi-directional bus supported Cost of abstraction is covered in later modules

71 Outline Using HDL Using the Xilinx CORE Generator
Using the Xilinx System Generator for DSP Hardware Verification In System Debug Resource Estimator Summary Simulink Tips and Tricks

72 Simulink Tips and Tricks
Throughout this course, we will disperse various tips and tricks that we find useful when using Simulink to create System Generator designs

73 Complete Systems Throughout this course, we will build and study small sections of complete systems To get a flavor of the capability of System Generator, check out the demos Type “demos” from the MATLAB command line to view them Type “demos” at the MATLAB command line to open the Simulink demos section. Click on the Xilinx demos and the following should be available for you to analyze: A/D and delta-sigma D/A conversion Digital Comm: 16-QAM demodulator (sim) Digital Comm: A QAM system with packet framing and FEC for telemetry channels (sim) Digital Comm: Concatenated FEC codec for DVB standard (sim) Digital Comm: Costas loop carrier recovery (sim) Digital Comm: Digital down converter for GSM (sim) FFT/IFFT in streaming mode (sim) FIR filtering: LMS-based adaptive equalization (sim) FIR filtering: Custom reference library (sim) FIR filtering: Polyphase 1:8 filter using SRL16Es (sim) IIR filtering: Multi-channel, folded implementation (sim) IIR filtering: 2nd order Direct Form I implementation (sim) Image processing: Color space converter (sim) Math: CORDIC-based rectangular-to-polar coordinate converter (sim) Math: CORDIC-based divider circuit (sim) Math: CORDIC-based sine and cosine function (sim) PicoBlaze(Tm) Microcontroller (sim)

74 Combining Signals To be viewed on a scope, multiple signals must first be combined Use the MUX block (Simulink library  Signals & Systems) to combine signals, thus making a vector out of them Check Format  Signal Dimensions and Format  Wide NonScalar Lines to view how many signals are combined Similarly, the DEMUX can be used to separate signals Type ‘vector’ to view the example NOTE: No Xilinx block supports VHDL generation for vectors. Use them purely for Simulink signal manipulation. Typically you will see the MUX block used in a System Generator design for analysis of results. It is exceptionally useful for viewing multiple signals on a sink. For example, you may want to see the results of filtering block and compare them with the unfiltered original signal. Once again, it is very important to stress that it is not possible to use vector signals inside of a System Generator design. Type vector to view the example from c:\training\dsp_flow\labs\lab3 folder

75 Creating Subsystems All large designs will utilize hierarchy
Select the blocks to go into the subsystem. Click and drag to highlight a design region Select “Create Subsystem” in the Edit Menu Ctrl+G has the same effect Use the modelbrowser under the “View” menu to navigate the hierarchy Hierarchy in the VHDL code generated is determined by subsystems As readability is essential in large designs, subsystems are a very useful feature maintaining the readability of a design. Another way to create a new subsystem is to use the “Subsystem” block from the Simulink => Signals and Systems library. This will provide an empty subsystem window for users to add blocks to. Double-clicking on subsystems is another way to view their contents. It displays the contents of the subsystem in a NEW window. We recommend the Modelbrowser for the majority of navigation, otherwise you will have more windows than you know what to do with. A subsystem will also affect the VHDL code generated by System Generator. The subsystems of a design will directly control the hierarchy in the VHDL generated. A further point to make when analyzing a design in the Xilinx implementation tools is that the name of a subsystem will be added to the component and signal names in that subsystem. Thus, a component will have a name like, e.g., subsystem/AddSub in FPGA Editor.

76 Documenting a Design Double-click the background to create a textbox
Type in the text Right-click the text to change format Left-click to move the textbox around A masked subsystem can be given “Help” documentation. More on this later Annotating a block diagram is another method of creating a better documented design. Using this simple “double -click on the background” feature is an excellent way of adding major and minor notes to the design. Anything from a description of a complete design to the maximum and minimum values that can come out of block. The later being very useful in analyzing bit growth.

77 Inports and Outports Allow the transfer of signal values between a subsystem and a parent Inport and Outport block names are reflected on the subsystem Can be found in Simulink  Sinks (for the Outport) and Simulink  Sources (for the Inport) When Simulink creates a subsystem, additional blocks are added to the subsystem. These are the inport and outport blocks, which can be thought of as hierarchy connectors. Inport blocks cause input ports to appear on the subsystem; similarly, outport blocks cause output ports to appear. The names of the blocks are reflected in the names on the subsystem. Typically, Simulink numbers these blocks sequentially, but their names can be changed if desired. If you are building a subsystem from scratch using the subsystem block, you must insert Inport and Outport blocks manually. These blocks can be found in the Simulink  Sources or Sinks library. They are named In1 and Out1, respectively. For further reference: Using Simulink: Creating a Model: Labeling a Subsystem Ports

78 Inputting Data from the Workspace
“From Workspace” block can be used to input MATLAB data to a Simulink model Format: t = 0:time_step:final_time; x = func(t); make these into a matrix for Simulink Example: In the MATLAB console, type: Type ‘FromWorkspace’ to view the example One way to use a variable in the MATLAB workspace as an input signal in Simulink is to use the “From Workspace” block in the Simulink  Sources library. The variable to be used should have a specific format. The first column should have the time sequence, and the following columns should include the corresponding signal data. If required, Simulink will linearly interpolate the data for undefined time step. Another example: Type at the MATLAB prompt >> t = [ ]; >> x = [ /3]’ >> simin = [t’, x’]; The [t’, x’] command turns the horizontal vector data into a two-column matrix. This will create a triangular waveform. You can view it by typing >> plot(t, x); Type ‘FromWorkspace’ to view the example from c:\training\dsp_flow\labs\lab3 folder t = 0:0.01:1; x = sin(2*pi*t); simin = [t', x'];

79 Outputting Data to the Workspace
“To Workspace” block can be used to output a signal to the MATLAB workspace The output is written to the workspace when the simulation has finished or is paused Data can be saved as a structure (including time) or as an array Type ‘ToWorkspace’ to view the example There are a number of ways that Simulink can write data out to the workspace. The simplest is to use the “To Workspace” block from the Simulink  Sinks library. The variable name is specified in the parameters window, as well as the number of data points to save. The save format is also worth noting. It can be an array that contains signal value/s at each time step, or a structure, which has the signal values as one of its fields and contains more information, such as the label of the signal. You may also choose to save time information as part of structure. To access a structure, use the following syntax at the command prompt: >> simout.signals.values Help reference: Using Simulink: Analyzing Simulation Results: Using the To Workspace Block Type ‘ToWorkspace’ to view the example from c:\training\dsp_flow\labs\lab3 folder


Download ppt "DSP Design Flows in FPGA"

Similar presentations


Ads by Google