High-Level Synthesis.

Slides:



Advertisements
Similar presentations
© 2003 Xilinx, Inc. All Rights Reserved Course Wrap Up DSP Design Flow.
Advertisements

High Level Languages: A Comparison By Joel Best. 2 Sources The Challenges of Synthesizing Hardware from C-Like Languages  by Stephen A. Edwards High-Level.
Behavioral Design Outline –Design Specification –Behavioral Design –Behavioral Specification –Hardware Description Languages –Behavioral Simulation –Behavioral.
Mahapatra-Texas A&M-Fall'001 cosynthesis Introduction to cosynthesis Rabi Mahapatra CPSC498.
George Mason University ECE 448 – FPGA and ASIC Design with VHDL Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts,
ECE 699: Lecture 2 ZYNQ Design Flow.
1 Chapter 7 Design Implementation. 2 Overview 3 Main Steps of an FPGA Design ’ s Implementation Design architecture Defining the structure, interface.
Dynamic Hardware Software Partitioning A First Approach Komal Kasat Nalini Kumar Gaurav Chitroda.
© 2011 Xilinx, Inc. All Rights Reserved Intro to System Generator This material exempt per Department of Commerce license exception TSU.
An Introduction Chapter Chapter 1 Introduction2 Computer Systems  Programmable machines  Hardware + Software (program) HardwareProgram.
George Mason University ECE 448 – FPGA and ASIC Design with VHDL High Level Language (HLL) Design Flow Reconfigurable Supercomputers ECE 448 Lecture 21.
FPGA-Based System Design: Chapter 4 Copyright  2004 Prentice Hall PTR HDL coding n Synthesis vs. simulation semantics n Syntax-directed translation n.
ISE. Tatjana Petrovic 249/982/22 ISE software tools ISE is Xilinx software design tools that concentrate on delivering you the most productivity available.
Performance and Overhead in a Hybrid Reconfigurable Computer O. D. Fidanci 1, D. Poznanovic 2, K. Gaj 3, T. El-Ghazawi 1, N. Alexandridis 1 1 George Washington.
Ch.9 CPLD/FPGA Design TAIST ICTES Program VLSI Design Methodology Hiroaki Kunieda Tokyo Institute of Technology.
ASIC/FPGA design flow. FPGA Design Flow Detailed (RTL) Design Detailed (RTL) Design Ideas (Specifications) Design Ideas (Specifications) Device Programming.
HW/SW PARTITIONING OF FLOATING POINT SOFTWARE APPLICATIONS TO FIXED - POINTED COPROCESSOR CIRCUITS - Nalini Kumar Gaurav Chitroda Komal Kasat.
IEEE ICECS 2010 SysPy: Using Python for processor-centric SoC design Evangelos Logaras Elias S. Manolakos {evlog, Department of Informatics.
1 Fly – A Modifiable Hardware Compiler C. H. Ho 1, P.H.W. Leong 1, K.H. Tsoi 1, R. Ludewig 2, P. Zipf 2, A.G. Oritz 2 and M. Glesner 2 1 Department of.
George Mason University ECE 448 – FPGA and ASIC Design with VHDL Lecture 18 FPGA Boards & FPGA-based Supercomputers High Level Language (HLL) Design Methodology.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
Evaluating and Improving an OpenMP-based Circuit Design Tool Tim Beatty, Dr. Ken Kent, Dr. Eric Aubanel Faculty of Computer Science University of New Brunswick.
George Mason University ECE 448 – FPGA and ASIC Design with VHDL Practice final exam Solutions ECE 448 Review Session.
FPGA-based Supercomputers
© 2003 Xilinx, Inc. All Rights Reserved Course Wrap Up DSP Design Flow.
04/26/20031 ECE 551: Digital System Design & Synthesis Lecture Set : Introduction to VHDL 12.2: VHDL versus Verilog (Separate File)
Survey of Reconfigurable Logic Technologies
ECE 448 Lecture 6 Finite State Machines State Diagrams vs. Algorithmic State Machine (ASM) Charts.
Design with Vivado IP Integrator
Introduction to High-Level Synthesis ECE 699: Lecture 12.
1 Introduction to Engineering Spring 2007 Lecture 18: Digital Tools 2.
George Mason University Finite State Machines Refresher ECE 545 Lecture 11.
SUBJECT : DIGITAL ELECTRONICS CLASS : SEM 3(B) TOPIC : INTRODUCTION OF VHDL.
EMT 351/4 DIGITAL IC DESIGN Week # 1 EDA & HDL.
Introduction to the FPGA and Labs
Lab 4 HW/SW Compression and Decompression of Captured Image
System-on-Chip Design
Programmable Hardware: Hardware or Software?
System-on-Chip Design Homework Solutions
Andreas Hoffmann Andreas Ropers Tim Kogel Stefan Pees Prof
ECE 448 Lecture 6 Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL Code.
COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE
Introduction to Programmable Logic
Introduction Introduction to VHDL Entities Signals Data & Scalar Types
The 8085 Microprocessor Architecture
Introduction of microprocessor
ENG3050 Embedded Reconfigurable Computing Systems
FPGAs in AWS and First Use Cases, Kees Vissers
Introduction to High-level Synthesis
ECE 448 Lecture 6 Finite State Machines State Diagrams vs. Algorithmic State Machine (ASM) Charts.
An Introduction to Microprocessor Architecture using intel 8085 as a classic processor
Design Flow System Level
Introduction to cosynthesis Rabi Mahapatra CSCE617
Reconfigurable Computing
Topics HDL coding for synthesis. Verilog. VHDL..
Matlab as a Development Environment for FPGA Design
High Level Synthesis Overview
ECE-C662 Introduction to Behavioral Synthesis Knapp Text Ch
ChipScope Pro Software
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
ECE 699: Lecture 3 ZYNQ Design Flow.
The 8085 Microprocessor Architecture
ChipScope Pro Software
THE ECE 554 XILINX DESIGN PROCESS
ECE 448 Lecture 6 Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL Code.
H a r d w a r e M o d e l i n g O v e r v i e w
ECE 448 Lecture 6 Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL code ECE 448 – FPGA and ASIC Design.
Digital Designs – What does it take
ECE 448 Lecture 6 Finite State Machines State Diagrams vs. Algorithmic State Machine (ASM) Charts.
THE ECE 554 XILINX DESIGN PROCESS
Presentation transcript:

High-Level Synthesis

This presentation is based on ECE 669 Required Reading The ZYNQ Book High-Level Synthesis Chapter 14: Spotlight on High-Level Synthesis Chapter 15: Vivado HLS: A Closer Look Vivado Design Suite Tutorial, High-Level Synthesis, UG871, Nov. 2014 Vivado Design Suite User Guide, High-Level Synthesis, UG902, Oct. 2014 Introduction to FPGA Design with Vivado High-Level Synthesis, UG998, Jul. 2013. A Zynq Accelerator for Floating Point Matrix multiplication Designed with Vivado HLS, XAPP1170, Jan. 2016 This presentation is based on ECE 669

Behavioral Synthesis High-Level Synthesis I/O Behavior Target Library Algorithm Behavioral Synthesis RTL Design Logic Synthesis Classic RTL Design Flow Gate level Netlist

Need for High-Level Design High-Level Synthesis Higher level of abstraction Modeling complex designs Reduce design efforts Fast turnaround time Technology independence Ease of HW/SW partitioning

Platform Mapping SW/HW Partitioning High-Level Synthesis Program Hardware (executed in the reconfigurable processor system) Software (executed in the microprocessor system)

SW/HW Partitioning & Coding Traditional Approach High-Level Synthesis Specification SW/HW Partitioning SW Coding HW Coding SW Compilation HW Compilation SW Profiling HW Profiling

SW/HW Partitioning & Coding New Approach High-Level Synthesis Specification SW/HW Coding SW/HW Partitioning SW Compilation HW Compilation SW Profiling HW Profiling

Advantages of Behavioral Synthesis High-Level Synthesis Easy to model higher level of complexities Smaller in size source compared to RTL code Generates RTL much faster than manual method Multi-cycle functionality Loops Memory Access

Short History of High-Level Synthesis Generation 1 (1980s-early 1990s): research period Generation 2 (mid 1990s-early 2000s): Commercial tools from Synopsys, Cadence, Mentor Graphics, etc. Input languages: behavioral HDLs Target: ASIC Outcome: Commercial failure Generation 3 (from early 2000s): Domain oriented commercial tools: in particular for DSP Input languages: C, C++, C-like languages (Impulse C, Handel C, etc.), Matlab + Simulink, Bluespec Target: FPGA, ASIC, or both Outcome: First success stories High-Level Synthesis

Hardware-Oriented High-Level Languages High-Level Synthesis C-Based System level languages Commercial Handel C -- Celoxica Ltd. Impulse C -- Impulse Accelerated Technologies Carte C – SRC Computers SystemC -- The Open SystemC Initiative Research Streams-C -- Los Alamos National Laboratory SA-C -- Colorado State University, University of California, Riverside, Khoral Research, Inc. SpecC – University of California, Irvine and SpecC Technology Open Consortium

Other High-Level Design Flows High-Level Synthesis Matlab-based AccelChip DSP Synthesis -- AccelChip System Generator for DSP -- Xilinx GUI Data-Flow based Corefire -- Annapolis Microsystems Java-based Commercial Forge -- Xilinx Research JHDL – Brigham Young University

Handel-C Overview High-level language based on ISO/ANSI-C for the implementation of algorithms in hardware Allows software engineers to design hardware without retraining Clean extensions for hardware design including flexible data widths, parallelism and communications Well defined timing model Each statement takes a single clock cycle Includes extended operators for bit manipulation, and high-level mathematical macros (including floating point) High-Level Synthesis

Handel-C/ANSI-C Comparisons High-Level Synthesis ANSI-C HANDEL-C Handel-C Standard Library ANSI-C Standard Library Preprocessors i.e. #define Parallelism Pointers Structures Arbitrary width variables ANSI-C Constructs for, while, if, switch Recursion Arrays Bitwise logical operators Enhanced bit manipulation Floating Point Logical operators Arithmetic operators RAM, ROM Signals Functions Interfaces

Executable Specification Handel-C Design Flow High-Level Synthesis Executable Specification Handel-C VHDL Synthesis EDIF EDIF Place & Route

Different Levels of C/C++ Synthesis Abstraction High-Level Synthesis The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)

Pure Untimed C/C++ Design Flow High-Level Synthesis The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)

Mentor Graphics – Catapult C High-Level Synthesis

Mentor Graphics – Catapult C High-Level Synthesis Catapult C automatically converts un-timed C/C++ descriptions into synthesizable RTL.

SystemC -based design-flow alternatives High-Level Synthesis Implementation specific, relatively slow to simulate, relatively difficult to modify Auto-RTL Translation Verilog / VHDL RTL RTL Synthesis SystemC Gate-level netlist SystemC Synthesis Alternative SystemC flows

SystemC Evolution High-Level Synthesis The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)

ECE 448 – FPGA and ASIC Design with VHDL High-Level Synthesis Reconfigurable Supercomputers ECE 448 – FPGA and ASIC Design with VHDL

Reconfigurable Computer? What is a Reconfigurable Computer? High-Level Synthesis Microprocessor system Reconfigurable system P . . . P FPGA . . . FPGA P memory P memory FPGA memory FPGA memory . . . . . . Interface Interface I/O I/O

Reconfigurable Supercomputers High-Level Synthesis Machine Released SRC 6 from SRC Computers Cray XD1 from from Cray SGI Altix from SGI SRC 7 from SRC Computers, Inc, 2002 2005 2006

Pros and cons of reconfigurable computers High-Level Synthesis + can be programmed using high-level programming languages, such as C, by mathematicians & scientist themselves + facilitates hardware/software co-design + shortens development time, encourages experimentation and complex optimizations + allows sharing costs among users of various applications high entry cost (~$100,000) hardware aware programming limited portability limited availability of libraries - limited maturity of tools.

SRC Programming Model Microprocessor FPGA Libraries of macros High-Level Synthesis function_1 macro_1 macro_2 macro_3 macro_4 ………………………. main.c macro_1(a, b, c) macro_2(b, d) macro_2(c, e) function_1() function_2() VHDL FPGA function_2 I/O a macro_3(s, t) macro_1(n, b) macro_4(t, k) Macro_1 ANSI C b c Macro_2 Macro_2 MAP C (subset of ANSI C) d e I/O

SRC Compilation Process High-Level Synthesis Application sources Macro sources .c or .f files .mc or .mf files . . vhd or or .v files HDL HDL sources sources Logic synthesis Logic synthesis .v files .v files  P Compiler MAP Compiler Netlists . . ngo ngo files files Object .o files .o files files Place & Route Place & Route Linker Linker .bin files .bin files Configuration Application bitstreams executable

Library Development - SRC High-Level Synthesis LLL (ASM) HLL (C, Fortran) HLL (C, Fortran) P system FPGA system HDL (VHDL, Verilog) HLL (C, Fortran) HLL (C, Fortran) Library Developer Application Programmer

SRC Programming Environment High-Level Synthesis + very easy to learn and use + standard ANSI C + hides implementation details + very well integrated environment + mature - in production use for over 4 years with constant improvements - subset of C - legacy C code requires rewriting - C limitations in describing HW (paralellism, data types) - closed environment, limited portability of code to HW platforms other than SRC

Application Development for Reconfigurable Computers High-Level Synthesis Program Entry Platform mapping Debugging & Verification Compilation Execution

Ideal Program Entry High-Level Synthesis Function Program Entry

Actual Program Entry Program Entry High-Level Synthesis Function Preferred Architectures SW/HW Partitioning Use of FPGA Resources (multipliers, μP cores) FPGA Mapping Program Entry Data Transfers & Synchronization Sequence of Run-time Reconfigurations Use of Internal and External Memories SW/HW Interface

Cinderella Story AutoESL Design Technologies, Inc. (25 employees) High-Level Synthesis AutoESL Design Technologies, Inc. (25 employees) Flagship product: AutoPilot, translating C/C++/System C to VHDL or Verilog Acquired by the biggest FPGA company, Xilinx Inc., in 2011 AutoPilot integrated into the primary Xilinx toolset, Vivado, as Vivado HLS, released in 2012 “High-Level Synthesis for the Masses”

Hardware Description Language Vivado HLS High-Level Synthesis High Level Language C, C++, System C Vivado HLS Hardware Description Language VHDL or Verilog 33 33

HLS-Based Development and Benchmarking Flow High-Level Synthesis Reference Implementation in C Manual Modifications (pragmas, tweaks) Test Vectors HLS-ready C code High-Level Synthesis Functional Verification HDL Code Post Place & Route Results Physical Implementation FPGA Tools Timing Verification Netlist 34

Levels of Abstraction in FPGA Design High-Level Synthesis Source: The Zynq Book

High-Level Synthesis vs. Logic Synthesis Source: The Zynq Book

Algorithm and Interface Synthesis High-Level Synthesis Source: The Zynq Book

High-Level Synthesis Vivado HLS Design Flow Source: The Zynq Book

Design Trade-offs Explored Using HLS High-Level Synthesis Source: The Zynq Book

C Functional Verification and C/RTL Cosimulation in Vivado HLS High-Level Synthesis Source: The Zynq Book

Vivado HLS 41 41

Scheduling and Binding Vivado HLS Scheduling and Binding High-Level Synthesis Source: The Zynq Book

Scheduling and Binding Vivado HLS Scheduling and Binding High-Level Synthesis Scheduling – translation of the RTL statements interpreted from the C code into a set of operations, each with an associated duration in terms of clock cycles. Affected by the clock frequency, uncertainty, target technology, and user directives. Binding - associating the scheduled operations with the physical resources of the target device. Source: The Zynq Book

Three Possible Outcomes from HLS Average of 10 numbers High-Level Synthesis Source: The Zynq Book

Vivado HLS Synthesis Process Source: The Zynq Book

Native Integer Data Types of C High-Level Synthesis Source: The Zynq Book

Arbitrary Precision Integer Data Types of C and C++ Accepted by Vivado HLS High-Level Synthesis Source: The Zynq Book

Arbitrary Precision Integer Types of C and C++ High-Level Synthesis Source: The Zynq Book

Native Floating-Point Data Types of C High-Level Synthesis Source: The Zynq Book

Fixed-point Word Format High-Level Synthesis Source: The Zynq Book

Arbitrary Precision Fixed-Point Data Types used in Vivado HLS High-Level Synthesis W – total width, I – number of integer bits Q – quantization mode, O – overflow mode, N – number of saturation bits in overflow wrap modes Source: The Zynq Book

Quantization modes for for the C++ ap_fixed and ap_ufixed types High-Level Synthesis Source: The Zynq Book

Truncation to zero High-Level Synthesis Source: UG902 Vivado Design Suite User Guide, High-Level Synthesis

for the C++ ap_fixed and ap_ufixed types Overflow modes for for the C++ ap_fixed and ap_ufixed types High-Level Synthesis Source: The Zynq Book

Wraparound High-Level Synthesis Source: UG902 Vivado Design Suite User Guide, High-Level Synthesis

C++ code with the declaration of fixed point variables High-Level Synthesis Source: The Zynq Book

System C Data Types High-Level Synthesis Source: The Zynq Book

An Example Top-Level Function for HLS High-Level Synthesis Source: The Zynq Book

Simplified Interface Diagram for the Example Top-Level Function High-Level Synthesis Source: The Zynq Book

Synthesis of Port Directions High-Level Synthesis Source: The Zynq Book

Default Port Level Types and Protocols High-Level Synthesis Source: The Zynq Book

Port Interface Protocol Types ap_none — This is the simplest protocol type, with no explicit interface protocol, no additional control signals, and no associated hardware overhead. However, there is an implication that timing of input and output operations is independently and correctly handled. ap_stable — This is a similar protocol to ap_none, in that it does not involve additional control signals or related hardware. The difference is that ap_stable is intended for inputs (only) that change infrequently, i.e. that are generally stable apart from at reset, such as configuration data. The inputs are not constants, but neither do they require to be registered. Source: The Zynq Book

Port Interface Protocol Types ap_ack — This protocol behaves differently for input and output ports. For inputs, an output acknowledge port is added, and held high on the same clock cycle as the input is read. For outputs ports, an input acknowledge port is added. After every write to the output port, the design must wait for the input acknowledge to be asserted before it may resume operation. ap_vld — An additional port is provided to validate data. For input ports, a valid input control port is added, which qualifies input data as valid. For output ports, a valid output port is added, and asserted on clock cycles when output data is valid. Source: The Zynq Book

Port Interface Protocol Types ap_ovld — This protocol is the same as ap_vld, but can only be implemented on output ports, or the output portion of an inout (bidirectional) port. ap_hs — The _hs suffix of this protocol stands for ‘handshaking’, and it is a superset of ap_ack, ap_vld, and ap_ovld. The ap_hs protocol can be used for both input and output ports, and facilitates a two-way handshaking process between the producer and consumer of data, including both validation and acknowledgement transactions. As such, it requires two control ports and associated overhead. It is, however, a robust method of passing data, with no need to ensure timing externally Source: The Zynq Book

Port Interface Protocol Types ap_memory — This memory-based protocol supports random access transactions with a memory, and can be used for both input, output, and bidirectional ports. The only argument type compatible with this protocol is the array type, which corresponds with the structure of a memory. The ap_memory protocol requires control signals for clock and write enables, as well as an address port. Source: The Zynq Book

Port Interface Protocol Types ap_fifo — The FIFO protocol is also compatible with array arguments, provided that they are accessed sequentially rather than in random order. It does not require any address information to be generated, and therefore is simpler in implementation than the ap_memory interface. The ap_fifo protocol can be used for input and output ports, but not bidirectional ports. The associated control ports indicate the fullness or emptiness of the FIFO, depending on the port direction, and ensure that processing is stalled to prevent overrun or underrun. Source: The Zynq Book

Port Interface Protocol Types ap_bus — The ap_bus protocol is a generic bus interface that is not tied to a specific bus standard, and may be used to communicate with a bus bridge, which can then arbitrate with a system bus. The ap_bus protocol supports single read operations, single write operations, and burst transfers, and these are coordinated using a set of control signals. In addition to this generic bus interface, specific support for AXI bus interfaces can be integrated at a later stage, using an interface synthesis directive. Source: The Zynq Book

Port Interface Protocol Types bram — The same as ap_memory, except that when bundled using IP Integrator, the ports are not shown as individual ports, but grouped together into a single port. axis — This specifies the interface as AXI stream. s_axilite — This specifies the interface as AXI Slave Lite. m_axi — This specifies the interface the AXI Master protocol. Source: The Zynq Book

Port Interface Protocol Types S – Supported, D - default Source: The Zynq Book

Block-level Protocol There are three types of block-level protocol, and these are listed below according to the terms used in Vivado HLS: ap_ctrl_none — Choosing this option simply means that a block-level protocol is not added. Instead, control is exerted entirely at the port interface level, using portlevel protocols. ap_ctrl_hs — A block-level control protocol with handshaking. An ap_start control input is asserted to prompt the block to begin operation, and the block produces three output control signals (ap_ready, ap_idle, and ap_done) to indicate its stage of operation. Specifically, the ap_ready signal indicates that the block is ready for new inputs, the ap_idle indicates when it is processing data, and ap_done is asserted when output data is available. To provide a usage example, the ap_ctrl_hs protocol is appropriate when a single HLS block is to be interfaced with the controlling processor. Source: The Zynq Book

Block-level Protocol There are three types of block-level protocol, and these are listed below according to the terms used in Vivado HLS: ap_ctrl_chain —This protocol is similar to ap_ctrl_hs, but has an additional input control signal, ap_continue, and is designed for chaining multiple Vivado HLS blocks together. The ap_continue input indicates the ability of the downstream block to accept new data, and therefore it can exert backpressure on upstream blocks if necessary. If ap_continue is de-asserted, the block will complete its current computation to the stage of presenting the results at the output, but will then stall until ap_continue is set high again. Source: The Zynq Book

RTL Interface Diagram Showing Default Block Level Ports and Protocols High-Level Synthesis Source: The Zynq Book

Directive Types There are six types of directive which can be applied to both port-level and block-level interfaces, as reviewed below. Array Map — Combines several arrays to form one larger array, with the goal of using fewer FIFO or RAM resources and control ports to implement the interface. Array Partition — Separates array interfaces into several smaller sections, resulting in an expanded set of ports, control signals, and implementation resources, but with increased bandwidth. Array Reshape — In this case, an original array is partitioned into smaller arrays, which are then recombined to form an array with fewer elements, and wider data elements. This implies fewer memory locations and shorter addresses. Interface — This directive can be used to explicitly specify a port-level interface protocol as one of the available options (as listed in Table 15.7 on page 300), or the block-level protocol as ap_none, ap_ctrl_block or ap_ctrl_chain, as covered in Section 15.4.5. Source: The Zynq Book

Directive Types There are six types of directive which can be applied to both port-level and block-level interfaces, as reviewed below. Resource — A particular resource can be chosen to implement the interface. For instance, a one or two port RAM can be specified for an ap_memory interface, or an ap_fifo interface can target a FIFO constructed from a Block RAM or LUTs. Stream — This directive specifies the interface as a streaming port, utilising FIFOs and permitting the depth of the FIFO to be explicitly chosen. Source: The Zynq Book

Data flow between Vivado HLS blocks High-Level Synthesis Source: The Zynq Book