Programmable Logic- How do they do that?

Slides:



Advertisements
Similar presentations
Basic HDL Coding Techniques
Advertisements

© 2003 Xilinx, Inc. All Rights Reserved Course Wrap Up DSP Design Flow.
ECE 506 Reconfigurable Computing ece. arizona
Architecture-Specific Packing for Virtex-5 FPGAs
Commercial FPGAs: Altera Stratix Family Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Reconfigurable Computing (EN2911X, Fall07) Lecture 04: Programmable Logic Technology (2/3) Prof. Sherief Reda Division of Engineering, Brown University.
Altera FLEX 10K technology in Real Time Application.
A Survey of Logic Block Architectures For Digital Signal Processing Applications.
Lecture 9: Coarse Grained FPGA Architecture October 6, 2004 ECE 697F Reconfigurable Computing Lecture 9 Coarse Grained FPGA Architecture.
Spartan II Features  Plentiful logic and memory resources –15K to 200K system gates (up to 5,292 logic cells) –Up to 57 Kb block RAM storage  Flexible.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR SRAM-based FPGA n SRAM-based LE –Registers in logic elements –LUT-based logic element.
The Spartan 3e FPGA. CS/EE 3710 The Spartan 3e FPGA  What’s inside the chip? How does it implement random logic? What other features can you use?  What.
Programmable logic and FPGA
Implementation of DSP Algorithm on SoC. Mid-Semester Presentation Student : Einat Tevel Supervisor : Isaschar Walter Accompaning engineer : Emilia Burlak.
Using Programmable Logic to Accelerate DSP Functions 1 Using Programmable Logic to Accelerate DSP Functions “An Overview“ Greg Goslin Digital Signal Processing.
The Xilinx Spartan 3 FPGA EGRE 631 2/2/09. Basic types of FPGA’s One time programmable Reprogrammable (non-volatile) –Retains program when powered down.
Programmable Logic- How do they do that? 1/16/2015 Warren Miller Class 5: Software Tools and More 1.
PROGRAMMABLE LOGIC DEVICES (PLD)
Software Defined Radio 長庚電機通訊組 碩一 張晉銓 指導教授 : 黃文傑博士.
J. Christiansen, CERN - EP/MIC
FPGA (Field Programmable Gate Array): CLBs, Slices, and LUTs Each configurable logic block (CLB) in Spartan-6 FPGAs consists of two slices, arranged side-by-side.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR FPGA Fabric n Elements of an FPGA fabric –Logic element –Placement –Wiring –I/O.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
PROCStar III Performance Charactarization Instructor : Ina Rivkin Performed by: Idan Steinberg Evgeni Riaboy Semestrial Project Winter 2010.
Computer Organization & Assembly Language © by DR. M. Amer.
Tools - LogiBLOX - Chapter 5 slide 1 FPGA Tools Course The LogiBLOX GUI and the Core Generator LogiBLOX L BX.
© 2010 Altera Corporation - Public Lutiac – Small Soft Processors for Small Programs David Galloway and David Lewis November 18, 2010.
M.Mohajjel. Why? TTM (Time-to-market) Prototyping Reconfigurable and Custom Computing 2Digital System Design.
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
Introduction to Field Programmable Gate Arrays Lecture 1/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden, 31 May – 9 June 2007 Javier.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
Introduction to Field Programmable Gate Arrays (FPGAs) EDL Spring 2016 Johns Hopkins University Electrical and Computer Engineering March 2, 2016.
Introduction to the FPGA and Labs
Issues in FPGA Technologies
Programmable Logic Devices
COURSE OUTCOMES OF Microprocessor and programming
Reconfigurable Architectures
Hands On SoC FPGA Design
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
Design for Embedded Image Processing on FPGAs
Introduction Introduction to VHDL Entities Signals Data & Scalar Types
Chap 7. Register Transfers and Datapaths
Embedded Systems Design
Introduction of microprocessor
Instructor: Dr. Phillip Jones
Spartan FPGAs مرتضي صاحب الزماني.
An Introduction to Microprocessor Architecture using intel 8085 as a classic processor
Instructor: Alexander Stoytchev
Basics Combinational Circuits Sequential Circuits Ahmad Jawdat
Programmable Logic- How do they do that?
Field Programmable Gate Array
Field Programmable Gate Array
Field Programmable Gate Array
We will be studying the architecture of XC3000.
Getting Started with Programmable Logic
The Xilinx Virtex Series FPGA
A Digital Signal Prophecy The past, present and future of programmable DSP and the effects on high performance applications Continuing technology enhancements.
Multiplier-less Multiplication by Constants
Programmable Configurations
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
Programmable Logic- How do they do that?
The Xilinx Virtex Series FPGA
Reconfigurable Architectures
ECE 352 Digital System Fundamentals
Exploring Application Specific Programmable Logic Devices
Arithmetic Building Blocks
Optimizing RTL for EFLX Tony Kozaczuk, Shuying Fan December 21, 2016
ADSP 21065L.
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

Programmable Logic- How do they do that? Class 3: Specialized Functions 1/14/2015 Warren Miller

This Week’s Agenda 1/12/15 An Introduction to Programmable Logic 1/13/15 Switches and Logic 1/14/15 Specialized Functions 1/15/15 Adding Processors 1/16/15 Software Tools

Course Description Often we don't think about the details of how a particular device or technology are implemented- we just use them in our designs. However sometimes you can’t help but wonder- “What did they do that?” This course will dig into the details of how programmable logic devices and the associated tools are implemented so you can better understand some of the ‘How’ behind common trade-offs you are faced with in your designs. Programmable logic starts first with the technology used to implement the configurable logic that makes up a programmable logic device. This class will review the primary technology use to implement the configurable elements common to all programmable logic devices.

Today’s Topics Goals and Objectives What Functions are Inefficient for the Base Fabric? Memory, Counters, Decoders Adders, Multipliers Memory How Are These Specialized Functions Implemented? Memory, Counters Adders, Multipliers, DSP Blocks How Does Software Identify and Use These Functions? Synthesis Place and Route The general purpose nature of programmable logic switches and logic elements are very flexible, but inefficient for implementing common high-level building blocks for most digital sub-systems. Most programmable logic devices add some fixed function elements to avoid these inefficiencies and this class will describe the most common ones.

Goals and Objectives Understand How and Why FPGAs have Fixed Function Blocks Architecture Logic Interconnect Efficiency Compared to programmable fabric Software

FPGA Fabric- Review IO Blocks Programmable Interconnect Logic Blocks Switches and Signal Lines Logic Blocks LUTs plus ‘stuff’ Carry Look Ahead, RAM ROM, Shift Registers Interconnect Limited

FPGA Fabric- Efficient Use? State Machines One Hot Encoding Delay Counters Feedback shift registers Limited Fanout Duplicate Logic if needed Add retiming registers Register Rich Limited Inputs Limited Outputs Rent’s Rule! In the 1960s, E.F. Rent, an IBM employee, found a remarkable trend between the number of pins (terminals T) at the boundaries of integrated circuit designs at IBM and the number of internal components (g), such as logic gates or standard cells. On a log-log plot, these datapoints were on a straight line, implying a power-law relation T = t g^p where t and p are constants (p < 1.0, and generally 0.5 < p < 0.8). Rent disclosed his findings in IBM-internal memoranda that were published in the IBM Journal of Research and Development in 2005 (IBM J. Res. & Dev. Vol. 49, No. 4/5 July/September 2005, pp. 777–803), but the relation was described in 1971 by Landman and Russo.[1] They performed a hierarchical circuit partitioning in such a way that at each hierarchical level (top-down) the least number of interconnections had to be cut to partition the circuit (in more or less equal parts). At each partitioning step, they noted the number of terminals and the number of components in each partition and then partitioned the sub-partitions further. They found the power law rule applied to the resulting T versus g plot and named it "Rent's rule". Rent's rule is an empirical result based on observations of existing designs, and therefore it is less applicable to the analysis of non-traditional circuit architectures. However, it provides a useful framework with which to compare similar architectures.

Example Register Rich Logic Lean Adjust Fan-in to reduce logic levels Adjust Fan-out to reduce routing delay

FPGA Fabric- What’s Inefficient Large Memory Blocks Multiplication Division Floating Point Operations Standard Interfaces PCIe, Ethernet, etc. Priority Encoders Register Lean Many Inputs Many Outputs

Architecture for Fixed Blocks Xilinx Series 7 Example Previous Approach New ASMBL approach Column Based Can create families Artix (Cost Sensitive) Kintex (Efficient) Virtex (Performance and Capacity) Xilinx created the Advanced Silicon Modular Block (ASMBL) architecture to enable FPGA platforms with varying feature mixes optimized for different application domains. Through this innovation Xilinx offers a greater selection of devices, enabling customers to select the FPGA with the right mix of features and capabilities for their specific design. Figure 2-1 provides a high-level description of the different types of column-based resources.

Xilinx Series 7 Block RAM Same as Virtex-6 SRAM Block 36K/18K Block 32Kx1 to 512 x 72 Simple Dual-Port and True-Dual Port Built-in FIFO 64-bit ECC per Block Adjacent Blocks Combine to 64Kx1 without using fabric

Xilinx Series 7 DSP Block 25x18 Multiplier 25-bit pre-adder Pipeline Cascade and Carry 96-bit MAC SIMD Support 48-bit ALU Pattern Detect 17-bit Shifter Dynamic Operation (Cycle by cycle)

Altera Arria 10 FPGAs Arria 10: Column Based Core Logic Fabric DSP Blocks Memory Blocks Memory Controllers PCIe Core Transceiver PCS Clocking

Altera Arria 10 Logic Module Adaptive LUT 8-inputs 8-outputs Full Adder Registers Carry In/Out Adaptive LUT is ‘fracturable’ Made up of smaller LUTs Connected with Muxes

Altera Arria 10 Logic Module 2 4-inout LUTs 4 3-input LUTs Muxes to combine in multiple ways Shared inputs Dabcd Def Separate inputs Control Signals

Altera Arria 10 Logic Module Combinations: Dual 4-input LUTs 5-input and 3-input 5-input and 4-input 5-input and 5-input 6-input 6-input and 6-input Cascaded 4-input and 3-input Software Impact Performance impact

Altera Arria 10 Interconnect Row Column Local Variable Speed and Length Block and IO Connects ALM, LAB, MLAB Carry Chains Control Signals and Clocks

Arria 10 Block RAM

Altera DSP Block Floating-point arithmetic: • Multiplication, addition, subtraction, multiply-add, and multiply-subtract • Multiplication with accumulation • Multiplication with cascade summation or subtraction • Complex multiplication • Direct vector dot product • Systolic FIR filter Features for fixed-point arithmetic: • High-performance, power-optimized, and fully registered multiplication operations • 18-bit and 27-bit word lengths • Two 18 x 19 multipliers or one 27 x 27 multiplier per DSP block • Built-in addition, subtraction, and 64-bit double accumulation register to combine multiplication results • Cascading 19-bit or 27-bit when pre-adder is disabled and cascading 18-bit when pre-adder is used to form the tap-delay line for filtering applications • Cascading 64-bit output bus to propagate output results from one block to the next block without external logic support • Hard pre-adder supported in 19-bit and 27-bit mode for symmetric filters • Internal coefficient register bank in both 18-bit and 27-bit modes for filter implementation • 18-bit and 27-bit systolic finite impulse response (FIR) filters with distributed output adder • Biased rounding support Features for floating-point arithmetic: • Multiplication, addition, subtraction, multiply-add, and multiply-subtract • Multiplication with accumulation capability and a dynamic accumulator reset control • Multiplication with cascade summation capability • Multiplication with cascade subtraction capability • Complex multiplication • Direct vector dot product • Systolic FIR filter

Altera DSP Block Fixed-point arithmetic: • 18-bit and 27-bit word lengths • Two 18 x 19 multipliers or one 27 x 27 multiplier • Built-in addition, subtraction, and 64-bit double accumulation register • Cascading 19-bit or 27-bit and cascading 18-bit when pre-adder is used • Cascading 64-bit output bus • Hard pre-adder supported in 19-bit and 27-bit mode • Internal coefficient register bank in both 18-bit and 27-bit modes • 18-bit and 27-bit systolic FIR filters • Biased rounding support Features for fixed-point arithmetic: • High-performance, power-optimized, and fully registered multiplication operations • 18-bit and 27-bit word lengths • Two 18 x 19 multipliers or one 27 x 27 multiplier per DSP block • Built-in addition, subtraction, and 64-bit double accumulation register to combine multiplication results • Cascading 19-bit or 27-bit when pre-adder is disabled and cascading 18-bit when pre-adder is used to form the tap-delay line for filtering applications • Cascading 64-bit output bus to propagate output results from one block to the next block without external logic support • Hard pre-adder supported in 19-bit and 27-bit mode for symmetric filters • Internal coefficient register bank in both 18-bit and 27-bit modes for filter implementation • 18-bit and 27-bit systolic finite impulse response (FIR) filters with distributed output adder • Biased rounding support Features for floating-point arithmetic: • Multiplication, addition, subtraction, multiply-add, and multiply-subtract • Multiplication with accumulation capability and a dynamic accumulator reset control • Multiplication with cascade summation capability • Multiplication with cascade subtraction capability • Complex multiplication • Direct vector dot product • Systolic FIR filter

Conclusion Fixed Functions Architecture Interconnect Software

Additional Resources Max Maxfield: Bebop to the Boolean Boogie http://www.amazon.com/Bebop-Boolean-Boogie-Third-Unconventional/dp/1856175073 What is Programmable Logic? http://www.xilinx.com/company/about/programmable.html Programmable Logic Wikibook (Work in progress- want to help?) http://en.wikibooks.org/wiki/Programmable_Logic Altera, Lattice, Microsemi, Xilinx web sites for data sheets, users manuals and software downloads

This Week’s Agenda 1/12/15 An Introduction to Programmable Logic 1/13/15 Switches and Logic 1/14/15 Specialized Functions 1/15/15 Adding Processors 1/16/15 Software Tools