Download presentation
Presentation is loading. Please wait.
Published byRoberta Lyons Modified over 9 years ago
1
Reducing the Pressure on Routing Resources of FPGAs with Generic Logic Chains Hadi P. Afshar Joint work with: Grace Zgheib, Philip Brisk and Paolo Ienne
2
FPGAs and ASICs Gaps* Performance – Ratio: 3-4 Area – Ratio: 20-35 Power – Ratio: 7-15 *I. Kuon and J. Rose, "Measuring the gap between FPGAs and ASICs“, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 26, NO. 2, FEBRUARY 2007, pp. 203 – 215. 2 Routing resources consume ≈60-80% of the chip area and are significant contributors to circuit delay. Concerns: ✘ Lack of generality and flexibility ✘ Underutilization ✘ Change in routing structure How to narrow the gap? Specialized (DSP) blocks Coarser grained logic blocks Hard-wired connections
3
Fracturable LUTs 3 S0S0 S1S1 S2S2 S3S3 S4S4 S5S5 S6S6 S7S7 2-LUT i0i0 i1i1 i2i2 LUT 2-LUT 3-LUT
4
Motivation CLB 4-LUT CLB Fracturable LUT structure and extra CLB outputs reduce the problem of large LUT under-utilization. 4 6-LUT 5-LUT 8 Inputs 5-LUT 3-LUT 4-LUT
5
What is the solution? 5 4-LUT + + ? ? CLB 8 Inputs More input bandwidth Improved logic density Dedicated and faster connections
6
Vertical Look-Up Tables 6 4-LUT + + 5-LUT A 5-LUT can be built by two 4-LUTs with shared inputs and a multiplexer that selects between the two sub-LUTs and is controlled by the 5 th input. Two 5-LUTs in the logic cell with disjoint inputs No routing wire is needed for the interconnection No change in the routing network interface Fanout
7
Example 7 Routing wire Hard-wired logic chain F(i 0,i 1,... i 12 ) F(i 0,i 1,... i 15 )
8
Chaining Heuristic 8 Input Output 1 2 3 4 5 2 0 5 1 Input Output 2 0 1 1 Input Output We need to find chains of functions, which have 5 or less number of inputs, to be mapped on the logic chains (vertical 5-LUTs)
9
Synthesis and Chaining Results BenchmarkChainableChained Max Chain Length Average Chain Length alu494%39%125.2 pdc89%53%95.8 misex393%42%95.1 ex101060%47%85.3 ex5p88%46%75.2 des*77%20%43.1 apex284%39%84.9 apex482%59%84.3 spla91%46%115.3 seq88%43%64.9 Average85%44%8.24.9 9 * The minimum threshold for the chain length is 4, except for “des” which is 3.
10
Experimental Methodology 10 4-LUT + + Similar Interface Quartus-II VQM Parser Quartus-II Goal: Extract chains of eligible functions from the synthesized netlist in order to place them on the logic chains; the non- chained ones are remained unchanged. 5-LUT ABC? VPR?
11
Logic Cell Utilization 11 4% saving in the ALM counts
12
Local Routing Wires 12 37% saving in local wires number
13
Total Wire Lengths 13 12% saving in total wire lengths
14
Delay 14 No average delay penalty for the placement restriction
15
Did I say something new?! 15 Local connection in Altera Stratix and Cyclone – Use available logic cell bandwidth – No fracturable LUT structure Local connections in Xilinx FPGAs, goes through multiplexers – Carry look-ahead – Wide AND functions Cascading LUTs to build bigger LUTs in Xilinx Virtex-5 – Routing wire – Few large functions
16
Conclusion 16 Narrow the FPGA and ASIC Gaps Lighten the stress on routing resources Hardwired connections + Dedicated logic More logic density Less Power More LC bandwidth Less routing wires Less circuit delay Improved Routability with a Lighter Network
17
Future Work Logic chain aware synthesis Guided chaining heuristic Multiple logic chains 2-D logic chains 17
18
18 Thanks for your attention. hadi.parandehafshar@epfl.ch
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.