Download presentation
Presentation is loading. Please wait.
Published byDarren Carson Modified over 9 years ago
1
Routing Wire Optimization through Generic Synthesis on FPGA Carry Hadi P. Afshar Joint work with: Grace Zgheib, Philip Brisk and Paolo Ienne
2
FPGAs and ASICs Gaps* Performance – Ratio: 3-4 Area – Ratio: 20-35 Power – Ratio: 7-15 *I. Kuon and J. Rose, "Measuring the gap between FPGAs and ASICs“, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 26, NO. 2, FEBRUARY 2007, pp. 203 – 215. 2 Routing resources consume ≈60-80% of the chip area and are significant contributors to circuit delay. Concerns: ✘ Lack of generality and flexibility ✘ Underutilization ✘ Change in routing structure How to narrow the gap? Specialized (DSP) blocks Coarser grained logic blocks Hard-wired connections
3
Carry Chains 3 4-LUT + + CLB 8 Inputs
4
Motivation Example 4
5
Problem Definition 5 LUT Mapped Flow Graph Step1: Logic Matching Step2: Chaining
6
Logic Matching Step1: Enumeration of Programmable Part Step2: Identifying regular and independent segments Step3: Developing alphabet library of the macro cell Step4: Mask division and library matching 6 B LUT + A C in C out
7
Logic Matching (Example) Step1: Enumeration 7 i3i3 i2i2 i1i1 i0i0 LUT 1 LUT 2 0000A0A0 B0B0 0001A0A0 B1B1 0010A1A1 B0B0 0011A1A1 B1B1 0100A2A2 B2B2 0101A2A2 B3B3 0110A3A3 B2B2 0111A3A3 B3B3 1000A4A4 B4B4 1001A4A4 B5B5 1010A5A5 B4B4 1011A5A5 B5B5 1100A6A6 B6B6 1101A6A6 B7B7 1110A7A7 B6B6 1111A7A7 B7B7
8
Logic Matching (Example) Step2: Regular and Independent Segments 8 i3i3 i2i2 i1i1 i0i0 LUT 1 LUT 2 0000A0A0 B0B0 0001A0A0 B1B1 0010A1A1 B0B0 0011A1A1 B1B1 0100A2A2 B2B2 0101A2A2 B3B3 0110A3A3 B2B2 0111A3A3 B3B3 1000A4A4 B4B4 1001A4A4 B5B5 1010A5A5 B4B4 1011A5A5 B5B5 1100A6A6 B6B6 1101A6A6 B7B7 1110A7A7 B6B6 1111A7A7 B7B7
9
Logic Matching (Example) Step3: Alphabet library of the cell 9 LUT 1 LUT 2 C in 8-bit alphabets of configuration mask dictionary A0A0 B0B0 000000 … A0A0 B1B1 000000 … A1A1 B0B0 000000 … A1A1 B1B1 000000 … A0A0 B0B0 101011 … A0A0 B1B1 101010 … A1A1 B0B0 100111 … A1A1 B1B1 100110 … A 0 = 0 A 1 = 0 B 0 = 0 B 1 = 0 A 0 = 1 A 1 = 0 B 0 = 0 B 1 = 0 A 0 = 0 A 1 = 1 B 0 = 0 B 1 = 0 A 0 = 1 A 1 = 1 B 0 = 0 B 1 = 0 A 0 = 0 A 1 = 0 B 0 = 1 B 1 = 0
10
Logic Matching (Example) Step4: Mask segmented matching 10 8-bit Library
11
How much we gain? Assume that mask is 32-bit – N Segments – M Patterns in each segment – Our Library Size = Bits – Num of all configurations = 11 Order of magnitudes less memory Order of magnitudes less comparisons
12
Chaining Heuristic 12 Input Output 1 2 3 4 5 2 0 5 1 Input Output 2 0 1 1 Input Output We need to find chains of functions, which are mappable to the macrocell, to be placed on the carry chains
13
Synthesis and Chaining Results BenchmarkChainableChained Max Chain Length Average Chain Length alu474%39%43.5 pdc69%35%63.9 misex368%42%43.1 ex101071%41%53.4 ex5p72%40%43.5 des*65%31%33.0 apex273%42%43.6 apex475%39%43.7 spla72%43%64.2 seq69%38%43.4 Average70%39%4.43.5 13 * The minimum threshold for the chain length is 4, except for “des” which is 3.
14
Experimental Methodology 14 Goal: Extract chains of eligible functions from the synthesized netlist in order to place them on the logic chains; the non- chained ones are remained unchanged. Our Synthesis Engine Logic Matching Chaining Heuristic Netlist Generation VQM Parser DAG Generation Quartus-II LUT Mapping & Syn Quartus-II Place & Route
15
Local Routing Wires 15 26% saving in local wires number
16
Total Wire Lengths 16 9% saving in total wire lengths
17
Delay 17 3% delay penalty due to large in-out delay of the adder
18
Conclusion 18 Narrow the FPGA and ASIC Gaps Lighten the stress on routing resources Hardwired connections + Dedicated logic Improved Routability with a Lighter Network
19
19 Thanks for your attention. hadi.parandehafshar@epfl.ch
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.