June 10, Functionally Linear Decomposition and Synthesis of Logic Circuits for FPGAs Tomasz S. Czajkowski and Stephen D. Brown University of Toronto
2 FPGA CAD Background Start with HDL Convert HDL to gates Gates to logic components on FPGA Place and route Get Results Program FPGA
3 Motivation Synthesis of XOR-based logic circuits is Difficult Time Consuming Very useful for circuits that deal with Arithmetic Error correction Communication Focus on area optimization in this work
4 Why Use XOR Gates? c b d f a
5 Basic Idea Express a k-input logic function in a truth table (2 n rows, 2 m columns, n+m=k) Find a set of linearly independent columns, also known as a basis Express each column as a weighted sum of basis functions Column Selector Functions are the weighting factors Synthesize G1G1 G2G2 G 1 XOR G 2 G 1 = c b G 2 = d f a f = bc + ad
6 Finding Basis Functions Use Gaussian Elimination to determine the basis columns Perform elementary row operations (add rows, swap rows) Reduce the matrix until for each row the column with the left most 1 has only 0s below it Result The leftmost 1 element of each non-zero row points to the basis vector in the original truth table Note: Linear Independence is guaranteed Number of basis vectors is minimum
7 Express Each Column in terms of G 1 and G 2 Trivial for columns of all zeros or those that are either G 1 or G 2 Other columns C i are expressed as h 1 and h 2 are the solution to the following equation Easy to see
8 Create Column Selector Functions For each basis function, G 1 and G 2, record for which columns h 1i and h 2i are 1 Create Truth Tables H 1 and H 2 to identify columns in which h 1i and h 2i are 1. H 1 and H 2 are the selector functions H 1 = b H 2 = a
9 Synthesize Put G 1, G 2, H 1 and H 2 together to synthesize function f G 1 = c H 1 = b G 2 = d f H 2 = a
10 How to order variables? Partition variables between rows (bound set) and columns (free set) Which one is the better choice? For a function with k variables the largest number of possible variable partitions is a d b c f
11 Heuristic Variable Ordering: Procedure Step 1: Starting with n=2, determine all possible partitions with bound set size of 2. Pick k/2 best such that each variable is in exactly one grouping. Step 2: For (n=4; n < m; n=n*2) Repeat procedure in Step 1, except now group groupings generated in the step for n/2. Step 3: If m is not a power of 2, use the generated groupings to form valid bound sets and pick the best one (longest step). Step 4: Reorder variables in f to match the best grouping of size m found. a b c d e f g h ab cd ef gh abcd efgh best
12 Heuristic Variable Ordering: Runtime For k=16, m=8 the number of partitions tested is 154, versus possible partitions 120 tested for n=2, picked 8 best 28 tested for n=4, picked 4 best 6 tested for n=8, picked 2 best If m was 7 then in addition we would test combinations of valid partitions formed from initial inputs, as well as n=2 and n=4 groups. 4*6*10 = 240 Thus for a 16 variable function we are testing at most 388 partitions (instead of partitions)
13 Basis and Selector Optimization Variable ordering can change the area of the final implementation of the logic function A set of basis/selector functions for a given variable partition is a minimum set, but Not unique Other sets can be better (less costly to implement) than the one we found We need to explore alternate solutions
14 G2H2G2H2 G’H’ Example Same function as before bound set {b,c} free set {a,d} Basis-selector pairs are: Let We can replace G 2 with G’ and then we have basis- selector pairs: G1H1G1H1 + bc!(ad)*bcadad*!(bc)
15 Multi-Output Synthesis Put truth tables side by side Apply Gaussian Elimination to all functions simultaneously Create a common set of basis functions Selector functions are different for each output
16 Example: 2-bit Adder Synthesize S 1 and C out as C out S1S1 S0S0
17 Circuit for Example 2 Let x 0 y 0 be C in
18 Duplication Reduction Replace a duplicate function (related by equality or complementation) with a wire/inverter Store a list of functions with k inputs or less created in the process of synthesis If the same function is repeated then connect to it via a wire/inverter Both methods are utilized frequently
19 Results 99 MCNC circuits tested 25 XOR based, as determined by prior research Circuit known to have a lot of XOR gates inside Set used in many XOR–based logic synthesis papers 74 non-XOR Compiled BDS-PGA 2.0, ABC, and our tool (FLDS) under Windows XP Dual Xeon 2.8GHz with 2GB of RAM Synthesized each circuit with BDS-PGA 2.0, ABC and FLDS. Used ABC to map logic into 4-LUTs
20 XOR circuits (1 of 2) Cordic two 23-input functions, small area, fast synthesis Neither ABC nor BDS-PGA can synthesize it well
21 XOR circuits (2 of 2) Good results Win on both area and depth Synthesis is fast
22 Non-XOR circuits vs. BDS-PGA
23 Non-XOR circuits vs. ABC
24 Circuits not included in comparison Failed to synthesize with BDS-PGA 2.0 Two circuits failed to synthesize with BDS-PGA 2.0 Ex1010 ABC results: 4094 LUTs, Depth 8, Time 1.52 seconds FLDS results: 1063 LUTs, Depth 7, Time seconds, Cone size set to 12 Comparison Area: % Depth: % Misex3 ABC results: 1093 LUTs, Depth 6, Time 0.44 seconds FLDS results: 493 LUTs, Depth 10, Time 3.8 seconds, Cone size set to 16 Comparison Area: % Depth: %
25 Interesting Experiment Does FLDS work in tandem with other synthesis tools? Optimize circuit with FLDS and then apply ABC’s optimizations Compared to ABC alone Results: XOR circuits: Area: % Depth: -16.2% Non-XOR circuits: Area: % Depth: +1% Overall Area: -9.3% Depth: -3.3%
26 FLDS with ABC vs. ABC (all circuits included)
27 Observations FLDS is good for XOR based logic Performs reasonably well for non-XOR logic Most gains due to synthesis of multi-output logic functions FLDS is fast Runtime in second for functions larger than 16 inputs
28 Future Work Look at non-disjoint decomposition Combine with tools such as ABC to synthesize all types of logic well
29 Acknowledgements Valavan Manohararajah, Deshanand Singh of Altera Corporation Professors Zvonko G. Vranesic and Jianwen Zhu from the University of Toronto for their input during the course of this research We would also like to take this opportunity to thank Altera Corporation for funding this research
30 Questions?