Presentation is loading. Please wait.

Presentation is loading. Please wait.

Philip Brisk 2 Paolo Ienne 2 Hadi Parandeh-Afshar 1,2 1: University of Tehran, ECE Department 2: EPFL, School of Computer and Communication Sciences Improving.

Similar presentations


Presentation on theme: "Philip Brisk 2 Paolo Ienne 2 Hadi Parandeh-Afshar 1,2 1: University of Tehran, ECE Department 2: EPFL, School of Computer and Communication Sciences Improving."— Presentation transcript:

1 Philip Brisk 2 Paolo Ienne 2 Hadi Parandeh-Afshar 1,2 1: University of Tehran, ECE Department 2: EPFL, School of Computer and Communication Sciences Improving Synthesis of Compressor Trees on FPGAs via Integer Linear Programming

2 March 14, 20082 Outline Motivation Generalized Parallel Counters ILP Formulation Experimental Results Conclusion

3 March 14, 20083 Outline Motivation Generalized Parallel Counters ILP Formulation Experimental Results Conclusion

4 March 14, 20084 Motivation: Why multi-input addition is important? Partial product reduction in parallel multiplication Wallace and Dadda in the 1960s Multi-input addition occurs in many multimedia and signal processing H.264/AVC Variable Block Size Motion Estimation FIR Filters 3G Wireless Base Station Channel Cards Flow graph transformations expose opportunities to use compresor trees in high-level synthesis [ Verma and Ienne, ICCAD 2004 ]

5 March 14, 20085 Multi Input Addition Implementation ASIC Compressor Trees + Final Adder Counters are the basic blocks Wallace/Dadda/3-Greedy FPGA Adder Trees Full Adder Implemented in CLB Structure Fast Carry-Chain (Xilinx and Altera) Reduces Routing Delay Compressor Trees have poor performance Fast carry chains can not be used Counters are inflexible GOAL: Better implementation of compressor trees on FPGAs

6 March 14, 20086 Outline Motivation Generalized Parallel Counters ILP Formulation Experimental Results Conclusion

7 March 14, 20087 Generalized Parallel Counters (GPCs) Parallel Counter: Sum bits with the same rank Generalized Parallel Counter: Sum bits having different ranks Example GPCs are more flexible and reduce the number of logic levels GPCs are more complex, but the additional complexity is absorbed in LUTs! GPCs are perfect building blocks to create better compressors out of FPGA LUTs (3; 2) Counter(3, 3; 4) GPC

8 March 14, 20088 GPC Implementation K-LUT GPC N N K K

9 March 14, 20089 Goal How to best select GPC types and connect them to build a compressor tree 0 12 3 Rank

10 March 14, 200810 Goal How to best select GPC types and connect them to build a compressor tree 0 12 3 Rank

11 March 14, 200811 Goal How to best select GPC types and connect them to build a compressor tree 0 12 3 Rank

12 March 14, 200812 Goal How to best select GPC types and connect them to build a compressor tree 0 12 3 Rank

13 March 14, 200813 Goal How to best select GPC types and connect them to build a compressor tree 0 12 3 Rank

14 March 14, 200814 Outline Motivation Generalized Parallel Counters ILP Formulation Experimental Results Conclusion

15 March 14, 200815 ILP Formulation GPC ki = 0 ki = 1 kj = 0 kj = 1 kj = 2 Objective Function Minimizing Levels of GPCs GPC Representation in ILP

16 March 14, 200816 ILP Formulation Variables p m,i,ki {0, 1} – True if there is a connection between the m-th input bit and an input of rank k i of GPC i. m0m0 m1m1 m2m2 GPC 1 e 1,2,0,1 e 0,2,1,0 p 0,0,0 p 1,0,1 p 2,1,0 q 0,0,0 q 2,1,1 q 1,2,2 n0n0 n2n2 n1n1 GPC 0 GPC 2 n3n3 m3m3 D 3,3

17 March 14, 200817 ILP Formulation Variables q i,ki,m {0, 1} – True if there is a connection between the ki-th output of GPC i and an output bit of rank m. m0m0 m1m1 m2m2 GPC 1 e 1,2,0,1 e 0,2,1,0 p 0,0,0 p 1,0,1 p 2,1,0 q 0,0,0 q 2,1,1 q 1,2,2 n0n0 n2n2 n1n1 GPC 0 GPC 2 n3n3 m3m3 D 3,3

18 March 14, 200818 ILP Formulation Variables e i,j,ki,kj {0, 1} – True if there is a connection from the ki-th output of GPC i and an input of rank k j of GPC j. m0m0 m1m1 m2m2 GPC 1 e 1,2,0,1 e 0,2,1,0 p 0,0,0 p 1,0,1 p 2,1,0 q 0,0,0 q 2,1,1 q 1,2,2 n0n0 n2n2 n1n1 GPC 0 GPC 2 n3n3 m3m3 D 3,3

19 March 14, 200819 ILP Formulation Variables D i,j {0, 1} – True if there is a direct connection from the ith input bit and an output bit of rank j. m0m0 m1m1 m2m2 GPC 1 e 1,2,0,1 e 0,2,1,0 p 0,0,0 p 1,0,1 p 2,1,0 q 0,0,0 q 2,1,1 q 1,2,2 n0n0 n2n2 n1n1 GPC 0 GPC 2 n3n3 m3m3 D 3,3

20 March 14, 200820 ILP Formulation Connection rules Circuit I/Os Each circuit input should be connected to either a GPC or the final adder Each output rank should be derived k-times (K=3, final adder is a ternary adder) GPC I/Os Satisfying number of allowable I/Os considering input ranks Wires Satisfying rank constraints of source and destination of each wire

21 March 14, 200821 ILP Formulation ILP Improvement Using [ Parandeh-Afshar et. al, APSDAC 2008 ] heuristic for estimating maximum number of GPCs at each Level GPC on level L can only connect to inputs of GPCs on levels L+1 and L+2

22 March 14, 200822 Outline Motivation Generalized Parallel Counters ILP Formulation Experimental Results Conclusion

23 March 14, 200823 Experimental Methodology CPLEX ILP Solver Altera Stratix-II 90nm CMOS Technology Implementations of multi-input addition Adder Tree – Ternary adder tree State of the art for FPGAs Heuristic – Mapping heuristic described in [13] ILP – ILP formulation described here

24 March 14, 200824 Experimental results (Delay) ILP on average is: 32% faster than Adder Tree 5% faster than the Heuristic

25 March 14, 200825 Experimental Results (Area) ILP on average consumes: 3% less resources than Adder Tree 13% less resources than Heuristic

26 March 14, 200826 Outline Motivation Generalized Parallel Counters ILP Formulation Experimental Results Conclusion

27 March 14, 200827 Conclusion Conventional wisdom has held that adder trees outperform compressor trees on FPGAs Ternary adder trees were a major selling point of the Altera Stratix II architecture Conventional wisdom is wrong! GPCs map nicely onto LUTs Compressor trees on FPGAs, are faster than adder trees when built from GPCs


Download ppt "Philip Brisk 2 Paolo Ienne 2 Hadi Parandeh-Afshar 1,2 1: University of Tehran, ECE Department 2: EPFL, School of Computer and Communication Sciences Improving."

Similar presentations


Ads by Google