Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer.

Similar presentations


Presentation on theme: "1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer."— Presentation transcript:

1 1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer Engineering, University of Toronto {yeandy, lewis, jayar}@eecg.utoronto.ca

2 2 Motivation: Datapath Regularity Larger FPGAs –Larger applications on FPGAs –More datapath logic in larger applications –Datapath logic is highly regular Utilize regularity to improve logic density

3 3 Utilizing Datapath Regularity A new datapath-oriented FPGA New CAD tools supporting the new FPGA –Synthesis –Packing –Placement –Routing This talk focuses on synthesis

4 4 Background: Datapath-oriented FPGA Architected to utilize datapath regularity Architectural features –Capture regularity using special logic blocks –Increase logic density by coarse grain routing

5 5 Background: FPGA Overview LL LL S LLogic cluster Coarse grain routing tracks Fine grain routing tracks S Switch box Routing Channels

6 6 Background: Logic Cluster BLE Subcluster 1Subcluster 2Subcluster 3Subcluster 4 Local Routing Network BLE A Subcluster MUX LUT DFF M A Basic Logic Element (BLE)

7 7 Background: FPGA Overview LL LL S LLogic cluster Coarse grain routing tracks Fine grain routing tracks S Switch box Routing Channels

8 8 Background: Coarse Grain Routing Tracks Logic Cluster Sub- cluster Sub- cluster Sub- Cluster Sub- cluster M Switch Box M M Coarse Grain Routing MMMM Fine Grain Routing

9 9 Datapath Synthesis Synthesis –The first step in a fully automated CAD flow –Transforms high level descriptions into logic Conventional synthesis (flat synthesis) –Minimizes area and delay metrics –Destroys datapath regularity Datapath synthesis –Preserves datapath regularity –Supports downstream CAD tools

10 10 Datapath Representation Datapath circuits are represent by netlists of datapath components (VHDL or Verilog) Datapath component library –Multiplexers –Adders/subtracters –Shifters –Comparators –Registers Each component consists of identical bit-slices

11 11 Hard Boundary Hierarchical Synthesis Optimize within the boundaries of bit-slices Keep identical bit-slices identical Optimized 15 datapath circuits from Pico- java processor using Synopsys [sun] –Good regularity –Bad area - 38% area inflation FPGA architecture – increase logic density –Need a better synthesis tool

12 12 Causes of Area Inflation Examined circuits to determine the causes Constraint of preserving bit-slice boundaries –Common sub-expressions exist across bit-slices –Harder to discover in datapath synthesis Constraint of preserving datapath regularity –Identical bit-slices have different external connections –Some bit-slices have more optimization opportunities –Missing optimization opportunities if one has to keeping all bit-slices identical

13 13 Enhanced Module Compaction Netlist of Datapath Components Word-level Optimization Module Compaction Bit-slice Netlist I/O Optimization Flat Synthesis & Optimization Within Bit-slice Boundaries Manual Operation Netlist of Synthesized Bit-slices

14 14 Word-level Optimization Done manually and will be automated Optimizes across bit-slice boundaries Uses the functionality of each datapath component to create optimization opportunities Two are performed –Multiplexer tree collapsing –Operation reordering More in the future

15 15 Multiplexer Tree Collapsing Datapath circuits contain multiplexers in a tree topology Collapses several multiplexers in a multiplexer tree into a single multiplexer Collapsing operation creates common sub- expressions Extracts common expressions out of multiple bit-slices to save area

16 16 An Example FF S1 S2 R A FF A rl S1 S2 rl – random logic mux1 mux2

17 17 Operation Reordering Transforms result selection into operand selection Accepts the transformation if resulting in smaller area

18 18 An Example mux ++ abcd s e + acbd e s sumcarrysumcarry a0 b0 cin0a c0 d0 cin0b cout0a cout0b s0 e0 sumcarry e0 cout0 cin0 a0 c0 b0 d0 s0

19 19 Module Compaction Merges bit-slices into larger bit-slices Based on connectivity between datapath components Larger bit-slices have more optimization opportunities for flat synthesis Avoids merging based on carry chains Similar to the algorithm proposed by Koch

20 20 An Example mux0mux1mux2mux3 FA0FA1FA2FA3FA4

21 21 Bit-slice I/O Optimization Granularity of bit-slice I/O optimization, m Breaks datapath components into m-bit wide chunks m bit-slices are kept identical to each other Allows some bit-slices in a datapath component to be optimized more than others

22 22 Bit-slice I/O Optimization Converts bit-slice I/O signals into internal signals if all m bit-slices meet an optimization criteria More optimization opportunities for flat synthesis Four types of I/O optimizations –Constant absorption –Feedback absorption –Duplicated input absorption –Unused output absorption

23 23 Experimental Results Fifteen benchmark circuits –From the Pico-java processor –Synthesized into 4-LUTs and DFFs Experiments –Area –Regularity –Area against m (the granularity of bit-slice I/O optimization)

24 24 Area m (granularity of bit-slice I/O optimization) = 4 Compare datapath synthesis with flat synthesis

25 25 Post-synthesis Area (LUT Count) Flat Synthesis Area Datapath Synthesis AreaInflation icu_dpath312032353.7% ex_dpath253025530.91% multmod_dp155816344.9% ucode_dat124313044.9% imdr_dpath118212193.1% dcu_dpath9609660.63% mantissa_dp8468783.8% incmod_dp77986511% smu_dpath4904930.61% exponent_dp4775015.0% pipe_dpath4434716.3% prils_dp3773882.9% rsadd_dp346305-12% code_seq_dp2182232.3% ucode_reg78825.1% Total Area14647151173.2%

26 26 Regularity m (granularity of bit-slice I/O optimization) = 4 Two terminal connections captured by –4-bit wide buses –4-bit wide control groups

27 27 Regularity A 4-bit wide bus S1S2S3S4 S1S2S3S4 S1S2S3S4 A 4-bit wide control group

28 28 Regularity Results Two Terminal Connections 4-bit Wide Buses4-bit Wide Control groups dcu_dpath223249%43% ex_dpath654752%39% icu_dpath804747%36% imdr_dpath310050%36% pipe_dpath104948%42% smu_dpath116748%25% ucode_data314352%41% ucode_reg19472%21% code_seq_dp79958%18% exponent_dp136232%23% incmod_dp201342%33% mantissa_dp253347%36% multmod_dp338039%25% prils_dp86441%32% rsadd_dp72252%27% Total3715248%35% 94% of LUTs remain in regular datapath components

29 29 Granularity (m) Vs. Area Higher m (the granularity of bit-slice I/O optimization) –Keeps more bit-slices identical –Preserves more regularity –Higher area cost

30 30 Granularity Vs. Area Inflation

31 31 Conclusion Presented a datapath-oriented FPGA architecture Presented an enhanced module compaction algorithm Empirically demonstrated the area efficiency of the algorithm –3%-8% area inflation Good regularity –48% two terminal connections are in 4-bit wide buses –35% two terminal connections are in 4-bit wide control groups


Download ppt "1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer."

Similar presentations


Ads by Google