Download presentation
Presentation is loading. Please wait.
Published byBaldwin Barker Modified over 8 years ago
1
1 Architecture of Datapath- oriented Coarse-grain Logic and Routing for FPGAs Andy Ye, Jonathan Rose, David Lewis Department of Electrical and Computer Engineering University of Toronto {yeandy, jayar, lewis}@eecg.utoronto.ca
2
2 Outline Motivation –Datapath regularity An datapath-oriented FPGA –Architecture –CAD flow Experimental results –Area efficiency Conclusion
3
3 Modern FPGAs Very large logic capacities –Over 10 million equivalent logic gates Increasingly used to implement large and complex applications –Central processing units –Graphics accelerators –Digital signal processors –Packet switching networks
4
4 Datapath Circuits Large applications –Contain a greater amount of datapath circuits Datapath circuits –Consist of multiple identical logic structures called bit-slices Regularity Predictability
5
5 An Example Full Adder Full Adder Full Adder Full Adder A0A0 A1A1 A2A2 A3A3 B0B0 B1B1 B2B2 B3B3 C0C0 C1C1 C2C2 C3C3 Carry In Carry Out
6
6 An Example
7
7 Research Goal Design a new FPGA architecture –Utilize datapath regularity Reduce the implementation area of datapath circuits on FPGAs Implement a full set of CAD tools for the new architecture –Synthesis –Packing –Placement –Routing
8
8 Key Architectural Features A bus-oriented logic block architecture A mixture of coarse-grain tracks and fine- grain routing tracks
9
9 Datapath FPGA Overview LL LL S LLogic Block Coarse grain routing tracks Fine grain routing tracks S Switch Block Routing Channels
10
10 Logic Block — Super-cluster BLE Cluster 4Cluster 3Cluster 2Cluster 1 Local Routing Network BLE A Cluster MUX LUT DFF M A Basic Logic Element (BLE)
11
11 Datapath FPGA Overview LL LL S LSuper-cluster Coarse grain routing tracks Fine grain routing tracks S Switch Block Routing Channels
12
12 Coarse-grain Routing Tracks Super-cluster Cluster M Switch Block M M Coarse-grain Routing MMMM Fine-grain Routing
13
13 CAD flow for the datapath-oriented FPGA consists of –Synthesis –Packing –Placement –Routing Conventional CAD flow –Minimize area and delay metrics –Destroy datapath regularity CAD Flow
14
14 Datapath-oriented CAD Flow Preserve datapath regularity (bit-sliced structures) Map the preserved regularity onto the datapath-oriented FPGA architecture Maximize the utilization of coarse-grain routing tracks –Minimize the implementation area of datapath structures
15
15 Datapath Representation Datapath circuits are represent by netlists of datapath components (VHDL or Verilog) Datapath component library –Multiplexers –Adders/subtracters –Shifters –Comparators –Registers Each component consists of identical bit-slices
16
16 Synthesis Enhanced module compaction algorithm Based on the Synopsys FPGA compiler Augmented with several datapath-oriented features –Preserve datapath regularity by preserving bit- slice boundaries –Achieve as good area results as the conventional synthesis tools
17
17 An Example Datapath Circuit mux + c1c1 a1a1 b1b1 d1d1 s1s1 + c2c2 a2a2 b2b2 d2d2 s2s2 + c3c3 a3a3 b3b3 d3d3 s3s3 sel mux + c0c0 a0a0 b0b0 d0d0 s0s0 c in c out
18
18 Synthesis mux c0c0 a0a0 b0b0 d0d0 s0s0 sel c in 4-LUT a0a0 b0b0 c0c0 sel 4-LUT d0d0 s0s0 c in +
19
19 Synthesis 4-LUT a2a2 b2b2 c2c2 sel 4-LUT d2d2 s2s2 a1a1 b1b1 c1c1 sel 4-LUT d1d1 s1s1 a0a0 b0b0 c0c0 sel 4-LUT d0d0 s0s0 c in 4-LUT a3a3 b3b3 c3c3 sel 4-LUT d3d3 s3s3 c out
20
20 Packing Based on the T-VPACK packing algorithm Pack adjacent bit-slices into super-clusters Utilize carry connections in super-clusters to minimize the delay of carry chains
21
21 An Example Four clusters per super-cluster Two BLEs per cluster Six inputs per cluster BLE
22
22 Packing Into Clusters 4-LUT a0a0 b0b0 c0c0 sel 4-LUT d0d0 s0s0 c in BLE a0a0 b0b0 c0c0 sel d0d0 s0s0 c in BLE
23
23 Packing Into Super-clusters BLE a0a0 b0b0 c0c0 sela2a2 b2b2 c2c2 a3a3 b3b3 c3c3 d0d0 d1d1 d2d2 d3d3 s0s0 s1s1 s2s2 s3s3 c in c out a1a1 b1b1 c1c1 sel
24
24 Placement Based on the VPR placer Use simulated annealing algorithm For super-clusters containing datapath circuits –Move super-clusters only For super-clusters containing non- datapath circuits - Move individual clusters
25
25 Routing Based on the VPR router Use the path finder algorithm As much as possible –Route buses through coarse-grain routing tracks –Route individual signals through fine-grain routing tracks When necessary –Use coarse-grain routing tracks for individual signals –Use fine-grain routing tracks for buses
26
26 Area Efficiency Benchmarks –15 datapath circuits from the Pico-java processor Architectural assumptions –Four BLEs per cluster –Four clusters per super-cluster –Four coarse-grain tracks sharing configuration memory –Logic track length of two –Disjoint switch block topology Architectural variables –Number of coarse-grain tracks
27
27 Area Efficiency 1.60 1.50 1.40 100.0% 95.0% 90.0% 0%0%- 10% 10%- 20% 20%- 30% 30%- 40% 40%- 50% 50%- 60% 60%- 70% circuit area in minimum transistor area (x10 6 ) normalized circuit area % of coarse- grain tracks
28
28 Logic Track Length Vs. Area Architectural assumptions –Four clusters per super-cluster –Four coarse-grain tracks share configuration memory –50% of tracks are coarse-grain tracks –Disjoint switch block topology Architectural variables –Number of BLEs per cluster –Logic track length
29
29 Logic Track Length Vs. Area 124816 track length 1.60 1.80 2.00 2.20 circuit area in minimum transistor area (x10 6 ) N = 2 N = 4 N = 8 N = 10
30
30 Conclusion Proposed a datapath-oriented FPGA architecture and its CAD tools Best area is achieved when –40% - 50% of tracks are coarse-grain routing tracks –Four BLEs per cluster –Logic track length of two Best area is 9.6% smaller than conventional FPGAs
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.