Presentation is loading. Please wait.

Presentation is loading. Please wait.

Yu Hu1, Satyaki Das2 Steve Trimberger2, and Lei He1

Similar presentations


Presentation on theme: "Yu Hu1, Satyaki Das2 Steve Trimberger2, and Lei He1"— Presentation transcript:

1 Design, Synthesis and Evaluation of Heterogeneous FPGA with Mixed LUTs and Macro-Gates
Yu Hu1, Satyaki Das2 Steve Trimberger2, and Lei He1 1. Electrical Engineering Dept., UCLA 2. Research Lab, Xilinx Inc. Presented by Yu Hu Address comments to

2 Heterogeneous FPGA with Macro-Gates
There exists trade-off between programmability and cost (performance, area, power, etc.) Xilinx V4 benefits from small gates (MUX2, XOR2) built in SLICEs. Seek a small set of wider logic functions (macro gates) to replace a large portion of LUTs. Reduce logic area and delay What is missing? Design: What should be inside these macro gates? CAD: Need flexible Synthesis tools to evaluate the architecture!

3 Selection of Logic Functions for Macro-Gates
…… Map with LUT-N F f g d e h b a c LUT Extract logic functions Generate Utilization NPN Diagram ab’c’+a’bc’ / 1 / xx% ab / 0 / xx% a / 0 / xx% ab’+a’b / 0 / xx% -0- / 0 / xx% abc/ 1 / xx% ab’+a’b / 1 / xx% a / 1 / xx% Calculate score For logic functions a / 1 / 25% ab’+a’b / 1 / 50% ab’c’+a’bc’ / 1 / 75% ab / 0 / 25% -0- / 0 / xx% abc/ 1 / 50% 1+1*1/2=1.5 1 1*1/2=0.5 1+1*1/3=1.33 1+1*2/3+1*1/3=2 This slide shows how we perform logic function exploration and ranking so that we can select a small subset of logic functions to build macro-gates. Suppose we want to study logic functions with up to K inputs. Given a set of sample circuits, we first map them into K-LUTs. And then extract the logic functions exhibit in the LUTs. A particular data structure, called utilization NPN diagram, is generated based on those extracted logic functions. After that, each node in this diagram is labeled based on the frequency of the logic functions represented by this node. Finally we can perform a traverse on this diagram to get a ranking of logic functions. Rank logic functions Best function: ab’c’+a’bc’

4 Proposed Macro-Gates and FPGA Architecture
For IWLS’05 benchmarks, the following four 6-input functions have the highest ranks GI1=a b c d e f (AND-6) GI2=a’ b’ c’ + b c f’ + b c’ d’ + b’ c e (MUX-4) GI3=a b' c d' e + b c e f + d e f GI4=a b' + a' c d' + b' c' + e' + f‘ The architecture of the proposed macro-gate and FPGA slice are

5 Mapping: Resource Utilization Balancer
The available resource of different logics in an FPGA is fixed Technology mapper should optimize logic resource utilization rate to minimize the packing area A Binary Integer and Linear Programming is used to balance the logic resource utilization while preserving the timing

6 Mapping: SAT-Based Slice Packing
Formulate the slice packing problem as a localized place and route validation problem, which is solved by SAT: Exclusively constraint: ∨ Presence constraint: ∨ Input/Output constraint: → Routing constraint: G0 →out ∧ → More constraints in the paper …

7 Overall Flow for Technology Mapping
d e h b a c LUT Area weight Setting Cut-based Mapping LUT6 MG6 Area-Balance Trade-off? Y N LUT-MG ratio balancer LUT6 MG6 This slide shows the overall flow of technology mapping for mixed macro-gates and LUTs based FPGA. We first define the delay and area cost for a LUT and a macro-gate, then perform the cut based technology mapping. If the resource (LUTs and macro-gates) utilization is not uniform (balanced). A resource balancing algorithm is performed to adjust the mapping result. Finally a packing procedure is conducted. packing LUT6 MG6

8 Architecture Evaluation
Four architectures are compared: LUT4, LUT4 + macro gate, LUT6, and LUT6 + macro gate Power and delay model Based on transistor number For IWLS’05 benchmark, mixing LUT and gates reduces delay and device area Using international Workshop on Logic Synthesis 2005 (IWLS’05) benchmarks (22 circuits), we first performed our functionality exploration approach and found 4 6-input logic functions with highest scores and built them into a macro-gate, which allows input/output inversion and the selection of the four logic functions. Assume there is one LUT and one macro-gate in a BLE.


Download ppt "Yu Hu1, Satyaki Das2 Steve Trimberger2, and Lei He1"

Similar presentations


Ads by Google