Download presentation
Presentation is loading. Please wait.
Published byLeslie Buck Rose Modified over 9 years ago
1
1 Rapid Estimation of Power Consumption for Hybrid FPGAs Chun Hok Ho 1, Philip Leong 2, Wayne Luk 1, Steve Wilton 3 1 Department of Computing, Imperial College London 2 Department of Computer Science and Engineering, Chinese University of Hong Kong 3 Department of Electrical and Computer Engineering, University of British Columbia 9 September 2008
2
2 Overview 1. Motivation 2. Contributions 3. Related Work 4. Rapid Power Estimation Flow 5. Technology Mapper 6. Evaluation 7. Future work + Conclusion
3
3 Motivation For a new hybrid FPGA architecture How do we assess power dissipation rapidly? How do we map application into such architecture effectively?
4
4 Contributions High level power estimation flow Estimate the power using various vendor toolchain and technique Hybrid FPGA technology mapper Produce netlist/bitstream based on dataflow graph (DFG)
5
5 D=9, M=4, R=3, F=3, 2 add, 2 mul: best density over benchmarks Related work Hybrid FPGA: architecture [1] [1] C. Ho et. al, “Domain-Specific Hybrid FPGA: Architecture and Floating Point Applications”, FPL 2007
6
6 Related work: Virtual Embedded Blocks [1] Dummy blocks used to model coarse-grained block’s area and delay Timing analyzer can be used to determine hybrid’s performance (including fine-to-coarse routing and delays) [1] C. Ho et. al, “Virtual Embedded Blocks: A Methodology for Evaluating Embedded Elements in FPGAs ”, FCCM 2006
7
7 Power estimation flow Different tools chain involved VEB modelling flow FPGA power spreadsheet model ASIC power compiler flow Limitation Dynamic power consumption only (power loss due to switching activity) Constant activity rate is assumed Core only – no I/O power is assessed First order estimation Accurate simulation based model is required
8
8 Power estimation flow P all – Total power dissipations P fgu – power dissipated in fine-grained unit (FGU) P cgu – power dissipated in coarse-grained unit (CGU) P r – power dissipated in routing between FGU and CGU
9
9 Power estimation flow ( P fgu ) 1. Synthesis the circuit with VEB flow 2. Measure the power of the circuit with spreadsheet approach ( P’ ) Constant activity rate of 12.5% applied 3. Measure the power of the VEB with spreadsheet approach ( P veb ) 4. P fgu = P’ - P veb
10
10 Power estimation flow ( P cgu ) 1. Synthesis the coarse-grained unit with ASIC flow 2. Configure the ASIC netlist with bitstream 3. Apply constant activity rate on all the nets 4. Estimate the dynamic power with power compiler tool
11
11 Power estimation flow ( P r ) P r can be modeled by providing suitable output loading in estimating P cgu Output loading can be calibrated by referring existing embedded block Embedded multiplier blocks in Virtex II is used in calibration.
12
12 Power estimation flow ( P r ) 1. Measure the power of multiplier in FPGA using spreadsheet ( P em ) 2. Implement a multiplier in ASIC flow 3. Measure the power of ASIC multiplier ( P am ) 4. Adjust loading capacitance ( C L )such that P am ~= P em 5. Apply C L in estimating P cgu
13
13 Technology mapper A tool for producing netlist/bitstream from high level description Reuse existing C-to-gate compiler CHiMPS [1] Trident [2] fly [3] Only backend is different – technology mapper [1] A. Putnam, et. al, “CHiMPS: A C-Level Compilation Flow for Hybrid CPU-FPGA Architectures”, FPL 2008 [2] J. Tripp, et. al, “Trident: An FPGA Compiler Framework for Floating-Point Algorithms”, FPL 2005 [3] C. Ho, et. al, “Fly - A Modifiable Hardware Compiler”, FPL 2002
14
14 Technology mapper
15
15 Technology mapper Greedy algorithm Not optimal but effective in most cases Pack as much operations in a single coarse-grained unit as possible No suitable block – use soft core Coarse-grained units use up – use soft core
16
16 Mapping example
17
17 Mapping example fadd tmp1, a, b fadd tmp2, c, d
18
18 Mapping example fmul tmp3, tmp1, tmp2
19
19 Mapping example fsqrt tmp4, tmp3 No square root dedicated block, use fine-grained unit
20
20 Mapping example fmul z, tmp4, g Instantiate another coarse-grained unit and connect altogether
21
21 Evaluation How effective of the technology mapper? Compare with optimal mapping How much power/energy can be reduced by introducing coarse-grained unit? Compare with existing FPGA devices
22
22 Evaluation 8 benchmark circuits DSP computation kernels: e.g. bfly Linear algebra: e.g. mm3 Complete application: e.g. bgm Synthetic benchmark: e.g. syn2 Circuits are mapped to hybrid FPGA using technology mapper Synthesized to Xilinx Virtex II devices for comparison
23
23 Evaluation Technology mapper
24
24 Evaluation Power reduction * syn7 is implemented on XC2V8000-5
25
25 Evaluation Energy reduction Energy reduced by 14 times on average
26
26 Future work Integration of technology mapper into existing compiler Trident, fly Simulation based power estimation flow for more accurate results Power estimation comparison with HHVPR [1] flow Static power consumption? [1] N. Choy, et. al, “Activity-Based Power Estimation and Characterization of DSP and Multiplier Blocks in FPGAs”, FPT 2006
27
27 Conclusion Rapid power estimation flow on hybrid FPGA VEB flow, FPGA power spreadsheet, ASIC power compiler Technology mapper for hybrid FPGA Target different coarse-grained units DFG input to cope with existing compiler Produce netlist and bitstream Assess hybrid FPGA power consumption Power reduced by 4 times Energy reduced by 14 times
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.