1 Closed-Loop Modeling of Power and Temperature Profiles of FPGAs Kanupriya Gulati Sunil P. Khatri Peng Li Department of ECE, Texas A&M University, College.

Slides:



Advertisements
Similar presentations
THERMAL-AWARE BUS-DRIVEN FLOORPLANNING PO-HSUN WU & TSUNG-YI HO Department of Computer Science and Information Engineering, National Cheng Kung University.
Advertisements

FPGA (Field Programmable Gate Array)
EGRE 427 Advanced Digital Design Figures from Application-Specific Integrated Circuits, Michael John Sebastian Smith, Addison Wesley, 1997 Chapter 5 Programmable.
Power Reduction Techniques For Microprocessor Systems
Improving Placement under the Constant Delay Model Kolja Sulimma 1, Ingmar Neumann 1, Lukas Van Ginneken 2, Wolfgang Kunz 1 1 EE and IT Department University.
3D-STAF: Scalable Temperature and Leakage Aware Floorplanning for Three-Dimensional Integrated Circuits Pingqiang Zhou, Yuchun Ma, Zhouyuan Li, Robert.
Fast FPGA Resource Estimation Paul Schumacher & Pradip Jha Xilinx, Inc.
Leakage and Dynamic Glitch Power Minimization Using MIP for V th Assignment and Path Balancing Yuanlin Lu and Vishwani D. Agrawal Auburn University ECE.
XPower for CoolRunner™-II CPLDs
Statistical Full-Chip Leakage Analysis Considering Junction Tunneling Leakage Tao Li Zhiping Yu Institute of Microelectronics Tsinghua University.
S. Reda EN160 SP’08 Design and Implementation of VLSI Systems (EN1600) Lecture 14: Power Dissipation Prof. Sherief Reda Division of Engineering, Brown.
1 A Variation-tolerant Sub- threshold Design Approach Nikhil Jayakumar Sunil P. Khatri. Texas A&M University, College Station, TX.
A Self-adjusting Scheme to Determine Optimum RBB by Monitoring Leakage Currents Nikhil Jayakumar* Sandeep Dhar $ Sunil P. Khatri* $ National Semiconductor,
An Algorithm to Minimize Leakage through Simultaneous Input Vector Control and Circuit Modification Nikhil Jayakumar Sunil P. Khatri Presented by Ayodeji.
Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA
1 A Single-supply True Voltage Level Shifter Rajesh Garg Gagandeep Mallarapu Sunil P. Khatri Department of Electrical and Computer Engineering, Texas A&M.
Mehdi Amirijoo1 Power estimation n General power dissipation in CMOS n High-level power estimation metrics n Power estimation of the HW part.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 13: Power Dissipation Prof. Sherief Reda Division of Engineering, Brown.
A Probabilistic Method to Determine the Minimum Leakage Vector for Combinational Designs Kanupriya Gulati Nikhil Jayakumar Sunil P. Khatri Department of.
Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction Yan Lin and Lei He EE Department, UCLA Partially supported.
Temperature-Aware Design Presented by Mehul Shah 4/29/04.
Trace-Based Framework for Concurrent Development of Process and FPGA Architecture Considering Process Variation and Reliability 1 Lerong Cheng, 1 Yan Lin,
Dynamic Power Consumption In Large FPGAs WILLIAM GARCIA, ANDREW MORTELLARO.
Power, Energy and Delay Static CMOS is an attractive design style because of its good noise margins, ideal voltage transfer characteristics, full logic.
Dynamic Hardware Software Partitioning A First Approach Komal Kasat Nalini Kumar Gaurav Chitroda.
General FPGA Architecture Field Programmable Gate Array.
Dr. Konstantinos Tatas ACOE201 – Computer Architecture I – Laboratory Exercises Background and Introduction.
Register-Transfer (RT) Synthesis Greg Stitt ECE Department University of Florida.
© 2003 Xilinx, Inc. All Rights Reserved Power Estimation.
EE466: VLSI Design Power Dissipation. Outline Motivation to estimate power dissipation Sources of power dissipation Dynamic power dissipation Static power.
ISE. Tatjana Petrovic 249/982/22 ISE software tools ISE is Xilinx software design tools that concentrate on delivering you the most productivity available.
Power Reduction for FPGA using Multiple Vdd/Vth
POWER-DRIVEN MAPPING K-LUT-BASED FPGA CIRCUITS I. Bucur, N. Cupcea, C. Stefanescu, A. Surpateanu Computer Science and Engineering Department, University.
LOPASS: A Low Power Architectural Synthesis for FPGAs with Interconnect Estimation and Optimization Harikrishnan K.C. University of Massachusetts Amherst.
Logic Synthesis For Low Power CMOS Digital Design.
1 Rapid Estimation of Power Consumption for Hybrid FPGAs Chun Hok Ho 1, Philip Leong 2, Wayne Luk 1, Steve Wilton 3 1 Department of Computing, Imperial.
Basics of Energy & Power Dissipation Lecture notes S. Yalamanchili, S. Mukhopadhyay. A. Chowdhary.
Wen-Hao Liu 1, Yih-Lang Li 1, and Kai-Yuan Chao 2 1 Department of Computer Science, National Chiao-Tung University, Hsin-Chu, Taiwan 2 Intel Architecture.
XPower for CoolRunner™ XPLA3 CPLDs. Quick Start Training Overview Design power considerations Power consumption basics of CMOS devices Calculating power.
1 Moore’s Law in Microprocessors Pentium® proc P Year Transistors.
Lecture 2 1 ECE 412: Microcomputer Laboratory Lecture 2: Design Methodologies.
05/04/06 1 Integrating Logic Synthesis, Tech mapping and Retiming Presented by Atchuthan Perinkulam Based on the above paper by A. Mishchenko et al, UCAL.
Tools - Implementation Options - Chapter15 slide 1 FPGA Tools Course Implementation Options.
A Routing Approach to Reduce Glitches in Low Power FPGAs Quang Dinh, Deming Chen, Martin D. F. Wong Department of Electrical and Computer Engineering University.
A Robust Pulse-triggered Flip-Flop and Enhanced Scan Cell Design
EEE2243 Digital System Design Chapter 7: Advanced Design Considerations by Muhazam Mustapha, extracted from Intel Training Slides, April 2012.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
XIAOYU HU AANCHAL GUPTA Multi Threshold Technique for High Speed and Low Power Consumption CMOS Circuits.
Basics of Energy & Power Dissipation
1 Leakage Power Analysis of a 90nm FPGA Authors: Tim Tuan (Xilinx), Bocheng Lai (UCLA) Presenter: Sang-Kyo Han (ECE, University of Maryland) Published.
1 Carnegie Mellon University Center for Silicon System Implementation An Architectural Exploration of Via Patterned Gate Arrays Chetan Patel, Anthony Cozzie,
1 Performance Analysis (Clock Signal). 2 Unbalanced delays Logic with unbalanced delays leads to inefficient use of logic: long clock periodshort clock.
EE201C : Stochastic Modeling of FinFET LER and Circuits Optimization based on Stochastic Modeling Shaodi Wang
An Improved “Soft” eFPGA Design and Implementation Strategy
FPGA CAD 10-MAR-2003.
In-Place Decomposition for Robustness in FPGA Ju-Yueh Lee, Zhe Feng, and Lei He Electrical Engineering Dept., UCLA Presented by Ju-Yueh Lee Address comments.
A Design Flow for Optimal Circuit Design Using Resource and Timing Estimation Farnaz Gharibian and Kenneth B. Kent {f.gharibian, unb.ca Faculty.
CDA 4253 FPGA System Design RTL Design Methodology 1 Hao Zheng Comp Sci & Eng USF.
Seok-jae, Lee VLSI Signal Processing Lab. Korea University
© PSU Variation Aware Placement in FPGAs Suresh Srinivasan and Vijaykrishnan Narayanan Pennsylvania State University, University Park.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
Field Programmable Gate Arrays
LOW POWER DESIGN METHODS V.ANANDI ASST.PROF,E&C MSRIT,BANGALORE.
MAPLD 2005 Reduced Triple Modular Redundancy for Tolerating SEUs in SRAM based FPGAs Vikram Chandrasekhar, Sk. Noor Mahammad, V. Muralidharan Dr. V. Kamakoti.
ELEC 6970: Low Power Design Class Project By: Sachin Dhingra
Analytical Approach for Soft Error Rate Estimation of SRAM-Based FPGAs
FPGA Glitch Power Analysis and Reduction
Power and Heat Power Power dissipation in CMOS logic arises from the following sources: Dynamic power due to switching current from charging and discharging.
Kejia Li, Yang Fu University of Virginia
Measuring the Gap between FPGAs and ASICs
Presentation transcript:

1 Closed-Loop Modeling of Power and Temperature Profiles of FPGAs Kanupriya Gulati Sunil P. Khatri Peng Li Department of ECE, Texas A&M University, College Station

2 Introduction Due to increasing density of FPGAs –Power is now a zeroth order design constraint During operation, two components of power consumption are –Dynamic Power Temperature independent –Static Power Gate leakage –Largely temperature independent Sub-threshold leakage –Exponential dependence on junction temperature This positive feedback loop could cause –Non-convergence (thermal runaway) –Convergence above a safe junction temperature (thermal breakdown) Increase in dynamic power Increase in temperature Increase in leakage power

3 Our Approach Our approach is design and FPGA device specific Partition placed and routed FPGA design into n 2 grid regions For each grid region, at the given temperature –Compute total power (dynamic and leakage power) Dynamic power computed based on logic in the region Leakage power computed using fast and accurate macromodels From the power of the n 2 grid regions, compute new thermal profile –Compute increase in temperature for each grid region –If change in temperature in all grid regions is less than ε, stop and declare convergence –If no convergence and new temperature in any grid region more than a threshold value, declare thermal breakdown –Else recompute leakage power of each grid region using new temperature value and iterate

4 Our Approach – Flowchart

5 Our Approach – Dynamic Power Compute using the XPower tool from Xilinx –XPower reads the design data file and computes activity estimate ‘α’ –After synthesis, place and route of the design, we compute the maximum operating frequency ‘f ckt ’ –XPower has the node and wire capacitance values. So, P dyn = C * Vdd 2 * f ckt * α –Find the contribution of grid region (i, j) to P dyn For each LUT in grid region (i, j), we compute –Probability of output being logic ‘1’, P 1 = (ΣV k )/16 »Where V k is the logic value stored in the k th SRAM of the LUT –Probability of output switching, P sw = 2 * P 1 * (1-P 1 ) Average probability of switching in the grid region P(i, j) = (ΣP sw )/q –Where q is the number of LUTs per grid region P dyn (i, j) = P dyn * P(i, j) * 1/(ΣP(i, j))

6 Our Approach – Static Power NMOS Passgate Gate Leakage States L 2 ’ Leakage NMOS Passgate Sub-threshold Leakage States LUT Implementation using a 16:1 MUX

7 Our Approach – Static Power Pre-compute leakage using SPICE for –LUT SRAM configuration data is known Each of the 31 pass gates in LUT are in one of –4 states ( L 1, L 2, L 3 or L 2 ’ ) contributing to subthreshold leakage –4 states ( K 1, K 2, K 3 or K 4 ) contributing to gate leakage or –Remaining states have negligible leakage contribution But we do not know the f 1, f 2, f 3 and f 4 inputs to the LUT –Take average over 16 possible input combinations SRAM cell in LUT (stored 1 and 0) –D-flipflop (output 1 and 0) –MUX Logic block in the FPGA

8 Our Approach – Total Power Generate temperature dependent leakage macromodel for –LUT ( L states), D-flipflop, SRAM and MUX Pre-compute the leakage values at 3 different temperatures and fit exponential curve Gate leakage (for K states) is largely temperature independent –Leakage is quickly and accurately estimated for the logic block at any temperature Maximum 3% error when compared to explicit SPICE runs 4 orders of magnitude faster Compute leakage for grid region ( i, j ) at any temperature, P lkg (i, j, T) –Taking the sum of the leakages of all LUTs, D-flipflops, SRAMs and MUXes in region ( i, j ) at any temperature T = temp(i, j) Total power P tot (i, j, T) = P dyn (i, j) + P lkg (i, j, T)

9 Our Approach – Temperature Computation We use the following approach –“ Critical path analysis considering temperature, power supply variations and temperature induced leakage ”, P. Li, ISQED 2006 –Assume a 1W power consumption in grid region (i, j) Table Z ij (k, l) indicates resulting temperature at grid region (k, l) –We precompute n 2 such Z ij tables, each with n 2 entries –We know the total power consumption of each grid region Thus, we find the new temperature, temp_new(i,j), at the (i, j) th grid region, by superposition Details of the thermal model –Circuit discretized into n 2 grid regions –15 layers of metal/dielectric are modeled Assuming a metallization percentage for each layer, the thermal conductivity of each layer is computed –Model includes heat dissipation due to heat sinks

10 Endgame and Experimental Setup Endgame –Find the absolute difference between temp(i,j) and temp_new(i,j) –Declare convergence when the maximum difference for all grid points is < 0.001°c –If temp_new(i,j) > 110°c, and no convergence, we declare thermal breakdown Setup –Applied our methodology to 10 designs, implemented on a Virtex-4XCVLX200 Xilinx FPGA device –Synthesized, placed and routed using Xilinx ISE 8.1 i –Initial temperature set at 27°c –n = 16 –To the best of our knowledge, no other existing work reports final converged temperature and power numbers for FPGA designs, after closing the dependence loop between leakage and temperature –We therefore compared our final temperatures against a full-chip 3D thermal modeling and simulation tool Maximum (average) error in temperature was 2.52%(1.05%) for the DMA benchmark Our approach is faster by ~40X per iteration

11 Results Temperature Profile for Circuit DMA Circuits operating at 450 MHz

12 Conclusions Developed a technique to simultaneously model (in an FPGA) –Power consumption –Temperature Used fast and accurate macromodels, for leakage estimation –Over all circuit components of a logic block, at all temperatures Less than 3% error compared to SPICE and Up to 4 orders of magnitude speedup Approach –Partition FPGA design (placed and routed) into 16x16 grid regions –Compute total power consumption (dynamic and leakage) for each region –Find thermal profile of IC under this power consumption Using pre-computed power-to-temperature tables –New thermal information is used to update the leakage power consumption –Steps iterated until the temperature converges (for all grid regions), or exceeds a safe value (for any grid region) Final temperature obtained from our method –Compared to full-chip 3D temperature estimation tool –Shows max.(avg.) error of 2.52%(1.05%) for the DMA benchmark