An evaluation of HotSpot-3.0 block-based temperature model Damien Fetis, Pierre Michaud June 2006
Temperature: an important constraint Technology Scale down Power must be decreased to prevent temperature from increasing
HotSpot: a thermal model for temperature-aware microarchitecture http://lava.cs.virginia.edu/HotSpot/ Based on thermal resistances and capacitances It is becoming a standard tool in the computer architecture community Several tens of works based on HotSpot have been published so far
Outline Short tutorial on temperature modeling Short description of HotSpot block model Some limitations of HotSpot Conclusion: be careful when using HotSpot
Processor temperature model Power-density map q(x,y,t) Material characteristics, heat-sink thermal resistance, etc… processor temperature T(x,y,t) Temperature model Ambient temperature
Qualitative accuracy Accurate temperature number ? forget it ! If the conclusions of your research depend on precise parameter values, what you are proposing probably has little value What we need for research: qualitative accuracy Model can tell whether an idea is worth or not We would like to be consistent with physics
Heat conduction theory Fourier’s law: heat flux (W/m2) proportional to temperature gradient thermal conductivity Heat equation 3D power density heat capacity per unit volume
Solving the heat equation Analytical method Exact solution Possible only for simple geometries Finite methods Search (xn) that makes T’ “close” to the actual solution solve a system of equations Finite differences Finite elements Spectral methods …
1D thermal resistance Right cylinder Length = L Cross section area = A Thermal conductivity = k Uniform power over cross section uniform temperature over cross section Thermally-insulated side T2 L T1 Uniform power P over area A Define thermal “resistance”
What HotSpot models ambient air Copper heat sink base Copper heat spreader Interface material Silicon die Power sources
How HotSpot “solves” the heat equation Model ambient as ground Instead of using formal methods, solve an “electrical” network Thermal resistances Model power generation as current sources
HotSpot block model Thermal “resistances” simulate Fourier’s law Thermal “capacitances” simulate transients Network consists of few layers “horizontal” resistances within layers “vertical” resistances between layers Single layer for the silicon die
Compute resistance between block center and block edge Z=silicon die thickness W R H L
Each block is connected to adjacent blocks through a resistance Thermal conductance proportional to shared edge length
HotSpot is empirical Not based on mathematical foundations Resistance formula applied without justification Was derived for definite boundary conditions that do not apply here Coarse “vertical” space discretization Problem with empirical models: more difficult to validate Require extensive validation Not sufficient to validate a few points in the parameter space Error may vary significantly with parameter values
Evaluation We are not validating HotSpot We are just highlighting some of its limitations deliberate focus on problematic cases Compare HotSpot block model with finite-element solver FF3D Model same physical system as HotSpot Two versions of HotSpot The original one Our modified version with simple 1D resistance formula
Steady-state temperature EV6 floorplan, default HotSpot configuration
Let’s take a better interface material Interface material with ~6x higher thermal conductivity emphasizes “horizontal” heat conduction through copper Even the modified HotSpot is inaccurate
Single square source Model the same square source with two different floorplans (default HotSpot parameters) Power = 10 W A B
What do we learn ? In some cases, HotSpot may be significantly inaccurate The usefulness of the complicated thermal resistance formula is not obvious HotSpot documentation indicates that mixing small and large blocks may be source of inaccuracy we confirm
Point source: transient temperature Thermal diffusivity opposite side starts heating Example: silicon die d=0.5 mm HotSpot miss this behavior
Volume vs. surface power sources Sources spread in bulk silicon Sources concentrated in thin layer temperature temperature time t time t HotSpot behavior Close to actual behavior
What this implies for HotSpot HotSpot block-model considers a single network layer for the silicon die cannot produce correct behavior for small times Underestimates slope of temperature transient E.g., how long does it take to get a 1°C increase ? HotSpot may be wrong by orders of magnitude
Problem: insufficient “vertical” discretization in silicon 1 mm square source dissipating 10 W Problem: insufficient “vertical” discretization in silicon
Conclusion Be careful when using HotSpot Good to read a little heat conduction theory before … Heat conduction ≠ electric conduction Ok to use HotSpot for confirming a priori intuitions Draw qualitative conclusions, not quantitative ones In case of doubt, check with formal methods that HotSpot is correctly calibrated for a particular use
HotSpot still evolving This study was only for HotSpot block model Version 3.0 features a new grid mode Discretization is automatic (but “vertically”) Permits defining multiple silicon layers must be validated HotSpot will probably continue to evolve Will end up resembling finite differences ?