Thermal-ADI: a Linear-Time Chip-Level Dynamic Thermal Simulation Algorithm Based on Alternating-Direction-Implicit(ADI) Method Good afternoon! The topic that I am going to talk today is “Thermal-ADI: a Linear-Time Chip-Level Dynamic Thermal Simulation Algorithm Based on Alternating-Direction-Implicit(ADI) Method.” My name is Ting-Yuan Wang. This work was supervised by my advisor, Professor Charlie Chen. We are from Electrical and Computer Engineering at University of Wisconsin-Madison. Ting-Yuan Wang Charlie Chung-Ping Chen Electrical and Computer Engineering University of Wisconsin-Madison April 4 2001
Motivation 1999 International Technology Roadmap for Semiconductor (ITRS) Maximum power Number of metal layers Wire current density Due to the relentless push for high speed, high performance, and high component density, the power density and the on-chip temperature in the high-end VLSI circuits rise significantly. The 1999 International Technology Roadmap for Semiconductors (ITRS) shows that the maximum power, number of metal layers, and the wire current density will significantly increase for the future high-performance Microprocessor Unit (MPU). This trend shows the importance of thermal issues on VLSI design. April 4 2001
Maximum number of metal level 1999 ITRS Year 1999 2000 2001 2002 2003 2004 2005 2008 2011 2014 Technology Node (nm) 180 - 130 100 70 50 35 Maximum Power (W) 90 115 140 150 160 170 174 183 On-chip, across-chip clock (MHz) 1200 1321 1454 1600 1724 1857 2500 3004 3600 Maximum number of metal level 7 8 9 10 5.8e5 7.1e5 8.0e5 9.5e5 1.1e6 1.3e6 1.4e6 2.1e6 3.7e6 4.6e6 This table shows the trend. What’s the problem for high temperature? High temperature not only causes timing failures for both transistors and interconnects but also degrades chip reliability. For example, electromigration (EM) effect for the interconnects is exponentially proportional to the temperature. So, how do we effectively analyze the thermal distributions and locate the hot spots are very important. April 4 2001
Existing Thermal Simulation Methods Finite Difference Method Easy, good for regular geometry, fast Finite Element Method More complicated, good for irregular geometry Equivalent RC Model (S.M. Kang) Compatible with SPICE model, need to solve large scale matrix April 4 2001
Finite-Difference Formulation of the Heat Conduction on a Chip Space Domain Time Domain For a given chip, there are two steps to establish the finite-difference method. First step is to discretize the continuous space domain. The second step is to discretize the time domain. During the following discussions of discretization, we will concern the accuracy and stability issues. April 4 2001
Heat Conduction Equation where : Temperature : Material density : Specific heat : Heat generation rate : Time : Thermal conductivity This is the heat conduction equation. It is a second-order parabolic partial differential equation. Where …….. April 4 2001
Increasing rate of stored Energy Conservation Before talking the time domain discretization, let’s look at the energy conservation of the heat conduction equation. It can be explained physically as the increasing rate of the stored energy in a control unit volume being equal to the net rate of energy transferring into the volume and heat generation rate from the volume. Increasing rate of stored energy which causes temperature increase Net rate of energy transferring into the volume Heat generation rate in the volume April 4 2001
Space Domain Discretization Heat Conduction Equation Central-Finite-Difference Approximation Let’s see this heat conduction equation. It is a second-order parabolic partial differential equation. For space domain discretization, we use central-finite-deference approximation to represent the second order partial derivative term in order to have a second order accuracy. Next, we replace partial derivative terms by the difference term. April 4 2001
Time domain discretization Heat Conduction Equation Simple Explicit Method Simple Implicit Method Crank-Nicolson Method So, the concerns are coming from what time step will be used to update this term? N or n+1? There are three choices of time updating: simple explicit method, simple implicit method, and Crank-Nicolson method. April 4 2001
Simple Explicit Method Accuracy: Stability Constraint: No matrix inversion but time steps are limited by space discretization For the simple explicit method, we apply the explicit update on the right-hand side of the equation. This method has second order accuracy on space and first order accuracy on time. However, this method need to satisfy the stability constraint in order to avoid fluctuation. April 4 2001
Simple Implicit Method Accuracy: Unconditionally Stable No limits on time step but involves with large scale matrix inversion For the simple implicit method, we apply the implicit update on the right-hand side of the equation. This method has second second-order accuracy on space and first order accuracy on time. But this method is unconditionally stable. April 4 2001
Crank-Nicolson Method For the Crank-Nicolson method, we take the average of simple explicit and implicit on the right hand side of the equation. This method not only has second order accuracy on space but also on time. Fortunately, this method also sustains unconditional stability. Accuracy: Unconditionally stable No limits on time step but involves with large scale matrix inversion April 4 2001
Analysis of Crank-Nicolson Method e.x. m=4,n=4 Total node number N = mn n For a two-dimensional mesh with total number of nodes N = mn, it requires a matrix with size NxN. To solve the equations Ax = b by LU decomposition or Cholesky decomposition, the runtime and memory requirement are superlinear with sparse matrix techniques. Therefore, we will face the difficulty of computational intense. m Matrix size = NxN April 4 2001
Alternating Direction Implicit Method Solves higher dimension problem by successive Lower dimension methods Accuracy: Unconditionally stable No limits on time step and no large scale matrix inversion The ADI (Alternating Direction Implicit) method is a process to reduce the two-dimensional or three-dimensional problems to a succession of two or three one-dimensional problems. This algorithm separates the time step from n to n+1 into two sub time steps: from n to n+1/2 and from n+1/2 to n+1.During Step I, it applies the implicit update in the x-direction and the explicit update in the y-direction. During Step II, it applies the explicit update in the x-direction and the implicit update in the y-direction. We have two different approaches to ADI method: Peaceman-Rachford and Douglas-Gunn Algorithm. April 4 2001
Alternating Direction Implicit Method Step I: x-direction implicit y-direction explicit Step II: x-direction explicit y-direction implicit n The ADI (Alternating Direction Implicit) method is a process to reduce the two-dimensional or three-dimensional problems to a succession of two or three one-dimensional problems. This algorithm separates the time step from n to n+1 into two sub time steps: from n to n+1/2 and from n+1/2 to n+1.During Step I, it applies the implicit update in the x-direction and the explicit update in the y-direction. During Step II, it applies the explicit update in the x-direction and the implicit update in the y-direction. We have two different approaches to ADI method: Peaceman-Rachford and Douglas-Gunn Algorithm. Peaceman-Rachford Algorithm Douglas-Gunn Algorithm April 4 2001
Peaceman-Rachford Algorithm Step I Step II Peaceman-Rachford Algorithm. After rearranging the heat conduction equation, we have the form like this. For step I, we take the implicit term of x and explicit term of y, as shown in the figure with green color. For step II, we take the implicit term of y and explicit term of x, as shown in the figure with yellow color. This algorithm has second-order accuracy both in space and time domain. April 4 2001
Douglas-Gunn Algorithm Step I Step II For Douglas-Gunn algorithm, the other scheme was used. The heat conduction can be rearranged like this. For step I, we apply the implicit update with x, and keep y direction with explicit update. During step II, we apply the explicit update in x direction, but this term is from step I. Also we apply the implicit update with y. This algorithm has a second accuracy in both space and time domains. April 4 2001
Illustration for ADI Step I Step II X-direction implicit Y-direction implicit n n … … This is the illustration of ADI method. In step I,for every j, there are m equations for the corresponding (i,j) points. Since each point (i,j) is related to two points (i-1,j) and (i+1,j), the coefficient matrix for each row is in tridiagonal form which can be solved with time complex O(m). There is a similar procedure for step II. 2 2 j = 1 j = 1 i = 1 2 … m 1 2 … m April 4 2001
Analysis of ADI Method X-direction implicit Tridiagonal Matrix n … 2xnxm = 2nm =2N This is the analysis of ADI method. For every j, we has a tridiagonal matrix like this. So totally we need 2 steps, each step need to solve n matrices, and each matrix needs time m. So the total time complex is O(N). 2 2 steps n matrices tridaigonal matrix j = 1 i = 1 2 … m Time complexity: O(N) April 4 2001
Three Different Locations of Node (case I) (case II) (case III) (i,j+1) (i,j+1) (i,j+1) Si Si Heat Source Heat Source Heat Source (i-1,j) (i,j) (i+1,j) (i-1,j) (i,j) (i+1,j) (i-1,j) (i,j) (i+1,j) For the implementation, we need to consider three different situations of layout extraction. For case I, the node is on the corner. For case II, the node is at the boundary between two materials. For case III, the node is inside the heat source. For case I and II, we need to modify the difference equations we had discussed. But we don’t talk about the details here. (i,j-1) (i,j-1) (i,j-1) April 4 2001
Results comparison April 4 2001 The temperature of transient thermal simulation at a random chosen point is shown here. The error of the Douglas-Gunn algorithm compared to the Crank-Nicolson method is less than 0.1.However, the error of the Peaceman-Rachford algorithm compared to the Crank-Nicolson method is 2.25 %. Of course, the error also depends on the delta t we choose. April 4 2001
Result – Run Time Comparison I The runtime comparison of the Crank-Nicolson method, Peaceman-Rachford algorithm, and Douglas-Gunn algorithm is shown here. The runtime of the Douglas-Gunn and Peaceman-Rachford algorithm is linearly proportional to the number of nodes as shown with size up to $10^{8}$. However, the runtime of Crank-Nicolson method increases dramatically. 5000X April 4 2001
Result – Run Time Comparison II April 4 2001
Results – Memory Usages I the memory usages of the Douglas-Gunn and Peaceman-Rachford Algorithms are linearly proportional to the number of nodes up to $10^{8}$. However, the memory usage of Crank-Nicolson Method increases dramatically. April 4 2001
Results – Memory Usages II April 4 2001
Results – Stability Constraint The stability constraint gamma was varied from 0.4 to 20 as shown here, where 1/2 is the stability limit. We can see it is stable. Gamma is the stability limit for simple explicit method April 4 2001
Thank you for your attention! April 4 2001