Chapter 5a On-Chip Power Integrity

Slides:



Advertisements
Similar presentations
Gregory Shklover, Ben Emanuel Intel Corporation MATAM, Haifa 31015, Israel Simultaneous Clock and Data Gate Sizing Algorithm with Common Global Objective.
Advertisements

Modeling and Design for Beyond-the-Die Power Integrity
Paul Falkenstern and Yuan Xie Yao-Wen Chang Yu Wang Three-Dimensional Integrated Circuits (3D IC) Floorplan and Power/Ground Network Co-synthesis ASPDAC’10.
Coupling-Aware Length-Ratio- Matching Routing for Capacitor Arrays in Analog Integrated Circuits Kuan-Hsien Ho, Hung-Chih Ou, Yao-Wen Chang and Hui-Fang.
Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.
The continuous scaling trends of smaller devices, higher operating frequencies, lower power supply voltages, and more functionalities for integrated circuits.
Path Finding for 3D Power Distribution Networks A. B. Kahng and C. K. Cheng UC San Diego Feb 18, 2011.
Multiobjective VLSI Cell Placement Using Distributed Simulated Evolution Algorithm Sadiq M. Sait, Mustafa I. Ali, Ali Zaidi.
Lecture 8: Clock Distribution, PLL & DLL
Supply Voltage Degradation Aware Analytical Placement Andrew B. Kahng, Bao Liu and Qinke Wang UCSD CSE Department {abk, bliu,
04/09/02EECS 3121 Lecture 25: Interconnect Modeling EECS 312 Reading: 8.3 (text), 4.3.2, (2 nd edition)
SAMSON: A Generalized Second-order Arnoldi Method for Reducing Multiple Source Linear Network with Susceptance Yiyu Shi, Hao Yu and Lei He EE Department,
Microwave Interference Effects on Device,
Gate Sizing by Mathematical Programming Prof. Shiyan Hu
Statistical Gate Delay Calculation with Crosstalk Alignment Consideration Andrew B. Kahng, Bao Liu, Xu Xu UC San Diego
Efficient Decoupling Capacitance Budgeting Considering Operation and Process Variations Yiyu Shi*, Jinjun Xiong +, Chunchen Liu* and Lei He* *Electrical.
1 Reconfigurable ECO Cells for Timing Closure and IR Drop Minimization TingTing Hwang Tsing Hua University, Hsin-Chu.
Decoupling Capacitance Allocation for Power Supply Noise Suppression Shiyou Zhao, Kaushik Roy, Cheng-Kok Koh School of Electrical & Computer Engineering.
More Realistic Power Grid Verification Based on Hierarchical Current and Power constraints 2 Chung-Kuan Cheng, 2 Peng Du, 2 Andrew B. Kahng, 1 Grantham.
Modern VLSI Design 4e: Chapter 7 Copyright  2008 Wayne Wolf Topics Global interconnect. Power/ground routing. Clock routing. Floorplanning tips. Off-chip.
Power Reduction for FPGA using Multiple Vdd/Vth
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
Pattern Selection based co-design of Floorplan and Power/Ground Network with Wiring Resource Optimization L. Li, Y. Ma, N. Xu, Y. Wang and X. Hong WuHan.
EE 5900 Advanced Algorithms for Robust VLSI CAD, Spring 2009 Static Timing Analysis and Gate Sizing.
On-chip power distribution in deep submicron technologies
TSV-Aware Analytical Placement for 3D IC Designs Meng-Kai Hsu, Yao-Wen Chang, and Valerity Balabanov GIEE and EE department of NTU DAC 2011.
Chapter 07 Electronic Analysis of CMOS Logic Gates
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
EE 201C Modeling of VLSI Circuits and Systems
Statistical Sampling-Based Parametric Analysis of Power Grids Dr. Peng Li Presented by Xueqian Zhao EE5970 Seminar.
Scalable Symbolic Model Order Reduction Yiyu Shi*, Lei He* and C. J. Richard Shi + *Electrical Engineering Department, UCLA + Electrical Engineering Department,
Fang Gong HomeWork 6 & 7 Fang Gong
Modern VLSI Design 4e: Chapter 3 Copyright  2008 Wayne Wolf Topics n Pseudo-nMOS gates. n DCVS logic. n Domino gates. n Design-for-yield. n Gates as IP.
Stochastic Current Prediction Enabled Frequency Actuator for Runtime Resonance Noise Reduction Yiyu Shi*, Jinjun Xiong +, Howard Chen + and Lei He* *Electrical.
1 Interconnect/Via. 2 Delay of Devices and Interconnect.
IBM Microelectronics © 2005 IBM Corporation SLIP 2005April 2, 2005 Bounding the Impact of Transient Power Supply Noise in Static Timing Analysis Over a.
Modern VLSI Design 3e: Chapter 7 Copyright  1998, 2002 Prentice Hall PTR Topics n Power/ground routing. n Clock routing. n Floorplanning tips. n Off-chip.
EE 201C Modeling of VLSI Circuits and Systems
EE201C : Stochastic Modeling of FinFET LER and Circuits Optimization based on Stochastic Modeling Shaodi Wang
-1- UC San Diego / VLSI CAD Laboratory Optimization of Overdrive Signoff Tuck-Boon Chan, Andrew B. Kahng, Jiajia Li and Siddhartha Nath Tuck-Boon Chan,
1 Hardware Reliability Margining for the Dark Silicon Era Liangzhen Lai and Puneet Gupta Department of Electrical Engineering University of California,
COE 360 Principles of VLSI Design Delay. 2 Definitions.
Copyright © 2009, Intel Corporation. All rights reserved. Power Gate Design Optimization and Analysis with Silicon Correlation Results Yong Lee-Kee, Intel.
MICROPROCESSOR DESIGN1 IR/Inductive Drop Introduction One component of every chip is the network of wires used to distribute power from the input power.
Piero Belforte, HDT 1999: PRESTO POWER by Alessandro Arnulfo.
Piero Belforte, HDT, July 2000: MERITA Methodology to Evaluate Radiation in Information Technology Application, methodologies and software solutions by Carla Giachino,
Worst Case Crosstalk Noise for Nonswitching Victims in High-Speed Buses Jun Chen and Lei He.
Chapter 4b Process Variation Modeling
Static Timing Analysis and Gate Sizing Optimization
VLSI Testing Lecture 5: Logic Simulation
On-Chip Power Network Optimization with Decoupling Capacitors and Controlled-ESRs Wanping Zhang1,2, Ling Zhang2, Amirali Shayan2, Wenjian Yu3, Xiang Hu2,
VLSI Testing Lecture 5: Logic Simulation
OPS - Energy Harvesting
Vishwani D. Agrawal Department of ECE, Auburn University
Haihua Su, Sani R. Nassif IBM ARL
Chapter 2 Interconnect Analysis
Chapter 2 Interconnect Analysis
Static Timing Analysis and Gate Sizing Optimization
Jinghong Liang,Tong Jing, Xianlong Hong Jinjun Xiong, Lei He
Performance Optimization Global Routing with RLC Crosstalk Constraints
Chapter 5b Stochastic Circuit Optimization
Chapter 2 Interconnect Analysis Delay Modeling
Yiyu Shi*, Jinjun Xiong+, Howard Chen+ and Lei He*
Yiyu Shi*, Wei Yao*, Jinjun Xiong+ and Lei He*
332:578 Deep Submicron VLSI Design Lecture 14 Design for Clock Skew
EE 201C Modeling of VLSI Circuits and Systems
Yiyu Shi*, Jinjun Xiong+, Chunchen Liu* and Lei He*
Yiyu Shi*, Jinjun Xiong+, Chunchen Liu* and Lei He*
COPING WITH INTERCONNECT
Chapter 3b Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction Prof. Lei He Electrical Engineering Department.
Presentation transcript:

Chapter 5a On-Chip Power Integrity Prof. Lei He Electrical Engineering Department University of California, Los Angeles URL: eda.ee.ucla.edu Email: lhe@ee.ucla.edu

Outline Introduction to power integrity On-chip current modeling On-chip de-cap insertion

3 Introduction Power noise (ground bounce, Vdd droop, Vdd drop) The integral of noise is what matters 3 3

Pad ... 4 Introduction IR drops: Voltage difference between power supply pads and individual cell instances. Electro-migration: Metal ion mass transport along the grain boundaries when a metallic interconnect is stressed at high current density. Mean Time to Failure (MTF) (Black’s equation): ... Pad 4 4

5 Power Noise Power noise (ground bounce, Vdd droop)  IR + L*di/dt IR drop is the primary on-chip power noise, especially when flip chip is used L*di/dt noise most time is not an on-chip concern, except for high-performance design SSN (simultaneous switching noise) is primarily L*di/dt noise for IO cells, package and PCB SSO: simultaneous switching output noise 5 5

Simultaneous Switching Noise SSN is most signicantly observed around the output pads of the chip. in order to drive large off-chip loads, the I/O buffers are usually very large in size, drawing a significant amount of instantaneous currents when they switch. in clock synchronized chips multiple I/O buffers tend to switch simultaneously to create a large surge current with a sharp slope. the parasitic inductance of the power distribution network of the package, including the interconnections to both the chip and the board, is usually in the range of a few hundred pico-Henries. 6 6

Circuit to Generate SSN 7 7

Outline Introduction to power integrity On-chip current modeling On-chip de-cap insertion

On-Chip Current Models Static current model for IR drop Worst case based on test vectors Average case (either based on test vectors, or assume certain percent of gates are switching) So-called vector-less model Partition circuit into blocks Find worst-case current for each block via exhaustive testing Sum up worst-case current Transient current model for L di/dt noise Find peak current, and assume a switching time Stochastic model considering logic-induced and temporal correlation 3 9

Design for on-chip Power Integrity Static current model for IR drop Power/ground sizing Transient current model for L di/dt noise On-chip de-cap allocation De-cap can be implemented by CMOS FET, or trench-cap De-cap value is continuous 3 10

Power Noise of a chip with peripheral IO A real industrial chip #cell instances: 0.5M #P/G resistors: 0.6M

P/G Sizing Unrestricted IR drops and current densities in power / ground (P/G) network will cause malfunction and reliability problems in deep sub-micron IC chips. Increased cell delays (timing problem) increased resistance and even opens of P/G wires P/G sizing in two steps P/G network construction (P/G routing) to decide P/G pitches and tapping-points connecting to package power supply Determination of wire segment widths 3 12

Effect of de-cap Adding decaps is the most effective way to reduce L di/dt noise in P/G grids 13

Costs of decap Decaps are mainly made of MOS gate capacitors Consuming premium white spaces White space can otherwise be used for adding buffers, other logic gates for physical optimization. MOS gates are leaky or become more leaky with scaling More leakage powers Excessive decap may lead to low yield and low circuit resonant frequency Economic use of decap is important!! 14

Decap Budgeting Overview Nodes away from Vdd pin may suffer from supply noise due to sudden burst of activity Provide current for surplus need from the local storage charge Location matters The closer to the turbulent point, the more noise reduction can be achieved Given the amount of decap to be inserted, find the optimal location so that the noise can be suppressed to a maximum extent. Load current power supply intrinsic cap decap We define the noise as the integral over time of the area below Vn t0 t1 15

Decap Budgeting Problem Formulation Objective Find the distribution and location of the white space so the noise on power network is minimized Constraints: Local decap constraints: amount of decap allowed at each location is limited due to placement constraint Global decap constraints: total amount of decap allowed is limited due to leakage constraint Limitation of most existing work: Most existing work in essence uses worst case load current in order to guarantee there is no noise violation, which is too pessimistic It is not clear how to provide decap budgeting solution that is robust to current loads under all kinds of operations for a circuit

Correlated Load Currents Strong correlation between load currents due to Operation variation Currents at different ports have logic-induced correlation Large number of ports with limited control bits Currents at certain ports cannot reach maximum at the same time due to the inherent logic dependency for a given design Currents at the same port have temporal correlation System takes several clock cycles to execute one instruction The currents cannot reach maximum at all the clock cycles Process variation Currents have intra-die variation due to process variation The P/G network is robust to process variation, but the load currents have intra-die variation because the circuit suffers from process variation. Leff variation is one of the primary variation sources and the variation is spatially correlated

Current Sampling Model the current in each clock cycle as a triangular waveform and assume constant rising/falling time Other current waveforms can be used. It will not affect the algorithm In our verification, we use the detailed non-simplified current waveform Partition a circuit into blocks and assume no correlation between different blocks Extensive simulation for each block to get the peak current value in each clock cycle and at each port. Assume there is only temporal correlation within certain number of clock cycles L L can be the number of clock cycles to execute certain function

Stochastic Current Modeling Divide peak current values into different sets according to the clock cycle and port number The set contains peak current values at port k and in clock cycle j, j+L, j+2L,… Example: Take L=2, and consider two ports in 8 consecutive clock cycles Define to be the stochastic variable with the sample set For example, has the samples 0.1, 0.3, 0.5, 0.7, and therefore has mean value 0,4 The correlation between and reflects the temporal correlation between clock cycle j1 and j2 The correlation between and reflects the logic induced correlation between port k1 and k2. clock cycles j, temporal correlation port k, logic-induced correlation 20

Extraction of Correlations The logic-induced correlation coefficient between port k1 and k2 at clock cycle j can be computed as Temporal correlation coefficient between clock cycle j1 and j2 at port k can be computed as To take process variation into consideration, sample each multiple times over different region, and the above two formulas can still be applied We use the general definition of correlation coefficient. In our case, we have two. 21

Extraction of Correlations As is not Gaussian, apply Independent Component Analysis to remove the correlation between and get a new set of independent variables r1, r2, … Each can be represented by the linear combination of r1, r2,… Accordingly the waveform at each clock cycle can be reconstructed from those r1,r2,…, i.e., The new variables ri catch both the operation and process variations. We use the general definition of correlation coefficient. In our case, we have two. 22

Example of Extracted Temporal Correlation The correlation map for peak currents between different clock cycles of one port from an industry application. The P/G network is modeled as RC mesh The load currents are obtained by detailed simulation of the circuit It can be seen that the correlation matrix can be clearly divided into four trunks, and L can be set as 10 23

Parameterized MNA Formulation Original MNA formulation With the design variables - decap area wi, the G, C matrices can be expressed as Together with the stochastic current model, the MNA formulation becomes: With parameters wi and ri The objective now is to find the optimal solution for those parameters More specifically, find the wi values that minimize the noise with the ri corresponding to the load currents which introduce the maximum noise 24

Stochastic Decap Formulation Minimize the maximum noise sum over all ports Subject to the stochastic current variable upper/lower bound Subject to Local decap area constraint due to placement constraint Global decap area constraint due to leakage constraint Non-convex min/max optimization problem Difficult to find global optimal solution 25

Iterative Programming Algorithm Each iteration we increase the white space allowed until all the white space has been used up or it converges Find the optimal decap budgeting for the giving max droop/bounce update the max droop/bounce update the decap budgeting Find the input corresponding to the max. droop/bounce for the given decap budgeting Cannot guarantee optimality, but can guarantee convergence and efficiency Experimental results show our algorithm can achieve good optimization results 26

Illustration of Iterative Programming A3: (P3) A1: (P3) A0: Initial A2: (P2) A0: Initial noise curve at one randomly selected port A1: The noise curve under the optimal decap budgeting for a giving droop/bounce A2: The noise curve with the input corresponding to the max. droop/bounce for the decap budgeting in A1 A3: The noise curve under the optimal decap budgeting for the giving max droop/bounce in A2 27

Sequential Programming We apply sequential linear programming (sLP) to solve each of the two sub-problems. For each sub-problem, we iteratively do the following two steps until the solution converges: Compute the sensitivities of all the variables to the first order by moment matching. Linearize the objective function with the sensitivities and the optimization problem becomes an LP first order sensitivities 28

Impact of Current Correlations Model 1 Maximum current at all ports Model 2 Stochastic model with logic-induced correlation Model 3 Model 2 + temporal correlation Node # Noise (V*s) Runtime (s) Model 1 Model 2 Model 3 1284 6.33e-7 1.28e-7 4.10e-8 104.2 161.2 282.3 10490 5.21e-5 1.09e-5 4.80e-6 973.2 1430 2199 42280 7.92e-4 5.38e-4 9.13e-5 2732 3823 5238 166380 1.34e-2 5.37e-3 2.28e-3 3625 5798 7821 avg 1 1/2.68X 1/9.10X 1.50X 2.26X Compared with the model assuming maximum currents at all ports, under the same decap area, Stochastic model with spatial correlation only reduce the noise by up to 3X Stochastic model with both spatial and temporal correlation reduce the noise by up to 9X 29

Impact of Leff Variation Node #3429 3.06X V.R. sLP sLP + Leff mean (V*s) std (V*s) runtime (s) 1284 10% 9.28e-7 3.97e-7 184.2 6.14e-7 1.38e-7 332.8 1.81X 20% 9.43e-7 4.55e-7 6.38e-7 1.86e-7 10490 1.03e-4 4.79e-5 1121 7.22e-5 1.23e-5 3429 3.06X 1.22e-4 4.38e-5 7.94e-5 2.06e-5 42280 2.29e-3 9.72e-4 2236 8.23e-4 1.01e-4 6924 3.10X 4.43e-3 1.01e-3 8.28e-4 1.92e-4 166380 2.06e-2 9.91e-3 3824 5.31e-3 8.92e-4 11224 2.93X 2.31e-2 1.03e-2 5.92e-3 9.33e-4 avg 1 1/2.02X 1/5.05X 2.73X 1/1.95X 1/4.05X Compared with the stochastic model without considering Leff variation, the stochastic model with it reduce the average noise by up to 4X and the 3-sigma noise by up to 13X 30

Re-cap of Key Points IR drop is primarily an on-chip issue and can be fixed by P/G sizing Beyond chip, L di/dt (or SSN) is the primarily concern On-chip decap is continuous and is allocated to the white-space on the chip On-chip current (not IO current yet) can be modeled by worst case, average case, and stochastic models 31

Dynamic noise: L di/dt noise Reading Assignment Static noise: IR drop S. Tan and R. Shi, “Optimization of VLSI Power/Ground (P/G) Networks Via Sequence of Linear Programmings”, DAC’09 Dynamic noise: L di/dt noise Yiyu Shi, Jinjun Xiong, Chunchen Liu and Lei He, "Efficient Decoupling Capacitance Budgeting Considering Current Correlation Including Process Variation", ICCAD, San Jose, CA, Nov. 2007. Supplementary reading: H. Qian, S. R. Nassif, and S. S. Sapatnekar, “Power Grid Analysis Using Random Walks,” IEEE Trans. on CAD, 2005. Yiyu Shi, Wei Yao, Jinjun Xiong, and Lei He, "Incremental and On-demand Random Walk for Iterative Power Distribution Network Analysis", ASPDAC 2009