EE 201C Modeling of VLSI Circuits and Systems

Slides:



Advertisements
Similar presentations
Gregory Shklover, Ben Emanuel Intel Corporation MATAM, Haifa 31015, Israel Simultaneous Clock and Data Gate Sizing Algorithm with Common Global Objective.
Advertisements

Non-Gaussian Statistical Timing Analysis Using Second Order Polynomial Fitting Lerong Cheng 1, Jinjun Xiong 2, and Lei He 1 1 EE Department, UCLA *2 IBM.
Design Rule Generation for Interconnect Matching Andrew B. Kahng and Rasit Onur Topaloglu {abk | rtopalog University of California, San Diego.
Keeping Hot Chips Cool Ruchir Puri, Leon Stok, Subhrajit Bhattacharya IBM T.J. Watson Research Center Yorktown Heights, NY Circuits R-US.
Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.
0 1 Width-dependent Statistical Leakage Modeling for Random Dopant Induced Threshold Voltage Shift Jie Gu, Sachin Sapatnekar, Chris Kim Department of Electrical.
Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.
Minimal Skew Clock Embedding Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.
The continuous scaling trends of smaller devices, higher operating frequencies, lower power supply voltages, and more functionalities for integrated circuits.
Yuanlin Lu Intel Corporation, Folsom, CA Vishwani D. Agrawal
Externally Tested Scan Circuit with Built-In Activity Monitor and Adaptive Test Clock Priyadharshini Shanmugasundaram Vishwani D. Agrawal.
Non-Linear Statistical Static Timing Analysis for Non-Gaussian Variation Sources Lerong Cheng 1, Jinjun Xiong 2, and Prof. Lei He 1 1 EE Department, UCLA.
Statistical Crosstalk Aggressor Alignment Aware Interconnect Delay Calculation Supported by NSF & MARCO GSRC Andrew B. Kahng, Bao Liu, Xu Xu UC San Diego.
Lecture 8: Clock Distribution, PLL & DLL
Supply Voltage Degradation Aware Analytical Placement Andrew B. Kahng, Bao Liu and Qinke Wang UCSD CSE Department {abk, bliu,
Off-chip Decoupling Capacitor Allocation for Chip Package Co-Design Hao Yu Berkeley Design Chunta Chu and Lei He EE Department.
Jan. 2007VLSI Design '071 Statistical Leakage and Timing Optimization for Submicron Process Variation Yuanlin Lu and Vishwani D. Agrawal ECE Dept. Auburn.
SAMSON: A Generalized Second-order Arnoldi Method for Reducing Multiple Source Linear Network with Susceptance Yiyu Shi, Hao Yu and Lei He EE Department,
Circuit Simulation Based Obstacle-Aware Steiner Routing Yiyu Shi, Paul Mesa, Hao Yu and Lei He EE Department, UCLA Partially supported by NSF Career Award.
Statistical Gate Delay Calculation with Crosstalk Alignment Consideration Andrew B. Kahng, Bao Liu, Xu Xu UC San Diego
Efficient Decoupling Capacitance Budgeting Considering Operation and Process Variations Yiyu Shi*, Jinjun Xiong +, Chunchen Liu* and Lei He* *Electrical.
RLC Interconnect Modeling and Design Students: Jinjun Xiong, Jun Chen Advisor: Lei He Electrical Engineering Department Design Automation Group (
1 Reconfigurable ECO Cells for Timing Closure and IR Drop Minimization TingTing Hwang Tsing Hua University, Hsin-Chu.
Decoupling Capacitance Allocation for Power Supply Noise Suppression Shiyou Zhao, Kaushik Roy, Cheng-Kok Koh School of Electrical & Computer Engineering.
More Realistic Power Grid Verification Based on Hierarchical Current and Power constraints 2 Chung-Kuan Cheng, 2 Peng Du, 2 Andrew B. Kahng, 1 Grantham.
Worst-Case Timing Jitter and Amplitude Noise in Differential Signaling Wei Yao, Yiyu Shi, Lei He, Sudhakar Pamarti, and Yu Hu Electrical Engineering Dept.,
Power Reduction for FPGA using Multiple Vdd/Vth
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
Pattern Selection based co-design of Floorplan and Power/Ground Network with Wiring Resource Optimization L. Li, Y. Ma, N. Xu, Y. Wang and X. Hong WuHan.
Lecture 12 Review and Sample Exam Questions Professor Lei He EE 201A, Spring 2004
EE 5900 Advanced Algorithms for Robust VLSI CAD, Spring 2009 Static Timing Analysis and Gate Sizing.
On-chip power distribution in deep submicron technologies
Research in IC Packaging Electrical and Physical Perspectives
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
New Modeling Techniques for the Global Routing Problem Anthony Vannelli Department of Electrical and Computer Engineering University of Waterloo Waterloo,
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
EE 201C Modeling of VLSI Circuits and Systems
Statistical Sampling-Based Parametric Analysis of Power Grids Dr. Peng Li Presented by Xueqian Zhao EE5970 Seminar.
The Fast Optimal Voltage Partitioning Algorithm For Peak Power Density Minimization Jia Wang, Shiyan Hu Department of Electrical and Computer Engineering.
A Power Grid Analysis and Verification Tool Based on a Statistical Prediction Engine M.K. Tsiampas, D. Bountas, P. Merakos, N.E. Evmorfopoulos, S. Bantas.
Scalable Symbolic Model Order Reduction Yiyu Shi*, Lei He* and C. J. Richard Shi + *Electrical Engineering Department, UCLA + Electrical Engineering Department,
Fang Gong HomeWork 6 & 7 Fang Gong
Clock-Tree Aware Placement Based on Dynamic Clock-Tree Building Yanfeng Wang, Qiang Zhou, Xianlong Hong, and Yici Cai Department of Computer Science and.
1 A Fast Algorithm for Power Grid Design Jaskirat Singh Sachin Sapatnekar Department of Electrical and Computer Engineering University of Minnesota.
Stochastic Current Prediction Enabled Frequency Actuator for Runtime Resonance Noise Reduction Yiyu Shi*, Jinjun Xiong +, Howard Chen + and Lei He* *Electrical.
IBM Microelectronics © 2005 IBM Corporation SLIP 2005April 2, 2005 Bounding the Impact of Transient Power Supply Noise in Static Timing Analysis Over a.
Xuanxing Xiong and Jia Wang Electrical and Computer Engineering Illinois Institute of Technology Chicago, Illinois, United States November, 2011 Vectorless.
Power Integrity Test and Verification CK Cheng UC San Diego 1.
EE201C : Stochastic Modeling of FinFET LER and Circuits Optimization based on Stochastic Modeling Shaodi Wang
Computer Science and Engineering Power-Performance Considerations of Parallel Computing on Chip Multiprocessors Jian Li and Jose F. Martinez ACM Transactions.
In-Place Decomposition for Robustness in FPGA Ju-Yueh Lee, Zhe Feng, and Lei He Electrical Engineering Dept., UCLA Presented by Ju-Yueh Lee Address comments.
Department of Electrical and Computer Engineering University of Wisconsin - Madison Optimizing Total Power of Many-core Processors Considering Voltage.
DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs Deming Chen, Jason Cong , Computer Science Department , UCLA Presented.
1 Hardware Reliability Margining for the Dark Silicon Era Liangzhen Lai and Puneet Gupta Department of Electrical Engineering University of California,
PROCEED: Pareto Optimization-based Circuit-level Evaluation Methodology for Emerging Devices Shaodi Wang, Andrew Pan, Chi-On Chui and Puneet Gupta Department.
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
Copyright © 2009, Intel Corporation. All rights reserved. Power Gate Design Optimization and Analysis with Silicon Correlation Results Yong Lee-Kee, Intel.
MICROPROCESSOR DESIGN1 IR/Inductive Drop Introduction One component of every chip is the network of wires used to distribute power from the input power.
Chapter 4b Process Variation Modeling
On-Chip Power Network Optimization with Decoupling Capacitors and Controlled-ESRs Wanping Zhang1,2, Ling Zhang2, Amirali Shayan2, Wenjian Yu3, Xiang Hu2,
Haihua Su, Sani R. Nassif IBM ARL
Jinghong Liang,Tong Jing, Xianlong Hong Jinjun Xiong, Lei He
Chapter 5a On-Chip Power Integrity
Performance Optimization Global Routing with RLC Crosstalk Constraints
Chapter 5b Stochastic Circuit Optimization
Yiyu Shi*, Jinjun Xiong+, Howard Chen+ and Lei He*
Yiyu Shi*, Wei Yao*, Jinjun Xiong+ and Lei He*
EE 201C Modeling of VLSI Circuits and Systems
Yiyu Shi*, Jinjun Xiong+, Chunchen Liu* and Lei He*
Yiyu Shi*, Jinjun Xiong+, Chunchen Liu* and Lei He*
Presentation transcript:

EE 201C Modeling of VLSI Circuits and Systems Prof Lei He UCLA

Chapter 5 Signal and Power Integrity On-chip signal integrity RC and RLC coupling noise Power integrity Static noise: IR drop Dynamic noise: L di/dt noise Beyond die noise (chapter 6) In-package decap insertion Low frequency P/G resonance Noise for High-speed signaling 2

Reading on Power Integrity Static noise: IR drop S. Tan and R. Shi, “Optimization of VLSI Power/Ground (P/G) Networks Via Sequence of Linear Programmings”, DAC’09 Dynamic noise: L di/dt noise Yiyu Shi, Jinjun Xiong, Chunchen Liu and Lei He, "Efficient Decoupling Capacitance Budgeting Considering Current Correlation Including Process Variation", ICCAD, San Jose, CA, Nov. 2007. Supplementary reading: H. Qian, S. R. Nassif, and S. S. Sapatnekar, “Power Grid Analysis Using Random Walks,” IEEE Trans. on CAD, 2005. Yiyu Shi, Wei Yao, Jinjun Xiong, and Lei He, "Incremental and On-demand Random Walk for Iterative Power Distribution Network Analysis", ASPDAC 2009 2

Efficient Decoupling Capacitance Budgeting Considering Operation and Process Variations Yiyu Shi*, Jinjun Xiong+, Chunchen Liu* and Lei He* *Electrical Engineering Department, UCLA +IBM T. J. Watson Research Center, Yorktown Heights, NY This work is partially supported by NSF CAREER award and a UC MICRO grant sponsored by Altera, RIO and Intel.

What is Decap? Decap: decoupling capacitor

Why Adding Decap is Important? Power source fluctuations increase significantly Static IR Drop: V = I × R = (P /Vdd ) × R Dynamic IR Drop : V = L di/dt noise Illustration of voltage drop variation of modern VLSI chip

Introduction – Voltage Drop Impacts on Timing 10% voltage drop can cause more than 10% delay

Introduction - Effect of Adding Decaps Adding decaps is the most effective way to reduce voltage noises in P/G grids

Introduction - The Costs of Adding Decaps Decaps are mainly made of MOS gate capacitors Consuming premium white spaces White space can otherwise be used for adding buffers, other logic gates for physical optimization. MOS gates are leaky or become more leaky with scaling More leakage powers Excessive decaps will lead to low yield and low circuit resonant frequency, etc. Economic use of decaps is important!!

Decap Budgeting Overview Nodes away from Vdd pin may suffer from supply noise due to sudden burst of activity Provide current for surplus need from the local storage charge Side effect of adding too much decap Increased leakage Increased die area Risk of yield loss Location matters The closer to the turbulent point, the more noise reduction can be achieved Given the amount of decap to be inserted, find the optimal location so that the noise can be suppressed to a maximum extent. Load current power supply intrinsic cap decap We define the noise as the integral over time of the area below Vn t0 t1

Decap Budgeting Problem Formulation Objective Find the distribution and location of the white space so the noise on power network is minimized Constraints: Local decap constraints: amount of decap allowed at each location is limited due to placement constraint Global decap constraints: total amount of decap allowed is limited due to leakage constraint Limitation of existing work: Most existing work in essence uses worst case load current in order to guarantee there is no noise violation, which is too pessimistic It is not clear how to provide decap budgeting solution that is robust to current loads under all kinds of operations for a circuit

Major Contribution of our work In this paper, we develop a novel stochastic model for current loads, taking into account operation variation such as temporal and logic-induced correlations and process variations such as systematic and random Leff variation. We propose a formal method to extract operation variation and formulate a new decap budgeting problem using the stochastic current model. We develop an effective yet efficient iterative alternative programming algorithm and conduct experiments using industrial designs. Experiments show that considering both operation and process variations can reduce over-design significantly. This demonstrates the importance of considering operation variation. We convincingly demonstrate the significance of considering both operation and process variations and open a new research direction for optimizing signal, power and thermal integrity with consideration of operation variation

Outline Stochastic Modeling and Problem Formulation Algorithm Experimental Results Conclusions

Correlated Load Currents Strong correlation between load currents due to Operation variation Currents at different ports have logic-induced correlation Large number of ports with limited control bits Currents at certain ports cannot reach maximum at the same time due to the inherent logic dependency for a given design Currents at the same port have temporal correlation System takes several clock cycles to execute one instruction The currents cannot reach maximum at all the clock cycles Process variation Currents have intra-die variation due to process variation The P/G network is robust to process variation, but the load currents have intra-die variation because the circuit suffers from process variation. Leff variation is one of the primary variation sources and the variation is spatially correlated [Cao:DAC’05]

Current Sampling Model the current in each clock cycle as a triangular waveform and assume constant rising/falling time Other current waveforms can be used. It will not affect the algorithm In our verification, we use the detailed non-simplified current waveform Partition a circuit into blocks and assume no correlation between different blocks [Najm:ICCAD’05] Extensive simulation for each block to get the peak current value in each clock cycle and at each port. Assume there is only temporal correlation within certain number of clock cycles L L can be the number of clock cycles to execute certain function

Stochastic Current Modeling Divide peak current values into different sets according to the clock cycle and port number The set contains peak current values at port k and in clock cycle j, j+L, j+2L,… Example: Take L=2, and consider two ports in 8 consecutive clock cycles Define to be the stochastic variable with the sample set For example, has the samples 0.1, 0.3, 0.5, 0.7, and therefore has mean value 0,4 The correlation between and reflects the temporal correlation between clock cycle j1 and j2 The correlation between and reflects the logic induced correlation between port k1 and k2. clock cycles j, temporal correlation port k, logic-induced correlation

Extraction of Correlations The logic-induced correlation coefficient between port k1 and k2 at clock cycle j can be computed as Temporal correlation coefficient between clock cycle j1 and j2 at port k can be computed as To take process variation into consideration, sample each multiple times over different region, and the above two formulas can still be applied We use the general definition of correlation coefficient. In our case, we have two.

Extraction of Correlations As is not Gaussian, apply Independent Component Analysis [Hyvarinen’01] to remove the correlation between and get a new set of independent variables r1, r2, … Each can be represented by the linear combination of r1, r2,… Accordingly the waveform at each clock cycle can be reconstructed from those r1,r2,…, i.e., The new variables ri catch both the operation and process variations. We use the general definition of correlation coefficient. In our case, we have two.

Example of Extracted Temporal Correlation The correlation map for peak currents between different clock cycles of one port from an industry application. The P/G network is modeled as RC mesh The load currents are obtained by detailed simulation of the circuit It can be seen that the correlation matrix can be clearly divided into four trunks, and L can be set as 10

Parameterized MNA Formulation Original MNA formulation With the design variables - decap area wi, the G, C matrices can be expressed as Together with the stochastic current model, the MNA formulation becomes: With parameters wi and ri The objective now is to find the optimal solution for those parameters More specifically, find the wi values that minimize the noise with the ri corresponding to the load currents which introduce the maximum noise

Stochastic Decap Formulation Minimize the maximum noise sum over all ports Subject to the stochastic current variable upper/lower bound Subject to Local decap area constraint due to placement constraint Global decap area constraint due to leakage constraint Non-convex min/max optimization problem Difficult to find global optimal solution

Outline Stochastic Modeling and Problem Formulation Algorithm Experimental Results Conclusions

Iterative Programming Algorithm Each iteration we increase the white space allowed until all the white space has been used up or it converges Find the optimal decap budgeting for the giving max droop/bounce update the max droop/bounce update the decap budgeting Find the input corresponding to the max. droop/bounce for the given decap budgeting Cannot guarantee optimality, but can guarantee convergence and efficiency Experimental results show our algorithm can achieve good optimization results

Illustration of Iterative Programming A3: (P3) A1: (P3) A0: Initial A2: (P2) A0: Initial noise curve at one randomly selected port A1: The noise curve under the optimal decap budgeting for a giving droop/bounce A2: The noise curve with the input corresponding to the max. droop/bounce for the decap budgeting in A1 A3: The noise curve under the optimal decap budgeting for the giving max droop/bounce in A2

Sequential Programming We apply sequential linear programming (sLP) to solve each of the two sub-problems. For each sub-problem, we iteratively do the following two steps until the solution converges: Compute the sensitivities of all the variables to the first order by moment matching. Linearize the objective function with the sensitivities and the optimization problem becomes an LP first order sensitivities

Outline Stochastic Modeling and Problem Formulation Algorithm Experimental Results Conclusions

Impact of Current Correlations Model 1 Maximum current at all ports Model 2 Stochastic model with logic-induced correlation Model 3 Model 2 + temporal correlation Node # Noise (V*s) Runtime (s) Model 1 Model 2 Model 3 1284 6.33e-7 1.28e-7 4.10e-8 104.2 161.2 282.3 10490 5.21e-5 1.09e-5 4.80e-6 973.2 1430 2199 42280 7.92e-4 5.38e-4 9.13e-5 2732 3823 5238 166380 1.34e-2 5.37e-3 2.28e-3 3625 5798 7821 avg 1 1/2.68X 1/9.10X 1.50X 2.26X Compared with the model assuming maximum currents at all ports, under the same decap area, Stochastic model with spatial correlation only reduce the noise by up to 3X Stochastic model with both spatial and temporal correlation reduce the noise by up to 9X

Impact of Leff Variation Node #3429 3.06X V.R. sLP sLP + Leff mean (V*s) std (V*s) runtime (s) 1284 10% 9.28e-7 3.97e-7 184.2 6.14e-7 1.38e-7 332.8 1.81X 20% 9.43e-7 4.55e-7 6.38e-7 1.86e-7 10490 1.03e-4 4.79e-5 1121 7.22e-5 1.23e-5 3429 3.06X 1.22e-4 4.38e-5 7.94e-5 2.06e-5 42280 2.29e-3 9.72e-4 2236 8.23e-4 1.01e-4 6924 3.10X 4.43e-3 1.01e-3 8.28e-4 1.92e-4 166380 2.06e-2 9.91e-3 3824 5.31e-3 8.92e-4 11224 2.93X 2.31e-2 1.03e-2 5.92e-3 9.33e-4 avg 1 1/2.02X 1/5.05X 2.73X 1/1.95X 1/4.05X Compared with the stochastic model without considering Leff variation, the stochastic model with it reduce the average noise by up to 4X and the 3-sigma noise by up to 13X

Conclusions In this paper, we develop a novel stochastic model for current loads, taking into account operation variation such as temporal and logic-induced correlations and process variations such as systematic and random Leff variation. We propose a formal method to extract operation variation and formulate a new decap budgeting problem using the stochastic current model. We develop an effective yet efficient iterative alternative programming algorithm and conduct experiments using industrial designs. Experimental results show that the noise can be reduced by up to 9X. We also apply similar idea to temperature-aware clock routing [Hao:ispd’07] and microprocessor floorplanning (ICCAD’07).