Yiyu Shi*, Jinjun Xiong+, Chunchen Liu* and Lei He*

Slides:



Advertisements
Similar presentations
Gregory Shklover, Ben Emanuel Intel Corporation MATAM, Haifa 31015, Israel Simultaneous Clock and Data Gate Sizing Algorithm with Common Global Objective.
Advertisements

Non-Gaussian Statistical Timing Analysis Using Second Order Polynomial Fitting Lerong Cheng 1, Jinjun Xiong 2, and Lei He 1 1 EE Department, UCLA *2 IBM.
Design Rule Generation for Interconnect Matching Andrew B. Kahng and Rasit Onur Topaloglu {abk | rtopalog University of California, San Diego.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
1 Modeling and Optimization of VLSI Interconnect Lecture 9: Multi-net optimization Avinoam Kolodny Konstantin Moiseev.
Improving Placement under the Constant Delay Model Kolja Sulimma 1, Ingmar Neumann 1, Lukas Van Ginneken 2, Wolfgang Kunz 1 1 EE and IT Department University.
3D-STAF: Scalable Temperature and Leakage Aware Floorplanning for Three-Dimensional Integrated Circuits Pingqiang Zhou, Yuchun Ma, Zhouyuan Li, Robert.
Exploiting Sparse Markov and Covariance Structure in Multiresolution Models Presenter: Zhe Chen ECE / CMR Tennessee Technological University October 22,
Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.
Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.
Minimal Skew Clock Embedding Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.
The continuous scaling trends of smaller devices, higher operating frequencies, lower power supply voltages, and more functionalities for integrated circuits.
Multiobjective VLSI Cell Placement Using Distributed Simulated Evolution Algorithm Sadiq M. Sait, Mustafa I. Ali, Ali Zaidi.
Non-Linear Statistical Static Timing Analysis for Non-Gaussian Variation Sources Lerong Cheng 1, Jinjun Xiong 2, and Prof. Lei He 1 1 EE Department, UCLA.
Statistical Crosstalk Aggressor Alignment Aware Interconnect Delay Calculation Supported by NSF & MARCO GSRC Andrew B. Kahng, Bao Liu, Xu Xu UC San Diego.
Lecture 8: Clock Distribution, PLL & DLL
Independent Component Analysis (ICA) and Factor Analysis (FA)
Off-chip Decoupling Capacitor Allocation for Chip Package Co-Design Hao Yu Berkeley Design Chunta Chu and Lei He EE Department.
SAMSON: A Generalized Second-order Arnoldi Method for Reducing Multiple Source Linear Network with Susceptance Yiyu Shi, Hao Yu and Lei He EE Department,
Efficient Decoupling Capacitance Budgeting Considering Operation and Process Variations Yiyu Shi*, Jinjun Xiong +, Chunchen Liu* and Lei He* *Electrical.
RLC Interconnect Modeling and Design Students: Jinjun Xiong, Jun Chen Advisor: Lei He Electrical Engineering Department Design Automation Group (
Temperature Aware Microprocessor Floorplanning Considering Application Dependent Power Load *Chunta Chu, Xinyi Zhang, Lei He, and Tom Tong Jing Electrical.
Noise and Delay Uncertainty Studies for Coupled RC Interconnects Andrew B. Kahng, Sudhakar Muddu † and Devendra Vidhani ‡ UCLA Computer Science Department,
Decoupling Capacitance Allocation for Power Supply Noise Suppression Shiyou Zhao, Kaushik Roy, Cheng-Kok Koh School of Electrical & Computer Engineering.
Chapter 3 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
More Realistic Power Grid Verification Based on Hierarchical Current and Power constraints 2 Chung-Kuan Cheng, 2 Peng Du, 2 Andrew B. Kahng, 1 Grantham.
Worst-Case Timing Jitter and Amplitude Noise in Differential Signaling Wei Yao, Yiyu Shi, Lei He, Sudhakar Pamarti, and Yu Hu Electrical Engineering Dept.,
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
Lecture 12 Review and Sample Exam Questions Professor Lei He EE 201A, Spring 2004
On-chip power distribution in deep submicron technologies
Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
New Modeling Techniques for the Global Routing Problem Anthony Vannelli Department of Electrical and Computer Engineering University of Waterloo Waterloo,
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
Statistical Sampling-Based Parametric Analysis of Power Grids Dr. Peng Li Presented by Xueqian Zhao EE5970 Seminar.
A Power Grid Analysis and Verification Tool Based on a Statistical Prediction Engine M.K. Tsiampas, D. Bountas, P. Merakos, N.E. Evmorfopoulos, S. Bantas.
Scalable Symbolic Model Order Reduction Yiyu Shi*, Lei He* and C. J. Richard Shi + *Electrical Engineering Department, UCLA + Electrical Engineering Department,
Partition-Driven Standard Cell Thermal Placement Guoqiang Chen Synopsys Inc. Sachin Sapatnekar Univ of Minnesota For ISPD 2003.
PAPER PRESENTATION Real-Time Coordination of Plug-In Electric Vehicle Charging in Smart Grids to Minimize Power Losses and Improve Voltage Profile IEEE.
Clock-Tree Aware Placement Based on Dynamic Clock-Tree Building Yanfeng Wang, Qiang Zhou, Xianlong Hong, and Yici Cai Department of Computer Science and.
Xianwu Ling Russell Keanini Harish Cherukuri Department of Mechanical Engineering University of North Carolina at Charlotte Presented at the 2003 IPES.
Stochastic Current Prediction Enabled Frequency Actuator for Runtime Resonance Noise Reduction Yiyu Shi*, Jinjun Xiong +, Howard Chen + and Lei He* *Electrical.
1 Chapter 5: Harmonic Analysis in Frequency and Time Domains Contributors: A. Medina, N. R. Watson, P. Ribeiro, and C. Hatziadoniu Organized by Task Force.
EE 201C Modeling of VLSI Circuits and Systems
EE201C : Stochastic Modeling of FinFET LER and Circuits Optimization based on Stochastic Modeling Shaodi Wang
Computer Science and Engineering Power-Performance Considerations of Parallel Computing on Chip Multiprocessors Jian Li and Jose F. Martinez ACM Transactions.
In-Place Decomposition for Robustness in FPGA Ju-Yueh Lee, Zhe Feng, and Lei He Electrical Engineering Dept., UCLA Presented by Ju-Yueh Lee Address comments.
Chance Constrained Robust Energy Efficiency in Cognitive Radio Networks with Channel Uncertainty Yongjun Xu and Xiaohui Zhao College of Communication Engineering,
-1- UC San Diego / VLSI CAD Laboratory Optimization of Overdrive Signoff Tuck-Boon Chan, Andrew B. Kahng, Jiajia Li and Siddhartha Nath Tuck-Boon Chan,
1 Hardware Reliability Margining for the Dark Silicon Era Liangzhen Lai and Puneet Gupta Department of Electrical Engineering University of California,
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
Worst Case Crosstalk Noise for Nonswitching Victims in High-Speed Buses Jun Chen and Lei He.
Chapter 4b Process Variation Modeling
On-Chip Power Network Optimization with Decoupling Capacitors and Controlled-ESRs Wanping Zhang1,2, Ling Zhang2, Amirali Shayan2, Wenjian Yu3, Xiang Hu2,
Haihua Su, Sani R. Nassif IBM ARL
Chapter 5a On-Chip Power Integrity
Performance Optimization Global Routing with RLC Crosstalk Constraints
Chapter 5b Stochastic Circuit Optimization
Yiyu Shi*, Jinjun Xiong+, Howard Chen+ and Lei He*
Yiyu Shi*, Wei Yao*, Jinjun Xiong+ and Lei He*
CMOS VLSI Design Chapter 13 Clocks, DLLs, PLLs
Simultaneous Power and Thermal Integrity Driven Via Stapling in 3D ICs
EE 201C Modeling of VLSI Circuits and Systems
Post-Silicon Calibration for Large-Volume Products
Yiyu Shi*, Jinjun Xiong+, Chunchen Liu* and Lei He*
Reducing Clock Skew Variability via Cross Links
Update on “Channel Models for 60 GHz WLAN Systems” Document
NONLINEAR AND ADAPTIVE SIGNAL ESTIMATION
Simultaneous Power and Thermal Integrity Driven Via Stapling in 3D ICs
NONLINEAR AND ADAPTIVE SIGNAL ESTIMATION
Presentation transcript:

Efficient Decoupling Capacitance Budgeting Considering Operation and Process Variations Yiyu Shi*, Jinjun Xiong+, Chunchen Liu* and Lei He* *Electrical Engineering Department, UCLA +IBM T. J. Watson Research Center, Yorktown Heights, NY This work is partially supported by NSF CAREER award and a UC MICRO grant sponsored by Altera, RIO and Intel.

Motivation The continuous semiconductor technology scaling leads to growing process variations, and statistical optimization has been actively researched to cope with process variations. Stochastic gate sizing for power reduction [Bhardwaj:DAC’05, Mani:DAC’05] Stochastic gate sizing for yield optimization [Davoodi:DAC’06, Sinha:ICCAD’05] Stochastic buffer insertion to minimize delay [He:TCAD’07] Adaptive body biasing with post-silicon tuning [Main:ICCAD’06] However, all these work ignore operation variation such as crosstalk difference over input vectors power supply noise fluctuation over time processor temperature variation over workload A better design could be achieved by considering both operation and process variations As a vehicle to demonstrate this point, we study the on-chip decoupling capacitance insertion and sizing (or decap budgeting) problem taking into account operation and process variations

Decap Budgeting Overview Function Load current causes the voltage droop/bounce Suppress dynamic noise by supplying sudden current demands from local charge storage Side effect of adding too much decap Increased leakage Increased die area Risk of yield loss Location matters The closer to the turbulent point, the more noise reduction can be achieved Need to add minimum amount of decap at proper location, yet sufficient for reducing noise Load current power supply intrinsic cap decap We define the noise as the integral over time of the area below Vn t0 t1

Decap Budgeting Problem Formulation Objective Find the distribution and location of the white space so the noise on power network is minimized Constraints: Circuit system constraints: KCL, KVL and circuit element equations Decap constraints: amount of decap allowed at a location is limited Limitation of existing work: Most existing work in essence uses worst case load current in order to guarantee there is no noise violation, which is too pessimistic It is not clear how to provide decap budgeting solution that is robust to current load under all kinds of operations for a circuit

Major Contribution of our work In this paper, we develop a novel stochastic model for current loads, taking into account operation variation such as temporal and logic-induced correlations and process variations such as systematic and random Leff variation. We propose a formal method to extract operation variation and formulate a new decap budgeting problem using the stochastic current model. We develop an effective yet efficient iterative alternative programming algorithm and conduct experiments using industrial designs. Experiments show that considering both operation and process variations can reduce over-design significantly. This demonstrates the importance of considering operation variation. We convincingly demonstrate the significance of considering both operation and process variations and open a new research direction for optimizing signal, power and thermal integrity with consideration of operation variation

Outline Stochastic Modeling and Problem Formulation Algorithm Experimental Results Conclusions

Correlated Load Currents Strong correlation between load currents due to Operation variation Currents at different ports have logic-induced correlation Large number of ports with limited control bits Currents at certain ports cannot reach maximum at the same time due to the inherent logic dependency for a given design Currents at the same port have temporal correlation System takes several clock cycles to execute one instruction The currents cannot reach maximum at all the clock cycles Process variation Currents have intra-die variation due to process variation The P/G network is robust to process variation, but the load currents have intra-die variation because the circuit suffers from process variation. Leff variation is one of the primary variation sources and the variation is spatially correlated [Cao:DAC’05]

Current Sampling Model the current in each clock cycle as a triangular waveform and assume constant rising\falling time Other current waveforms can be used. It will not affect the algorithm In our verification, we use the detailed non-simplified current waveform Partition a circuit into blocks and assume no correlation between different blocks [Najm:ICCAD’05] Extensive simulation for each block to get the peak current value in each clock cycle and at each port. Assume there is only temporal correlation within certain number of clock cycles L L can be the number of clock cycles to execute certain function

Stochastic Current Modeling Divide peak current values into different sets according to the clock cycle and port number The set contains peak current values at port k and in clock cycle j, j+L, j+2L,… Example: Take L=2, and consider two ports in 8 consecutive clock cycles Define to be the stochastic variable with the sample set For example, has the samples 0.1, 0.3, 0.5, 0.7, and therefore has mean value 0,4 The correlation between and reflects the temporal correlation between clock cycle j1 and j2 The correlation between and reflects the logic induced correlation between port k1 and k2.

Extraction of Correlations The logic-induced correlation coefficient between port k1 and k2 at clock cycle j can be computed as Temporal correlation coefficient between clock cycle j1 and j2 at port k can be computed as To take process variation into consideration, sample each multiple times over different region, and the above two formulas can still be applied We use the general definition of correlation coefficient. In our case, we have two.

Extraction of Correlations As is not Gaussian, apply Independent Component Analysis [Hyvarinen’01] to remove the correlation between and get a new set of independent variables r1, r2, … Each can be represented by the linear combination of r1, r2,… Accordingly the waveform at each clock cycle can be reconstructed from those r1,r2,…, i.e., The new variables ri catch both the operation and process variations. We use the general definition of correlation coefficient. In our case, we have two.

Example of Extracted Temporal Correlation The correlation map for peak currents between different clock cycles of one port from an industry application. The P/G network is modeled as RC mesh The load currents are obtained by detailed simulation of the circuit It can be seen that the correlation matrix can be clearly divided into four trunks, and L can be set as 10

Parameterized MNA Formulation Original MNA formulation With the design variables - decap area wi, the G, C matrices can be expressed as Together with the stochastic current model, the MNA formulation becomes: With parameters wi and ri The objective now is to find the optimal solution for those parameters More specifically, find the wi values that minimize the noise with the ri corresponding to the load currents which introduce the maximum noise

Stochastic Decap Formulation Minimize the maximum noise sum over all ports Subject to the stochastic current variable upper/lower bound Subject to Individual decap area constraint due to placement constraints Total decap area constraint Non-convex min/max optimization problem Difficult to find exact optimal solution

Outline Stochastic Modeling and Problem Formulation Algorithm Experimental Results Conclusions

Iterative Programming Algorithm Each iteration we increase the white space allowed until all the white space has been used up or it converges Find the optimal decap budgeting for the giving max droop/bounce update the max droop/bounce update the decap budgeting Find the input corresponding to the max. droop/bounce for the given decap budgeting Cannot guarantee optimality, but can guarantee convergence and efficiency

Illustration of Iterative Programming A3: (P3) A1: (P3) A0: Initial A2: (P2) A0: Initial noise curve at one randomly selected port A1: The noise curve under the optimal decap budgeting for a giving droop/bounce A2: The noise curve with the input corresponding to the max. droop/bounce for the decap budgeting in A1 A3: The noise curve under the optimal decap budgeting for the giving max droop/bounce in A2

Sequential Programming We apply sequential linear programming (sLP) to solve each of the two sub-problems. For each sub-problem, we iteratively do the following two steps until the solution converges: Compute the sensitivities of all the variables to the first order by moment matching. Linearize the objective function with the sensitivities and the optimization problem becomes an LP first order sensitivities

Outline Stochastic Modeling and Problem Formulation Algorithm Experimental Results Conclusions

Impact of Current Correlations Model 1 Maximum current at all ports Model 2 Stochastic model with logic-induced correlation Model 3 Model 2 + temporal correlation Node # Noise (V*s) Runtime (s) Model 1 Model 2 Model 3 1284 6.33e-7 1.28e-7 4.10e-8 104.2 161.2 282.3 10490 5.21e-5 1.09e-5 4.80e-6 973.2 1430 2199 42280 7.92e-4 5.38e-4 9.13e-5 2732 3823 5238 166380 1.34e-2 5.37e-3 2.28e-3 3625 5798 7821 avg 1 1/2.68X 1/9.10X 1.50X 2.26X Compared with the model assuming maximum currents at all ports, under the same decap area, Stochastic model with spatial correlation only reduce the noise by up to 3X Stochastic model with both spatial and temporal correlation reduce the noise by up to 9X

Impact of Leff Variation Node #3429 3.06X V.R. sLP sLP + Leff mean (V*s) std (V*s) runtime (s) 1284 10% 9.28e-7 3.97e-7 184.2 6.14e-7 1.38e-7 332.8 1.81X 20% 9.43e-7 4.55e-7 6.38e-7 1.86e-7 10490 1.03e-4 4.79e-5 1121 7.22e-5 1.23e-5 3429 3.06X 1.22e-4 4.38e-5 7.94e-5 2.06e-5 42280 2.29e-3 9.72e-4 2236 8.23e-4 1.01e-4 6924 3.10X 4.43e-3 1.01e-3 8.28e-4 1.92e-4 166380 2.06e-2 9.91e-3 3824 5.31e-3 8.92e-4 11224 2.93X 2.31e-2 1.03e-2 5.92e-3 9.33e-4 avg 1 1/2.02X 1/5.05X 2.73X 1/1.95X 1/4.05X Compared with the stochastic model without considering Leff variation, the stochastic model with it reduce the average noise by up to 4X and the 3-sigma noise by up to 13X

Conclusions In this paper, we develop a novel stochastic model for current loads, taking into account operation variation such as temporal and logic-induced correlations and process variations such as systematic and random Leff variation. We propose a formal method to extract operation variation and formulate a new decap budgeting problem using the stochastic current model. We develop an effective yet efficient iterative alternative programming algorithm and conduct experiments using industrial designs. Experimental results show that the noise can be reduced by up to 9X. We also apply similar idea to temperature-aware clock routing [Hao:ispd’07] and microprocessor floorplanning (Section 8C.2).

Thank you!