Off-chip Decoupling Capacitor Allocation for Chip Package Co-Design Hao Yu Berkeley Design Chunta Chu and Lei He EE Department.

Slides:



Advertisements
Similar presentations
EE 201A Modeling and Optimization for VLSI LayoutJeff Wong and Dan Vasquez EE 201A Noise Modeling Jeff Wong and Dan Vasquez Electrical Engineering Department.
Advertisements

A Graph-Partitioning-Based Approach for Multi-Layer Constrained Via Minimization Yih-Chih Chou and Youn-Long Lin Department of Computer Science, Tsing.
Slide 1 Bayesian Model Fusion: Large-Scale Performance Modeling of Analog and Mixed- Signal Circuits by Reusing Early-Stage Data Fa Wang*, Wangyang Zhang*,
Modeling and Design for Beyond-the-Die Power Integrity
Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.
Minimal Skew Clock Embedding Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.
The continuous scaling trends of smaller devices, higher operating frequencies, lower power supply voltages, and more functionalities for integrated circuits.
1 Accurate Power Grid Analysis with Behavioral Transistor Network Modeling Anand Ramalingam, Giri V. Devarayanadurg, David Z. Pan The University of Texas.
A Fast Block Structure Preserving Model Order Reduction for Inverse Inductance Circuits Hao Yu, Yiyu Shi, Lei He Electrical Engineering Dept. UCLA David.
1 HW/SW Partitioning Embedded Systems Design. 2 Hardware/Software Codesign “Exploration of the system design space formed by combinations of hardware.
1 BSMOR: Block Structure-preserving Model Order Reduction http//:eda.ee.ucla.edu Hao Yu, Lei He Electrical Engineering Dept., UCLA Sheldon S.D. Tan Electrical.
Primary Contributions  Derive inversion based VPEC (Vector Potential Equivalent Circuit) model from first principles.  Replace inductances with effective.
Chess Review May 10, 2004 Berkeley, CA Platform-based Design for Mixed Analog-Digital Designs Fernando De Bernardinis, Yanmei Li, Alberto Sangiovanni-Vincentelli.
TBS: Fast Analysis of Structured Power Grid by Triangularization Based Structure Preserving Model Order Reduction Hao Yu, Yiyu Shi and Lei He Electrical.
SAMSON: A Generalized Second-order Arnoldi Method for Reducing Multiple Source Linear Network with Susceptance Yiyu Shi, Hao Yu and Lei He EE Department,
Changbo Long ECE Department, UW-Madison Lei He EDA Research Group EE Department, UCLA Distributed Sleep Transistor Network.
Efficient Decoupling Capacitance Budgeting Considering Operation and Process Variations Yiyu Shi*, Jinjun Xiong +, Chunchen Liu* and Lei He* *Electrical.
Platform-based Design for Mixed Analog-Digital Designs Fernando De Bernardinis, Yanmei Li, Alberto Sangiovanni-Vincentelli May 10, 2004 Analog Platform.
Radial Basis Function Networks
Worst-Case Timing Jitter and Amplitude Noise in Differential Signaling Wei Yao, Yiyu Shi, Lei He, Sudhakar Pamarti, and Yu Hu Electrical Engineering Dept.,
A Fast Evaluation of Power Delivery System Input Impedance of Printed Circuit Boards with Decoupling Capacitors Jin Zhao Sigrity Inc.
Machine Learning in Simulation-Based Analysis 1 Li-C. Wang, Malgorzata Marek-Sadowska University of California, Santa Barbara.
1 Design Considerations and Improvement by Using Chip and Package Co-Simulation Yeong-Jar Chang, Meng-Xin Jiang, Chen-Wei Chang, Wang- Jin Chen, Faraday.
Parallel Performance of Hierarchical Multipole Algorithms for Inductance Extraction Ananth Grama, Purdue University Vivek Sarin, Texas A&M University Hemant.
Signal Integrity Software, Inc.Electronic Module Description© SiSoft, 2008 Electrical Module Description EMD A new approach to describing packages and.
CSE 242A Integrated Circuit Layout Automation Lecture: Partitioning Winter 2009 Chung-Kuan Cheng.
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
Pattern Selection based co-design of Floorplan and Power/Ground Network with Wiring Resource Optimization L. Li, Y. Ma, N. Xu, Y. Wang and X. Hong WuHan.
Sparse Coding for Specification Mining and Error Localization Runtime Verification September 26, 2012 Wenchao Li, Sanjit A. Seshia University of California.
CAD Techniques for IP-Based and System-On-Chip Designs Allen C.-H. Wu Department of Computer Science Tsing Hua University Hsinchu, Taiwan, R.O.C {
Low-Power Wireless Sensor Networks
Fast Low-Frequency Impedance Extraction using a Volumetric 3D Integral Formulation A.MAFFUCCI, A. TAMBURRINO, S. VENTRE, F. VILLONE EURATOM/ENEA/CREATE.
Trace Generation to Simulate Large Scale Distributed Application Olivier Dalle, Emiio P. ManciniMar. 8th, 2012.
A New Method For Developing IBIS-AMI Models
Statistical Sampling-Based Parametric Analysis of Power Grids Dr. Peng Li Presented by Xueqian Zhao EE5970 Seminar.
ESL and High-level Design: Who Cares? Anmol Mathur CTO and co-founder, Calypto Design Systems.
Scalable Symbolic Model Order Reduction Yiyu Shi*, Lei He* and C. J. Richard Shi + *Electrical Engineering Department, UCLA + Electrical Engineering Department,
Low-Rank Kernel Learning with Bregman Matrix Divergences Brian Kulis, Matyas A. Sustik and Inderjit S. Dhillon Journal of Machine Learning Research 10.
An accurate and efficient SSO/SSN simulation methodology for 45 nm LPDDR I/O interface Dr. Souvik Mukherjee, Dr. Rajen Murugan (Texas Instruments Inc.)
Stochastic Current Prediction Enabled Frequency Actuator for Runtime Resonance Noise Reduction Yiyu Shi*, Jinjun Xiong +, Howard Chen + and Lei He* *Electrical.
Design, Optimization, and Control for Multiscale Systems
System-level power analysis and estimation September 20, 2006 Chong-Min Kyung.
Simultaneous Analog Placement and Routing with Current Flow and Current Density Considerations H.C. Ou, H.C.C. Chien and Y.W. Chang Electronics Engineering,
Distributed Computation: Circuit Simulation CK Cheng UC San Diego
QuickYield: An Efficient Global-Search Based Parametric Yield Estimation with Performance Constraints Fang Gong 1, Hao Yu 2, Yiyu Shi 1, Daesoo Kim 1,
Power Integrity Test and Verification CK Cheng UC San Diego 1.
Bundle Adjustment A Modern Synthesis Bill Triggs, Philip McLauchlan, Richard Hartley and Andrew Fitzgibbon Presentation by Marios Xanthidis 5 th of No.
Outline Introduction Research Project Findings / Results
EE 201C Modeling of VLSI Circuits and Systems
1 Representation and Evolution of Lego-based Assemblies Maxim Peysakhov William C. Regli ( Drexel University) Authors: {umpeysak,
An Exact Algorithm for Difficult Detailed Routing Problems Kolja Sulimma Wolfgang Kunz J. W.-Goethe Universität Frankfurt.
1 Tom Edgar’s Contribution to Model Reduction as an introduction to Global Sensitivity Analysis Procedure Accounting for Effect of Available Experimental.
System-on-Chip Design Homework Solutions
Managed by UT-Battelle for the Department of Energy Vector Control Algorithm for Efficient Fan-out RF Power Distribution Yoon W. Kang SNS/ORNL Fifth CW.
On the Relation Between Simulation-based and SAT-based Diagnosis CMPE 58Q Giray Kömürcü Boğaziçi University.
Compressive Coded Aperture Video Reconstruction
On-Chip Power Network Optimization with Decoupling Capacitors and Controlled-ESRs Wanping Zhang1,2, Ling Zhang2, Amirali Shayan2, Wenjian Yu3, Xiang Hu2,
Finite Element Method To be added later 9/18/2018 ELEN 689.
Haihua Su, Sani R. Nassif IBM ARL
Yiyu Shi Electrical Engineering Dept. UCLA http//:eda.ee.ucla.edu
Chapter 5b Stochastic Circuit Optimization
Yiyu Shi*, Jinjun Xiong+, Howard Chen+ and Lei He*
Yiyu Shi*, Wei Yao*, Jinjun Xiong+ and Lei He*
Simultaneous Power and Thermal Integrity Driven Via Stapling in 3D ICs
EE 201C Modeling of VLSI Circuits and Systems
Yiyu Shi*, Jinjun Xiong+, Chunchen Liu* and Lei He*
Multiport, Multichannel Transmission Line: Modeling and Synthesis
Yiyu Shi*, Jinjun Xiong+, Chunchen Liu* and Lei He*
Simultaneous Power and Thermal Integrity Driven Via Stapling in 3D ICs
Department of Computer Science and Technology
Presentation transcript:

Off-chip Decoupling Capacitor Allocation for Chip Package Co-Design Hao Yu Berkeley Design Chunta Chu and Lei He EE Department UCLA The work was performed at UCLA and was partially supported by NSF and UC-MICRO

2 Decap Allocation for Clean Power Delivery n Chip-package co-design requires a noise-free off-chip power delivery system (PDS) l Modeling inductance is a must n Decoupling capacitors (decaps) are allocated on chip-package interface to satisfy power integrity n It is a challenging task to find a fast yet accurate decap allocation for a large- scale design How to consider the large and complex physical-level layout during the system- level design? decap cc

3 Physical Level Challenge n Finite parastic impedance affects the circuit functionality at chip-package interface l Supply volatage drop and electromagnetic (EM) coupling n Distributed post-layout model burdens the system-level power integrity analysis and design l Millions of nodes and terminals with dense inductances Module 1 Module 2

4 System Level Challenge System-level synthesis needs to explore the design space composed by those tunable layout parameters Decap allocation: How to select the size? Where to insert? It requires the sensitivity information by perturbing the nominal design parameters Optimization trajectory driven by sensitivity x0x0 x2x2 xnxn … Perturbed design space x1x

5 The Need of Macromodeling Representing a large and complex power delivery system blindly leads to expensive design cycles A compact representation by macromodeling is needed n Existing decap allocation methods with macromodeling [Zheng:CICC’04, Chen:ISPD’06] l Generate PDS macromodel l Apply simulated annealing to add/remove one decap to a legal position l Can not efficiently handle a large-scale design

6 Limitations of Existing Macromodeling n Macromodeling algorithms [PVL, PACT, PRIMA] are limited to handle a large-scale PDS 1. Become ineffective when terminal number is large 2. Do not provide the sensitivity information 3. Destroy the structure of state matrix Small but dense project How to use it ?

7 Our Decap Problem Formulation n A multiple-ring-based problem formulation l Represent decap solution by combination of multi-level templates l Constrain by noise integral at I/O instead of noise amplitude in [ Chen:ISPD’06 ] n Optimization Method l Each step inserts a template with a given decap type based on sensitivity instead of simulated-annealing The key is to efficiently calculate sensitivity from macromodel

8 TBS2: Macromodeling for PDS n Principle Terminal Selection l Capture the essential input/output behavior n Parameterization l Compute performance sensitivities from the layout modifications n Structured Simulation l Sparsely arrange couplings (sparsity), leverage diverse physical domains (latency) and analyze at block-levels (hierarchy) A structured and parameterized macromodel connects layout with system

9 TBS2 (1) Principle Terminal Selection The input signals ( J =B x I ) are temporally correlated Described by a correlation matrix C (N x N) Correlated terminals [ b 0 b 1 b 2 ] can be simplified with use of a principal component analysis (PCA) n Select K principle terminals by K-means method

10 TBS2 (2) Parameterization Decaps can be parametrically described by The sizing vector ( D ) for M2 types of decaps and the topological matrix ( X ) for M1 levels of rings Total M1XM2 types of parameterized templates described by a parameterized state matrix in s-domain X(2,6)=

11 TBS2 (2) Structured Stamping 1. Partition the nominal state matrices according to clustered terminals 2. Triangularize the partitioned state matrices 3. Triangularize the nominal and sensitivity states in each local block 4. Details can be found TBS1[Yu:DAC’06] and [Yu:ISLPED’06]

12 Structured projection Block-wise nominal and sensitivity Sparse and block-triangular TBS2 (3) Structured Macromodeling Details can be found in TBS1 [Yu:DAC’06] and [Yu:ISLPED’06]

13 Improved Accuracy By TBS2 Reduction A non-uniform RLC mesh is reduced by an 80 th -order reduction using TBS2 and PRIMA TBS2 matches more poles than PRIMA w.r.t principle terminals The waveform accuracy is improved in both frequency/time domain by TBS2

14 Our Decap Algorithm Overview 1. Apply TBS2 just one-time to generate a structured and parameterized macromodel 2. Calculate block-level nominal noise at each terminal and its sensitivity w.r.t the partitioned template 3. Check if noise integral satisfies constraints 4. Allocate decaps for each block according to the sensitivity in a greedy fashion TBS2 Check Constraints update Template Calculate nominal+ sensitivity

15 Reduced Runtime and Cost of Decap Allocation Comparing three methods: 1) Simulated-annealing with noise amplitude [Chen:ISPD’06]; 2) Multiple-ring with noise amplitude [this paper]; 3) Multiple-ring with noise integral [this paper] MRA-NI is up to 97X faster than SA-NA due to structured and- parameterized macromodel from TBS2 MRA-NI reduces decap cost by up to 16% due to a more accurate integrity metric using noise integral

16 Conclusions 1. Macromodel connects the system-level design with the physical-level layout 2. TBS2: Structured and parameterized macromodel Provide a fast yet accurate computational prototyping for large/complex system Solve an integrity-driven decap allocation for chip-package co-design Such a block-wise macromodel and optimization can be applied to other layout optimization problems