A Global Minimum Clock Distribution Network Augmentation Algorithm for Guaranteed Clock Skew Yield A. B. Kahng, B. Liu, X. Xu, J. Hu* and G. Venkataraman*

Slides:



Advertisements
Similar presentations
Gregory Shklover, Ben Emanuel Intel Corporation MATAM, Haifa 31015, Israel Simultaneous Clock and Data Gate Sizing Algorithm with Common Global Objective.
Advertisements

EE 201A Modeling and Optimization for VLSI LayoutJeff Wong and Dan Vasquez EE 201A Noise Modeling Jeff Wong and Dan Vasquez Electrical Engineering Department.
Design Rule Generation for Interconnect Matching Andrew B. Kahng and Rasit Onur Topaloglu {abk | rtopalog University of California, San Diego.
OCV-Aware Top-Level Clock Tree Optimization
4/22/ Clock Network Synthesis Prof. Shiyan Hu Office: EREC 731.
ELEN 468 Lecture 261 ELEN 468 Advanced Logic Design Lecture 26 Interconnect Timing Optimization.
Transmission Line Network For Multi-GHz Clock Distribution Hongyu Chen and Chung-Kuan Cheng Department of Computer Science and Engineering, University.
1 Interconnect Layout Optimization by Simultaneous Steiner Tree Construction and Buffer Insertion Presented By Cesare Ferri Takumi Okamoto, Jason Kong.
Leakage and Dynamic Glitch Power Minimization Using MIP for V th Assignment and Path Balancing Yuanlin Lu and Vishwani D. Agrawal Auburn University ECE.
Coupling-Aware Length-Ratio- Matching Routing for Capacitor Arrays in Analog Integrated Circuits Kuan-Hsien Ho, Hung-Chih Ou, Yao-Wen Chang and Hui-Fang.
SLIP 2008, Newcastle Revisiting Fidelity: A Case of Elmore-based Y-routing Trees Tuhina Samanta*, Prasun Ghosal*, Hafizur Rahaman* and Parthasarathi Dasgupta†
Chop-SPICE: An Efficient SPICE Simulation Technique For Buffered RC Trees Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of.
1 EL736 Communications Networks II: Design and Algorithms Class8: Networks with Shortest-Path Routing Yong Liu 10/31/2007.
Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.
Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.
Improved Algorithms for Link- Based Non-tree Clock Network for Skew Variability Reduction Anand Rajaram †‡ David Z. Pan † Jiang Hu * † Dept. of ECE, UT-Austin.
Minimal Skew Clock Embedding Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.
Yuanlin Lu Intel Corporation, Folsom, CA Vishwani D. Agrawal
Low-power Clock Trees for CPUs Dong-Jin Lee, Myung-Chul Kim and Igor L. Markov Dept. of EECS, University of Michigan 1 ICCAD 2010, Dong-Jin Lee, University.
Minimum-Buffered Routing of Non- Critical Nets for Slew Rate and Reliability Control Supported by Cadence Design Systems, Inc. and the MARCO Gigascale.
Multiobjective VLSI Cell Placement Using Distributed Simulated Evolution Algorithm Sadiq M. Sait, Mustafa I. Ali, Ali Zaidi.
Power-Aware Placement
Statistical Crosstalk Aggressor Alignment Aware Interconnect Delay Calculation Supported by NSF & MARCO GSRC Andrew B. Kahng, Bao Liu, Xu Xu UC San Diego.
ABSTRACT We consider the problem of buffering a given tree with the minimum number of buffers under load cap and buffer skew constraints. Our contributions.
Supply Voltage Degradation Aware Analytical Placement Andrew B. Kahng, Bao Liu and Qinke Wang UCSD CSE Department {abk, bliu,
Local Unidirectional Bias for Smooth Cutsize-delay Tradeoff in Performance-driven Partitioning Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts. Work supported.
NTHU-CS VLSI/CAD LAB TH EDA Student : Da-Cheng Juan Advisor : Shih-Chieh Chang Fine-Grained Sleep Transistor Sizing Algorithm for Leakage Power Minimization.
1 UCSD VLSI CAD Laboratory ISQED-2009 Revisiting the Linear Programming Framework for Leakage Power vs. Performance Optimization Kwangok Jeong, Andrew.
Jan. 2007VLSI Design '071 Statistical Leakage and Timing Optimization for Submicron Process Variation Yuanlin Lu and Vishwani D. Agrawal ECE Dept. Auburn.
Processing Rate Optimization by Sequential System Floorplanning Jia Wang 1, Ping-Chih Wu 2, and Hai Zhou 1 1 Electrical Engineering & Computer Science.
Statistical Gate Delay Calculation with Crosstalk Alignment Consideration Andrew B. Kahng, Bao Liu, Xu Xu UC San Diego
More Realistic Power Grid Verification Based on Hierarchical Current and Power constraints 2 Chung-Kuan Cheng, 2 Peng Du, 2 Andrew B. Kahng, 1 Grantham.
Enhanced Metamodeling Techniques for High-Dimensional IC Design Estimation Problems Andrew B. Kahng, Bill Lin and Siddhartha Nath VLSI CAD LABORATORY,
VLSI Physical Design Automation
A Methodology for Interconnect Dimension Determination By: Jeff Cobb Rajesh Garg Sunil P Khatri Department of Electrical and Computer Engineering, Texas.
Dose Map and Placement Co-Optimization for Timing Yield Enhancement and Leakage Power Reduction Kwangok Jeong, Andrew B. Kahng, Chul-Hong Park, Hailong.
DELAY INSERTION METHOD IN CLOCK SKEW SCHEDULING BARIS TASKIN and IVAN S. KOURTEV ISPD 2005 High Performance Integrated Circuit Design Lab. Department of.
-1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo.
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
1 Coupling Aware Timing Optimization and Antenna Avoidance in Layer Assignment Di Wu, Jiang Hu and Rabi Mahapatra Texas A&M University.
A Polynomial Time Approximation Scheme For Timing Constrained Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, Charles J. Alpert** *Dept of Electrical.
PiCAP: A Parallel and Incremental Capacitance Extraction Considering Stochastic Process Variation Fang Gong 1, Hao Yu 2, and Lei He 1 1 Electrical Engineering.
Wen-Hao Liu 1, Yih-Lang Li 1, and Kai-Yuan Chao 2 1 Department of Computer Science, National Chiao-Tung University, Hsin-Chu, Taiwan 2 Intel Architecture.
March 20, 2007 ISPD An Effective Clustering Algorithm for Mixed-size Placement Jianhua Li, Laleh Behjat, and Jie Huang Jianhua Li, Laleh Behjat,
Low-Power Gated Bus Synthesis for 3D IC via Rectilinear Shortest-Path Steiner Graph Chung-Kuan Cheng, Peng Du, Andrew B. Kahng, and Shih-Hung Weng UC San.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
1 Wire Length Prediction-based Technology Mapping and Fanout Optimization Qinghua Liu Malgorzata Marek-Sadowska VLSI Design Automation Lab UC-Santa Barbara.
Statistical Sampling-Based Parametric Analysis of Power Grids Dr. Peng Li Presented by Xueqian Zhao EE5970 Seminar.
1 Interconnect and Packaging Lecture 8: Clock Meshes and Shunts Chung-Kuan Cheng UC San Diego.
A Faster Approximation Scheme for Timing Driven Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, and Charles J. Alpert** *Dept of ECE, Michigan Technological.
1 Efficient Obstacle-Avoiding Rectilinear Steiner Tree Construction Chung-Wei Lin, Szu-Yu Chen, Chi-Feng Li, Yao-Wen Chang, Chia-Lin Yang National Taiwan.
1 ε -Optimal Minimum-Delay/Area Zero-Skew Clock Tree Wire-Sizing in Pseudo-Polynomial Time Jeng-Liang Tsai Tsung-Hao Chen Charlie Chung-Ping Chen (National.
Fast Algorithms for Slew Constrained Minimum Cost Buffering S. Hu*, C. Alpert**, J. Hu*, S. Karandikar**, Z. Li*, W. Shi* and C. Sze** *Dept of ECE, Texas.
1 Interconnect/Via. 2 Delay of Devices and Interconnect.
Xuanxing Xiong and Jia Wang Electrical and Computer Engineering Illinois Institute of Technology Chicago, Illinois, United States November, 2011 Vectorless.
1ISPD'03 Process Variation Aware Clock Tree Routing Bing Lu Cadence Jiang Hu Texas A&M Univ Gary Ellis IBM Corp Haihua Su IBM Corp.
A Fully Polynomial Time Approximation Scheme for Timing Driven Minimum Cost Buffer Insertion Shiyan Hu*, Zhuo Li**, Charles Alpert** *Dept of Electrical.
-1- UC San Diego / VLSI CAD Laboratory Optimization of Overdrive Signoff Tuck-Boon Chan, Andrew B. Kahng, Jiajia Li and Siddhartha Nath Tuck-Boon Chan,
-1- Delay Uncertainty and Signal Criticality Driven Routing Channel Optimization for Advanced DRAM Products Samyoung Bang #, Kwangsoo Han ‡, Andrew B.
A Novel Timing-Driven Global Routing Algorithm Considering Coupling Effects for High Performance Circuit Design Jingyu Xu, Xianlong Hong, Tong Jing, Yici.
An O(bn 2 ) Time Algorithm for Optimal Buffer Insertion with b Buffer Types Authors: Zhuo Li and Weiping Shi Presenter: Sunil Khatri Department of Electrical.
Proximity Optimization for Adaptive Circuit Design Ang Lu, Hao He, and Jiang Hu.
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts.
EE201C Modeling of VLSI Circuits and Systems Final Project
EE201C Modeling of VLSI Circuits and Systems Final Project
Yiyu Shi*, Wei Yao*, Jinjun Xiong+ and Lei He*
Reducing Clock Skew Variability via Cross Links
Zero Skew Clock tree Implementation
Clock Tree Routing With Obstacles
Presentation transcript:

A Global Minimum Clock Distribution Network Augmentation Algorithm for Guaranteed Clock Skew Yield A. B. Kahng, B. Liu, X. Xu, J. Hu* and G. Venkataraman* CSE Dept., UC San Diego *ECE Dept., TAMU

Outline Background Problem formulation Theoretical results New method Experiment Future works and summary

Clock Distribution Network Goal: Circuit synchronization Design objectives  Zero skew  Minimum wirelength  minimum power consumption  Minimized insertion delay Objective under process variation  Bounded skew Localization Symmetry  H-Tree

Clock Distribution Network Clock tree (ASICs)  Matching / clustering  DME  Minimum wirelength  Larger skew (variation) Clock grid (CPUs)  Larger wirelength  Minimum skew (variation) Hybrid (recent)  Symmetric tree on top  Grid in the middle  Steiner minimum trees at bottom

Clock Tree Link Insertion Link insertion reduces clock skew  Signal propagation delay is a weighted sum of the delays of the two paths Pros:  Reduces clock skew (variation)  Relatively low cost overhead Cons:  Wirelength increase  Polarity / short circuit

Link Insertion vs. Wire Sizing Synchronize two subtrees Speedup two subtrees

Previous Works on Link Based Clock Network Rule based link insertion [Rajaram, Hu and Mahapatra, DAC 04; Rajaram, Pan and Hu, ISPD 05]  All links are inserted at one time  Links are inserted between equal delay nodes  Role of the rules: find short links, distribute links evenly in clock network Incremental link insertion [Lam, Jain, Koh, Balakrishnan and Chen, ICCAD 05]  Guided by statistical skew analysis, long runtime Link insertion in buffered network [Venkataraman, et al., ICCAD05; Rajaram and Pan, ISPD 06]  Short circuit risk and slew rate are discussed

Our Contributions We formulate the problem as clock network augmentation for required clock skew yield Observations on the effect of resistive link insertion on local skew reduction for general structure clock networks New heuristic: insert many links and then selectively remove links  More accurate – guided by skew analysis  More efficient – removal sees a more global view Our method achieves dominant results on skew yield, wirelength, max skew and skew variability

Outline Background Problem formulation Theoretical results New method Experiment Future works and summary

Problem Formulation Given  (buffered) clock distribution network N (in forms of electrical schematic and geometric embedding)  electrical parameter variations  clock skew yield Y s for bound U Find an augmented clock network N’ of  minimum wirelength and  skew s which satisfies clock skew yield requirement Pr(s Y s

Modeling Parasitic extraction  interconnect RLC networks Device models  buffer circuits SPICE simulation  best accuracy Variations: buffer gate length, supply voltage, wire width and sink capacitance p = p 0 +  1 +  2  p 0 : nominal   1 : intra-chip (local), spatially correlated   2 : purely random

Outline Background Problem formulation Theoretical results New method Experiment Future works and summary

Analysis: Skew Reduction Between Two Nodes Inserting a purely resistive link in an unbuffered RC clock network scales down the skew and the skew variation between the two nodes by a ratio of r/(r+  ), where r is the resistance of the inserted link,  = G ii -1 +G jj -1 -2G ij -1  Conductance matrix G nxn includes G ij = -1/r ij and G ii =  j!=i 1/r ij + 1/r is for n nodes excluding the clock source s  Apply rank one update for matrix inverses  Consistent with previous observations in a clock tree G ii -1 is the resistance of path (s, i) G jj -1 is the resistance of path (s, j) G ij -1 is the common path resistance of paths (s, i) and (s, j)

Analysis: Global Skew Reduction Local skew  Clock signal arrival time at two specific sinks Global skew  Maximum of local skews Effect of resistive augmentation  Reduces local skew  No guarantee of global skew reduction Effect of capacitive augmentation  Capacitive balancing impact  Iterative link insertion does not reduce global skew monotonically

Outline Background Problem formulation Theoretical results New method Experiment Future works and summary

Overview of Our Method Link insertion between all pairs of sinks which are within a short distance  Link removal based on statistical yield analysis and two heuristic rules, for minimum wirelength and skew yield compromise Link consolidation by replacing minimum spanning trees with Steiner minimum trees for further wirelength reduction

Advantages of Our Method Smooth and monotonic optimization process  starting from the “batch” augmented clock network  remove links iteratively Accurate analysis  based on the augmented clock network  (not on initial tree) Ease of physical routing of the inserted links  minimum wirelength increase  (unconfirmed with P&R flow)

Clock Network Augmentation for Clock Skew Yield Input: Clock network N in electrical schematic and geometric embedding electrical parameter variations clock skew yield Y s for bound U Output: augmented clock network N’ 1.Link insertion between nearest sink pairs within a distance of  2.Statistical clock skew analysis 3.Rule based link removal 4.Iterative link removal 5.Link consolidation by forming Steiner trees 6.Final statistical clock skew yield evaluation

Initial Link Insertion Shortest links between clock tree sinks  Minimum routing resource consumption  Minimum routing complexity  Minimum capacitance balance effect  Minimum optimization complexity Increase distance threshold  until the required clock skew yield is achieved  (Intuition: more connections helps reduce skew)

Example of Link Insertion

Rule-Based Link Removal Rule 1: remove links between two maximum delay sinks which have maximum delay in at least one of the Monte Carlo SPICE simulation runs  Reduce capacitance in longest paths so that source- sink delay is reduced Rule 2: remove links between two comparable delay sinks which have maximum delay difference no larger than a threshold  in all Monte Carlo SPICE simulation runs  Remove redundant links of virtually no effect

Example of Link Removal Link between max delay sinks Between small skew sinks

Link Consolidation If the bounding boxes of two links overlap, the two links can be merged into a Steiner tree  Reduce wirelength  Implemented in a line-scanning algorithm

Example of Link Consolidation

Outline Background Problem formulation Theoretical results New method Experiment Future works and summary

Experimental Setup ISCAS’89 circuits synthesized by SIS and placed by mPL 180nm technology electrical parameters extracted by SPACE 3D Process variations 3  = 15% Spatial correlation degrades linearly Monte Carlo simulation by 1000 HSPICE runs Initial trees are obtained as in [11] G. Venkataraman et al., “Practical Techniques to Reduce Skew and Its Variations in Buffered Clock Networks,” ICCAD, pp , #sinksDelay (ps)Skew (ps)WL (um)Cap (pF) s s s

Main Experimental Results Skew Std. Deviation Max Skew Wire Length Skew Yield Our Method Previous Method [11] Initial Tree Dominant results compared with previous method [11] [11] G. Venkataraman et al., “Practical Techniques to Reduce Skew and Its Variations in Buffered Clock Networks,” ICCAD, pp , 2005.

Effects of Each Step Tree+ link insertion + insertion + link removal + insertion + removal + consolidation Wirelength Max skew Skew std deviation Skew bound is equal to 50ps for s9234 and s5378, and is 100ps for s13207 Skew yield is retained after link removal and consolidation

Skew-Wirelength Tradeoff for s13207 Skew bound (ps) Skew yield = 1

Outline Background Problem formulation Theoretical results New method Experiment Future works and summary

Discussion and Future Work Impact to signal routing  Clock wirelength is usually one order of magnitude less than signal wirelength  impact of link wirelength is limited  Integrating method with P&R flow to confirm feasibility Skew yield analysis  Currently use SPICE-based Monte Carlo simulation  Investigating statistical skew analysis to improve estimation efficiency Comparison and integration with wire sizing More comprehensive experiments with different variation models

Summary We present minimum clock distribution network augmentation for required clock skew yield New analysis results on resistive link insertion on local skew reduction for general clock networks New clock network augmentation method  Achieves dominant results  Average of 16% clock skew yield increase, 9% maximum skew reduction, and 25% clock skew standard deviation reduction with identical wirelength increase compared with [11]

Thank you !

More Discussions (1) Q: Is there any risk of ringing because of the existence of loops in the linked clock network? A: such risk can be easily avoided  Initial tree: should be balanced, buffers are inserted level by level  Links are inserted between subtrees of the same level  Then, no feedback loop is induced and no risk of ringing

More Discussions (2) Q: How to choose parameters in the proposed method? For example, how to choose the skew threshold for the second rule of link removal? A:  The results of our method are not sensitive to the change of the parameters. For example, if the skew threshold is small, less links are removed based on rule 2. However, the rule based removal is followed by skew analysis guided link removal, which can still remove those inefficient links. So, there is no need to meticulously tune the parameters.  On a coarse level, the parameters can be selected by wrapping our method with a binary search of the parameters.