-1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo.

Slides:



Advertisements
Similar presentations
Tunable Sensors for Process-Aware Voltage Scaling
Advertisements

OCV-Aware Top-Level Clock Tree Optimization
-1- VLSI CAD Laboratory, UC San Diego Post-Routing BEOL Layout Optimization for Improved Time- Dependent Dielectric Breakdown (TDDB) Reliability Tuck-Boon.
Timing Margin Recovery With Flexible Flip-Flop Timing Model
Minimum Implant Area-Aware Gate Sizing and Placement
Chop-SPICE: An Efficient SPICE Simulation Technique For Buffered RC Trees Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of.
Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.
Improved Algorithms for Link- Based Non-tree Clock Network for Skew Variability Reduction Anand Rajaram †‡ David Z. Pan † Jiang Hu * † Dept. of ECE, UT-Austin.
UC San Diego / VLSI CAD Laboratory NOLO: A No-Loop, Predictive Useful Skew Methodology for Improved Timing in IC Implementation Tuck-Boon Chan, Andrew.
An Optimal Algorithm of Adjustable Delay Buffer Insertion for Solving Clock Skew Variation Problem Juyeon Kim, Deokjin Joo, Taehan Kim DAC’13.
The Cost of Fixing Hold Time Violations in Sub-threshold Circuits Yanqing Zhang, Benton Calhoun University of Virginia Motivation and Background Power.
Background: Scan-Based Delay Fault Testing Sequentially apply initialization, launch test vector pairs that differ by 1-bit shift A vector pair induces.
Power-Aware Placement
Statistical Crosstalk Aggressor Alignment Aware Interconnect Delay Calculation Supported by NSF & MARCO GSRC Andrew B. Kahng, Bao Liu, Xu Xu UC San Diego.
ABSTRACT We consider the problem of buffering a given tree with the minimum number of buffers under load cap and buffer skew constraints. Our contributions.
Supply Voltage Degradation Aware Analytical Placement Andrew B. Kahng, Bao Liu and Qinke Wang UCSD CSE Department {abk, bliu,
Fast and Area-Efficient Phase Conflict Detection and Correction in Standard-Cell Layouts Charles Chiang, Synopsys Andrew B. Kahng, UC San Diego Subarna.
On-Line Adjustable Buffering for Runtime Power Reduction Andrew B. Kahng Ψ Sherief Reda † Puneet Sharma Ψ Ψ University of California, San Diego † Brown.
1 UCSD VLSI CAD Laboratory ISQED-2009 Revisiting the Linear Programming Framework for Leakage Power vs. Performance Optimization Kwangok Jeong, Andrew.
A Global Minimum Clock Distribution Network Augmentation Algorithm for Guaranteed Clock Skew Yield A. B. Kahng, B. Liu, X. Xu, J. Hu* and G. Venkataraman*
A Proposal for Routing-Based Timing-Driven Scan Chain Ordering Puneet Gupta 1 Andrew B. Kahng 1 Stefanus Mantik 2
Detailed Placement for Leakage Reduction Using Systematic Through-Pitch Variation Andrew B. Kahng †‡ Swamy Muddu ‡ Puneet Sharma ‡ CSE † and ECE ‡ Departments,
Statistical Gate Delay Calculation with Crosstalk Alignment Consideration Andrew B. Kahng, Bao Liu, Xu Xu UC San Diego
UC San Diego Computer Engineering. VLSI CAD Laboratory.. UC San Diego Computer EngineeringVLSI CAD Laboratory.. UC San Diego Computer EngineeringVLSI CAD.
UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD.
UCSD VLSI CAD Laboratory - ICCAD, Nov. 3, 2009 Timing Yield-Aware Color Reassignment and Detailed Placement Perturbation for Double Patterning Lithography.
Timing Analysis and Optimization Implications of Bimodal CD Distribution in Double Patterning Lithography Kwangok Jeong and Andrew B. Kahng VLSI CAD LABORATORY.
Enhanced Metamodeling Techniques for High-Dimensional IC Design Estimation Problems Andrew B. Kahng, Bill Lin and Siddhartha Nath VLSI CAD LABORATORY,
UC San Diego / VLSI CAD Laboratory Reliability-Constrained Die Stacking Order in 3DICs Under Manufacturing Variability Tuck-Boon Chan, Andrew B. Kahng,
-1- UC San Diego / VLSI CAD Laboratory Methodology for Electromigration Signoff in the Presence of Adaptive Voltage Scaling Wei-Ting Jonas Chan, Andrew.
Andrew B. Kahng‡†, Mulong Luo†, Siddhartha Nath†
Dose Map and Placement Co-Optimization for Timing Yield Enhancement and Leakage Power Reduction Kwangok Jeong, Andrew B. Kahng, Chul-Hong Park, Hailong.
DELAY INSERTION METHOD IN CLOCK SKEW SCHEDULING BARIS TASKIN and IVAN S. KOURTEV ISPD 2005 High Performance Integrated Circuit Design Lab. Department of.
Accuracy-Configurable Adder for Approximate Arithmetic Designs
A New Methodology for Reduced Cost of Resilience Andrew B. Kahng, Seokhyeong Kang and Jiajia Li UC San Diego VLSI CAD Laboratory.
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
UC San Diego / VLSI CAD Laboratory Toward Quantifying the IC Design Value of Interconnect Technology Improvement Tuck-Boon Chan, Andrew B. Kahng, Jiajia.
Horizontal Benchmark Extension for Improved Assessment of Physical CAD Research Andrew B. Kahng, Hyein Lee and Jiajia Li UC San Diego VLSI CAD Laboratory.
Low-Power Gated Bus Synthesis for 3D IC via Rectilinear Shortest-Path Steiner Graph Chung-Kuan Cheng, Peng Du, Andrew B. Kahng, and Shih-Hung Weng UC San.
UC San Diego / VLSI CAD Laboratory Incremental Multiple-Scan Chain Ordering for ECO Flip-Flop Insertion Andrew B. Kahng, Ilgweon Kang and Siddhartha Nath.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
Thermal-aware Steiner Routing for 3D Stacked ICs M. Pathak and S.K. Lim Georgia Institute of Technology ICCAD 07.
-1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI.
Clock Clustering and IO Optimization for 3D Integration Samyoung Bang*, Kwangsoo Han ‡, Andrew B. Kahng ‡† and Vaishnav Srinivas ‡ ‡ ECE and † CSE Departments,
Kwangsoo Han, Andrew B. Kahng, Hyein Lee and Lutong Wang
1 Interconnect and Packaging Lecture 8: Clock Meshes and Shunts Chung-Kuan Cheng UC San Diego.
Kwangsoo Han‡, Andrew B. Kahng‡† and Hyein Lee‡
-1- UC San Diego / VLSI CAD Laboratory High-Dimensional Metamodeling for Prediction of Clock Tree Synthesis Outcomes Andrew B. Kahng, Bill Lin and Siddhartha.
1 ε -Optimal Minimum-Delay/Area Zero-Skew Clock Tree Wire-Sizing in Pseudo-Polynomial Time Jeng-Liang Tsai Tsung-Hao Chen Charlie Chung-Ping Chen (National.
Fast Algorithms for Slew Constrained Minimum Cost Buffering S. Hu*, C. Alpert**, J. Hu*, S. Karandikar**, Z. Li*, W. Shi* and C. Sze** *Dept of ECE, Texas.
Outline Introduction: BTI Aging and AVS Signoff Problem
UC San Diego / VLSI CAD Laboratory Learning-Based Approximation of Interconnect Delay and Slew Modeling in Signoff Timing Tools Andrew B. Kahng, Seokhyeong.
Mixed Cell-Height Implementation for Improved Design Quality in Advanced Nodes Sorin Dobre +, Andrew B. Kahng * and Jiajia Li * * UC San Diego VLSI CAD.
1ISPD'03 Process Variation Aware Clock Tree Routing Bing Lu Cadence Jiang Hu Texas A&M Univ Gary Ellis IBM Corp Haihua Su IBM Corp.
0 Optimizing Stochastic Circuits for Accuracy-Energy Tradeoffs Armin Alaghi 3, Wei-Ting J. Chan 1, John P. Hayes 3, Andrew B. Kahng 1,2 and Jiajia Li 1.
Outline Motivation and Contributions Related Works ILP Formulation
-1- UC San Diego / VLSI CAD Laboratory On Potential Design Impacts of Electromigration Awareness Andrew B. Kahng, Siddhartha Nath and Tajana S. Rosing.
-1- UC San Diego / VLSI CAD Laboratory Optimization of Overdrive Signoff Tuck-Boon Chan, Andrew B. Kahng, Jiajia Li and Siddhartha Nath Tuck-Boon Chan,
-1- Delay Uncertainty and Signal Criticality Driven Routing Channel Optimization for Advanced DRAM Products Samyoung Bang #, Kwangsoo Han ‡, Andrew B.
-1- UC San Diego / VLSI CAD Laboratory Optimal Reliability-Constrained Overdrive Frequency Selection in Multicore Systems Andrew B. Kahng and Siddhartha.
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
Improved Flop Tray-Based Design Implementation for Power Reduction
Gopakumar.G Hardware Design Group
Kun Young Chung*, Andrew B. Kahng+ and Jiajia Li+
Chapter 7 – Specialized Routing
Kristof Blutman† , Hamed Fatemi† , Andrew B
Improved Performance of 3DIC Implementations Through Inherent Awareness of Mix-and-Match Die Stacking Kwangsoo Han, Andrew B. Kahng and Jiajia Li University.
Revisiting and Bounding the Benefit From 3D Integration
Buffered tree construction for timing optimization, slew rate, and reliability control Abstract: With the rapid scaling of IC technology, buffer insertion.
Parallel ClockDesigner
Presentation transcript:

-1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo Han, Andrew B. Kahng, Jongpil Lee, Jiajia Li and Siddhartha Nath Kwangsoo Han, Andrew B. Kahng, Jongpil Lee, Jiajia Li and Siddhartha Nath VLSI CAD LABORATORY, UC San Diego

-2- Outline Motivation Motivation Related Work Related Work Our Optimization Framework Our Optimization Framework Experimental Setup and Results Experimental Setup and Results Conclusions Conclusions

-3- Motivation Many signoff PVT corners in modern SoCs  “ping-pong” effect == fixing timing issues at one corner leads to timing violation at others Clock skew variation across corners  “ping-pong” effect == fixing timing issues at one corner leads to timing violation at others Our goal: Minimize clock skew variation datapath launch pathcapture path Corner Clock latency Skew LaunchCapture SS, 0.7V, -25°C FF, 1.1V, -25°C Low voltage: gate delay dominates High voltage: wire delay dominates  Skew reversal  Power/area overheads Skew = -0.1 /+0.2 /0.7

-4- Outline Motivation Motivation Related Work Related Work Our Optimization Framework Our Optimization Framework Experimental Setup and Results Experimental Setup and Results Conclusions Conclusions

-5- Related Work Skew minimization at multiple corners [Cho05] perform temperature-aware skew reduction based on an improved DME [Lung10] minimize the worst clock skew across corners with delay correlation factors Skew variation minimization across corners [Restle01] propose two-level non-tree structure, in which mesh is applied at bottom level [Su01] use mesh for top-level of clock network [Rajaram04] insert crosslinks in a clock tree to minimize skew variation Our work: systematic optimization framework for minimization of clock skew variation in clock tree

-6- Skew Variation Reduction Problem At C :Skew i,j C At C’ : Skew i,j C’ i j r r: root; i, j: sinks C’ C’’ i j r C i j r C C’ i j r max … ∑

-7- Outline Motivation Motivation Related Work Related Work Our Optimization Framework Our Optimization Framework Experimental Setup and Results Experimental Setup and Results Conclusions Conclusions

-8- Our Optimization Framework Incremental optimization of a CTS solution Incremental optimization of a CTS solution Perform both global and local optimization Perform both global and local optimization Global optimization uses LP to determine delta delays on arcs Global optimization uses LP to determine delta delays on arcs Local optimization performs iterative local moves Local optimization performs iterative local moves root last-stage buffer sinks Original routed clock tree target buffer After global optimization root After local optimization Routed clock tree database Global Optimization Buffer insertion/removal, routing detour Local Optimization Local moves (e.g., sizing/displacement) Optimized database

-9- Global Optimization: LP Formulate linear program to minimize skew variation  Determine the delta delay on each arc at each corner  Based on LUTs to insert/remove buffer and detour wires Formulate linear program to minimize skew variation  Determine the delta delay on each arc at each corner  Based on LUTs to insert/remove buffer and detour wires Discreteness of buffer delays  ECO feasibility is important Discreteness of buffer delays  ECO feasibility is important  (1) Minimize number of ECO changes  (2) Sweep U for solution with minimum skew variation  (3) Ensure no skew degradation  (4) Maximum clock latency constraint  (1, 5, 6) Improve ECO feasibility

-10- Our Optimization Framework Incremental optimization of a CTS solution Incremental optimization of a CTS solution Perform both global and local optimization Perform both global and local optimization Global optimization use LP to determine delta delays on arcs Global optimization use LP to determine delta delays on arcs Local optimization perform iterative local moves Local optimization perform iterative local moves Routed clock tree database Global Optimization Buffer insertion/removal, routing detour Local Optimization Local moves (e.g., sizing/displacement) Optimized database

-11- Local Optimization: Moves Iterative local moves to minimize skew variation Iterative local moves to minimize skew variation Tree types of local moves Tree types of local moves 1.Displacement {N, S, E, W, NE, NW, SE, SW} by 10μm x one-step sizing 2.Displacement by 10μm x one-step sizing on child buffer 3.Reassign to a new driver (i) at the same level, (ii) within bounding box of 50μm x 50μm 10μm... (1) 10μm... (2)... (3)  Each move is expensive (= legalization, ECO routing, RC extraction, STA)  Each buffer has ~100 candidate moves  Which move is the best? Our solution: learning-based model

-12- Machine Learning-Based Model Predict driver-to-fanout latency change due to local moves Predict driver-to-fanout latency change due to local moves Local move Analytical models Routing: FLUTE, STST Cell delay: Liberty LUTs Wire delay: Elmore, D2M Delta delays Learning-based model Delta delays  Each attempt is a local move  114 buffers  45 candidate moves for each buffer  Learning-based model identifies best moves for more buffers with less #attempts

-13- Outline Motivation Motivation Related Work Related Work Our Optimization Framework Our Optimization Framework Experimental Setup and Results Experimental Setup and Results Conclusions Conclusions

-14- Experimental Setup Technology: foundry 28nm LP Technology: foundry 28nm LP Initial clock tree from Synopsys IC Compiler Initial clock tree from Synopsys IC Compiler Testcases: (a) high-speed application processor, (b) memory controller Testcases: (a) high-speed application processor, (b) memory controller Corners Corners Clock ports In yellow are clock nets/cells and sinks CornerProcessVoltageTemperatureBEOL Apply to which testcase C0SS0.90V-25°CCmax (a), (b) C1SS0.75V-25°CCmax (a), (b) C2FF1.10V125°CCmin (b) C3FF1.32V125°CCmin (a)

-15- Experimental Results (1) Up to 22% reduction on sum of skew variation over all sink pairs Up to 22% reduction on sum of skew variation over all sink pairs No skew degradation at all corners No skew degradation at all corners Negligible area and power overhead Negligible area and power overhead TestcaseFlow Variation (ns) Skew (ps) #Cells Power (mW) Area (μm 2 ) C0C1C2/C3 (a) Original Global-local (b) Original Global-local

-16- Experimental Results (2) Figure shows comparison of skew variation on (a) Figure shows comparison of skew variation on (a) Our optimization significantly reduces the large skew variation between corner pairs Our optimization significantly reduces the large skew variation between corner pairs Corner pair = (C0, C3) Corner pair = (C0, C1) Optimized skew variation (ns) Original skew variation (ns) Optimized skew variation (ns) Original skew variation (ns)

-17- Outline Motivation Motivation Related Work Related Work Our Optimization Framework Our Optimization Framework Experimental Setup and Results Experimental Setup and Results Conclusions Conclusions

-18- Conclusion and Future Works First framework to minimize sum of skew variation over all sink pairs in a clock tree First framework to minimize sum of skew variation over all sink pairs in a clock tree Up to 22% reduction of the sum of skew variation Up to 22% reduction of the sum of skew variation Future works Future works –Study resultant power and area benefits –Model to predict a buffer location for minimum skew over a continuous range of possible locations Thank You!

-19- Backup Slides

-20- Experimental Results (3) Figure shows distribution of skew ratios between C0 and C1 Figure shows distribution of skew ratios between C0 and C1 Our optimization significantly reduces the variation of skew ratios between corner pairs Our optimization significantly reduces the variation of skew ratios between corner pairs μ = = 3.21 μ = = 2.26 Ratio (= skew at C1 / skew at C0) #Sink pairs Original Global-local