-1- Delay Uncertainty and Signal Criticality Driven Routing Channel Optimization for Advanced DRAM Products Samyoung Bang #, Kwangsoo Han ‡, Andrew B.

Slides:



Advertisements
Similar presentations
EE 201A Modeling and Optimization for VLSI LayoutJeff Wong and Dan Vasquez EE 201A Noise Modeling Jeff Wong and Dan Vasquez Electrical Engineering Department.
Advertisements

Design Rule Generation for Interconnect Matching Andrew B. Kahng and Rasit Onur Topaloglu {abk | rtopalog University of California, San Diego.
OCV-Aware Top-Level Clock Tree Optimization
Advanced Interconnect Optimizations. Buffers Improve Slack RAT = 300 Delay = 350 Slack = -50 RAT = 700 Delay = 600 Slack = 100 RAT = 300 Delay = 250 Slack.
-1- VLSI CAD Laboratory, UC San Diego Post-Routing BEOL Layout Optimization for Improved Time- Dependent Dielectric Breakdown (TDDB) Reliability Tuck-Boon.
4/22/ Clock Network Synthesis Prof. Shiyan Hu Office: EREC 731.
ELEN 468 Lecture 261 ELEN 468 Advanced Logic Design Lecture 26 Interconnect Timing Optimization.
Timing Margin Recovery With Flexible Flip-Flop Timing Model
Net-Ordering for Optimal Circuit Timing in Nanometer Interconnect Design M. Sc. work by Moiseev Konstantin Supervisors: Dr. Shmuel Wimer, Dr. Avinoam Kolodny.
Improving Placement under the Constant Delay Model Kolja Sulimma 1, Ingmar Neumann 1, Lukas Van Ginneken 2, Wolfgang Kunz 1 1 EE and IT Department University.
1 Interconnect Layout Optimization by Simultaneous Steiner Tree Construction and Buffer Insertion Presented By Cesare Ferri Takumi Okamoto, Jason Kong.
Moon-Su Kim, Sunik Heo, DalHee Lee, DaeJoon Hyun, Byung Su Kim, Bonghyun Lee, Chul Rim, Hyosig Won, Keesup Kim Samsung Electronics Co., Ltd. System LSI.
Chop-SPICE: An Efficient SPICE Simulation Technique For Buffered RC Trees Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of.
Noise Model for Multiple Segmented Coupled RC Interconnects Andrew B. Kahng, Sudhakar Muddu †, Niranjan A. Pol ‡ and Devendra Vidhani* UCSD CSE and ECE.
Layer Assignment Algorithm for RLC Crosstalk Minimization Bin Liu, Yici Cai, Qiang Zhou, Xianlong Hong Tsinghua University.
Power-Aware Placement
Statistical Crosstalk Aggressor Alignment Aware Interconnect Delay Calculation Supported by NSF & MARCO GSRC Andrew B. Kahng, Bao Liu, Xu Xu UC San Diego.
Study of Floating Fill Impact on Interconnect Capacitance Andrew B. Kahng Kambiz Samadi Puneet Sharma CSE and ECE Departments University of California,
Constructing Current-Based Gate Models Based on Existing Timing Library Andrew Kahng, Bao Liu, Xu Xu UC San Diego
On-Line Adjustable Buffering for Runtime Power Reduction Andrew B. Kahng Ψ Sherief Reda † Puneet Sharma Ψ Ψ University of California, San Diego † Brown.
1 UCSD VLSI CAD Laboratory ISQED-2009 Revisiting the Linear Programming Framework for Leakage Power vs. Performance Optimization Kwangok Jeong, Andrew.
Effects of Global Interconnect Optimizations on Performance Estimation of Deep Sub-Micron Design Yu (Kevin) Cao 1, Chenming Hu 1, Xuejue Huang 1, Andrew.
Circuit Performance Variability Decomposition Michael Orshansky, Costas Spanos, and Chenming Hu Department of Electrical Engineering and Computer Sciences,
ELEN 468 Lecture 271 ELEN 468 Advanced Logic Design Lecture 27 Interconnect Timing Optimization II.
Statistical Gate Delay Calculation with Crosstalk Alignment Consideration Andrew B. Kahng, Bao Liu, Xu Xu UC San Diego
Design of Integrated-Circuit Interconnects with Accurate Modeling of Chemical-Mechanical Planarization Lei He, Andrew B. Kahng* #, Kingho Tam, Jinjun Xiong.
UC San Diego Computer Engineering. VLSI CAD Laboratory.. UC San Diego Computer EngineeringVLSI CAD Laboratory.. UC San Diego Computer EngineeringVLSI CAD.
Effects of Global Interconnect Optimizations on Performance Estimation of Deep Sub-Micron Design Yu Cao, Chenming Hu, Xuejue Huang, Andrew B. Kahng, Sudhakar.
Statistical Critical Path Selection for Timing Validation Kai Yang, Kwang-Ting Cheng, and Li-C Wang Department of Electrical and Computer Engineering University.
UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD.
Noise and Delay Uncertainty Studies for Coupled RC Interconnects Andrew B. Kahng, Sudhakar Muddu † and Devendra Vidhani ‡ UCLA Computer Science Department,
Signal Integrity Methodology on 300 MHz SoC using ALF libraries and tools Wolfgang Roethig, Ramakrishna Nibhanupudi, Arun Balakrishnan, Gopal Dandu Steven.
Enhanced Metamodeling Techniques for High-Dimensional IC Design Estimation Problems Andrew B. Kahng, Bill Lin and Siddhartha Nath VLSI CAD LABORATORY,
Andrew B. Kahng‡†, Mulong Luo†, Siddhartha Nath†
Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Interconnect design. n Crosstalk. n Power optimization.
Accuracy-Configurable Adder for Approximate Arithmetic Designs
-1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo.
Capturing Crosstalk-Induced Waveform for Accurate Static Timing Analysis Masanori Hashimoto, Yuji Yamada, Hidetoshi Onodera Kyoto University.
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
1 Coupling Aware Timing Optimization and Antenna Avoidance in Layer Assignment Di Wu, Jiang Hu and Rabi Mahapatra Texas A&M University.
UC San Diego / VLSI CAD Laboratory Toward Quantifying the IC Design Value of Interconnect Technology Improvement Tuck-Boon Chan, Andrew B. Kahng, Jiajia.
EE 5900 Advanced Algorithms for Robust VLSI CAD, Spring 2009 Static Timing Analysis and Gate Sizing.
UC San Diego / VLSI CAD Laboratory Incremental Multiple-Scan Chain Ordering for ECO Flip-Flop Insertion Andrew B. Kahng, Ilgweon Kang and Siddhartha Nath.
Kwangsoo Han, Andrew B. Kahng, Hyein Lee and Lutong Wang
A Robust Pulse-triggered Flip-Flop and Enhanced Scan Cell Design
MICAS Department of Electrical Engineering (ESAT) Design-In for EMC on digital circuit December 5th, 2005 Low Emission Digital Circuit Design Junfeng Zhou.
Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Interconnect design. n Crosstalk. n Power optimization.
Fast Algorithms for Slew Constrained Minimum Cost Buffering S. Hu*, C. Alpert**, J. Hu*, S. Karandikar**, Z. Li*, W. Shi* and C. Sze** *Dept of ECE, Texas.
Crosstalk Noise Optimization by Post-Layout Transistor Sizing Masanori Hashimoto Masao Takahashi Hidetoshi Onodera Dept. CCE, Kyoto University.
Physical Synthesis Buffer Insertion, Gate Sizing, Wire Sizing,
Modern VLSI Design 4e: Chapter 3 Copyright  2008 Wayne Wolf Topics n Wire delay. n Buffer insertion. n Crosstalk. n Inductive interconnect. n Switch logic.
LEMAR: A Novel Length Matching Routing Algorithm for Analog and Mixed Signal Circuits H. Yao, Y. Cai and Q. Gao EDA Lab, Department of CS, Tsinghua University,
UC San Diego / VLSI CAD Laboratory Learning-Based Approximation of Interconnect Delay and Slew Modeling in Signoff Timing Tools Andrew B. Kahng, Seokhyeong.
Modern VLSI Design 3e: Chapter 3 Copyright  1998, 2002 Prentice Hall PTR Topics n Wire delay. n Buffer insertion. n Crosstalk. n Inductive interconnect.
Inductance Screening and Inductance Matrix Sparsification 1.
Improved Path Clustering for Adaptive Path-Delay Testing Tuck-Boon Chan* and Prof. Andrew B. Kahng*# UC San Diego ECE* & CSE # Departments.
-1- UC San Diego / VLSI CAD Laboratory Optimization of Overdrive Signoff Tuck-Boon Chan, Andrew B. Kahng, Jiajia Li and Siddhartha Nath Tuck-Boon Chan,
Dept. of Electronics Engineering & Institute of Electronics National Chiao Tung University Hsinchu, Taiwan ISPD’16 Generating Routing-Driven Power Distribution.
PROCEED: Pareto Optimization-based Circuit-level Evaluation Methodology for Emerging Devices Shaodi Wang, Andrew Pan, Chi-On Chui and Puneet Gupta Department.
1 Double-Patterning Aware DSA Template Guided Cut Redistribution for Advanced 1-D Gridded Designs Zhi-Wen Lin and Yao-Wen Chang National Taiwan University.
Slide 1 SLIP 2004 Payman Zarkesh-Ha, Ken Doniger, William Loh, and Peter Bendix LSI Logic Corporation Interconnect Modeling Group February 14, 2004 Prediction.
Worst Case Crosstalk Noise for Nonswitching Victims in High-Speed Buses Jun Chen and Lei He.
Crosstalk If both a wire and its neighbor are switching at the same time, the direction of the switching affects the amount of charge to be delivered and.
Topics Driving long wires..
Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts.
Revisiting and Bounding the Benefit From 3D Integration
Jason Cong, David Zhigang Pan & Prasanna V. Srinivas
Chapter 3b Static Noise Analysis
Crosstalk Noise in FPGAs
Jason Cong, David Zhigang Pan & Prasanna V. Srinivas
Presentation transcript:

-1- Delay Uncertainty and Signal Criticality Driven Routing Channel Optimization for Advanced DRAM Products Samyoung Bang #, Kwangsoo Han ‡, Andrew B. Kahng ‡† and Mulong Luo † Presented By: Siddhartha Nath

-2- Outline Introduction and Related Works Introduction and Related Works Crosstalk-Aware Layout Optimization Crosstalk-Aware Layout Optimization Testcase Generation Testcase Generation Experimental Setup and Results Experimental Setup and Results Conclusions Conclusions

-3- Introduction DRAM interconnect channels DRAM interconnect channels –Narrow and long interconnect channel  large crosstalk  large crosstalk –Manual design is still the dominant methodology  might be far from optimal … Aggressor Victim An automated DRAM channel layout optimizer is essential to minimize the crosstalk effects

-4- Related Works Crosstalk-aware analysis Crosstalk-aware analysis –Analytical modeling of crosstalk-induced delay and noise [Xiao00] –Arrival time alignment of aggressor and victim for worst- case victim delay and noise [Gross98, Sato00] Crosstalk-aware design Crosstalk-aware design –Swizzling-based interconnect design to reduce crosstalk-induced delay and noise [Yu09] –Crosstalk-aware MILP-based detailed routing [Gao93] No existing works integrate accurate crosstalk-aware analysis and automated design!

-5- Our Contributions Develop an accurate closed-form analytical delay calculator Develop an accurate closed-form analytical delay calculator Propose several methods to achieve high- quality, scalable channel layout optimization Propose several methods to achieve high- quality, scalable channel layout optimization –MILP-based segment optimization –Pair-swapping segment optimization Achieve 29% reduction of maximum weighted delay uncertainty compared to the conventional signal permutation Achieve 29% reduction of maximum weighted delay uncertainty compared to the conventional signal permutation

-6- Outline Introduction and Related Works Introduction and Related Works Crosstalk-Aware Layout Optimization Crosstalk-Aware Layout Optimization Testcase Generation Testcase Generation Experimental Setup and Results Experimental Setup and Results Conclusions Conclusions

-7- Track 1 Track 2 Track 3 Track 4 Segment 1 Segment 2 Segment 3 Segment 4 Segment |G| … Track |T| … … Problem Statement Inputs: Inputs: –Long and narrow rectangular channel  set of tracks T –Set of segments G; set of signals S –Criticality classes (e.g., CLK signal  highest criticality) –Design rules (e.g., pitch, width, spacing) for each class –Inter-buffer length for each class Objective: minimize max weighted delay uncertainty among all signals in different classes Objective: minimize max weighted delay uncertainty among all signals in different classes

-8- Problem Complexity Track t 0 Track t 1 Track t 2 Track t 3 Track t 4 Track t |T|-1 … … ……… Segment g 0 Segment g 1 Segment g 2 Segment g 3

-9- Segment-by-segment Optimization Optimal Max delay uncertainty: 7.16ps Segment-by-segment Max delay uncertainty: 7.19ps Signal permutation Max delay uncertainty: 9.57ps

-10- Overview of Crosstalk-Aware Layout Optimization Testcase specifications: channel length and width, #signals, #tracks, etc Testcase specifications: channel length and width, #signals, #tracks, etc Segment-by-segment optimization Segment-by-segment optimization –MILP-based segment optimization –Pair-swapping segment optimization Accurate and fast delay uncertainty calculator Accurate and fast delay uncertainty calculator Pair-swapping segment optimization MILP-based segment optimization Optimized layout with min delay uncertainties of signals Testcase specifications Segment-by-segment optimization Delay Uncertainty Calculator

-11- MILP-based Segment Optimization: Notations

-12- Basic Constraints for Our MILP

-13- Track t 0 Track t 1 Track t 2 Track t 3 Track t 4 Track t |T|-1 … … … Segment g j Segment g j+1 MILP-based Segment Optimization Signal 1 Signal 2 Signal 3 Signal 4

-14- Decomposition for Scalability Limitation of MILP-based method  scalability Limitation of MILP-based method  scalability –Decompose tracks into set of subsets –Solve MILP instance for each subset –Offset half of the subset size to mix signals Track t 1 Track t 2 Track t 3 Track t |T|-1 … … ……… Segment g 0 Segment g 1 Segment g 2 Segment g 3 |V 0 |=V |V 1 |=V |V 2 |=V|V 3 |=V|V 1 |=V|V 2 |=V|V 3 |=V |V 0 |=[V/2] |V 0 |=V |V 1 |=V |V 2 |=V|V 3 |=V|V 1 |=V|V 2 |=V|V 3 |=V |V 0 |=[V/2]

-15- Overview of Crosstalk-Aware Layout Optimization Testcase specifications: channel length and width, #signals, #tracks, etc Testcase specifications: channel length and width, #signals, #tracks, etc Segment-by-segment optimization Segment-by-segment optimization –MILP-based segment optimization –Pair-swapping segment optimization Accurate and fast delay uncertainty calculator Accurate and fast delay uncertainty calculator Pair-swapping segment optimization MILP-based segment optimization Optimized layout with min delay uncertainties of signals Testcase specifications Segment-by-segment optimization Delay Uncertainty Calculator

-16- Pair-swapping Segment Optimization Main idea: swap the signal with maximum weighted delay uncertainty with other signals Main idea: swap the signal with maximum weighted delay uncertainty with other signals Procedure: Procedure: Step1: Sort all the signals in increasing order of delay uncertainties Step2: Swap the signal w/ max weighted delay uncertainty and min weighted delay uncertainty Step 3: Revert the swap if no improvement Step 4: Repeat Steps 1, 2 until no weighted delay uncertainty improvement 70ps 30ps Track t 1 Track t 2 Track t 3 Segment g 1 Segment g 2 40ps 45ps 60ps

-17- Overview of Crosstalk-Aware Layout Optimization Testcase specifications: channel length and width, #signals, #tracks, etc Testcase specifications: channel length and width, #signals, #tracks, etc Segment-by-segment optimization Segment-by-segment optimization –MILP-based segment optimization –Pair-swapping segment optimization Accurate and fast delay uncertainty calculator Accurate and fast delay uncertainty calculator Pair-swapping segment optimization MILP-based segment optimization Optimized layout with min delay uncertainties of signals Testcase specifications Segment-by-segment optimization Delay Uncertainty Calculator

-18- Delay Uncertainty Calculator Layout and electrical information of any aggressors and victims Noise waveform to Delay Change Curve (DCC) [Sato03] Delay Calculator Delay uncertainty induced by crosstalk

-19- Accuracy of Delay Uncertainty Model Comparison of our model and the model in [Gupta04] Comparison of our model and the model in [Gupta04] –Testcase: five signals, five tracks and 8000um channel divided into 8 segments –Randomly generate 300 swizzling patterns  1500 data points –Rank correlation between our model and SPICE –Rank correlation between [Gupta04] model and SPICE [Gupta04] P. Gupta and A. B. Kahng, “Wire Swizzling to Reduce Delay Uncertainty Due to Capacitive Coupling”, Proc. VLSI Design, 2004, pp Rank of delay uncertainty by our model Rank of delay uncertainty by SPICE Max rank difference: 148 Rank of delay uncertainty by [Gupta04] Rank of delay uncertainty by SPICE Max rank difference: 487

-20- Outline Introduction and Related Works Introduction and Related Works Crosstalk-Aware Layout Optimization Crosstalk-Aware Layout Optimization Testcase Generation Testcase Generation Experimental Setup and Results Experimental Setup and Results Conclusions Conclusions

-21- Testcase Generation: General Inputs No public benchmark for DRAM channel routing optimization  develop testcase generator No public benchmark for DRAM channel routing optimization  develop testcase generator General inputs General inputs –Channel length –Channel width –Number of signals –Number of tracks –Number of segments –Probability that a signal in class 0 is correlated with a signal in class 1 –Supply voltage –Clock period class 0class 1class 2class 3class 4 class class class class class 40.5 … Channel length Channel width Signals Tracks Segment

-22- Testcase Generation: More Inputs Class-specific inputs Class-specific inputs Number of signals Number of signals Ground capacitance (Resistance) Ground capacitance (Resistance) Coupling capacitance between two signals in different classes Coupling capacitance between two signals in different classes Distances between any two consecutive buffers Distances between any two consecutive buffers Input capacitance (Output resistance) of buffer Input capacitance (Output resistance) of buffer Signal-specific inputs Signal-specific inputs Load capacitance Load capacitance Input resistance Input resistance Input slew Input slew Activity correlation with other signals Activity correlation with other signals Load cap. Input resistance Input slew Input cap. Output res. Distance Signal 1

-23- Outline Introduction and Related Works Introduction and Related Works Crosstalk-Aware Layout Optimization Crosstalk-Aware Layout Optimization Testcase Generation Testcase Generation Experimental Setup and Results Experimental Setup and Results Conclusions Conclusions

-24- Experimental Setup Channel length = 8000um Channel length = 8000um Pitch, width, space and buffer location of each class Pitch, width, space and buffer location of each class Signal-specific inputs: R d = 500Ω, t slew = 130ps, C load = 4fF Signal-specific inputs: R d = 500Ω, t slew = 130ps, C load = 4fF Experiment 1: Impact of number of signals and tracks Experiment 1: Impact of number of signals and tracks Experiment 2: Impact of percentage of signals in each class Experiment 2: Impact of percentage of signals in each class Experiment 3: Impact of correlation of signals Experiment 3: Impact of correlation of signals Experiment 4: MILP vs. pair-swapping vs. signal permutation Experiment 4: MILP vs. pair-swapping vs. signal permutation ClassPitch (um)Width (um)Space (um)Buffer location Per 1000um Per 1000um Per 2000um Per 4000um No buffer

-25- Experiment 1: Impact of Number of Signals and Tracks Vary number of tracks and signals Vary number of tracks and signals Weights of classes = {10, 6.7, 4, 2, 1} Weights of classes = {10, 6.7, 4, 2, 1} Same number of signals in each class Same number of signals in each class Testcase E1T1 result Testcase E1T1 result –Signals in higher-criticality class  smaller delay uncertainty –Most critical two signals  mostly on the boundary of channel Testcase#signals#tracks E1T1 10 E1T E1T E1T4 20 E1T E1T Class 0 Class 2 Class 3 Class 4 Class 1

-26- Experiment 2: Impact of Percentage of Signals in Each Class Number of tracks = 20 and Number of signals = 20 Number of tracks = 20 and Number of signals = 20 Change the percentage of signals in each criticality class Change the percentage of signals in each criticality class Same weights used in Experiment 1 Same weights used in Experiment 1 Observation Observation –% of lower-criticality signals ↑  objective ↓ –Replace higher-criticality signals to lower-criticality signals  maximum weighted delay uncertainty ↓ Test case Priority of class A Priority of class B #signals in class A #signals in class B D max A D max B Objective E2T E2T E2T E2T E2T E2T

-27- Experiment 3: Impact of Correlation of Signals Layout of channel for E3T1 Layout of channel for E3T2

-28- Max weighted delay uncertainty for testcases T2 – T5 Runtime of MILP and pair-swapping for testcases T2 – T5 MILP vs. Pair-Swapping vs. Signal Permutation (2) Scalability evaluation with larger testcases Scalability evaluation with larger testcases Size of decomposition subset of MILP: 20 Size of decomposition subset of MILP: 20 Testcases T2 – T5 Testcases T2 – T5 –T2: #tracks = 100, #signals for each class = {15, 40, 30, 10, 5} –T3: #tracks = 110, #signals for each class = {15, 40, 30, 10, 5} –T4: #tracks = 200, #signals for each class = {30, 80, 60, 20, 10} –T5: #tracks = 220, #signals for each class = {30, 80, 60, 20, 10} Observation Observation –Pair-swapping achieves better results than signal permutation  Up to 29% max weighted delay uncertainty reduction  Up to 29% max weighted delay uncertainty reduction –Empty tracks (10% of #tracks)  Up to 19.9% max weighted delay uncertainty reduction –Runtime: pair-swapping < MILP 19.9% 29%

-29- Outline Introduction and Related Works Introduction and Related Works Crosstalk-Aware Layout Optimization Crosstalk-Aware Layout Optimization Testcase Generation Testcase Generation Experimental Setup and Results Experimental Setup and Results Conclusions Conclusions

-30- Conclusions Propose a DRAM routing channel optimization to specifically target the layout design of long, resource- constrained channels in modern DRAM products Propose a DRAM routing channel optimization to specifically target the layout design of long, resource- constrained channels in modern DRAM products Optimizer is signal criticality-aware, and minimizes a maximum weighted delay uncertainty Optimizer is signal criticality-aware, and minimizes a maximum weighted delay uncertainty Achieve up to 29% reduction of maximum weighted delay uncertainty compared to a traditional track permutation methodology Achieve up to 29% reduction of maximum weighted delay uncertainty compared to a traditional track permutation methodology Ongoing work Ongoing work –Flexible buffer location –Use of inverters

-31- Acknowledgments Work supported by Samsung Electronics

-32- Thank You!