Research on Interconnect

Slides:



Advertisements
Similar presentations
ASYNC07 High Rate Wave-pipelined Asynchronous On-chip Bit-serial Data Link R. Dobkin, T. Liran, Y. Perelman, A. Kolodny, R. Ginosar Technion – Israel Institute.
Advertisements

EE 201A Modeling and Optimization for VLSI LayoutJeff Wong and Dan Vasquez EE 201A Noise Modeling Jeff Wong and Dan Vasquez Electrical Engineering Department.
A Novel 3D Layer-Multiplexed On-Chip Network
Robust Low Power VLSI R obust L ow P ower VLSI Sub-threshold Sense Amplifier (SA) Compensation Using Auto-zeroing Circuitry 01/21/2014 Peter Beshay Department.
NTHU-CS VLSI/CAD LAB TH EDA De-Shiuan Chiou Da-Cheng Juan Yu-Ting Chen Shih-Chieh Chang Department of CS, National Tsing Hua University, Taiwan Fine-Grained.
Improving Placement under the Constant Delay Model Kolja Sulimma 1, Ingmar Neumann 1, Lukas Van Ginneken 2, Wolfgang Kunz 1 1 EE and IT Department University.
Transmission Line Network For Multi-GHz Clock Distribution Hongyu Chen and Chung-Kuan Cheng Department of Computer Science and Engineering, University.
Computer Science & Engineering Department University of California, San Diego SPICE Diego A Transistor Level Full System Simulator Chung-Kuan Cheng May.
1/42 Changkun Park Title Dual mode RF CMOS Power Amplifier with transformer for polar transmitters March. 26, 2007 Changkun Park Wave Embedded Integrated.
Design and Implementation of VLSI Systems (EN0160) Sherief Reda Division of Engineering, Brown University Spring 2007.
Design Automation for VLSI, MS-SOCs & Nanotechnologies Dr. Malgorzata Chrzanowska-Jeske Mixed-Signal System-on-Chip (supported.
NTHU-CS VLSI/CAD LAB TH EDA Student : Da-Cheng Juan Advisor : Shih-Chieh Chang Fine-Grained Sleep Transistor Sizing Algorithm for Leakage Power Minimization.
Effects of Global Interconnect Optimizations on Performance Estimation of Deep Sub-Micron Design Yu (Kevin) Cao 1, Chenming Hu 1, Xuejue Huang 1, Andrew.
Ultra-Low Power On-Chip Differential Interconnects Using High-Resolution Comparator Hao Liu and Chung-Kuan Cheng University of California, San Diego 10/22/2012.
More Realistic Power Grid Verification Based on Hierarchical Current and Power constraints 2 Chung-Kuan Cheng, 2 Peng Du, 2 Andrew B. Kahng, 1 Grantham.
Yulei Zhang1, James F. Buckwalter1, and Chung-Kuan Cheng2
Worst-Case Timing Jitter and Amplitude Noise in Differential Signaling Wei Yao, Yiyu Shi, Lei He, Sudhakar Pamarti, and Yu Hu Electrical Engineering Dept.,
A Fast Evaluation of Power Delivery System Input Impedance of Printed Circuit Boards with Decoupling Capacitors Jin Zhao Sigrity Inc.
TLC: Transmission Line Caches Brad Beckmann David Wood Multifacet Project University of Wisconsin-Madison 12/3/03.
Power Reduction for FPGA using Multiple Vdd/Vth
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
Lecture 03: Fundamentals of Computer Design - Trends and Performance Kai Bu
Low-Power Wireless Sensor Networks
Prediction of High-Performance On-Chip Global Interconnection Yulei Zhang 1, Xiang Hu 1, Alina Deutsch 2, A. Ege Engin 3 James F. Buckwalter 1, and Chung-Kuan.
Logic Synthesis for Low Power(CHAPTER 6) 6.1 Introduction 6.2 Power Estimation Techniques 6.3 Power Minimization Techniques 6.4 Summary.
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
1 ECE 556 Design Automation of Digital Systems By Prof. Charlie Chung-Ping Chen ECE Department UW-Madison.
1 Passive Distortion Compensation for Package Level Interconnect Chung-Kuan Cheng UC San Diego Dongsheng Ma & Janet Wang Univ. of Arizona.
1 Passive Distortion Compensation for Package Level Interconnect Chung-Kuan Cheng UC San Diego Dongsheng Ma & Janet Wang Univ. of Arizona.
Distributed Computation: Circuit Simulation CK Cheng UC San Diego
Power Integrity Test and Verification CK Cheng UC San Diego 1.
Multi-Split-Row Threshold Decoding Implementations for LDPC Codes
By Nasir Mahmood.  The NoC solution brings a networking method to on-chip communication.
Surfliner: Distortion-less Electrical Signaling for Speed of Light On- chip Communication Hongyu Chen, Rui Shi, Chung-Kuan Cheng Computer Science and Engineering.
Low-Power and High-Speed Interconnect Using Serial Passive Compensation Chun-Chen Liu and Chung-Kuan Cheng Computer Science and Engineering Dept. University.
Seok-jae, Lee VLSI Signal Processing Lab. Korea University
1 Revamping Electronic Design Process to Embrace Interconnect Dominance Chung-Kuan Cheng CSE Department UC San Diego La Jolla, CA
-1- Delay Uncertainty and Signal Criticality Driven Routing Channel Optimization for Advanced DRAM Products Samyoung Bang #, Kwangsoo Han ‡, Andrew B.
Circuit Simulation using Matrix Exponential Method Shih-Hung Weng, Quan Chen and Chung-Kuan Cheng CSE Department, UC San Diego, CA Contact:
Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles Zhiyi Yu, Bevan Baas VLSI Computation Lab, ECE Department University of California,
Tinoosh Mohsenin 2, Houshmand Shirani-mehr 1, Bevan Baas 1 1 University of California, Davis 2 University of Maryland Baltimore County Low Power LDPC Decoder.
Exploring the Rogue Wave Phenomenon in 3D Power Distribution Networks Xiang Hu 1, Peng Du 2, Chung-Kuan Cheng 2 1 ECE Dept., 2 CSE Dept. University of.
Analysis of noise immunity at common circuits of the front end parts of high-speed transceivers S.V. Kondratenko NRNU MEPhI ICPPA-2016,
Interconnect and Packaging Chapter 1: Spectrum and Resonance (digital vs. analog) Chung-Kuan Cheng UC San Diego.
High Speed Signal Integrity Analysis
ELEC 7770 Advanced VLSI Design Spring 2016 Introduction
Alireza Shafaei, Shuang Chen, Yanzhi Wang, and Massoud Pedram
Xiang Hu1, Wenbo Zhao2, Peng Du2, Amirali Shayan2, Chung-Kuan Cheng2
A KICK-BACK REDUCED COMPARATOR FOR A 4-6-BIT 3-GS/S FLASH ADC
Architecture & Organization 1
A Novel 1. 5V CMFB CMOS Down-Conversion Mixer Design for IEEE 802
ELEC 7770 Advanced VLSI Design Spring 2014 Introduction
Jason Cong, David Zhigang Pan & Prasanna V. Srinivas
ITC 2016 PO 16: Testing Yield Improvement by Optimizing the Impedance of Power Delivery Network on DIB Jintao Shi Zaiman Chen.
Architecture & Organization 1
ELEC 7770 Advanced VLSI Design Spring 2012 Introduction
CSE245: Computer-Aided Circuit Simulation and Verification
ELEC 7770 Advanced VLSI Design Spring 2010 Introduction
Low Power Passive Equalizer Design for Computer-Memory Links
Circuit Design Techniques for Low Power DSPs
Yiyu Shi*, Wei Yao*, Jinjun Xiong+ and Lei He*
Post-Silicon Calibration for Large-Volume Products
Energy Efficient Power Distribution on Many-Core SoC
Leveraging Optical Technology in Future Bus-based Chip Multiprocessors
Multiport, Multichannel Transmission Line: Modeling and Synthesis
Department of Electrical Engineering Joint work with Jiong Luo
Low Power Digital Design
A New Hybrid FPGA with Nanoscale Clusters and CMOS Routing Reza M. P
Interconnect and Packaging Lecture 2: Scalability
Jason Cong, David Zhigang Pan & Prasanna V. Srinivas
Presentation transcript:

Research on Interconnect Chung-Kuan Cheng University of California, San Diego ckcheng@ucsd.edu

Research on Interconnect Physical Layout 3D Floorplanning, Placement, Bus Topology Whole Chip Analysis Interconnect Dominance Parallel SPICE (1): circuit theory, numerical methods, implementation Signal Interconnects Power, Throughput, Latency EM Waves, e.g. T Line (2): eye diagram analysis Power Ground Distribution Networks Stimuli, Networks, Noise (3) 3D ICs (4): worst case analysis

Trends of Scaling (Moore’s Law) Expansion of applications: ai, bioinf, graphics, vision Explosion of communication: internet Distributed system: exascale computation Power constrained designs: low power Interconnect dominance: VLSI Nano-devices with variations: fault tolerant design, design for manufacturability Application System Technology

Circuit Simulation: Motivation Technology Scaling Challenges for Interconnect Dominance Complexity Signal and Power Integrity: crosstalk, voltage drop, coupling noise etc. High clock frequency: Inductance Effect Smaller transistors: Complicated Models

Parallel SPICE Transient Analysis for Whole System Acknowledgement: NSF Researchers: F.J. Liu, X.D. Yang, Z. Qin, Z. Zhu, R. Shi, H. Peng, S.H. Weng, Q. Chen, Y. Shen

SPICE Research Outline Cluster Machines Netlist Partitioning Whole Chip Analysis SPICE Accuracy

Simulation: Goal Analyze whole chip with 100x memory capacity, 100x speed up, 100x efficiency for designers Set standard of input/output for parallel processing Allow cluster machines or cloud computing for the acceleration Demonstrate the results via power ground analysis and tera Hertz circuitry

Parallel Device Loading (continued) PU: Processing Unit PU1 Direct Solver (KLU) PU2 Direct Solver (KLU) PUk-1 Direct Solver (KLU) Nonlinear Sub-circuit Nonlinear Sub-circuit … Nonlinear Sub-circuit Device Loading Device Loading Device Loading Equivalent Ckt Equivalent Ckt Equivalent Ckt Linear-Nonlinear Iteration Original Circuit Interface PUK Parallel AMG PUK+j1 Parallel AMG PUK+jm Parallel AMG Equivalent Ckt Equivalent Ckt Equivalent Ckt … linear Sub-circuit linear Sub-circuit linear Sub-circuit

Action Items Input/Output Parsing and Screening Adaptive Time Step Control Parallel Input Parallelization Netlist Transformation Overall Framework Parallel Output Implement in C/C++ Device Evaluation Which math library? Transistor Model Sparse matrix library? Integration Parallel Implementation Stability Stiffness Handling Sensitivity Calculation

Remarks Scalable Parallel Processing Integration Matrix Operations Applications Power Ground Network Analysis Substrate Noises Memory Analysis Tera Hertz Circuit Simulation

Signal Interconnect Introduction Contributions On-Chip T Lines Conclusion 11/12/2018

Signal Interconnect Goals: Technology Power Throughput Latency Scaling Effect On-Chip Interconnect Chip to Chip 3D ICs 11/12/2018

Signal Interconnect: Scaling and Design Styles On-chip global wires become barrier for achieving High-performance: 542ps (1mm wire) vs. 161ps (10 FO4 inverter) [ITRS 2008] Low-power: Contribution for 50% dynamic power. [Magen 2004] Various interconnect schemes proposed RC wires On-chip T-lines transceiver design, equalization, etc. Design criteria minimum latency [Zhang 2009] 11/12/2018

Contributions Distortion: Time Domain vs Frequency Domain Analysis Eye Diagram Analysis Passive and Active Equalizer Distortionless T Lines Rotary Clocks 11/12/2018

On-Chip T Line: Active Equalizer Contributions On-chip T-line l interconnect utilizing concepts borrowed from off-chip Performance analysis for the whole system A framework to improve energy-efficiency Results of our design 20Gbps signaling over 10mm, 2.6um-pitch on-chip T-line 15.5ps/mm latency and 0.2pJ/b energy per bit in 45nm CMOS We propose a novel equalized global interconnect scheme, analyze it to provide design guidelines, and optimize it by transmitter-receiver co-design framework. 11/12/2018

Equalized On-Chip Global Interconnect Architecture of the proposed equalized on-chip global interconnect Overall structure Tapered current-mode logic (CML) transmitter Terminated differential on-chip T-line Continuous-time linear equalizer (CTLE) receiver Sense-amplifier based latch (SA-latch) The proposed equalized interconnect is composed of CML transmitter, differential on-chip T-line, CTLE receiver, and synchronous SA-latch. 11/12/2018

Transmitter/Receiver Co-Optimization Flow Pre-designed CML transmitter Pre-designed CTLE receiver Co-Design Initial Solution Change variables [ISS,RT,RL,RD,CD,Vod] Cost-Function Veye/Power Or Lowest Power w/ Veye const. Co-Design Cost Function Estimation SPICE generated T-line step response Receiver Step-Response using CTLE modeling Step-Response Based Eye Estimation Co-optimization flow takes pre-designed transmitter/receiver as initial solution, estimate cost function for each variable set, permute variables in SOP engine, and finally outputs best design. Internal SQP (Sequential Quadratic Optimization) routine to generate best solution Best set of design variables in terms of defined cost function 11/12/2018

Simulated Eye Diagrams Methodology A: transmitter/receiver separate design Energy-efficiency co-design generates the similar receiver eye-quality but doesn’t guarantee the open eye in the internal nodes of interconnect, such as transmitter or T-line output. Methodology B: transmitter/receiver co-design w/ power efficiency opt. 11/12/2018

Summary of Performance Comparison Methodology A TX/RX separate design Methodology B TX/RX co-design RS/ohm 47 148 RT/ohm 94 1100 RL/ohm 440 890 RD/ohm 110 1430 CD/fF 680 150 Vod/mV 60 58 Eye-Opening@CTLE/mV 91 113 Power Consumption (w/o SA-latch)/mW 8.1 3.8 Half total power can be reduced through co-design by reducing TX/RX current, since internal eyes are no longer needed to be open. CTLE capability is maximally utilized. Note: 1) transmitter/receiver co-design increases driver/termination resistance 2) Internal eyes are closed, fully utilize CTLE 11/12/2018

Summary of Performance Split-supply design shows over 30% power reduction compared with single-supply but also introduces 4% yield loss due to the additional supply. Note: 1) 30% less power consumed by split-supply design 2) 4% drop on yield for split-supply 11/12/2018

Remarks Interconnect for 3D ICs Analysis TSVs Interposer Eye Diagram vs Power Ground Noises 11/12/2018

Interconnect Publications On-chip T-lines Y. Zhang, X. Hu, A. Deutsch, A. E. Engin, J. F. Buckwalter, C. K. Cheng, “Prediction and Comparison of High-Performance On-Chip Global Interconnection'', IEEE Trans. on VLSI Systems, accepted. Y. Zhang, X. Hu, A. Deutsch, A. E. Engin, J. F. Buckwalter, C. K. Cheng, “Prediction of High-Performance On-Chip Global Interconnection'', SLIP 2009. Y. Zhang, J. F. Buckwalter, C. K. Cheng, “Energy Efficiency Optimization through Co-Design of the Transmitter and Receiver in High-Speed On-Chip Interconnects'', IEEE Trans. on VLSI Systems, submitted. Y. Zhang, J. F. Buckwalter, C. K. Cheng, “High-Speed Low-Power On-Chip Global Link Design using Continuous-Time Linear Equalizer'', EPEPS 2010. Y. Zhang, L. Zhang, A. Deutsch, G. A. Katopis, D. M. Dreps, E. S. Kuh, C. K. Cheng, “Design Methodology of High Performance On-Chip Global Interconnect Using Terminated Transmission-Line'', ISQED 2009. Y. Zhang, L. Zhang, A. Deutsch, G. A. Katopis, D. M. Dreps, J. F. Buckwalter, E. S. Kuh, C. K. Cheng, “On-Chip Bus Signaling Using Passive Compensation'', EPEPS 2008. Y. Zhang, L. Zhang, A. Tsuchiya, M. Hashimoto, C. K. Cheng, “On-chip High Performance Signaling Using Passive Compensation'‘, ICCD 2008. L. Zhang, Y. Zhang, A. Tsuchiya, M. Hashimoto, C. K. Cheng, “High Performance On-Chip Differential Signaling Using Passive Compensation for Global Communication'', ASP-DAC 2009. PhD study mainly focuses on design and optimization of global interconnection using on-chip T-line, which generates 2 journal papers and 6 conference papers. 11/12/2018

Publications (cont’d) Repeated RC wires Y. Zhang, J. F. Buckwalter, C. K. Cheng, “Performance Prediction of Throughput-Centric Pipelined Global Interconnects with Voltage Scaling'', SLIP 2010. L. Zhang, Y. Zhang, H. Cheng, B. Yao, K. Hamilton, C. K. Cheng, “On-Chip Interconnect Analysis of Performance and Energy Metrics under Different Design Goals'', IEEE Trans. on VLSI Systems, March, 2011. Other interconnects Y. Zhang, J. F. Buckwalter, C. K. Cheng, “On-Chip Global Clock Distribution using Directional Rotary Traveling-Wave Oscillator'', EPEPS 2009. R. Wang, Y. Zhang, N. C. Chou, E.F.Y. Young, C. K. Cheng, R. Graham, “Bus Matrix Synthesis based on Steiner Graphs for Power Efficient System-on-Chip Communications'', IEEE Trans. on CAD, Feb, 2011. L. Zhang, W. Yu, Y. Zhang, R. Wang, A. Deutsch, G. A. Katopis, D. M. Dreps, J. F. Buckwalter, E. S. Kuh, C. K. Cheng, “Low Power Passive Equalization Design for Computer Memory Links'', HOTI 2008. L. Zhang, W. Yu, Y. Zhang, R. Wang, A. Deutsch, G. A. Katopis, D. M. Dreps, J. F. Buckwalter, E. S. Kuh, C. K. Cheng, “Analysis and Optimization of Low Power Passive Equalizers for CPU-Memory Links'', IEEE Trans on Advance Packaging, accepted. X. Hu, W. Zhao, P. Du, Y. Zhang, A. Shayan, C. Pan, A. E. Engin, C. K. Cheng, “On the Bound of Time-Domain Power Supply Noise Based of Frequency-Domain Target Impedance'', SLIP 2009. Also works on some other interconnect-related projects, such as repeated RC wires, on-chip clock distribution, power and ground network, and off-chip interconnect. 11/12/2018