Presentation is loading. Please wait.

Presentation is loading. Please wait.

Research on Interconnect

Similar presentations


Presentation on theme: "Research on Interconnect"— Presentation transcript:

1 Research on Interconnect
Chung-Kuan Cheng University of California, San Diego

2 Research on Interconnect
Physical Layout 3D Floorplanning, Placement, Bus Topology Whole Chip Analysis Interconnect Dominance Parallel SPICE (1): circuit theory, numerical methods, implementation Signal Interconnects Power, Throughput, Latency EM Waves, e.g. T Line (2): eye diagram analysis Power Ground Distribution Networks Stimuli, Networks, Noise (3) 3D ICs (4): worst case analysis

3 Trends of Scaling (Moore’s Law)
Expansion of applications: ai, bioinf, graphics, vision Explosion of communication: internet Distributed system: exascale computation Power constrained designs: low power Interconnect dominance: VLSI Nano-devices with variations: fault tolerant design, design for manufacturability Application System Technology

4 Circuit Simulation: Motivation
Technology Scaling Challenges for Interconnect Dominance Complexity Signal and Power Integrity: crosstalk, voltage drop, coupling noise etc. High clock frequency: Inductance Effect Smaller transistors: Complicated Models

5 Parallel SPICE Transient Analysis for Whole System
Acknowledgement: NSF Researchers: F.J. Liu, X.D. Yang, Z. Qin, Z. Zhu, R. Shi, H. Peng, S.H. Weng, Q. Chen, Y. Shen

6 SPICE Research Outline
Cluster Machines Netlist Partitioning Whole Chip Analysis SPICE Accuracy

7 Simulation: Goal Analyze whole chip with 100x memory capacity, 100x speed up, 100x efficiency for designers Set standard of input/output for parallel processing Allow cluster machines or cloud computing for the acceleration Demonstrate the results via power ground analysis and tera Hertz circuitry

8 Parallel Device Loading (continued)
PU: Processing Unit PU1 Direct Solver (KLU) PU2 Direct Solver (KLU) PUk-1 Direct Solver (KLU) Nonlinear Sub-circuit Nonlinear Sub-circuit Nonlinear Sub-circuit Device Loading Device Loading Device Loading Equivalent Ckt Equivalent Ckt Equivalent Ckt Linear-Nonlinear Iteration Original Circuit Interface PUK Parallel AMG PUK+j1 Parallel AMG PUK+jm Parallel AMG Equivalent Ckt Equivalent Ckt Equivalent Ckt linear Sub-circuit linear Sub-circuit linear Sub-circuit

9 Action Items Input/Output Parsing and Screening
Adaptive Time Step Control Parallel Input Parallelization Netlist Transformation Overall Framework Parallel Output Implement in C/C++ Device Evaluation Which math library? Transistor Model Sparse matrix library? Integration Parallel Implementation Stability Stiffness Handling Sensitivity Calculation

10 Remarks Scalable Parallel Processing Integration Matrix Operations
Applications Power Ground Network Analysis Substrate Noises Memory Analysis Tera Hertz Circuit Simulation

11 Signal Interconnect Introduction Contributions On-Chip T Lines
Conclusion 11/12/2018

12 Signal Interconnect Goals: Technology Power Throughput Latency
Scaling Effect On-Chip Interconnect Chip to Chip 3D ICs 11/12/2018

13 Signal Interconnect: Scaling and Design Styles
On-chip global wires become barrier for achieving High-performance: 542ps (1mm wire) vs. 161ps (10 FO4 inverter) [ITRS 2008] Low-power: Contribution for 50% dynamic power. [Magen 2004] Various interconnect schemes proposed RC wires On-chip T-lines transceiver design, equalization, etc. Design criteria minimum latency [Zhang 2009] 11/12/2018

14 Contributions Distortion: Time Domain vs Frequency Domain Analysis
Eye Diagram Analysis Passive and Active Equalizer Distortionless T Lines Rotary Clocks 11/12/2018

15 On-Chip T Line: Active Equalizer
Contributions On-chip T-line l interconnect utilizing concepts borrowed from off-chip Performance analysis for the whole system A framework to improve energy-efficiency Results of our design 20Gbps signaling over 10mm, 2.6um-pitch on-chip T-line 15.5ps/mm latency and 0.2pJ/b energy per bit in 45nm CMOS We propose a novel equalized global interconnect scheme, analyze it to provide design guidelines, and optimize it by transmitter-receiver co-design framework. 11/12/2018

16 Equalized On-Chip Global Interconnect
Architecture of the proposed equalized on-chip global interconnect Overall structure Tapered current-mode logic (CML) transmitter Terminated differential on-chip T-line Continuous-time linear equalizer (CTLE) receiver Sense-amplifier based latch (SA-latch) The proposed equalized interconnect is composed of CML transmitter, differential on-chip T-line, CTLE receiver, and synchronous SA-latch. 11/12/2018

17 Transmitter/Receiver Co-Optimization Flow
Pre-designed CML transmitter Pre-designed CTLE receiver Co-Design Initial Solution Change variables [ISS,RT,RL,RD,CD,Vod] Cost-Function Veye/Power Or Lowest Power w/ Veye const. Co-Design Cost Function Estimation SPICE generated T-line step response Receiver Step-Response using CTLE modeling Step-Response Based Eye Estimation Co-optimization flow takes pre-designed transmitter/receiver as initial solution, estimate cost function for each variable set, permute variables in SOP engine, and finally outputs best design. Internal SQP (Sequential Quadratic Optimization) routine to generate best solution Best set of design variables in terms of defined cost function 11/12/2018

18 Simulated Eye Diagrams
Methodology A: transmitter/receiver separate design Energy-efficiency co-design generates the similar receiver eye-quality but doesn’t guarantee the open eye in the internal nodes of interconnect, such as transmitter or T-line output. Methodology B: transmitter/receiver co-design w/ power efficiency opt. 11/12/2018

19 Summary of Performance Comparison
Methodology A TX/RX separate design Methodology B TX/RX co-design RS/ohm 47 148 RT/ohm 94 1100 RL/ohm 440 890 RD/ohm 110 1430 CD/fF 680 150 Vod/mV 60 58 91 113 Power Consumption (w/o SA-latch)/mW 8.1 3.8 Half total power can be reduced through co-design by reducing TX/RX current, since internal eyes are no longer needed to be open. CTLE capability is maximally utilized. Note: 1) transmitter/receiver co-design increases driver/termination resistance 2) Internal eyes are closed, fully utilize CTLE 11/12/2018

20 Summary of Performance
Split-supply design shows over 30% power reduction compared with single-supply but also introduces 4% yield loss due to the additional supply. Note: 1) 30% less power consumed by split-supply design 2) 4% drop on yield for split-supply 11/12/2018

21 Remarks Interconnect for 3D ICs Analysis TSVs Interposer
Eye Diagram vs Power Ground Noises 11/12/2018

22 Interconnect Publications
On-chip T-lines Y. Zhang, X. Hu, A. Deutsch, A. E. Engin, J. F. Buckwalter, C. K. Cheng, “Prediction and Comparison of High-Performance On-Chip Global Interconnection'', IEEE Trans. on VLSI Systems, accepted. Y. Zhang, X. Hu, A. Deutsch, A. E. Engin, J. F. Buckwalter, C. K. Cheng, “Prediction of High-Performance On-Chip Global Interconnection'', SLIP 2009. Y. Zhang, J. F. Buckwalter, C. K. Cheng, “Energy Efficiency Optimization through Co-Design of the Transmitter and Receiver in High-Speed On-Chip Interconnects'', IEEE Trans. on VLSI Systems, submitted. Y. Zhang, J. F. Buckwalter, C. K. Cheng, “High-Speed Low-Power On-Chip Global Link Design using Continuous-Time Linear Equalizer'', EPEPS 2010. Y. Zhang, L. Zhang, A. Deutsch, G. A. Katopis, D. M. Dreps, E. S. Kuh, C. K. Cheng, “Design Methodology of High Performance On-Chip Global Interconnect Using Terminated Transmission-Line'', ISQED 2009. Y. Zhang, L. Zhang, A. Deutsch, G. A. Katopis, D. M. Dreps, J. F. Buckwalter, E. S. Kuh, C. K. Cheng, “On-Chip Bus Signaling Using Passive Compensation'', EPEPS 2008. Y. Zhang, L. Zhang, A. Tsuchiya, M. Hashimoto, C. K. Cheng, “On-chip High Performance Signaling Using Passive Compensation'‘, ICCD 2008. L. Zhang, Y. Zhang, A. Tsuchiya, M. Hashimoto, C. K. Cheng, “High Performance On-Chip Differential Signaling Using Passive Compensation for Global Communication'', ASP-DAC 2009. PhD study mainly focuses on design and optimization of global interconnection using on-chip T-line, which generates 2 journal papers and 6 conference papers. 11/12/2018

23 Publications (cont’d)
Repeated RC wires Y. Zhang, J. F. Buckwalter, C. K. Cheng, “Performance Prediction of Throughput-Centric Pipelined Global Interconnects with Voltage Scaling'', SLIP 2010. L. Zhang, Y. Zhang, H. Cheng, B. Yao, K. Hamilton, C. K. Cheng, “On-Chip Interconnect Analysis of Performance and Energy Metrics under Different Design Goals'', IEEE Trans. on VLSI Systems, March, 2011. Other interconnects Y. Zhang, J. F. Buckwalter, C. K. Cheng, “On-Chip Global Clock Distribution using Directional Rotary Traveling-Wave Oscillator'', EPEPS 2009. R. Wang, Y. Zhang, N. C. Chou, E.F.Y. Young, C. K. Cheng, R. Graham, “Bus Matrix Synthesis based on Steiner Graphs for Power Efficient System-on-Chip Communications'', IEEE Trans. on CAD, Feb, 2011. L. Zhang, W. Yu, Y. Zhang, R. Wang, A. Deutsch, G. A. Katopis, D. M. Dreps, J. F. Buckwalter, E. S. Kuh, C. K. Cheng, “Low Power Passive Equalization Design for Computer Memory Links'', HOTI 2008. L. Zhang, W. Yu, Y. Zhang, R. Wang, A. Deutsch, G. A. Katopis, D. M. Dreps, J. F. Buckwalter, E. S. Kuh, C. K. Cheng, “Analysis and Optimization of Low Power Passive Equalizers for CPU-Memory Links'', IEEE Trans on Advance Packaging, accepted. X. Hu, W. Zhao, P. Du, Y. Zhang, A. Shayan, C. Pan, A. E. Engin, C. K. Cheng, “On the Bound of Time-Domain Power Supply Noise Based of Frequency-Domain Target Impedance'', SLIP 2009. Also works on some other interconnect-related projects, such as repeated RC wires, on-chip clock distribution, power and ground network, and off-chip interconnect. 11/12/2018


Download ppt "Research on Interconnect"

Similar presentations


Ads by Google