On-Chip Communication Architectures

Slides:



Advertisements
Similar presentations
Topics Electrical properties of static combinational gates:
Advertisements

11/24/2004EE 42 fall 2004 lecture 361 Lecture #36: Transmission lines Last lecture: –Transmission lines –Balanced and unbalanced –Propagation This lecture:
1 Lecture 28 Timing Analysis. 2 Overview °Circuits do not respond instantaneously to input changes °Predictable delay in transferring inputs to outputs.
EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
Introduction to CMOS VLSI Design Lecture 19: Design for Skew David Harris Harvey Mudd College Spring 2004.
A Look at Chapter 4: Circuit Characterization and Performance Estimation Knowing the source of delays in CMOS gates and being able to estimate them efficiently.
Introduction to CMOS VLSI Design Clock Skew-tolerant circuits.
EE141 © Digital Integrated Circuits 2nd Timing Issues 1 Digital Integrated Circuits A Design Perspective Timing Issues Jan M. Rabaey Anantha Chandrakasan.
CSE477 L19 Timing Issues; Datapaths.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 19: Timing Issues; Introduction to Datapath.
Clock Design Adopted from David Harris of Harvey Mudd College.
Chapter 11 Timing Issues in Digital Systems Boonchuay Supmonchai Integrated Design Application Research (IDAR) Laboratory August 20, 2004; Revised - July.
Fall 06, Sep 19, 21 ELEC / Lecture 6 1 ELEC / (Fall 2005) Special Topics in Electrical Engineering Low-Power Design of Electronic.
04/11/02EECS 3121 Lecture 26: Interconnect Modeling, continued EECS 312 Reading: 8.2.2, (text) HW 8 is due now!
The Wire Scaling has seen wire delays become a major concern whereas in previous technology nodes they were not even a secondary design issue. Wire parasitic.
© Digital Integrated Circuits 2nd Inverter CMOS Inverter: Digital Workhorse  Best Figures of Merit in CMOS Family  Noise Immunity  Performance  Power/Buffer.
04/09/02EECS 3121 Lecture 25: Interconnect Modeling EECS 312 Reading: 8.3 (text), 4.3.2, (2 nd edition)
Interconnect Optimizations
Lecture 5 – Power Prof. Luke Theogarajan
Temperature-Aware Design Presented by Mehul Shah 4/29/04.
Lecture 7: Power.
Lecture 21, Slide 1EECS40, Fall 2004Prof. White Lecture #21 OUTLINE –Sequential logic circuits –Fan-out –Propagation delay –CMOS power consumption Reading:
Power, Energy and Delay Static CMOS is an attractive design style because of its good noise margins, ideal voltage transfer characteristics, full logic.
The CMOS Inverter Slides adapted from:
Digital Integrated Circuits for Communication
Digital logic families
Practical Aspects of Logic Gates COE 202 Digital Logic Design Dr. Aiman El-Maleh College of Computer Sciences and Engineering King Fahd University of Petroleum.
ECE 424 – Introduction to VLSI Design
1 Delay Estimation Most digital designs have multiple data paths some of which are not critical. The critical path is defined as the path the offers the.
ENGG 6090 Topic Review1 How to reduce the power dissipation? Switching Activity Switched Capacitance Voltage Scaling.
Modern VLSI Design 4e: Chapter 7 Copyright  2008 Wayne Wolf Topics Global interconnect. Power/ground routing. Clock routing. Floorplanning tips. Off-chip.
Review: CMOS Inverter: Dynamic
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
Norhayati Soin 06 KEEE 4426 WEEK 7/1 6/02/2006 CHAPTER 2 WEEK 7 CHAPTER 2 MOSFETS I-V CHARACTERISTICS CHAPTER 2.
1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
Chapter 07 Electronic Analysis of CMOS Logic Gates
1 Fundamentals of Microelectronics  CH1 Why Microelectronics?  CH2 Basic Physics of Semiconductors  CH3 Diode Circuits  CH4 Physics of Bipolar Transistors.
Ratioed Circuits Ratioed circuits use weak pull-up and stronger pull-down networks. The input capacitance is reduced and hence logical effort. Correct.
Practical Digital Design Considerations Review of Concepts Created February 2008 ©Paul R. Godin.
EE141 © Digital Integrated Circuits 2nd Wires 1 Digital Integrated Circuits A Design Perspective The Interconnect Jan M. Rabaey Anantha Chandrakasan Borivoje.
Recent Topics on Programmable Logic Array
11/22/2004EE 42 fall 2004 lecture 351 Lecture #35: data transfer Last lecture: –Communications synchronous / asynchronous –Buses This lecture –Transmission.
Inverter Chapter 5 The Inverter April 10, Inverter Objective of This Chapter  Use Inverter to know basic CMOS Circuits Operations  Watch for performance.
1 Interconnect/Via. 2 Delay of Devices and Interconnect.
Modern VLSI Design 4e: Chapter 3 Copyright  2008 Wayne Wolf Topics n Wire delay. n Buffer insertion. n Crosstalk. n Inductive interconnect. n Switch logic.
Basics of Energy & Power Dissipation
Modern VLSI Design 3e: Chapter 7 Copyright  1998, 2002 Prentice Hall PTR Topics n Power/ground routing. n Clock routing. n Floorplanning tips. n Off-chip.
© Digital Integrated Circuits 2nd Inverter Digital Integrated Circuits A Design Perspective The Inverter Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.
Bi-CMOS Prakash B.
By Nasir Mahmood.  The NoC solution brings a networking method to on-chip communication.
Introduction to Clock Tree Synthesis
FPGA-Based System Design: Chapter 2 Copyright  2004 Prentice Hall PTR Topics n Logic gate delay. n Logic gate power consumption. n Driving large loads.
Interconnect/Via.
Solid-State Devices & Circuits
Chapter 4: Secs ; Chapter 5: pp
Modern VLSI Design 3e: Chapter 3 Copyright  1998, 2002 Prentice Hall PTR Topics n Wire delay. n Buffer insertion. n Crosstalk. n Inductive interconnect.
A High-Speed & High-Capacity Single-Chip Copper Crossbar John Damiano, Bruce Duewer, Alan Glaser, Toby Schaffer, John Wilson, and Paul Franzon North Carolina.
EE141 © Digital Integrated Circuits 2nd Combinational Circuits 1 A few notes for your design  Finger and multiplier in schematic design  Parametric analysis.
Modern VLSI Design 3e: Chapter 3 Copyright  1998, 2002 Prentice Hall PTR Topics n Electrical properties of static combinational gates: –transfer characteristics;
Clock Distribution Network
Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 6.1 EE4800 CMOS Digital IC Design & Analysis Lecture 6 Power Zhuo Feng.
EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
1 Modeling and Optimization of VLSI Interconnect Lecture 2: Interconnect Delay Modeling Avinoam Kolodny Konstantin Moiseev.
1 Chapter 8 Operational Amplifier as A Black Box  8.1 General Considerations  8.2 Op-Amp-Based Circuits  8.3 Nonlinear Functions  8.4 Op-Amp Nonidealities.
1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
Chapter 9 CAPACITOR.
High Speed Properties of Digital Gates, Copyright F. Canavero, R. Fantino Licensed to HDT - High Design Technology
Reading: Hambley Ch. 7; Rabaey et al. Sec. 5.2
SIDDAGANGA INSTITUTE OF TECHNOLOGY
332:578 Deep Submicron VLSI Design Lecture 14 Design for Clock Skew
Wire Indctance Consequences of on-chip inductance include:
Presentation transcript:

On-Chip Communication Architectures Physical Design Trends for Interconnects ICS 295 Sudeep Pasricha and Nikil Dutt Slides based on book chapter 11 © 2008 Sudeep Pasricha & Nikil Dutt

Outline Introduction DSM Interconnect Design Low Power High Speed Circuit Design Techniques Power Distribution Techniques Clock Distribution Techniques 3D Interconnects © 2008 Sudeep Pasricha & Nikil Dutt

Introduction Smaller feature size in DSM technology nodes reduces delay of active devices However, effect on delay due to the passive interconnects has increased rapidly Local wire delay decreases with feature size due to a reduction in distance among active devices However overall speed of current ICs is most often limited by the long distance global interconnects © 2008 Sudeep Pasricha & Nikil Dutt

Introduction With shrinking feature size and larger chip die dimensions, sheer number of interconnects has increased exponentially Interconnect capacitance often dominates total gate load therefore, a large portion of the total transient power is dissipated by these on-chip lines particularly true for those long interconnects that distribute the clock signals that can dissipate 40–50% of total IC power Gains achieved in performance with technology scaling are often accompanied by an increase in power dissipation e.g., additional interconnect layers enhance circuit speed at the expense of higher power consumption due to the larger interconnect capacitance © 2008 Sudeep Pasricha & Nikil Dutt

Introduction Traditionally used RC interconnect models are not sufficiently accurate to properly capture signal propagation in lines with fast transition times Including inductance (L) in the RC model has become a necessary modification RLC models are therefore becoming increasingly common at the expense of greater computational cost Whether to include inductance in the interconnect model at midrange frequencies (~ 1 - 3 GHz) depends on several factors e.g. the length of the line, cross-sectional area of the line, etc Under certain conditions, ignoring line inductance may lead to high area and power inefficient circuits © 2008 Sudeep Pasricha & Nikil Dutt

Introduction In addition to interconnects among the various on-chip devices, the clock and power distribution networks require significant metal resources both must span the entire chip Accurately modeling the clock, power, and signal nets is a difficult task highly complex structures Optimally allocating metal to properly design these networks presents an even greater challenge because of complexity in developing interconnect models © 2008 Sudeep Pasricha & Nikil Dutt

Outline Introduction DSM Interconnect Design Low Power High Speed Circuit Design Techniques Power Distribution Techniques Clock Distribution Techniques 3D Interconnects © 2008 Sudeep Pasricha & Nikil Dutt

DSM Interconnect Design Scaling of the lateral dimensions in planar devices, such as MOS transistors, has produced improvements in device area power speed Power consumption and signal propagation delays of long (global) resistive lines have increased Accurate on-chip interconnect models are required to determine the signal characteristics and design requirements of high speed DSM interconnect © 2008 Sudeep Pasricha & Nikil Dutt

DSM Interconnect Design A local line can be modeled as a single lumped capacitor signal propagation delay is negligible compared to gate delay Longer lines must include resistive effect Long interconnect models often divide the line into sections - distributed impedance model T or П accuracy depends on no. of sections © 2008 Sudeep Pasricha & Nikil Dutt

DSM Interconnect Design Long, wide interconnects operating at frequencies exceeding a GHz are not accurately characterized by an RC model An RLC model is necessary for accuracy More complex to evaluate Distributed RLC model (T or П) © 2008 Sudeep Pasricha & Nikil Dutt

DSM Interconnect Design Length criteria for including inductance in interconnect model © 2008 Sudeep Pasricha & Nikil Dutt

DSM Interconnect Design Frequency criteria for including inductance in interconnect models increasing signal frequencies require faster signal transition times resulting effect of a decreased signal transition time is a lower limit on the line length, making shorter on-chip interconnects behave inductively medium length lines can also behave inductively at high signal frequencies Dielectric material criteria for including inductance in interconnect models line capacitance can be reduced by half of the capacitance of SiO2 with the use of low k dielectrics copper interconnect has reduced the line resistance by a factor of two to three as compared to aluminum these and other new materials further the need to include line inductance in interconnect models © 2008 Sudeep Pasricha & Nikil Dutt

Outline Introduction DSM Interconnect Design Low Power High Speed Circuit Design Techniques Power Distribution Techniques Clock Distribution Techniques 3D Interconnects © 2008 Sudeep Pasricha & Nikil Dutt

Low Power High Speed Circuit Design Techniques High Speed Circuit Design Goal: Improve performance Power and noise are important when considering design techniques to optimize circuit performance Noise affects delay, degrades waveform shape, and most importantly, creates the possibility of an erroneous interpretation of the digital signals © 2008 Sudeep Pasricha & Nikil Dutt

CMOS Power Dissipation Overview Dynamic power © 2008 Sudeep Pasricha & Nikil Dutt

CMOS Power Dissipation Overview Short circuit power Due to current flow when a conductive path exists from the power rail to ground rail Leakage power Due to current that flows between the power terminals in the absence of any switching Increasingly dominating overall power © 2008 Sudeep Pasricha & Nikil Dutt

Wire Sizing Width of an interconnect affects the power characteristics and propagation delay Consider a CMOS inverter driving an RC interconnect line Simple first order model of delay © 2008 Sudeep Pasricha & Nikil Dutt

Wire Sizing Increasing the driver transistor width reduces Rtr decreasing the circuit delay trading off circuit power and area for higher speed Increasing the width of the interconnect to reduce Rint does not significantly reduce the delay caused by the RC interconnect impedance since decrease in wire resistance is offset by increase in wire capacitance Many algorithms have been proposed to determine the optimum wire size that minimizes a target cost function minimizing delay addressing reliability issues by reducing clock skew Few approaches for simultaneous driver and wire sizing © 2008 Sudeep Pasricha & Nikil Dutt

Wire Sizing Tradeoffs exist between dynamic and short-circuit power As line inductance-to-resistance ratio increases with wider lines, short- circuit power decreases due to reduction in signal transition time For an RC line, short-circuit power remains approximately constant with increasing width (unlike RLC line shown in figure) decrease in interconnect resistance offset by an increase in capacitance If width of interconnect exceeds a specific limit, short-circuit power increases change in the matching characteristics between driver and interconnect Dynamic power increases with width since line capacitance is greater © 2008 Sudeep Pasricha & Nikil Dutt

Driver Sizing Transistor sizing techniques aim to lower delay Wider transistors have a few advantages but also several disadvantages + produce more current + reduce charge time of load capacitance – have greater physical area – have larger gate capacitance – increased circuit area and power A careful balance of the current drive and output load is necessary to enhance circuit performance Power optimal transistor size is smaller than the power- delay optimal transistor size © 2008 Sudeep Pasricha & Nikil Dutt

Tapered Buffers An important example of transistor sizing is tapered buffers Used to drive large capacitive loads Consists of chain of inverters of gradually increasing size Ratio of size of an inverter to size of the preceding inverter is the tapering factor β Delay of tapered buffer system is less than delay of a single large inverter © 2008 Sudeep Pasricha & Nikil Dutt

Tapered Buffers Dependence of the tapered buffer propagation delay on tapering factor For minimum delay, optimal number of stages Nopt= ln(M) where M = CL/Co is the ratio of the load capacitance to the input capacitance of the initial inverter in the chain Area, power, and delay strongly dependent on β Trade-offs possible by varying tapering factor © 2008 Sudeep Pasricha & Nikil Dutt

Repeater Insertion An effective strategy for reducing delay of a long interconnect is to strategically insert buffers along a line Repeaters circumvent quadratic increase in interconnect delay by partitioning the line into smaller and approximately equal sections Sum of section delays is smaller than delay of original path since delay of each section is reduced Decreased interconnect delay is partially offset by the additional delay of the inserted repeaters © 2008 Sudeep Pasricha & Nikil Dutt

Repeater Insertion Optimal number of repeaters Optimal size of repeaters Rt and Ct are the total interconnect resistance and capacitance R0 and C0 are the input and output repeater resistance and capacitance Cg0 is the input capacitance of the repeater Fitting parameters a1 and a2 account for rise and fall time of the propagating signal © 2008 Sudeep Pasricha & Nikil Dutt

Repeater Insertion For an RC line, repeater insertion techniques outperform wire sizing In RLC lines, wire sizing outperforms repeater insertion as the minimum signal propagation delay with no repeaters is smaller © 2008 Sudeep Pasricha & Nikil Dutt

Outline Introduction DSM Interconnect Design Low Power High Speed Circuit Design Techniques Power Distribution Techniques Clock Distribution Techniques 3D Interconnects © 2008 Sudeep Pasricha & Nikil Dutt

Power Distribution Techniques Power grid consists of a supply, load, and interconnect lines connecting the supply to the load Interconnect lines connecting the power supply to the load are non-ideal with a finite resistance and inductance, Rp , Lp and Rg , Lg , for the power and ground lines, respectively Change in supply voltages at the load terminal is referred to as power supply noise © 2008 Sudeep Pasricha & Nikil Dutt

Power Distribution Techniques Power supply noise can adversely affect circuit operation Power supply variations reduce rail-to-rail power voltage, causing gate-to-source voltage across both NMOS and PMOS transistors to also decrease lowering the output drive current of these devices signal delay increases, compared to delay under nominal power supply voltage Power noise affects propagating clock and data signals by causing an increase in both delay and delay uncertainty within the data paths severely limit the maximum operating frequency of an IC © 2008 Sudeep Pasricha & Nikil Dutt

Power Distribution Techniques Power distribution network should exhibit a minimal impedance at terminals of the load to ensure a small variation in the power supply voltage Decoupling capacitors ensure correct and reliable operation distributed across a system, placed at board, package, and on-chip levels provide charge when transient current demands on power grid are high each decoupling capacitor provides transient current to the load, effectively reducing the local transient noise © 2008 Sudeep Pasricha & Nikil Dutt

Power Distribution Techniques On-chip current densities can reach several hundred thousand amperes per sq. cm, leading to electromigration Electromigration is the transport of metal atoms under the force of an electron flux depletion and accumulation of metal material resulting from atomic flow can lead to formation of extrusions and voids in metal structures extrusions and voids can lead to short circuits and open circuit faults © 2008 Sudeep Pasricha & Nikil Dutt

Outline Introduction DSM Interconnect Design Low Power High Speed Circuit Design Techniques Power Distribution Techniques Clock Distribution Techniques 3D Interconnects © 2008 Sudeep Pasricha & Nikil Dutt

Clock Distribution Techniques In a synchronous digital system, the clock signal provides a time reference for movement of data within that system Clock signals are typically loaded with the greatest fanout travel over the longest distances operate at the highest speeds of any signal, either control or data Differences in the delay of the clock signals can severely limit the maximum performance of the entire system create catastrophic race conditions in which an incorrect data signal may latch within a register © 2008 Sudeep Pasricha & Nikil Dutt

Clock Distribution Techniques Synchronous systems are composed of the following three delay components: memory storage elements logic elements clocking circuitry and distribution networks Minimum allowable clock period TCP(min) between any two registers in a sequential data path is © 2008 Sudeep Pasricha & Nikil Dutt

Clock Distribution Techniques © 2008 Sudeep Pasricha & Nikil Dutt

Clock Distribution Techniques Ideally, clocking events occur simultaneously at all registers Difference in the clock signal arrival time between two sequentially adjacent registers is the clock skew system-wide or chip-wide clock skew between two non-sequentially adjacent registers, from an analysis viewpoint is meaningless has no effect on the performance and reliability of a synchronous system can be positive or negative © 2008 Sudeep Pasricha & Nikil Dutt

Clock Distribution Techniques Under positive clock skew the maximum attainable operating frequency is decreased Performance degradation Under negative clock skew maximum performance of a synchronous system can be improved by decreasing delay of a critical path But there is also potential for a race condition clock skew must be less than the time required for data signal to leave initial register, propagate through interconnect and combinatorial logic, and successfully set up in the final register if this condition is not met data stored in register Rf is overwritten by data that had been stored in register Ri and has propagated through the combinatorial logic © 2008 Sudeep Pasricha & Nikil Dutt

Clock Distribution Techniques The most common strategy for distributing on-chip clock signals is to insert buffers at the clock source and along the clock path, forming a tree structure Buffers, acting as repeaters, are usually distributed throughout clock network Distributed buffers serve the double function of amplifying clk signals degraded by distributed interconnect impedances isolating the local clock nets from the upstream load impedances © 2008 Sudeep Pasricha & Nikil Dutt

Clock Distribution Techniques Another approach for distributing clock signals utilizes a hierarchy of planar symmetric H-tree or X-tree structures to ensure zero clock skew by maintaining identical distributed interconnect and buffer paths from clock signal source to the clocked registers conductor widths in H-tree structures are designed to progressively decrease as signal propagates to lower levels of the hierarchy ensuring that reflections are minimized at branch points Drawbacks compared to standard clock trees: interconnect capacitance and power dissipation is much greater since total wire length is much longer difficult to implement in high complexity integrated systems which are typically irregular in nature © 2008 Sudeep Pasricha & Nikil Dutt

Outline Introduction DSM Interconnect Design Low Power High Speed Circuit Design Techniques Power Distribution Techniques Clock Distribution Techniques 3D Interconnects © 2008 Sudeep Pasricha & Nikil Dutt

3D Interconnects 3-D interconnects have been proposed as a way to address increasing line delay and capacitive crosstalk Introduction of a third dimension significantly alters distribution of the interconnect length in ICs As the number of planes is increased, the length and number of the global (local) interconnects decrease (increase) © 2008 Sudeep Pasricha & Nikil Dutt

3D Interconnects It is possible to partition a 2-D IC into multiple subsections and stack these sections in the vertical dimension reduces corner-to-corner interconnect length significantly several global interconnects in the upper metallization levels can be transferred to local, smaller aspect ratio metal layers reduces the total number of metal levels within a 3-D circuit an increase in the clock frequency is possible as the worst case data path delay can be reduced assuming a constant number of registers along a sequential data path and number of metal layers consume less power as compared to 2-D ICs as a consequence of the reduced capacitive load of the global interconnect lines © 2008 Sudeep Pasricha & Nikil Dutt

3D Interconnects 3-D interconnect system requires inter-plane interconnects that connect signals between vertically stacked devices Inter-plane interconnects implemented as through silicon vias (TSV) or 3-D vias can produce the shortest path within a 3-D system as compared to wire bonding, peripheral vertical interconnects, and solder ball arrays Much work is needed to properly characterize and model the interplane TSV primary technological innovation required to exploit benefits of 3-D integration © 2008 Sudeep Pasricha & Nikil Dutt

Summary Complexity of properly designing interconnects in the DSM regime increases with each successive technology generation Choice between RC or RLC dependant on several factors Low power, high speed circuit techniques are essential to expand battery lifetime and maintain ambient thermal levels Wire and driver sizing, as well as repeater insertion critical Clock and power distribution are important applications of the general interconnect design problem noise and skew must be considered during design Novel techniques such as 3D interconnects in interconnect design can help alleviate challenges in emerging ICs e.g. longer line lengths, greater line impedances, increased signal delays © 2008 Sudeep Pasricha & Nikil Dutt

© 2008 Sudeep Pasricha & Nikil Dutt