Download presentation
Presentation is loading. Please wait.
1
Interconnect Optimizations
2
A scaling primer Ideal process scaling: –Device geometries shrink by S = 0.7x) Device delay shrinks by s –Wire geometries shrink by R/ : /(ws.hs) = r/s 2 Cc/ : (hs). /(Ss) = Cc C/ : similar R/ doubles, C/ and Cc/ unchanged SGD h w l S ll hh SS ww
3
Interconnect role Short (local) interconnect –Used to connect nearby cells –Minimize wire C, i.e., use short min-width wires Medium to long-distance (global) interconnect –Size wires to tradeoff area vs. delay –Increasing width Capacitance increases, Resistance decreases Need to find acceptable tradeoff - wire sizing problem “Fat” wires –Thicker cross-sections in higher metal layers –Useful for reducing delays for global wires –Inductance issues, sharing of limited resource
4
Cross-Section of A Chip
5
Block scaling Block area often stays same –# cells, # nets doubles –Wiring histogram shape invariant Global interconnect lengths don’t shrink Local interconnect lengths shrink by s
6
Interconnect delay scaling Delay of a wire of length l : int = (rl)(cl) = rcl 2 (first order) Local interconnects : int : (r/s 2 )(c)(ls) 2 = rcl 2 –Local interconnect delay unchanged (compare to faster devices) Global interconnects : int : (r/s 2 )(c)(l) 2 = (rcl 2 )/s 2 –Global interconnect delay doubles – unsustainable! Interconnect delay increasingly more dominant
7
Buffer Insertion For Delay Reduction
8
Analysis of Simple RC Circuit state variable Input waveform ± v(t) C R v T (t) i(t)
9
Analysis of Simple RC Circuit Step-input response: match initial state: output response for step-input: v0v0 v 0 u(t) v 0 (1-e -t/RC )u(t)
10
Delays of Simple RC Circuit v(t) = v 0 (1 - e -t/RC ) -- waveform under step input v 0 u(t) v(t)=0.5v 0 t = 0.69RC –i.e., delay = 0.69RC (50% delay) v(t)=0.1v 0 t = 0.1RC v(t)=0.9v 0 t = 2.3RC –i.e., rise time = 2.2RC (if defined as time from 10% to 90% of Vdd) Commonly used metric T D = RC (= Elmore delay)
11
Elmore Delay Delay
12
Elmore Delay Driver is modeled as R Driver intrinsic gate delay t(B) Delay = all Ri all Cj downstream from Ri Ri*Cj Elmore delay at n2 R(B)*(C1+C2)+R(w)*C2 Elmore delay at n1 R(B)*(C1+C2) R(B) C1 R(w) C2 n1 B n2
13
Elmore Delay For uniform wire No matter how to lump, the Elmore delay is the same x C unit wire capacitance c unit wire resistance r
14
Delay for Buffer v C u C(b) u Intrinsic buffer delay Driver resistance Input capacitance
15
R Buffers Reduce Wire Delay x/2 cx/4 rx/2 t_unbuf = R( cx + C ) + rx( cx/2 + C ) t_buf = 2R( cx/2 + C ) + rx( cx/4 + C ) + t b t_buf – t_unbuf = RC + t b – rcx 2 /4 x/2 cx/4 rx/2 C C R x ∆t∆t
16
Combinational Logic Delay Combinational logic delay <= clock period Combinational Logic Register Primary Input Register Primary Output clock
17
Buffered global interconnects: Intuition Interconnect delay = r.c.l 2 Now, interconnect delay = r.c.l i 2 < r.c.l 2 (where l = l j ) since (l j 2 ) < ( l j ) 2 (Of course, account for buffer delay also) l1l1 lnln l3l3 l2l2 l
18
Optimal inter-buffer length First order (lumped parasitic, Elmore delay) analysis Assume N identical buffers with equal inter-buffer length l (L = Nl) For minimum delay, L R d – On resistance of inverter C g – Gate input capacitance r,c – Resistance, cap. per micron … … l
19
Optimal interconnect delay Substituting l opt back into the interconnect delay expression: Delay grows linearly with L (instead of quadratically)
20
Optimized interconnect delay scaling Rewriting the optimal interconnect delay expression, With optimally sized buffers (using dT/dh = 0),
21
Total buffer count Ever-increasing fractions of total cell count will be buffers –70% in 32nm 0 10 20 30 40 50 60 70 80 90nm65nm45nm32nm % cells used to buffer nets clk-buf buf tot-buf
22
ITRS projections
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.