Download presentation
Presentation is loading. Please wait.
Published byLouisa Murphy Modified over 9 years ago
1
1 Modeling and Optimization of VLSI Interconnect 049031 Lecture 1: Introduction Avinoam Kolodny Konstantin Moiseev
2
2 Why bother about interconnects? Chip performance and power depend on interconnect VLSI Technology has changed: “Device-dominant” → “Interconnect-dominant” Interconnect has changed: “Parasitic effect” → “primary effect” What are the reasons for the changes? Growth in system complexity Poor scaling of interconnects
3
3 Process Technology Scaling: Minimum Feature Size Source: Intel, SIA Technology Roadmap SIA: Semiconductor Industry Association 0.01 Feature Size (microns) 0.1 1 10 ’68’71’76’80’84’88’92’96’00’04 ’08 Intel SIA ’14 130nm 90nm 65nm 45nm 32nm 180nm 22nm 14nm 22nm
4
4 Scaling: Moore’s Law Source: Gordon E. Moore, Cramming more components onto integrated circuits, Electronics, April 19, 1965, p.114 1965 version 1979 version
5
Processor Chips ! ! ! !! 5
6
6 Organization of Chips Blocks (cells) Ports (pins) Nets (nodes) Local wires Global wires
7
7 Keeping up with Moore’s Law: Rent’s Rule Abstraction Hierarchy Regularity Design Methodology 100 1000 10 0 10 1 10 2 10 3 10 4 T Terminals per module 1 10 N Transistors in module Rent’s exponent <1
8
8 Interconnect Length Distribution Source: Shekhar Y. Borkar, Nir Magen - Intel
9
9 Connectivity and Complexity
10
10 3-dimensional view of interconnect
11
11 History 1: Single metal layer Intel 8088 processor When interconnections layers were limited, people preceived Systolic arrays Source: http://www.microscopy.fsu.edu/chipshots/intel
12
12 History 2: Two layers of metal
13
13 Interconnect Stack
14
14 Growing demand for metal layers Source: ITRS 2007
15
15 45nm (intel) Cu Wiring Low-k ILD Narrow plugs Stacked vias Note aspect ratio and wire spacing! M1-M3 M4 M5 M6 M7 M8
16
16 Metal Processing Single Damascene Dual Damascene ILD Deposition Oxide Trench Etch Metal Fill Metal CMP Oxide Trench / Via Etch Metal Fill Metal CMP
17
17 The Interconnect Scaling Problem: Transistors get better, Wires get worse
18
18 Transistor Scaling (ideal)
19
19 Wire Scaling (ideal)
20
20 Interconnect time constant
21
21 Nonuniform Scaling of Wires The idea: Shrink lateral dimensions – save area Keep vertical dimensions – avoid high resistance
22
22 Non-uniform scaling of wires
23
23 Interconnect scaling: Wire Capacitance ε 0 = 8.85 x 10 -14 F/cm ε r = 3.9 (SiO 2 )
24
24 “interconnect scaling – the real limiter” Bohr, IEDM 95 P T T T
25
25 “interconnect scaling – the real limiter” Bohr, IEDM 95; ITRS 1997 -Unloaded single transistors -Fixed-length wire
26
26 Criticism on Bohr’s model Assumed fixed wire length of 1mm Assumed a fixed (small) transistor size Ignored the effect of transistor size on circuit speed
27
27 Review: Why transistors become better with scaling?
28
28 Review: Why wires become worse with scaling? Local wires: Global wires:
29
29 Local wires and Global wires Local wire: Shrinks in length just like everything else While transistors become faster, local wire RC remains unchanged (by simple scaling theory) Global wire: Goes across the whole chip – does not scale! Reflects new complexity added to the system! Global wire RC grows as (wirelength) 2
30
30 “The bottom of deep submicron” (Sylvester & Keutzer, ICCAD 98) How to separate gate delay from wire delay? How strong should the driving gate be? Conclusion: 50K-100K gate blocks are OK for traditional design flow.
31
31 “The future of wires” (Ho, Mai, Horowitz, Proc. IEEE 2001 ) Local wires scale in performance Global wires do not Implications: CAD tools to improve handling of long wires and “exceptions” (otherwise design productivity will be destroyed) VLSI architecture must explicitly account for global latencies
32
32 Delay of global wire is longer than a clock cycle Time for signal propagation across the die: Today: 2-3 cycles In 10 years: ~10 cycles Fraction of chip reachable in a single cycle: Today: ~25% In 10 years: ~1% Fraction of chip reachable in 1 clock cycle Source: Keckler et al. ISSCC 2003
33
33 Uniprocessor architecture inefficiency “Fred Pollack’s rule”: New microarchitectures take a lot more area for just a little more performance New microarchitecture ~2-3X die area of the last uArchNew microarchitecture ~2-3X die area of the last uArch Provides 1.5-1.7X performance of the last uArchProvides 1.5-1.7X performance of the last uArch Die Area Performance Global interconnect delay is part of this problem!
34
34 Future of VLSI architecture (Dally 1999, Horowitz 2001) System requirement: Power-efficient performance growth Implications: Chip Multi Processors (CMP) Thread-Level Parallelism (TLP) Local memories Explicit communication M P M P PP PP M M P M P
35
35 The infamous growth in processor power Source: Intel (Sery, Borkar and De, DAC 2002)
36
36 Interconnect power Dynamic power P = ΣAF i C i V 2 f Definition: Interconnect-Power Dynamic power consumption due to interconnect capacitance switching
37
37 Interconnect Power in the Banias chip Total dynamic powerInterconnect power Source: Magen et al., SLIP 2004
38
38 Additional issues with wires Delay Power Noise Reliability Cost
39
39 Noise: on-chip communication will become unreliable Crosstalk Capacitive and inductive Power supply noise Reduced threshold voltage and margins Timing errors Synchronization failures between different clock domains Soft errors : logic upsets 100% Reliable transmission over long wires will not be possible Need error correction mechanisms
40
40 Capacitive Crosstalk
41
41 Capacitive Crosstalk Noise
42
42 The future of design flow? J. Cong : “Interconnect-centric design” Proc. IEEE 2001
43
43 Summary Growing system complexity requires more wires Poor scalability of global wires Delay, Power, Noise,….. Interconnect-centric design is desirable
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.