Performance Analysis and Technology of 3D ICs

Slides:



Advertisements
Similar presentations
THERMAL-AWARE BUS-DRIVEN FLOORPLANNING PO-HSUN WU & TSUNG-YI HO Department of Computer Science and Information Engineering, National Cheng Kung University.
Advertisements

Savas Kaya and Ahmad Al-Ahmadi School of EE&CS Russ College of Eng & Tech Search for Optimum and Scalable COSMOS.
Heat Generation in Electronics Thermal Management of Electronics Reference: San José State University Mechanical Engineering Department.
Embedded Systems Design: A Unified Hardware/Software Introduction 1 Chapter 10: IC Technology.
by Alexander Glavtchev
Alain Espinosa Thin Gate Insulators Nanoscale Silicon Technology PresentersTopics Mike DuffyDouble-gate CMOS Eric DattoliStrained Silicon.
An International Technology Roadmap for Semiconductors
© imec Interconnect Width Selection for Deep Submicron Designs using the Table Lookup Method Mandeep Bamal*, Evelyn Grossar*, Michele Stucchi and.
Metal Oxide Semiconductor Field Effect Transistors
ECE 6466 “IC Engineering” Dr. Wanda Wosik
1 Thermal Via Placement in 3D ICs Brent Goplen, Sachin Sapatnekar Department of Electrical and Computer Engineering University of Minnesota.
MonolithIC 3D  Inc. Patents Pending 1 The Monolithic 3D-IC A Disruptor to the Semiconductor Industry.
Discussion D2: Gigascale Integration(GSI) Interconnect Limits and N-Tier Multilevel Interconnect Architectural Solutions Moderator: Jeff A. Davis Contributors:
S. Reda EN160 SP’08 Design and Implementation of VLSI Systems (EN1600) Lecture 18: Scaling Theory Prof. Sherief Reda Division of Engineering, Brown University.
Institute of Digital and Computer Systems 1 Fabio Garzia / Finding Peak Performance in a Process23/06/2015 Chapter 5 Finding Peak Performance in a Process.
From Compaq, ASP- DAC00. Power Consumption Power consumption is on the rise due to: - Higher integration levels (more devices & wires) - Rising clock.
Lecture #25a OUTLINE Interconnect modeling
Analytical Thermal Placement for VLSI Lifetime Improvement and Minimum Performance Variation Andrew B. Kahng †, Sung-Mo Kang ‡, Wei Li ‡, Bao Liu † † UC.
Temperature-Aware Design Presented by Mehul Shah 4/29/04.
SLIP 2000April 9, Wiring Layer Assignments with Consistent Stage Delays Andrew B. Kahng (UCLA) Dirk Stroobandt (Ghent University) Supported.
Introduction Integrated circuits: many transistors on one chip.
INAC The NASA Institute for Nanoelectronics and Computing Purdue University Circuit Modeling of Carbon Nanotubes and Their Performance Estimation in VLSI.
Z. Feng VLSI Design 1.1 VLSI Design MOSFET Zhuo Feng.
MOS Capacitors MOS capacitors are the basic building blocks of CMOS transistors MOS capacitors distill the basic physics of MOS transistors MOS capacitors.
1 Delay Estimation Most digital designs have multiple data paths some of which are not critical. The critical path is defined as the path the offers the.
MonolithIC 3D Inc., Patents Pending MonolithIC 3D ICs RCAT approach 1 MonolithIC 3D Inc., Patents Pending.
Presentation for Advanced VLSI Course presented by:Shahab adin Rahmanian Instructor:Dr S. M.Fakhraie Major reference: 3D Interconnection and Packaging:
The Short, Medium and Long-Term Path to the 3D Ecosystem
Avogadro-Scale Engineering: Form and Function MIT, November 18, Three Dimensional Integrated Circuits C.S. Tan, A. Fan, K.N. Chen, S. Das, N.
Global Routing.
Comparison of various TSV technology
LOPASS: A Low Power Architectural Synthesis for FPGAs with Interconnect Estimation and Optimization Harikrishnan K.C. University of Massachusetts Amherst.
Taklimat UniMAP Universiti Malaysia Perlis WAFER FABRICATION Hasnizah Aris, 2008 Lecture 2 Semiconductor Basic.
New Modeling Techniques for the Global Routing Problem Anthony Vannelli Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Limitations of Digital Computation William Trapanese Richard Wong.
ITRS: RF and Analog/Mixed- Signal Technologies for Wireless Communications Nick Krajewski CMPE /16/2005.
Text Book: Silicon VLSI Technology Fundamentals, Practice and Modeling Authors: J. D. Plummer, M. D. Deal, and P. B. Griffin Class: ECE 6466 “IC Engineering”
Comparative Analysis of the RF and Noise Performance of Bulk and Single-Gate Ultra-thin SOI MOSFETs by Numerical Simulation M.Alessandrini, S.Eminente,
Introduction to CMOS VLSI Design CMOS Fabrication and Layout Harris, 2004 Updated by Li Chen, 2010.
MonolithIC 3D Inc., Patents Pending MonolithIC 3D ICs October MonolithIC 3D Inc., Patents Pending.
INTERCONNECT MODELING M.Arvind 2nd M.E Microelectronics
1 Interconnect/Via. 2 Delay of Devices and Interconnect.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 6: January 19, 2005 VLSI Scaling.
©2008 R. Gupta, UCSD COSMOS Summer 2008 Chips and Chip Making Rajesh K. Gupta Computer Science and Engineering University of California, San Diego.
VLSI INTERCONNECTS IN VLSI DESIGN - PROF. RAKESH K. JHA
ADVANCED HIGH DENSITY INTERCONNECT MATERIALS AND TECHNIQUES DIVYA CHALLA.
Interconnect/Via.
CMOS VLSI Fabrication.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 6: January 22, 2003 VLSI Scaling.
CHAPTER 6: MOSFET & RELATED DEVICES CHAPTER 6: MOSFET & RELATED DEVICES Part 2.
Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 6.1 EE4800 CMOS Digital IC Design & Analysis Lecture 6 Power Zhuo Feng.
EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
Dirk Stroobandt Ghent University Electronics and Information Systems Department A New Design Methodology Based on System-Level Interconnect Prediction.
Circuit Delay Performance Estimation Most digital designs have multiple signal paths and the slowest one of these paths is called the critical path Timing.
14/2/20041 Interconnect-Power Dissipation in a Microprocessor N. Magen, A. Kolodny, U. Weiser, N. Shamir Intel corporation® Technion - Israel Institute.
14 February, 2004SLIP, 2004 Self-Consistent Power/Performance/Reliability Analysis for Copper Interconnects Bipin Rajendran, Pawan Kapur, Krishna C. Saraswat.
Integrated Circuits.
The Interconnect Delay Bottleneck.
VLSI Design MOSFET Scaling and CMOS Latch Up
Defect Tolerance for Nanocomputer Architecture
Chapter 10: IC Technology
An Automated Design Flow for 3D Microarchitecture Evaluation
FIELD EFFECT TRANSISTOR
Summary Current density in a signal line was estimated, based on the simple circuit shown in Fig.1. This circuit is scaled down according to ITRS 2003.
Chapter 10: IC Technology
Technology scaling Currently, technology scaling has a threefold objective: Reduce the gate delay by 30% (43% increase in frequency) Double the transistor.
Chapter 10: IC Technology
CSE 87 Fall 2007 Chips and Chip Making
Basic Planar Process 1. Silicon wafer (substrate) preparation
Presentation transcript:

Performance Analysis and Technology of 3D ICs Krishna Saraswat Shukri Souri Kaustav Banerjee Pawan Kapur Department of Electrical Engineering Stanford University Stanford, CA 94305 saraswat@stanford.edu Funding sources: DARPA, MARCO

Outline Why 3-D ICs? Limits of Cu/low K technology 3D IC performance simulation 3-D technologies Seeding crystallization of amorphous Si Processed wafer bonding Thermal simulations

Introduction: Interconnect Delay Is Increasing Chip size is continually increasing due to increasing complexity Device performance is improving but interconnect delay is increasing Chip sizes today are wire-pitch limited: Size is determined by amount of wiring required Mark Bohr, IEDM Proceedings, 1995

Cu Resistivity: Effect of Line Width Scaling Effect of Cu diffusion Barrier Barriers have higher resistivity Barriers can’t be scaled below a minimum thickness Effect of Electron Scattering Reduced mobility as dimensions decrease Effect of Higher Frequencies Carriers confined to outer skin increasing resistivity Problem is worse than anticipated in the ITRS 1999 roadmap

Cu Resistivity: Barriers Deposition Technology ITRS 1999 Line width (nm) Globel Local 525 250 280 133 95 48 Atomic Layer Deposition (ALD) Ionized PVD Collimated PVD 5 nm barrier assumed at the thinnest spot No scattering assumed, I.e., bulk resistivity Interconnect dimensions scaled according to ITRS 1999

Cu Resistivity: Effect of Electron Scattering Diffuse scattering Elastic Diffuse, Global Diffuse, Local 273 K 373 K Lower mobility Elastic scattering No barrier assumed Diffuse electron scattering increases resistivity Lowering temperature has a big effect

Fraction of chip area used by repeaters Rent’s exponents As much as 27% of the chip area at 50 nm node is likely to be occupied by repeaters.

3D ICs with Multiple Active Si Layers Motivation Performance of ICs is limited due to R, L, C of interconnects Interconnect length and therefore R, L, C can be minimized by stacking active Si layers Number of horizontal interconnects can be minimized by using vertical interconnects Disparate technology integration possible, e.g., memory & logic, optical I/O, etc. Logic n+/p+ Gate T1 T2 M1 M2 M3 M4 Repeaters optical I/O devices M’1 M’2 VILIC Via Memory Analog

Chip Size Device Size Limited Memory: SRAM, DRAM Wire Pitch Limited Logic, e.g., µ-Processors    PMOS NMOS Chips are either device size limited or wire pitch limited depending on the complexity of wiring. Memory chips are typically device size limited where the driving force is the density of devices. In wire pitch limited chips such as modern microprocessors, wiring complexity results in a total wiring area that exceeds the real estate occupied by the devices.

Rent’s Rule T = k N P T = # of I/O terminals N = # of gates N gates T = k N P T = # of I/O terminals N = # of gates k = avg. I/O’s per gate P = Rent’s exponent

Determination of Wire-length Distribution Conservation of I/O’s TA + TB + TC = TA-to-B + TA-to-C + TB-to-C + TABC Block A with NA gates TA-to-B = TA + TB -TAB TB-to-C = TB+ TC -TBC Block B Values of T within a block or collection of blocks are calculated using Rent’s rule, e.g., TA = k (NA) P TABC = k (NA+ NB+ NC) P Recursive use of Rent’s rule gives wire-length distribution for the whole chip Block C Ref: Davis & Meindl, IEEE TED, March 1998

Inter-Layer Connections For 3-D2-Layers Fraction of I/O ports T1 and T2 is used for inter-layer connections, Tint Assume I/O port conservation: T = T1 + T2 - Tint Use Rent’s Rule: T = kNP to solve for Tint (p assumed constant) k = Avg. I/O’s per gate N = No. of gates p = Rent’s exponent

Wire-length Distribution of 3-D IC Microprocessor Example from NTRS 50 nm Node Number of Gates 180 million Minimum Feature Size 50 nm Number of wiring levels, 9 Metal Resistivity, Copper 1.673e-6 Ω-cm Dielectric Constant, Polymer er = 2.5 1 2 3 4 5 Single Layer 1 4 5 2 Layers 3 2 Replace horizontal by vertical interconnect Vertical inter-layer connections reduce metal wiring requirement

Chip Area Estimation A 3-tier wiring network Global Semi- global Local Placement of a wire in a tier is determined by some constraint, e.g., maximum allowed RC delay Wiring Area = wire pitch x total length Areq = plocLtot_loc + psemiLtot_semi + pglobLtot_glob = Aloc + Asemi + Aglob Ltot calculated from wire-length distribution A 3-tier wiring network Global Semi- global Local

2 Active Layer Results Upper tiers pitches are reduced for constant chip frequency, fc Less wiring needed Almost 50% reduction in chip area Chip area vs. Semi-Global pitch comparison between 2D and 3D IC. Operating frequency is kept constant.

3-D Wire-Length Distribution Symmetric Interconnects: Comparable inter- and intra-device layer connectivity Asymmetric Interconnects: Negligible inter-device layer connectivity Ref: Rahman & Reif (MIT) N: Number of logic gates, f.o.: fan-out, k and p: Rent’s parameters, Nz: Number of device layers More vertical interconnects required

More than 2 active layers No. of Active Layers Normalized Interconnect Delay 1 2 3 4 5 0.65 0.75 0.85 0.95 1.0

Delay of Scaled 2D and 3D ICs Moving repeaters to upper active tiers reduces interconnect delay by 9%. 3D (2 Si layers) shows significant delay reduction (64%). Increasing the number of metal levels in 3D improves interconnect delay by another 40%. Increasing the number of Si layers to 5 further improves interconnect delay. 50 100 150 200 250 0.1 1 . T e chno l ogy Ge ne r a t ion (nm ) ypi c ga De ay 0.01 I n er onnec D el 0.001 2 C w it h r p te s 3 C c o st t m al la y C me ta rs 2X 3D IC 2X metal layers, 5 Si layers Interconnect Delay: Simulations assumed state-of-the-art chip at a technology node with data from NTRS

3D Approaches Wafer Bonding (MIT) Seeding crystallization of -Si Logic n+/p+ Gate T1 T2 M1 M2 M3 M4 Repeaters or optical I/O devices M’1 M’2 VILIC Via Memory or Analog Wafer Bonding (MIT) Seeding crystallization of -Si (Stanford) Epitaxial Lateral Overgrowth (Purdue)

Statistical Variations in Poly-TFT Properties Conventional Poly-TFT Mobility Grain size 0.3-0.5 µm Effect of Grain Boundaries In 3D ICs can we get by with TFTs in poly-Si or do we really need single crystal TFTs? To answer this question a new model to connect grain size variation to performance variation in poly-TFTs has been developed. The number of grains within a TFT is modeled using a Poisson area scatter, leading to variation in the number of grains between devices. The device area is divided by the number of grains to obtain the average grain size within a TFT. The grain size can be connected to device performance by substitution into physically based models for device behavior. In this case, the grain size is substituted into simple models for threshold voltage and mobility. The model clearly suggests that device variation increases considerably as device and grain sizes converge. The increase in variation is due to the increased spread in average grain size in such devices, which in turn is due to the random growth of the grains. But what happens if grain growth is not random, so that the grain size is always known to have a fixed value for all transistors on a substrate? This model would predict no variation in device performance due to grain size variation due to total absence of grain boundaries which are the primary reason for the transistor performance degradation. In other words, such TFTs would not be subject to this model; there would be some performance variation, but it would not be due to grain size. Since seeding methods involve non-random grain nucleation, TFTs made using this method should theoretically have no variation due to different average grain size. Such methods may be needed to make TFTs practical for 3-D integration applications. As channel length  grain size, statistical variation increases Elimination of grain boundaries should reduce this variation

Ge Seeded Lateral Crystallization Ge seeds Lateral crystallization a -Si Substrate SiO2 Seeding Grain Growth C h a n e l S o u r c D i Gate oxide Gate MOSFET Fabrication Grain -Si Single Grain 0.1 µm NMOS Concept: Locally induce nucleation Grow laterally, inhibiting additional nucleation Build MOSFET in a single grain

Single Grain Transistors in Ge Induced Crystallized Si ID-VG of 0.1 µm NMOS Mobility SGT

Ni Seeded Lateral Crystallization NMOS -Si Crystallized Si Ni seed SiGe gate substrate SiO2 Tmax = 450ºC Initially transistor fabricated in -Si Ni seeding for simultaneous crystallization and dopant activation Low thermal budget (≤ 450°C) Devices could be fabricated on top of a metal line

Thermal Behavior in 3D ICs Power Dissipation for 2D Energy is dissipated during transistor operation Heat is conducted through the low thermal conductivity dielectric, Silicon substrate and packaging to heat sink 1-D model assumed to calculate die temperature

3D Examples for Thermal Study Bulk Si n+ p+ Gate T1 T2 M1 M2 M3 M4 M’1 M’2 M3 M4 M5 M6 Bulk Si n+ Gate T1 M’1 M’2 T2 Case A: Heat dissipation is confined to one surface Case B: Heat dissipation possible from 2 surfaces.

Die Temperature Simulation Attainable die temperatures for 2-D and 3-D ICs at the NTRS based 50 nm node using advanced heat-sinking technologies that would reduce the normalized thermal resistance, R

3D ICs: Implications for Circuit Design Critical Path Layout: By vertical stacking, the distance between logic blocks on the critical path can be reduced to improve circuit performance. Integration of disparate technologies is easier Microprocessor Design: on-chip caches on the second active layer will reduce distance from the logic and computational blocks. RF and Mixed Signal ICs: Substrate isolation between the digital and RF/analog components can be improved by dividing them among separate active layers - ideal for system on a chip design. Optical I/O can be integrated in the top layer Repeaters: Chip area can be saved by placing repeaters (~ 10,000 for high performance circuits) on the higher active layers. Physical Design and Synthesis: Due to a non-planar target graph (upon which the circuit graph is embedded), placement and routing algorithms, and hence synthesis algorithms and architectural choices, need to be suitably modified.

Summary Cu/low k will not solve the problems of interconnects. Modeling of interconnect delay shows significant improvement by transitioning from 2-D to 3-D ICs. Seeding and lateral crystallization of amorphous Si is a promising technique to implement 3-D ICs. Thermal dissipation in 3-D ICs may require innovative packaging solutions.