Puneet Gupta1 , Andrew B. Kahng1 , Youngmin Kim2, Dennis Sylvester2

Slides:



Advertisements
Similar presentations
An International Technology Roadmap for Semiconductors
Advertisements

A Novel 3D Layer-Multiplexed On-Chip Network
-1- VLSI CAD Laboratory, UC San Diego Post-Routing BEOL Layout Optimization for Improved Time- Dependent Dielectric Breakdown (TDDB) Reliability Tuck-Boon.
Cadence Design Systems, Inc. Why Interconnect Prediction Doesn’t Work.
Kwangok Jeong and Andrew B. Kahng UCSD VLSI CAD Laboratory
Wires.
© imec Interconnect Width Selection for Deep Submicron Designs using the Table Lookup Method Mandeep Bamal*, Evelyn Grossar*, Michele Stucchi and.
1 Worst-case Delay Analysis Considering the Variability of Transistors and Interconnects Takayuki Fukuoka, Tsuchiya Akira and Hidetoshi Onodera Kyoto University.
Noise Model for Multiple Segmented Coupled RC Interconnects Andrew B. Kahng, Sudhakar Muddu †, Niranjan A. Pol ‡ and Devendra Vidhani* UCSD CSE and ECE.
The Impact of Variability on the Reliability of Long on-chip Interconnect in the Presence of Crosstalk Basel Halak, Santosh Shedabale, Hiran Ramakrishnan,
DARPA Assessing Parameter and Model Sensitivities of Cycle-Time Predictions Using GTX u Abstract The GTX (GSRC Technology Extrapolation) system serves.
Boosting: Min-Cut Placement with Improved Signal Delay Andrew B. KahngSherief Reda CSE & ECE Departments University of CA, San Diego La Jolla, CA
Design Sensitivities to Variability: Extrapolations and Assessments in Nanometer VLSI Y. Kevin Cao *, Puneet Gupta +, Andrew Kahng +, Dennis Sylvester.
Architectural-Level Prediction of Interconnect Wirelength and Fanout Kwangok Jeong, Andrew B. Kahng and Kambiz Samadi UCSD VLSI CAD Laboratory
Study of Floating Fill Impact on Interconnect Capacitance Andrew B. Kahng Kambiz Samadi Puneet Sharma CSE and ECE Departments University of California,
From Compaq, ASP- DAC00. Power Consumption Power consumption is on the rise due to: - Higher integration levels (more devices & wires) - Rising clock.
Lecture #25a OUTLINE Interconnect modeling
Effects of Global Interconnect Optimizations on Performance Estimation of Deep Sub-Micron Design Yu (Kevin) Cao 1, Chenming Hu 1, Xuejue Huang 1, Andrew.
Toward a Methodology for Manufacturability-Driven Design Rule Exploration Luigi Capodieci, Puneet Gupta, Andrew B. Kahng, Dennis Sylvester, and Jie Yang.
Circuit Performance Variability Decomposition Michael Orshansky, Costas Spanos, and Chenming Hu Department of Electrical Engineering and Computer Sciences,
ISPD 2000, San DiegoApr 10, Requirements for Models of Achievable Routing Andrew B. Kahng, UCLA Stefanus Mantik, UCLA Dirk Stroobandt, Ghent.
1 Modeling and Optimization of VLSI Interconnect Lecture 1: Introduction Avinoam Kolodny Konstantin Moiseev.
SLIP 2000April 9, Wiring Layer Assignments with Consistent Stage Delays Andrew B. Kahng (UCLA) Dirk Stroobandt (Ghent University) Supported.
1 A Novel Metric for Interconnect Architecture Performance Parthasarathi Dasgupta, Andrew B. Kahng, Swamy V. Muddu Dept. of CSE and ECE University of California,
Lecture 21, Slide 1EECS40, Fall 2004Prof. White Lecture #21 OUTLINE –Sequential logic circuits –Fan-out –Propagation delay –CMOS power consumption Reading:
Noise and Delay Uncertainty Studies for Coupled RC Interconnects Andrew B. Kahng, Sudhakar Muddu † and Devendra Vidhani ‡ UCLA Computer Science Department,
A Methodology for Interconnect Dimension Determination By: Jeff Cobb Rajesh Garg Sunil P Khatri Department of Electrical and Computer Engineering, Texas.
TLC: Transmission Line Caches Brad Beckmann David Wood Multifacet Project University of Wisconsin-Madison 12/3/03.
EZ-COURSEWARE State-of-the-Art Teaching Tools From AMS Teaching Tomorrow’s Technology Today.
Power Reduction for FPGA using Multiple Vdd/Vth
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
UC San Diego / VLSI CAD Laboratory Toward Quantifying the IC Design Value of Interconnect Technology Improvement Tuck-Boon Chan, Andrew B. Kahng, Jiajia.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
Ho-Lin Chang, Hsiang-Cheng Lai, Tsu-Yun Hsueh, Wei-Kai Cheng, Mely Chen Chi Department of Information and Computer Engineering, CYCU A 3D IC Designs Partitioning.
Kwangsoo Han, Andrew B. Kahng, Hyein Lee and Lutong Wang
Detailed Routing: New Challenges
Kwangsoo Han‡, Andrew B. Kahng‡† and Hyein Lee‡
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 6: Detailed Routing © KLMH Lienig 1 What Makes a Design Difficult to Route Charles.
Impact of Interconnect Architecture on VPSAs (Via-Programmed Structured ASICs) Usman Ahmed Guy Lemieux Steve Wilton System-on-Chip Lab University of British.
Chapter 4: Secs ; Chapter 5: pp
EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
Department of Electrical and Computer Engineering University of Wisconsin - Madison Optimizing Total Power of Many-core Processors Considering Voltage.
-1- UC San Diego / VLSI CAD Laboratory Optimization of Overdrive Signoff Tuck-Boon Chan, Andrew B. Kahng, Jiajia Li and Siddhartha Nath Tuck-Boon Chan,
-1- Delay Uncertainty and Signal Criticality Driven Routing Channel Optimization for Advanced DRAM Products Samyoung Bang #, Kwangsoo Han ‡, Andrew B.
Dept. of Electronics Engineering & Institute of Electronics National Chiao Tung University Hsinchu, Taiwan ISPD’16 Generating Routing-Driven Power Distribution.
14/2/20041 Interconnect-Power Dissipation in a Microprocessor N. Magen, A. Kolodny, U. Weiser, N. Shamir Intel corporation® Technion - Israel Institute.
14 February, 2004SLIP, 2004 Self-Consistent Power/Performance/Reliability Analysis for Copper Interconnects Bipin Rajendran, Pawan Kapur, Krishna C. Saraswat.
Slide 1 SLIP 2004 Payman Zarkesh-Ha, Ken Doniger, William Loh, and Peter Bendix LSI Logic Corporation Interconnect Modeling Group February 14, 2004 Prediction.
Oleg Petelin and Vaughn Betz FPL 2016
Power-Optimal Pipelining in Deep Submicron Technology
CS203 – Advanced Computer Architecture
The Interconnect Delay Bottleneck.
Crosstalk If both a wire and its neighbor are switching at the same time, the direction of the switching affects the amount of charge to be delivered and.
20-NM CMOS DESIGN.
Chapter 4 Interconnect.
Architecture & Organization 1
Reading: Hambley Ch. 7; Rabaey et al. Sec. 5.2
Capacitance variation 3/ (%)
Revisiting and Bounding the Benefit From 3D Integration
Jason Cong, David Zhigang Pan & Prasanna V. Srinivas
Architecture & Organization 1
Overview of VLSI 魏凱城 彰化師範大學資工系.
Interconnect Architecture
Buffered tree construction for timing optimization, slew rate, and reliability control Abstract: With the rapid scaling of IC technology, buffer insertion.
High Throughput LDPC Decoders Using a Multiple Split-Row Method
Summary Current density in a signal line was estimated, based on the simple circuit shown in Fig.1. This circuit is scaled down according to ITRS 2003.
Circuit Design Techniques for Low Power DSPs
Reading (Rabaey et al.): Sections 3.5, 5.6
Applications of GTX Y. Cao, X. Huang, A.B. Kahng, F. Koushanfar, H. Lu, S. Muddu, D. Stroobandt and D. Sylvester Abstract The GTX (GSRC Technology Extrapolation)
Jason Cong, David Zhigang Pan & Prasanna V. Srinivas
Presentation transcript:

Investigation of Performance Metrics for Interconnect Stack Architectures Puneet Gupta1 , Andrew B. Kahng1 , Youngmin Kim2, Dennis Sylvester2 1ECE Department, University of California at San Diego 2EECS Department, University of Michigan at Ann Arbor

Outline Motivation Delay and Bandwidth Energy-Driven Metrics Via Blockage WLD and Wire Assignment (Avg. wire length for each layer) Bandwidth Metrics Energy-Driven Metrics SLIP2004

Motivation Front-end dimensions set by lithography restrictions Back-end dimensions are often area/performance driven Especially intermediate and global metal levels Front-end performance quantified with known metrics: FO4 or ring oscillator delays Ioff, Ion No comparable metrics for back-end RC per µm ignores many issues Via blockage factor is important to consider SLIP2004

Delay and Bandwidth Bandwidth or throughput-driven approach for the interconnect recently proposed Ho 01, Young 01, Lin 02 All approaches are applied to one layer only (i.e., a top level) Ignore factors due to multilayer interconnect via blockage repeater insertion Bandwidth and energy metrics considering the entire multilayer interconnect stack are required SLIP2004

Interconnect Stack Repeater via Source: Muddu We seek to develop metrics that can be used to compare back-end dimensions, and eventually to drive the selection of such dimensions SLIP2004

Standard Interconnect Delay Models Worst-case 50% delay of a minimum sized inverter driving an interconnect is (Pamunuwa 03) The number of repeaters is k and the size of each repeater is h, then SLIP2004

Via Blockage; Previous Work Via blockage factor (vi) : Fraction of total available space that is not available due to via blockage effects on layer i A layer blocks 12% ~ 15% of the wiring capacity of every layer underneath it at constant pitch (Sai-Halasz) Via blockage is only severe on the lower metal layers (Chen et al.) Where Nv_wire is the number of vias by wires, I(l) is cumulative interconnect density SLIP2004

Via Blockage, continued Repeater insertion for long wires in the semi-global and global interconnect layers is necessary for delay and slew constraints Total via blockage factor in nth layer SLIP2004

Interconnect Technology Parameters 130nm 90nm Ac (cm^2) 0.98 #of gates 6.4M 12.87M Vdd (V) 1.4 1.2 Id (mA/um) 1.0 # of nets 29.08M 58.15M k (ILD) 3.6 2.9 p (Rents exponent) 0.6 Layer 130nm technology (nm) 90nm technology (nm) Intel IBM TSMC ITRS 1 350 320 340 220 245 240 210 2 448 400 410 280 3 450 275 4 756 5 1120 1340 480 6 1204 800 900 720 560 840 820 7 - 1080 SLIP2004

Via Blockage Projections 130nm 90nm 2~5x times larger blockage at the bottom layer IBM and TSMC show more significant via blockage  less tapered nature (many iso-pitch layers) SLIP2004

WLD and Wire Assignment Typical wire lengths on a given layer can be considerably different  delay as well Davis’s wirelength model and a top town wire assignment technique (Venkatesan 01) are applied Average wire length on each layer is used as typical wire length on the layer SLIP2004

Average Wire Length 130nm 90nm Intel and ITRS show much larger average wire length for all layer top down wire assignment Wider pitch at top two layers for Intel/ITRS SLIP2004

Bandwidth Metrics Represents the rate at which information can be transferred through channel # of wires or bits per second Delayn is the wire delay calculated at the average wire length in a given layer n Driver is sized to match wire cap. using optimal repeater as baseline Nwire is the number of parallel wires If Lavg > maximum allowable distance between repeaters  insert optimal repeaters Via blockage reduces the wiring resources directly SLIP2004

Bandwidth IBM and TSMC show better bandwidth 130nm 90nm IBM and TSMC show better bandwidth Shorter average wire length and greater wiring density overcome larger RC per unit length Bandwidth in lower layers is much higher than in upper layers Shorter wires and greater wiring density SLIP2004

Normalized Bandwidth Simply summing BW of individual layers is not a good way to assess the performance  normalization required The number of segments: “routing demand” for a given layer avg c seg L A N = SLIP2004

Normalized Bandwidth 130nm 90nm Intel and ITRS stacks are superior when considering normalization They have fewer segments due to longer average wire length on all layers Their pre-normalized BW was already penalized for longer wirelengths Normalized bandwidth is more consistent across the stack (ignoring metal l where density is main criteria) SLIP2004

Statistics of Normalized BW ~15% variation in total normalized BW across technologies Somewhat smaller than spread of front-end metrics like FO4 and Ion 130nm 90nm SLIP2004

Energy-Driven Metrics The increasing use of repeaters on global and even intermediate layers Reduces delay and maintains good signal integrity But: Increases power consumption dramatically Repeater capacitance : Crep = khCdrv Drivers other than repeaters also considered Wire capacitance : Cwire = 2(Cg + Cc) Energy is then = Nwire(Crep + Cwire) Ignoring operation frequency and supply voltage which are not varying SLIP2004

Energy in 130nm, 90nm 130nm 90nm Total energy on a layer is fairly constant across metal levels Large differences on layer 1 are due to top-down layer assignment, different utilization factors Larger pitches at top levels allows for smaller energy consumption (fewer repeaters) in Intel and ITRS With growing # of repeaters in future technologies (Saxena, ISPD03), it becomes critical to choose wiring pitches (reverse scale) with energy/repeaters in mind SLIP2004

Bandwidth per unit Energy Bandwidth and energy can be combined to provide a complete interconnect performance metric  Bandwidth (normalized) per unit Energy 130nm 90nm SLIP2004

Sum and Avg. of BW/Energy 130nm 90nm Intel/ITRS remain appealing in terms of BW/Energy Spread is now larger than in case of just BW Gap increases from 130 to 90nm SLIP2004

Conclusions Bandwidth and energy metrics for complete interconnect stacks identified Growing impact of repeaters on via blockage Normalized bandwidth metrics for comparison of bandwidth across layers Intel and ITRS tend to show better results in terms of normalized bandwidth Wider pitches, question of routability (?) Energy-based metrics indicate that top-level pitch choice has a large impact on BW/Energy As repeaters become common on intermediate metallization layers, more layers must consider reverse scaling A gradually tapered interconnect stack provides best performance but somewhat more manufacturing complexity SLIP2004

Thank you SLIP2004