2013 DAC Designer/User Track Presentation Inductor Design for Global Resonant Clock Distribution in a 28-nm CMOS Processor Visvesh Sathe 3, Padelis Papadopoulos.

Slides:



Advertisements
Similar presentations
Chungki Oh, Jianfeng Liu, Seokhoon Kim, Kyung-Tae Do,
Advertisements

An International Technology Roadmap for Semiconductors
A Novel 3D Layer-Multiplexed On-Chip Network
Keeping Hot Chips Cool Ruchir Puri, Leon Stok, Subhrajit Bhattacharya IBM T.J. Watson Research Center Yorktown Heights, NY Circuits R-US.
On-chip inductance and coupling Zeynep Dilli, Neil Goldsman Thanks to Todd Firestone and John Rodgers for providing the laboratory equipment and expertise.
New Features in Sonnet ® 7.0 Planar Electromagnetic Analysis A New Benchmark in Capability by James C. Rautio Sonnet Software, Inc.
On-chip Inductors: Design and Modeling UMD Semiconductor Simulation Lab March 2005.
Chapter 5 Interconnect RLC Model n Efficient capacitance model Efficient inductance model Efficient inductance model RC and RLC circuit model generation.
Power-Aware Placement
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 14: Interconnects Prof. Sherief Reda Division of Engineering, Brown University.
EM-sensitive components on semiconductor chips Modern RF circuits often feature on-chip inductors required by circuit design –Operating frequencies are.
From Compaq, ASP- DAC00. Power Consumption Power consumption is on the rise due to: - Higher integration levels (more devices & wires) - Rising clock.
Lecture #25a OUTLINE Interconnect modeling
Introduction to CMOS VLSI Design Interconnect: wire.
Effects of Global Interconnect Optimizations on Performance Estimation of Deep Sub-Micron Design Yu (Kevin) Cao 1, Chenming Hu 1, Xuejue Huang 1, Andrew.
Temperature-Aware Design Presented by Mehul Shah 4/29/04.
7/13/ EE4271 VLSI Design VLSI Routing. 2 7/13/2015 Routing Problem Routing to reduce the area.
Integrated Regulation for Energy- Efficient Digital Circuits Elad Alon 1 and Mark Horowitz 2 1 UC Berkeley 2 Stanford University.
THEORETICAL LIMITS FOR SIGNAL REFLECTIONS DUE TO INDUCTANCE FOR ON-CHIP INTERCONNECTIONS F. Huret, E. Paleczny, P. Kennis F. Huret, E. Paleczny, P. Kennis.
Interconnection and Packaging in IBM Blue Gene/L Yi Zhu Feb 12, 2007.
- 1 - Interconnect modeling for multi-GHz clock network Special interconnect structure –Wires highly optimized with VDD/GND shields –Wide lines split into.
Performance of the DZero Layer 0 Detector Marvin Johnson For the DZero Silicon Group.
Radio-Frequency Effects in Integrated Circuits
Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai.
Presented By Dwarakaprasad Ramamoorthy An Optimized Integrated QVCO for Use in a Clock Generator for a New Globally Asynchronous, Locally Synchronous (GALS)
P ERFORMANCE E NHANCEMENT F OR S PIRAL I NDCUTORS, D ESIGN A ND M ODELING E FE Ö ZTÜRK.
On-Chip Communication Architectures
TLC: Transmission Line Caches Brad Beckmann David Wood Multifacet Project University of Wisconsin-Madison 12/3/03.
ECE 546 – Jose Schutt-Aine 1 ECE 546 Lecture -13 Latency Insertion Method Spring 2014 Jose E. Schutt-Aine Electrical & Computer Engineering University.
Design methodology.
Modern VLSI Design 4e: Chapter 7 Copyright  2008 Wayne Wolf Topics Global interconnect. Power/ground routing. Clock routing. Floorplanning tips. Off-chip.
Introspective 3D Chips S. Mysore, B. Agrawal, N. Srivastava, S. Lin, K. Banerjee, T. Sherwood (UCSB), ASPLOS 2006 Shimin Chen (LBA Reading Group Presentation)
Determining the Optimal Process Technology for Performance- Constrained Circuits Michael Boyer & Sudeep Ghosh ECE 563: Introduction to VLSI December 5.
Global Routing.
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
Logic Synthesis for Low Power(CHAPTER 6) 6.1 Introduction 6.2 Power Estimation Techniques 6.3 Power Minimization Techniques 6.4 Summary.
Interconnect Focus Center e¯e¯ e¯e¯ e¯e¯ e¯e¯ IWSM 2001Sam, Chandrakasan, and Boning – MIT Variation Issues in On-Chip Optical Clock Distribution S. L.
EE141 © Digital Integrated Circuits 2nd Wires 1 Digital Integrated Circuits A Design Perspective The Interconnect Jan M. Rabaey Anantha Chandrakasan Borivoje.
Anasim  -fp Power integrity analyzer/optimizer Bottomline Benefits  -fp  -fp Raj Nair, Anasim Corporation Anasim Q
Session 5: Projects 1. Physical Limits of Technology Scaling : 2 SCALING AND EFFICIENCY.
Authors – Jeahyuk huh, Doug Burger, and Stephen W.Keckler Presenter – Sushma Myneni Exploring the Design Space of Future CMPs.
1 Interconnect/Via. 2 Delay of Devices and Interconnect.
Distributed Computation: Circuit Simulation CK Cheng UC San Diego
Modern VLSI Design 3e: Chapter 7 Copyright  1998, 2002 Prentice Hall PTR Topics n Power/ground routing. n Clock routing. n Floorplanning tips. n Off-chip.
A High-Speed & High-Capacity Single-Chip Copper Crossbar John Damiano, Bruce Duewer, Alan Glaser, Toby Schaffer,John Wilson, and Paul Franzon North Carolina.
S A N T A C L A R A U N I V E R S I T Y Center for Nanostructures September 25, 2003 On-Chip Interconnects in Sub-100nm Circuits Sang-Pil Sim Sunil Yu.
Power Integrity Test and Verification CK Cheng UC San Diego 1.
February 12, 1999 Architecture and Circuits: 1 Interconnect-Oriented Architecture and Circuits William J. Dally Computer Systems Laboratory Stanford University.
By Nasir Mahmood.  The NoC solution brings a networking method to on-chip communication.
Introduction to Clock Tree Synthesis
Interconnect/Via.
Chapter 4: Secs ; Chapter 5: pp
A High-Speed & High-Capacity Single-Chip Copper Crossbar John Damiano, Bruce Duewer, Alan Glaser, Toby Schaffer, John Wilson, and Paul Franzon North Carolina.
Inductance Screening and Inductance Matrix Sparsification 1.
Simultaneous Multi-Layer Access Improving 3D-Stacked Memory Bandwidth at Low Cost Donghyuk Lee, Saugata Ghose, Gennady Pekhimenko, Samira Khan, Onur Mutlu.
-1- Delay Uncertainty and Signal Criticality Driven Routing Channel Optimization for Advanced DRAM Products Samyoung Bang #, Kwangsoo Han ‡, Andrew B.
1 Modeling and Optimization of VLSI Interconnect Lecture 2: Interconnect Delay Modeling Avinoam Kolodny Konstantin Moiseev.
1 Hardware Reliability Margining for the Dark Silicon Era Liangzhen Lai and Puneet Gupta Department of Electrical Engineering University of California,
Passive Integrated Elements Robert H. Caverly, Villanova University The creation of these notes was supported by a Grant from The National Science Foundation.
EE4271 VLSI Design VLSI Channel Routing.
Slide 1 SLIP 2004 Payman Zarkesh-Ha, Ken Doniger, William Loh, and Peter Bendix LSI Logic Corporation Interconnect Modeling Group February 14, 2004 Prediction.
WEBENCH® Coil Designer
A High-Speed and High-Capacity Single-Chip Copper Crossbar
Jason Cong, David Zhigang Pan & Prasanna V. Srinivas
EE201C Chapter 3 Interconnect RLC Modeling
Timing Analysis 11/21/2018.
Inductance Screening and Inductance Matrix Sparsification
EE4271 VLSI Design, Fall 2016 VLSI Channel Routing.
Jason Cong, David Zhigang Pan & Prasanna V. Srinivas
Electromagnetic Crosstalk Analysis and Sign-off For Advanced Node SoCs
Presentation transcript:

2013 DAC Designer/User Track Presentation Inductor Design for Global Resonant Clock Distribution in a 28-nm CMOS Processor Visvesh Sathe 3, Padelis Papadopoulos 2, Alvin Loke 3, Tarek Khan 1, Anand Raman 2, Gerry Vandevalk 3, Nikolas Provatas 2, Vincent Ross 1 1 Advanced Micro Devices, Inc. 2 Helic, Inc. 3 Formerly at Advanced Micro Devices, Inc.

AMD/Helic, Inductor Design for Resonant-clocked Processor Outline Resonant Clock Distribution Inductor Design and Analysis Challenges Helic VeloceRaptor/X Inductor Extraction using VeloceRaptor/X Silicon Correlation Conclusion Slide 1

AMD/Helic, Inductor Design for Resonant-clocked Processor Processor Global Clock Distribution Slide 2 Significant global clock loading  7-ps clock skew target across > 20-mm 2 core area  Constrained clock latency from grid to timing elements clocking 24% standard cells 19% gaters 16% macros 18% flops 18% bus 5% Typical core-power breakdown consumption AMD “Piledriver”

AMD/Helic, Inductor Design for Resonant-clocked Processor Basic Resonant Clocking Operation Rely on efficient resonance between L tank and C clk near ω 0 Efficient operation around ω 0 Driving clock at much lower frequencies  Reduced efficiency, warped clock waveform Slide 3

AMD/Helic, Inductor Design for Resonant-clocked Processor AMD Resonant Clocking 90 inductors distributed over custom power grid, signal wires, and core circuitry Slide 4

AMD/Helic, Inductor Design for Resonant-clocked Processor Inductor Design Clock macro, bump pitch constrain inductor size Metal sharing with existing power → cut-aways Centered power straps, HCK tree for  mutual inductance Slide 5

AMD/Helic, Inductor Design for Resonant-clocked Processor Inductor and Grid Problem Summary 87 x 65 μm spiral over 113 x 126 μm custom grid 12 metal layers (2 thick)  Width: 0.13 to 5.7 μm  Thickness: 0.1 to 1.2 μm >5μm/μm 2 interconnect length to be extracted! Slide 6

AMD/Helic, Inductor Design for Resonant-clocked Processor Inductor Design Methodology Goal: Achieve desired L with maximum Q on a highly customized inductor Available design variables  Winding width, outer spacing, inner spacing (NESW)  Winding height, winding width Multiple extractions within reasonable time is vital Extraction customization per-metal is crucial  Top metal layers dominate magnetic interaction, lower level metals have minimal interaction  Per-metal extraction/merging mode selection (R/C/RC/RLC/RLCk) Process-aware, temperature-sensitive extraction Slide 7

AMD/Helic, Inductor Design for Resonant-clocked Processor What is VeloceRaptor/X ? Rapid, high-capacity multi-GHz EM extraction Maxwell equations-based RLCk model per metal segment Inductance calculations based on magnetic vector potential  Skin and proximity effects, substrate losses, capacitive and magnetic coupling Silicon-proven accuracy Use model:  In situ selection of nets and pin definition  Netlist and symbol creation for the marked nets  Model annotation and simulation Slide 8

AMD/Helic, Inductor Design for Resonant-clocked Processor VeloceRaptor/X Offers… High capacity and speed Multithreading support S-parameters and RLCk netlist output  Temperature-aware model  Mixed-mode R/C/RC/RLC/RLCk per any net layer  Layout-dependent effects captured Direct GDS extraction Batch-mode support Numerical network reduction Slide 9

AMD/Helic, Inductor Design for Resonant-clocked Processor Inductor-over-Grid Model Validation Slide 10 Mixed-mode extraction per net layer:  M11- M x : RLCk  M x-1 - M3: RC RLCk extraction below M07 has negligible impact Increasing interconnect density, runtime, memory requirement No improvement in model accuracy when adding more RLCk layers Metals Density (µm/µm 2 ) Extraction Time (sec) RAM (MB) Netlist Size (KB) M11-M10: RLCk3.12E M11-M9: RLCk5.78E M11-M8: RLCk1.34E M11-M7: RLCk2.27E M11-M6: RLCk2.93E M11-M5: RLCk3.85E Best tradeoff between model accuracy and runtime/memory requirements

AMD/Helic, Inductor Design for Resonant-clocked Processor Turnaround Time vs. Metal Density Slide 11

AMD/Helic, Inductor Design for Resonant-clocked Processor Test Chip Silicon Validation Very good agreement between measured and extracted L and Q Slide 12

AMD/Helic, Inductor Design for Resonant-clocked Processor Conclusions Resonant clocking feature reduces global clock power distribution Use of multiple distributed on-chip inductors poses a significant challenge to inductor extraction – Metal-rich extraction environment – Significant mutual inductance with underlying and adjacent circuits and power grids Exploiting design structure and VeloceRaptor/X capabilities enabled efficient inductor optimization Batch mode and per-metal per-net extraction for extraction of a model with sufficient detail to accurately model silicon behavior. Slide 13