On-chip power distribution in deep submicron technologies

Slides:



Advertisements
Similar presentations
THERMAL-AWARE BUS-DRIVEN FLOORPLANNING PO-HSUN WU & TSUNG-YI HO Department of Computer Science and Information Engineering, National Cheng Kung University.
Advertisements

Heat Generation in Electronics Thermal Management of Electronics Reference: San José State University Mechanical Engineering Department.
Efficient Design and Analysis of Robust Power Distribution Meshes Puneet Gupta Blaze DFM Inc. Andrew B. Kahng.
Keeping Hot Chips Cool Ruchir Puri, Leon Stok, Subhrajit Bhattacharya IBM T.J. Watson Research Center Yorktown Heights, NY Circuits R-US.
Power Reduction Techniques For Microprocessor Systems
ElasticTree: Saving Energy in Data Center Networks Brandon Heller, Srini Seetharaman, Priya Mahadevan, Yiannis Yiakoumis, Puneed Sharma, Sujata Banerjee,
3D-STAF: Scalable Temperature and Leakage Aware Floorplanning for Three-Dimensional Integrated Circuits Pingqiang Zhou, Yuchun Ma, Zhouyuan Li, Robert.
Computer Science & Engineering Department University of California, San Diego SPICE Diego A Transistor Level Full System Simulator Chung-Kuan Cheng May.
Paul Falkenstern and Yuan Xie Yao-Wen Chang Yu Wang Three-Dimensional Integrated Circuits (3D IC) Floorplan and Power/Ground Network Co-synthesis ASPDAC’10.
A Cyber-Physical Systems Approach to Energy Management in Data Centers Presented by Chen He Adopted form the paper authors.
Aleksandra Tešanović Low Power/Energy Scheduling for Real-Time Systems Aleksandra Tešanović Real-Time Systems Laboratory Department of Computer and Information.
S. Reda EN160 SP’08 Design and Implementation of VLSI Systems (EN1600) Lecture 14: Power Dissipation Prof. Sherief Reda Division of Engineering, Brown.
Supply Voltage Degradation Aware Analytical Placement Andrew B. Kahng, Bao Liu and Qinke Wang UCSD CSE Department {abk, bliu,
04/09/02EECS 3121 Lecture 25: Interconnect Modeling EECS 312 Reading: 8.3 (text), 4.3.2, (2 nd edition)
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 13: Power Dissipation Prof. Sherief Reda Division of Engineering, Brown.
Temperature-Aware Design Presented by Mehul Shah 4/29/04.
Lecture 7: Power.
Power-Aware Computing 101 CS 771 – Optimizing Compilers Fall 2005 – Lecture 22.
Efficient Decoupling Capacitance Budgeting Considering Operation and Process Variations Yiyu Shi*, Jinjun Xiong +, Chunchen Liu* and Lei He* *Electrical.
1 Reconfigurable ECO Cells for Timing Closure and IR Drop Minimization TingTing Hwang Tsing Hua University, Hsin-Chu.
Integrated Regulation for Energy- Efficient Digital Circuits Elad Alon 1 and Mark Horowitz 2 1 UC Berkeley 2 Stanford University.
Decoupling Capacitance Allocation for Power Supply Noise Suppression Shiyou Zhao, Kaushik Roy, Cheng-Kok Koh School of Electrical & Computer Engineering.
Power, Energy and Delay Static CMOS is an attractive design style because of its good noise margins, ideal voltage transfer characteristics, full logic.
© 2012 Pearson Education. Upper Saddle River, NJ, All rights reserved. Electronic Devices, 9th edition Thomas L. Floyd Electronic Devices Ninth.
Digital Integrated Circuits© Prentice Hall 1995 Inverter THE INVERTERS.
More Realistic Power Grid Verification Based on Hierarchical Current and Power constraints 2 Chung-Kuan Cheng, 2 Peng Du, 2 Andrew B. Kahng, 1 Grantham.
1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
6.893: Advanced VLSI Computer Architecture, September 28, 2000, Lecture 4, Slide 1. © Krste Asanovic Krste Asanovic
EE466: VLSI Design Power Dissipation. Outline Motivation to estimate power dissipation Sources of power dissipation Dynamic power dissipation Static power.
University of California San Diego
ENGG 6090 Topic Review1 How to reduce the power dissipation? Switching Activity Switched Capacitance Voltage Scaling.
Modern VLSI Design 4e: Chapter 7 Copyright  2008 Wayne Wolf Topics Global interconnect. Power/ground routing. Clock routing. Floorplanning tips. Off-chip.
Power Reduction for FPGA using Multiple Vdd/Vth
Pattern Selection based co-design of Floorplan and Power/Ground Network with Wiring Resource Optimization L. Li, Y. Ma, N. Xu, Y. Wang and X. Hong WuHan.
Lecture 2 1 Computer Elements Transistors (computing) –How can they be connected to do something useful? –How do we evaluate how fast a logic block is?
Low-Power Wireless Sensor Networks
CAD for Physical Design of VLSI Circuits
Logic Synthesis for Low Power(CHAPTER 6) 6.1 Introduction 6.2 Power Estimation Techniques 6.3 Power Minimization Techniques 6.4 Summary.
Chapter 07 Electronic Analysis of CMOS Logic Gates
MICAS Department of Electrical Engineering (ESAT) Design-In for EMC on digital circuit October 27th, 2005 AID–EMC: Low Emission Digital Circuit Design.
The George Washington University School of Engineering and Applied Science Department of Electrical and Computer Engineering ECE122 – Lab 7 MOSFET Parameters.
Washington State University
Statistical Sampling-Based Parametric Analysis of Power Grids Dr. Peng Li Presented by Xueqian Zhao EE5970 Seminar.
XIAOYU HU AANCHAL GUPTA Multi Threshold Technique for High Speed and Low Power Consumption CMOS Circuits.
1 A Fast Algorithm for Power Grid Design Jaskirat Singh Sachin Sapatnekar Department of Electrical and Computer Engineering University of Minnesota.
Stochastic Current Prediction Enabled Frequency Actuator for Runtime Resonance Noise Reduction Yiyu Shi*, Jinjun Xiong +, Howard Chen + and Lei He* *Electrical.
1 Interconnect/Via. 2 Delay of Devices and Interconnect.
Basics of Energy & Power Dissipation
Modern VLSI Design 3e: Chapter 7 Copyright  1998, 2002 Prentice Hall PTR Topics n Power/ground routing. n Clock routing. n Floorplanning tips. n Off-chip.
© Digital Integrated Circuits 2nd Inverter Digital Integrated Circuits A Design Perspective The Inverter Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.
1 Decoupling Capacitors Requirements Intel - Microprocessor power levels in the past have increased exponentially, which has led to increased complexity.
EE 201C Modeling of VLSI Circuits and Systems
Seok-jae, Lee VLSI Signal Processing Lab. Korea University
CS203 – Advanced Computer Architecture
1 Hardware Reliability Margining for the Dark Silicon Era Liangzhen Lai and Puneet Gupta Department of Electrical Engineering University of California,
LOW POWER DESIGN METHODS
Institute of Applied Microelectronics and Computer Engineering College of Computer Science and Electrical Engineering, University of Rostock Slide 1 Power.
Copyright © 2009, Intel Corporation. All rights reserved. Power Gate Design Optimization and Analysis with Silicon Correlation Results Yong Lee-Kee, Intel.
MICROPROCESSOR DESIGN1 IR/Inductive Drop Introduction One component of every chip is the network of wires used to distribute power from the input power.
High Speed Properties of Digital Gates, Copyright F. Canavero, R. Fantino Licensed to HDT - High Design Technology
Electronic Devices Ninth Edition Floyd Chapter 17.
LOW POWER DESIGN METHODS V.ANANDI ASST.PROF,E&C MSRIT,BANGALORE.
Chapter 5a On-Chip Power Integrity
ElasticTree: Saving Energy in Data Center Networks
Yiyu Shi*, Jinjun Xiong+, Howard Chen+ and Lei He*
Circuit Characterization and Performance Estimation
EE 201C Modeling of VLSI Circuits and Systems
Energy Efficient Power Distribution on Many-Core SoC
Presentation transcript:

On-chip power distribution in deep submicron technologies Aida Todri Electrical and Computer Engineering Department University of California Santa Barbara

Outline Introduction Problem Statement and Formulation Electromigration (EM) Phenomena in Power Gated Networks EM Analysis and Grid Optimization Decoupling Capacitor Efficiency in Power Networks Metrics and Placement Power Supply Noise Reduction in Multi-core System Power vs Performance Trade-offs Conclusions

Technology Scaling Advantages: Disadvantages: Increasing device count Higher transistor density Increasing logic switching speed Increasing clock frequencies Disadvantages: Increasing internal capacitance Increasing leakage current higher standby power Increasing dynamic power larger transient currents Smaller devices are intrinsically faster as the carriers have to pass only a shorter distance. Subthreshold current (1) Junction leakage (2) Gate tunnelling current (3) Gate induced drain leakage (4) 3 4

On-Chip Power Delivery Network Hierarchical mesh structure on several metal layers Global grid occupies the top two layers of the chip Local (block) grid occupies lower metal layers Must satisfy reliability constraints: In DC (steady state) conditions: Voltage drop (IR) must be within margins Current density in power tracks should not surpass allowed current density In AC (transient) conditions: Power supply noise must be within margins Decaps may be inserted to suppress power supply noise and to lower impedance of power tracks

Power gating technique Low-Power Strategy Idle blocks can be disconnected from the grid Their static power can be eliminated Sleep transistor controls the wake up or sleep mode of the gated block Leakage components are, subthreshold, junction, gate and drain leakage. These currents significantly increase the leakage power. In deep submicron technologies, idle blocks can cause considerable amount of leakage currents which contribute to the total power consumption. An effective technique to eliminate standby power is power gating. It is implemented through an additional transistor, PMOS for power gating (header), and NMOS for ground gating (footer). Sleep transistor is a device with high threshold voltage and thick gate oxide to reduce the leakage current. Power gating is beneficial if there exist considerable idle periods where the circuit is not needed. However, the sleep transistors come with cost, they experience voltage drop, consume power, occupy area and cause an energy overhead during switching. Thus, the idle times must be long enough to save at least the energy consumed due to power switching. Power gating technique

Power Gating Technique Top grid has to satisfy the reliability constraints in terms of current density and the voltage drop. Previous works consider the power grid reliability when there is a single configuration of blocks (no power gating). They do not consider the problem that power gating imposes on the grid. The contribution of our work, is on the analysis of the grid when power gating is applied on the some of the blocks. Also, we consider the integrity of the grid while various power gating configurations are considered. This kind of work has not been done before, and it sort of counter intuitive, because power gating overall reduces the amount of current flow on the grid, however, on some location the grid will experience current overcrowding which can cause EM violations. Top layer is global grid Designed to satisfy reliability constraints (EM and IR) when all circuits are switching Each block has its local power mesh Many power gating configurations exist

Research Topics of Interest - 1 Designing Power Grid for Power-Gated Chips Typically designed at the early stages of the design process Mostly over-designed causing a large overhead in chip power consumption Power gating is not considered during the design of power grids.

On-chip Power Delivery for Power Gated Chips Objective: Deliver power to the circuit blocks while satisfying reliability constraints in the power grid when power gating is applied. Top grid has to satisfy the reliability constraints in terms of current density and the voltage drop. Previous works consider the power grid reliability when there is a single configuration of blocks (no power gating). They do not consider the problem that power gating imposes on the grid. The contribution of our work, is on the analysis of the grid when power gating is applied on the some of the blocks. Also, we consider the integrity of the grid while various power gating configurations are considered. This kind of work has not been done before, and it sort of counter intuitive, because power gating overall reduces the amount of current flow on the grid, however, on some location the grid will experience current overcrowding which can cause EM violations. Power tracks are not ideal and have finite resistance Many possible configurations of operating blocks

Electromigration Mechanisms Transport of metal atoms under the force of an electron flux High current density stress Depletion/ accumulation of metal material from atomic flow can lead to the formation of hillocks and voids in metal lines lead to shorts and open circuits faults Voids Grain Boundaries Hillocks Photo courtesy of University of Notre Dame

Electromigration on Power Gated Grids Before power gating After power gating EM violations may occur only on those branches where base currents flow in opposite directions.

IR Drop Analysis for Power Gating Theorem 1: The grid node voltages can only increase when a current source is turned off. Corollary: When a source is turned off, IR drop may only decrease when power gating is applied. Theorem 2: Uniform track resizing of a resistive grid does not change the current flow. Corollary: Uniform upsizing does not change currents on a grid, so we can always upsize tracks to meet EM and IR constraints. Experiments will be related to this psi*. This is a standard way to fix the grid for all constraints. Mention Boyd paper on fixing the IR. Uniform upsizing by guarantees that all EM and IR constraints are satisfied for all power gating configurations.

Power-Gating Aware Optimization We reduce the complexity of the optimization problem by reducing the grid granularity by applying the multi-grid technique. Our optimization scheme has three main steps: Reduce grid size by folding tracks Optimize the reduced grid Unfold the grid to its original granularity

1. Grid Folding Identify a few neighbor tracks around a violation that remain unfolded. Identify a few neighbor tracks around a violation that remain unfolded to maintain the conditions in which a violation exists.

2. Reduced Grid Optimization A three-step iterative process, 3 Step LP : Derive current and voltage sensitivities to grid sizing Uniformly upsize the grid by fine scale upsizing steps {ψ1, ψ2,…, ψr} Shrink the selected tracks The process is repeated until no violations exist. Upsizing by ψi from {ψ1, ψ2,…, ψr} Original grid Shrink selected tracks

LP Problem Minimize the total resizing of the grid as subject to the three constraints: Current Density Voltage Drop Resizing Coefficients

3-step Iterative LP Algorithm Initial Optimized Grid for All Sources On Computations from Power Gating Configurations J EM violation VB V IR violation node Y Upsizing coefficient y Finer scale coefficients i J >J N VB max Feasible Grid V <0.9V node DD Y Upsize Grid by y i Shrink Grid

3. Grid Unfolding As we only considered only worst case violations on the grid, minor violations after optimization and unfolding are possible. These violations are miniscule and can be fixed by applying greedy upsizing of the track with violation.

Experiments- Floorplans Low/medium current density blocks High density blocks located in the center of the grid. Power gating configurations. High current density blocks Gated blocks Low/medium density blocks located in the center of the grid. Power gating configurations

Results Experiments to observe: Various current density blocks (high, med, low) Various power grid granularities 20x20, 30x30, 50x50, 100x100 All vs. some power gating configurations Percentages in area savings compared to uniform upsizing up to 48% of area savings 100x100 granularity grid with high density blocks placed on the center of the grid

Decoupling Capacitor vs. PSN Inserted decoupling capacitor (decaps) can provide charge to switching circuit to reduce power supply noise (PSN). Decaps consume power due to switching PSN suppression depends on decap efficiency

Research Topics of Interest - 2 How to Use Decoupling Capacitors Most Efficiently ? Decoupling capacitor is a reservoir of charge Used to reduce voltage drop at the switching current load Amount of charge supplied depends on Parasitic conductance between decap and current load Parasitic conductance between decap and power supply Switching frequency of the current load Capacitor To current load Charge Interconnect

Decoupling Capacitance Effectiveness Decoupling capacitors suppress power supply noise Decaps reduce the impedance of the power delivery system operating at high frequencies. Efficacy of decoupling capacitors depends upon Impedance of conductors connecting the capacitor to current loads and power sources Charge-back ability after a transitions is completed.

Decap Effectiveness in Mesh Grids 1 8 7 6 5 4 3 2 12 11 10 9 16 15 14 13 (a) 1 8 7 6 5 4 3 2 12 11 10 9 16 15 14 13 (b) A Original mesh Mesh A circuit 1 8 7 6 5 4 3 2 12 11 10 9 16 15 14 13 (c) B 1 8 7 6 5 4 3 2 12 11 10 9 16 15 14 13 (d) C Mesh B circuit Mesh C circuit

Decap Effectiveness on Mesh Grids Detrimental decoupling capacitance.

Decap Effectiveness in Mesh Grids 1 8 7 6 5 4 3 2 12 11 10 9 16 15 14 13 (c) B Ineffective decoupling capacitance.

Decap Effectiveness in Mesh Grids 1 8 7 6 5 4 3 2 12 11 10 9 16 15 14 13 (d) C Effective decoupling capacitance

Mesh Analysis Decap effectiveness depends upon Zd impedance has an impact on how fast Cdecap will be recharged Zs,impedance has an impact on how much voltage drop will be at the switching circuit Zsd,impedance has an impact on how much current (charge) Cdecap can provide to the switching circuit. tr, tf, Ipeak, switching frequency and current magnitude Cdecap, decap size

Decap’s effectiveness metrics a: effective distance between decap and Vdd pin b: effective distance between current source and decap u: minimum distance between decap and Vdd pin to avoid spurious switching.

Decap Effectiveness Model

Decap Budget : Optimization Function LP optimization problem Subject to : Voltage drop margin Charge transfer balance Allowed cap constraint Efficiency metrics constraints

Sequence of Linear Programs Cdecapi is dependent on the node voltage Vi ; Cdecapi and Vi are variables. Sequence of linear programs: Initial transient analysis performed with existing decaps, solved for Vi’s Determine decap budgets Cdecapi based on LP formulation where node voltages are determined in step 1. Re-perform transient analysis with Cdecapi to check the node voltages. Update node voltages Vi. Check if Vi >Vthresh. If Vi >Vthresh+σ, run decap budget to reduce decaps, step 2 If Vi <Vthresh-σ, run decap budget to allocate more decaps, step 2

Courtesy of STMicroelectronics Case Study Courtesy of STMicroelectronics

Experiments

Experiments ???

Experiments Total Decap Reduction Correlations Total amount of decap reduced on chip 297pF Percentage 5.56% Number of Filler Cells Reduction (placed decaps) 297pF out of 623pF = > 52% Correlations Case Study Max IR Drop (mV) Power (W) Apache’s Redhawk 51.8 0.645 Our method (before) 43.1 0.660 (after) 43.7

Multi-Core System Several cores integrated on a chip Chips with Several cores have been produced Tens to hundreds of cores per chip are envisioned Physical design problems Thermal management Power management Power delivery Noise control …

Research Topics of Interest - 3 How to Suppress Power Supply Noise? Sources Fast transient currents of switching blocks Turn on/off of power gated blocks Parasitic impedance of power tracks (package) Detrimental Effects Circuit delay increase Logical faults due to increased delay

Multi-Core Systems Objective: Assign task to cores such that minimum power supply noise is generated. In this work we are studying the impact that various working workloads would have on the global grid and generated power supply noise Shared global grid Uniform controlled collapse chip connection (C4s) distribution

PSN vs. Workload Assignments 9 3 2 1 8 6 5 4 7 1 2 3 The same workload will have different PSN based on the core assignment. Assignment 1-2-4 has less available decap than 1-5-9. 4 5 6 PSN vs. proximity between working cores PSN vs. available decap PSN vs. operating frequencies 7 8 9

Grid Models Base grid Global grid Core grid

Circuit Reduction Reducing base grid (a) to a simplified model (b) We are interested in representing the worst node voltage of the base grid (figure a) which is also represented by the simplified model (figure b). Reducing base grid (a) to a simplified model (b) Circuit voltage response maintained for the worst case voltage drop Assumption: the worst case voltage drop is on node 5

Power Supply Noise Aware Assignment We apply simulated annealing (SA) based algorithm to minimize PSN. A workload can be assigned to any core Task assignments on cores will vary due to: Location same task at different location Frequency Same location but varying workloads Location and Frequency Utilize the frequency response for each frequency group to explore the trade-off between power supply noise, performance and workload assignment

Assignment Heuristics Current Demand-Based Assignment (CDA) Workloads assigned to cores which are farther away from large current workloads to minimize noise propagation.

Experiments Experiments to observe Results Various core granularities 3x3,5x5,7x7, 10x10 Various operating frequencies Various core sizes Impact of initial task assignment on the multicore system Results No initial assignment Up to 30% less in PSN compared to CDA method With initial assignment Up to 37% less in PSN compared to CDA method.

Conclusions On-chip power distribution for low-power applications Power gating induced electromigration issues in the power networks Analysis and optimization of power network Analysis of decoupling capacitance efficiency in power grids Decoupling capacitance placement in power networks Low power supply noise task assignment for multicore systems Analysis of multicore systems power network Task assignment optimization for low power noise