An Algorithm to Minimize Leakage through Simultaneous Input Vector Control and Circuit Modification Nikhil Jayakumar Sunil P. Khatri Presented by Ayodeji.

Slides:



Advertisements
Similar presentations
Robust Window-based Multi-node Technology- Independent Logic Minimization Jeff L.Cobb Kanupriya Gulati Sunil P. Khatri Texas Instruments, Inc. Dept. of.
Advertisements

Dynamic and Leakage Power Reduction in MTCMOS Circuits Using an Automated Efficient Gate Clustering Technique Mohab Anis, Shawki Areibi *, Mohamed Mahmoud.
NTHU-CS VLSI/CAD LAB TH EDA De-Shiuan Chiou Da-Cheng Juan Yu-Ting Chen Shih-Chieh Chang Department of CS, National Tsing Hua University, Taiwan Fine-Grained.
Leakage and Dynamic Glitch Power Minimization Using MIP for V th Assignment and Path Balancing Yuanlin Lu and Vishwani D. Agrawal Auburn University ECE.
5/9/2015 A 32-bit ALU with Sleep Mode for Leakage Power Reduction Manish Kulkarni Department of Electrical and Computer Engineering Auburn University,
Predictably Low-Leakage ASIC Design using Leakage-immune Standard Cells Nikhil Jayakumar Sunil P. Khatri University of Colorado at Boulder.
Logical Effort A Method to Optimize Circuit Topology Swarthmore College E77 VLSI Design Adem Kader David Luong Mark Piper December 6, 2005.
1 A Design Approach for Radiation-hard Digital Electronics Rajesh Garg Nikhil Jayakumar Sunil P Khatri Gwan Choi Department of Electrical and Computer.
1 Closed-Loop Modeling of Power and Temperature Profiles of FPGAs Kanupriya Gulati Sunil P. Khatri Peng Li Department of ECE, Texas A&M University, College.
1 A Lithography-friendly Structured ASIC Design Approach By: Salman Goplani* Rajesh Garg # Sunil P Khatri # Mosong Cheng # * National Instruments, Austin,
CMOS Circuit Design for Minimum Dynamic Power and Highest Speed Tezaswi Raja, Dept. of ECE, Rutgers University Vishwani D. Agrawal, Dept. of ECE, Auburn.
A Robust Algorithm for Approximate Compatible Observability Don’t Care (CODC) Computation Nikhil S. Saluja University of Colorado Boulder, CO Sunil P.
Aug 23, ‘021Low-Power Design Minimum Dynamic Power Design of CMOS Circuits by Linear Program Using Reduced Constraint Set Vishwani D. Agrawal Agere Systems,
Design of Variable Input Delay Gates for Low Dynamic Power Circuits
A Delay-efficient Radiation-hard Digital Design Approach Using Code Word State Preserving (CWSP) Elements Charu Nagpal Rajesh Garg Sunil P. Khatri Department.
August 12, 2005Uppalapati et al.: VDAT'051 Glitch-Free Design of Low Power ASICs Using Customized Resistive Feedthrough Cells 9th VLSI Design & Test Symposium.
1 A Variation-tolerant Sub- threshold Design Approach Nikhil Jayakumar Sunil P. Khatri. Texas A&M University, College Station, TX.
A Novel Clock Distribution and Dynamic De-skewing Methodology Arjun Kapoor – University of Colorado at Boulder Nikhil Jayakumar – Texas A&M University,
Power-Aware Placement
9/08/05ELEC / Lecture 51 ELEC / (Fall 2005) Special Topics in Electrical Engineering Low-Power Design of Electronic Circuits.
A Self-adjusting Scheme to Determine Optimum RBB by Monitoring Leakage Currents Nikhil Jayakumar* Sandeep Dhar $ Sunil P. Khatri* $ National Semiconductor,
UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD.
Technology Mapping.
1 Generalized Buffering of PTL Logic Stages using Boolean Division and Don’t Cares Rajesh Garg Sunil P. Khatri Department of Electrical and Computer Engineering,
Device Sizing Techniques for High Yield Minimum-Energy Subthreshold Circuits Dan Holcomb and Mervin John University of California, Berkeley EE241 Spring.
TH EDA NTHU-CS VLSI/CAD LAB 1 Re-synthesis for Reliability Design Shih-Chieh Chang Department of Computer Science National Tsing Hua University.
A PLA based Asynchronous Micropipelining Approach for Sub- threshold Circuit Design Authors: Nikhil Jayakumar* Rajesh Garg* Bruce Gamache $ Sunil P. Khatri*
1 A Single-supply True Voltage Level Shifter Rajesh Garg Gagandeep Mallarapu Sunil P. Khatri Department of Electrical and Computer Engineering, Texas A&M.
NTHU-CS VLSI/CAD LAB TH EDA Student : Da-Cheng Juan Advisor : Shih-Chieh Chang Fine-Grained Sleep Transistor Sizing Algorithm for Leakage Power Minimization.
On-Line Adjustable Buffering for Runtime Power Reduction Andrew B. Kahng Ψ Sherief Reda † Puneet Sharma Ψ Ψ University of California, San Diego † Brown.
Jan. 2007VLSI Design '071 Statistical Leakage and Timing Optimization for Submicron Process Variation Yuanlin Lu and Vishwani D. Agrawal ECE Dept. Auburn.
A Cost-Driven Lithographic Correction Methodology Based on Off-the-Shelf Sizing Tools.
Fall 06, Sep 14 ELEC / Lecture 5 1 ELEC / (Fall 2006) Low-Power Design of Electronic Circuits (Formerly ELEC / )
A Probabilistic Method to Determine the Minimum Leakage Vector for Combinational Designs Kanupriya Gulati Nikhil Jayakumar Sunil P. Khatri Department of.
Lecture 7: Power.
Selective Gate-Length Biasing for Cost-Effective Runtime Leakage Control Puneet Gupta 1 Andrew B. Kahng 1 Puneet Sharma 1 Dennis Sylvester 2 1 ECE Department,
UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD.
A Highly Testable Pass Transistor Based Structured ASIC Design Methodology Kanupriya Gulati Nikhil Jayakumar Sunil P. Khatri.
1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
USING SAT-BASED CRAIG INTERPOLATION TO ENLARGE CLOCK GATING FUNCTIONS Ting-Hao Lin, Chung-Yang (Ric) Huang Graduate Institute of Electrical Engineering,
EE466: VLSI Design Power Dissipation. Outline Motivation to estimate power dissipation Sources of power dissipation Dynamic power dissipation Static power.
1 Efficient Analytical Determination of the SEU- induced Pulse Shape Rajesh Garg Sunil P. Khatri Department of ECE Texas A&M University College Station,
ENGG 6090 Topic Review1 How to reduce the power dissipation? Switching Activity Switched Capacitance Voltage Scaling.
Power Reduction for FPGA using Multiple Vdd/Vth
POWER-DRIVEN MAPPING K-LUT-BASED FPGA CIRCUITS I. Bucur, N. Cupcea, C. Stefanescu, A. Surpateanu Computer Science and Engineering Department, University.
A Class Presentation for VLSI Course by : Fatemeh Refan Based on the work Leakage Power Analysis and Comparison of Deep Submicron Logic Gates Geoff Merrett.
An Efficient Algorithm for Dual-Voltage Design Without Need for Level-Conversion SSST 2012 Mridula Allani Intel Corporation, Austin, TX (Formerly.
Ashley Brinker Karen Joseph Mehdi Kabir ECE 6332 – VLSI Fall 2010.
An ASIC Design methodology with Predictably Low Leakage, using Leakage-immune Standard Cells Nikhil Jayakumar, Sunil P Khatri ISLPED’03.
Chapter 07 Electronic Analysis of CMOS Logic Gates
False Path. Timing analysis problems We want to determine the true critical paths of a circuit in order to: –To determine the minimum cycle time that.
A NEW ECO TECHNOLOGY FOR FUNCTIONAL CHANGES AND REMOVING TIMING VIOLATIONS Jui-Hung Hung, Yao-Kai Yeh,Yung-Sheng Tseng and Tsai-Ming Hsieh Dept. of Information.
05/04/06 1 Integrating Logic Synthesis, Tech mapping and Retiming Presented by Atchuthan Perinkulam Based on the above paper by A. Mishchenko et al, UCAL.
A Robust Pulse-triggered Flip-Flop and Enhanced Scan Cell Design
XIAOYU HU AANCHAL GUPTA Multi Threshold Technique for High Speed and Low Power Consumption CMOS Circuits.
4. Combinational Logic Networks Layout Design Methods 4. 2
Pattern Sensitive Placement For Manufacturability Shiyan Hu, Jiang Hu Department of Electrical and Computer Engineering Texas A&M University College Station,
Skewed Flip-Flop Transformation for Minimizing Leakage in Sequential Circuits Jun Seomun, Jaehyun Kim, Youngsoo Shin Dept. of Electrical Engineering, KAIST,
Leakage reduction techniques Three major leakage current components 1. Gate leakage ; ~ Vdd 4 2. Subthreshold ; ~ Vdd 3 3. P/N junction.
Technology Mapping. 2 Technology mapping is the phase of logic synthesis when gates are selected from a technology library to implement the circuit. Technology.
Post-Layout Leakage Power Minimization Based on Distributed Sleep Transistor Insertion Pietro Babighian, Luca Benini, Alberto Macii, Enrico Macii ISLPED’04.
A Class presentation for VLSI course by : Maryam Homayouni
An O(bn 2 ) Time Algorithm for Optimal Buffer Insertion with b Buffer Types Authors: Zhuo Li and Weiping Shi Presenter: Sunil Khatri Department of Electrical.
Circuit Delay Performance Estimation Most digital designs have multiple signal paths and the slowest one of these paths is called the critical path Timing.
A Novel, Highly SEU Tolerant Digital Circuit Design Approach By: Rajesh Garg Sunil P. Khatri Department of Electrical and Computer Engineering, Texas A&M.
University of Colorado at Boulder
Timing Optimization.
On the Improvement of Statistical Timing Analysis
Fast Min-Register Retiming Through Binary Max-Flow
Chapter 3b Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction Prof. Lei He Electrical Engineering Department.
Presentation transcript:

An Algorithm to Minimize Leakage through Simultaneous Input Vector Control and Circuit Modification Nikhil Jayakumar Sunil P. Khatri Presented by Ayodeji Coker Texas A&M University, College Station, TX, USA

Contribution of Leakage Power Leakage is a major contributor to total power consumption. “Standby / Sleep” leakage reduction is crucial for portable electronics. Some popular techniques are:  MTCMOS / sleep transistor  Body biasing  Input Vector Control (IVC)

Intuition Behind Input Vector Control Stack Effect : As many series cut-off transistors as possible reduces leakage.  Leakage can be about 2 orders of magnitude lower than maximum. Cannot set all gates to minimum leakage state due to logical interdependencies  NAND3 : min leakage state = 000  NOR3 : min leakage state = 111 InputLeakage (A) E E E E E E E E-08 Leakage of a NAND3 gate

Traditional Input Vector Control Find the Minimum Leakage Vector (MLV) at the primary inputs.  NP-hard problem.  Several heuristics to find an optimal MLV. Apply inputs through scan-chain or through MUXes at primary inputs (flip-flop outputs) during standby / sleep. Can we do more?  Why restrict ourselves to only primary inputs?

Previous Approaches “Leakage Current Reduction in CMOS VLSI Circuits by Input Vector Control” (TVLSI ‘04) – Abdollahi et.al.  Similar to our approach – use control points and IVC.  Our choice of gate variants allows greater flexibility at control points. “Enhanced Leakage Reduction by Gate Replacement” (DAC ‘05) – Yuan et.al. “A Fast Simultaneous Input Vector Generation and Gate Replacement Algorithm for Leakage Power Reduction” (DAC ’06) – Cheng et.al.  Use gate replacement like we do, but a gate G is replaced by a gate G’ to reduce leakage of gate G not control internal nodes. Previous approaches have an associated delay penalty to get a reasonable leakage reduction.  We get a significant leakage reduction with no expected delay penalty.

Our Approach - Overview Modify the circuit such that we control internal nodes of the circuit. Create variants of each gate that replaces the original. Traverse a circuit from inputs to output and replace gates in the circuit  Reduce leakage through stack effect for the gates in the fanout of a gate.  Do not necessarily reduce leakage of the gate being replaced. Perform gate replacement such that leakage is reduced but circuit delay is not increased.

Variants of a Gate Regular NAND2

sngl1out 0 : Used when output of gate is 1 in standby, but all the fanout gates required an output of 0. Variants of a Gate

sngl1out 1 : Used when output of gate is 0 in standby, but all the fanout gates required an output of 1. Variants of a Gate

snglmx 0 : Used when output of gate is 1 in standby, but some fanout gates require an output of 0. Variants of a Gate

snglmx 1 : Used when output of gate is 0 in standby, but some fanout gates require an output of 1. Variants of a Gate

dbl variants : Larger counterparts of the sngl variants (devices sized < 2X)  Adds more flexibility to choices for replacement. Variants of a Gate

The Gate Replacement Algorithm Assume inputs of gates at first level can be set independently  Gates at first level can all be set to their minimum leakage state. Pick a gate G from the first level. Let g be its output signal. Find what value all gates in the fanout of G require. Try to replace gate if there is a net savings in leakage and there is no timing violation.

Example G H J First set gate G to lowest leakage state - 00 Next look at fanout of gate G – gate J is in its fanout.  If output of G = 1 (the current value) – best state at J possible is 10. Choose from 10,11  Best state possible for J is 00. Choose from 00,01,10,11.  Leakage improvement possible = (Leakage of J at state 00 – Leakage of J at state 10 – Leakage cost of replacing gate G with a sngl1out 0 variant).

First set gate G to lowest leakage state - 00 Next look at fanout of gate G – gate J is in its fanout.  If output of G = 1 (the current value) – best state at J possible is 10. Choose from 10,11  Best state possible for J is 00. Choose from 00,01,10,11.  Leakage improvement possible = (Leakage of J at state 00 – Leakage of J at state 10 – Leakage cost of replacing gate G with a sngl1out 0 variant). G H J Example

Next set gate H to its lowest leakage state - 00 Then look at fanout of gate H – gate J is in its fanout.  If output of H = 1 (the current value) – best state at J possible is is only choice.  Best state possible for J is 00 Choose from 00,01.  Leakage improvement possible = (Leakage of J at state 00 – Leakage of J at state 01 – Leakage cost of replacing gate H with a sngl1out 0 variant). G H J Example

…Replacement Algorithm If both logic 0 and logic 1 are required at some node – then try snglmx variants. If sngl variants cause timing violations – try dbl variants.  Use dbl variants only if leakage improvement is positive. Traverse circuit from inputs to output in levelized order.

Experimental Results Cell library characterization done in SPICE.  bsim100 Berkeley Predictive Technology Model (BPTM) cards, 1.2V VDD Algorithm implemented in PERL  Run on 3GHz Pentium 4, 2GB RAM, Fedora Core 3.

On average 30% improvement in leakage over applying MLV at primary inputs alone. Existing approaches that use IVC and control points to get a similar leakage improvement have a delay penalty of 10 to 15%. Ckt.Min Lkg Original(nA)New min. Lkg(nA)% Lkg Decr alu alu apex apex C C C C C C dalu des i i i i i i i i t too_large Avg29.18 Experimental Results

There is never a delay increase. Delay decreases in some instances  due to use of dbl variants.  sngl1out variants improve delay in one transition. Runtime is low.  Current implementation is in PERL – expected to speed up when implemented in C/C++. Ckt.Original Delay (ps)New Delay (ps) % Delay ImprovementRuntime(s) alu alu apex apex C C C C C C dalu des i i i i i i i i t too_large Avg Experimental Results

Ckt. Original Active Area(μ2) New Active Area(μ2) Active Area Ovh (%) Sleep Cut-off transistor Active Area (μ2) Active Area excluding sleep cut-off transistors (μ2) Active Area Ovh excluding sleep cut-off transistors (%) alu alu apex apex C C C C C C dalu des i i i i i i i i t too_large Avg Total Active area overhead on average = 24%.  Real area overhead would be lower after layout, place and route. A lot of the area is used by sleep cut- off transistors.  These can be shared – would reduce area, delay and leakage. Experimental Results

dblmx variants did not get used. sngl1out variants used the most. Ckt.#sngl1out#dbl1out#snglmxTotal # replacementsTotal # gates alu alu apex apex C C C C C C dalu des i i i i i i i i t too_large Avg Experimental Results

Conclusion We extended input vector control to control internal nodes – not just primary inputs. 30% leakage decrease with no delay penalty  Leakage decrease is over MLV at primary inputs alone.  Delay improvement in many cases. Active area increase = 24%, but this is mostly sleep cut-off transistor area  Placed and routed area is expected to be much lower. Dynamic power estimated to increase by 1.5% on average.

Thank you Contact info of authors: nikhil_AT_ece_DOT_tamu_DOT_edu sunilkhatri_AT_tamu_DOT_edu