Dynamic and Leakage Power Reduction in MTCMOS Circuits Using an Automated Efficient Gate Clustering Technique Mohab Anis, Shawki Areibi *, Mohamed Mahmoud.

Slides:



Advertisements
Similar presentations
COMBINATIONAL LOGIC [Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.]
Advertisements

Subthreshold SRAM Designs for Cryptography Security Computations Adnan Gutub The Second International Conference on Software Engineering and Computer Systems.
9/15/05ELEC / Lecture 71 ELEC / (Fall 2005) Special Topics in Electrical Engineering Low-Power Design of Electronic Circuits.
NTHU-CS VLSI/CAD LAB TH EDA De-Shiuan Chiou Da-Cheng Juan Yu-Ting Chen Shih-Chieh Chang Department of CS, National Tsing Hua University, Taiwan Fine-Grained.
Keeping Hot Chips Cool Ruchir Puri, Leon Stok, Subhrajit Bhattacharya IBM T.J. Watson Research Center Yorktown Heights, NY Circuits R-US.
Elettronica T A.A Digital Integrated Circuits © Prentice Hall 2003 Inverter CMOS INVERTER.
Minimum Energy CMOS Design with Dual Subthrehold Supply and Multiple Logic-Level Gates Kyungseok Kim and Vishwani D. Agrawal ECE Dept. Auburn University.
Leakage and Dynamic Glitch Power Minimization Using MIP for V th Assignment and Path Balancing Yuanlin Lu and Vishwani D. Agrawal Auburn University ECE.
5/9/2015 A 32-bit ALU with Sleep Mode for Leakage Power Reduction Manish Kulkarni Department of Electrical and Computer Engineering Auburn University,
Predictably Low-Leakage ASIC Design using Leakage-immune Standard Cells Nikhil Jayakumar Sunil P. Khatri University of Colorado at Boulder.
Fall 06, Sep 19, 21 ELEC / Lecture 6 1 ELEC / (Fall 2005) Special Topics in Electrical Engineering Low-Power Design of Electronic.
Energy Source Lifetime Optimization for a Digital System through Power Management Department of Electrical and Computer Engineering Auburn University,
Aug 23, ‘021Low-Power Design Minimum Dynamic Power Design of CMOS Circuits by Linear Program Using Reduced Constraint Set Vishwani D. Agrawal Agere Systems,
Design of Variable Input Delay Gates for Low Dynamic Power Circuits
S. Reda EN160 SP’08 Design and Implementation of VLSI Systems (EN1600) Lecture 14: Power Dissipation Prof. Sherief Reda Division of Engineering, Brown.
4/28/05Vemula: ELEC72501 Enhanced Scan Based Flip-Flop for Delay Testing By Sudheer Vemula.
An Algorithm to Minimize Leakage through Simultaneous Input Vector Control and Circuit Modification Nikhil Jayakumar Sunil P. Khatri Presented by Ayodeji.
Nov. 8, 001Low-Power Design Digital Circuit Design for Minimum Transient Energy Vishwani D. Agrawal Circuits and Systems Research Lab, Agere Systems (Bell.
Optimal Layout of CMOS Functional Arrays ECE665- Computer Algorithms Optimal Layout of CMOS Functional Arrays T akao Uehara William M. VanCleemput Presented.
9/20/05ELEC / Lecture 81 ELEC / (Fall 2005) Special Topics in Electrical Engineering Low-Power Design of Electronic Circuits.
8/18/05ELEC / Lecture 11 ELEC / (Fall 2005) Special Topics in Electrical Engineering Low-Power Design of Electronic Circuits.
Institute of Digital and Computer Systems 1 Fabio Garzia / Finding Peak Performance in a Process23/06/2015 Chapter 5 Finding Peak Performance in a Process.
Using Hierarchy in Design Automation: The Fault Collapsing Problem Raja K. K. R. Sandireddy Intel Corporation Hillsboro, OR 97124, USA
NTHU-CS VLSI/CAD LAB TH EDA Student : Da-Cheng Juan Advisor : Shih-Chieh Chang Fine-Grained Sleep Transistor Sizing Algorithm for Leakage Power Minimization.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 13: Power Dissipation Prof. Sherief Reda Division of Engineering, Brown.
Using Hierarchy in Design Automation: The Fault Collapsing Problem Raja K. K. R. Sandireddy Intel Corporation Hillsboro, OR 97124, USA
Lecture 5 – Power Prof. Luke Theogarajan
Changbo Long ECE Department, UW-Madison Lei He EDA Research Group EE Department, UCLA Distributed Sleep Transistor Network.
Low-voltage techniques Mohammad Sharifkhani. Reading Text Book I, Chapter 4 Text Book II, Section 11.7.
Lecture 7: Power.
1 Enhancing Performance of Iterative Heuristics for VLSI Netlist Partitioning Dr. Sadiq M. Sait Dr. Aiman El-Maleh Mr. Raslan Al Abaji. Computer Engineering.
Lecture 21, Slide 1EECS40, Fall 2004Prof. White Lecture #21 OUTLINE –Sequential logic circuits –Fan-out –Propagation delay –CMOS power consumption Reading:
Power, Energy and Delay Static CMOS is an attractive design style because of its good noise margins, ideal voltage transfer characteristics, full logic.
6.893: Advanced VLSI Computer Architecture, September 28, 2000, Lecture 4, Slide 1. © Krste Asanovic Krste Asanovic
EE466: VLSI Design Power Dissipation. Outline Motivation to estimate power dissipation Sources of power dissipation Dynamic power dissipation Static power.
ENGG 6090 Topic Review1 How to reduce the power dissipation? Switching Activity Switched Capacitance Voltage Scaling.
ICCAD 2003 Algorithm for Achieving Minimum Energy Consumption in CMOS Circuits Using Multiple Supply and Threshold Voltages at the Module Level Yuvraj.
Mehdi Sadi, Italo Armenti Design of a Near Threshold Low Power DLL for Multiphase Clock Generation and Frequency Multiplication.
Power Reduction for FPGA using Multiple Vdd/Vth
Digital Components and Combinational Circuits Sachin Kharady.
A Class Presentation for VLSI Course by : Fatemeh Refan Based on the work Leakage Power Analysis and Comparison of Deep Submicron Logic Gates Geoff Merrett.
An Efficient Algorithm for Dual-Voltage Design Without Need for Level-Conversion SSST 2012 Mridula Allani Intel Corporation, Austin, TX (Formerly.
Ashley Brinker Karen Joseph Mehdi Kabir ECE 6332 – VLSI Fall 2010.
Basics of Energy & Power Dissipation Lecture notes S. Yalamanchili, S. Mukhopadhyay. A. Chowdhary.
Jia Yao and Vishwani D. Agrawal Department of Electrical and Computer Engineering Auburn University Auburn, AL 36830, USA Dual-Threshold Design of Sub-Threshold.
MICAS Department of Electrical Engineering (ESAT) Design-In for EMC on digital circuit October 27th, 2005 AID–EMC: Low Emission Digital Circuit Design.
Washington State University
A 240ps 64b Carry-Lookahead Adder in 90nm CMOS Faezeh Montazeri Advanced VLSI Course Presentation University of Tehran December.
Adiabatic Logic as Low-Power Design Technique Presented by: Muaayad Al-Mosawy Presented to: Dr. Maitham Shams Mar. 02, 2005.
Low Power – High Speed MCML Circuits (II)
XIAOYU HU AANCHAL GUPTA Multi Threshold Technique for High Speed and Low Power Consumption CMOS Circuits.
Design of a High-Throughput Low-Power IS95 Viterbi Decoder Xun Liu Marios C. Papaefthymiou Advanced Computer Architecture Laboratory Electrical Engineering.
Design of an 8-bit Carry-Skip Adder Using Reversible Gates Vinothini Velusamy, Advisor: Prof. Xingguo Xiong Department of Electrical Engineering, University.
1 Interconnect/Via. 2 Delay of Devices and Interconnect.
Basics of Energy & Power Dissipation
Bi-CMOS Prakash B.
Multi-Split-Row Threshold Decoding Implementations for LDPC Codes
Post-Layout Leakage Power Minimization Based on Distributed Sleep Transistor Insertion Pietro Babighian, Luca Benini, Alberto Macii, Enrico Macii ISLPED’04.
Sp09 CMPEN 411 L14 S.1 CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 14: Designing for Low Power [Adapted from Rabaey’s Digital Integrated Circuits,
64 bit Kogge-Stone Adders in different logic styles – A study Rob McNish Satyanand Nalam.
A Class presentation for VLSI course by : Maryam Homayouni
Class Report 林常仁 Low Power Design: System and Algorithm Levels.
CS203 – Advanced Computer Architecture
LOW POWER DESIGN METHODS
LOW POWER DESIGN METHODS V.ANANDI ASST.PROF,E&C MSRIT,BANGALORE.
Reading: Hambley Ch. 7; Rabaey et al. Sec. 5.2
Dual Mode Logic An approach for high speed and energy efficient design
Circuit Design Techniques for Low Power DSPs
University of Colorado at Boulder
Reading: Hambley Ch. 7; Rabaey et al. Secs. 5.2, 5.5, 6.2.1
Presentation transcript:

Dynamic and Leakage Power Reduction in MTCMOS Circuits Using an Automated Efficient Gate Clustering Technique Mohab Anis, Shawki Areibi *, Mohamed Mahmoud and Mohamed Elmasry VLSI Research Group, University of Waterloo, Canada * School of Engineering, University of Guelph, Canada

Presentation Outline Low Power Design in DSM Concept of sleep transistors Previous work Sizing the sleep transistor Bin-Packing technique Set-Partitioning technique Conclusion and extended work done

Why Low Power Design ? Growing market of mobile and handheld electronic systems. Difficulty in providing adequate cooling. Fans create noise and add to cost. Heat dissipation impacts packaging technology and cost Increasing standby time of portable devices. In DSM regimes, leakage power has become as big a problem as dynamic power

Concept of sleep transistors MTCMOS technology is an increasingly popular technique to reduce leakage power Proper ST sizing is a key issue ST size Area , Pdynamic , Pleakage ST size Delay LVT Logic Block LVT Logic Block VX VX R I SLEEP HVT Modeling of a sleep transistor as a resistor

First Approach [1] Single ST to support whole circuit LVT Increase in interconnect resistance for distant blocks ST size to compensate added resistance Area Pdynamic Pleakage More significant in the DSM regime [1] S.Mutah et al. “1-V Power Supply High-Speed Digital Circuit Technology with Multi-Threshold Voltage CMOS,” IEEE J. of Solid-State Circuits, pp.847-853, 1995. LVT Logic Circuit SLEEP HVT

Second Approach [2] Single ST is sized according to a mutual exclusive discharge pattern algorithm. ST assignments are wasteful. G1 G9 G7 G8 G6 G4 G2 G3 G5 G10 Increase in interconnect resistance for distant blocks. ST size to compensate added resistance. Pdynamic Pleakage More significant in the DSM regime. [2] J.Kao et al. “MTCMOS Hierarchical Sizing Based on Mutual Exclusive Discharge Patterns”, in Proc. of 35th DAC, pp. 495-500, Las Vegas, 1998

Sizing the sleep transistor Objective: Constant ST size, causing 5% degradation in circuit speed. (W/L)sleep = Isleep 0.05 n Cox (Vdd-VtL)(Vdd-VtH) Isleep is chosen to be 250 A. (W/L)sleep  6 for 0.18 m CMOS technology VtL = 350mV, VtH = 500mV

4-bit CLA Adder

Preprocessing of Gate Currents Random I/Ps to CLA adder are applied, highest current discharge is monitored, and multiplied by corresponding switching activity Monitor the peak current value and time of occurrence + duration Currents are combined into single current Ieq = max{Ii}, when  Ii in time  max{Ii}

Timing Diagram F0=2 G1 F0=4 T1 G2 T2 65 I1 (G1) T1=80psec 79 time T1+T2=210psec 79 65 260psec 120psec 0 0 11 22 33 43 54 65 54 43 33 22 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 12 18 24 30 37 43 49 55 61 67 73 79 73 67 61 55 49 43 37 30 24 18 12 6 0 0 0 0 0 0 0 I1 (G1): I2 (G2): I1 (G1) I2 (G2) T1 T2 G1 G2 F0=2 F0=4 time

Preprocessing Heuristic Initialize current vectors Set all Gates free; to move to sub-cluster; 3. For all gates in circuit If gate i is not clustered yet assign gate i to new cluster k update cluster current vector calculate max current, start, end time For all other gates in circuit If (gate j is not clustered yet) add current of gate j to cluster k If (combination  max current) append gate to cluster update cluster info set gate j locked in cluster k End For 4. Return all clusters formed.

Bin-Packing Technique Objective: Minimize the No. of used STs. Subject to: 1.  Ieq  Imax for any ST. 2. Ieq are assigned only once.

Currents Assignment Sleep Transistors 1 2 Equivalent Currents IEQ3 IEQ4 IEQ7 IEQ1 IEQ2 IEQ5 IEQ6 Assigned Gates G5 G6 G7 G8 G14 G16 G18 G23 G1 G2 G3 G4 G9 G10 G11 G12 G13 G15 G17 G19 G20 G21 G22 G24 G25 G26 G27 G28  Currents (A) 250 240

Clustering of CLA adder

Set-Partitioning Technique Ground rail Sleep Device cavity Cell Vdd gnd Height G1 G3 G2 G5 G4 G7 G6 G8 G9 G19 G11 G10 G14 G13 G16 G15 G17 G24 G18 G12 G22 G26 G21 G25 G20 G23 G27 G28 Lmin

Cost Function Cj = ( w1 . Cj1 ) + ( w2 . Cj2 ) Cj1 = Sleep_Transistor max_current -  currenti i Cj2 =  duv in a group Sj Gv Sj dvw duv Gw Gu dwu

Clustering Heuristic Create_Clusters ( ) Calculate distances between all gates; Initialize maxgates_per_cluster=n; Create clusters with Single gates; For cl=2; cl  maxgates_per_cluster Create_n_Gate_Cluster (cl) For all clusters created calculate_cost ( ) Create_n_Gate_Clusters (cl) For cluster of type cl create_new_cluster ( ) While not done Choose Gate with minimum distances If sum of currents  capacity append gate to newly created cluster End If If total gates within cluster  limit break; End While End For 2. Return newly created cluster

Set-Partitioning Technique Objective: Minimize  CjSj Subject to: 1.  of currents for Sj  Imax 2. Groups must cover all gates with no repetition.

Grouping of gates G1 G2 G3 G4 G5 G6 G7 G8 G19 G9 G10 G11 G12 G13 G14 Cell Lmin Sleep Device cavity Ground rail Vdd Cell Height G1 G2 G3 G4 G5 G6 G7 G8 gnd G19 G9 G10 G11 G12 G13 G14 G15 G16 G17 G18 Vdd G24 G20 G25 G21 G26 G22 G27 G23 G28 gnd

Computational Time BP/SP CPU TIME Time (secs) Number of Gates BP CPU Time 2000 1800 1600 1400 1200 Time (secs) 1000 800 600 400 200 -200 28 30 31 61 160 204 Number of Gates

Results (% Savings) 2 % 0 % 98 % 77 % 98, 76 % 9 % 8 % 87 % 71 % 86, 70 % 11 % 86 % 66 % 86, 67 % 19 % 85 % 35 % 85, 34 % 6 % 70 % 84, 69 % 7 % 5 % 78 % 87, 77 % Pdynamic to [1] Pdynamic to [2] Pleakage to [1] Pleakage to [2] ST_Area [1],[2] SP 99 % 89 % 99, 88 % 20 % 95 % 95, 89 % 17 % 14 % 93 % 83 % 93, 83 % 31 % 23 % 95, 78 % 18 % 16 % 92 % 92, 85 % 12 % 96 % 95, 92 % BP 160 202 61 30 31 28 No. of gates 27-channel interrupt controller C432 32-bit Single Error Correcting C499 4-bit 74181 ALU 6-bit Multiplier 32-bit Parity Checker 4-bit CLA adder Benchmark REF

% Power Savings (Bin-Packing)

% Power Savings (Set-Partitioning)

% ST Area Saving (Bin-Packing)

% ST Area Saving (Set-Partitioning)

Conclusion BP technique cluster gates in MTCMOS circuits. Pdynamic and Pleakage are reduced by 15% and 90% compared to [1] and [2] respectively. SP takes routing complexity into consideration. Pdynamic and Pleakage are reduced by 11% and 77% compared to [1] and [2] respectively.

Extended Work Done A hybrid clustering technique that combines the BP and SP techniques is devised, to produce a more efficient and faster solution. Noise associated with ground bounce is taken as taken as a design criterion (< 50mV). Investigating effect of different ST sizes on circuit parameters. Investigating effect of the cost function weights w1 and w2 on circuit parameters.