NTHU-CS VLSI/CAD LAB TH EDA De-Shiuan Chiou Da-Cheng Juan Yu-Ting Chen Shih-Chieh Chang Department of CS, National Tsing Hua University, Taiwan Fine-Grained.

Slides:



Advertisements
Similar presentations
OCV-Aware Top-Level Clock Tree Optimization
Advertisements

Dynamic and Leakage Power Reduction in MTCMOS Circuits Using an Automated Efficient Gate Clustering Technique Mohab Anis, Shawki Areibi *, Mohamed Mahmoud.
Supply Voltage Noise Aware ATPG for Transition Delay Faults Nisar Ahmed and M. Tehranipoor University of Connecticut Vinay Jayaram Texas Instruments, TX.
Keeping Hot Chips Cool Ruchir Puri, Leon Stok, Subhrajit Bhattacharya IBM T.J. Watson Research Center Yorktown Heights, NY Circuits R-US.
Minimum Implant Area-Aware Gate Sizing and Placement
Improving Placement under the Constant Delay Model Kolja Sulimma 1, Ingmar Neumann 1, Lukas Van Ginneken 2, Wolfgang Kunz 1 1 EE and IT Department University.
Leakage and Dynamic Glitch Power Minimization Using MIP for V th Assignment and Path Balancing Yuanlin Lu and Vishwani D. Agrawal Auburn University ECE.
Predictably Low-Leakage ASIC Design using Leakage-immune Standard Cells Nikhil Jayakumar Sunil P. Khatri University of Colorado at Boulder.
Paul Falkenstern and Yuan Xie Yao-Wen Chang Yu Wang Three-Dimensional Integrated Circuits (3D IC) Floorplan and Power/Ground Network Co-synthesis ASPDAC’10.
Chop-SPICE: An Efficient SPICE Simulation Technique For Buffered RC Trees Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of.
1 Dual Threshold Voltage Domino Logic Synthesis for High Performance with Noise and Power Constraint Seong-Ook Jung, Ki-Wook Kim and Sung-Mo (Steve) Kang.
1 Accurate Power Grid Analysis with Behavioral Transistor Network Modeling Anand Ramalingam, Giri V. Devarayanadurg, David Z. Pan The University of Texas.
CMOS Circuit Design for Minimum Dynamic Power and Highest Speed Tezaswi Raja, Dept. of ECE, Rutgers University Vishwani D. Agrawal, Dept. of ECE, Auburn.
VLSI/CAD Laboratory Department of Computer Science National Tsing Hua University TH EDA Estimation of Maximum Instantaneous Current for Sequential Circuits.
1 A Variation-tolerant Sub- threshold Design Approach Nikhil Jayakumar Sunil P. Khatri. Texas A&M University, College Station, TX.
Power-Aware Placement
An Algorithm to Minimize Leakage through Simultaneous Input Vector Control and Circuit Modification Nikhil Jayakumar Sunil P. Khatri Presented by Ayodeji.
1 Generalized Buffering of PTL Logic Stages using Boolean Division and Don’t Cares Rajesh Garg Sunil P. Khatri Department of Electrical and Computer Engineering,
TH EDA NTHU-CS VLSI/CAD LAB 1 Re-synthesis for Reliability Design Shih-Chieh Chang Department of Computer Science National Tsing Hua University.
A Timing-Driven Soft-Macro Resynthesis Method in Interaction with Chip Floorplanning Hsiao-Pin Su 1 2 Allen C.-H. Wu 1 Youn-Long Lin 1 1 Department of.
Decomposition of Instruction Decoder for Low Power Design TingTing Hwang Department of Computer Science Tsing Hua University.
Supply Voltage Degradation Aware Analytical Placement Andrew B. Kahng, Bao Liu and Qinke Wang UCSD CSE Department {abk, bliu,
NTHU-CS VLSI/CAD LAB TH EDA Student : Da-Cheng Juan Advisor : Shih-Chieh Chang Fine-Grained Sleep Transistor Sizing Algorithm for Leakage Power Minimization.
1 UCSD VLSI CAD Laboratory ISQED-2009 Revisiting the Linear Programming Framework for Leakage Power vs. Performance Optimization Kwangok Jeong, Andrew.
NuCAD ELECTRICAL ENGINEERING AND COMPUTER SCIENCE McCormick Northwestern University Robert R. McCormick School of Engineering and Applied Science Nostra-XTalk.
A Global Minimum Clock Distribution Network Augmentation Algorithm for Guaranteed Clock Skew Yield A. B. Kahng, B. Liu, X. Xu, J. Hu* and G. Venkataraman*
Changbo Long ECE Department, UW-Madison Lei He EDA Research Group EE Department, UCLA Distributed Sleep Transistor Network.
Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction Yan Lin and Lei He EE Department, UCLA Partially supported.
Lecture 7: Power.
1 Reconfigurable ECO Cells for Timing Closure and IR Drop Minimization TingTing Hwang Tsing Hua University, Hsin-Chu.
The CMOS Inverter Slides adapted from:
More Realistic Power Grid Verification Based on Hierarchical Current and Power constraints 2 Chung-Kuan Cheng, 2 Peng Du, 2 Andrew B. Kahng, 1 Grantham.
MOS Inverter: Static Characteristics
Accuracy-Configurable Adder for Approximate Arithmetic Designs
Power Reduction for FPGA using Multiple Vdd/Vth
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
TSV-Aware Analytical Placement for 3D IC Designs Meng-Kai Hsu, Yao-Wen Chang, and Valerity Balabanov GIEE and EE department of NTU DAC 2011.
An Efficient Algorithm for Dual-Voltage Design Without Need for Level-Conversion SSST 2012 Mridula Allani Intel Corporation, Austin, TX (Formerly.
Jia Yao and Vishwani D. Agrawal Department of Electrical and Computer Engineering Auburn University Auburn, AL 36830, USA Dual-Threshold Design of Sub-Threshold.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
The George Washington University School of Engineering and Applied Science Department of Electrical and Computer Engineering ECE122 – Lab 7 MOSFET Parameters.
The George Washington University School of Engineering and Applied Science Department of Electrical and Computer Engineering ECE122 – Lab 7 MOSFET Parameters.
-1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI.
Ho-Lin Chang, Hsiang-Cheng Lai, Tsu-Yun Hsueh, Wei-Kai Cheng, Mely Chen Chi Department of Information and Computer Engineering, CYCU A 3D IC Designs Partitioning.
Statistical Sampling-Based Parametric Analysis of Power Grids Dr. Peng Li Presented by Xueqian Zhao EE5970 Seminar.
A Power Grid Analysis and Verification Tool Based on a Statistical Prediction Engine M.K. Tsiampas, D. Bountas, P. Merakos, N.E. Evmorfopoulos, S. Bantas.
1 A Fast Algorithm for Power Grid Design Jaskirat Singh Sachin Sapatnekar Department of Electrical and Computer Engineering University of Minnesota.
Skewed Flip-Flop Transformation for Minimizing Leakage in Sequential Circuits Jun Seomun, Jaehyun Kim, Youngsoo Shin Dept. of Electrical Engineering, KAIST,
Statistical Transistor-Level Methodology for CMOS Circuit Analysis and Optimization Zuying Luo and Farid N. Najm.
Jun Seomun, Insup Shin, Youngsoo Shin Dept. of Electrical Engineering, KAIST DAC’ 10.
Distributed Computation: Circuit Simulation CK Cheng UC San Diego
Post-Layout Leakage Power Minimization Based on Distributed Sleep Transistor Insertion Pietro Babighian, Luca Benini, Alberto Macii, Enrico Macii ISLPED’04.
Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 6.1 EE4800 CMOS Digital IC Design & Analysis Lecture 6 Power Zhuo Feng.
Seok-jae, Lee VLSI Signal Processing Lab. Korea University
-1- UC San Diego / VLSI CAD Laboratory Optimization of Overdrive Signoff Tuck-Boon Chan, Andrew B. Kahng, Jiajia Li and Siddhartha Nath Tuck-Boon Chan,
Dept. of Electronics Engineering & Institute of Electronics National Chiao Tung University Hsinchu, Taiwan ISPD’16 Generating Routing-Driven Power Distribution.
1 Hardware Reliability Margining for the Dark Silicon Era Liangzhen Lai and Puneet Gupta Department of Electrical Engineering University of California,
PROCEED: Pareto Optimization-based Circuit-level Evaluation Methodology for Emerging Devices Shaodi Wang, Andrew Pan, Chi-On Chui and Puneet Gupta Department.
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
COE 360 Principles of VLSI Design Delay. 2 Definitions.
Power-Optimal Pipelining in Deep Submicron Technology
Memory Segmentation to Exploit Sleep Mode Operation
An MTCMOS Design Methodology and Its Application to Mobile Computing
Vishwani D. Agrawal Department of ECE, Auburn University
Reading: Hambley Ch. 7; Rabaey et al. Sec. 5.2
Fine-Grain CAM-Tag Cache Resizing Using Miss Tags
University of Colorado at Boulder
A High Performance SoC: PkunityTM
Energy Efficient Power Distribution on Many-Core SoC
Chapter 3b Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction Prof. Lei He Electrical Engineering Department.
Presentation transcript:

NTHU-CS VLSI/CAD LAB TH EDA De-Shiuan Chiou Da-Cheng Juan Yu-Ting Chen Shih-Chieh Chang Department of CS, National Tsing Hua University, Taiwan Fine-Grained Sleep Transistor Sizing Algorithm for Leakage Power Minimization

2 Outline Sleep Transistor Sizing Problem Sleep Transistor Sizing Problem MIC Estimation Mechanism MIC Estimation Mechanism Partitioned Time-Frame for MIC Estimation Partitioned Time-Frame for MIC Estimation Experimental Results and Conclusions Experimental Results and Conclusions

3 Power Gating Leakage increases exponentially Leakage increases exponentially –reach 50% of total power in 65nm technology Power Gating Power Gating – reduce leakage –One of the most effective ways to reduce leakage Low V th Logic Device VDD GND use high V th Sleep Transistor to reduce the leakage current SL VGND GND

4 C1C1 C2C2 C3C3 Implementation of Power Gating Distributed Sleep Transistor Network (DSTN) Distributed Sleep Transistor Network (DSTN) VDD VGND Low V th Logic Device SL

5 Leakage Saving In standby mode: In standby mode: –Leakage: proportional to the ST ’ s size –Small ST to reduce leakage I leakage VDD VGND I leakage

6 Voltage Drop across the ST In active mode: In active mode: –Voltage drop across a ST degrades the speed –Voltage drop: inversely proportional to the ST ’ s size –Large ST to bound the voltage drop V ST VDD VGND V ST

7 V ST * Sleep Transistor (ST) Sizing Dilemma scenario: Dilemma scenario: –Large ST to bound the voltage drop. (active mode) –Small ST to reduce leakage. (standby mode) =>objective: minimize ST size (leakage) under a specified voltage drop constraint, V ST * V ST VDD VGND V ST V ST *

8 C1C1 C2C2 C3C3 Estimate Voltage Drop by MIC Maximum Instantaneous Current (MIC) through the ST Maximum Instantaneous Current (MIC) through the ST –determines the worst case voltage drop Estimating the upper bound of MIC(ST) Estimating the upper bound of MIC(ST) –for sizing ST appropriately to meet voltage drop constraint MIC(ST 1 ) VDD VGND MIC(ST 2 ) MIC(ST 3 ) MIC(ST): MIC across a ST.

9 C1C1 C2C2 C3C3 Estimate Voltage Drop by MIC MIC(C) (MIC of a cluster) is easy to measure MIC(C) (MIC of a cluster) is easy to measure Due to current balancing effect Due to current balancing effect –MIC(ST) (MIC through the ST) is hard to predict MIC(ST 1 ) VDD VGND MIC(ST 2 ) MIC(ST 3 ) MIC(C 1 ) Finding the MIC of a cluster is fast Finding the MIC across a ST is time-consuming

10 Temporal Perspective of Clusters ’ MIC Traditional ways Traditional ways –use the entire clock period ’ s MIC to determine the ST size (Time Unit) Cluster 1 Cluster 2 MIC(C 2 ) occurs at T 9 one clock cycle MIC(C i ) waveform (Current) MIC(C 1 ) occurs at T 6

11 (Time Unit) Current (mA) Cluster 1 Cluster 2 Temporal Perspective of Clusters ’ MIC one clock cycle MIC(C i ) waveform Smaller time frames leads to: Smaller time frames leads to: –a more accurate MIC estimation –high computation complexity

12 Difficulties Current balancing effect complicates the sizing problem Current balancing effect complicates the sizing problem Time-frame partitioning leads to high computation complexity Time-frame partitioning leads to high computation complexity MIC one clock cycle

13 Contributions A more accurate MIC prediction in a temporal perspective A more accurate MIC prediction in a temporal perspective A variable-length partitioning to reduce computation complexity A variable-length partitioning to reduce computation complexity Heuristics to minimize the size of sleep transistors Heuristics to minimize the size of sleep transistors Achieving 21% reduction in sleep transistor area Achieving 21% reduction in sleep transistor area

14 Outline Sleep Transistor Sizing Problem Sleep Transistor Sizing Problem MIC Estimation Mechanism MIC Estimation Mechanism Partitioned Time-Frame for MIC Estimation Partitioned Time-Frame for MIC Estimation Experimental Results and Conclusions Experimental Results and Conclusions

15 Resistance Network I(ST 1 ) I(ST 2 ) I(ST 3 ) I(C1)I(C1) I(C2)I(C2) I(C3)I(C3) R(ST 1 ) R(ST 2 ) R(ST 3 ) RVRV RVRV C1C1 C2C2 C3C3

16 The discharging ratio can be calculated by The discharging ratio can be calculated by –Kirchhoff ’ s Current Law –Ohm ’ s Law Discharging Ratio C1C1 C2C2 C3C I(C 1 ) 0.34 I(C 2 ) 0.23 I(C 3 ) I(C1)I(C1)

17 Discharging Matrix Ψ → where I(ST 1 ) I(ST 2 ) I(ST 3 ) I(C1)I(C1) I(C2)I(C2) I(C3)I(C3) C1C1 C2C2 C3C3

18 MIC(ST) Estimation Mechanism → MIC(ST 1 ) MIC(ST 2 ) MIC(ST 3 ) MIC(C 1 ) MIC(C 2 ) MIC(C 3 ) C1C1 C2C2 C3C3 where

19 Outline Sleep Transistor Sizing Problem Sleep Transistor Sizing Problem MIC Estimation Mechanism MIC Estimation Mechanism Partitioned Time-Frame for MIC Estimation Partitioned Time-Frame for MIC Estimation Experimental Results and Conclusions Experimental Results and Conclusions

20 Temporal Perspective of Clusters ’ MIC Different MIC(C i ) occurs at different time points (Time Unit) Cluster 1 Cluster 2 MIC(C 2 ) occurs at T 9 one clock cycle MIC(C i ) waveform (Current) MIC(C 1 ) occurs at T 6

21 Temporal Perspective of Clusters ’ MIC Different MIC(C i ) occurs at different time points within a clock period Traditional way to estimate MIC(ST i ) is over pessimistic

22 Time-Frame Partitioning for MIC(ST) Estimation Expand MIC(C i ) into MIC(C i,T j ) (Time Frame) Cluster 1 Cluster 2 one clock cycle MIC(C i,T j ) waveform (Current) MIC(C 1,T 1 ) MIC(C 2,T 1 ) MIC(C 1,T 3 ) MIC(C 2,T 3 ) MIC(C 1,T 6 ) MIC(C 2,T 6 )

23 For each time frame T j, use MIC(C i,T j ) to obtain MIC(ST i,T j ) Time-Frame Partitioning for MIC(ST) Estimation

24 Time-Frame Partitioning for MIC(ST) Estimation For ST 1, the maximum MIC(ST 1,T j ) among all T j is the upper bound of MIC(ST 1 ) after partitioning Cluster 1 Cluster 2 (Time Frame) one clock cycle MIC(ST i,T j ) waveform MIC(ST 1 ) ST 1 ST 2 (Current) MIC(ST 2 )

25 Time-Frame Partitioning for MIC(ST) Estimation Cluster 1 Cluster 2 (Time Frame) one clock cycle MIC(ST i,T j ) waveform MIC(ST 1 ) ST 1 ST 2 MIC(ST 2 ) (Current) ORIGINAL_MIC(ST 1 ) 37% larger! ORIGINAL_MIC(ST 2 ) 27% larger! Time-Frame Partitioning leads to a better MIC(ST) estimation!

26 Reduce the Computation Complexity Increase the number of time frames leads to Increase the number of time frames leads to –more accurate voltage drop estimation –high computation complexity Reduce the computation complexity: Reduce the computation complexity: –dominated time-frame removal –variable length time-frame partitioning

27 Dominated Time-Frame Removal T 3 is dominated by T 6 T 3 is dominated by T 6 –MIC(C 1,T 6 ) > MIC(C 1,T 3 ) –MIC(C 2,T 6 ) > MIC(C 2,T 3 ) Neglect T 3 and all dominated time frames Neglect T 3 and all dominated time frames Cluster 1 Cluster 2 MIC(C 1,T 6 ) MIC(C 1,T 3 ) MIC(C 2,T 6 ) MIC(C 2,T 3 )

28 (T b dominates T c ) and (T b dominates T d ) (T b dominates T c ) and (T b dominates T d ) => the estimated upper bound will be smaller If all the MIC(C i ) are separated, the MIC(ST i ) can be better estimated! If all the MIC(C i ) are separated, the MIC(ST i ) can be better estimated! Variable Length Time-Frame Partitioning TaTa uniform two-way partition variable length two-way partition TbTb TdTd TcTc MIC(C 1,T b ) MIC(C 2,T b ) MIC(C 1,T d ) MIC(C 2,T d )MIC(C 1,T c ) MIC(C 2,T c ) (1) (2)

29 Problem Formulation of ST Sizing Inputs: Inputs: 1.Voltage-drop constraint 2.MIC(C i,T j ): Clusters ’ MIC information Objective: minimize the total ST width Objective: minimize the total ST width Voltage drops must meet the constraint Voltage drops must meet the constraint

30 ST Sizing Algorithm Initialize ST size with a large value. MIC(ST i,T j ) = . MIC(C i,T j ) V(ST i,T j ) = MIC(ST i,T j ) . R(ST i ) 3. Update MIC(ST i,T j ) and voltage drops. Return ST size Yes Voltage drops ok? = 2. Update the discharging matrix. No 4. Resize ST with the worst drop

31 Outline Sleep Transistor Sizing Problem Sleep Transistor Sizing Problem MIC Estimation Mechanism MIC Estimation Mechanism Partitioned Time-Frame for MIC Estimation Partitioned Time-Frame for MIC Estimation Experimental Results and Conclusions Experimental Results and Conclusions

32 Environment Setup TSMC 130nm CMOS technology TSMC 130nm CMOS technology Vdd = 1.3 volt Vdd = 1.3 volt Specified tolerable IR drop: 5% of the ideal supply voltage Specified tolerable IR drop: 5% of the ideal supply voltage MIC(C i,T j ) is obtained via 10,000-random-pattern PrimePower simulations MIC(C i,T j ) is obtained via 10,000-random-pattern PrimePower simulations

33 Implementation Flow RTL netlist SDF file Gate Positioning Gate location VCD Partitioning Partitioned VCD file : Our tools : Commercial tools Synthesis Gate-level netlist MIC Estimation V-length Partitioning (Optional) ST size ST Sizing Simulation VCD file Placement DEF file

34 Experimental Results Avg. AES des t481 i8 frg2 dalu C7552 C5315 C3540 C1355 C880 C499 C432 Circuit V-TPTPV-TPTP[2][8] Runtime (Sec.)Total Area (Width in μm) Previous works: [2] Chiou et al. DAC’06, [8] Long et al. DAC’03

35 Conclusions Propose an efficient sleep transistor sizing method for DSTN power gating designs Propose an efficient sleep transistor sizing method for DSTN power gating designs Present theorems based on temporal perspective for estimating a tight upper bound of voltage drop Present theorems based on temporal perspective for estimating a tight upper bound of voltage drop Achieving 21% size (leakage) reduction Achieving 21% size (leakage) reduction

36 Thank You!

37 Sleep Transistor (ST) Sizing Relations between W ST, and V ST. Relations between W ST, and V ST. Sleep Transistors operate in linear region in active mode. Sleep Transistors operate in linear region in active mode. VDD VGND GND I(ST) I(ST): the current through the sleep transistor V ST V ST : the voltage drop across the sleep transistor

38 Sleep Transistor (ST) Sizing Determine the minimum required size (W ST * ) based on: Determine the minimum required size (W ST * ) based on: 1.MIC(ST) 2.V ST *: IR-drop constraint VDD VGND GND MIC(ST) MIC(ST) : Maximum Instantaneous Current (MIC) through ST Smaller MIC(ST) leads to a better ST size!