Skewed Flip-Flop Transformation for Minimizing Leakage in Sequential Circuits Jun Seomun, Jaehyun Kim, Youngsoo Shin Dept. of Electrical Engineering, KAIST,

Slides:



Advertisements
Similar presentations
CS 140 Lecture 11 Sequential Networks: Timing and Retiming Professor CK Cheng CSE Dept. UC San Diego 1.
Advertisements

Gate Sizing for Cell Library Based Designs Shiyan Hu*, Mahesh Ketkar**, Jiang Hu* *Dept of ECE, Texas A&M University **Intel Corporation.
1 Lecture 16 Timing  Terminology  Timing issues  Asynchronous inputs.
A Robust, Fast Pulsed Flip- Flop Design By: Arunprasad Venkatraman Rajesh Garg Sunil Khatri Department of Electrical and Computer Engineering, Texas A.
Introduction to CMOS VLSI Design Sequential Circuits
ECE C03 Lecture 81 Lecture 8 Memory Elements and Clocking Hai Zhou ECE 303 Advanced Digital Design Spring 2002.
Designing Sequential Logic Circuits
Modern VLSI Design 4e: Chapter 5 Copyright  2008 Wayne Wolf Topics n Memory elements. n Basics of sequential machines.
Timing Margin Recovery With Flexible Flip-Flop Timing Model
Leakage and Dynamic Glitch Power Minimization Using MIP for V th Assignment and Path Balancing Yuanlin Lu and Vishwani D. Agrawal Auburn University ECE.
Sequential Timing Optimization. Long path timing constraints Data must not reach destination FF too late s i + d(i,j) + T setup  s j + P s i s j d(i,j)
1 Lecture 28 Timing Analysis. 2 Overview °Circuits do not respond instantaneously to input changes °Predictable delay in transferring inputs to outputs.
1 Dual Threshold Voltage Domino Logic Synthesis for High Performance with Noise and Power Constraint Seong-Ook Jung, Ki-Wook Kim and Sung-Mo (Steve) Kang.
S. Reda EN160 SP’08 Design and Implementation of VLSI Systems (EN01600) Lecture 19: Combinational Circuit Design (1/3) Prof. Sherief Reda Division of Engineering,
Sequential Logic 1 clock data in may changestable data out (Q) stable Registers  Sample data using clock  Hold data between clock cycles  Computation.
1 Digital Design: State Machines Timing Behavior Credits : Slides adapted from: J.F. Wakerly, Digital Design, 4/e, Prentice Hall, 2006 C.H. Roth, Fundamentals.
CSE477 L19 Timing Issues; Datapaths.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 19: Timing Issues; Introduction to Datapath.
Assume array size is 256 (mult: 4ns, add: 2ns)
CMOS Circuit Design for Minimum Dynamic Power and Highest Speed Tezaswi Raja, Dept. of ECE, Rutgers University Vishwani D. Agrawal, Dept. of ECE, Auburn.
Polynomial-Time Algorithms for Designing Dual-Voltage Energy Efficient Circuits Master’s Thesis Defense Mridula Allani Advisor : Dr. Vishwani D. Agrawal.
Dual Voltage Design for Minimum Energy Using Gate Slack Kyungseok Kim and Vishwani D. Agrawal ECE Dept. Auburn University Auburn, AL 36849, USA IEEE ICIT-SSST.
ECEN 248: INTRODUCTION TO DIGITAL SYSTEMS DESIGN Dr. Shi Dept. of Electrical and Computer Engineering.
Design of Variable Input Delay Gates for Low Dynamic Power Circuits
ENGIN112 L28: Timing Analysis November 7, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 28 Timing Analysis.
Puneet Sharma and Puneet Gupta Prof. Andrew B. Kahng Prof. Dennis Sylvester System-Level Living Roadmap Annual Review, Sept Basic Ideas Gate-length.
S. Reda EN160 SP’08 Design and Implementation of VLSI Systems (EN1600) Lecture 14: Power Dissipation Prof. Sherief Reda Division of Engineering, Brown.
August 12, 2005Uppalapati et al.: VDAT'051 Glitch-Free Design of Low Power ASICs Using Customized Resistive Feedthrough Cells 9th VLSI Design & Test Symposium.
EE141 © Digital Integrated Circuits 2nd Timing Issues 1 Latch-based Design.
Practically Realizing Random Access Scan By Anand Mudlapur ECE Dept. Auburn University.
UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD.
Lecture 9 Memory Elements and Clocking
An Algorithm to Minimize Leakage through Simultaneous Input Vector Control and Circuit Modification Nikhil Jayakumar Sunil P. Khatri Presented by Ayodeji.
ENGIN112 L20: Sequential Circuits: Flip flops October 20, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 20 Sequential Circuits: Flip.
Modern VLSI Design 2e: Chapter 5 Copyright  1998 Prentice Hall PTR Topics n Memory elements. n Basics of sequential machines.
1 UCSD VLSI CAD Laboratory ISQED-2009 Revisiting the Linear Programming Framework for Leakage Power vs. Performance Optimization Kwangok Jeong, Andrew.
Iterative Algorithms for Low Power VLSI Placement Sadiq M. Sait, Ph.D Department of Computer Engineering King Fahd University of Petroleum.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 13: Power Dissipation Prof. Sherief Reda Division of Engineering, Brown.
A Probabilistic Method to Determine the Minimum Leakage Vector for Combinational Designs Kanupriya Gulati Nikhil Jayakumar Sunil P. Khatri Department of.
Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction Yan Lin and Lei He EE Department, UCLA Partially supported.
Chapter #6: Sequential Logic Design 6.2 Timing Methodologies
CS 140 Lecture 9 Professor CK Cheng 10/24/02. Sequential Network 1.Components F-Fs 2.Specification D Q Q’ CLK.
CS 151 Digital Systems Design Lecture 28 Timing Analysis.
1 A Method for Fast Delay/Area Estimation EE219b Semester Project Mike Sheets May 16, 2000.
1 CSE370, Lecture 16 Lecture 19 u Logistics n HW5 is due today (full credit today, 20% off Monday 10:29am, Solutions up Monday 10:30am) n HW6 is due Wednesday.
Logic Synthesis For Low Power CMOS Digital Design.
An Efficient Algorithm for Dual-Voltage Design Without Need for Level-Conversion SSST 2012 Mridula Allani Intel Corporation, Austin, TX (Formerly.
Ashley Brinker Karen Joseph Mehdi Kabir ECE 6332 – VLSI Fall 2010.
Jia Yao and Vishwani D. Agrawal Department of Electrical and Computer Engineering Auburn University Auburn, AL 36830, USA Dual-Threshold Design of Sub-Threshold.
A Robust Pulse-triggered Flip-Flop and Enhanced Scan Cell Design
EEE2243 Digital System Design Chapter 7: Advanced Design Considerations by Muhazam Mustapha, extracted from Intel Training Slides, April 2012.
XIAOYU HU AANCHAL GUPTA Multi Threshold Technique for High Speed and Low Power Consumption CMOS Circuits.
1 CSE370, Lecture 17 Lecture 17 u Logistics n Lab 7 this week n HW6 is due Friday n Office Hours íMine: Friday 10:00-11:00 as usual íSara: Thursday 2:30-3:20.
Outline Introduction: BTI Aging and AVS Signoff Problem
Jun Seomun, Insup Shin, Youngsoo Shin Dept. of Electrical Engineering, KAIST DAC’ 10.
BR 8/991 DFFs are most common Most programmable logic families only have DFFs DFF is fastest, simplest (fewest transistors) of FFs Other FF types (T, JK)
ECE C03 Lecture 81 Lecture 8 Memory Elements and Clocking Hai Zhou ECE 303 Advanced Digital Design Spring 2002.
Sequential Networks: Timing and Retiming
In-Place Decomposition for Robustness in FPGA Ju-Yueh Lee, Zhe Feng, and Lei He Electrical Engineering Dept., UCLA Presented by Ju-Yueh Lee Address comments.
June clock data Q-flop Flop dataQ clock Flip-flop is edge triggered. It transfers input data to Q on clock rising edge. Memory Elements.
1 COMP541 Sequential Logic Timing Montek Singh Sep 30, 2015.
Review: Sequential Definitions
04/21/20031 ECE 551: Digital System Design & Synthesis Lecture Set : Functional & Timing Verification 10.2: Faults & Testing.
FAMU-FSU College of Engineering EEL 3705 / 3705L Digital Logic Design Spring 2007 Instructor: Dr. Michael Frank Module #10: Sequential Logic Timing & Pipelining.
A Novel, Highly SEU Tolerant Digital Circuit Design Approach By: Rajesh Garg Sunil P. Khatri Department of Electrical and Computer Engineering, Texas A&M.
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
Estimate power saving by clock slowdown for s5378 in 180nm and 32nm CMOS Chao Han ELEC 6270.
Clocking in High-Performance and Low-Power Systems Presentation given at: EPFL Lausanne, Switzerland June 23th, 2003 Vojin G. Oklobdzija Advanced.
Off-path Leakage Power Aware Routing for SRAM-based FPGAs
Low Power Digital Design
Chapter 3b Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction Prof. Lei He Electrical Engineering Department.
Presentation transcript:

Skewed Flip-Flop Transformation for Minimizing Leakage in Sequential Circuits Jun Seomun, Jaehyun Kim, Youngsoo Shin Dept. of Electrical Engineering, KAIST, KOREA

Leakage Power in Technology Scaling Power (W) Technology 0.25µ 0.18µ 0.13µ 0.10µ 0.07µ Dynamic Power Leakage Power Intel Corporation, 2002

Overview of Mixed V t Technique Mixed V t CMOS Mixed V t CMOS –Low V t : fast but high leakage –High V t : low leakage but slow Value of mixed V t is limited Value of mixed V t is limited –It considers only the combinational portion of circuits Low V t High V t High V t gates can be assigned some non–critical path Critical path Initially all low V t

Motivation Leakage of sequential elements Leakage of sequential elements –Sequential elements take large proportion in many controllers s298s344s349 s382 s400s444 s526s641s713s838 s9234 Flip-flop Comb. 100% 80% 60% 40% 20% 0% s298s344s349 s382 s400s444s526s641 s713s838 s9234 Mixed V t

Why Not High V t Flip-Flop? Large effects on the slack Large effects on the slack –The delay overhead of high V t flip-flops is larger than that of the other high V t combinational gates –Flip-flop typically affects more than one of the timing paths in a circuit F/FINVNAND2NOR2NAND3NAND s298s344s349s400s444s526s641s713s838 s9234 [Average # fanout timing paths on F/Fs] / [Average # fanout timing paths on comb. Gates] Delay of high V t gate - delay of low V t gate

Mixed L gate flip-flop Mixed L gate flip-flop –Lager L gate transistor Smaller delay overhead than high V t transistor Smaller delay overhead than high V t transistor Footprint of gate remains almost the same Footprint of gate remains almost the same –Selective assignment of larger L gate in flip-flop Smaller delay overhead than entire assignment in flip-flop Smaller delay overhead than entire assignment in flip-flop Maximum reduction can be obtained up to same amount of leakage reduction with the case when all gates are larger L gate Maximum reduction can be obtained up to same amount of leakage reduction with the case when all gates are larger L gate Unequal leakage along with values of D and Q Unequal leakage along with values of D and Q –Four kinds of SFFs Characterized to minimize leakage corresponding to four states (D & Q) Characterized to minimize leakage corresponding to four states (D & Q) SF 00, SF 01, SF 10 and SF 11 SF 00, SF 01, SF 10 and SF 11 Skewed Flip-Flops

Design of an SFF (in case of SF 00 ) Design of an SFF (in case of SF 00 ) –Assume CK = 0 in idle state (clock gating) clk1clkclk clk clk clk clk clk clk QDCK clk Larger L gate

Skewed Flip-Flops Skewed flip-flops Skewed flip-flops clk Q D CK clk SF 00 SF 01 SF 10 SF 11

Leakage Characteristic of SFFs 45-nm PTM, 4 nm biasing 45-nm PTM, 4 nm biasing /00/11/01/1 (a) SF 00 D/Q /00/11/01/1 (b) SF 01 D/Q /00/11/01/1 (c) SF 10 D/Q /00/11/01/1 (d) SF 11 D/Q Current [nA] Orig.SF 00Orig.SF 01 Orig.SF 10Orig.SF 11

45-nm PTM, 4 nm biasing 45-nm PTM, 4 nm biasing Timing Characteristic of SFFs Rising T su Falling Rising T c-q Falling (a) SF 00 Rising T su Falling Rising T c-q Falling (b) SF 01 Rising T su Falling Rising T c-q Falling (c) SF 10 Rising T su Falling Rising T c-q Falling (d) SF 11 Delay [ps] Orig.SF 00 Orig.SF 01 Orig.SF 10 Orig.SF 11 (a) Rising T su (b) Falling T su T su ' T su T su ' T 1 T 1 ' T 1 ' T 1 D clk T su D clk CK (rising edge) Orig. SF 00 Orig. SF 00 TimeTime Voltage [V]

SFF Transformation Utilize SFFs while maintaining timing constraints – –Input : netlist & idle state probabilities of flip-flops – –Output : new netlist with skewed flip-flops Skewed flip-flop transformation under timing constraints Initial SFF assignment Flip-flop transformation Find critical path Find candidate Substitute Netlist & Idle state probabilities Mixed V t assignment on combinational subcircuits

For a smoother transition For a smoother transition –HSF 0 : unchanged setup time delay –HSF 1 : unchanged clock-to-q delay Half Skewed Flip-Flops (HSFs) HSF 0 HSF 1

SFF Transformation Algorithm Select a flip-flop to be transformed Select a flip-flop to be transformed –Find critical path –Find candidate Both ends of the most critical path Both ends of the most critical path Larger timing improvement Larger timing improvement Substitute Substitute –(1) Most effective SFFs in terms of delay given position and phase of transition –(2) If (1) fails, try HSFs –(3) If (2) fails, use the original flip-flops

Experimental Results For ISCAS benchmark circuits (45-nm PTM library) For ISCAS benchmark circuits (45-nm PTM library) Benchmark Mixed V t only SFX + Mixed V t Name # Gates # FFs Comb. (uA) SE (uA) Total (uA) Comb. (x) SE (x) Total (x) s s s s s s s s s s s s Avg

Comparison of Mixed V t Flip-Flop s298s344s349s400s444s526s641s713s838 s9234 s382 Mixed V t FFs + Mixed V t comb. SFX + Mixed V t comb s298s344s349s400s444s526s641s713s838 s9234 [Average # fanout timing paths of F/Fs] / [Average # fanout timing paths of comb. Gates]

Conclusion Proposed Skewed Flip-Flops Proposed Skewed Flip-Flops –The set of mixed L gate flip-flops –Skewed characteristics in terms of leakage and delay A heuristic algorithm that substitutes SFFs A heuristic algorithm that substitutes SFFs –An average leakage saving of 16% is achieved, compared to the use of mixed V t alone