Post-Layout Leakage Power Minimization Based on Distributed Sleep Transistor Insertion Pietro Babighian, Luca Benini, Alberto Macii, Enrico Macii ISLPED’04
Outline Introduction Previous Work Algorithm Experimental Results Conclusion
Introduction Bellow.13 process leakage dominates power consumption – Leakage power = exp(-q*V t / K*T) Leakage reduction methods – Dual V t partition – MTCMOS – State assignment Low Vt logic module sleep Virtual ground high Vt
Outline Introduction Previous Work Algorithm Experimental Results Conclusion
sleep Previous Works MTCMOS – Take a non-negligible amount of time to wake up and re-activate sleep transistor. (long re- activation time) Virtual ground Low Vt logic module Vdd ONOFF VDD-Vth 0 Discharge Re-activation time Stand by mode Active mode
Previous Works Distributed sleep transistor – Multiple sleep transistors are initiated. – A faster re-activation time Most techniques presented at the logic and circuit level, and do not take placement information into account. Cause severe wiring congestion
Outline Introduction Previous Work Algorithm Experimental Results Conclusion
Sleep Transistor Insertion in row- based layout Low Vt logic module sleep Virtual ground high Vt Vdd local wiring
Row Compaction & Area Penalty Row Compaction Area Penalty Add sleep transistor
Gate Clustering Get Timing & floorplan Information from Layout Select a sleep transistor Check all rows? Yes No Row Compaction Select a cell Update maximum current available at sleep transistor Add cell to cluster Timing violation? No Yes sleep Virtual ground Gate 1 Gate 2 Gate n available current at sleep transistor According to available space A gate by gate exploration of each row
How to Select Cell? sleep Virtual ground Gate 1 Gate 2 Gate n ON Re-activation time If Arrival time > Re-activation time, zero re-activation delay overhead are paid. From primary output to primary input OFF Vdd Discharge Check whether the cell can be power-gated? 2.Current?3.Timing? RT>RT_OH? 1.Leakage Power?
Sleep Transistor Sizing sleep Virtual ground Gate 1 Gate 2 Gate N CL
Outline Introduction Previous Work Algorithm Experimental Results Conclusion
Experimental Results(1/2) Delay overhead constraint is set to 5% Area overhead constraint is set to 5% Benchma rk OrigOpt∆ PL [mW] Pdyn [mW] Ptot [mW] PL [mW] Pdyn [mW] Ptot [mW] PL [%] Pdyn [%] Ptot [%] Block Block Block Block Block Block Avg
Experimental Results(2/2) Area Penalty BenchmarkGatesSleep Area_Ori g [µm2] Area_Opt [µm2] ∆[%] Block Block Block Block Block Block
Experimental Results(3/3) Delay penalty Cell No Leak ControlLeak Control∆Power [%] ∆Delay [%] PLk[mW]Delay[ps]PLk[mW]Delay[ps] G G G G G G G G G G G G
Outline Introduction Previous Work Algorithm Experimental Results Conclusion
Sleep Transistor Insertion : – Driven by a layout-aware cost function – Done with tunable performance and area penalty