Download presentation
Presentation is loading. Please wait.
1
Polynomial-Time Algorithms for Designing Dual-Voltage Energy Efficient Circuits Master’s Thesis Defense Mridula Allani Advisor : Dr. Vishwani D. Agrawal Committee Members: Dr. Victor P. Nelson, Dr. Adit D. Singh Department of Electrical and Computer Engineering Auburn University October 19, 2011
2
Outline Motivation Problem statement Background Contributions Algorithm to find V DDL Algorithm to assign V DDL Results Future work References 10/19/2011 2 Mridula Allani - MS Thesis Defense
3
Motivation Ref. http://www.anandtech.com/show/3794/the-iphone-4-review/13 http://www.anandtech.com/show/3794/the-iphone-4-review/13 10/19/2011 3 Mridula Allani - MS Thesis Defense
4
Motivation Current dual voltage designs use 0.7V DD as the lower supply voltage. Algorithms to assign low voltage have exponential or polynomial complexity. Require faster algorithms that increase energy savings. 10/19/2011 4 Mridula Allani - MS Thesis Defense
5
Problem Statement Develop a linear time algorithm to find the optimal lower voltage. Develop new algorithms for voltage assignment in dual-V DD design. 10/19/2011 5 Mridula Allani - MS Thesis Defense
6
Background Gate slack: The amount of time by which a signal is early or late. Critical path: The longest path in the circuit. All gates on this path have ‘zero’ slack. Timing constraints: No other path can be longer than the critical path. No gate should have a negative slack. 10/19/2011 6 Mridula Allani - MS Thesis Defense
7
Background Timing violations: A path is longer than the critical path. The gates on this path have negative slack. Topological constraints: NoV DDL gate is at the input of any V DD gate. Estimate of energy savings (neglecting leakage): where N is the number of gates in low voltage and n is the total number of gates. 10/19/2011 7 Mridula Allani - MS Thesis Defense
8
Background Basic idea: decrease energy consumption without any delay penalty. Done by assigning lower supply voltage to gates on non-critical paths. Different algorithms propose different ways of finding these non-critical gates. 10/19/2011 8 Mridula Allani - MS Thesis Defense
9
Background Authors Kuroda and Hamada say that power reduction ratio is minimum when 0.6V DD ≤ V DDL ≤ 0.7V DD. The works described by Chen, et. al., Kulkarni, et. al., Srivatsava, et. al., claims that the optimal value of V DDL for minimizing total power is 50% of V DD. Rule of thumb proposed by Hamada, et. al. says 10/19/2011 9 Mridula Allani - MS Thesis Defense
10
Background CVS Structure [Usami and Horowitz] ECVS Structure [Usami, et. al.] V DDL V DD Level Converter Ref. K. Usami and M. Horowitz, “Clustered Voltage Scaling Technique for Low-Power Design," in Proceedings of the International Symposium on Low Power Design, pp. 23-26, 1995. Ref. K. Usami, et. al.,“Automated Low-Power Technique Exploiting Multiple Supply Voltages Applied to a Media Processor," IEEE Journal of Solid-State Circuits, vol. 33, no. 3, pp. 463-472, Mar. 1998. 10/19/2011 10 Mridula Allani - MS Thesis Defense
11
Background Kulkarni, et al. Greedy heuristic based on gate slacks. Uses 0.7V DD and 0.5V DD as V DDL. Includes power and delay overhead of level converters. Sundararajan and Parhi Linear programming based model. Minimizes the power consumption. Includes level converter delay overheads. 10/19/2011 11 Mridula Allani - MS Thesis Defense
12
Background TPI (i): longest time for an event to arrive at gate i from PI. TPO (i): longest time for an event from gate i to reach PO. Slack time for gate i: S i = Tc – D p,i, where T c = Max { D p,i } for all i [Kim and Agrawal] Delay of the longest path through gate i : D p,i = TPI(i) + TPO(i) 10/19/2011 12 Mridula Allani - MS Thesis Defense TPI (i) TPO (i) TcTc PIPO
13
Background S u, the upper slack time is the lower bound of slacks of the gates which can be unconditionally assigned low voltage without affecting the critical timing of the circuit. where β = D ’ p,I / D p,i and D ’ p,i, D p,i is the longest path delay through the gate i when it is supplied with V DDL and V DD, respectively. [Kim and Agrawal] 10/19/2011 13 Mridula Allani - MS Thesis Defense S u = T c
14
Background Recent work [Kim and Agrawal]: Assign V DDL to gates with S i ≥S u. Assign V DDL to gates with S l ≤ S i ≤ S u one by one without violating timing or topological constraints. Repeat last two steps across all voltages to find the best V DDL and the corresponding dual-voltage design with the least energy. Ref. K. Kim and V. D. Agrawal, “Dual Voltage Design for Minimum Energy Using Gate Slack,” in Proceedings of the IEEE International Conference on Industrial Technology, pp. 419-424, March, 2011. 10/19/2011 14 Mridula Allani - MS Thesis Defense
15
Example Without level converter V1V1 V1V1 V1V1 V1V1 V1V1 V2V2 V2V2 V2V2 V2V2 V2V2 IN OUT 10/19/2011 15 Mridula Allani - MS Thesis Defense
16
Example: Energy per cycle and delay Without level converter 9.69fJ ∞ 44.84fJ 280.6ps 15.75fJ 123.7ps 7.315fJ 95.61ps 7.863fJ 84.15ps 6.465fJ ∞ 10.13fJ 204.5ps 4.573fJ 123.2ps 5.203fJ 99.28ps 6.65fJ 91.19ps 6.6fJ 1183ps 2.651fJ 203.3ps 3.233fJ 132.3ps 4.289fJ 115ps 5.678fJ 107.7ps 1.291fJ 801.5ps 1.761fJ 235.4ps 2.543fJ 179.4ps 3.567fJ 164.3ps 4.977fJ 156.1ps 0.755fJ 1062ps 1.285fJ 614 ps 2.052fJ 565.3ps 3.082fJ 560.5ps 4.423fJ 557.7ps V 2 (V) V 1 (V) 0.4 0.6 0.8 1.0 1.2 0.40.60.81.01.2 10/19/2011 16 Mridula Allani - MS Thesis Defense 90 nm PTM model Clock period: 1500 ps
17
Example With level converter V1V1 V1V1 V1V1 V1V1 V1V1 V2V2 V2V2 V2V2 V2V2 V2V2 IN OUT 10/19/2011 17 Mridula Allani - MS Thesis Defense
18
10.44fJ ∞ 7.18fJ 249.1ps 7.18fJ 184.0ps 7.98fJ 161.7ps 9.316fJ 153.4ps 7.13fJ 1198ps 4.39fJ 268.5ps 4.96fJ 203.3ps 5.94fJ 182.8ps 8.05fJ 174.8ps 2.74fJ 952.5ps 2.83fJ 309.4ps 3.56fJ 251.4ps 4.93fJ 231.8ps 16.14fJ 225.8ps 1.408fJ 948.8ps 1.91fJ 470.7ps 2.82fJ 418.9ps 10.34fJ 405.7ps 45.31fJ 387.8ps 0.81fJ 2188ps 1.4fJ 1757ps 7.08fJ 1733ps 6.46fJ ∞ 9.75fJ ∞ 9.69fJ ∞ 44.84fJ 280.6ps 15.75fJ 123.7ps 7.315fJ 95.61ps 7.863fJ 84.15ps 6.465fJ ∞ 10.13fJ 204.5ps 4.573fJ 123.2ps 5.203fJ 99.28ps 6.65fJ 91.19ps 6.6fJ 1183ps 2.651fJ 203.3ps 3.233fJ 132.3ps 4.289fJ 115ps 5.678fJ 107.7ps 1.291fJ 801.5ps 1.761fJ 235.4ps 2.543fJ 179.4ps 3.567fJ 164.3ps 4.977fJ 156.1ps 0.755fJ 1062ps 1.285fJ 614 ps 2.052fJ 565.3ps 3.082fJ 560.5ps 4.423fJ 557.7ps Example 0.4 0.6 0.8 1.0 1.2 0.40.60.81.01.2 With level converterWithout level converter 0.40.60.81.01.2 10/19/2011 18 Mridula Allani - MS Thesis Defense V 2 (V) V 1 (V)
19
Outline Motivation Problem statement Background Contributions Algorithm to find V DDL Algorithm to assign V DDL Results Future work References 10/19/2011 19 Mridula Allani - MS Thesis Defense
20
Grouping of gates 45 o line S u = 336.9 ps P G ≥0 10/19/2011 20 Mridula Allani - MS Thesis Defense ∑(dl i –dh i )≤min{S i }
21
Groups when V DDL = 1.2V 45 o line P G 10/19/2011 21 Mridula Allani - MS Thesis Defense V DD = 1.2V V DDL = 1.2V T c = 510 ps S u = 0 ps
22
45 o line P G 10/19/2011 22 Mridula Allani - MS Thesis Defense V DD = 1.2V V DDL = 1.19V T c = 510 ps S u = 14.6 ps Groups when V DDL = 1.19V
23
45 o line S u = 336.9 ps P G 10/19/2011 23 Mridula Allani - MS Thesis Defense T c = 510 ps Groups when V DDL = 0.49V
24
45 o line P G 10/19/2011 24 Mridula Allani - MS Thesis Defense V DD = 1.2V V DDL = 0.39V S u = 469ps T c = 510 ps Groups when V DDL = 0.39V
25
Groups when V DDL = 0.1V G 10/19/2011 25 Mridula Allani - MS Thesis Defense V DD = 1.2V V DDL = 0.1V S u = 510 ps = T c T c = 510 ps P 45 o line
26
Theorems 1. Gates above the 45 o line in the ‘Delay increment versus slack’ plot cannot be assigned lower supply voltage without violating the timing constraint. 2. where β i = dl i /dh i and dl i is the low voltage delay and dh i is the high voltage delay of gate i. The maximum value of β i ; β max, will give us the lower bound on the gate slacks. 10/19/2011 26 Mridula Allani - MS Thesis Defense
27
Theorems 3. Groups within P which satisfy can be assigned lower supply voltage without violating the timing constraint. (where, y i = dl i – dh i, dl i = low voltage delay of gate i, dh i = high voltage delay of gate i and S i = slack of the gate i at V DD.) 4. Group with slacks greater than S u, G, can always be assigned the lower supply voltage without causing any topological violations. 10/19/2011 27 Mridula Allani - MS Thesis Defense
28
Algorithm to find V DDL Assume all gates are assigned V DD initially. Calculate the gate slacks. Group the gates according to their slacks and delays. 10/19/2011 28 Mridula Allani - MS Thesis Defense
29
Algorithm to find V DDL V DDL = V DDL1, when using no level converter. V DDL = (V DDL1 V DDL2 ) 1/2, when using level converter. 10/19/2011 29 Mridula Allani - MS Thesis Defense
30
Algorithm to find V DDL 10/19/2011 30 Mridula Allani - MS Thesis Defense =V DD C880 Total 360 gates
31
Algorithm to find V DDL 10/19/2011 31 Mridula Allani - MS Thesis Defense =V DD C880 Total 360 gates V DDL1 = 0.49VV DDL2 = 0.71V
32
Results: V DDL selection algorithm ISCAS ’85 Total gates Without level converters V DDL = V DDL1 V DDL = V DDL2 V DDL = (V DDL1 +V DDL2 )/2 V DDL = (V DDL1 V LDD2 ) 1/2 V DDL (V) Gates in V DDL E sav (%) V DDL (V) Gates in V DDL E sav (%) V DDL (V) Gates in V DDL E sav (%) V DDL (V) Gates in V DDL E sav (%) C4321540.8082.90.8982.30.8482.70.8482.7 C4994930.7611313.71.111414.10.9312310.00.9112911.1 C8803600.4921349.30.7122941.30.622947.70.5822948.8 C13554690.77769.51.111083.40.94766.30.92766.7 C19085840.6022128.41.0022111.60.8022121.90.7722122.3 C26709010.4857053.10.8257033.70.6557044.70.6257046.4 C354012700.521499.50.731497.40.621498.60.611498.7 C531520770.49122049.00.75122636.00.62122043.10.60122044.1 C628824070.55752.51.00770.980.77771.90.73772.0 C728828230.54158244.70.7121238.90.62167243.40.61167243.4 10/19/2011 32 Mridula Allani - MS Thesis Defense
33
Results: V DDL selection algorithm ISCAS ’85 Total gates With level converters V DDL = V DDL1 V DDL = V DDL2 V DDL = (V DDL1 +V DDL2 )/2 V DDL = (V DDL1 V LDD2 ) 1/2 V DDL (V) Gates in V DDL E sav (%) V DDL (V) Gates in V DDL E sav (%) V DDL (V) Gates in V DDL E sav (%) V DDL (V) Gates in V DDL E sav (%) C4321540.807317.10.898524.80.848126.80.848126.8 C4994930.7617321.11.1135910.50.9324920.20.9124721.3 C8803600.4922351.60.7130955.80.629060.40.5828660.9 C13554690.7712215.31.112608.00.9419716.30.9219317.0 C19085840.6026333.81.0026724.40.8039537.60.7738538.8 C26709010.4837635.10.8278446.40.6567753.10.6263351.5 C354012700.5264741.40.73107353.20.6290652.30.6188151.5 C531520770.49114045.00.75177752.10.62163357.60.60160257.8 C628824070.5565921.61.00187723.80.77130231.80.73118947.3 C728828230.54156044.10.71223551.50.62199851.90.61119751.8 10/19/2011 33 Mridula Allani - MS Thesis Defense
34
Results: Comparison with reported data ISCAS’85 Total gates Without level converters V DDL =V DDL1 V DDL = V DDL = 0.7V DD = 0.84V V DDL = V DDL = 0.5V DD = 0.6V V DDL (V) Gates in V DDL E sav (%) Gates in V DDL E sav ( %) Gates in V DDL E sav (%) C4321540.8082.982.783.9 C4994930.7611313.712112.5568.5 C8803600.4921349.322932.422947.7 C13554690.77769.5768.36410.2 C19085840.6022128.422119.322128.4 C26709010.4857053.157032.357047.5 C354012700.521499.51496.01498.8 C531520770.49122049.0124030.5122044.1 C628824070.55752.5771.6752.3 C728828230.54158244.7235942.6167243.9 10/19/2011 34 Mridula Allani - MS Thesis Defense
35
Results: Comparison with reported data ISCAS’85 Total gates With level converters V DDL =V DDL1 V DDL =V DDL = 0.7V DD = 0.84V V DDL =V DDL = 0.5V DD = 0.6V V DDL (V) Gates in V DDL E sav (%) Gates in V DDL E sav ( %) Gates in V DDL E sav (%) C4321540.848126.88126.84320.9 C4994930.9124721.321121.29915.1 C8803600.5828660.932345.829060.4 C13554690.9219317.015416.8447.0 C19085840.7738538.841536.226333.8 C26709010.6263351.581346.060650.5 C354012700.6188151.5109343.986451.0 C531520770.60160257.8181244.5160256.9 C628824070.73118947.3147031.278024.3 C728828230.61119751.8234742.4194351.6 10/19/2011 35 Mridula Allani - MS Thesis Defense
36
Outline Motivation Problem statement Background Contributions Algorithm to find V DDL Algorithm to assign V DDL Results Future work References 10/19/2011 36 Mridula Allani - MS Thesis Defense
37
Algorithm to assign V DDL Assume all gates are at V DD initially. Calculate slacks of all gates. Assign V DDL to gates whose slacks, S i ≥S u Recalculate slacks. 10/19/2011 37 Mridula Allani - MS Thesis Defense
38
Algorithm to assign V DDL Assign V DDL to a group of gates in P satisfying the condition Recalculate slacks. Check whether there are any V DDL gates at the inputs of any V DD gates and if there are any negative slacks. 10/19/2011 38 Mridula Allani - MS Thesis Defense
39
Algorithm to assign V DDL If there any violations occur, put the corresponding gate back to V DD. Recalculate slacks. Repeat previous five steps until we do not have any V DD gates in groups P and G. 10/19/2011 39 Mridula Allani - MS Thesis Defense
40
c880 slack distribution 45 o line S u =336.9 ps P G 10/19/2011 40 Mridula Allani - MS Thesis Defense V DD = 1.2V V DDL = 0.49V
41
Slack data after V DDL assignment 45 o line S u = 336.9ps P G V DD = 1.2V V DDL = 0.49V 10/19/2011 41 Mridula Allani - MS Thesis Defense
42
ISCAS’85 Total gates V DDL =V DDL1 Determination and assignment SPICE Results ** [Kim and Agrawal] V DDL (V) Gates in V DDL E sav (%) CPU* (s) E single VDD (fJ) E dual VDD ( fJ) E sav (%) CPU (s) C4321540.8082.91.78161.3155.43.7 3.915.8 C4994930.7611313.79.414634277.8 5.9194.4 C8803600.4921349.35.39277.6115.858.3 50.862.1 C13554690.77769.58.75455.2433.14.9 4.3132 C19085840.6022128.411.43496.5378.323.8 19.0247.8 C26709010.4857053.123.49660.3251.561.9 47.8480.7 C354012700.521499.545.441843162012.2 9.61244 C531520770.49122049.0109.472320127245.2 N/R C628824070.55752.5154.94193218693.3 2.66128 C728828230.54158244.7191.042465156236.6 N/R Dual voltage design without level converter Intel Core i5 2.30GHz, 4GB RAM ** 90nm PTM model 10/19/2011 42 Mridula Allani - MS Thesis Defense
43
CPU Time Vs. Number of Gates 10/19/2011 43 Mridula Allani - MS Thesis Defense
44
c880 slacks with 5% increase in T c 45 o line S u = 293ps PG 10/19/2011 44 Mridula Allani - MS Thesis Defense V DD = 1.2V V DDL = 0.67V
45
c880 final slacks with 5% increase in T c 45 o line S u = 293ps P G V DD = 1.2V V DDL = 0.67V 10/19/2011 45 Mridula Allani - MS Thesis Defense
46
Dual voltage design without level converter with 5% increase in T c ISCAS’85 Total gates V DDL =V DDL1 Determination and assignment SPICE Results ** V DDL (V) Gates in V DDL E sav (%) CPU * (s) E single VDD (fJ) E dual VDD (fJ) E sav (%) C4321541.0815419.01.70161.3123.923.2 C4994931.0349326.39.18463321.930.5 C8803600.6733465.84.32277.683.8669.8 C13554691.0646922.08.52455.2339.912.2 C19085841.0058430.68.56496.544510.4 C26709010.8189954.315.81660.3257.361.0 C354012700.90127043.828.221843949.548.5 C531520770.72207764.061.772320716.869.1 C628824071.07240720.5108.391932146424.2 C728828230.68281667.7175.072465677.272.3 Intel Core i5 2.30GHz, 4GB RAM ** 90nm PTM model 10/19/2011 46 Mridula Allani - MS Thesis Defense
47
Future work Accommodate level converter energy overheads. Consider leakage energy reduction. Dual threshold designs. Simultaneous dual supply voltage and dual threshold voltage designs. Include the effects of process variations. 10/19/2011 47 Mridula Allani - MS Thesis Defense
48
References 1. T. Kuroda and M. Hamada, “Low-Power CMOS Digital Design with Dual Embedded Adaptive Power Supplies," IEEE Journal of Solid-State Circuits, vol. 35, no. 4, pp. 652-655, Apr. 2000. 2. M. Hamada, Y. Ootaguro, and T. Kuroda, “Utilizing Surplus Timing for Power Reduction,” in Proceedings of the IEEE Custom Integrated Circuits Conference, pp. 89-92, 2001. 3. C. Chen, A. Srivastava, and M. Sarrafzadeh, “On Gate Level Power Optimization Using Dual-Supply Voltages," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 9, no. 5, pp. 616-629, Oct. 2001. 4. S. H. Kulkarni, A. N. Srivastava, and D. Sylvester, “A New Algorithm for Improved VDD Assignment in Low Power Dual VDD Systems," in Proceedings of the International Symposium on Low Power Design, pp. 200-205, 2004. 5. A. Srivastava, D. Sylvester, and D. Blaauw, “Concurrent Sizing, Vdd and Vth Assignment for Low-Power Design," Proceedings of the Design, Automation and Test in Europe Conference, pp. 107-118, 2004. 6. K. Kim, Ultra Low Power CMOS Design. PhD thesis, Auburn University, ECE Dept., Auburn, AL, May 2011. 10/19/2011 48 Mridula Allani - MS Thesis Defense
49
References 7. K. Kim and V. D. Agrawal, “Dual Voltage Design for Minimum Energy Using Gate Slack,” in Proceedings of the IEEE International Conference on Industrial Technology, pp. 419-424, Mar. 2011. 8. K. Usami and M. Horowitz, “Clustered Voltage Scaling Technique for Low- Power Design," in Proceedings of the International Symposium on Low Power Design, pp. 23-26, 1995. 9. K. Usami, M. Igarashi, F. Minami, T. Ishikawa, M. Kanzawa, M. Ichida, and K. Nogami, “Automated Low-Power Technique Exploiting Multiple Supply Voltages Applied to a Media Processor," IEEE Journal of Solid-State Circuits, vol. 33, no. 3, pp. 463-472, Mar. 1998. 10. V. Sundararajan and K. K. Parhi, “Synthesis of Low Power CMOS VLSI Circuits Using Dual Supply Voltages," in Proceedings of the 36th Annual Design Automation Conference, pp. 72-75, 1999. 11. M. Allani and V. D. Agrawal, “Level-Converter Free Dual-Voltage Design of Energy Efficient Circuits Using Gate Slack,” Submitted to Design Automation and Test in Europe Conference, March 12-16, 2012. 10/19/2011 49 Mridula Allani - MS Thesis Defense
50
Thank you.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.