An Active Glitch Elimination Technique for FPGAs by Julien Lamoureux, Guy Lemieux, and Steven J.E. Wilton University of British Columbia Funding provided by Altera and NSERC
Context of this Research FPGAs use a lot of power Dynamic power still dominates Source: Altera Stratix II Study (99 Circuits)
Overview Reducing power in FPGAs by minimizing glitching Using programmable delay circuits to align arrival times Results: 18% power savings 5% area overhead 1% critical-path delay overhead No changes to the existing CAD flow
What is glitching? Unnecessary transitions Generated by uneven arrival times Propagated by certain gates
How much glitching is there? Circuit Switching Activity Functional Glitching % Glitching C1355 0.319 0.231 0.088 27.5 C1908 0.255 0.167 34.6 C2670 0.267 0.208 0.059 22.2 C3540 0.419 0.230 0.189 45.2 C432 0.260 0.184 0.076 29.3 C499 0.341 0.232 0.109 31.9 C5315 0.400 0.253 0.147 36.7 C6288 1.562 0.295 1.267 81.1 C7552 0.392 0.228 0.165 42.0 C880 0.186 0.046 19.8 alu4 0.081 0.070 0.011 13.1 apex2 0.049 0.042 0.007 13.7 apex4 0.044 0.030 0.014 32.3 des 0.169 0.098 36.8 ex1010 0.032 0.015 0.017 52.9 ex5p 0.168 0.082 0.086 51.0 misex3 0.064 0.050 0.013 20.9 pdc 0.035 0.024 31.8 seq 0.048 0.040 0.008 16.0 spla 0.028 0.021 42.7 Average - 34.1
Context of this Research Circuit Switching Activity Functional Glitching % Glitching C1355 0.319 0.231 0.088 27.5 C1908 0.255 0.167 34.6 C2670 0.267 0.208 0.059 22.2 C3540 0.419 0.230 0.189 45.2 C432 0.260 0.184 0.076 29.3 C499 0.341 0.232 0.109 31.9 C5315 0.400 0.253 0.147 36.7 C6288 1.562 0.295 1.267 81.1 C7552 0.392 0.228 0.165 42.0 C880 0.186 0.046 19.8 alu4 0.081 0.070 0.011 13.1 apex2 0.049 0.042 0.007 13.7 apex4 0.044 0.030 0.014 32.3 des 0.169 0.098 36.8 ex1010 0.032 0.015 0.017 52.9 ex5p 0.168 0.082 0.086 51.0 misex3 0.064 0.050 0.013 20.9 pdc 0.035 0.024 31.8 seq 0.048 0.040 0.008 16.0 spla 0.028 0.021 42.7 Average - 34.1 1/3 of Dynamic Power!
Idea Use programmable delay circuits to lineup arrival times.
Idea Use programmable delay circuits to lineup arrival times.
ASIC vs. FPGA ASIC Circuit and delays are known before fabrication Fixed delay circuits can be used FPGA Circuit and delays are unknown Delay circuits needed to be programmable Location of delays must be carefully considered
Where should the delays go? Option 1: Global Routing Option 2: Logic Blocks (LABs)
Where in the LAB?
Where in the LAB?
Where in the LAB? Too Expensive?
4 Schemes
Programmable Delay Circuit
Programmable Delay Circuit Biased PMOS Biased NMOS
Programmable Delay Circuit Biased PMOS Biasing Circuit Biased NMOS
Calibrating Scheme 1
Calibrating Scheme 1 Three parameters: Number of delay circuits per LUT Maximum delay of the delay circuit Minimum delay of the delay circuit
Number of delay circuits per LUT
Number of delay circuits per LUT
Number of delay circuits per LUT
Number of delay circuits per LUT
Number of delay circuits per LUT K-1
Maximum Delay 4ns
Maximum Delay 6ns
Maximum Delay 8ns
Maximum Delay 8ns
Minimum Delay Increment 100ps
Minimum Delay Increment 200ps
Minimum Delay Increment 300ps
Minimum Delay Increment 250ps
Glitch Elimination Results Circuits % Glitch Elimination Scheme 1 C1355 92.5 C1908 91.2 C2670 96.3 C3540 84.7 C432 62.8 C499 94.5 C5315 83.9 C6288 62.7 C7552 89.6 C880 84.6 alu4 96.6 apex2 97.0 apex4 95.0 des 86.4 ex1010 ex5p 82.7 misex3 97.2 pdc 89.3 seq 96.9 spla 95.9 Average 88.5
Glitch Elimination Results Circuits % Glitch Elimination Scheme 1 Scheme 2 Scheme 3 Scheme 4 C1355 92.5 92.2 89.6 C1908 91.2 88.4 74.2 93.2 C2670 96.3 90.5 77.3 96.1 C3540 84.7 73.3 74.4 83.9 C432 62.8 65.8 63.4 62.0 C499 94.5 94.1 C5315 71.5 69.6 82.3 C6288 62.7 59.2 56.6 48.8 C7552 76.3 77.8 90.0 C880 84.6 80.5 81.1 75.7 alu4 96.6 85.6 96.2 apex2 97.0 86.5 85.9 apex4 95.0 91.6 88.2 des 86.4 70.3 75.6 87.2 ex1010 83.0 79.9 89.4 ex5p 82.7 74.5 71.4 82.2 misex3 97.2 88.8 87.9 pdc 89.3 87.1 seq 96.9 88.9 91.5 96.7 spla 95.9 90.6 91.7 Average 88.5 81.3 79.8
Overhead
Overhead Area count minimum width transistor areas 5.3 %
Overhead Area Tcrit count minimum width transistor areas 5.3 % VPR delay + HSPICE delay circuit 0.21 %
Overhead Area Tcrit Power count minimum width transistor areas 5.3 % VPR delay + HSPICE delay circuit 0.21 % Power VPR power + HSPICE delay circuits 0.45 %
Overall Results
Final Power Savings Circuits % Glitch Elimination Ideal Scheme 1 C1355 28.8 26.7 C1908 21.1 17.6 C2670 13.4 12.1 C3540 31.7 26.5 C432 17.1 11.2 C499 34.6 33.0 C5315 22.8 19.2 C6288 73.1 46.3 C7552 25.5 22.6 C880 9.6 7.8 alu4 3.6 3.3 apex2 4.3 4.1 apex4 10.1 9.5 des 17.9 15.4 ex1010 18.4 ex5p 28.1 25.4 misex3 8.1 pdc 13.3 11.8 seq 6.1 5.9 spla 21.4 20.8 Average 18.0
Summary Proposed an active glitch elimination technique for FPGAs Examined how to implement the technique Reduced power by 18% with only 5% area and 1% speed Proposed technique requires little or no modifications to the CAD flow or routing architecture