Download presentation
Presentation is loading. Please wait.
1
Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz
2
Why Adaptive V dd /V th ? No one transistor meets all needs This transistor is too leaky… This transistor is too slow… Modern processes usually have lots of device options, but they still have a set of fixed characteristics Optimum characteristics often environment dependent and hence vary with time, workload, etc. May want to tune both V dd and V th on a block-by-block basis to minimize total energy Supply can be set/controlled with regulators - how about V th ?
3
Body Biasing Unfortunately, body bias is not very effective in modern technologies Less than 100mV shift in V th across full range of bias Hardly any effect on power vs. frequency (traced by sweeping V dd ) Not very promising… Usual approach for adjusting V th : body bias
4
Adjusting V th with Skewed Supplies Assume that delay (and power) is dominated by edges in a particular direction We’ll come back to the other edges shortly “V th ” can be adjusted by skewing supplies of pos. edge gates (PMOS) vs. neg. edge gates (NMOS) Notation: ΔV th >0 means device’s V th reduced by ΔV th
5
Power vs. Frequency Preview Ring Oscillator w/Body BiasSkewed Supply Ring Oscillator
6
Outline Skewed Supply Logic Circuits Adaptive Implementation Application to Minimum Energy Systems Conclusions
7
What About the Non-Critical Edge? Performance benefit negated if need to wait for the other (slow) edge ΔV th > 0:ΔV th < 0: Skewed supply directly shifts V th of non-critical devices in the opposite direction as the critical devices Need to return to default state so that leakage isn’t set by (reduced threshold) non-critical devices
8
Self-Resetting Skewed Supply Gates Can be extended to use delayed self-reset (i.e. interlock mechanism to guarantee input pulses overlap) Another option: use self-resetting critical path replica to generate en/en_b signals for every level of logic gates Keep gates in default state most of the time: self-reset
9
Self-Resetting Gates Challenges Pulse-width needs to (at least somewhat) track delay across V dd and ΔV th Don’t want pulses to disappear Don’t want reset, re-enable delay to become critical path Maintain proper operation at high |ΔV th | ΔV th >0: P-stack in subthreshold, N-stack V th ≈0 Gate could fire even when inputs unasserted ΔV th <0: subthreshold N- stack vs. V th ≈0 P-stack leakage
10
Self-Resetting N-Gate: Keeper Keeper structure helps boost voltage margin for ΔV th >0 (as opposed to a weakened P-stack connected to inputs) Unfortunately, can’t really make keeper strength track because gate needs to reset to nV dd. (Unless use yet another supply nnV ss …) For ΔV th <0, need to make sure N-stack can always overpower the keeper May want to go back to P-stack connected to inputs as “keepers”
11
Self-Resetting N-Gate: Reset Path Pulse width tracks delay by alternating n/p supplies on reset path Falling edge out n traverses “critical” edge through reset gates However, the re-enabling edge will then have the “non-critical” delay For ΔV th >0, re-enable path will be slow – NOR gate cuts path in half For high ΔV th even this might not be enough – another option next For ΔV th <0, re-enable edge will be fast – need to make sure the gate fully resets (or at least turns on the keeper).
12
Self-Resetting N-Gate: Reset Path (#2) Even with NOR gate, At high ΔV th re-enable delay can be VERY slow Devices in that direction can easily be in subthreshold Break the “rules” and have gates on re-enable path swing from nV ss to pV dd Often costs less power than reducing the fanout Also allows evaluate device at bottom of N-stack to see ΔV th
13
Self-Resetting P-Gate Could be mirrored version of N-gate, but because of higher NMOS drive current (and sometime lower V th ) input “keeper” can still be relatively effective (even when ΔV th >0)
14
Skewed Supply Oscillator Use ring oscillator as a test structure to characterize the gates Helps find issues that arise at various operating points Pulse “chases its own tail”:
15
Operating Range and Power vs. Frequency V dd = 1V V dd = 600mV V dd = 500mV Simulation results from a 90nm triple well technology Since skewing supplies for threshold adjustment triple well isn’t a requirement Gates designed for ΔV th >0 operation Without heavy optimization covers ~300mV range in ΔV th
16
Outline Skewed Supply Logic Circuits Adaptive Implementation Application to Minimum Energy Systems Conclusions
17
Adaptive System Block Diagram Roles/bandwidths of V dd /V th loops can be flipped Just want separated bandwidths to minimize stability issues More complicated algorithms can do both at same speed, but probably not needed (environment changes usually slow) “High” bandwidth FLL enforces frequency constraint by setting V dd “Low” bandwidth threshold loop attempts to minimize power through V th
18
Generating the Power Supplies: Switching DC-DC Converters Switching DC-DC converters desirable for efficiency But hardest to integrate Want external inductors or efficiency may suffer Power measurement (for V th loop) can be tricky Could use extra series resistor, but again costs efficiency May get that resistance from an on-chip inductor anyways
19
Generating the Power Supplies: On-Chip Linear Regulators Efficiency could greatly suffer however Especially if get only one V sup to generate both n and p supplies On-chip linear regulators most desirable for integration High bandwidths easy to achieve Easy to measure power External supply fixed, just mirror output device current
20
Generating the Power Supplies: Hybrid Architecture “Best of both worlds” High bandwidth, easy to integrate on-chip linear regulators Adjust external switching regulators to just meet linear regulators’ dropout (and minimize loss) To minimize external component count could share external supplies across multiple blocks Of course at some cost in efficiency however
21
FLL Implementation (1) Charge-pump based design Pulse generators + charge pump = analog counter nV ss serves as global reference (i.e. chip V ss or “0”) Control loops generate the other three rails
22
FLL Implementation (2): Regulators For simplicity used on-chip linear regulators V sup_dd, V sup_ss – external supplies w/headroom for regulators V sup_dd ≈ V dd_max +|ΔV th | max +150mV V sup_ss ≈ -150mV V c_sup sets pV dd, pV ss set by power loop Shifted ground on nV dd regulator feedback makes nV dd = pV dd – pV ss
23
Power Minimization Algorithm Optimization problem: min {Vdd,Vth} P avg (V dd,V th ) s.t. f = f targ FLL enforces constraint and eliminates V dd as a variable Set by ΔV th and operating frequency Simplified minimization algorithm: Step 1: Increase ΔV th by 1 step; measure average power Step 2: Decrease ΔV th by 1 step; measure average power Step 3: Move in direction of lower average power, repeat Step 1 Works as long as P vs. V th curve has no locally flat regions (except global minimum) Hard to show analytically, but intuitively (and numerically) true
24
Power Loop Implementation: Measuring Power Mirror regulator current to measure block’s current Voltage fixed, so just add currents from pV dd and nV dd to find total power (current) Want more processing if external supply is not fixed Multiply I tot by V sup_ext If V sup_ext is digitally controlled multiplication could be done in current domain by programming output mirroring ratio M
25
Power Loop Implementation: Minimization Algorithm (I) Step 1: Pulse up Δ (+ΔV c ), enable dn int (integrate –I mirr ) Step 2: Pulse dn Δ (–ΔV c ), enable up int (integrate +I mirr ) Step 3 happens automatically since: V c_th [k+1] > V c_th [k] if I mirr (+ΔV c )<I mirr (-ΔV c ) V c_th [k+1] I mirr (-ΔV c )
26
Power Loop Implementation: Minimization Algorithm (II) To keep polarities correct need I Δ t Δ > I mirr t int May need small pump currents and/or large capacitors, especially if shooting for small ΔV c
27
Outline Skewed Supply Logic Circuits Adaptive Implementation Application to Minimum Energy Systems Conclusions
28
Minimum Energy Systems with Global Supply Supply set by global activity vs. leakage energy ratio But blocks may exhibit wide variances in their activities Even a single block’s activity may vary with time (e.g. static vs. dynamic MPEG frame)
29
Minimum Energy Systems With Adaptive Supplies In subthreshold, minimum energy is independent of V th V th increases: both frequency and leakage decrease, net energy stays the same Can get minimum energy by adjusting each V dd, but: Each block would have to operate at its own frequency…
30
Minimum Energy Systems with Adaptive Supplies and Thresholds Controlling both V dd and V th allows blocks to achieve minimum energy at arbitrary operating frequency All blocks can then operate at the same (system determined) frequency Much simpler system to design and interface with than only adaptive supply…
31
Outline Skewed Supply Logic Circuits Adaptive Implementation Application to Minimum Energy Systems Conclusions
32
Skewed supplies a promising approach to allow direct control/optimization of effective device thresholds Still lots of issues to work out of course more research to be done For low-power applications, combined adaptation of V dd and V th can achieve per-block minimum energy while maintaining global synchronicity No need for software directives; chip constantly adapts itself to keep energy dissipation as low as possible This technique is attractive in high-performance applications as well Improvements in power efficiency increased performance in a heat-dissipation limited environment
33
Bonus Slides
34
Digital Control Implementation Particularly in advanced technologies, can be difficult to get charge pumps to behave as desired Both FLL and power loop well suited to digital control implementations FLL: Frequency detect is really easy – just count DAC just needs to enough resolution to keep dither small Power loop: Power ADC: Use mirrored block current as supply for current- starved ring, count Really need (effectively) monotonic DAC however
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.