Download presentation
Presentation is loading. Please wait.
1
CS 7810 Lecture 13 Pipeline Gating: Speculation Control For Energy Reduction S. Manne, A. Klauser, D. Grunwald Proceedings of ISCA-25 June 1998
2
Cost of Speculation 9.912.223.910.46.94.611.31.7Mispredict rates
3
Pipeline Gating Low confidence branches throttle instr fetch until they are resolved Pipeline gating usually lasts for fewer than five cycles
4
Metrics SPEC (specificity): fraction of all mispredicted branches detected as low-confidence by the confidence estimator (coverage) PVN (predictive value of a negative test): probability of a low-confidence branch being incorrectly branch-predicted (accuracy)
5
Confidence Estimators Perfect: to gauge potential benefits Static: branches that have low prediction rates JRS: if a branch has yielded N successive correct predictions, it has high confidence Saturating counters: unbiased counter value or disagreement in two predictors low confidence Distance: mpreds are clustered, hence the first 4 branches after a mispredict have low confidence
6
SPEC and PVN It is easier to achieve a high SPEC value than PVN A high PVN value can be achieved by using N low-confidence branches to invoke gating – if PVN is 30%, re-defining low-confidence as two low-confidence branches increases PVN to 51% SPEC (coverage): mispred branches detected by low-confidence estimator PVN (accuracy): % of low-confidence branches that are branch mpreds
7
Perfect
8
Gating Results
9
Results Can gating improve performance? – only if cache pollution is significant Less than 1% performance loss and up to 38% reduction in extra work Energy consumption could go up – some work is independent of number of executed instrs (clock distribution) – incr. execution time can incr. Energy Pipeline gating should reduce power consumption
10
Results
11
CS 7810 Lecture 13 Cache Decay: Exploiting Generational Behavior to Reduce Cache Leakage Power S. Kaxiras, Z. Hu, M. Martonosi Proceedings of ISCA-28 July 2001
12
Leakage Power Trends Circuit delay 1/(V – V th ) Leakage num transistors (incr) supply voltage (decr) (exp) low thresh. voltage (incr) L1 and L2 caches are the biggest contributors (high transistor budgets)
13
V dd -Gating Leakage can be reduced by gating off the supply voltage to the circuit When applied to a cache, the contents of the SRAM cell are lost Cache decay: apply Vdd-gating when you do not care about cache contents
14
Lifetime of a Cache Line
15
Overheads Hardware to determine when to decay Introduces additional cache misses Normalized cache leakage power = Activeratio (fraction of cache that is powered on) + (Counter overhead : Leak) x activity + (L2 access energy : Leak) x num-misses Increased execution time (< 0.7%) L2 access/leakage ratio is ~9
16
Skier’s Dilemma New skis: $400 Ski rentals: $20 Heuristic: Buy skis after rental cost = purchase price Ski trips:5 10 15 20 25 50 Optimal: $100 $200 $300 $400 $400 $400 Heuristic: $100 $200 $300 $800 $800 $800 Likewise, decay a cache line when the cost of an additional miss equals leakage dissipated so far
17
Tracking Dead Time Each line has a 2-bit counter that gets reset on every access and gets incremented every 2500 cycles through a global signal (negligible overhead) After 10,000 clock cycles, the counter reaches the max value and triggers a decay Adaptive decay: Start with a short decay period; if you have a quick miss, double the period; if there is no miss, halve the period
18
Results
19
Overheads
20
Other Results L2 cache is equally suitable to decay techniques -- lifetimes are scaled by a factor of 10, an extra miss also costs a lot more For their experiments, there is little interference from multiprogramming Some instructions can easily be identified as last touches to a cache block – potential for early cache decay Can this apply to bpred, register file?
21
Title Bullet
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.