Download presentation
Presentation is loading. Please wait.
Published byPrimrose Glenn Modified over 9 years ago
1
Power and Frequency Analysis for Data and Control Independence in Embedded Processors Farzad Samie Amirali Baniasadi Sharif University of Technology University of Victoria
2
This Work Goal Power and frequency analysis for control independent and data independent instructions in embedded processors Motivation Embedded processors are becoming complex Modern embedded processors use speculation Mis-speculation causes performance and power penalty Power is a major concern in embedded processors Save power and gain performance 2
3
This Work (cont.) Our Approach Reducing wasted energy and time in mispredictions. How? Identify and bypass Control Independent (CI) and Data Independent (DI) instructions. CIs: Instruction executing independent of branch outcome. CI-DI: CI Instructions executing with the same operands. Key Result: 12% processor energy reduction. 3
4
Background Branch Prediction 4 Branch Predictor Branch History Program Counter Predicted direction Predicted target address
5
Wrong Path (squashed) ?? Background (cont.) 5 I1I1 I2I2 I3I3 I4I4 I7I7 I8I8 I9I9 I5I5 I6I6 Branch Inst. Not taken Misprediction Detection Taken Right Path I9I9 I8I8 I7I7 I 12 I 11 I 10 Control Independent Instructions (CIs)
6
Background (cont.) 6 R 1 ←R 1 +R 2 Not takenTaken R 4 ←R 1 If (R4=0) R 2 ←R 4 -R 1 R 5 ←R 2 -R 3 R 3 ←0 R 5 ←R 4 +1 R 1 ←R 1 -1 R 3 ←0 R 4 ←R 6 +R 4 R 1 ←R 4 +R 1 R 5 ←R 5 -2 R 3 ←R 3 -R 4 Data Independent (CI-DI) Data Dependent (CI-DD) Data Independent (CI-DI) R 1 ←R 1 -1 R 5 ←R 2 -R 3 R 5 ←R 4 +1
7
CI-DI vs. CI-DD Bypassing CI-DIs saves more energy No need to read operands/execute again Bypassing CI-DIs provides higher performance Not need to waste time for reading operand/executing 7 FetchIssueDispatchExecute Write Back CI-DD CI-DI
8
Methodology Modified SimpleScalar Wattch for power measurement MiBench: Embedded Benchmark Suite 8
9
Distribution Wrong Path: 12%, CI: 5%, CI-DI: 2% 9
10
CI Power Reduction in Different Units Max: branch predictor unit, Min: instruction cache 10
11
CI Power Reduction in Stages 11 Rijndael: low misprediction low wrong path low CIs
12
Power Sensitivity to RUU size 12 CI CI-DI Higher power dissipation for bigger RUU sizes
13
Power Sensitivity to Execution Bandwidth 13 CI CI-DI Higher power dissipation for wider execution bandwidth
14
Power Sensitivity to Branch Predictor Size 14 Little sensitivity to branch predictor size
15
Related Work Rotenberg et. al: studied control independence in superscalar processors, HPCA99. Collins et. al: suggested mechanism to predict re-convergent point, Micro04. Lam and Wilson: studied impact of CIs on instruction level parallelism, ISCA92. Gandhi et. al: recover selected branch mis-prediction, HPCA04. 15
16
Conclusion Categorize CI to CI-DI and CI-DD Potential power saving for bypassing CI and CI-DI instructions up-to 12% High sensitivity to RUU size High sensitivity to execution bandwidth Little sensitivity to branch predictor size 16
17
Question Thank you 17
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.