Presentation is loading. Please wait.

Presentation is loading. Please wait.

Contents Introduction Bus Power Model Related Works Motivation

Similar presentations


Presentation on theme: "Contents Introduction Bus Power Model Related Works Motivation"— Presentation transcript:

1 An Efficient Low-Power Instruction Scheduling Algorithm for Embedded Systems

2 Contents Introduction Bus Power Model Related Works Motivation
Figure-of-Merit Algorithm Overview Random Scheduling Schedule Selection Experimental Result Conclusion

3 Introduction (1/3) Nomadic life-style is wide spreading these days thanks to the rapid progress in microelectronics technologies. Not only did electronic equipment get smaller, it got smarter. The low power electronics will play a key role in this nomadic age. The figure-of-merit for nomadic age = (intelligence)/(Size * Cost * Power). Electronic Equipment toward Smaller Size Nomadic Tool

4 Compiler-in-Loop Architecture Exploration
Introduction (2/3) ASIPs have high-programmability and application specific hardware structure. Because of ASIPs have high configurability and productivity, it have the merit of time-to-market. Retargetable compiler is essential tool for application analysis and code generation in the design of ASIPs. By equipping a retargetable compiler with an efficient scheduling algorithm, low-power code can be generated. Compiler-in-Loop Architecture Exploration

5 Power Distribution for ICORE
Introduction (3/3) The power consumption of ASIP instruction memory was found to be 30% or higher of the entire process power consumption. Minimizing power consumption at instruction bus is critical in low-power ASIP design. Power Distribution for ICORE

6 Bus Power Model (1/2) Bit transition on bus lines is one of the major contributing factors to power consumption. Traditional power model just uses self-capacitance model. With the development of nanometer technologies, coupling-capacitance is significant. As a result, how to solve the crosstalk problem on buses has become an important issue. Self capacitance model Self & coupling capacitance model

7 Bus Power Model (2/2) Crosstalk type Bus power model

8 Instruction Recoding Instruction recoding analyzes the performance pattern of the application program and reassign the binary code. Histogram graphs are used for analysis of application performance pattern. Chattopadhyay et al. obtained initial solution using MWP and applied simulated annealing with the initial solution. Histogram Graph

9 Cold Schedule C. Su et al. first proposed cold scheduling
How to reflect Control Dependency into SCG ? Why MST and Simulated Annealing as postprocess ? TSP is better choice

10 Cold Schedule K. Choi et al. has formulated cold-scheduling as TSP problem – Reasonable approach C. Lee et al. expanded cold-scheduling to VLIW

11 Comparison between Recoding and Cold Scheduling
Motivation (1/2) Recoding Cold-Scheduling Input Instruction sequence Instruction binary format Output Recoded instruction binary Instruction order Optimization Scope Global Local Considered Inst. Field Partial field All fields Comparison between Recoding and Cold Scheduling

12 Motivation (1/2) (a) Different Scheduling Results
(b) Constructed Histogram Graphs (c) Optimal Recoding Results

13 Figure-of-Merit Maximizing the variance of transition edge weights increases the efficiency of recoding. The larger the sum of self-loop edge weights, the greater will be the power saving effect of a code sequence. Figure-of-merit

14 Algorithm Overview Presented FM is global function
Global instruction scheduling is difficult to implement We solve the optimization problem using random schedule gathering and schedule selection Schedule Selection

15 Random Scheduling Considerations - Runtime performance
Make_Schedules_for_BBs (BB_SET[ ]) begin for each BB in BB_SET[ ] do list_schedule_solution = LIST_SCHEDULE (BB); latency_UB = LATENCY (list_schedule_solution); Insert list_schedule_solution to Schedules_for_BBs[BB];   for i = 0 until ITERATION_COUNT (BB) do new_schedule = RANDOM_SCHEDULE (BB); acceptable = False; if (LATENCY (new_schedule) <= latency_UB) then acceptable = True; for each schedule solution s in Schedules_for_BBs[BB] do if (LATENCY (s) == LATENCY (new_schedule)) then   similarity_measure = COMPARE (s, new_schedule); if (similarity_measure > Threshold*LATENCY (new_schedule)) then accpetable = False; break; end if (acceptable) then Insert new_schedule to Schedules_for_BBs[BB]; end end end return Schedules_for_BBs[ ]; Considerations - Runtime performance - BB size and iteration count - Differences (similarity) between random schedules

16 Schedule Selection (1/3)
Problem formulation Global histogram graph can be decomposed to local histograms So, we can consider the divide-and-conquer algorithm Merge of Histogram Graphs

17 Schedule Selection (2/3)
NP-Hardness - To maximize the global variance, must consider not only the sum of each local variances but also covariance of all local histogram pairs

18 Schedule Selection (3/3)
We used the dynamic programming method to achieve local cost maximization via a bottom-up approach For further optimization, we used simulated-annealing Greedy Selection Algorithm

19 Comparison of PCC Values
Experimental Result We used PCC as our measure of performance. Comparison of PCC Values

20 Conclusion We presented a new instruction scheduling algorithm for low-power code synthesis It’s very exhaustive method to generate low-power code in application specific domain But, advance of computing power makes our method reasonable


Download ppt "Contents Introduction Bus Power Model Related Works Motivation"

Similar presentations


Ads by Google