Presentation is loading. Please wait.

Presentation is loading. Please wait.

Baoxian Zhao Hakan Aydin Dakai Zhu Computer Science Department Computer Science Department George Mason University University of Texas at San Antonio DAC.

Similar presentations


Presentation on theme: "Baoxian Zhao Hakan Aydin Dakai Zhu Computer Science Department Computer Science Department George Mason University University of Texas at San Antonio DAC."— Presentation transcript:

1 Baoxian Zhao Hakan Aydin Dakai Zhu Computer Science Department Computer Science Department George Mason University University of Texas at San Antonio DAC 2011 Generalized Reliability-Oriented Energy Management for Real-time Embedded Applications Sponsored by NSF CNS-1016855, CNS-1016974 and CAREER Awards CNS-0546244,CNS-0953005

2 2 Introduction and Motivation Dynamic Voltage Scaling (DVFS) Adjusts CPU voltage and frequency on the fly to save energy Increases task response times Transient faults / soft errors Increasingly common with technology scaling and reduced design margins Reliability: Probability of completing the task successfully DVFS and transient faults Execution at low frequency/voltage levels has a significant and negative effect on the system reliability  Due to the exponentially increased transient fault rates at low supply voltage and frequency levels [ Zhu et al., ICCAD’04]  Due to the increased execution time of the task

3 3 Existing Solutions Reliability-Aware Power Management (RA-PM) [Zhu and Aydin, ICCAD’06, RTAS’07, IEEE TC’09]  Use DVFS only for a subset of tasks; no DVFS for others  For every scaled task, schedule a separate recovery task  Preserve the original reliability of the task set Shared Recovery Technique [Zhao et al ICCAD’09]  Single recovery task shared by all tasks This Work: Generalized Shared Recovery (GSHR) Technique  Targets any reliability level set by the designer  May be lower or much higher than the original reliability  Use multiple shared recovery tasks as appropriate

4 4 A Motivational Example

5 5 Generalized Shared Recovery (GSHR) Energy-Optimal Reliability Configuration Problem Determine optimal frequency assignments f 1, f 2,…, f n, and optimal number (k) of recoveries to: Minimize Energy Subject to: Reliability Constraint Deadline Constraint

6 Our Solutions Uniform Frequency (UF) Assign a unique frequency to all the tasks to meet the deadline and reliability constraints Incremental Reliability Configuration Search (IRCS) Iteratively scale down tasks by one level at a time by comparing their “energy/reliability ratios (ERRs)” ERR is a utility measure giving energy savings per unit reliability degradation Compared against Exhaustive Search (OPT) and traditional RA-PM schemes 6

7 Simulation results The six discrete frequency levels are modeled after Intel Xscale processor, Transient faults follow Poisson distribution : λ 0 =10 -6, f max =1 and f min =0.1

8 8 Conclusions GSHR: A general framework for real-time embedded systems Achieves arbitrary reliability levels with minimum energy consumption Recovery tasks shared by all tasks as needed Ultimate aim: optimal co-management of energy and reliability Please see our poster for additional details!!


Download ppt "Baoxian Zhao Hakan Aydin Dakai Zhu Computer Science Department Computer Science Department George Mason University University of Texas at San Antonio DAC."

Similar presentations


Ads by Google