Optimality and Improvement of Dynamic Voltage Scaling Algorithms for Multimedia Applications Authors: Zhen Cao, Brian Foo, Lei He and Mihaela van der Schaar.

Slides:



Advertisements
Similar presentations
Feedback EDF Scheduling Exploiting Dynamic Voltage Scaling Yifan Zhu and Frank Mueller Department of Computer Science Center for Embedded Systems Research.
Advertisements

Feedback Control Real-Time Scheduling: Framework, Modeling, and Algorithms Chenyang Lu, John A. Stankovic, Gang Tao, Sang H. Son Presented by Josh Carl.
Pinwheel Scheduling for Power-Aware Real-Time Systems Gaurav Chitroda Komal Kasat Nalini Kumar.
Zhou Peng, Zuo Decheng, Zhou Haiying Harbin Institute of Technology 1.
1 EE5900 Advanced Embedded System For Smart Infrastructure Energy Efficient Scheduling.
Real- time Dynamic Voltage Scaling for Low- Power Embedded Operating Systems Written by P. Pillai and K.G. Shin Presented by Gaurav Saxena CSE 666 – Real.
Courseware Scheduling of Distributed Real-Time Systems Jan Madsen Informatics and Mathematical Modelling Technical University of Denmark Richard Petersens.
Mehdi Kargahi School of ECE University of Tehran
1 MemScale: Active Low-Power Modes for Main Memory Qingyuan Deng, David Meisner*, Luiz Ramos, Thomas F. Wenisch*, and Ricardo Bianchini Rutgers University.
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
Power Aware Real-time Systems Rami Melhem A joint project with Daniel Mosse, Bruce Childers, Mootaz Elnozahy.
All Hands Meeting, 2006 Title: Grid Workflow Scheduling in WOSE (Workflow Optimisation Services for e- Science Applications) Authors: Yash Patel, Andrew.
Kuang-Hao Liu et al Presented by Xin Che 11/18/09.
Aleksandra Tešanović Low Power/Energy Scheduling for Real-Time Systems Aleksandra Tešanović Real-Time Systems Laboratory Department of Computer and Information.
May 14, ISVLSI 09 Algorithms for Estimating Number of Glitches and Dynamic Power in CMOS Circuits with Delay Variations Jins Davis Alexander Vishwani.
CAC and Scheduling Schemes for Real-time Video Applications in IEEE Networks Ou Yang UR 10/11/2006.
*Sponsored in part by the DARPA IT-MANET Program, NSF OCE Opportunistic Scheduling with Reliability Guarantees in Cognitive Radio Networks Rahul.
Performance and Energy Bounds for Multimedia Applications on Dual-processor Power-aware SoC Platforms Weng-Fai WONG 黄荣辉 Dept. of Computer Science National.
Energy-Aware Modeling and Scheduling of Real-Time Tasks for Dynamic Voltage Scaling Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.
Energy-Efficient Rate Scheduling in Wireless Links A Geometric Approach Yashar Ganjali High Performance Networking Group Stanford University
1 Center for Embedded Systems Research (CESR) Department of Computer Science North Carolina State University Frank Mueller Timing Analysis: In Search of.
System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.
Misconceptions About Real-time Computing : A Serious Problem for Next-generation Systems J. A. Stankovic, Misconceptions about Real-Time Computing: A Serious.
Processor Frequency Setting for Energy Minimization of Streaming Multimedia Application by A. Acquaviva, L. Benini, and B. Riccò, in Proc. 9th Internation.
1 Chapter 13 Embedded Systems Embedded Systems Characteristics of Embedded Operating Systems.
CS 423 – Operating Systems Design Lecture 22 – Power Management Klara Nahrstedt and Raoul Rivas Spring 2013 CS Spring 2013.
Soner Yaldiz, Alper Demir, Serdar Tasiran Koç University, Istanbul, Turkey Paolo Ienne, Yusuf Leblebici Swiss Federal Institute of Technology (EPFL), Lausanne,
Task Alloc. In Dist. Embed. Systems Murat Semerci A.Yasin Çitkaya CMPE 511 COMPUTER ARCHITECTURE.
Resource Allocation for E-healthcare Applications
Minimizing Response Time Implication in DVS Scheduling for Low Power Embedded Systems Sharvari Joshi Veronica Eyo.
VOLTAGE SCHEDULING HEURISTIC for REAL-TIME TASK GRAPHS D. Roychowdhury, I. Koren, C. M. Krishna University of Massachusetts, Amherst Y.-H. Lee Arizona.
Low-Power Wireless Sensor Networks
Integrating Fine-Grained Application Adaptation with Global Adaptation for Saving Energy Vibhore Vardhan, Daniel G. Sachs, Wanghong Yuan, Albert F. Harris,
Computer Science Department University of Pittsburgh 1 Evaluating a DVS Scheme for Real-Time Embedded Systems Ruibin Xu, Daniel Mossé and Rami Melhem.
Energy-Efficient Soft Real-Time CPU Scheduling for Mobile Multimedia Systems Wanghong Yuan, Klara Nahrstedt Department of Computer Science University of.
Optimal Power Control, Rate Adaptation and Scheduling for UWB-Based Wireless Networked Control Systems Sinem Coleri Ergen (joint with Yalcin Sadi) Wireless.
1 EE5900 Advanced Embedded System For Smart Infrastructure Energy Efficient Scheduling.
Dynamic Slack Reclamation with Procrastination Scheduling in Real- Time Embedded Systems Paper by Ravindra R. Jejurikar and Rajesh Gupta Presentation by.
Budget-based Control for Interactive Services with Partial Execution 1 Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety Microsoft Research.
Quality of Service Karrie Karahalios Spring 2007.
Real-Time Scheduling CS4730 Fall 2010 Dr. José M. Garrido Department of Computer Science and Information Systems Kennesaw State University.
Scheduling policies for real- time embedded systems.
A Node and Load Allocation Algorithm for Resilient CPSs under Energy-Exhaustion Attack Tam Chantem and Ryan M. Gerdes Electrical and Computer Engineering.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
Hard Real-Time Scheduling for Low- Energy Using Stochastic Data and DVS Processors Flavius Gruian Department of Computer Science, Lund University Box 118.
NC STATE UNIVERSITY 1 Feedback EDF Scheduling w/ Async. DVS Switching on the IBM Embedded PowerPC 405 LP Frank Mueller North Carolina State University,
Impact of Power-Management Granularity on The Energy-Quality Trade-off for Soft And Hard Real-Time Applications International Symposium on System-on-Chip,
1 of 14 1/34 Embedded Systems Design: Optimization Challenges Paul Pop Embedded Systems Lab (ESLAB) Linköping University, Sweden.
CSCI1600: Embedded and Real Time Software Lecture 23: Real Time Scheduling I Steven Reiss, Fall 2015.
Multimedia Computing and Networking Jan Reduced Energy Decoding of MPEG Streams Malena Mesarina, HP Labs/UCLA CS Dept Yoshio Turner, HP Labs.
Yifan Zhu, Frank Mueller North Carolina State University Center for Efficient, Secure and Reliable Computing DVSleak: Combining Leakage Reduction and Voltage.
CprE 458/558: Real-Time Systems (G. Manimaran)1 Energy Aware Real Time Systems - Scheduling algorithms Acknowledgement: G. Sudha Anil Kumar Real Time Computing.
Workload Clustering for Increasing Energy Savings on Embedded MPSoCs S. H. K. Narayanan, O. Ozturk, M. Kandemir, M. Karakoy.
CprE 458/558: Real-Time Systems (G. Manimaran)1 CprE 458/558: Real-Time Systems Energy-aware QoS packet scheduling.
Chance Constrained Robust Energy Efficiency in Cognitive Radio Networks with Channel Uncertainty Yongjun Xu and Xiaohui Zhao College of Communication Engineering,
Introductory Seminar on Research CIS5935 Fall 2008 Ted Baker.
Determining Optimal Processor Speeds for Periodic Real-Time Tasks with Different Power Characteristics H. Aydın, R. Melhem, D. Mossé, P.M. Alvarez University.
1 Performance Impact of Resource Provisioning on Workflows Gurmeet Singh, Carl Kesselman and Ewa Deelman Information Science Institute University of Southern.
Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and Prashant Shenoy.
Distributed Process Scheduling- Real Time Scheduling Csc8320(Fall 2013)
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
Real-Time Operating Systems RTOS For Embedded systems.
Embedded System Scheduling
Wayne Wolf Dept. of EE Princeton University
Chapter 8 – Processor Scheduling
Flavius Gruian < >
Dynamic Voltage Scaling
Research Topics Embedded, Real-time, Sensor Systems Frank Mueller moss
Presentation transcript:

Optimality and Improvement of Dynamic Voltage Scaling Algorithms for Multimedia Applications Authors: Zhen Cao, Brian Foo, Lei He and Mihaela van der Schaar Presented by : Amarnath Kasibhatla & ViswaKiran Popuri Electrical Engineering Dept., UCLA

Outline Background Problem formulation Optimal Offline Solution Effective Online Algorithm Simulations and Results Conclusions

Overview of the Background Review Why DVS is required and how it helps? Brief Review of DVS algorithms

Changes in Communication & Computation From Wall-plugged Towards Portable systems  Battery did not scale as much! Has LIMITED ENERGY  And that limited energy is expected to last long. From General-purpose Towards Intelligent systems  Computationally more expensive (freq. of operation is ever increasing)! Requires MORE ENERGY From Continuous-Mode Towards Burst-Mode  Multimedia (PEAK requirement >> AVERAGE throughput)  For e.g. Video transmission over Mobile phones Need an ENERGY-efficient processor to support high peak operating freq while conserving energy in other modes.

CMOS & DV(F)S Energy = C sw *V dd 2 *f clk *T exec (Quadratic wrt V dd )‏ Delay α V dd /(V dd –V th ) α (Freq scales Linearly wrt V dd )‏ Dynamic Voltage Scaling: Reduce V dd to an extent that the Delay requirements are just met. Example: f clk = (n/ ΔT)‏ Case1: T exec = ΔT/2  E 1 = (1/2)*CV dd 2 n Case2: T exec = ΔT, (V dd /2), f clk /2  E 2 = (1/8)*CV dd 2 n = (1/4)*E 1 75 % Reduction in Energy! ΔTΔT Taken from *[1]

Need and Challenge of Power Management Multimedia applications are energy sensitive  Computationally demanding  Stringent deadlines for multiple tasks  Processed on energy-limited devices DVS (dynamic voltage scaling) algorithms to reduce energy while meeting deadlines DVS algorithms need to deal with uncertainty in  Job complexity (time-varying workloads)‏  Communication delay (e.g. wireless channel)‏

DVS Hardware/Algorithm requirements Hardware: Programmable DC-DC switching voltage regulator, Programmable clock gen & Processor with multiple Operating Points. Algorithm: Accurate Prediction of workload requirement for a give computation task.  Error in Prediction causes Deadline miss or Low energy- efficiency Taken from *[1]

Overview of the background review Why DVS is required and how it helps? Types of DVS algorithms

DVS algorithms Single-task deadline based Multiple-task deadline based Feedback control based Stochastic Model based

Single-task deadline based Uses Worst-Case Exec Time (WCET) or Average Case Exec Time (ACET)‏ Frame-based DVS [1]  Each frame is handled individually for accurate prediction of decoding time. Cross-layer adaptation [2]  Adapts Hardware layer (Vdd Scaling) for small changes in processing overload (fine granularity). (+) Low computational Complexity (-) Tasks with imminent deadlines take huge energy to finish task in time. Taken from *[2]

Multiple-task deadline based Real Time-DVS (RT-DVS) [3]  Normal-DVS Based on average throughput (For e.g ACET)‏ Simple feedback mechanism: Detect the idle time and adjust the freq. This might cause deadline miss  DVS must be tightly coupled with the real-time scheduler of the OS. Rate-Monotonic Scheduler (RM): Static, Quickest Task first Earliest-Deadline-First Scheduler (EDF): Dynamic, Task with earliest deadline first

Multiple-task deadline based Look-Ahead RT-DVS: Defer as much work as possible. Set oper freq to meet the minimum work that must be done to meet the deadlines. Taken from *[3]

Feedback Control Based Earlier approaches of hard real-time scheduling rely on a priori knowledge of WCET. The actual execution time varies a lot from the estimated WCET. If actual exec time is lesser, proc consumes more Energy ( & hence computed earlier) than required.  For e.g. 60% degradation in RT- DVS for fluctuating workload. Use feedback control techniques in real-time scheduling for hard real- time systems such that the DVS scheme should adjust to the ever- changing workload as fast as possible. ~60% degradation Taken from *[4]

Feedback Control Based CA : Execution time of the first portion of tasks. Maximal Schedule Profile: Has the offline generated exec times. Taken from *[4] Feedback-DVS [4] Based on the difference between CA and the actual exec time (error) & the WCETs (Maximal Schedule Profile) the Vol/Freq selector chooses a V,F. Using the V,F input, the scheduler schedules the next ready task (from the Task Queue) using EDF policy. The estimated exec time for the next job is fed-back for later decision making.

Complexity of Multimedia Applications Huge work-load variations between different classes of jobs Work load distribution within each class of decoding jobs is observed to estimate mean and variance through off-line training.

Model for Stochastic Complexity of Jobs Class-based stochastic model [5]  A job class is a particular GOP frame type  Near Gaussian distributed complexity  Parameters derived offline and transmitted online with low cost B. Foo and M. van der Schaar, "A Queuing Theoretic Approach to Processor Power Adaptation for Video Decoding Systems," IEEE Trans. Signal Process., vol. 56, no. 1, pp , Jan. 2008

References [1] “Frame-Based Dynamic Voltage and Frequency Scaling for a MPEG Decoder” Kihwan Choi et al, ICCAD 2002 [2] “GRACE: Cross-layer adaptation for multimedia quality and battery energy” W. Yuan et al, IEEE trans. On Mobile Computing, 2006 [3] “Real time dynamic voltage scaling for low power embedded operating systems” P.Pillai et al, Proc of ACM symposium on Operating Systems, 2001 [4] “Feedback EDF scheduling exploiting dynamic voltage scaling” Y. Zhu et al, Proc. Of International Conf on Comp. Arch, 2004 [5] "A Queuing Theoretic Approach to Processor Power Adaptation for Video Decoding Systems," B. Foo and M. van der Schaar, IEEE Trans. Signal Process., vol. 56, no. 1, pp , Jan. 2008

Contributions Efficient LP-based (instead of ILP-based) optimal offline DVS algorithm.  It is not clear how far are the existing online algorithms are from the optimal solution.  Existing ILP-based offline solutions are not scalable.  Current offline algorithm enables optimality study of online DVS algorithms. Effective online approach by sequential robust linear programming, namely SLP/r.  Consumes 0.3% more energy versus 4% more energy for the best existing work, when both compared with optimal solution. Applicable to other delay-sensitive applications with time- varying workloads.  e.g. real-time stream queries for financial or medical data and manufacturing process control.

Power Model The power model used for Dynamic power: Sub-threshold leakage power Lg is the number of devices in the circuit, Ij is the reverse-bias junction current, Vbs is the body bias voltage. K3, K4 and K5 are constant fitting parameters. Sleep mode of operation is also considered (which makes the power model non-convex): Power = 0 ; Frequency = 0

Outline Background Problem formulation Optimal Offline Solution Effective Online Algorithm Simulations and Results Conclusions

Formulation of DVS Problem Given  A sequence of decoding jobs (stochastic complexity, stochastic arrival time, deterministic deadline).  A set of voltages including power gating, each with associated clock frequency and power. Find  The time and voltage level for each voltage switch. Minimize  Energy. Subject to  Start a job after it arrives.  Finish a job before its deadline.

Formulation of DVS Problem - Contd. Let C = {C1, C2... CM}, T = {T1,T2... TM} and D = {D1, D2... DM} be the complexity, arrival time and display deadlines of M incoming jobs. Let F = {F0, F1,... FK} and P = {P0, P1... PK} be the available frequency and power switch levels. The scheduling solution S = {Ts, Vs, N}, where N is the number of voltage switches; Ts = {t0, t1,... tN, tN+1} and Vs = {v0, v1.. vN} are the time and voltage levels for each switch, then the DVS problem is:

Outline Background Problem formulation Optimal Offline Solution Effective Online Algorithm Simulations and Results Conclusions

DVS Problem in Time – Complexity Space

Property of DVS Solution Compared to multimedia jobs (around 10 9 clock cycles), voltage switching overhead (around 10 clock cycles) is negligible. We call time interval with constant U(t) and L(t) as the adaptation interval. Primary Theorem: for an adaptation interval, an arbitrary ordering of any accumulative computation curve consumes the same energy.  What really matters is the percentage of time for each voltage level. U(t)‏ L(t Seq:2,0,1,3,4 U(t)‏ L(t Seq:0,1,2,3,4

LP Formulation for DVS Length of adaptation interval i Power j allocation of voltage level j in adaptation interval i Frequency s. t.

Outline Background Problem formulation Optimal Offline Solution Effective Online Algorithm Simulations and Results Conclusions

rLP Formulation L(t) and U(t) depend on stochastic complexity of jobs A sequence of rLP for a sequence of time windows, each window is bigger than an adaptation interval Difference from offline formulation L(t) and U(t) become stochastic s. t.

…… L(t) for robust linear programming Illustration of SLP/r Media Time D 1 D 2 D 3 T 1 'T 2 'T 3 ' U(t) for robust linear programming mean S D Prediction

Illustration of SLP/r …… Media Time D 1 D 2 D 3 T 1 'T 2 'T 3 ' L(t) for robust linear programming U(t) for robust linear programming Prediction rLP

L(t) for robust linear programming Illustration of SLP/r Prediction rLP Commitment …… U(t)‏ real complexity and arrive time of jobs Media Time D 1 D 2 D 3 T 1 'T 2 'T 3 '

L(t) for robust linear programming Illustration of SLP/r …… U(t)‏ real complexity and arrive time of jobs Media Time D 1 D 2 D 3 T 1 'T 2 'T 3 ' Prediction rLP Commitment

Illustration of SLP/r Process a new time window Media Time L(t) for robust linear programming D 1 D 2 D 3 Prediction rLP Commitment U(t)‏ real complexity and arrive time of jobs

Illustration of SLP/r Process a new time window Media Time L(t) for robust linear programming D 1 D 2 D 3 Prediction rLP Commitment U(t)‏ real complexity and arrive time of jobs

Outline Background Problem formulation Optimal Offline Solution Effective Online Algorithm Simulations and Results Conclusions

Experimental Setup V dd between 0.6V and 1.0V with step sizes of 0.1V, plus power gating  Our algorithms are applicable to any power model. Video sequence consisting of 10 different scenes. Compare SLP/r with:  Queuing-Based Stochastic Algorithm [Foo, 2008]  Deterministic laEDF [Pillai, 2001] Monte Carlo simulation of stochastic complexity and arrival time to verify all results

Recap of SLP/r Confidence level  Linear online prediction function for workload of each job class: mean + k * standard deviation  Deciding trade-off between miss rate and energy Granularity of SLP/r  Number of jobs to commit before shifting the window  Deciding tradeoff between runtime and quality of solution

Comparison of Energy/Miss Rate Granularity = 1 job Optimality study: laEDF: 15% more energy. Queuing-based: 4% more. Online algorithm SLP/r 1% more energy

Granularity VS Quality Granularity = 4 jobs: 0.03% miss rate with 0.3% more energy than optimal Changing granularity from 1 to 4jobs We reduce runtime, energy and miss rate simultaneously

Energy VS Granularity/Confidence Level Minimum energy setting: Granularity = 4 jobs Confidence level =1.5

Miss Rate VS Granularity/Confidence Level Granularity = 4 jobs Confidence level =1.5, Miss rate close to zero

Outline Background Optimal Offline Solution Effective Online Algorithm Simulations and Results Conclusions

An efficient optimal offline DVS algorithm based a tractable LP formulation.  Enables optimality study for DVS algorithms. An effective online approach SLP/r by sequential robust linear programming.  Consumes 0.3% more energy versus 4% for the best existing work, when both compared with optimal solution.

Thanks! Q & A

Back-up slides

Existing Online Algorithms Uses worst or average case execution time [Pillai ACM Symposium on OS’01] [Choi, ICCAD’02] [Zhu LCTES’07] [Nahrstedt et al., ITMC’06]  Ensures hard deadlines not missed  Soft deadlines (and slack reclamation) to reduce energy consumption Online workload prediction  Feedback-based [Zhu LCTES’07]  Adaptive linear prediction [Akyol 2007]  Buffer-constrained DVS [Maxiaguine et al, ICHSC’05]  Stochastic Queuing-based DVS [Foo 2008] It’s not clear how far online algorithms are away from the optimal solution

Existing Offline Algorithms Offline algorithms can be used for optimality study  Optimal solution assuming that complexity and arrival time are known based on trace  Lower bound of energy for online algorithms Existing ILP-based offline algorithms are not scalable  [Akyol, 2007] [Zhang, ICCAD’07]