Performance and Energy Bounds for Multimedia Applications on Dual-processor Power-aware SoC Platforms Weng-Fai WONG 黄荣辉 Dept. of Computer Science National University of Singapore Joint work in collaboration with Zhu Yongxin, Samarjit Chakraborty
Background SoC platforms become more complicated than classic embedded systems by carrying out multiple tasks: –to record music received by software radio –to play games while downloading another one –to talk over GPRS/3G mobile phone which stays online checking s –…. Needs to quickly explore design space of SoC for multimedia processing Emergence of multi-core technology
Background Analytical approaches are necessary due to unacceptable overheads of simulation practices to study multiple design tradeoffs Many efforts for performance enhancement to ensure the quality of service such as a guaranteed playback rate A few power-awareness efforts –dynamic voltage and frequency scaling (DVFS) –dynamic power management (DPM)
Overview Background Related work Methodology Experiment setup Results and discussion Next steps
A Motivating Problem Under a performance constraint, how to minimize energy dissipation by trading off among: –dynamic frequency and voltage scaling policies, –multiple frequencies of processors, –processor customization catering for applications
Related Work Yanhong Liu, Alexander Maxiaguine, Samarjit Chakraborty, and Wei Tsang Ooi. Processor frequency selection in energy-aware SoC platform design for multimedia applications. RTSS Alexander Maxiaguine, Yongxin Zhu, Samarjit Chakraborty, and Weng-Fai Wong. Tuning soc platforms for multimedia processing: Identifying limits and tradeoffs. CODES+ISSS 2004 L. Cai and Y.H. Lu. Energy Management Using Buffer Memory for Streaming Data, IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems, 24(2): , 2005 Validation of the models against simulation results and metrics on physical processors Co-optimization of performance and power
Methodology Network calculus models to identify the upper and lower bounds using variability characterization curves
Variability Characterization Curves Workload curves Consumption curves
Variability Characterization Curves Production curves Service curves number of available cycles, subject to schedulers such as duty cycles Number of activations
Power Model Active time –where L i is the length of activation on the i-th PE, Ω i is the frequency of the PE i Leakage power –where I subn is the sub-threshold current, V bs is the body bias voltage, and I j is the reverse bias junction current Switching overhead –where ρ i is the scheduling period of PE i, D wakeup is the wake-up delay, p idle p,i is the dynamic power of PE i in the idle mode
Power Model (cont’d) PE’s energy Buffer’s energy –where Q max i is the maximum buffer fill level of the i-th buffer, p b i is the i-th buffer’s dynamic power Total energy
Experiment Setup Map an MPEG-2 decoder onto PE 1 and PE 2 Setting 1: parameters of Intel Xscale processor Setting 2: parameters based on Transmeta Crusoe processor scaled up to 70nm technology Buffer’s specifications are Micro SDRAM parameters
Experiment Setup (cont’d)
A Motivating Problem Under a performance constraint, how to minimize energy dissipation by trading off among: –dynamic frequency and voltage scaling policies, –multiple frequencies of processors, –processor customization catering for applications
How do scheduling policies affect the constraint?
Results on Underflow Possibilities Underflow possibilities associated with scheduling periods (733MHz)
Results on Underflow Possibilities (cont’d) Underflow possibilities associated with varying duty cycles (633MHz)
Which is more sensitive to schedulers, the buffer’s energy or PE’s energy?
Bounds of Buffer’s Energy Bounds of buffer’s maximum energy associated with the same frequencies of PEs with SDRAM buffers under varying duty cycles
Bounds of Total Energy Bounds of maximum total energy associated with the same frequencies of PEs with SDRAM buffers under varying duty cycles
A Motivating Problem Under a performance constraint, how to minimize energy dissipation by trading off among: –dynamic frequency and voltage scaling policies, –multiple frequencies of processors, –processor customization catering for applications
How to reduce energy by choosing frequencies without undermining the quality of service?
Choosing Frequencies along the Boundary Bounds of maximum total energy associated with the combinations of frequencies of PEs with SDRAM buffers a duty cycle of 0.9
Choosing Frequencies along the Boundary (cont’d) Noting the surface almost monotonously increases with the frequencies except for the starting point Choosing frequency combinations along the boundary of the area can minimize energy without violating the performance constraint
A Motivating Problem Under a performance constraint, how to minimize energy dissipation by trading off among: –dynamic frequency and voltage scaling policies, –multiple frequencies of processors, –processor customization catering for applications
What to trade off if the frequencies are fixed?
Shifting of the Best Duty Cycle Bounds of maximum total energy associated with the combinations of frequencies of PEs with data cache buffers varying duty cycles
Summary An analytical framework based on VCC to identify both performance and energy bounds Studied the impacts of scheduler policies Explored the tradeoffs of frequencies Explored processor customizations
Next Steps Include more hardware details –Hierarchical cache systems –Communication mechanisms such as buses Co-optimization algorithms Detailed validations of the model
EASEL: Engineering Architectures and Software for the Embedded Landscape
Thank You!