Performance and Energy Bounds for Multimedia Applications on Dual-processor Power-aware SoC Platforms Weng-Fai WONG 黄荣辉 Dept. of Computer Science National.

Slides:



Advertisements
Similar presentations
Energy-efficient Task Scheduling in Heterogeneous Environment 2013/10/25.
Advertisements

Performance, Energy and Thermal Considerations of SMT and CMP architectures Yingmin Li, David Brooks, Zhigang Hu, Kevin Skadron Dept. of Computer Science,
Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors Abhishek Bhattacharjee Margaret Martonosi.
A 2 -MAC: An Adaptive, Anycast MAC Protocol for Wireless Sensor Networks Hwee-Xian TAN and Mun Choon CHAN Department of Computer Science, School of Computing.
1 A Self-Tuning Cache Architecture for Embedded Systems Chuanjun Zhang*, Frank Vahid**, and Roman Lysecky *Dept. of Electrical Engineering Dept. of Computer.
Power Reduction Techniques For Microprocessor Systems
1 MemScale: Active Low-Power Modes for Main Memory Qingyuan Deng, David Meisner*, Luiz Ramos, Thomas F. Wenisch*, and Ricardo Bianchini Rutgers University.
Techniques for Multicore Thermal Management Field Cady, Bin Fu and Kai Ren.
1 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory Performance Analysis of Embedded Systems Lothar Thiele ETH Zurich.
Mohamed Hefeeda 1 School of Computing Science Simon Fraser University, Canada Analysis of Multimedia Authentication Schemes Mohamed Hefeeda (Joint work.
June 20 th 2004University of Utah1 Microarchitectural Techniques to Reduce Interconnect Power in Clustered Processors Karthik Ramani Naveen Muralimanohar.
OCIN Workshop Wrapup Bill Dally. Thanks To Funding –NSF - Timothy Pinkston, Federica Darema, Mike Foster –UC Discovery Program Organization –Jane Klickman,
CS 7810 Lecture 12 Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors D. Brooks et al. IEEE Micro, Nov/Dec.
Analysis of power dissipation in embedded systems using real-time operating systems Dick, R.P. Lakshminarayana, G. Raghunathan, A. Jha, N.K. Dept. of Electr.
On-Line Adjustable Buffering for Runtime Power Reduction Andrew B. Kahng Ψ Sherief Reda † Puneet Sharma Ψ Ψ University of California, San Diego † Brown.
Power Aware Solutions for NoC Architecture Yaniv Ben-Itzhak Noc Seminar Winter 08.
Processor Frequency Setting for Energy Minimization of Streaming Multimedia Application by A. Acquaviva, L. Benini, and B. Riccò, in Proc. 9th Internation.
Research Directions for On-chip Network Microarchitectures Luca Carloni, Steve Keckler, Robert Mullins, Vijay Narayanan, Steve Reinhardt, Michael Taylor.
Power-aware Computing n Dramatic increases in computer power consumption: » Some processors now draw more than 100 watts » Memory power consumption is.
1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
CS 423 – Operating Systems Design Lecture 22 – Power Management Klara Nahrstedt and Raoul Rivas Spring 2013 CS Spring 2013.
Exploring the Tradeoffs of Configurability and Heterogeneity in Multicore Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable.
Determining the Optimal Process Technology for Performance- Constrained Circuits Michael Boyer & Sudeep Ghosh ECE 563: Introduction to VLSI December 5.
Baoxian Zhao Hakan Aydin Dakai Zhu Computer Science Department Computer Science Department George Mason University University of Texas at San Antonio DAC.
Low-Power Wireless Sensor Networks
Integrating Fine-Grained Application Adaptation with Global Adaptation for Saving Energy Vibhore Vardhan, Daniel G. Sachs, Wanghong Yuan, Albert F. Harris,
Stochastic sleep scheduling (SSS) for large scale wireless sensor networks Yaxiong Zhao Jie Wu Computer and Information Sciences Temple University.
Reconfigurable Caches and their Application to Media Processing Parthasarathy (Partha) Ranganathan Dept. of Electrical and Computer Engineering Rice University.
Computer Science Department University of Pittsburgh 1 Evaluating a DVS Scheme for Real-Time Embedded Systems Ruibin Xu, Daniel Mossé and Rami Melhem.
1 Overview 1.Motivation (Kevin) 1.5 hrs 2.Thermal issues (Kevin) 3.Power modeling (David) Thermal management (David) hrs 5.Optimal DTM (Lev).5 hrs.
1 EE5900 Advanced Embedded System For Smart Infrastructure Energy Efficient Scheduling.
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
Games are Up for DVFS Yan Gu Samarjit Chakraborty Wei Tsang Ooi Department of Computer Science National University of Singapore.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
An Energy-Efficient Hypervisor Scheduler for Asymmetric Multi- core 1 Ching-Chi Lin Institute of Information Science, Academia Sinica Department of Computer.
1 Tuning Garbage Collection in an Embedded Java Environment G. Chen, R. Shetty, M. Kandemir, N. Vijaykrishnan, M. J. Irwin Microsystems Design Lab The.
Power-Aware Scheduling of Virtual Machines in DVFS-enabled Clusters
Optimal Selection of Power Saving Classes in IEEE e Lei Kong, Danny H.K. Tsang Department of Electronic and Computer Engineering Hong Kong University.
Energy Management in Virtualized Environments Gaurav Dhiman, Giacomo Marchetti, Raid Ayoub, Tajana Simunic Rosing (CSE-UCSD) Inside Xen Hypervisor Online.
ICOM 6115: Computer Systems Performance Measurement and Evaluation August 11, 2006.
Towards Dynamic Green-Sizing for Database Servers Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, Kenneth Salem University of Waterloo.
Dynamic Voltage Frequency Scaling for Multi-tasking Systems Using Online Learning Gaurav DhimanTajana Simunic Rosing Department of Computer Science and.
A Systematic Approach to the Design of Distributed Wearable Systems Urs Anliker, Jan Beutel, Matthias Dyer, Rolf Enzler, Paul Lukowicz Computer Engineering.
Dynamic Phase-based Tuning for Embedded Systems Using Phase Distance Mapping + Also Affiliated with NSF Center for High- Performance Reconfigurable Computing.
CUHK Learning-Based Power Management for Multi-Core Processors YE Rong Nov 15, 2011.
An Integrated Design Environment to Evaluate Power/Performance Tradeoffs for Sensor Network Applications Amol Bakshi, Jingzhao Ou, and Viktor K. Prasanna.
Lev Finkelstein ISCA/Thermal Workshop 6/ Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)
Analysis of Cache Tuner Architectural Layouts for Multicore Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable Computing.
Thermal-aware Phase-based Tuning of Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable Computing This work was supported.
High Performance Embedded Computing © 2007 Elsevier Chapter 7, part 3: Hardware/Software Co-Design High Performance Embedded Computing Wayne Wolf.
1 of 14 1/34 Embedded Systems Design: Optimization Challenges Paul Pop Embedded Systems Lab (ESLAB) Linköping University, Sweden.
Computer Science and Engineering Power-Performance Considerations of Parallel Computing on Chip Multiprocessors Jian Li and Jose F. Martinez ACM Transactions.
Multimedia Computing and Networking Jan Reduced Energy Decoding of MPEG Streams Malena Mesarina, HP Labs/UCLA CS Dept Yoshio Turner, HP Labs.
Workload Clustering for Increasing Energy Savings on Embedded MPSoCs S. H. K. Narayanan, O. Ozturk, M. Kandemir, M. Karakoy.
CMP Design Space Exploration Subject to Physical Constraints Yingmin Li, Benjamin Lee, David Brooks, Zhigang Hu, Kevin Skadron HPCA’06 01/27/2010.
CprE 458/558: Real-Time Systems (G. Manimaran)1 CprE 458/558: Real-Time Systems Energy-aware QoS packet scheduling.
ECE 526 – Network Processing Systems Design Programming Model Chapter 21: D. E. Comer.
1 Power-Aware System on a Chip A. Laffely, J. Liang, R. Tessier, C. A. Moritz, W. Burleson University of Massachusetts Amherst Boston Area Architecture.
University of Maryland at College Park Smart Dust Digital Processing, 1 Digital Processing Platform Low power design and implementation of computation.
FaridehShiran Department of Electronics Carleton University, Ottawa, ON, Canada SmartReflex Power and Performance Management Technologies.
Overview Motivation (Kevin) Thermal issues (Kevin)
Andrea Acquaviva, Luca Benini, Bruno Riccò
Alireza Shafaei, Shuang Chen, Yanzhi Wang, and Massoud Pedram
Green Software Engineering Prof
Fine-Grain CAM-Tag Cache Resizing Using Miss Tags
Digital Processing Platform
A High Performance SoC: PkunityTM
Department of Electrical Engineering Joint work with Jiong Luo
Research Topics Embedded, Real-time, Sensor Systems Frank Mueller moss
Presentation transcript:

Performance and Energy Bounds for Multimedia Applications on Dual-processor Power-aware SoC Platforms Weng-Fai WONG 黄荣辉 Dept. of Computer Science National University of Singapore Joint work in collaboration with Zhu Yongxin, Samarjit Chakraborty

Background SoC platforms become more complicated than classic embedded systems by carrying out multiple tasks: –to record music received by software radio –to play games while downloading another one –to talk over GPRS/3G mobile phone which stays online checking s –…. Needs to quickly explore design space of SoC for multimedia processing Emergence of multi-core technology

Background Analytical approaches are necessary due to unacceptable overheads of simulation practices to study multiple design tradeoffs Many efforts for performance enhancement to ensure the quality of service such as a guaranteed playback rate A few power-awareness efforts –dynamic voltage and frequency scaling (DVFS) –dynamic power management (DPM)

Overview Background Related work Methodology Experiment setup Results and discussion Next steps

A Motivating Problem Under a performance constraint, how to minimize energy dissipation by trading off among: –dynamic frequency and voltage scaling policies, –multiple frequencies of processors, –processor customization catering for applications

Related Work Yanhong Liu, Alexander Maxiaguine, Samarjit Chakraborty, and Wei Tsang Ooi. Processor frequency selection in energy-aware SoC platform design for multimedia applications. RTSS Alexander Maxiaguine, Yongxin Zhu, Samarjit Chakraborty, and Weng-Fai Wong. Tuning soc platforms for multimedia processing: Identifying limits and tradeoffs. CODES+ISSS 2004 L. Cai and Y.H. Lu. Energy Management Using Buffer Memory for Streaming Data, IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems, 24(2): , 2005 Validation of the models against simulation results and metrics on physical processors Co-optimization of performance and power

Methodology Network calculus models to identify the upper and lower bounds using variability characterization curves

Variability Characterization Curves Workload curves Consumption curves

Variability Characterization Curves  Production curves  Service curves  number of available cycles, subject to schedulers such as duty cycles  Number of activations

Power Model Active time –where L i is the length of activation on the i-th PE, Ω i is the frequency of the PE i Leakage power –where I subn is the sub-threshold current, V bs is the body bias voltage, and I j is the reverse bias junction current Switching overhead –where ρ i is the scheduling period of PE i, D wakeup is the wake-up delay, p idle p,i is the dynamic power of PE i in the idle mode

Power Model (cont’d) PE’s energy Buffer’s energy –where Q max i is the maximum buffer fill level of the i-th buffer, p b i is the i-th buffer’s dynamic power Total energy

Experiment Setup Map an MPEG-2 decoder onto PE 1 and PE 2 Setting 1: parameters of Intel Xscale processor Setting 2: parameters based on Transmeta Crusoe processor scaled up to 70nm technology Buffer’s specifications are Micro SDRAM parameters

Experiment Setup (cont’d)

A Motivating Problem Under a performance constraint, how to minimize energy dissipation by trading off among: –dynamic frequency and voltage scaling policies, –multiple frequencies of processors, –processor customization catering for applications

How do scheduling policies affect the constraint?

Results on Underflow Possibilities Underflow possibilities associated with scheduling periods (733MHz)

Results on Underflow Possibilities (cont’d) Underflow possibilities associated with varying duty cycles (633MHz)

Which is more sensitive to schedulers, the buffer’s energy or PE’s energy?

Bounds of Buffer’s Energy Bounds of buffer’s maximum energy associated with the same frequencies of PEs with SDRAM buffers under varying duty cycles

Bounds of Total Energy Bounds of maximum total energy associated with the same frequencies of PEs with SDRAM buffers under varying duty cycles

A Motivating Problem Under a performance constraint, how to minimize energy dissipation by trading off among: –dynamic frequency and voltage scaling policies, –multiple frequencies of processors, –processor customization catering for applications

How to reduce energy by choosing frequencies without undermining the quality of service?

Choosing Frequencies along the Boundary Bounds of maximum total energy associated with the combinations of frequencies of PEs with SDRAM buffers a duty cycle of 0.9

Choosing Frequencies along the Boundary (cont’d) Noting the surface almost monotonously increases with the frequencies except for the starting point Choosing frequency combinations along the boundary of the area can minimize energy without violating the performance constraint

A Motivating Problem Under a performance constraint, how to minimize energy dissipation by trading off among: –dynamic frequency and voltage scaling policies, –multiple frequencies of processors, –processor customization catering for applications

What to trade off if the frequencies are fixed?

Shifting of the Best Duty Cycle Bounds of maximum total energy associated with the combinations of frequencies of PEs with data cache buffers varying duty cycles

Summary An analytical framework based on VCC to identify both performance and energy bounds Studied the impacts of scheduler policies Explored the tradeoffs of frequencies Explored processor customizations

Next Steps Include more hardware details –Hierarchical cache systems –Communication mechanisms such as buses Co-optimization algorithms Detailed validations of the model

EASEL: Engineering Architectures and Software for the Embedded Landscape

Thank You!