Low-Power and Temperature-Aware Compilation for Embedded Processors José L. Ayala Politecnica University of Madrid

Slides:



Advertisements
Similar presentations
Subthreshold SRAM Designs for Cryptography Security Computations Adnan Gutub The Second International Conference on Software Engineering and Computer Systems.
Advertisements

Philips Research ICS 252 class, February 3, The Trimedia CPU64 VLIW Media Processor Kees Vissers Philips Research Visiting Industrial Fellow
ECE-777 System Level Design and Automation Hardware/Software Co-design
ECOE 560 Design Methodologies and Tools for Software/Hardware Systems Spring 2004 Serdar Taşıran.
Lecture 8 Dynamic Branch Prediction, Superscalar and VLIW Advanced Computer Architecture COE 501.
Reducing Leakage Power in Peripheral Circuits of L2 Caches Houman Homayoun and Alex Veidenbaum Dept. of Computer Science, UC Irvine {hhomayou,
A Framework for Dynamic Energy Efficiency and Temperature Management (DEETM) Michael Huang, Jose Renau, Seung-Moon Yoo, Josep Torrellas University of Illinois.
University of Michigan Electrical Engineering and Computer Science 1 A Distributed Control Path Architecture for VLIW Processors Hongtao Zhong, Kevin Fan,
Dynamic Branch PredictionCS510 Computer ArchitecturesLecture Lecture 10 Dynamic Branch Prediction, Superscalar, VLIW, and Software Pipelining.
Pipelining 5. Two Approaches for Multiple Issue Superscalar –Issue a variable number of instructions per clock –Instructions are scheduled either statically.
Thermal-Scheduling For Ultra Low Power Mobile Microprocessor May, Thermal-Scheduling For Ultra Low Power Mobile Microprocessor George Cai 1 Chee.
A reconfigurable system featuring dynamically extensible embedded microprocessor, FPGA, and customizable I/O Borgatti, M. Lertora, F. Foret, B. Cali, L.
Copyright © 2002 UCI ACES Laboratory A Design Space Exploration framework for rISA Design Ashok Halambi, Aviral Shrivastava,
Extensible Processors. 2 ASIP Gain performance by:  Specialized hardware for the whole application (ASIC). −  Almost no flexibility. −High cost.  Use.
Timing Analysis Timing Analysis Instructor: Dr. Vishwani D. Agrawal ELEC 7770 Advanced VLSI Design Team Project.
Instruction Level Parallelism (ILP) Colin Stevens.
Chia-Yen Hsieh Laboratory for Reliable Computing Microarchitecture-Level Power Management Iyer, A. Marculescu, D., Member, IEEE IEEE Transaction on VLSI.
8/19/04ELEC / ELEC / Advanced Topics in Electrical Engineering Designing VLSI for Low-Power and Self-Test Fall 2004 Vishwani.
September 28 th 2004University of Utah1 A preliminary look Karthik Ramani Power and Temperature-Aware Microarchitecture.
Lecture 16: Power Reduction Techniques November 5, 2013 ECE 636 Reconfigurable Computing Lecture 16 Power Reductions Techniques for FPGAs.
Instruction Set Architecture (ISA) for Low Power Hillary Grimes III Department of Electrical and Computer Engineering Auburn University.
Enhancing Embedded Processors with Specific Instruction Set Extensions for Network Applications A. Chormoviti, N. Vassiliadis, G. Theodoridis, S. Nikolaidis.
1 Energy-efficiency potential of a phase-based cache resizing scheme for embedded systems G. Pokam and F. Bodin.
Temperature-Aware Design Presented by Mehul Shah 4/29/04.
Power-Aware Computing 101 CS 771 – Optimizing Compilers Fall 2005 – Lecture 22.
Architectural and Compiler Techniques for Energy Reduction in High-Performance Microprocessors Nikolaos Bellas, Ibrahim N. Hajj, Fellow, IEEE, Constantine.
Single-Chip Multi-Processors (CMP) PRADEEP DANDAMUDI 1 ELEC , Fall 08.
EE466: VLSI Design Power Dissipation. Outline Motivation to estimate power dissipation Sources of power dissipation Dynamic power dissipation Static power.
Power Reduction for FPGA using Multiple Vdd/Vth
ParaScale : Exploiting Parametric Timing Analysis for Real-Time Schedulers and Dynamic Voltage Scaling Sibin Mohan 1 Frank Mueller 1,William Hawkins 2,
A Reconfigurable Processor Architecture and Software Development Environment for Embedded Systems Andrea Cappelli F. Campi, R.Guerrieri, A.Lodi, M.Toma,
Instruction-Level Parallelism for Low-Power Embedded Processors January 23, 2001 Presented By Anup Gangwar.
Low-Power Wireless Sensor Networks
Logic Synthesis for Low Power(CHAPTER 6) 6.1 Introduction 6.2 Power Estimation Techniques 6.3 Power Minimization Techniques 6.4 Summary.
Speculative Software Management of Datapath-width for Energy Optimization G. Pokam, O. Rochecouste, A. Seznec, and F. Bodin IRISA, Campus de Beaulieu
October 6, 2004.Software Technology Forum 1 The Renaissance of Compiler Development Com piler optimizations motivated by embedded systems Tibor Gyimóthy.
1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
Exploiting Program Hotspots and Code Sequentiality for Instruction Cache Leakage Management J. S. Hu, A. Nadgir, N. Vijaykrishnan, M. J. Irwin, M. Kandemir.
Section 10: Advanced Topics 1 M. Balakrishnan Dept. of Comp. Sci. & Engg. I.I.T. Delhi.
Energy-Effective Issue Logic Hasan Hüseyin Yılmaz.
1 Tuning Garbage Collection in an Embedded Java Environment G. Chen, R. Shetty, M. Kandemir, N. Vijaykrishnan, M. J. Irwin Microsystems Design Lab The.
CISC 879 : Advanced Parallel Programming Vaibhav Naidu Dept. of Computer & Information Sciences University of Delaware Importance of Single-core in Multicore.
Power Estimation and Optimization for SoC Design
Dual-Pipeline Heterogeneous ASIP Design Swarnalatha Radhakrishnan, Hui Guo, Sri Parameswaran School of Computer Science & Engineering University of New.
Milestones, Feedback, Action Items Power Aware Distributed Systems Kickoff August 23, 2000.
Performance and Power Analysis of Globally Asynchronous Locally Synchronous Multiprocessor Systems Zhiyi Yu, Bevan M. Baas VLSI Computation Lab, ECE department,
Reconfigurable Computing Ender YILMAZ, Hasan Tahsin OĞUZ.
1 Power estimation in the algorithmic and register-transfer level September 25, 2006 Chong-Min Kyung.
Kyushu University Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems.
LOGIC OPTIMIZATION USING TECHNOLOGY INDEPENDENT MUX BASED ADDERS IN FPGA Project Guide: Smt. Latha Dept of E & C JSSATE, Bangalore. From: N GURURAJ M-Tech,
Improving Energy Efficiency of Configurable Caches via Temperature-Aware Configuration Selection Hamid Noori †, Maziar Goudarzi ‡, Koji Inoue ‡, and Kazuaki.
© Digital Integrated Circuits 2nd Inverter Digital Integrated Circuits A Design Perspective The Inverter Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.
1 Energy-Efficient Register Access Jessica H. Tseng and Krste Asanović MIT Laboratory for Computer Science, Cambridge, MA 02139, USA SBCCI2000.
Computer Science and Engineering Power-Performance Considerations of Parallel Computing on Chip Multiprocessors Jian Li and Jose F. Martinez ACM Transactions.
Addressing Instruction Fetch Bottlenecks by Using an Instruction Register File Stephen Hines, Gary Tyson, and David Whalley Computer Science Dept. Florida.
NISC set computer no-instruction
Workload Clustering for Increasing Energy Savings on Embedded MPSoCs S. H. K. Narayanan, O. Ozturk, M. Kandemir, M. Karakoy.
Sunpyo Hong, Hyesoon Kim
Compacting ARM binaries with the Diablo framework – Dominique Chanet & Ludo Van Put Compacting ARM binaries with the Diablo framework Dominique Chanet.
The Effect of Data-Reuse Transformations on Multimedia Applications for Application Specific Processors N. Vassiliadis, A. Chormoviti, N. Kavvadias, S.
CS203 – Advanced Computer Architecture
Thermal-Aware Data Flow Analysis José L. Ayala – Complutense University (Spain) David Atienza – EPFL (Switzerland) Philip Brisk – EPFL (Switzerland)
LOW POWER DESIGN METHODS
“Temperature-Aware Task Scheduling for Multicore Processors” Masters Thesis Proposal by Myname 1 This slides presents title of the proposed project State.
CS203 – Advanced Computer Architecture
Evaluating Register File Size
5.2 Eleven Advanced Optimizations of Cache Performance
Improving Program Efficiency by Packing Instructions Into Registers
A High Performance SoC: PkunityTM
Department of Electrical Engineering Joint work with Jiong Luo
Presentation transcript:

Low-Power and Temperature-Aware Compilation for Embedded Processors José L. Ayala Politecnica University of Madrid

Outline Motivation Motivation Goals Goals Low-Power Compilation Low-Power Compilation Temperature-Aware Compilation Temperature-Aware Compilation Ongoing and Future Work Ongoing and Future Work

Motivation Continuing advances in semiconductor technology  performance gains (clock rate and ILP) Continuing advances in semiconductor technology  performance gains (clock rate and ILP) Significant increase in power consumption (dynamic and static) Significant increase in power consumption (dynamic and static) Technology scales  future power density 200 W/cm 2 : early consideration of thermal effects are needed Technology scales  future power density 200 W/cm 2 : early consideration of thermal effects are needed

Goals Extensive efforts on dynamic techniques: static approaches are needed Extensive efforts on dynamic techniques: static approaches are needed Early thermal characterization with a selectable granularity Early thermal characterization with a selectable granularity Thermal and power management incorporated in the compiler code Thermal and power management incorporated in the compiler code

Low-Power Compilation Power-aware compilation for embedded single-core processors Power-aware compilation for embedded single-core processors –Modified register assignment: power reduction in the register file –Register file organized in banks: simple decoding logic –Register assignment improves register locality: registers assigned from same bank

Low-Power Compilation Power-aware compilation for embedded single-core processors Power-aware compilation for embedded single-core processors –Voltage scaling technique applied to unused banks –Compiler modifications in gcc compiler –Energy savings up to 75% without performance penalty

Low-Power Compilation Power-aware compilation for VLIW- based systems Power-aware compilation for VLIW- based systems –Power-aware register assignment targeting VLIW-based systems –Similar approach than the applied to embedded processors –Compiler modifications in Trimaran compiler

Low-Power Compilation Power-aware compilation for VLIW- based systems Power-aware compilation for VLIW- based systems –Simulation framework: CRISP (by IMEC) –Energy savings up to 60% without performance penalty –Modified compiler helps on simplifying over-sized register file in compilation time

Low-Power Compilation

Temperature-Aware Compilation Two different goals: Two different goals: –Decrease total temperature in the chip: effect on propagation delays, signal integrity and power consumption –Avoid temperature gradients: effect on electromigration and chip damage

Temperature-Aware Compilation First step: thermal modeling First step: thermal modeling –Thermal modeling based on power consumption measures: bidimensional RC thermal model through layout area estimations (black box model) –Thermal characterization of hardware modules in the chip area

Temperature-Aware Compilation

Second step: thermal analysis of source-level transformations Second step: thermal analysis of source-level transformations –Analysis of temperature gradients

Temperature-Aware Compilation

Second step: thermal analysis of source-level transformations Second step: thermal analysis of source-level transformations –Analysis of temperature gradients –Analysis of global temperature

Temperature-Aware Compilation

Third step: temperature/power-aware compiler Third step: temperature/power-aware compiler

Ongoing and Future Work Compiler-driven reconfigurable logic for low-power consumption: improves register assignment Compiler-driven reconfigurable logic for low-power consumption: improves register assignment Implementation over a real platform (CoolFlux, Lisatek, ACE/COSY…) Implementation over a real platform (CoolFlux, Lisatek, ACE/COSY…) Extension of the number of analyzed code transformations with impact in the thermal behavior Extension of the number of analyzed code transformations with impact in the thermal behavior

Ongoing and Future Work Integration of low-power optimizations and temperature-aware transformations in the compiler code Integration of low-power optimizations and temperature-aware transformations in the compiler code

Thanks! Questions?