Thermal-Aware Data Flow Analysis José L. Ayala – Complutense University (Spain) David Atienza – EPFL (Switzerland) Philip Brisk – EPFL (Switzerland)

Slides:



Advertisements
Similar presentations
Parallel Processing & Parallel Algorithm May 8, 2003 B4 Yuuki Horita.
Advertisements

Compiler-Based Register Name Adjustment for Low-Power Embedded Processors Discussion by Garo Bournoutian.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Pipeline Hazards Pipeline hazards These are situations that inhibit that the next instruction can be processed in the next stage of the pipeline. This.
High-Level Constructors and Estimators Majid Sarrafzadeh and Jason Cong Computer Science Department
H.264/AVC Baseline Profile Decoder Complexity Analysis Michael Horowitz, Anthony Joch, Faouzi Kossentini, and Antti Hallapuro IEEE TRANSACTIONS ON CIRCUITS.
Low-Power and Temperature-Aware Compilation for Embedded Processors José L. Ayala Politecnica University of Madrid
1 CS 201 Compiler Construction Lecture 12 Global Register Allocation.
Communication Systems Simulation - I Harri Saarnisaari Part of Simulations and Tools for Telecommunication Course.
1 Final Exam Study Guide 4 Final Examination is scheduled on Wednesday May 9th at 4PM 4 There are 8 questions with or without sub- parts and the exam.
Figure 2.8 Compiler phases Compiling. Figure 2.9 Object module Linking.
Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.
Simulated-Annealing-Based Solution By Gonzalo Zea s Shih-Fu Liu s
Data Partitioning for Reconfigurable Architectures with Distributed Block RAM Wenrui Gong Gang Wang Ryan Kastner Department of Electrical and Computer.
Multiscalar processors
Center for Embedded Computer Systems University of California, Irvine SPARK: A High-Level Synthesis Framework for Applying.
Topic 6 -Code Generation Dr. William A. Maniatty Assistant Prof. Dept. of Computer Science University At Albany CSI 511 Programming Languages and Systems.
Compiler Optimization-Space Exploration Adrian Pop IDA/PELAB Authors Spyridon Triantafyllis, Manish Vachharajani, Neil Vachharajani, David.
CISC673 – Optimizing Compilers1/34 Presented by: Sameer Kulkarni Dept of Computer & Information Sciences University of Delaware Phase Ordering.
Compressed Instruction Cache Prepared By: Nicholas Meloche, David Lautenschlager, and Prashanth Janardanan Team Lugnuts.
1 Prediction of Software Reliability Using Neural Network and Fuzzy Logic Professor David Rine Seminar Notes.
1 Presenter: Ming-Shiun Yang Sah, A., Balakrishnan, M., Panda, P.R. Design, Automation & Test in Europe Conference & Exhibition, DATE ‘09. A Generic.
Ajay K. Verma, Philip Brisk and Paolo Ienne Processor Architecture Laboratory (LAP) & Centre for Advanced Digital Systems (CSDA) Ecole Polytechnique Fédérale.
Generic Software Pipelining at the Assembly Level Markus Pister
Just-In-Time Java Compilation for the Itanium Processor Tatiana Shpeisman Guei-Yuan Lueh Ali-Reza Adl-Tabatabai Intel Labs.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Design Space Exploration
CIS Computer Programming Logic
Data Flow in Static Profiling Cathal Boogerd, Delft University, The Netherlands Leon Moonen, Simula Research Lab, Norway ?
A Reconfigurable Processor Architecture and Software Development Environment for Embedded Systems Andrea Cappelli F. Campi, R.Guerrieri, A.Lodi, M.Toma,
Instruction-Level Parallelism for Low-Power Embedded Processors January 23, 2001 Presented By Anup Gangwar.
A RISC ARCHITECTURE EXTENDED BY AN EFFICIENT TIGHTLY COUPLED RECONFIGURABLE UNIT Nikolaos Vassiliadis N. Kavvadias, G. Theodoridis, S. Nikolaidis Section.
Speculative Software Management of Datapath-width for Energy Optimization G. Pokam, O. Rochecouste, A. Seznec, and F. Bodin IRISA, Campus de Beaulieu
Section 10: Advanced Topics 1 M. Balakrishnan Dept. of Comp. Sci. & Engg. I.I.T. Delhi.
CMPE 511 Computer Architecture A Faster Optimal Register Allocator Betül Demiröz.
© 2010 IBM Corporation Code Alignment for Architectures with Pipeline Group Dispatching Helena Kosachevsky, Gadi Haber, Omer Boehm Code Optimization Technologies.
Fast Simulation Techniques for Design Space Exploration Daniel Knorreck, Ludovic Apvrille, Renaud Pacalet
|Processors designed for low power |Architectural state is correct at basic block granularity rather than instruction granularity 2.
Static Program Analyses of DSP Software Systems Ramakrishnan Venkitaraman and Gopal Gupta.
Interference Graphs for Programs in Static Single Information Form are Interval Graphs Philip Brisk Processor Architecture Laboratory (LAP) EPFL Lausanne,
Bypass Aware Instruction Scheduling for Register File Power Reduction Sanghyun Park, Aviral Shrivastava Nikil Dutt, Alex Nicolau Yunheung Paek Eugene Earlie.
Limits of Instruction-Level Parallelism Presentation by: Robert Duckles CSE 520 Paper being presented: Limits of Instruction-Level Parallelism David W.
System-level power analysis and estimation September 20, 2006 Chong-Min Kyung.
1 Power estimation in the algorithmic and register-transfer level September 25, 2006 Chong-Min Kyung.
DTM and Reliability High temperature greatly degrades reliability
Compiler Construction Dr. Naveed Ejaz Lecture 4. 2 The Back End Register Allocation:  Have each value in a register when it is used. Instruction selection.
CS 598 Scripting Languages Design and Implementation 14. Self Compilers.
11 Online Computing and Predicting Architectural Vulnerability Factor of Microprocessor Structures Songjun Pan Yu Hu Xiaowei Li {pansongjun, huyu,
A Memory-hierarchy Conscious and Self-tunable Sorting Library To appear in 2004 International Symposium on Code Generation and Optimization (CGO ’ 04)
Xi He Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY THERMAL-AWARE RESOURCE.
Methodology to Compute Architectural Vulnerability Factors Chris Weaver 1, 2 Shubhendu S. Mukherjee 1 Joel Emer 1 Steven K. Reinhardt 1, 2 Todd Austin.
D A C U C P Speculative Alias Analysis for Executable Code Manel Fernández and Roger Espasa Computer Architecture Department Universitat Politècnica de.
High Performance Embedded Computing © 2007 Elsevier Lecture 10: Code Generation Embedded Computing Systems Michael Schulte Based on slides and textbook.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science John Cavazos J Eliot B Moss Architecture and Language Implementation Lab University.
Best detection scheme achieves 100% hit detection with
FDR--ECE6276 Class Project 12/06/00 The ChooChoo: Final Design Review Wody-Instruction Set Architecture School of Electrical and Computer Engineering Georgia.
Re-configurable Bus Encoding Scheme for Reducing Power Consumption of the Cross Coupling Capacitance for Deep Sub-micron Instructions Bus Siu-Kei Wong.
Topic Register Allocation
Contents Introduction Bus Power Model Related Works Motivation
Ali Galip Bayrak EPFL, Switzerland June 7th, 2011
Compiler Supports and Optimizations for PAC VLIW DSP Processors
Stephen Hines, David Whalley and Gary Tyson Computer Science Dept.
Objective of This Course
Presented By: Md Amjad Hossain
Reiley Jeyapaul and Aviral Shrivastava Compiler-Microarchitecture Lab
Compiler Construction
LLVM Greedy Register Allocator – Improving Region Split Decisions
Research: Past, Present and Future
(via graph coloring and spilling)
CS 201 Compiler Construction
Presentation transcript:

Thermal-Aware Data Flow Analysis José L. Ayala – Complutense University (Spain) David Atienza – EPFL (Switzerland) Philip Brisk – EPFL (Switzerland)

Problem formulation  Thermal dissipation in semiconductor devices has a strong correlation with power consumption (leakage dependency) and reliability (absolute values and thermal gradients).  Register file (RF) is one of the hot spots in processor architectures.  Thermal-aware compilation has been proposed but requires post-compiled temperature estimation. Compiler can predict thermal state of the processor AT EVERY POINT IN THE CODE

Motivating example  Register assignment algorithms assign a register to a variable from those not assigned to interfering variables.  Traditional algorithms maintain an ordered list (FIFO) of registers  small set of registers is repeatedly selected.  Thermal profile of the RF is determined by a random assignment of registers, an ordered assignment or a “chessboard” assignment.

Data flow analysis (basics)  Useful to determine correctness and propose optimizations.  A single bit of information is propagated in liveness analysis.  An interval is propagated in bitwidth analysis.  The thermal state is a much more complex information to propagate: continuous function, floorplan dependent, accuracy depends on the granularity.

Data flow analysis (thermal)  Proposed forward analysis, in a single procedure, for the RF Do Boolean stop  True For each basic block B For each instruction I ε B, in forward order Estimate thermal state after I If change in I’s thermal state exceeds δ stop  False EndIf EndFor While (stop = False) Output thermal state of each instruction

Data flow analysis (thermal)  There is not an explicit way to guarantee convergence → very irregular data usage.  In case of non convergence, program code can be re-optimized or following compiler phases can be guided.

Data flow analysis (thermal)  Traditional approach  Thermal analysis is applied after register assignment.  Proposed approach  Application at earlier stages based on predictive analyses.  Development of a set of rules that relate compiler decisions with thermal state.  Compiler executes thermal-aware optimization based on previous analysis.

Data flow analysis (thermal)  Thermal-aware optimizations  Spreading in space  Spilling critical variables to memory  Splitting via copy insertion  Spreading in time  Via instruction scheduling  Using register promotion  Inserting NOP instructions

Preliminary Results  Leon3 Register File – MPEG-2 Decoder O R I G I N A L M O D I F I E D

Conclusions  Compilers can estimate thermal state in early stages of compilation;  Experimental results have shown thermal-aware optimization methods for the register file;  Research goal: to develop comprehensive data flow thermal analyses and rules for the processor modules;

Thank you! Questions?