CISC673 – Optimizing Compilers1/34 Presented by: Sameer Kulkarni Dept of Computer & Information Sciences University of Delaware Phase Ordering.

Slides:



Advertisements
Similar presentations
Efficient Program Compilation through Machine Learning Techniques Gennady Pekhimenko IBM Canada Angela Demke Brown University of Toronto.
Advertisements

Code Optimization and Performance Chapter 5 CS 105 Tour of the Black Holes of Computing.
Compiler Support for Superscalar Processors. Loop Unrolling Assumption: Standard five stage pipeline Empty cycles between instructions before the result.
Comparison and Evaluation of Back Translation Algorithms for Static Single Assignment Form Masataka Sassa #, Masaki Kohama + and Yo Ito # # Dept. of Mathematical.
Computer Science and Engineering Laboratory, Transport-triggered processors Jani Boutellier Computer Science and Engineering Laboratory This.
7. Optimization Prof. O. Nierstrasz Lecture notes by Marcus Denker.
ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
Optimizing single thread performance Dependence Loop transformations.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
Using Parallel Genetic Algorithm in a Predictive Job Scheduling
Code Generation Steve Johnson. May 23, 2005Copyright (c) Stephen C. Johnson The Problem Given an expression tree and a machine architecture, generate.
Constraint Programming for Compiler Optimization March 2006.
Optimal Instruction Scheduling for Multi-Issue Processors using Constraint Programming Abid M. Malik and Peter van Beek David R. Cheriton School of Computer.
Program Representations. Representing programs Goals.
AUTOMATIC GENERATION OF CODE OPTIMIZERS FROM FORMAL SPECIFICATIONS Vineeth Kumar Paleri Regional Engineering College, calicut Kerala, India. (Currently,
Games CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Mitigating the Compiler Optimization Phase- Ordering Problem using Machine Learning.
SCIENCES USC INFORMATION INSTITUTE An Open64-based Compiler Approach to Performance Prediction and Performance Sensitivity Analysis for Scientific Codes.
Code Generation for Basic Blocks Introduction Mooly Sagiv html:// Chapter
Validating High-Level Synthesis Sudipta Kundu, Sorin Lerner, Rajesh Gupta Department of Computer Science and Engineering, University of California, San.
Artificial Intelligence Genetic Algorithms and Applications of Genetic Algorithms in Compilers Prasad A. Kulkarni.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science John Cavazos Architecture and Language Implementation Lab Thesis Seminar University.
Compilation, Architectural Support, and Evaluation of SIMD Graphics Pipeline Programs on a General-Purpose CPU Mauricio Breternitz Jr, Herbert Hum, Sanjeev.
Reduced Instruction Set Computers (RISC) Computer Organization and Architecture.
Development in hardware – Why? Option: array of custom processing nodes Step 1: analyze the application and extract the component tasks Step 2: design.
Genetic Algorithms and Ant Colony Optimisation
Optimizing Loop Performance for Clustered VLIW Architectures by Yi Qian (Texas Instruments) Co-authors: Steve Carr (Michigan Technological University)
Adapting Convergent Scheduling Using Machine Learning Diego Puppin*, Mark Stephenson †, Una-May O’Reilly †, Martin Martin †, and Saman Amarasinghe † *
1 Advance Computer Architecture CSE 8383 Ranya Alawadhi.
Dept. of Computer and Information Sciences : University of Delaware John Cavazos Department of Computer and Information Sciences University of Delaware.
A Survey of Distributed Task Schedulers Kei Takahashi (M1)
Spiral: an empirical search system for program generation and optimization David Padua Department of Computer Science University of Illinois at Urbana-
CISC Machine Learning for Solving Systems Problems Presented by: Alparslan SARI Dept of Computer & Information Sciences University of Delaware
Research Topics CSC Parallel Computing & Compilers CSC 3990.
Artificial Intelligence Chapter 4. Machine Evolution.
CPSC 404, Laks V.S. Lakshmanan1 External Sorting Chapter 13: Ramakrishnan & Gherke and Chapter 2.3: Garcia-Molina et al.
Advanced Computer Architecture Lab University of Michigan Compiler Controlled Value Prediction with Branch Predictor Based Confidence Eric Larson Compiler.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Overview of Compilers and JikesRVM John.
CISC Machine Learning for Solving Systems Problems John Cavazos Dept of Computer & Information Sciences University of Delaware
ECEG-3202 Computer Architecture and Organization Chapter 7 Reduced Instruction Set Computers.
Compilers as Collaborators and Competitors of High-Level Specification Systems David Padua University of Illinois at Urbana-Champaign.
CISC Machine Learning for Solving Systems Problems Microarchitecture Design Space Exploration Lecture 4 John Cavazos Dept of Computer & Information.
CS 732: Advance Machine Learning
CISC Machine Learning for Solving Systems Problems Presented by: Eunjung Park Dept of Computer & Information Sciences University of Delaware Solutions.
Machine Learning in Compiler Optimization By Namita Dave.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Intelligent Compilation John Cavazos Computer & Information Sciences Department.
High Performance Embedded Computing © 2007 Elsevier Lecture 10: Code Generation Embedded Computing Systems Michael Schulte Based on slides and textbook.
Adaptive Inlining Keith D. CooperTimothy J. Harvey Todd Waterman Department of Computer Science Rice University Houston, TX.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science John Cavazos J Eliot B Moss Architecture and Language Implementation Lab University.
Search Control.. Planning is really really hard –Theoretically, practically But people seem ok at it What to do…. –Abstraction –Find “easy” classes of.
Learning A Better Compiler Predicting Unroll Factors using Supervised Classification And Integrating CPU and L2 Cache Voltage Scaling using Machine Learning.
Multi-cellular paradigm The molecular level can support self- replication (and self- repair). But we also need cells that can be designed to fit the specific.
Genetic Algorithms An Evolutionary Approach to Problem Solving.
An Evolutionary Algorithm for Neural Network Learning using Direct Encoding Paul Batchis Department of Computer Science Rutgers University.
Genetic Algorithm(GA)
Automatic Feature Generation for Machine Learning Based Optimizing Compilation Hugh Leather, Edwin Bonilla, Michael O'Boyle Institute for Computing Systems.
Single Static Assignment Intermediate Representation (or SSA IR) Many examples and pictures taken from Wikipedia.
High-level optimization Jakub Yaghob
Code Optimization.
Presented by: Sameer Kulkarni
MILEPOST GCC Lecture 4 John Cavazos
Artificial Intelligence Chapter 4. Machine Evolution
Christophe Dubach, Timothy M. Jones and Michael F.P. O’Boyle
Faustino J. Gomez, Doug Burger, and Risto Miikkulainen
Artificial Intelligence Chapter 4. Machine Evolution
Predicting Unroll Factors Using Supervised Classification
Compiler Construction
Rohan Yadav and Charles Yuan (rohany) (chenhuiy)
Presentation transcript:

CISC673 – Optimizing Compilers1/34 Presented by: Sameer Kulkarni Dept of Computer & Information Sciences University of Delaware Phase Ordering

CISC673 – Optimizing Compilers2/34 Optimization?? does it really work??

CISC673 – Optimizing Compilers3/34 No. of optimizations O64 = 264 (on last count) JikesRVM = 67

CISC673 – Optimizing Compilers4/34 Search space Consider a hypothetical case where we apply 40 optimizations O64 : 3.98 x Jikes: 4.1 x 10 18

CISC673 – Optimizing Compilers5/34 Could take a while Considering the smaller problem, assume that running all the benchmarks take just 1 sec to run Jikes would take: billion years Age of the universe 13 billion years

CISC673 – Optimizing Compilers6/34 Some basic Optimizations Constant Sub-expression Elimination Loop Unrolling Local Copy Prop Branch Optimizations...

CISC673 – Optimizing Compilers7/34 Example for(int i=0; i< 3;i++){ a = a + i + 1; } Loop Unrolling CSE

CISC673 – Optimizing Compilers8/34 Instruction Scheduling vs Register Allocation Maximizing Parallelism  IS Minimizing Register Spilling  RA

CISC673 – Optimizing Compilers9/34 Phase Ord. vs Opt Levels Opt Levels ~ Timing Constraints Phase ordering ~ code interactions

CISC673 – Optimizing Compilers10/34 Whimsical?? Opt X would like to go before Opt Y, but not always.

CISC673 – Optimizing Compilers11/34 Ideal Solution? Oracle  Perfect sequence at the very start Wise Man Solution  Given the present code predict the best optimization solution

CISC673 – Optimizing Compilers12/34 Wise Man ? Understand Compilers Optimizations Source Code

CISC673 – Optimizing Compilers13/34 Possible Solutions Pruning the search space Genetic Algorithms Estimating running times Precompiled choices

CISC673 – Optimizing Compilers14/34 Pruning Search space Fast and Efficient Searches for Effective Optimization Phase Sequences, Kulkarni et al. TACO 2005

CISC673 – Optimizing Compilers15/34 Optimization Profiling Fast and Efficient Searches for Effective Optimization Phase Sequences, Kulkarni et al. TACO 2005

CISC673 – Optimizing Compilers16/34 Genetic Algorithms Fast Searches for Effective Optimization Phase Sequences, Kulkarni et al. PLDI ‘04

CISC673 – Optimizing Compilers17/34 Exhaustive vs Heuristic [2]

CISC673 – Optimizing Compilers18/34 Disadvantages Benchmark Specific Architecture dependent Code disregarded

CISC673 – Optimizing Compilers19/34 Improvements Profiling the application Understand the code Understanding optimizations Continuous evaluation of transformations

CISC673 – Optimizing Compilers20/34 Proposed solution Input = Code Features Output = Running time Evolve  Neural Networks

CISC673 – Optimizing Compilers21/34 Proposed solution

CISC673 – Optimizing Compilers22/34 Experimental Setup Neural Network Evolver (ANJI) Training Set { javaGrande } Testing Set { SpecJVM, Da Capo }

CISC673 – Optimizing Compilers23/34 ANJI Mutating & generating n/w s Network  phase ordering Timing Information Scoring the n/w

CISC673 – Optimizing Compilers24/34 Training Phase Generations and Chromosomes Random chromosomes Back Propagation Add/Remove/Update hidden nodes

CISC673 – Optimizing Compilers25/34 Experimental Setup

CISC673 – Optimizing Compilers26/34 Network Evolution

CISC673 – Optimizing Compilers27/34 Network Evolution

CISC673 – Optimizing Compilers28/34 javaGrande Set of very small benchmarks Low running times Memory management Machine Architecture

CISC673 – Optimizing Compilers29/34 Testing SpecJVM’98 & Da Capo Champion n/w Running times

CISC673 – Optimizing Compilers30/34 Present Solution

CISC673 – Optimizing Compilers31/34 Implementation in GCC Milepost GCC Created for intelligent compilation Collecting source features Submitting features to common loc. Hooks into the Compilation process.

CISC673 – Optimizing Compilers32/34 Possible Use Case

CISC673 – Optimizing Compilers33/34 Structure for Phase Ordering ANJI network from Source features

CISC673 – Optimizing Compilers34/34 LLVM Open Source Compiler Modular Design Easy to work with All Optimizations are interchangeable

CISC673 – Optimizing Compilers35/34 Questions Most of the files and this presentation have been uploaded to