Automatic Feature Generation for Machine Learning Based Optimizing Compilation Hugh Leather, Edwin Bonilla, Michael O'Boyle Institute for Computing Systems.

Slides:



Advertisements
Similar presentations
Inside an XSLT Processor Michael Kay, ICL 19 May 2000.
Advertisements

Models of Computation Prepared by John Reif, Ph.D. Distinguished Professor of Computer Science Duke University Analysis of Algorithms Week 1, Lecture 2.
Efficient Program Compilation through Machine Learning Techniques Gennady Pekhimenko IBM Canada Angela Demke Brown University of Toronto.
ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Mitigating the Compiler Optimization Phase- Ordering Problem using Machine Learning.
Online Performance Auditing Using Hot Optimizations Without Getting Burned Jeremy Lau (UCSD, IBM) Matthew Arnold (IBM) Michael Hind (IBM) Brad Calder (UCSD)
Compiler Optimization-Space Exploration Adrian Pop IDA/PELAB Authors Spyridon Triantafyllis, Manish Vachharajani, Neil Vachharajani, David.
Introduction & Overview CS4533 from Cooper & Torczon.
CISC673 – Optimizing Compilers1/34 Presented by: Sameer Kulkarni Dept of Computer & Information Sciences University of Delaware Phase Ordering.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods Hidenao Abe & Takahira Yamaguchi Shizuoka University, JAPAN.
Adapting Convergent Scheduling Using Machine Learning Diego Puppin*, Mark Stephenson †, Una-May O’Reilly †, Martin Martin †, and Saman Amarasinghe † *
Overview of the Course Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Compiler course 1. Introduction. Outline Scope of the course Disciplines involved in it Abstract view for a compiler Front-end and back-end tasks Modules.
Meta Optimization Improving Compiler Heuristics with Machine Learning Mark Stephenson, Una-May O’Reilly, Martin Martin, and Saman Amarasinghe MIT Computer.
Dept. of Computer and Information Sciences : University of Delaware John Cavazos Department of Computer and Information Sciences University of Delaware.
System Software for Parallel Computing. Two System Software Components Hard to do the innovation Replacement for Tradition Optimizing Compilers Replacement.
1 Fly – A Modifiable Hardware Compiler C. H. Ho 1, P.H.W. Leong 1, K.H. Tsoi 1, R. Ludewig 2, P. Zipf 2, A.G. Oritz 2 and M. Glesner 2 1 Department of.
Applying Genetic Algorithm to the Knapsack Problem Qi Su ECE 539 Spring 2001 Course Project.
Mining Binary Constraints in Feature Models: A Classification-based Approach Yi Li.
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
Automated Patch Generation Adapted from Tevfik Bultan’s Lecture.
CISC Machine Learning for Solving Systems Problems John Cavazos Dept of Computer & Information Sciences University of Delaware
Introduction to Loops For Loops. Motivation for Using Loops So far, everything we’ve done in MATLAB, you could probably do by hand: Mathematical operations.
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
Automated discovery in math Machine learning techniques (GP, ILP, etc.) have been successfully applied in science Machine learning techniques (GP, ILP,
CISC Machine Learning for Solving Systems Problems Presented by: Eunjung Park Dept of Computer & Information Sciences University of Delaware Solutions.
Machine Learning in Compiler Optimization By Namita Dave.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Intelligent Compilation John Cavazos Computer & Information Sciences Department.
Learning A Better Compiler Predicting Unroll Factors using Supervised Classification And Integrating CPU and L2 Cache Voltage Scaling using Machine Learning.
Michael J. Voss and Rudolf Eigenmann PPoPP, ‘01 (Presented by Kanad Sinha)
Estimation of Distribution Algorithm and Genetic Programming Structure Complexity Lab,Seoul National University KIM KANGIL.
Machine Learning, Compilers and Mobile Systems Institute for Computing Systems Architecture University of Edinburgh, UK Hugh Leather.
Automatic Feature Generation for Setting Compiler Heuristics Hugh Leather*, Elad Yom-Tov**, Mircea Namolaru**, Ari Freund** *Institute for Computing Systems.
Raced Profiles:Efficient Selection of Competing Compiler Optimizations Hugh Leather, Bruce Worton, Michael O'Boyle Institute for Computing Systems Architecture.
MILEPOST Machine learning in compilers: The Future of Optimisation Hugh Leather University of Edinburgh.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Experience Report: System Log Analysis for Anomaly Detection
Compiler Design (40-414) Main Text Book:
Introduction to Compiler Construction
Computational Thinking, Problem-solving and Programming: General Principals IB Computer Science.
Deep Learning Amin Sobhani.
Control Flow Testing Handouts
Handouts Software Testing and Quality Assurance Theory and Practice Chapter 4 Control Flow Testing
A Simple Syntax-Directed Translator
Programming Languages Translator
Presented by: Dr Beatriz de la Iglesia
Accelerate define.xml using defineReady - Saravanan June 17, 2015.
Outline of the Chapter Basic Idea Outline of Control Flow Testing
Introduction to Data Science Lecture 7 Machine Learning Overview
Overview of the Course Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Designing and Debugging Batch and Interactive COBOL Programs
Genetic Improvement CHORDS GROUP Stirling University
Chapter 11 Introduction to Programming in C
Objective of This Course
Register Pressure Guided Unroll-and-Jam
Genetic Programming Applied to Compiler Optimization
Lecture 7: Introduction to Parsing (Syntax Analysis)
Presented by: Divya Muppaneni
Loops CIS 40 – Introduction to Programming in Python
Teori Bahasa dan Automata Lecture 9: Contex-Free Grammars
Genetic Programming Applied to Compiler Optimization
Predicting Unroll Factors Using Supervised Classification
Information Retrieval
Chapter 10: Compilers and Language Translation
COMP755 Advanced Operating Systems
Rohan Yadav and Charles Yuan (rohany) (chenhuiy)
Running & Testing Programs :: Translators
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Automatic Feature Generation for Machine Learning Based Optimizing Compilation Hugh Leather, Edwin Bonilla, Michael O'Boyle Institute for Computing Systems Architecture University of Edinburgh, UK

Overview Introduction to machine learning in compilers Difficulties choosing features A feature space for a motivating example Searching the feature space Features for GCC Results Further and on going work

Introduction to machine learning in compilers Problem: Tuning heuristics is hard Architectures and compilers keep changing Goal: Replace an heuristic with a Machine Learned one ML performs very well

Introduction to machine learning in compilers How it works Summarise data before heuristic (features) Collect examples Learn a model Model predicts heuristic for new program

Introduction to machine learning in compilers Start with compiler data structures AST, RTL, SSA, CFG, DDG, etc. Human expert determines a mapping to a feature vector number of instructions mean dependency depth branch count loop nest level trip count...

Introduction to machine learning in compilers Now collect many examples of programs, determining their feature values... Execute the programs with different compilation strategies and find the best for each Best Heuristic Value Feature s

Introduction to machine learning in compilers Now give these examples to a machine learner It learns a model... Supervised Machine Learner Example s Best Heuristic Value Feature s Model

Introduction to machine learning in compilers This model can then be used to predict the best compiler strategy from the features of a new program Our heuristic is replaced Predicted Heuristic Value Feature s Model New Program

Overview Introduction to machine learning in compilers Difficulties choosing features A feature space for a motivating example Searching the feature space Features for GCC Results Further and on going work

Difficulties choosing features The expert must do a good job of projecting down to features amount of white space average identifier length age of programmer name of program...

Difficulties choosing features Feature 1 Feature Machine learning works well when all examples associated with one feature value have the same type

Difficulties choosing features Feature 1 Feature Machine learning doesn't work if the features don't distinguish the examples

Difficulties choosing features Better features might allow classification Feature 1 Feature Feature 1 Feature Feature 3 = 0Feature 3 = 1

Difficulties choosing features There are much more subtle interactions between features and ML algorithm Sometimes adding a feature makes things worse A feature might be copies of existing features There is an infinite number of possible features

Overview Introduction to machine learning in compilers Difficulties choosing features A feature space for a motivating example Searching the feature space Features for GCC Results Further and on going work

A feature space for a motivating example Simple language the compiler accepts: Variables, integers, '+', '*', parentheses Examples: a = 10 b = 20 c = a * b + 12 d = a * (( b + c * c ) * ( ))

A feature space for a motivating example What type of features might we want? count−nodes−matching( is−times && left−child−matches( is−plus )&& right−child−matches( is−constant ) = var+ ** +const+ varconst*var +const var a = ((b+c)*2 + d) * 9 + (b+2)*4 Value = 3

A feature space for a motivating example Define a simple feature language: ::= ”count−nodes−matching(” ”)” ::= ”is−constant” | ”is−variable” | ”is−any−type” | ( ”is−plus” | ”is−times” ) ( ”&& left−child−matches(” ”)” ) ? ( ”&& right−child−matches(” ”)” ) ?

A feature space for a motivating example Now generate sentences from the grammar to give features Grammar ::= | “b” Sentence Start with the root non-terminal A

Choose randomly among productions and replace AAA A feature space for a motivating example Now generate sentences from the grammar to give features Grammar ::= | “b” Sentence

Repeat for each non-terminal still in the sentence A feature space for a motivating example Now generate sentences from the grammar to give features Grammar ::= | “b” Sentence bAAAb

A feature space for a motivating example Now generate sentences from the grammar to give features Grammar ::= | “b” Sentence bbbbb Continue until there are no more non-terminals

Overview Introduction to machine learning in compilers Difficulties choosing features A feature space for a motivating example Searching the feature space Features for GCC Results Further and on going work

Searching the Feature Space Search space is parse trees of features Genetic programming searches over feature parse trees Features which help machine learning are better

Searching the Feature Space Data Extractor Benchmarks Program Data Genetic Programming Search Data Generation Feature Evaluator Feature Values ML Tool Best Heuristic Values Prediction Quality Overview of searching

Searching the Feature Space Data Extractor Benchmarks Program Data Genetic Programming Search Data Generation Feature Evaluator Feature Values ML Tool Best Heuristic Values Prediction Quality Data structures extracted from benchmarks

Searching the Feature Space Data Extractor Benchmarks Program Data Genetic Programming Search Data Generation Feature Evaluator Feature Values ML Tool Best Heuristic Values Prediction Quality Population of parse trees evolved with genetic programming

Searching the Feature Space Data Extractor Benchmarks Program Data Genetic Programming Search Data Generation Feature Evaluator Feature Values ML Tool Best Heuristic Values Prediction Quality Parse trees are interpreted over program data giving feature values

Searching the Feature Space Data Extractor Benchmarks Program Data Genetic Programming Search Data Generation Feature Evaluator Feature Values ML Tool Best Heuristic Values Prediction Quality ML tool learns model using feature values and target heuristic values

Searching the Feature Space Data Extractor Benchmarks Program Data Genetic Programming Search Data Generation Feature Evaluator Feature Values ML Tool Best Heuristic Values Prediction Quality Model predicts heuristic with cross validation Quality found (accuracy or speedup)

Searching the Feature Space Data Extractor Benchmarks Program Data Genetic Programming Search Data Generation Feature Evaluator Feature Values ML Tool Best Heuristic Values Prediction Quality After some generations first feature fixed Used when learning model for next feature

Searching the Feature Space Data Extractor Benchmarks Program Data Genetic Programming Search Data Generation Feature Evaluator Feature Values ML Tool Best Heuristic Values Prediction Quality Build up features, one at a time Stop when no improvement

Overview Introduction to machine learning in compilers Difficulties choosing features A feature space for a motivating example Searching the feature space Features for GCC Results Further and on going work

Features for GCC Start by dumping data structures to XML GCC Exports Data Structures to XML Benchmarks Program Data Grammar Stylesheet Structure Analysed Grammar Created Sentence Generator Compiled Structure GrammarSentence Generator.xml.xsl GCC.c XSL Engine. bnf.class.xml XSL + JavaC GP Search Feature Interpreter ML Final Features Target Heuristic Values Feature Search Final Features Produced

Features for GCC Start by dumping data structures to XML GCC Exports Data Structures to XML Benchmarks Program Data Grammar Stylesheet Structure Analysed Grammar Created Sentence Generator Compiled Structure GrammarSentence Generator.xml.xsl GCC.c XSL Engine. bnf.class.xml XSL + JavaC GP Search Feature Interpreter ML Final Features Target Heuristic Values Feature Search Final Features Produced

Features for GCC GCC Exports Data Structures to XML Benchmarks Program Data Grammar Stylesheet Structure Analysed Grammar Created Sentence Generator Compiled Structure GrammarSentence Generator.xml.xsl GCC.c XSL Engine. bnf.class.xml XSL + JavaC GP Search Feature Interpreter ML Final Features Target Heuristic Values Feature Search Final Features Produced Find out the structure found in the benchmarks Allows system to know data format without hard coding

Features for GCC Grammar is constructed from structure GCC Exports Data Structures to XML Benchmarks Program Data Grammar Stylesheet Structure Analysed Grammar Created Sentence Generator Compiled Structure GrammarSentence Generator.xml.xsl GCC.c XSL Engine. bnf.class.xml XSL + JavaC GP Search Feature Interpreter ML Final Features Target Heuristic Values Feature Search Final Features Produced

Features for GCC Grammar is constructed from structure Huge grammar > 160kb Ensures minimal useless features Update easy if GCC changes Features are in interpreted language

Features for GCC Grammar compiled down to Java GCC Exports Data Structures to XML Benchmarks Program Data Grammar Stylesheet Structure Analysed Grammar Created Sentence Generator Compiled Structure GrammarSentence Generator.xml.xsl GCC.c XSL Engine. bnf.class.xml XSL + JavaC GP Search Feature Interpreter ML Final Features Target Heuristic Values Feature Search Final Features Produced

Features for GCC Feature search Features are evaluated over benchmark data GCC Exports Data Structures to XML Benchmarks Program Data Grammar Stylesheet Structure Analysed Grammar Created Sentence Generator Compiled Structure GrammarSentence Generator.xml.xsl GCC.c XSL Engine. bnf.class.xml XSL + JavaC GP Search Feature Interpreter ML Final Features Target Heuristic Values Feature Search Final Features Produced

Features for GCC Final features outputted GCC Exports Data Structures to XML Benchmarks Program Data Grammar Stylesheet Structure Analysed Grammar Created Sentence Generator Compiled Structure GrammarSentence Generator.xml.xsl GCC.c XSL Engine. bnf.class.xml XSL + JavaC GP Search Feature Interpreter ML Final Features Target Heuristic Values Feature Search Final Features Produced

Overview Introduction to machine learning in compilers Difficulties choosing features A feature space for a motivating example Searching the feature space Features for GCC Results Further and on going work

Results Set up Modified GCC benchmarks from MiBench, MediaBench and UTDSP Pentium 6; 2.8GHz; 512Mb RAM Benchmarks run in RamDisk to reduce IO variability Found best unroll factor for each loop in [0-16]

Results Search set up 100 parse trees per generation Stop at 200 generations or 15 without change Double Cross Validation Machine learning Decision Trees (C4.5)

Results GCC default heuristic vs. oracle GCC gets 3% of maximum (1.05 speedup) On average mostly not worth unrolling

Results State of the art technique – Stephenson Hand designed features Uses support vector machine

Results GCC vs. state of the art vs. ours GCC 3% Stephenson 59% Ours 75% Automated features outperform human ones

Results Compare all with same machine learning All with C4.5 decision trees Level playing field GCC 'features' are data used to compute heuristic

Results Decision Trees GCC 48% Stephenson 53% Ours 75% Automated features outperform human ones

Results Top Features Found count(filter(//*, !(is-type(wide-int) || (is-type(float extend) &&[(is- type(reg)]/count(filter(//*,is-type(int))))) || is-type(union type)))) count(filter(/*, (is-type(basic-block) && ( || (0.0 > ( (count(filter(//*, is-type(var decl))) - (count(filter(//*, (is-type(xor) + sum( filter(/*, (is-type(call insn) && count(filter(//*, is-type(real type)))))) / count(filter(/*, is-type(code label)))))))))

Overview Introduction to machine learning in compilers Difficulties choosing features A feature space for a motivating example Searching the feature space Features for GCC Results Further and on going work

Make all GCC's internals available Integrate to Milepost GCC plug-in system and open source it Features for multi-core optimisation

Conclusion Shown a system which automatically finds good features Searches a huge set of features Allows greater experimentation in 'feature ideas' More flexible A few feature grammars should service many heuristics Retune features as well as heuristic Out performs expert features