Machine Learning, Compilers and Mobile Systems Institute for Computing Systems Architecture University of Edinburgh, UK Hugh Leather.

Slides:



Advertisements
Similar presentations
Efficient Program Compilation through Machine Learning Techniques Gennady Pekhimenko IBM Canada Angela Demke Brown University of Toronto.
Advertisements

ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
INTRODUCTION Chapter 1 1. Java CPSC 1100 University of Tennessee at Chattanooga 2  Difference between Visual Logic & Java  Lots  Visual Logic Flowcharts.
Overview Motivations Basic static and dynamic optimization methods ADAPT Dynamo.
Fundamentals of Python: From First Programs Through Data Structures
CS 3500 SE - 1 Software Engineering: It’s Much More Than Programming! Sources: “Software Engineering: A Practitioner’s Approach - Fourth Edition” Pressman,
Java.  Java is an object-oriented programming language.  Java is important to us because Android programming uses Java.  However, Java is much more.
Compiler Optimization-Space Exploration Adrian Pop IDA/PELAB Authors Spyridon Triantafyllis, Manish Vachharajani, Neil Vachharajani, David.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science John Cavazos Architecture and Language Implementation Lab Thesis Seminar University.
SM3121 Software Technology Mark Green School of Creative Media.
Accelerating Machine Learning Applications on Graphics Processors Narayanan Sundaram and Bryan Catanzaro Presented by Narayanan Sundaram.
Estimation Wrap-up CSE 403, Spring 2008, Alverson Spolsky.
M1G Introduction to Programming 2 4. Enhancing a class:Room.
D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
Sadegh Aliakbary Sharif University of Technology Fall 2012.
Dr. Rado Kotorov Technical Director Strategic Product Mgt. Jeff Shein Technical Manager Creating Web 2.0 Rich Internet Applications (RIA) and Dashboards.
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
Dept. of Computer and Information Sciences : University of Delaware John Cavazos Department of Computer and Information Sciences University of Delaware.
CISC Machine Learning for Solving Systems Problems Presented by: Alparslan SARI Dept of Computer & Information Sciences University of Delaware
Machine Learning Tutorial Amit Gruber The Hebrew University of Jerusalem.
Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.
CS 3500 L Performance l Code Complete 2 – Chapters 25/26 and Chapter 7 of K&P l Compare today to 44 years ago – The Burroughs B1700 – circa 1974.
CISC Machine Learning for Solving Systems Problems John Cavazos Dept of Computer & Information Sciences University of Delaware
Software Metric Tools Joel Keyser, Jacob Napp, Carey Norslien, Stephen Owings, Tristan Paynter.
Machine Learning in Compiler Optimization By Namita Dave.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Intelligent Compilation John Cavazos Computer & Information Sciences Department.
Lecture1 Instructor: Amal Hussain ALshardy. Introduce students to the basics of writing software programs including variables, types, arrays, control.
Tommy Messelis * Stefaan Haspeslagh Burak Bilgin Patrick De Causmaecker Greet Vanden Berghe *
Learning A Better Compiler Predicting Unroll Factors using Supervised Classification And Integrating CPU and L2 Cache Voltage Scaling using Machine Learning.
Michael J. Voss and Rudolf Eigenmann PPoPP, ‘01 (Presented by Kanad Sinha)
Data Summit 2016 H104: Building Hadoop Applications Abhik Roy Database Technologies - Experian LinkedIn Profile:
Automatic Feature Generation for Machine Learning Based Optimizing Compilation Hugh Leather, Edwin Bonilla, Michael O'Boyle Institute for Computing Systems.
Automatic Feature Generation for Setting Compiler Heuristics Hugh Leather*, Elad Yom-Tov**, Mircea Namolaru**, Ari Freund** *Institute for Computing Systems.
Raced Profiles:Efficient Selection of Competing Compiler Optimizations Hugh Leather, Bruce Worton, Michael O'Boyle Institute for Computing Systems Architecture.
Parallelisation of Desktop Environments Nasser Giacaman Supervised by Dr Oliver Sinnen Department of Electrical and Computer Engineering, The University.
MILEPOST Machine learning in compilers: The Future of Optimisation Hugh Leather University of Edinburgh.
What is it all about? .NET MeetUp in Prague, CZ (2017/7/19)
Overview of E-Learning Authoring Software
Android Mobile Application Development
Why don’t programmers have to program in machine code?
Lecture 1b- Introduction
Code Optimization Overview and Examples
Pavlos Petoumenos, Hugh Leather, Björn Franke
Customer Support Strategic Pillars
Computational Thinking, Problem-solving and Programming: General Principals IB Computer Science.
Chapter 1 Introduction to Computers, Programs, and Java
Mobile Application Development
Topics Introduction to Repetition Structures
Twitter Augmented Android Malware Detection
Application Development Theory
CMPE419 Mobile Application Development
Introduction to Cloud Computing
The Simplest Heuristics May Be The Best in Java JIT Compilers
Register Pressure Guided Unroll-and-Jam
Adaptive Code Unloading for Resource-Constrained JVMs
Christophe Dubach, Timothy M. Jones and Michael F.P. O’Boyle
Mincer: a distributed automated problem determination tool
Chap. 7 Regularization for Deep Learning (7.8~7.12 )
Realizing Closed-loop, Online Tuning and Control for Configurable-Cache Embedded Systems: Progress and Challenges Islam S. Badreldin*, Ann Gordon-Ross*,
Search.
Query Optimization Techniques
Search.
CMPE419 Mobile Application Development
Introduction to Optimization
Rohan Yadav and Charles Yuan (rohany) (chenhuiy)
Software Re-engineering and Reverse Engineering
IS 135 Business Programming
When Machine Learning Meets Security – Secure ML or Use ML to Secure sth.? ECE 693.
Running & Testing Programs :: Translators
Presentation transcript:

Machine Learning, Compilers and Mobile Systems Institute for Computing Systems Architecture University of Edinburgh, UK Hugh Leather

Introduction Mobile is the next big thing Machine learning key to power and performance Prior work making machine learning in compilers practical at scale

Machine Learning in Compilers

Machine learning in compilers Problem: Tuning heuristics is hard Architectures and compilers keep changing Goal: Replace an heuristic with a Machine Learned one ML performs very well

Machine learning in compilers Start with compiler data structures AST, RTL, SSA, CFG, DDG, etc. Predicted Optimisation Parameter Unroll factor? Scheduling priority? Inline? Compiler flags?

Machine learning in compilers Human expert determines a mapping to a feature vector number of instructions mean dependency depth branch count loop nest level trip count...

Machine learning in compilers Now collect many examples of programs, determining their feature values... Execute the programs with different compilation strategies and find the best for each Best Optimisation Parameters Feature s

Machine learning in compilers Now give these examples to a machine learner It learns a model... Supervised Machine Learner Example s Best Heuristic Value Feature s Model

Machine learning in compilers This model can then be used to predict the best compiler strategy from the features of a new program Our heuristic is replaced Predicted Optimisation Parameter Feature s Model New Program

Machine learning in compilers Fit a curve (model) to data Look new point on curve for prediction Feature s Best Optimisation Parameter

The pillars of machine learning in compilers Compiler Internals Access libPlugin Sourceforge – 400+ downloads Cost of Iterative Compilation Profile Races LCTES 09 Choosing the Features Automatic Feature Generation CGO 09 Enough Benchmarks Automatic Benchmark Generation In preparation.c Need to make practical at scale

Automatic Feature Generation

Choosing Features Problem ML relies on good features Subtle interaction between features and ML Infinite number of features to choose from Solution Automatically search for good features!

An example – Loop unrolling Set up 57 benchmarks from MiBench, MediaBench and UTDSP Found best unroll factor for each loop in [0-16] Exhaustive evaluation to find oracle

An example – Loop unrolling Unrolled 5 times for( i = 0; i < n; i = i + k ) { c[i+0] = a[i+0] * b[i+0]; c[i+1] = a[i+1] * b[i+1]; c[i+2] = a[i+2] * b[i+2]; c[i+3] = a[i+3] * b[i+3]; c[i+4] = a[i+4] * b[i+4]; c[i+5] = a[i+5] * b[i+5]; } Original Loop for( i = 0; i < n; i = i ++ ) { c[i] = a[i] * b[i]; }

GCC vs Oracle GCC gets 3% of maximum On average mostly not worth unrolling

State of the art features Lots of good work with hand-built features Dubach, Cavazos, etc Stephenson was state of the art Tackled loop unrolling heuristic Spent some months designing features Multiple iterations to get right

GCC vs Stephenson Gets 59% of maximum! Machine learning does well

GCC vs Stephenson To scale up, must reduce feature development time

A feature space for a motivating example Simple language the compiler accepts: Variables, integers, '+', '*', parentheses Examples: a = 10 b = 20 c = a * b + 12 d = a * (( b + c * c ) * ( ))

A feature space for a motivating example What type of features might we want? a = ((b+c)*2 + d) * 9 + (b+2)*4

A feature space for a motivating example What type of features might we want? = var+ ** +const+ varconst*var +const var a = ((b+c)*2 + d) * 9 + (b+2)*4

A feature space for a motivating example What type of features might we want? = var+ ** +const+ varconst*var +const var a = ((b+c)*2 + d) * 9 + (b+2)*4

A feature space for a motivating example What type of features might we want? count−nodes−matching( is−times && left−child−matches( is−plus )&& right−child−matches( is−constant ) = var+ ** +const+ varconst*var +const var Value = 3 a = ((b+c)*2 + d) * 9 + (b+2)*4

A feature space for a motivating example Define a simple feature language: GCC grammar is huge >160kb Genetic search for features that improve machine learning prediction ::= ”count−nodes−matching(” ”)” ::= ”is−constant” | ”is−variable” | ”is−any−type” | ( ”is−plus” | ”is−times” ) ( ”&& left−child−matches(” ”)” ) ? ( ”&& right−child−matches(” ”)” ) ?

Results GCC 3% Stephenson 59% Ours 75% Automated features outperform human ones

Results Top Features Found 39%

Results Top Features Found count(filter(//*, !(is-type(wide-int) || (is-type(float extend) &&[(is- type(reg)]/count(filter(//*,is-type(int))))) || is-type(union type)))) 39% 14%

Results Top Features Found count(filter(//*, !(is-type(wide-int) || (is-type(float extend) &&[(is- type(reg)]/count(filter(//*,is-type(int))))) || is-type(union type)))) count(filter(/*, (is-type(basic-block) && ( || (0.0 > ( (count(filter(//*, is-type(var decl))) - (count(filter(//*, (is-type(xor) + sum( filter(/*, (is-type(call insn) && count(filter(//*, is-type(real type)))))) / count(filter(/*, is-type(code label))))))))) 39% 14% 8%

GCC vs Stephenson vs Ours

Machine Learning for Mobile Systems

Mobiles Mobile devices will become THE consumer computing platform Need to make mobile devices faster Quad cores here already Increased power demand Need to make mobile devices lower power Battery life measured in hours Battery capacity not improving

Desktop vs Mobile Mobile is a different beast Application characteristics Customers Information available Needs different techniques

Desktop vs Mobile Desktop Applications No app store Many languages Opaque binaries Mobile (Android) Applications Central app store Mostly Java Recompilable classes

Desktop vs Mobile Desktop Customers = Devs Training in lab Few benchmarks Bad points OK Mobile Customers = Users Training in wild All applications Bad points, umm, bad

Desktop vs Mobile Desktop No user knowledge Static code features Mobile User knowledge Static code features Application history Geographical Temporal OS states Usage patterns

Optimise applications All Android programs use Dalvik JIT - very slow Create a market replacement Light-weight profiling identifies hot methods Updates get experimental code System learns how to optimise similar apps for similar users

Optimise applications Needs zero user impact ML directed profiling ML guided iterative compilation ML guided version selection Huge scope for research

Optimise power Recharge prediction allows power choices

Other topics Optimise communications Power modelling Scheduling heterogeneous multi-cores JIT optimisation

Conclusion Machine learning in compilers Choosing Features Enough Benchmarks Cost of Iterative Compilation Compiler Internals Machine learning the key to mobile systems Mobile is the next big thing Huge scope for research

Backup Slides

Compiler internals Problem Compilers not built for ML Must access all internals Prior approach was to hack the source Solution lib Plugin Opens up GCC internals Modern software engineering Cooperative, extensible plug-ins, with AOP Plug-ins now adopted in GCC

Cost of iterative compilation Problem Gathering training data can take months Statistical soundness often overlooked Solution Profile Races (H.Leather, B.Worton, M.O'Boyle) Program version race each other, losers quit early Reduces training time by order of magnitude Ensures statistically sound data

Enough benchmarks Problem ML would like 10^5 examples Only got a few dozen benchmarks Solution Automatic Benchmark Generation (H.Leather,Z.Wang,A.Magni,C.Thompson - In preparation) Genetic programming + constraint satisfaction to make 'human like' programs Active learning to cover the training space

Difficulties choosing features The expert must do a good job of projecting down to features amount of white space average identifier length age of programmer name of program number of comments...

Difficulties choosing features Feature 1 Feature Machine learning doesn't work if the features don't distinguish the examples

Difficulties choosing features Better features might allow classification Feature 1 Feature Feature 1 Feature Feature 3 = 0Feature 3 = 1

Searching the Feature Space Data Extractor Benchmarks Program Data Genetic Programming Search Data Generation Feature Evaluator Feature Values ML Tool Best Heuristic Values Prediction Quality Overview of searching

Features for GCC GCC Exports Data Structures to XML Benchmarks Program Data Grammar Stylesheet Structure Analysed Grammar Created Sentence Generator Compiled Structure GrammarSentence Generator.xml.xsl GCC.c XSL Engine. bnf.class.xml XSL + JavaC GP Search Feature Interpreter ML Final Features Target Heuristic Values Feature Search Final Features Produced Overview of grammar production

ML for Mobile Server side native compilation of hot methods APK classes.so.jpg.c APK classes.so.jpg Develope r Use r Develope r.so To C Translate Analysis GCC

ML for Mobile - Downsizing Down Sides Experiments on real users’ phones What about the bad search points? Multiple versions - known good and experimental P(experimental) ∝ confidence(experimental) f(...) Original or GoodExperimental P(good ) P(experimental )

Optimise communications Comms updates are expensive Updates need to be fresh and not wasted Build cost models Predictor 'use' times Schedule updates for predicted lowest cost

Power models Need power models Energy sensors are low fidelity Batteries non-linear Allow relaxation Lowest power solution may not give longest battery life Power aware workloads needed

Heterogeneous multi-cores Simple heterogeneous multi-cores here now Scheduling is NP-hard Even when application characteristics known Use ML to tune scheduling heuristics for power and performance

Performance - Dijkstra

Performance

Energy

Energy vs Performance Are energy and performance correlated? Not really! Why? If could predict recharge time, change version