LLVM Simone Campanoni

Slides:



Advertisements
Similar presentations
Overview of programming in C C is a fast, efficient, flexible programming language Paradigm: C is procedural (like Fortran, Pascal), not object oriented.
Advertisements

Introduction to Assembly language
Wannabe Lecturer Alexandre Joly inst.eecs.berkeley.edu/~cs61c-te
Program Representations. Representing programs Goals.
The Web Warrior Guide to Web Design Technologies
 2005 Pearson Education, Inc. All rights reserved Introduction.
Lab#1 (14/3/1431h) Introduction To java programming cs425
CIS 101: Computer Programming and Problem Solving Lecture 8 Usman Roshan Department of Computer Science NJIT.
1 Java Basics. 2 Compiling A “compiler” is a program that translates from one language to another Typically from easy-to-read to fast-to-run e.g. from.
Software Language Levels Machine Language (Binary) Assembly Language –Assembler converts Assembly into machine High Level Languages (C, Perl, Shell)
Combining Static and Dynamic Data in Code Visualization David Eng Sable Research Group, McGill University PASTE 2002 Charleston, South Carolina November.
Operating Systems Concepts Professor Rick Han Department of Computer Science University of Colorado at Boulder.
JVM-1 Introduction to Java Virtual Machine. JVM-2 Outline Java Language, Java Virtual Machine and Java Platform Organization of Java Virtual Machine Garbage.
1. 2 FUNCTION INLINE FUNCTION DIFFERENCE BETWEEN FUNCTION AND INLINE FUNCTION CONCLUSION 3.
Guide To UNIX Using Linux Third Edition
1 Programming Languages Examples: C, Java, HTML, Haskell, Prolog, SAS Also known as formal languages Completely described and rigidly governed by formal.
LLVM Developed by University of Illinois at Urbana-Champaign CIS dept Cisc 471 Matthew Warner.
P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Compilers, Interpreters and Debuggers Ruibin Bai (Room AB326) Division of Computer Science.
M. Taimoor Khan * Java Server Pages (JSP) is a server-side programming technology that enables the creation of dynamic,
Floating point variables of different lengths. Trade-off: accuracy vs. memory space Recall that the computer can combine adjacent bytes in the RAM memory.
Language Systems Chapter FourModern Programming Languages 1.
Imperative Programming
Enabling the ARM Learning in INDIA ARM DEVELOPMENT TOOL SETUP.
RM2D Let’s write our FIRST basic SPIN program!. The Labs that follow in this Module are designed to teach the following; Turn an LED on – assigning I/O.
JIT in webkit. What’s JIT See time_compilation for more info. time_compilation.
IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.
LLVM Compiler (2 of 3) Jason Dangel. Lectures High-level overview of LLVM (Katie) Walkthrough of LLVM in context of our project (Jason) –Input requirements.
1 A Simple but Realistic Assembly Language for a Course in Computer Organization Eric Larson Moon Ok Kim Seattle University October 25, 2008.
Compiler Construction
CS266 Software Reverse Engineering (SRE) Reversing and Patching Java Bytecode Teodoro (Ted) Cipresso,
Architecture for a Next-Generation GCC Chris Lattner Vikram Adve The First Annual GCC Developers'
CS 320 Assignment 1 Rewriting the MISC Osystem class to support loading machine language programs at addresses other than 0 1.
Replay Compilation: Improving Debuggability of a Just-in Time Complier Presenter: Jun Tao.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Lecture 8 February 29, Topics Questions about Exercise 4, due Thursday? Object Based Programming (Chapter 8) –Basic Principles –Methods –Fields.
 Pearson Education, Inc. All rights reserved Introduction to Java Applications.
1.  10% Assignments/ class participation  10% Pop Quizzes  05% Attendance  25% Mid Term  50% Final Term 2.
LANGUAGE SYSTEMS Chapter Four Modern Programming Languages 1.
1 Homework HW5 due today Review a lot of things about allocation of storage that may not have been clear when we covered them in our initial pass Introduction.
LLVM Compiler Katie Dew. Lectures High-level overview of LLVM (Katie) Walkthrough of LLVM in context of our project (Jason) –Input requirements –Configuration.
Chapter 1 Introduction. Chapter 1 -- Introduction2  Def: Compiler --  a program that translates a program written in a language like Pascal, C, PL/I,
A (VERY) SHORT INTRODUCTION TO MATLAB J.A. MARR George Mason University School of Physics, Astronomy and Computational Sciences.
User Defined Methods Methods are used to divide complicated programs into manageable pieces. There are predefined methods (methods that are already provided.
By Mr. Muhammad Pervez Akhtar
How to execute Program structure Variables name, keywords, binding, scope, lifetime Data types – type system – primitives, strings, arrays, hashes – pointers/references.
Introduction to OOP CPS235: Introduction.
Announcements Assignment 1 due Wednesday at 11:59PM Quiz 1 on Thursday 1.
Chapter – 8 Software Tools.
LECTURE 3 Translation. PROCESS MEMORY There are four general areas of memory in a process. The text area contains the instructions for the application.
Welcome! Simone Campanoni
LLVM IR, File - Praakrit Pradhan. Overview The LLVM bitcode has essentially two things A bitstream container format Encoding of LLVM IR.
Unit Testing CLUE PLAYERS.  How much design do we do before we begin to code?  Waterfall: Design it all! (slight exaggeration… but not much)  Agile:
Simone Campanoni SSA Simone Campanoni
Lecture 3 Translation.
Simone Campanoni CFA Simone Campanoni
Simone Campanoni LLVM Simone Campanoni
Visit for more Learning Resources
User-Written Functions
Simone Campanoni Welcome! Simone Campanoni
Simone Campanoni LLVM Simone Campanoni
LLVM IR, code emission, assignment 4
LLVM Pass and Code Instrumentation
Computer Organization and Design Assembly & Compilation
Introduction to Computer Programming
Java Programming Introduction
Chapter 1 Introduction.
Understanding the TigerSHARC ALU pipeline
Programming language translators
SPL – PS1 Introduction to C++.
Week 1 - Friday COMP 1600.
Presentation transcript:

LLVM Simone Campanoni

Problems with Canvas? Problems with slides? Problems with code? Problems with LLVM?

Outline Introduction to LLVM CAT steps Hacking LLVM

LLVM LLVM is a great, hackable compiler for C/C++-like languages But it’s also A JIT A compiler for other languages (Java, CIL bytecode, etc…) LLVM is modular and well documented Started from UIUC It’s now the research tool of choiceresearch tool of choice It’s an industrial-strength compiler Apple, AMD, Intel, NVIDIA

LLVM internals Each component is composed of pipelines Each stage: reads something as input and generates something as output To develop a stage: specify how to transform the input to generate the output Complexity lies in linking stages In this class: we’ll look at concepts/internals of middle-end Some of them are still valid for front-end/back-end

LLVM and other compilers LLVM is designed around it’s IR Multiple forms (human readable, bitcode on-disk, in memory) Front-end (Clang) IR Middle-end IR Back-end Machine code IR Pass IR … Pass manager

The pass manager orchestrates passes It builds the pipeline of passes in the middle-end The pipeline is created by respecting the dependences declared by each pass Pass X depends on Y Y will be invoked before X

Pass types Use the “smallest” one for your CAT CallGraphSCCPass ModulePass ImmutablePass FunctionPass LoopPass BasicBlockPass

Learning LLVM Download LLVM 3.7 and install it using CMake Download CMake (Unless you’ve already installed in your system) Read the documentationdocumentation Read the documentationdocumentation Read the documentationdocumentation Get familiar with LLVM documentation Doxygen pages (API docs) Doxygen pages Language reference manual (IR) Language reference manual Programmer’s manual (LLVM-specific data structures, tools) Programmer’s manual Writing an LLVM pass

Code analysis and transformation Code normalization Analysis Transformation

CAT example: loop hoisting Do { Work(varX); varY = varZ + 1; varX++; } while (varX < 100); varY = varZ + 1; Do { Work(varX); varX++; } while (varX < 100); Loop hoisting

CAT example: loop hoisting (2) while (varX < 100) { Work(varX); varY = varZ + 1; varX++; } Do { Work(varX); varY = varZ + 1; varX++; } while (varX < 100); And now?

Loop normalization What: loop normalization pass When: before running loop hoisting Declare a dependence to your pass manager Advantages? Disadvantages?

CAT design Understand the problem Create representative code examples you expect to optimize Optimize them by hand to test the best benefits of your optimization Identify the common case Define the normalize input code Define the information you need to make your transformation safe Design the analyses to automatically generate this information Design the transformation Test, test, test

Improving CAT Improve your CAT by better handling your common cases Improve your CAT by improving the normalization passes Handle corner cases Before we just simply ignored them (i.e., no transformation)

LLVM IR RISC-based Instructions operate on variables Load and store to access memory Include high level instructions Function calls ( call ) Pointer arithmetics ( getelementptr )

LLVM IR (2) Strongly typed No assignments of variables with different types You need to explicitly cast variables Load and store to access memory Variables Global ) Local ( %myVar ) Function parameter ( define (i32 %myPar) )

LLVM IR (3) 3 different (but 100% equivalent) formats Assembly: human-readable format (FILENAME.ll) Bitcode: machine binary on-disk (FILENAME.bc) In memory: in memory binary Generating IR Clang for C-like languages (similar options w.r.t. GCC) Different front-ends available

LLVM IR (4) It’s a Static Single Assignment (SSA) representation A variable is set only by one instruction in the function body %myVar = … A static assignment can be executed more than once We’ll study SSA later

SSA and not SSA example float myF (float par1, float par2, float par3){ return (par1 * par2) + par3; } define %par1, float %par2, float %par3) { %1 = fmul float %par1, %par2 %2 = fadd float %1, %par3 ret float %2 } define %par1, float %par2, float %par3) { %1 = fmul float %par1, %par2 %1 = fadd float %1, %par3 ret float %1 } NOT SSA SSA

SSA and not SSA CATs applied to SSA-based code are faster! Old compilers aren’t SSA-based Transforming IR in its SSA-form takes time When designing your CAT, think carefully about SSA Take advantage of its properties

LLVM tools opt to analyze/transform LLVM IR code Read LLVM IR file Load external passes Run specified passes Respect pass order you specify as input opt -pass1 -pass2 FILE.ll Optionally generate transformed IR Useful passes opt -view-cfg FILE.ll opt -view-dom FILE.ll opt -help

LLVM tools clang to compile/generate LLVM IR code To generate binaries from source code/IR code Check Makefile you have in LLVM.tar.bz2 lli to execute (interpret/JIT) LLVM IR code lli FILE.bc llc to generate assembly from LLVM IR code llc FILE.bc

Conclusion LLVM is an industrial-strength compiler also used in academia Very hard to know in detail every component Focus on what’s important to your goal Become a ninja at jumping around the documentation It’s well organized, documented with a large community behind it Basic C++ skills are required

Final tips LLVM includes A LOT of passes Analyses Transformations Normalization Take advantage of existing code Try llc -march=cpp.bc -o APIs.cpp I have a point to something. What is it? getName() works on most things errs() << TheThingYouDon’tKnow ;

Let’s start hacking LLVM As Linus Torvalds says … Talk is cheap. Show me the code.