Compiler Principles Fall 2015-2016 Compiler Principles Lecture 6: Intermediate Representation Roman Manevich Ben-Gurion University of the Negev.

Slides:



Advertisements
Similar presentations
Intermediate Code Generation
Advertisements

Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
Lecture 08a – Backpatching & Recap Eran Yahav 1 Reference: Dragon 6.2,6.3,6.4,6.6.
8 Intermediate code generation
1 Compiler Construction Intermediate Code Generation.
Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis,
Compiler Principles Fall Compiler Principles Lecture 7: Intermediate Representation Roman Manevich Ben-Gurion University.
Intermediate Representation I High-Level to Low-Level IR Translation EECS 483 – Lecture 17 University of Michigan Monday, November 6, 2006.
Chapter 14: Building a Runnable Program Chapter 14: Building a runnable program 14.1 Back-End Compiler Structure 14.2 Intermediate Forms 14.3 Code.
CS412/413 Introduction to Compilers Radu Rugina Lecture 16: Efficient Translation to Low IR 25 Feb 02.
Lecture 02 – Structural Operational Semantics (SOS) Eran Yahav 1.
1 Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
Denotational Semantics Syntax-directed approach, generalization of attribute grammars: –Define context-free abstract syntax –Specify syntactic categories.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Intermediate Code CS 471 October 29, CS 471 – Fall Intermediate Code Generation Source code Lexical Analysis Syntactic Analysis Semantic.
Programming Language Semantics Mooly SagivEran Yahav Schrirber 317Open space html://
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
CS412/413 Introduction to Compilers Radu Rugina Lecture 15: Translating High IR to Low IR 22 Feb 02.
Compilation /15a Lecture 7 Getting into the back-end Noam Rinetzky 1.
COP4020 Programming Languages
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
10/1/2015© Hal Perkins & UW CSEG-1 CSE P 501 – Compilers Intermediate Representations Hal Perkins Autumn 2009.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 2: Operational Semantics I Roman Manevich Ben-Gurion University.
Compiler course 1. Introduction. Outline Scope of the course Disciplines involved in it Abstract view for a compiler Front-end and back-end tasks Modules.
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Compiler Chapter# 5 Intermediate code generation.
Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
1 Programming Languages Fundamentals Cao Hoaøng Truï Khoa Coâng Ngheä Thoâng Tin Ñaïi Hoïc Baùch Khoa TP. HCM.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Joey Paquet, 2000, Lecture 10 Introduction to Code Generation and Intermediate Representations.
1. 2 Preface In the time since the 1986 edition of this book, the world of compiler design has changed significantly 3.
Introduction to Code Generation and Intermediate Representations
Overview of Previous Lesson(s) Over View  A program must be translated into a form in which it can be executed by a computer.  The software systems.
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
Chapter 1 Introduction Study Goals: Master: the phases of a compiler Understand: what is a compiler Know: interpreter,compiler structure.
Introduction to Compiling
Compiler Principles Fall Compiler Principles Lecture 6: Parsing part 5 Roman Manevich Ben-Gurion University.
Compiler Introduction 1 Kavita Patel. Outlines 2  1.1 What Do Compilers Do?  1.2 The Structure of a Compiler  1.3 Compilation Process  1.4 Phases.
Program Analysis and Verification Noam Rinetzky Lecture 2: Operational Semantics 1 Slides credit: Tom Ball, Dawson Engler, Roman Manevich, Erik.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Principle of Programming Lanugages 3: Compilation of statements Statements in C Assertion Hoare logic Department of Information Science and Engineering.
 Fall Chart 2  Translators and Compilers  Textbook o Programming Language Processors in Java, Authors: David A. Watts & Deryck F. Brown, 2000,
Compiler Principles Fall Compiler Principles Lecture 7: Lowering Correctness Roman Manevich Ben-Gurion University of the Negev.
C H A P T E R T W O Linking Syntax And Semantics Programming Languages – Principles and Paradigms by Allen Tucker, Robert Noonan.
What is a compiler? –A program that reads a program written in one language (source language) and translates it into an equivalent program in another language.
Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications Chapter.
LECTURE 3 Compiler Phases. COMPILER PHASES Compilation of a program proceeds through a fixed series of phases.  Each phase uses an (intermediate) form.
Compiler Principles Fall Compiler Principles Lecture 8: Intermediate Representation Roman Manevich Ben-Gurion University.
Compiler Principles Fall Compiler Principles Exercise Set: Lowering and Formal Semantics Roman Manevich Ben-Gurion University of the Negev.
Operational Semantics Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Compiler Principles Fall Compiler Principles Lecture 8: Dataflow & Optimizations 1 Roman Manevich Ben-Gurion University of the Negev.
Lecture 12 Intermediate Code Generation Translating Expressions
Program Analysis and Verification Noam Rinetzky Lecture 2: Operational Semantics 1 Slides credit: Tom Ball, Dawson Engler, Roman Manevich, Erik.
CS 404 Introduction to Compiler Design
Compiler Design (40-414) Main Text Book:
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Fall Compiler Principles Lecture 5: Intermediate Representation
Spring 2017 Program Analysis and Verification Operational Semantics
Chapter 6 Intermediate-Code Generation
CSE401 Introduction to Compiler Construction
Compiler Design 21. Intermediate Code Generation
Getting into the back-end Noam Rinetzky
Fall Compiler Principles Lecture 5: Intermediate Representation
Spring 2016 Program Analysis and Verification Operational Semantics
Compiler Design 21. Intermediate Code Generation
Presentation transcript:

Compiler Principles Fall Compiler Principles Lecture 6: Intermediate Representation Roman Manevich Ben-Gurion University of the Negev

Tentative syllabus Front End Scanning Top-down Parsing (LL) Bottom-up Parsing (LR) Intermediate Representation Lowering Operational Semantics Optimizations Dataflow Analysis Loop Optimizations Code Generation Register Allocation Instruction Selection 2

Previously 3 Becoming parsing ninjas – Going from text to an Abstract Syntax Tree By Admiral Ham [GFDL ( or CC-BY-SA-3.0 ( via Wikimedia Commons

From scanning to parsing 59 + (1257 * xPosition) )id*num(+ Lexical Analyzer program text token stream Parser Grammar: E  id E  num E  E + E E  E * E E  ( E ) + num x * Abstract Syntax Tree valid syntax error 4

Agenda The role of intermediate representations Two example languages – A high-level language – An intermediate language Lowering Correctness – Formal meaning of programs 5

Role of intermediate representation Bridge between front-end and back-end Allow implementing optimizations independent of source language and executable (target) language High-level Language (scheme) Executable Code Lexical Analysis Syntax Analysis Parsing ASTSymbol Table etc. Inter. Rep. (IR) Code Generation 6

Motivation for intermediate representation 7

Intermediate representation A language that is between the source language and the target language – Not specific to any source language of machine language Goal 1: retargeting compiler components for different source languages/target machines 8 C++ IR Pentium Java bytecode Sparc Pyhton Java

Intermediate representation A language that is between the source language and the target language – Not specific to any source language of machine language Goal 1: retargeting compiler components for different source languages/target machines Goal 2: machine-independent optimizer – Narrow interface: small number of node types (instructions) 9 C++ IR Pentium Java bytecode Sparc Pyhton Java optimize LoweringCode Gen.

Multiple IRs Some optimizations require high-level structure Others more appropriate on low-level code Solution: use multiple IR stages ASTLIR Pentium Java bytecode Sparc optimize HIR optimize 10

Multiple IRs example 11 Elixir Program Automated Reasoning (Boogie+Z3) Delta Inferencer QueryAnswer Elixir Program + delta Automated Planner IL Synthesizer Planning Problem Plan LIR C++ backend C++ code Galois Library HIR Lowering HIR Elixir – a language for parallel graph algorithms Mini-project on parallel graph algorithms

AST vs. LIR for imperative languages AST Rich set of language constructs Rich type system Declarations: types (classes, interfaces), functions, variables Control flow statements: if- then-else, while-do, break- continue, switch, exceptions Data statements: assignments, array access, field access Expressions: variables, constants, arithmetic operators, logical operators, function calls LIR An abstract machine language Very limited type system Only computation-related code Labels and conditional/ unconditional jumps, no looping Data movements, generic memory access statements No sub-expressions, logical as numeric, temporaries, constants, function calls – explicit argument passing 12

three address code 13

Three-Address Code IR A popular form of IR High-level assembly where instructions have at most three operands There exist other types of IR – For example, IR based on acyclic graphs – more amenable for analysis and optimizations 14 Chapter 8

Base language: While 15

Syntax A  n | x | A ArithOp A | ( A ) ArithOp  - | + | * | / B  true | false | A = A | A  A |  B | B  B | ( B ) S  x := A | skip | S ; S | { S } | if B then S else S | while B S 16 n  Numnumerals x  Varprogram variables

Example program 17 while x < y { x := x + 1 { y := x;

Intermediate language: IL 18

Syntax V  n | x R  V Op V Op  - | + | * | / | = |  | > | … C  l : skip | l : x := R | l : Goto l’ | l : IfZ x Goto l’ | l : IfNZ x Goto l’ IR  C + 19 n  NumNumerals l  Num Labels x  Temp  VarTemporaries and variables

Intermediate language programs An intermediate program P has the form 1:c 1 … n:c n We can view it as a map from labels to individual commands and write P(j) = c j 20 1: t0 := 137 2: y := t : IfZ x Goto 7 4: t1 := y 5: z := t1 6: Goto 9 7: t2 := y 8: x := t2 9: skip

Lowering 21

TAC generation At this stage in compilation, we have – an AST – annotated with scope information – and annotated with type information To generate TAC for the program, we do recursive tree traversal – Generate TAC for any subexpressions and substatements – Using the result, generate TAC for the overall expression (bottom-up manner) 22

TAC generation for expressions Define a function cgen(expr) that generates TAC that computes an expression, stores it in a temporary variable, then hands back the name of that temporary Define cgen directly for atomic expressions (constants, this, identifiers, etc.) Define cgen recursively for compound expressions (binary operators, function calls, etc.) 23

Translation rules for expressions 24 cgen(n) = (l: t:=n, t)where l and t are fresh cgen(x) = (l: t:=x, t)where l and t are fresh cgen(e 1 ) = (P 1, t 1 ) cgen(e 2 ) = (P 2, t 2 ) cgen(e 1 op e 2 ) = (P 1 · P 2 · l: t:=t 1 op t 2, t) where l and t are fresh

cgen for basic expressions Maintain a counter for temporaries in c, and a counter for labels in l Initially: c = 0, l = 0 25 cgen(k) = { // k is a constant c = c + 1, l = l +1 Emit(l: tc := k) Return tc { cgen(id) = { // id is an identifier c = c + 1, l = l +1 Emit(l: t := id) Return tc {

Naive cgen for binary expressions cgen(e 1 op e 2 ) = { Let A = cgen(e 1 ) Let B = cgen(e 2 ) c = c + 1, l = l +1 Emit( l: tc := A op B; ) Return tc } 26 The translation emits code to evaluate e 1 before e 2. Why is that?

Example: cgen for binary expressions 27 cgen( (a*b)-d)

Example: cgen for binary expressions 28 c = 0, l = 0 cgen( (a*b)-d)

Example: cgen for binary expressions 29 c = 0, l = 0 cgen( (a*b)-d) = { Let A = cgen(a*b) Let B = cgen(d) c = c + 1, l = l +1 Emit(l: tc := A - B; ) Return tc }

Example: cgen for binary expressions 30 c = 0, l = 0 cgen( (a*b)-d) = { Let A = { Let A = cgen(a) Let B = cgen(b) c = c + 1, l = l +1 Emit(l: tc := A * B; ) Return tc } Let B = cgen(d) c = c + 1, l = l +1 Emit(l: tc := A - B; ) Return tc }

Example: cgen for binary expressions 31 c = 0, l = 0 cgen( (a*b)-d) = { Let A = { Let A = {c=c+1, l=l+1, Emit(l: tc := a;), return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := b;), return tc } c = c + 1, l = l +1 Emit(l: tc := A * B; ) Return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := d;), return tc } c = c + 1, l = l +1 Emit(l: tc := A - B; ) Return tc } Code here A=t1

Example: cgen for binary expressions 32 c = 0, l = 0 cgen( (a*b)-d) = { Let A = { Let A = {c=c+1, l=l+1, Emit(l: tc := a;), return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := b;), return tc } c = c + 1, l = l +1 Emit(l: tc := A * B; ) Return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := d;), return tc } c = c + 1, l = l +1 Emit(l: tc := A - B; ) Return tc } Code 1: t1:=a; here A=t1

Example: cgen for binary expressions 33 c = 0, l = 0 cgen( (a*b)-d) = { Let A = { Let A = {c=c+1, l=l+1, Emit(l: tc := a;), return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := b;), return tc } c = c + 1, l = l +1 Emit(l: tc := A * B; ) Return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := d;), return tc } c = c + 1, l = l +1 Emit(l: tc := A - B; ) Return tc } Code 1: t1:=a; 2: t2:=b; here A=t1

Example: cgen for binary expressions 34 c = 0, l = 0 cgen( (a*b)-d) = { Let A = { Let A = {c=c+1, l=l+1, Emit(l: tc := a;), return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := b;), return tc } c = c + 1, l = l +1 Emit(l: tc := A * B; ) Return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := d;), return tc } c = c + 1, l = l +1 Emit(l: tc := A - B; ) Return tc } Code 1: t1:=a; 2: t2:=b; 3: t3:=t1*t2 here A=t1

Example: cgen for binary expressions 35 c = 0, l = 0 cgen( (a*b)-d) = { Let A = { Let A = {c=c+1, l=l+1, Emit(l: tc := a;), return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := b;), return tc } c = c + 1, l = l +1 Emit(l: tc := A * B; ) Return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := d;), return tc } c = c + 1, l = l +1 Emit(l: tc := A - B; ) Return tc } Code 1: t1:=a; 2: t2:=b; 3: t3:=t1*t2 here A=t1 here A=t3

Example: cgen for binary expressions 36 c = 0, l = 0 cgen( (a*b)-d) = { Let A = { Let A = {c=c+1, l=l+1, Emit(l: tc := a;), return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := b;), return tc } c = c + 1, l = l +1 Emit(l: tc := A * B; ) Return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := d;), return tc } c = c + 1, l = l +1 Emit(l: tc := A - B; ) Return tc } Code 1: t1:=a; 2: t2:=b; 3: t3:=t1*t2 4: t4:=d here A=t1 here A=t3

Example: cgen for binary expressions 37 c = 0, l = 0 cgen( (a*b)-d) = { Let A = { Let A = {c=c+1, l=l+1, Emit(l: tc := a;), return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := b;), return tc } c = c + 1, l = l +1 Emit(l: tc := A * B; ) Return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := d;), return tc } c = c + 1, l = l +1 Emit(l: tc := A - B; ) Return tc } Code 1: t1:=a; 2: t2:=b; 3: t3:=t1*t2 4: t4:=d 5: t5:=t3-t4 here A=t1 here A=t3

cgen for statements We can extend the cgen function to operate over statements as well Unlike cgen for expressions, cgen for statements does not return the name of a temporary holding a value – (Why?) 38

Syntax A  n | x | A ArithOp A | ( A ) ArithOp  - | + | * | / B  true | false | A = A | A  A |  B | B  B | ( B ) S  x := A | skip | S ; S | { S } | if B then S else S | while B S 39 n  Numnumerals x  Varprogram variables

Translation rules for statements 40 cgen(e) = (P, t) cgen( x := e) = P · l: x :=t where l is fresh cgen( b ) = (Pb, t), cgen( S 1 ) = P 1, cgen( S 2 ) = P 2 cgen( if b then S 1 else S 2 ) = Pb IfZ t Goto l false P 1 l finish : Goto L after l false : skip P 2 l after : skip cgen( skip ) = l: skip where l is fresh where l finish, l false, l after are fresh

Translation rules for loops 41 cgen( b ) = (Pb, t), cgen( S ) = P cgen( while b S ) = l before : skip Pb IfZ t Goto l after P l loop : Goto L before l after : skip where l after, l before, l loop are fresh

Translation example 42 1: t1 := 137 2: t2 := 3 3: t3 := t1 + t2 4: y := t3 5: t4 := x 6: t5 := 0 7: t6 := t4=t5 8: IfZ t6 Goto 12 9: t7 := y 10: z := t7 11: Goto 14 12: t8 := y 13: x := t8 14: skip y := 137+3; if x=0 z := y; else x := y;

Correctness 43

Compiler correctness Intuitively, a compiler translates programs in one language (usually high) to another language (usually lower) such that they are bot equivalent Our goal is to formally define the meaning of this equivalence But first, we must define the meaning of a programming language 44

Formal semantics 45

46

What is formal semantics? 47 “Formal semantics is concerned with rigorously specifying the meaning, or behavior, of programs, pieces of hardware, etc.”

What is formal semantics? 48 “This theory allows a program to be manipulated like a formula – that is to say, its properties can be calculated.” Gérard Huet & Philippe Flajolet homage to Gilles Kahn

Why formal semantics? Implementation-independent definition of a programming language Automatically generating interpreters (and some day maybe full fledged compilers) Optimization, verification, and debugging – If you don’t know what it does, how do you know its correct/incorrect? – How do you know whether a given optimization is correct? 49

Operational semantics Elements of the semantics States/configurations: the (aggregate) values that a program computes during execution Transition rules: how the program advances from one configuration to another 50

Operational semantics of while 51

While syntax reminder A  n | x | A ArithOp A | ( A ) ArithOp  - | + | * | / B  true | false | A = A | A  A |  B | B  B | ( B ) S  x := A | skip | S ; S | { S } | if B then S else S | while B S 52 n  Numnumerals x  Varprogram variables

Semantic categories Z Integers {0, 1, -1, 2, -2, …} T Truth values { ff, tt } State Var  Z Example state:  =[ x  5, y  7, z  0] Lookup:  ( x) = 5 Update:  [ x  6] = [ x  6, y  7, z  0] 53

Semantics of expressions 54

Semantics of arithmetic expressions Semantic function  A  : State  Z Defined by induction on the syntax tree  n   = n  x   =  (x)  a 1 + a 2   =  a 1   +  a 2    a 1 - a 2   =  a 1   -  a 2    a 1 * a 2   =  a 1     a 2    (a 1 )   =  a 1   --- not needed  - a   = 0 -  a 1   Compositional Expressions in While are side-effect free 55

Arithmetic expression exercise Suppose  x = 3 Evaluate  x+1   56

Semantics of boolean expressions Semantic function  B  : State  T Defined by induction on the syntax tree  true   = tt  false   = ff  a 1 = a 2   =  a 1  a 2   =  b 1  b 2   =  b   = Compositional Expressions in While are side-effect free 57

Natural operating semantics Developed by Gilles Kahn [STACS 1987]STACS 1987 Configurations  S,  Statement S is about to execute on state   Terminal (final) state Transitions  S,    ’ Execution of S from  will terminate with the result state  ’ – Ignores non-terminating computations 58

Natural operating semantics  defined by rules of the form The meaning of compound statements is defined using the meaning immediate constituent statements 59  S 1,  1    1 ’, …,  S n,  n    n ’  S,    ’ if… premise conclusion side condition

Natural semantics for While 60  x := a,    [x  a   ] [ass ns ]  skip,    [skip ns ]  S 1,    ’,  S 2,  ’    ’’  S 1 ; S 2,    ’’ [comp ns ]  S 1,    ’  if b then S 1 else S 2,    ’ if  b   = tt [if tt ns ]  S 2,    ’  if b then S 1 else S 2,    ’ if  b   = ff [if ff ns ] axioms

Natural semantics for While 61  S,     ’,  while b S,  ’    ’’  while b S,    ’’ if  b   = tt [while tt ns ]  while b S,    if  b   = ff [while ff ns ] Non-compositional

Next lecture: Correctness of lowering