Compiler Principles Fall 2014-2015 Compiler Principles Lecture 7: Intermediate Representation Roman Manevich Ben-Gurion University.

Slides:



Advertisements
Similar presentations
Chapter 6 Intermediate Code Generation
Advertisements

ANALYSIS OF PROG. LANG. PROGRAM ANALYSIS Instructors: Crista Lopes Copyright © Instructors. 1.
Intermediate Code Generation
Chapter 8 ICS 412. Code Generation Final phase of a compiler construction. It generates executable code for a target machine. A compiler may instead generate.
Lecture 08a – Backpatching & Recap Eran Yahav 1 Reference: Dragon 6.2,6.3,6.4,6.6.
8 Intermediate code generation
1 Compiler Construction Intermediate Code Generation.
PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis,
Intermediate Representation I High-Level to Low-Level IR Translation EECS 483 – Lecture 17 University of Michigan Monday, November 6, 2006.
Chapter 14: Building a Runnable Program Chapter 14: Building a runnable program 14.1 Back-End Compiler Structure 14.2 Intermediate Forms 14.3 Code.
CS412/413 Introduction to Compilers Radu Rugina Lecture 16: Efficient Translation to Low IR 25 Feb 02.
Intermediate code generation. Code Generation Create linear representation of program Result can be machine code, assembly code, code for an abstract.
9/27/2006Prof. Hilfinger, Lecture 141 Syntax-Directed Translation Lecture 14 (adapted from slides by R. Bodik)
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
1 Semantic Processing. 2 Contents Introduction Introduction A Simple Compiler A Simple Compiler Scanning – Theory and Practice Scanning – Theory and Practice.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
CS 536 Spring Code generation I Lecture 20.
Intermediate Code CS 471 October 29, CS 471 – Fall Intermediate Code Generation Source code Lexical Analysis Syntactic Analysis Semantic.
Compiler Summary Mooly Sagiv html://
Intermediate Code. Local Optimizations
Compiler Construction Intermediate Representation I Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University.
Compiler Construction A Compulsory Module for Students in Computer Science Department Faculty of IT / Al – Al Bayt University Second Semester 2008/2009.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
CS412/413 Introduction to Compilers Radu Rugina Lecture 15: Translating High IR to Low IR 22 Feb 02.
Semantic Analysis III + Intermediate Representation I.
Compilation /15a Lecture 7 Getting into the back-end Noam Rinetzky 1.
COP4020 Programming Languages
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
10/1/2015© Hal Perkins & UW CSEG-1 CSE P 501 – Compilers Intermediate Representations Hal Perkins Autumn 2009.
1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.
1 Semantic Analysis Aaron Bloomfield CS 415 Fall 2005.
Concordia University Department of Computer Science and Software Engineering Click to edit Master title style COMPILER DESIGN Introduction to code generation.
Compiler course 1. Introduction. Outline Scope of the course Disciplines involved in it Abstract view for a compiler Front-end and back-end tasks Modules.
Chapter 8 Intermediate Code Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University.
Compiler Chapter# 5 Intermediate code generation.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 8: Semantic Analysis and Symbol Tables.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Joey Paquet, 2000, Lecture 10 Introduction to Code Generation and Intermediate Representations.
Introduction to Code Generation and Intermediate Representations
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
Chapter 1 Introduction Study Goals: Master: the phases of a compiler Understand: what is a compiler Know: interpreter,compiler structure.
Compiler Principles Fall Compiler Principles Lecture 6: Parsing part 5 Roman Manevich Ben-Gurion University.
1 Control Flow Analysis Topic today Representation and Analysis Paper (Sections 1, 2) For next class: Read Representation and Analysis Paper (Section 3)
1 Compiler & its Phases Krishan Kumar Asstt. Prof. (CSE) BPRCE, Gohana.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Compilation /15a Lecture 6 Getting into the back-end Noam Rinetzky 1.
Compiler Principles Fall Compiler Principles Lecture 7: Lowering Correctness Roman Manevich Ben-Gurion University of the Negev.
Intermediate code generation. Code Generation Create linear representation of program Result can be machine code, assembly code, code for an abstract.
LECTURE 3 Compiler Phases. COMPILER PHASES Compilation of a program proceeds through a fixed series of phases.  Each phase uses an (intermediate) form.
Compiler Principles Fall Compiler Principles Lecture 8: Intermediate Representation Roman Manevich Ben-Gurion University.
Compiler Principles Fall Compiler Principles Exercise Set: Lowering and Formal Semantics Roman Manevich Ben-Gurion University of the Negev.
Compiler Principles Fall Compiler Principles Lecture 6: Intermediate Representation Roman Manevich Ben-Gurion University of the Negev.
Intermediate Code Generation CS 671 February 14, 2008.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 10 Ahmed Ezzat.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CS 404 Introduction to Compiler Design
Lecture 6: Attribute Grammars IR Noam Rinetzky
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Fall Compiler Principles Lecture 5: Intermediate Representation
Chapter 6 Intermediate-Code Generation
CSE401 Introduction to Compiler Construction
Compiler Design 21. Intermediate Code Generation
Getting into the back-end Noam Rinetzky
Fall Compiler Principles Lecture 5: Intermediate Representation
Intermediate Code Generation
Compiler Design 21. Intermediate Code Generation
Review: What is an activation record?
Intermediate Code Generating machine-independent intermediate form.
Presentation transcript:

Compiler Principles Fall Compiler Principles Lecture 7: Intermediate Representation Roman Manevich Ben-Gurion University

Tentative syllabus Front End Scanning Top-down Parsing (LL) Bottom-up Parsing (LR) Attribute Grammars Intermediate Representation Lowering Optimizations Local Optimizations Dataflow Analysis Loop Optimizations Code Generation Register Allocation Instruction Selection 2 mid-termexam

Previously 3 Becoming parsing ninjas – Going from text to an Abstract Syntax Tree By Admiral Ham [GFDL ( or CC-BY-SA-3.0 ( via Wikimedia Commons

From scanning to parsing 59 + (1257 * xPosition) )id*num(+ Lexical Analyzer program text token stream Parser Grammar: E  id E  num E  E + E E  E * E E  ( E ) + num x * Abstract Syntax Tree valid syntax error 4

Agenda Why compilers need Intermediate Representations (IR) Translating from abstract syntax (AST) to IR – Three-Address Code 5

Role of intermediate representation Bridge between front-end and back-end Allow implementing optimizations independent of source language and executable (target) language High-level Language (scheme) Executable Code Lexical Analysis Syntax Analysis Parsing ASTSymbol Table etc. Inter. Rep. (IR) Code Generation 6

Motivation for intermediate representation 7

Intermediate representation A language that is between the source language and the target language – Not specific to any source language of machine language Goal 1: retargeting compiler components for different source languages/target machines 8 C++ IR Pentium Java bytecode Sparc Pyhton Java

Intermediate representation A language that is between the source language and the target language – Not specific to any source language of machine language Goal 1: retargeting compiler components for different source languages/target machines Goal 2: machine-independent optimizer – Narrow interface: small number of node types (instructions) 9 C++ IR Pentium Java bytecode Sparc Pyhton Java optimize LoweringCode Gen.

Multiple IRs Some optimizations require high-level structure Others more appropriate on low-level code Solution: use multiple IR stages ASTLIR Pentium Java bytecode Sparc optimize HIR optimize 10

Multiple IRs example 11 Elixir Program Automated Reasoning (Boogie+Z3) Delta Inferencer QueryAnswer Elixir Program + delta Automated Planner IL Synthesizer Planning Problem Plan LIR C++ backend C++ code Galois Library HIR Lowering HIR Elixir – a language for parallel graph algorithms Mini-project on parallel graph algorithms

AST vs. LIR for imperative languages AST Rich set of language constructs Rich type system Declarations: types (classes, interfaces), functions, variables Control flow statements: if- then-else, while-do, break- continue, switch, exceptions Data statements: assignments, array access, field access Expressions: variables, constants, arithmetic operators, logical operators, function calls LIR An abstract machine language Very limited type system Only computation-related code Labels and conditional/ unconditional jumps, no looping Data movements, generic memory access statements No sub-expressions, logical as numeric, temporaries, constants, function calls – explicit argument passing 12

Lowering to three address code 13

Three-Address Code IR A popular form of IR High-level assembly where instructions have at most three operands There exist other types of IR – For example, IR based on acyclic graphs – more amenable for analysis and optimizations 14 Chapter 8

TAC sub-expressions example int a; int b; int c; int d; a := b + c + d; b := a * a + b * b; 15 SourceLIR (unoptimized) _t0 := b + c; a := _t0 + d; _t1 := a * a; _t2 := b * b; b := _t1 + _t2; Where have the declarations gone?

TAC sub-expressions example int a; int b; int c; int d; a := b + c + d; b := a * a + b * b; _t0 := b + c; a := _t0 + d; _t1 := a * a; _t2 := b * b; b := _t1 + _t2; 16 SourceLIR (unoptimized) Temporaries explicitly store intermediate values resulting from sub-expressions

Elements of TAC-based IR 17

Variable assignments var := constant ; var 1 := var 2 ; var 1 := var 2 op var 3 ; var 1 := constant op var 2 ; var 1 := var 2 op constant ; var := constant 1 op constant 2 ; Permitted operators are +, -, *, /, % 18

Booleans Boolean variables are represented as integers that have zero or nonzero values In addition to the arithmetic operator, TAC supports <, ==, ||, and && How might you compile the following? 19 b := (x <= y);_t0 := x < y; _t1 := x == y; b := _t0 || _t1;

Booleans Boolean variables are represented as integers that have zero or nonzero values In addition to the arithmetic operator, TAC supports <, ==, ||, and && How might you compile the following? 20 b := (x <= y);_t0 := x < y; _t1 := x == y; b := _t0 + _t1;

Unary operators How would you compile the following assignments from unary statements? 21 y := -x; z := !w; y := 0 - x; z := w == 0; y := -1 * x;

Control flow instructions Label introduction _label_name: Indicates a point in the code that can be jumped to Unconditional jump: go to instruction following label L Goto L; Conditional jump: test condition variable t; if 0, jump to label L IfZ t Goto L; Similarly : test condition variable t; if 1, jump to label L IfNZ t Goto L; 22

Control-flow example – conditions int x; int y; int z; if (x < y) z := x; else z := y; z := z * z; ? 23

Control-flow example – conditions int x; int y; int z; if (x < y) z := x; else z := y; z := z * z; _t0 := x < y; IfZ _t0 Goto _L0; z := x; Goto _L1; _L0: z := y; _L1: z := z * z; 24

Control-flow example – loops int x; int y; while (x < y) { x := x * 2; { y := x; ? 25

Control-flow example – loops int x; int y; while (x < y) { x := x * 2; { y := x; _L0: _t0 := x < y; IfZ _t0 Goto _L1; x := x * 2; Goto _L0; _L1: y := x; 26

Data as control flow 27 x := y || z;x := y IfZ z Goto _L1; x := 1; _L1:

Functions Store local variables/temporaries in a stack A function call instruction pushes arguments to stack and jumps to the function label A statement x:=f(a1,…,an); looks like Push a1; … Push an; Call f; Pop x; // copy returned value Returning a value is done by pushing it to the stack ( return x; ) Push x; Return control to caller (and roll up stack) Return; 28

A logical stack frame 29 Param N Param N-1 … Param 1 _t0 … _tk x … y Parameters (actual arguments) Locals and temporaries Stack frame for function f(a1,…,aN)

Functions example int SimpleFn(int z) { int x, y; x := x * y * z; return x; { void main() { int w; w := SimpleFn(137); { _SimpleFn: Pop z; _t0 := x * y; _t1 := _t0 * z; x := _t1; Push x; Return; main: _t0 := 137; Push _t0; Call _SimpleFn; Pop w; 30

Memory access instructions Copy instruction: a = b Load/store instructions: a := *b*a := b Address of instruction a := &b Array accesses: a := b[i]a[i] := b Field accesses: a := b[f]a[f] := b Memory allocation: a := alloc(size) – Sometimes left out (e.g., malloc is a procedure in C) 31 constant

32 Lowering AST to TAC via syntax directed translation

TAC generation At this stage in compilation, we have – an AST – annotated with scope information – and annotated with type information To generate TAC for the program, we do recursive tree traversal – Generate TAC for any subexpressions and substatements – Using the result, generate TAC for the overall expression (bottom-up manner) 33

TAC generation for expressions Define a function cgen(expr) that generates TAC that computes an expression, stores it in a temporary variable, then hands back the name of that temporary Define cgen directly for atomic expressions (constants, this, identifiers, etc.) Define cgen recursively for compound expressions (binary operators, function calls, etc.) 34

cgen for basic expressions 35 cgen(k) = { // k is a constant Choose a new temporary t Emit( t := k ) Return t { cgen(id) = { // id is an identifier Choose a new temporary t Emit( t := id ) Return t {

Naive cgen for binary expressions Maintain a counter for temporaries in c Initially: c = 0 cgen(e 1 op e 2 ) = { Let A = cgen(e 1 ) c = c + 1 Let B = cgen(e 2 ) c = c + 1 Emit( _tc := A op B; ) Return _tc } 36 The translation emits code to evaluate e 1 before e 2. Why is that?

Example: cgen for binary expressions 37 cgen( (a*b)-d)

Example: cgen for binary expressions 38 c = 0 cgen( (a*b)-d)

Example: cgen for binary expressions 39 c = 0 cgen( (a*b)-d) = { Let A = cgen(a*b) c = c + 1 Let B = cgen(d) c = c + 1 Emit( _tc := A - B; ) Return _tc }

Example: cgen for binary expressions 40 c = 0 cgen( (a*b)-d) = { Let A = { Let A = cgen(a) c = c + 1 Let B = cgen(b) c = c + 1 Emit( _tc := A * B; ) Return tc } c = c + 1 Let B = cgen(d) c = c + 1 Emit( _tc := A - B; ) Return _tc }

Example: cgen for binary expressions 41 c = 0 cgen( (a*b)-d) = { Let A = { Let A = { Emit(_tc := a;), return _tc } c = c + 1 Let B = { Emit(_tc := b;), return _tc } c = c + 1 Emit( _tc := A * B; ) Return _tc } c = c + 1 Let B = { Emit(_tc := d;), return _tc } c = c + 1 Emit( _tc := A - B; ) Return _tc } Code here A=_t0

Example: cgen for binary expressions 42 c = 0 cgen( (a*b)-d) = { Let A = { Let A = { Emit(_tc := a;), return _tc } c = c + 1 Let B = { Emit(_tc := b;), return _tc } c = c + 1 Emit( _tc := A * B; ) Return _tc } c = c + 1 Let B = { Emit(_tc := d;), return _tc } c = c + 1 Emit( _tc := A - B; ) Return _tc } Code _t0:=a; here A=_t0

Example: cgen for binary expressions 43 c = 0 cgen( (a*b)-d) = { Let A = { Let A = { Emit(_tc := a;), return _tc } c = c + 1 Let B = { Emit(_tc := b;), return _tc } c = c + 1 Emit( _tc := A * B; ) Return _tc } c = c + 1 Let B = { Emit(_tc := d;), return _tc } c = c + 1 Emit( _tc := A - B; ) Return _tc } Code _t0:=a; _t1:=b; here A=_t0

Example: cgen for binary expressions 44 c = 0 cgen( (a*b)-d) = { Let A = { Let A = { Emit(_tc := a;), return _tc } c = c + 1 Let B = { Emit(_tc := b;), return _tc } c = c + 1 Emit( _tc := A * B; ) Return _tc } c = c + 1 Let B = { Emit(_tc := d;), return _tc } c = c + 1 Emit( _tc := A - B; ) Return _tc } Code _t0:=a; _t1:=b; _t2:=_t0*_t1 here A=_t0

Example: cgen for binary expressions 45 c = 0 cgen( (a*b)-d) = { Let A = { Let A = { Emit(_tc := a;), return _tc } c = c + 1 Let B = { Emit(_tc := b;), return _tc } c = c + 1 Emit( _tc := A * B; ) Return _tc } c = c + 1 Let B = { Emit(_tc := d;), return _tc } c = c + 1 Emit( _tc := A - B; ) Return _tc } Code _t0:=a; _t1:=b; _t2:=_t0*_t1 here A=_t0 here A=_t2

Example: cgen for binary expressions 46 c = 0 cgen( (a*b)-d) = { Let A = { Let A = { Emit(_tc := a;), return _tc } c = c + 1 Let B = { Emit(_tc := b;), return _tc } c = c + 1 Emit( _tc := A * B; ) Return _tc } c = c + 1 Let B = { Emit(_tc := d;), return _tc } c = c + 1 Emit( _tc := A - B; ) Return _tc } Code _t0:=a; _t1:=b; _t2:=_t0*_t1 _t3:=d; here A=_t0 here A=_t2

Example: cgen for binary expressions 47 c = 0 cgen( (a*b)-d) = { Let A = { Let A = { Emit(_tc := a;), return _tc } c = c + 1 Let B = { Emit(_tc := b;), return _tc } c = c + 1 Emit( _tc := A * B; ) Return _tc } c = c + 1 Let B = { Emit(_tc := d;), return _tc } c = c + 1 Emit( _tc := A - B; ) Return _tc } Code _t0:=a; _t1:=b; _t2:=_t0*_t1 _t3:=d; _t4:=_t2-_t3 here A=_t0 here A=_t2

Num val = 5 AddExpr leftright Ident name = x visit visit (left) visit (right) cgen(5 + x) t1 := 5; t2 := x; t := t1 + t2; t1:=5t2:=x t := t1 + t2 cgen as recursive AST traversal 48

cgen for short-circuit disjunction cgen(e1 || e2) Emit(_t1 := 0;) Emit(_t2 := 0;) Let L after be a new label Let _t1 = cgen(e1) Emit( IfNZ _t1 Goto L after ) Let _t2 = cgen(e2) Emit( L after : ) Emit( _t := _t1 || _t2; ) Return _t 49

cgen for statements We can extend the cgen function to operate over statements as well Unlike cgen for expressions, cgen for statements does not return the name of a temporary holding a value – (Why?) 50

cgen for simple statements 51 cgen(expr;) = { cgen(expr) {

cgen for if-then-else 52 cgen(if (e) s 1 else s 2 ) Let _t = cgen(e) Let L true be a new label Let L false be a new label Let L after be a new label Emit( IfZ _t Goto L false ; ) cgen(s 1 ) Emit( Goto L after ; ) Emit( L false : ) cgen(s 2 ) Emit( L after : )

cgen for while loops 53 cgen(while (expr) stmt) Let L before be a new label Let L after be a new label Emit( L before : ) Let t = cgen(expr) Emit( IfZ t Goto Lafter; ) cgen(stmt) Emit( Goto L before ; ) Emit( L after : )

Exercise: cgen for try-catch 54 cgen(try { try-stmt } catch (v) { catch-stmt } ) ?

Next lecture: IR part 2