Intermediate Representations CS 671 February 12, 2008.

Slides:



Advertisements
Similar presentations
Semantic Analysis and Symbol Tables
Advertisements

Procedures. 2 Procedure Definition A procedure is a mechanism for abstracting a group of related operations into a single operation that can be used repeatedly.
6. Intermediate Representation Prof. O. Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and.
Compiler Construction
Compiler Construction Sohail Aslam Lecture Code Generation  The code generation problem is the task of mapping intermediate code to machine code.
Code Generation.
1 Lecture 10 Intermediate Representations. 2 front end »produces an intermediate representation (IR) for the program. optimizer »transforms the code in.
Intermediate Code Generation
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
Register Allocation Mooly Sagiv Schrierber Wed 10:00-12:00 html://
Chapter 8 ICS 412. Code Generation Final phase of a compiler construction. It generates executable code for a target machine. A compiler may instead generate.
CSI 3120, Implementing subprograms, page 1 Implementing subprograms The environment in block-structured languages The structure of the activation stack.
Intermediate Representations Saumya Debray Dept. of Computer Science The University of Arizona Tucson, AZ
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
1 Compiler Construction Intermediate Code Generation.
Program Representations. Representing programs Goals.
Intermediate Representation I High-Level to Low-Level IR Translation EECS 483 – Lecture 17 University of Michigan Monday, November 6, 2006.
Chapter 14: Building a Runnable Program Chapter 14: Building a runnable program 14.1 Back-End Compiler Structure 14.2 Intermediate Forms 14.3 Code.
Intermediate Representations Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
1 Storage Registers vs. memory Access to registers is much faster than access to memory Goal: store as much data as possible in registers Limitations/considerations:
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
1 Handling nested procedures Method 1 : static (access) links –Reference to the frame of the lexically enclosing procedure –Static chains of such links.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
CS 536 Spring Code generation I Lecture 20.
Intermediate Code CS 471 October 29, CS 471 – Fall Intermediate Code Generation Source code Lexical Analysis Syntactic Analysis Semantic.
1 Intermediate representation Goals: encode knowledge about the program facilitate analysis facilitate retargeting facilitate optimization scanning parsing.
The Procedure Abstraction Part I: Basics Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Run time vs. Compile time
Improving Code Generation Honors Compilers April 16 th 2002.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
Compiler Construction A Compulsory Module for Students in Computer Science Department Faculty of IT / Al – Al Bayt University Second Semester 2008/2009.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Precision Going back to constant prop, in what cases would we lose precision?
The Procedure Abstraction Part I: Basics Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
10/1/2015© Hal Perkins & UW CSEG-1 CSE P 501 – Compilers Intermediate Representations Hal Perkins Autumn 2009.
Compiler Construction
Activation Records CS 671 February 7, CS 671 – Spring The Compiler So Far Lexical analysis Detects inputs with illegal tokens Syntactic analysis.
Activation Records (in Tiger) CS 471 October 24, 2007.
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Intermediate Code Representations
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 11: Functions and stack frames.
7. Runtime Environments Zhang Zhizheng
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 10 Ahmed Ezzat.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
1 Chapter10: Code generator. 2 Code Generator Source Program Target Program Semantic Analyzer Intermediate Code Generator Code Optimizer Code Generator.
Run-Time Environments Presented By: Seema Gupta 09MCA102.
CS 404 Introduction to Compiler Design
Run-Time Environments Chapter 7
Compiler Design (40-414) Main Text Book:
Mooly Sagiv html://
Compiler Construction (CS-636)
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Introduction to Compilers Tim Teitelbaum
Intermediate Representations Hal Perkins Autumn 2011
Intermediate Representations
Unit IV Code Generation
Chapter 6 Intermediate-Code Generation
CSE401 Introduction to Compiler Construction
The Procedure Abstraction Part I: Basics
Intermediate Representations
UNIT V Run Time Environments.
Optimization 薛智文 (textbook ch# 9) 薛智文 96 Spring.
Intermediate Code Generation
Review: What is an activation record?
Presentation transcript:

Intermediate Representations CS 671 February 12, 2008

CS 671 – Spring Last Time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage Storage organization

CS 671 – Spring Activation Records A procedure is a control abstraction it associates a name with a chunk of code that piece of code is regarded in terms of its purpose and not of its implementation A procedure creates its own name space It can declare local variables Local declarations may hide non-local ones Local names cannot be seen from outside

CS 671 – Spring Handling Control Abstractions Generated code must be able to preserve current state –save variables that cannot be saved in registers –save specific register values establish procedure environment on entry –map actual to formal parameters –create storage for locals restore previous state on exit This can be modeled with a stack Allocate a memory block for each activation Maintain a stack of such blocks This mechanism can handle recursion

CS 671 – Spring Activation Records Keeps track of the bottom of the current activation record g(…) { f(a1,…,an); } g is the caller f is the callee … Arg n … Arg 2 Arg 1 Ret Address Local vars Temporaries Saved regs Arg m … Arg 1 Ret Address f’s frame g’s frame next frame sp fp

CS 671 – Spring Activation Records Handling nested procedures –We must keep a static link (vs. dynamic link) Registers vs. memory –Assume infinite registers (for now) g(…) { f(a1,…,an); } If g saves before call, restores after call caller-save register If f saves before using a reg, restores after callee-save register

CS 671 – Spring Intermediate Code Generation Source code Lexical Analysis Syntactic Analysis Semantic Analysis Intermediate Code Gen lexical errors syntax errors semantic errors tokens AST AST’ IR

CS 671 – Spring And Then? The Back End IR Trace Formation Instruction Selection Register Allocation Optimizations * * phase ordering problem

CS 671 – Spring IR Motivation What we have so far... An abstract syntax tree –With all the program information –Known to be correct Well-typed Nothing missing No ambiguities What we need... Something “Executable” Closer to actual machine level of abstraction

CS 671 – Spring Why an Intermediate Representation? What is the IR used for? –Portability –Optimization –Component interface –Program understanding Compiler –Front end does lexical analysis, parsing, semantic analysis, translation to IR –Back end does optimization of IR, translation to machine instructions Try to keep machine dependences out of IR for as long as possible

CS 671 – Spring Intermediate Code Abstract machine code – (Intermediate Representation) Allows machine-independent code generation, optimization ASTIR PowerPC Alpha x86 optimize

CS 671 – Spring Intermediate Representations Any representation between the AST and ASM –3 address code: triples, quads (low-level) –Expression trees (high-level) a[i]; MEM + a BINOP MUL iCONST W

CS 671 – Spring What Makes a Good IR? Easy to translate from AST Easy to translate to assembly Narrow interface: small number of node types (instructions) –Easy to optimize –Easy to retarget AST (>40 node types) IR (~15 node types) x86 (>200 opcodes)

CS 671 – Spring IR selection Using an existing IR cost savings due to reuse it must be expressive and appropriate for the compiler operations Designing an IR decide how close to machine code it should be decide how expressive it should be decide its structure consider combining different kinds of IRs

CS 671 – Spring Optimizing Compilers Goal: get program closer to machine code without losing information needed to do useful optimizations Need multiple IR stages MIR PowerPC (LIR) Alpha (LIR) x86 (LIR) optimize ASTHIR optimize opt

CS 671 – Spring High-Level IR (HIR) used early in the process usually converted to lower form later on Preserves high-level language constructs – structured flow, variables, methods Allows high-level optimizations based on properties of source language (e.g. inlining, reuse of constant variables) Example: AST

CS 671 – Spring Medium-Level IR (MIR) Try to reflect the range of features in the source language in a language-independent way Intermediate between AST and assembly Unstructured jumps, registers, memory locations Convenient for translation to high-quality machine code OtherMIRs: –tree IR: easy to generate, easy to do reasonable instruction selection –quadruples: a = b OP c (easy to optimize) –stack machine based (like Java bytecode)

CS 671 – Spring Low-Level IR (LIR) Assembly code + extra pseudo instructions Machine dependent Translation to assembly code is trivial Allows optimization of code for low-level considerations: scheduling, memory layout

CS 671 – Spring IR Classification: Level for i := op1 to op2 step op3 instructions endfor i := op1 if step < 0 goto L2 L1: if i > op2 goto L3 instructions i := i + step goto L1 L2: if i < op2 goto L3 instructions i := i + step goto L2 L3: High-level Medium-level

CS 671 – Spring IR Classification: Structure Graphical Trees, graphs Not easy to rearrange Large structures Linear Looks like pseudocode Easy to rearrange Hybrid Combine graphical and linear IRs Example: –low-level linear IR for basic blocks, and –graph to represent flow of control

CS 671 – Spring Graphical IRs Abstract syntax tree High-level Useful for source-level information Retains syntactic structure Common uses –source-to-source translation –semantic analysis –syntax-directed editors

CS 671 – Spring Graphical IRs Tree, for basic block* root: operator up to two children: operands can be combined Uses: algebraic simplifications may generate locally optimal code. L1: i := 2 t1:= i+1 t2 := t1>0 if t2 goto L1 assgn, i add, t1 gt, t2 2 i 1 t1 0 2 assgn, i 1 add, t1 0 gt, t2 *straight-line code with no branches or branch targets.

CS 671 – Spring Graphical IRs Directed Acyclic Graphs (DAGs) Like compressed trees –leaves: variables, constants available on entry –internal nodes: operators annotated with variable names –distinct left/right children Used for basic blocks (DAGs don't show control flow) Can generate efficient code –Note: DAGs encode common expressions But difficult to transform Good for analysis

CS 671 – Spring Graphical IRs Control-Flow Graphs (CFGs) Each node corresponds to a –basic block, or –part of a basic block, or may need to determine facts at specific points within BB –a single statement more space and time Each edge represents flow of control

CS 671 – Spring Control-Flow Graphs Graphical representation of runtime control-flow paths Nodes of graph: basic blocks (straight-line computations) Edges of graph: flows of control Useful for collecting information about computation Detect loops, remove redundant computations, register allocation, instruction scheduling… Alternative CFG: Each node contains a single stmt …… i = 0 while (i < 50) { t1 = b * 2; a = a + t1; i = i + 1; } …. if I < 50 …… t1 := b * 2; a := a + t1; i = i + 1; i =0;

CS 671 – Spring Graphical IRs: Often Use Basic Blocks Basic block = a sequence of consecutive statements in which flow of control enters at the beginning and leaves at the end without halt or possibility of branching except at the end Partitioning a sequence of statements into BBs 1. Determine leaders (first statements of BBs) –The first statement is a leader –The target of a conditional is a leader –A statement following a branch is a leader 2. For each leader, its basic block consists of the leader and all the statements up to but not including the next leader.

CS 671 – Spring Basic Blocks unsigned int fibonacci (unsigned int n) { unsigned int f0, f1, f2; f0 = 0; f1 = 1; if (n <= 1) return n; for (int i=2; i<=n; i++) { f2 = f0+f1; f0 = f1; f1 = f2; } return f2; } read(n) f0 := 0 f1 := 1 if n<=1 goto L0 i := 2 L2:if i<=n goto L1 return f2 L1:f2 := f0+f1 f0 := f1 f1 := f2 i := i+1 go to L2 L0:return n

CS 671 – Spring Basic Blocks read(n) f0 := 0 f1 := 1 if n<=1 goto L0 i := 2 L2:if i<=n goto L1 return f2 L1:f2 := f0+f1 f0 := f1 f1 := f2 i := i+1 go to L2 L0:return n Leaders: read(n) f0 := 0 f1 := 1 n <= 1 i := 2 i<=n return f2f2 := f0+f1 f0 := f1 f1 := f2 i := i+1 return n exit entry

CS 671 – Spring Graphical IRs Dependence graphs : they represents constraints on the sequencing of operations Dependence: a relation between two statements that puts a constraint on their execution order –Control dependence: Based on the program's control flow –Data dependence: Based on the flow of data between statements Nodes represent statements Edges represent dependences Built for specific optimizations, then discarded

CS 671 – Spring Graphical IRs Dependence graphs Example: s1a := b + c s2 if a>10 goto L1 s3d := b * e s4e := d + 1 s5 L1: d := e / 2 control dependence: s3 and s4 are executed only when a<=10 true or flow dependence: s2 uses a value defined in s1 This is read-after-write dependence antidependence: s4 defines a value used in s3 This is write-after-read dependence output dependence: s5 defines a value also defined in s3 This is write-after-write dependence input dependence: s5 uses a value also uses in s3 This is read-after-read situation. It places no constraints in the execution order, but is used in some optimizations. s1s2s3 s4 s5

CS 671 – Spring IR Classification: Structure Graphical Trees, graphs Not easy to rearrange Large structures Linear Seq of instructions that execute in order of appearance Control flow is represented by cond branches and jumps Hybrid Combine graphical and linear IRs Example: –low-level linear IR for basic blocks, and –graph to represent flow of control

CS 671 – Spring Linear IR Lower level IR before final code generation A linear sequence of instructions Resemble assembly code for an abstract machine –Explicit conditional branches and goto jumps Reflect instruction sets of the target machine –Stack-machine code and three-address code Implemented as a collection (table or list) of tuples Push 2 Push y Multiply Push x subtract Linear IR for x – 2 * y MOV 2 => t1 MOV y => t2 MULT t2 => t1 MOV x => t4 SUB t1 => t4 Stack-machine codetwo-address codethree-address code t1 := 2 t2 := y t3 := t1*t2 t4 := x t5 := t4-t3

CS 671 – Spring Linear IRs Stack machine code Assumes presence of operand stack Useful for stack architectures, JVM Operations typically pop operands and push results Advantages –Easy code generation –Compact form Disadvantages –Difficult to rearrange –Difficult to reuse expressions

CS 671 – Spring Stack-Machine Code Also called one-address code Assumes an operand stack Operations take operands from the stack and push results back onto the stack Compact, simple to generate and execute –Most operands do not need names –Results are transitory unless explicitly moved to memory Used as IR for Smalltalk and Java Push 2 Push y Multiply Push x subtract Stack-machine code for x – 2 * y

CS 671 – Spring Linear IR: Three-Address Code Each instruction is of the form x := y op z y and z can be only registers or constants Just like assembly Common form of intermediate code The AST expression x + y * z is translated as t 1 := y * z t 2 := x + t 1 Each subexpression has a “home”

CS 671 – Spring Three-Address Code Result, operand, operand, operator –x := y op z, where op is a binary operator and x, y, z can be variables, constants or compiler generated temporaries (intermediate results) Can write this as a shorthand – -- quadruples Statements –Assignment x := y op z –Copy stmts x := y –Goto L

CS 671 – Spring Three Address Code Statements (cont) –if x relop y goto L –Indexed assignments x := y[j] or s[j] := z –Address and pointer assignments (for C) x := &y x := *p *x := y –Parm x; –call p, n; –return y;

CS 671 – Spring Three-Address Code Every instruction manipulates at most two operands and one result. Arithmetic operations: x := y op z | x := op y Data movement: x := y [z] | x[z] := y | x := y Control flow: if y op z goto x | goto x Function call: param x | return y | call foo Reasonably compact; reuse of names and values t1 := 2 t2 := y t3 := t1*t2 t4 := x t5 := t4-t3 Three-address code for x – 2 * y

CS 671 – Spring Storing Three-Address Code oparg1arg2result (0)Uminusct1 (1)Multbt1t2 (2)Uminusct3 (3)Multbt3t4 (4)Plust2t4t5 (5)Assignt5a t1 := - c t2 := b * t1 t3 := -c t4 := b * t3 t5 := t2 + t4 a := t5 3AC Store all instructions in a quadruple table Every instr has four fields: op, arg1, arg2, result Label of instructions  index of instruction in table Quadruple entries

CS 671 – Spring Bigger Example Consider the AST t 1 := - c t 2 := b * t 1 t 3 := - c t 4 := b * t 3 t 5 := t 2 + t 4 a := t 5 t 1 := - c t 2 := b * t 1 t 3 := - c t 4 := b * t 3 t 5 := t 2 + t 4 a := t 5

CS 671 – Spring The IR Machine A machine with Infinite number of temporaries (think registers) Simple instructions –3-operands –Branching –Calls with simple calling convention Simple code structure –Array of instructions Labels to define targets of branches

CS 671 – Spring Temporaries The machine has an infinite number of temporaries Call them t 0, t 1, t 2,.... Temporaries can hold values of any type The type of the temporary is derived from the generation Temporaries go out of scope with each function

CS 671 – Spring Mapping Names to Variables Variables are names for values Names given by programmers in the input program Names given by compilers for storing intermediate results of computation Reusing temp variables can save space, but mask context and prevent analysis and optimizations Result of t1*t2 is no longer available after t1 is reused t1 := 2 t2 := y t3 := t1*t2 t4 := x t5 := t4-t3 Three-address code for x – 2 * y t1 := 2 t2 := y t1 := t1*t2 t2 := x t1 := t2-t1 Different values use distinct names Different values reuse the same name

CS 671 – Spring Mapping Storage to Variables Variables are placeholders for values Every variable must have location to store its value –Register, stack, heap, static storage Values must be loaded into registers before use t1 := 2 t2 := y t3 := t1*t2 t4 := x t5 := t4-t3 Three-address code for x – 2 * y: t1 := 2 t2 := t1*y t3 := x-t2 x and y are in registersx and y are in memory Which vars can be kept in registers? Which vars must be stored in memory? void A(int b, int *p) { int a, d; a = 3; d = foo(a); *p =b+d; }

CS 671 – Spring Summary IRs provide the interface between the front and back ends of the compiler –Graphical, linear, hybrid Should be machine and language independent Should be amenable to optimization