Intermediate Code CS 471 October 29, 2007. CS 471 – Fall 2007 1 Intermediate Code Generation Source code Lexical Analysis Syntactic Analysis Semantic.

Slides:



Advertisements
Similar presentations
Intermediate Representations CS 671 February 12, 2008.
Advertisements

1 Lecture 10 Intermediate Representations. 2 front end »produces an intermediate representation (IR) for the program. optimizer »transforms the code in.
Intermediate Code Generation
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
Chapter 8 ICS 412. Code Generation Final phase of a compiler construction. It generates executable code for a target machine. A compiler may instead generate.
Intermediate Representations Saumya Debray Dept. of Computer Science The University of Arizona Tucson, AZ
8 Intermediate code generation
1 Compiler Construction Intermediate Code Generation.
Overview of Previous Lesson(s) Over View  Front end analyzes a source program and creates an intermediate representation from which the back end generates.
Program Representations. Representing programs Goals.
Compiler Construction Sohail Aslam Lecture IR Taxonomy IRs fall into three organizational categories 1.Graphical IRs encode the compiler’s knowledge.
Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis,
Intermediate Representation I High-Level to Low-Level IR Translation EECS 483 – Lecture 17 University of Michigan Monday, November 6, 2006.
Chapter 14: Building a Runnable Program Chapter 14: Building a runnable program 14.1 Back-End Compiler Structure 14.2 Intermediate Forms 14.3 Code.
Intermediate Representations Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
10/22/2002© 2002 Hal Perkins & UW CSEG-1 CSE 582 – Compilers Intermediate Representations Hal Perkins Autumn 2002.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
1 Intermediate representation Goals: encode knowledge about the program facilitate analysis facilitate retargeting facilitate optimization scanning parsing.
CSC 8505 Compiler Construction Intermediate Representations.
Intermediate Code. Local Optimizations
Topic 6 -Code Generation Dr. William A. Maniatty Assistant Prof. Dept. of Computer Science University At Albany CSI 511 Programming Languages and Systems.
Compiler Construction1 A Compulsory Module for Students in Computer Science Department Faculty of IT / Al – Al Bayt University First Semester 2009/2010.
Improving Code Generation Honors Compilers April 16 th 2002.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
Compiler Construction A Compulsory Module for Students in Computer Science Department Faculty of IT / Al – Al Bayt University Second Semester 2008/2009.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Precision Going back to constant prop, in what cases would we lose precision?
CSE P501 – Compiler Construction Parser Semantic Actions Intermediate Representations AST Linear Next Spring 2014 Jim Hogg - UW - CSE - P501G-1.
COP4020 Programming Languages
10/1/2015© Hal Perkins & UW CSEG-1 CSE P 501 – Compilers Intermediate Representations Hal Perkins Autumn 2009.
1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.
CS412/413 Introduction to Compilers and Translators May 3, 1999 Lecture 34: Compiler-like Systems JIT bytecode interpreter src-to-src translator bytecode.
Compiler Chapter# 5 Intermediate code generation.
CSc 453 Final Code Generation Saumya Debray The University of Arizona Tucson.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
1 June 3, June 3, 2016June 3, 2016June 3, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa Pacific University,
Introduction to Compilers. Related Area Programming languages Machine architecture Language theory Algorithms Data structures Operating systems Software.
Introduction to Code Generation and Intermediate Representations
Overview of Previous Lesson(s) Over View  A program must be translated into a form in which it can be executed by a computer.  The software systems.
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
Chapter 1 Introduction Study Goals: Master: the phases of a compiler Understand: what is a compiler Know: interpreter,compiler structure.
Code Generation Ⅰ CS308 Compiler Theory1. 2 Background The final phase in our compiler model Requirements imposed on a code generator –Preserving the.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Chapter# 6 Code generation.  The final phase in our compiler model is the code generator.  It takes as input the intermediate representation(IR) produced.
Intermediate Code Representations
12/18/2015© Hal Perkins & UW CSEG-1 CSE P 501 – Compilers Intermediate Representations Hal Perkins Winter 2008.
Compilers Modern Compiler Design
1 Control Flow Analysis Topic today Representation and Analysis Paper (Sections 1, 2) For next class: Read Representation and Analysis Paper (Section 3)
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Intermediate Code Generation CS 671 February 14, 2008.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 10 Ahmed Ezzat.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
1 Chapter10: Code generator. 2 Code Generator Source Program Target Program Semantic Analyzer Intermediate Code Generator Code Optimizer Code Generator.
CS 404 Introduction to Compiler Design
Compiler Design (40-414) Main Text Book:
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Compiler Construction
Basic Program Analysis: AST
Intermediate Representations Hal Perkins Autumn 2011
Intermediate Representations
Chapter 6 Intermediate-Code Generation
Intermediate Representations
Intermediate Representations Hal Perkins Autumn 2005
Intermediate Code Generation
Intermediate Code Generating machine-independent intermediate form.
Presentation transcript:

Intermediate Code CS 471 October 29, 2007

CS 471 – Fall Intermediate Code Generation Source code Lexical Analysis Syntactic Analysis Semantic Analysis Intermediate Code Gen lexical errors syntax errors semantic errors tokens AST AST’ IR

CS 471 – Fall Motivation What we have so far... An abstract syntax tree –With all the program information –Known to be correct Well-typed Nothing missing No ambiguities What we need... Something “Executable” Closer to actual machine level of abstraction

CS 471 – Fall Why an Intermediate Representation? What is the IR used for? –Portability –Optimization –Component Interface –Program understanding Compiler –Front end does lexical analysis, parsing, semantic analysis, translation to IR –Back end does optimization of IR, translation to machine instructions Try to keep machine dependences out of IR for as long as possible

CS 471 – Fall Intermediate Code Abstract machine code – (Intermediate Representation) Allows machine-independent code generation, optimization ASTIR PowerPC Alpha x86 optimize

CS 471 – Fall What Makes a Good IR? Easy to translate from AST Easy to translate to assembly Narrow interface: small number of node types (instructions) –Easy to optimize –Easy to retarget AST (>40 node types) IR (13 node types) x86 (>200 opcodes)

CS 471 – Fall Intermediate Representations Any representation between the AST and ASM –3 address code: triples, quads (low-level) –Expression trees (high-level) Tiger intermediate code: a[i]; MEM + a BINOP MUL iCONST W

CS 471 – Fall Intermediate representation Components code representation symbol table analysis information string table Issues Use an existing IR or design a new one? –Many available: RTLs, SSA, LVM, etc How close should it be to the source/target?

CS 471 – Fall IR selection Using an existing IR cost savings due to reuse it must be expressive and appropriate for the compiler operations Designing an IR decide how close to machine code it should be decide how expressive it should be decide its structure consider combining different kinds of IRs

CS 471 – Fall The IR Machine A machine with Infinite number of temporaries (think registers) Simple instructions –3-operands –Branching –Calls with simple calling convention Simple code structure –Array of instructions Labels to define targets of branches

CS 471 – Fall Temporaries The machine has an infinite number of temporaries Call them t 0, t 1, t 2,.... Temporaries can hold values of any type The type of the temporary is derived from the generation Temporaries go out of scope with each function

CS 471 – Fall Optimizing Compilers Goal: get program closer to machine code without losing information needed to do useful optimizations Need multiple IR stages MIR PowerPC (LIR) Alpha (LIR) x86 (LIR) optimize ASTHIR optimize opt

CS 471 – Fall High-Level IR (HIR) used early in the process usually converted to lower form later on Preserves high-level language constructs –structured flow, variables, methods Allows high-level optimizations based on properties of source language (e.g. inlining, reuse of constant variables) Example: AST

CS 471 – Fall Medium-Level IR (MIR) Try to reflect the range of features in the source language in a language-independent way Intermediate between AST and assembly Unstructured jumps, registers, memory locations Convenient for translation to high-quality machine code OtherMIRs: –quadruples: a = b OP c (“a” is explicit, not arc) –UCODE: stack machine based (like Java bytecode) –advantage of tree IR: easy to generate, easier to do reasonable instruction selection –advantage of quadruples: easier optimization

CS 471 – Fall Low-Level IR (LIR) Assembly code + extra pseudo instructions Machine dependent Translation to assembly code is trivial Allows optimization of code for low-level considerations: scheduling, memory layout

CS 471 – Fall IR classification: Level for i := op1 to op2 step op3 instructions endfor i := op1 if step < 0 goto L2 L1: if i > op2 goto L3 instructions i := i + step goto L1 L2: if i < op2 goto L3 instructions i := i + step goto L2 L3: High-level Medium-level

CS 471 – Fall IR classification: Structure Graphical Trees, graphs Not easy to rearrange Large structures Linear Looks like pseudocode Easy to rearrange Hybrid Combine graphical and linear IRs Example: –low-level linear IR for basic blocks, and –graph to represent flow of control

CS 471 – Fall Graphical IRs Parse tree Abstract syntax tree High-level Useful for source-level information Retains syntactic structure Common uses –source-to-source translation –semantic analysis –syntax-directed editors

CS 471 – Fall Graphical IRs: Often Use Basic Blocks Basic block = a sequence of consecutive statements in which flow of control enters at the beginning and leaves at the end without halt or possibility of branching except at the end Partitioning a sequence of statements into BBs 1. Determine leaders (first statements of BBs) –The first statement is a leader –The target of a conditional is a leader –A statement following a branch is a leader 2. For each leader, its basic block consists of the leader and all the statements up to but not including the next leader.

CS 471 – Fall Basic blocks unsigned int fibonacci (unsigned int n) { unsigned int f0, f1, f2; f0 = 0; f1 = 1; if (n <= 1) return n; for (int i=2; i<=n; i++) { f2 = f0+f1; f0 = f1; f1 = f2; } return f2; } read(n) f0 := 0 f1 := 1 if n<=1 goto L0 i := 2 L2:if i<=n goto L1 return f2 L1:f2 := f0+f1 f0 := f1 f1 := f2 i := i+1 go to L2 L0:return n

CS 471 – Fall Basic blocks read(n) f0 := 0 f1 := 1 if n<=1 goto L0 i := 2 L2:if i<=n goto L1 return f2 L1:f2 := f0+f1 f0 := f1 f1 := f2 i := i+1 go to L2 L0:return n Leaders: read(n) f0 := 0 f1 := 1 n <= 1 i := 2 i<=n return f2f2 := f0+f1 f0 := f1 f1 := f2 i := i+1 return n exit entry

CS 471 – Fall Graphical IRs Tree, for basic block* root: operator up to two children: operands can be combined Uses: algebraic simplifications may generate locally optimal code. L1: i := 2 t1:= i+1 t2 := t1>0 if t2 goto L1 assgn, i add, t1 gt, t2 2 i 1 t1 0 2 assgn, i 1 add, t1 0 gt, t2 *straight-line code with no branches or branch targets.

CS 471 – Fall Graphical IRs Directed acyclic graphs (DAGs) Like compressed trees –leaves: variables, constants available on entry –internal nodes: operators annotated with variable names? –distinct left/right children Used for basic blocks (DAGs don't show control flow) Can generate efficient code. –Note: DAGs encode common expressions But difficult to transform Good for analysis

CS 471 – Fall Graphical IRs Generating DAGs Check whether an operand is already present –if not, create a leaf for it Check whether there is a parent of the operand that represents the same operation –if not create one, then label the node representing the result with the name of the destination variable, and remove that label from all other nodes in the DAG.

CS 471 – Fall Graphical IRs Directed acyclic graphs (DAGs) Example m := 2 * y * z n := 3 * y * z p := 2 * y - z

CS 471 – Fall Graphical IRs Control flow graphs (CFGs) Each node corresponds to a –basic block, or –part of a basic block, or may need to determine facts at specific points within BB –a single statement more space and time Each edge represents flow of control

CS 471 – Fall Graphical IRs Dependence graphs : they represents constraints on the sequencing of operations Dependence = a relation between two statements that puts a constraint on their execution order. –Control dependence Based on the program's control flow. –Data dependence Based on the flow of data between statements. Nodes represent statements Edges represent dependences –Labels on the edges represent types of dependences Built for specific optimizations, then discarded

CS 471 – Fall Graphical IRs Dependence graphs Example: s1a := b + c s2 if a>10 goto L1 s3d := b * e s4e := d + 1 s5 L1: d := e / 2 control dependence: s3 and s4 are executed only when a<=10 true or flow dependence: s2 uses a value defined in s1 This is read-after-write dependence antidependence: s4 defines a value used in s3 This is write-after-read dependence output dependence: s5 defines a value also defined in s3 This is write-after-write dependence input dependence: s5 uses a value also uses in s3 This is read-after-read situation. It places no constraints in the execution order, but is used in some optimizations. s1s2s3 s4 s5

CS 471 – Fall IR classification: Structure Graphical Trees, graphs Not easy to rearrange Large structures Linear Looks like pseudocode Easy to rearrange Hybrid Combine graphical and linear IRs Example: –low-level linear IR for basic blocks, and –graph to represent flow of control

CS 471 – Fall Linear IRs Sequence of instructions that execute in order of appearance Control flow is represented by conditional branches and jumps Common representations stack machine code three-address code

CS 471 – Fall Linear IRs Stack machine code Assumes presence of operand stack Useful for stack architectures, JVM Operations typically pop operands and push results. Advantages –Easy code generation –Compact form Disadvantages –Difficult to rearrange –Difficult to reuse expressions

CS 471 – Fall Example: Three-Address Code Each instruction is of the form x := y op z y and z can be only registers or constants Just like assembly Common form of intermediate code The AST expression x + y * z is translated as t 1 := y * z t 2 := x + t 1 Each subexpression has a “home”

CS 471 – Fall Three Address Code Result, operand, operand, operator –x := y op z, where op is a binary operator and x, y, z can be variables, constants or compiler generated temporaries (intermediate results) Can write this as a shorthand – -- quadruples Statements –Assignment x := y op z –Copy stmts x := y –Goto L

CS 471 – Fall Three Address Code Statements (cont) –if x relop y goto L –Indexed assignments x := y[j] or s[j] := z –Address and pointer assignments (for C) x := &y x := *p *x := y –Parm x; –call p, n; –return y;

CS 471 – Fall Bigger Example Consider the AST t 1 := - c t 2 := b * t 1 t 3 := - c t 4 := b * t 3 t 5 := t 2 + t 4 a := t 5 t 1 := - c t 2 := b * t 1 t 3 := - c t 4 := b * t 3 t 5 := t 2 + t 4 a := t 5

CS 471 – Fall Summary IRs provide the interface between the front and back ends of the compiler Should be machine and language independent Should be amenable to optimization Next Time: Tiger IR