Intermediate Representations Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.

Slides:



Advertisements
Similar presentations
Target Code Generation
Advertisements

1 Lecture 10 Intermediate Representations. 2 front end »produces an intermediate representation (IR) for the program. optimizer »transforms the code in.
Intermediate Code Generation
1 CS 201 Compiler Construction Machine Code Generation.
Intermediate Representations Saumya Debray Dept. of Computer Science The University of Arizona Tucson, AZ
1 Compiler Construction Intermediate Code Generation.
Program Representations. Representing programs Goals.
Compiler Construction Sohail Aslam Lecture IR Taxonomy IRs fall into three organizational categories 1.Graphical IRs encode the compiler’s knowledge.
Intermediate Representation I High-Level to Low-Level IR Translation EECS 483 – Lecture 17 University of Michigan Monday, November 6, 2006.
Introduction to Code Optimization Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice.
Intermediate code generation. Code Generation Create linear representation of program Result can be machine code, assembly code, code for an abstract.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
10/22/2002© 2002 Hal Perkins & UW CSEG-1 CSE 582 – Compilers Intermediate Representations Hal Perkins Autumn 2002.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Intermediate Code CS 471 October 29, CS 471 – Fall Intermediate Code Generation Source code Lexical Analysis Syntactic Analysis Semantic.
1 Intermediate representation Goals: encode knowledge about the program facilitate analysis facilitate retargeting facilitate optimization scanning parsing.
Instruction Selection, II Tree-pattern matching Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in.
From Cooper & Torczon1 Implications Must recognize legal (and illegal) programs Must generate correct code Must manage storage of all variables (and code)
Improving Code Generation Honors Compilers April 16 th 2002.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
Introduction to Optimization Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Wrapping Up Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Introduction to Code Generation Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Compiler Construction A Compulsory Module for Students in Computer Science Department Faculty of IT / Al – Al Bayt University Second Semester 2008/2009.
Intermediate Representations Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Building An Interpreter After having done all of the analysis, it’s possible to run the program directly rather than compile it … and it may be worth it.
Instruction Selection Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Precision Going back to constant prop, in what cases would we lose precision?
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
CSE P501 – Compiler Construction Parser Semantic Actions Intermediate Representations AST Linear Next Spring 2014 Jim Hogg - UW - CSE - P501G-1.
10/1/2015© Hal Perkins & UW CSEG-1 CSE P 501 – Compilers Intermediate Representations Hal Perkins Autumn 2009.
1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.
Intermediate Representations. Front end - produces an intermediate representation (IR) Middle end - transforms the IR into an equivalent IR that runs.
Compiler Chapter# 5 Intermediate code generation.
Local Register Allocation & Lab Exercise 1 Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp.
Introduction to Compilers. Related Area Programming languages Machine architecture Language theory Algorithms Data structures Operating systems Software.
Introduction to Code Generation and Intermediate Representations
Introduction to Code Generation Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice.
Code Generation Ⅰ CS308 Compiler Theory1. 2 Background The final phase in our compiler model Requirements imposed on a code generator –Preserving the.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Chapter# 6 Code generation.  The final phase in our compiler model is the code generator.  It takes as input the intermediate representation(IR) produced.
Intermediate Code Representations
12/18/2015© Hal Perkins & UW CSEG-1 CSE P 501 – Compilers Intermediate Representations Hal Perkins Winter 2008.
Instruction Scheduling Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Intermediate Code Generation CS 671 February 14, 2008.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 10 Ahmed Ezzat.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
©SoftMoore ConsultingSlide 1 Code Optimization. ©SoftMoore ConsultingSlide 2 Code Optimization Code generation techniques and transformations that result.
CS 404 Introduction to Compiler Design
Introduction to Optimization
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Optimization Code Optimization ©SoftMoore Consulting.
Local Instruction Scheduling
Intermediate Representations Hal Perkins Autumn 2011
Introduction to Optimization
Intermediate Representations
Introduction to Code Generation
Wrapping Up Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit.
Chapter 6 Intermediate-Code Generation
Intermediate Representations
The Last Lecture COMP 512 Rice University Houston, Texas Fall 2003
Intermediate Representations Hal Perkins Autumn 2005
Introduction to Optimization
8 Code Generation Topics A simple code generator algorithm
Optimization 薛智文 (textbook ch# 9) 薛智文 96 Spring.
Intermediate Code Generation
Target Code Generation
Presentation transcript:

Intermediate Representations Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission to make copies of these materials for their personal use.

Intermediate Representations Front end - produces an intermediate representation (IR) Middle end - transforms the IR into an equivalent IR that runs more efficiently Back end - transforms the IR into native code IR encodes the compiler’s knowledge of the program Middle end usually consists of several passes Front End Middle End Back End IR Source Code Target Code

Intermediate Representations Decisions in IR design affect the speed and efficiency of the compiler Some important IR properties  Ease of generation  Ease of manipulation  Procedure size  Freedom of expression  Level of abstraction The importance of different properties varies between compilers  Selecting an appropriate IR for a compiler is critical

Types of Intermediate Representations Three major categories Structural  Graphically oriented  Heavily used in source-to-source translators  Tend to be large Linear  Pseudo-code for an abstract machine  Level of abstraction varies  Simple, compact data structures  Easier to rearrange Hybrid  Combination of graphs and linear code  Example: control-flow graph Examples: Trees, DAGs Examples: 3 address code Stack machine code Example: Control-flow graph

Level of Abstraction The level of detail exposed in an IR influences the profitability and feasibility of different optimizations. Two different representations of an array reference: subscript Ai j loadI 1 => r 1 sub r j, r 1 => r 2 loadI 10 => r 3 mult r 2, r 3 => r 4 sub r i, r 1 => r 5 add r 4, r 5 => r 6 => r 7 Add r 7, r 6 => r 8 load r 8 => r Aij High level AST: Good for memory disambiguation Low level linear code: Good for address calculation

Level of Abstraction Structural IRs are usually considered high-level Linear IRs are usually considered low-level Not necessarily true: + * 10 j 1 - j 1 - load Low level AST loadArray A,i,j High level linear code

Abstract Syntax Tree An abstract syntax tree is the procedure’s parse tree with the nodes for most non-terminal nodes removed x - 2 * y Can use linearized form of the tree  Easier to manipulate than pointers x 2 y * - in postfix form - * 2 y x in prefix form S-expressions are (essentially) ASTs - x 2 y *

Directed Acyclic Graph A directed acyclic graph (DAG) is an AST with a unique node for each value Makes sharing explicit Encodes redundancy x 2 y * -  z /  w z  x - 2 * y w  x / 2 Same expression twice means that the compiler might arrange to evaluate it just once!

Stack Machine Code Originally used for stack-based computers, now Java Example: x - 2 * y becomes Advantages Compact form Introduced names are implicit, not explicit Simple to generate and execute code Useful where code is transmitted over slow communication links (the net ) push x push 2 push y multiply subtract Implicit names take up no space, where explicit ones do!

Three Address Code Several different representations of three address code In general, three address code has statements of the form: x  y op z With 1 operator (op ) and, at most, 3 names (x, y, & z) Example: z  x - 2 * ybecomes Advantages: Resembles many machines Introduces a new set of names Compact form t  2 * y z  x - t *

Three Address Code: Quadruples Naïve representation of three address code Table of k * 4 small integers Simple record structure Easy to reorder Explicit names load1Y loadi22 mult321 load4X sub542 load r1, y loadI r2, 2 mult r3, r2, r1 load r4, x sub r5, r4, r3 RISC assembly codeQuadruples The original F ORTRAN compiler used “quads”

Three Address Code: Triples Index used as implicit name 25% less space consumed than quads Much harder to reorder loady loadI2 mult(1)(2) loadx sub(4)(3) (1) (2) (3) (4) (5) Implicit names take no space!

Three Address Code: Indirect Triples List first triple in each statement Implicit name space Uses more space than triples, but easier to reorder Major tradeoff between quads and triples is compactness versus ease of manipulation  In the past compile-time space was critical  Today, speed may be more important loady loadI2 mult(100)(101) loadx sub(103)(102) (100) (101) (102) (103) (104) (100) (105)

Static Single Assignment Form The main idea: each name defined exactly once Introduce  -functions to make it work Strengths of SSA-form Sharper analysis  -functions give hints about placement (sometimes) faster algorithms Original x  … y  … while (x < k) x  x + 1 y  y + x SSA-form x 0  … y 0  … if (x 0 > k) goto next loop: x 1   (x 0,x 2 ) y 1   (y 0,y 2 ) x 2  x y 2  y 1 + x 2 if (x 2 < k) goto loop next: …

Two Address Code Allows statements of the form x  x op y Has 1 operator (op ) and, at most, 2 names (x and y) Example: z  x - 2 * y becomes Can be very compact Problems Machines no longer rely on destructive operations Difficult name space  Destructive operations make reuse hard  Good model for machines with destructive ops (PDP-11) t 1  2 t 2  load y t 2  t 2 * t 1 z  load x z  z - t 2

Control-flow Graph Models the transfer of control in the procedure Nodes in the graph are basic blocks  Can be represented with quads or any other linear representation Edges in the graph represent control flow Example if (x = y) a  2 b  5 a  3 b  4 c  a * b Basic blocks — Maximal length sequences of straight-line code

Using Multiple Representations Repeatedly lower the level of the intermediate representation  Each intermediate representation is suited towards certain optimizations Example: the Open64 compiler  WHIRL intermediate format  Consists of 5 different IRs that are progressively more detailed Front End Middle End Back End IR 1IR 3 Source Code Target Code Middle End IR 2

Memory Models Two major models Register-to-register model  Keep all values that can legally be stored in a register in registers  Ignore machine limitations on number of registers  Compiler back-end must insert loads and stores Memory-to-memory model  Keep all values in memory  Only promote values to registers directly before they are used  Compiler back-end can remove loads and stores Compilers for RISC machines usually use register-to-register  Reflects programming model  Easier to determine when registers are used

The Rest of the Story… Representing the code is only part of an IR There are other necessary components Symbol table (already discussed) Constant table  Representation, type  Storage class, offset Storage map  Overall storage layout  Overlap information  Virtual register assignments