Download presentation
Presentation is loading. Please wait.
Published byJessie Patrick Modified over 8 years ago
1
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi
2
Outline 1. Intermediate Code Generation 2. Variants of Syntax Trees 1. Directed Acyclic Graph for Expressions 2. The Value-Number Method for Constructing DAG 3. Three-Address Code 1. Addresses and Instructions 2. Quadruples 3. Triples 4. Static Single-Assignment Form 5. Summary 2
3
Intermediate-Code Generation Lecture: 19-20 3
4
Where Are We Now? 4 Scanner Parser Semantics Analyzer Intermediate Code Generator Source code Syntax Tree Annotated Tree Intermediate code Tokens
5
Intermediate-Code Generation In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate representation, from which the back end generates target code Ideally, details of the source language are confined to the front end, and details of the target machine to the back end With a suitably defined intermediate representation, a compiler for language I and machine j can then be built by combining the front end for language I with back end for the machine j 5
6
Intermediate-Code Generation (Continue…) Following figure shows front-end model of compiler Static checking includes type checking, which ensures that operators are applied to compatible operands Static checking also includes any syntactic checks that remain after parsing A break statement in C is enclosed within a while, for or switch statement 6
7
Intermediate-Code Generation (Continue…) While translating a program, compiler may construct a sequence of intermediate representations High-level representations are close to the source language and low-level representation are close to the target machine The abstract syntax trees are high-level intermediate representation Depict natural hierarchical structure of the source program 7 Source Program High Level Intermediate Representation Low Level Intermediate Representation Target Code
8
Intermediate-Code Generation (Continue…) A low-level representation is suitable for machine- dependent tasks like register allocation and instruction selection Three-address code can range from high- to low- level, depending upon the choice of operators The difference between syntax trees and three- address code are superficial A syntax tree represents the component of a statement, whereas three-address code contains labels and jump instructions to represent the flow of control, as in machine language 8
9
Intermediate-Code Generation (Continue…) The choice or design of an intermediate representation varies from compiler to compiler An intermediate representation may either be an actual language or it may consist of internal data structures that are shared by phases of the compiler C is a programming language, yet it is often used as an intermediate form C is flexible, it compiles into efficient machine code, and its compilers are widely available The C++ compiler consisted of a front end that generated C, treating a C compiler as a back end 9
10
Quiz# 3 Time Allowed: 10 Minutes 10
11
Variants of Syntax Trees Nodes in a syntax tree represent constructs in the source program The children of the node represents meaningful components of a construct A directed acyclic graph (DAG) for an expression identifies the common suhexpression of the expression 11
12
Directed Acyclic Graphs for Expressions A directed acyclic graph (DAG), is a directed graph with no directed cycles Like syntax tree for an expression, a DAG has leaves corresponding to atomic operands and interior nodes corresponding to operators A node N in a DAG has more than one parent if N represents a common subexpression A DAG not only represents expressions more succinctly, it gives the compiler important clues regarding the generation of efficient code to evaluate the expression 12
13
Directed Acyclic Graphs for Expressions (Continue…) Create Syntax Trees and DAG’s for the following expressions a = a + 10 a + b + (a + b) a + b + a + b a + a * (b – c) + (b – c) * d 13
14
The Value-Number Method for Constructing DAG’s Often, the nodes of a syntax tree or DAG are stored in an array of records Each row of the array represents one record, and therefore one node Consider the figure on next slide that shows a DAG along with an array for expression i = i + 10 14
15
The Value-Number Method for Constructing DAG’s (Continue…) In the following figure leaves have one additional field, which holds the lexical value, and interior nodes have two additional fields indicating the left and right children 15
16
The Value-Number Method for Constructing DAG’s (Continue…) In the array, we refer to nodes by giving the integer index of the record for that node within the array This integer is called the value number for the node or for the expression represented by the node 16
17
Three-Address Code In three-address code, there is at most one operation on the right side of an instruction Expression like x+y*z might be translated into the sequence of three-address instructions t 1 = y*z t 2 = x+t 1 t 1 and t 2 are compiler generated temporary names The use of names for intermediate values computed by a program allows three-address code to be rearranged easily 17
18
Three-Address Code (Continue…) Exercise Represent the following DAG in three-address code sequence 18
19
Addresses and Instructions Three-address code is built from two concepts: addresses and instructions In object-oriented terms, these concepts correspond to classes, and the various kinds of addresses and instructions correspond to appropriate subclasses Alternatively, three-address code can be implemented using records with fields for the addresses The records called quadruples and triples 19
20
Addresses and Instructions (Continue…) In three-address code scheme, an address can be one of the following A name : The names that appear in source program. In implementation, a source name is replaced by a pointer to its symbol table entry, where all the information about the name is kept A constant : In practice, a compiler must deal with many different types of constants and variables A compiler-generated temporary : It is useful, especially in optimizing compilers, to create a distinct name each time a temporary is needed 20
21
Addresses and Instructions (Continue…) Few examples of three-address code instructions are mentioned below; Assignment instruction x = y op z Assignment of the form x = op y Copy instructions of the form x = y An unconditional jump goto L Conditional jumps of the form if x goto L Indexed copy instructions of the form x = y[z] OR y[z] = x etc. 21
22
Addresses and Instructions (Continue…) Consider the following statement and its three- address code in the figures; do i = i+1; while( a[i]<v ); 22
23
Quadruples & Triples The description of three-address instructions specifies components of each type of instructions, but it does not specify the representation of these instructions in a data structure In a compiler, these instructions can be implemented as objects or as records with fields for the operator and the operands Three such representations are called “quadruples”, “triples”, and “indirect triples” 23
24
Quadruples A quadruple or just “quad” has four fields, which we call op, arg 1, arg 2, and result In x=y+z, ‘+’ is op, y and z are arg 1 and arg 2 whereas x is result The following are some exceptions in this rule; Instructions with unary operators like x = minus y OR x = y do not use arg 2 Operators like param use neither arg 2 nor result Conditional and unconditional jumps put the target label in result 24
25
Quadruples (Continue…) Example: Three-address code for the assignment a = b*-c+b*-c is shown below 25
26
Triples A triple has only three fields which we call op, arg 1, and arg 2 In earlier example we have seen the result field is used primarily for temporary names Using triples, we refer to the result of an operation x op y by its position rather than an explicit temporary name Consider the figure in next slide for details; 26
27
Triples (Continue…) Example: Three-address code using Triples 27
28
Static Single-Assignment Form The Static Single-Assignment Form (SSA) is an intermediate representation that facilitates certain code optimizations Two aspects distinguish SSA from three-address code All assignments in SSA are to variables with distinct names SSA uses a notational convention Φ -function to combine two definitions of same variables if( flag ) x = -1; else x = 1; y = x + a if( flag ) x 1 = -1; else x 2 = 1; x 3 = Φ( x 1,x 2 ) 28
29
29 Summary Any Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.