Intermediate Code Generation Professor Yihjia Tsai Tamkang University.

Slides:



Advertisements
Similar presentations
Chapter 2-2 A Simple One-Pass Compiler
Advertisements

Chapter 6 Intermediate Code Generation
Intermediate Code Generation
Chapter 5 Syntax-Directed Translation. Translation of languages guided by context-free grammars. Attach attributes to the grammar symbols. Values of the.
8 Intermediate code generation
1 Compiler Construction Intermediate Code Generation.
Chapter 5 Syntax Directed Translation. Outline Syntax Directed Definitions Evaluation Orders of SDD’s Applications of Syntax Directed Translation Syntax.
1 Beyond syntax analysis An identifier named x has been recognized. Is x a scalar, array or function? How big is x? If x is a function, how many and what.
Lecture # 17 Syntax Directed Definition. 2 Translation Schemes A translation scheme is a CF grammar embedded with semantic actions rest  + term { print(“+”)
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Abstract Syntax Tree (AST)
Syntax Directed Translation
Syntax-Directed Translation Context-free grammar with synthesized and/or inherited attributes. The showing of values at nodes of a parse tree is called.
CH4.1 CSE244 Syntax Directed Translation Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Unit.
Syntax-Directed Translation
Compiler Construction A Compulsory Module for Students in Computer Science Department Faculty of IT / Al – Al Bayt University Second Semester 2008/2009.
Chapter 2 A Simple Compiler
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Chapter 5 Syntax-Directed Translation Section 0 Approaches to implement Syntax-Directed Translation 1、Basic idea Guided by context-free grammar (Translating.
1 Abstract Syntax Tree--motivation The parse tree –contains too much detail e.g. unnecessary terminals such as parentheses –depends heavily on the structure.
Topic #5: Translations EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Syntax-Directed Translation
Topic: Syntax Directed Translations
COP4020 Programming Languages Semantics Prof. Xin Yuan.
Lesson 11 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Compiler Chapter# 5 Intermediate code generation.
Chapter 8: Intermediate Code Generation
Topic #2: Infix to Postfix EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
1 June 3, June 3, 2016June 3, 2016June 3, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa Pacific University,
Chapter 5: Syntax directed translation –Use the grammar to direct the translation The grammar defines the syntax of the input language. Attributes are.
1 November 19, November 19, 2015November 19, 2015November 19, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University.
Scribe Sumbission Date: 28 th October, 2013 By M. Sudeep Kumar.
Introduction to Code Generation and Intermediate Representations
Chapter 5. Syntax-Directed Translation. 2 Fig Syntax-directed definition of a simple desk calculator ProductionSemantic Rules L  E n print ( E.val.
Overview of Previous Lesson(s) Over View  In syntax-directed translation 1 st we construct a parse tree or a syntax tree then compute the values of.
LESSON 04.
1 Syntax-Directed Translation Part I Chapter 5 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
1 Syntax-Directed Translation We associate information with the programming language constructs by attaching attributes to grammar symbols. 2.Values.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Syntax-Directed Definitions and Attribute Evaluation Compiler Design Lecture (02/18/98) Computer Science Rensselaer Polytechnic.
Code Generation CPSC 388 Ellen Walker Hiram College.
Overview of Previous Lesson(s) Over View 3 Model of a Compiler Front End.
1 Structure of a Compiler Source Language Target Language Semantic Analyzer Syntax Analyzer Lexical Analyzer Front End Code Optimizer Target Code Generator.
Chapter 8: Semantic Analyzer1 Compiler Designs and Constructions Chapter 8: Semantic Analyzer Objectives: Syntax-Directed Translation Type Checking Dr.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 10 Ahmed Ezzat.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CSE 420 Lecture Program is lexically well-formed: ▫Identifiers have valid names. ▫Strings are properly terminated. ▫No stray characters. Program.
LECTURE 10 Semantic Analysis. REVIEW So far, we’ve covered the following: Compilation methods: compilation vs. interpretation. The overall compilation.
Chapter4 Syntax-Directed Translation Introduction : 1.In the lexical analysis step, each token has its attribute , e.g., the attribute of an id is a pointer.
CS 404 Introduction to Compiler Design
Semantics Analysis.
Syntax-Directed Translation
A Simple Syntax-Directed Translator
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Compiler Construction
Chapter 5 Syntax Directed Translation
Compiler Construction
Syntax-Directed Translation Part I
Syntax-Directed Translation Part I
Syntax-Directed Translation Part I
פרק 5 תרגום מונחה תחביר תורת הקומפילציה איתן אביאור.
Chapter 6 Intermediate-Code Generation
Three-address code A more common representation is THREE-ADDRESS CODE . Three address code is close to assembly language, making machine code generation.
Syntax-Directed Translation Part I
SYNTAX DIRECTED DEFINITION
Intermediate Code Generation
Syntax-Directed Translation Part I
Chapter 5 Syntax Directed Translation
Presentation transcript:

Intermediate Code Generation Professor Yihjia Tsai Tamkang University

Sanath Jayasena/Apr Introduction Intermediate representation (IR) –Generally a program for an abstract machine (can be assembly language or slightly above) –Easy to produce and translate into target code Why? –When a re-targetable compiler is needed i.e., if we are planning a portable compiler, with different back ends –Better/easier for some optimizations Machine code can be more complex

Sanath Jayasena/Apr Java ML Pascal C Sparc MIPS Pentium Alpha Java ML Pascal C Sparc MIPS Pentium Alpha Intermediate Representation

Sanath Jayasena/Apr Introduction … contd Front end can do scanning, parsing, semantic analysis and translation to IR Back end will then optimize and generate target code IR can modularize the task –Front end not bothered about machine details –Back end not bothered about source language

Sanath Jayasena/Apr Introduction … contd Qualities of a good IR –Convenient for semantic analysis phase to produce –Convenient to translate into machine language of all desired target hardware –Each construct has a clear and simple meaning Easy for optimizing transformations

Sanath Jayasena/Apr Intermediate Representations Abstract syntax trees Postfix notation Directed acyclic graphs (DAGs) Three-address code (3AC)

Sanath Jayasena/Apr Abstract Syntax Trees Also called Intermediate Rep. (IR) trees –Has individual components that describe only very simple things –E.g., load, store, add, move, jump –E.g., pp , Tiger book (see handout)

Sanath Jayasena/Apr Postfix Notation For an expression E, inductively: 1.If E is a var or const, the postfix notation is E 2.If E is of the form E1 E2, the postfix notation is E1 ’ E2 ’ where E1 ’, E2 ’ are postfix notations for E1, E2 3.If E is of the form (E1) then the postfix notation for E1 is also that for E –Parenthesis unnecessary

Sanath Jayasena/Apr Example What are the postfix notations for (9- 5)+2 and 9-(5+2) (9-5)+2 in postfix notation is (5+2) in postfix notation is 952+-

Sanath Jayasena/Apr Syntax-Directed Translation Translation guided by CFG ’ s –Based on “ attributes ” of language constructs E.g., type, string, number, memory location –Attach attributes to grammar symbols –Values for attributes computed by semantic rules associated with productions Translation of a language construct in terms of attributes associated with its syntactic components

Sanath Jayasena/Apr Syntax-Directed Translation … contd Two notations for associating semantic rules with productions in a CFG 1.Syntax-directed definitions High-level specs, details hidden, order of translation unspecified 2.Translation schemes Order of translations specified, more details shown [Dragon book: Section 2.3 and Chapter 5]

Sanath Jayasena/Apr Syntax-Directed Definitions For each grammar symbol: associate a set of attributes (synthesized and inherited) For each production: a semantic rule defines the values of attribute at the parse-tree node used at that node Grammar + set of semantic rules

Sanath Jayasena/Apr Annotated Parse Tree A parse tree showing attribute value at each node Used for translation (which is an input  output mapping) –For input x, construct parse tree for x –If a node n in tree is labeled by symbol Y Value of attribute p of Y at node n denoted as Y.p Value of Y.p computed using semantic rule for attribute p associated with the Y- production at n

Sanath Jayasena/Apr Synthesized Attributes An attribute is synthesized if its value at a parse tree node is determined from those at the child nodes Can be evaluated with a single bottom-up tree traversal (e.g., depth-first traversal) A syntax-directed definition that uses these exclusively is said to be an s-attributed definition

Sanath Jayasena/Apr Example 1 Translating expressions into postfix “.t ” is a string valued attribute, || is concatenation ProductionSemantic Rule expr → expr 1 + term expr.t := expr 1.t || term.t || ‘ + ’ expr → expr 1 - term expr.t := expr 1.t || term.t || ‘ - ’ expr → termexpr.t := term.t term → 0term.t := ‘ 0 ’ …… term → 9term.t := ‘ 9 ’

Sanath Jayasena/Apr Example 1 … contd expr.t = expr.t = 95- expr.t = 9 term.t = 9 9 term.t = 5 term.t = Annotated parse tree corresponding to “9-5+2”

Sanath Jayasena/Apr Example 2 Syntax-directed definition for desk calculator program Draw the annotated parse tree for “ 3*5+4 $ ” ProductionSemantic Rule L → E $print(E.val) E → E 1 + TE.val := E 1.val + T.val E → TE.val := T.val T → T 1 * FT.val := T 1.val × F.val T → FT.val := F.val F → digitF.val := digit.lexval

Sanath Jayasena/Apr Example 2 … contd E.Val = 19 T.val = 15 T.val=3 F.val=3 digit.lexval=3 T.val=5 T.val=4 * + Annotated parse tree corresponding to “3*5+4 $” F.val=5 F.val=4 digit.lexval=5 digit.lexval=4 L $ E.val = 15

Sanath Jayasena/Apr Inherited Attributes Value at a node is defined using attributes at siblings and/or parent of the node Useful for tracking the context of a construct –E.g., decide whether address or value of a var is needed by keeping track of whether it appears on RHS or LHS of an assignment

Sanath Jayasena/Apr Example Syntax-directed definition with inherited attribute L.in for declaration of variables of type int or real Draw the annotated parse tree for “ real id 1, id 2, id 3 ” ProductionSemantic Rule D → T LL.in := T.type T → intT.type := integer T → realT.type := real L → L 1, idL 1.in := L.in addtype(id.entry, L.in) L → idaddtype(id.entry, L.in)

Sanath Jayasena/Apr Example … contd D T.type = real L.in = real real, id 1, Annotated parse tree for “real id1, id2, id3” with inherited attribute in at each node L L.in = real id 2 id 3

Sanath Jayasena/Apr Translation Schemes Semantic actions embedded within RHS of productions –Unlike syntax-directed definitions, order of evaluation of semantic rules explicitly shown –Action to be taken shown by enclosing in { } E.g., rterm  term { print ( ‘ + ’ ) } rterm1 –In a parse tree in this context, an action is shown by an extra child node & dashed edge

Sanath Jayasena/Apr Depth-First Order L-attributed definitions –Attributes can be always evaluated in depth-first order (left-to-right) Translation schemes with restrictions motivated by L-attributed definitions ensure that an attribute value is available when an action refers to it –E.g., when only synthesized attributes exist

Sanath Jayasena/Apr Example Translation scheme that maps infix expressions with addition/subtraction into corresponding postfix expressions E → T R R → addop T { print(addop.lexeme) } R 1 | Λ R → subop T { print(subop.lexeme) } R 2 | Λ T → num { print(num.val) } Show the parse tree for “ ”

Sanath Jayasena/Apr Example … contd E 9 5 Parse tree for “9-5+2” showing actions; when performed in depth-first order, prints “95-2+” 2 Λ { print (‘9’) } R R T T T R { print (‘5’) } { print (‘-’) } { print (‘+’) } { print (‘2’) } - +

Sanath Jayasena/Apr Emitting a Translation For simple syntax-directed definitions, implementation possible with translation schemes where actions print additional strings in the order of appearance –[Simple: string representing the translation of the non-terminal on LHS of each production is the concatenation of translations of non-terminals on the RHS, in the same order as in the production]

Sanath Jayasena/Apr Example A translation scheme derived from Example in slide 7-15 expr → expr + term{ print ( ‘ + ’ ) } expr → expr – term{ print ( ‘ - ’ ) } expr → term term → 0{ print ( ‘ 0 ’ ) } term → 1{ print ( ‘ 1 ’ ) } … term → 9{ print ( ‘ 9 ’ ) }

Sanath Jayasena/Apr Example … contd expr 9 5 Actions translating “9-5+2” into “95-2+” 2 { print (‘9’) } { print (‘5’) } { print (‘-’) } { print (‘+’) } { print (‘2’) } - + term expr term

Sanath Jayasena/Apr Constructing Syntax Trees Syntax-directed definitions can be used Recall: syntax tree is a condensed form of parse tree –Operators, keywords appear as interior nodes Construction: similar to postfix notation –For a subexpression, create a node for each operator and operand –Children of operator node represent operands (as subexpressions) of that operator

Sanath Jayasena/Apr Nodes in a Syntax Tree A node is like a record with many fields: – label, pointers to operand nodes, value etc., 3 basic functions to create nodes –mknode(op, left, right): operator node with label op, two pointer fields left and right –mkleaf(id, entry): ID node with label id and field entry pointing to symbol-table entry –mkleaf(num, val): a NUM node with label num and value field containing value of number

Sanath Jayasena/Apr Example From Example 5.7, p. 288 –What is the sequence of calls to create the syntax tree for the expression “ a – 4 + c ” ? p1 = mkleaf(id, entry_a); p2 = mkleaf(num, 4); p3 = mknode( ‘ - ’, p1, p2); p4 = mkleaf(id, entry_c); p5 = mknode( ‘ + ’, p3, p4); What is the syntax tree?

Sanath Jayasena/Apr Constructing Syntax Trees … contd A syntax-directed definition may be used for constructing a syntax tree –Semantic rules: calls to functions mknode( ) and mkleaf( ) –E.g., for the production, E  E1 + T, we may have the semantic rule E.nptr = mknode( ‘ + ’, E1.nptr, T.nptr) –Example 5.8, p. 289

Sanath Jayasena/Apr DAGs for Expressions A dag for an expression identifies common subexpressions –Unlike a syntax tree, a node for a common subexpression may have > 1 parent node –E.g., “ a + a * (b-c) + (b-c) * d ” Fig. 5.11, p.291 How to create a dag, given an expression? –Check if an identical node already exists –Example 5.9, p. 291

Sanath Jayasena/Apr Review Example: for the assignment statement, a = b * -c + b * -c, give a syntax tree, dag and postfix notation Fig. 8.2, p. 464

Sanath Jayasena/Apr Three-Address Code (3AC) 3AC is a sequence of statements of the general form x := y z –x, y, z are names, const ’ s, generated temp ’ s – is any operator (arithmetic, logical) 3AC means each statement usually has 3 addresses (2 for operands, 1 for the result)

Sanath Jayasena/Apr Examples Given the expression, x+y*z the 3AC t1 := y * zt2 := x + t1 Show 3AC for (a) syntax tree, (b) dag discussed earlier in slide 7-34 (Fig. 8.2) –Fig. 8.5, p. 466

Sanath Jayasena/Apr AC … contd A name in a program replaced by a pointer to a symbol table entry for that name 3AC statements are like assembly code –There are flow-control statements –They can have symbolic labels –A label represents the index of a 3AC statement in an array containing the intermediate code

Sanath Jayasena/Apr Types of 3AC Statements 1.Assignment statements with binary operators (arithmetic or logical) –Of the form x:= y z 2.Assignment statements with unary operators (minus, logical not, shift etc.,) –Of the form x:= y 3.Copy statements –Of the form x := y

Sanath Jayasena/Apr Types of 3AC Statements … contd 4.Unconditional jump: goto L –Statement with label L to be executed next 5.Conditional jump: if x y goto L –A relational operator ( = … ) is applied to x and y –If the relation holds, statement with label L executed next –If not, statement following it is executed

Sanath Jayasena/Apr Types of 3AC Statements … contd 6.Function calls: param x, call p, n and return y –“ return y ” is optional –E.g., for call p(x1, x2, …, xn) the 3AC will be param x1 param x2 … param xn call p, n

Sanath Jayasena/Apr Types of 3AC Statements … contd 7.Indexed assignments: x := y[i], x[i] := y –In x:=y[i] : x is set to the value in location i units beyond memory location y –In x[i]:=y : value in location i units beyond memory location x is set to the value of y –x, y and i are data objects

Sanath Jayasena/Apr Types of 3AC Statements … contd 8.Address & pointer assignments: x := &y, x := *y, *x := y –In x:= &y : x is set to be the location of y y denotes an l -value, x is a pointer name –In x:= *y : ( r -value of) x is set to the value in location pointed by y y is a pointer; r -value of y is a location –In *x:= y : ( r -value of) object pointed by x is set to (the r -value of) y

Sanath Jayasena/Apr Syntax-Dir. Translation into 3AC When 3AC code is generated, temp names are made up for interior nodes in syntax tree –E.g., for E  E1 + E2, value of E on LHS will be computed to a new temp t Example –Fig. 8.6, Fig 8.7 on p. 469

Sanath Jayasena/Apr Implementation of 3AC 3AC is an abstract form –Can be implemented in a compiler as records –(with fields for operator and operands) Three representations –Quadruples –Triples –Indirect triples

Sanath Jayasena/Apr (a) Quadruples A record structure with 4 fields –op, arg1, arg2 and result Examples –For x := y op z we have: y in arg1, z in arg2 and x in result –For unary operators, arg2 not used –For param operator, arg2 and result unused –Fig. 8.8(a), p. 471 for a:= b* -c + b* -c Content of fields are pointers to ST entries

Sanath Jayasena/Apr (b) Triples Temps generated in quadruples must be entered in symbol table To avoid this, we can refer to a temp value by the location of the relevant statement –We can have records with only 3 fields op, arg1 and arg2 –Fields arg1 and arg2 can be pointers to ST entries or to triple structure for temp values –Example: Fig 8.8(b), Fig. 8.9 on p. 471

Sanath Jayasena/Apr (c) Indirect Triples Listing of pointers to triples, rather than triples themselves Example –We can use an array to list pointers to triples in the desired order –Example: Fig 8.10 on p. 472

Sanath Jayasena/Apr Translating Language Constructs Balance of Chapter 8 in Dragon book covers details on implementing: –Declarations, scope –Assignments, array elements, fields in records –Boolean expressions –Case statements –Label renaming (called backpatching) –Function calls