Download presentation
Presentation is loading. Please wait.
1
Intermediate Code Generation Professor Yihjia Tsai Tamkang University
2
Sanath Jayasena/Apr 2006 7-2 Introduction Intermediate representation (IR) –Generally a program for an abstract machine (can be assembly language or slightly above) –Easy to produce and translate into target code Why? –When a re-targetable compiler is needed i.e., if we are planning a portable compiler, with different back ends –Better/easier for some optimizations Machine code can be more complex
3
Sanath Jayasena/Apr 2006 7-3 Java ML Pascal C Sparc MIPS Pentium Alpha Java ML Pascal C Sparc MIPS Pentium Alpha Intermediate Representation
4
Sanath Jayasena/Apr 2006 7-4 Introduction … contd Front end can do scanning, parsing, semantic analysis and translation to IR Back end will then optimize and generate target code IR can modularize the task –Front end not bothered about machine details –Back end not bothered about source language
5
Sanath Jayasena/Apr 2006 7-5 Introduction … contd Qualities of a good IR –Convenient for semantic analysis phase to produce –Convenient to translate into machine language of all desired target hardware –Each construct has a clear and simple meaning Easy for optimizing transformations
6
Sanath Jayasena/Apr 2006 7-6 Intermediate Representations Abstract syntax trees Postfix notation Directed acyclic graphs (DAGs) Three-address code (3AC)
7
Sanath Jayasena/Apr 2006 7-7 Abstract Syntax Trees Also called Intermediate Rep. (IR) trees –Has individual components that describe only very simple things –E.g., load, store, add, move, jump –E.g., pp. 136-139, Tiger book (see handout)
8
Sanath Jayasena/Apr 2006 7-8 Postfix Notation For an expression E, inductively: 1.If E is a var or const, the postfix notation is E 2.If E is of the form E1 E2, the postfix notation is E1 ’ E2 ’ where E1 ’, E2 ’ are postfix notations for E1, E2 3.If E is of the form (E1) then the postfix notation for E1 is also that for E –Parenthesis unnecessary
9
Sanath Jayasena/Apr 2006 7-9 Example What are the postfix notations for (9- 5)+2 and 9-(5+2) (9-5)+2 in postfix notation is 95-2+ 9-(5+2) in postfix notation is 952+-
10
Sanath Jayasena/Apr 2006 7-10 Syntax-Directed Translation Translation guided by CFG ’ s –Based on “ attributes ” of language constructs E.g., type, string, number, memory location –Attach attributes to grammar symbols –Values for attributes computed by semantic rules associated with productions Translation of a language construct in terms of attributes associated with its syntactic components
11
Sanath Jayasena/Apr 2006 7-11 Syntax-Directed Translation … contd Two notations for associating semantic rules with productions in a CFG 1.Syntax-directed definitions High-level specs, details hidden, order of translation unspecified 2.Translation schemes Order of translations specified, more details shown [Dragon book: Section 2.3 and Chapter 5]
12
Sanath Jayasena/Apr 2006 7-12 Syntax-Directed Definitions For each grammar symbol: associate a set of attributes (synthesized and inherited) For each production: a semantic rule defines the values of attribute at the parse-tree node used at that node Grammar + set of semantic rules
13
Sanath Jayasena/Apr 2006 7-13 Annotated Parse Tree A parse tree showing attribute value at each node Used for translation (which is an input output mapping) –For input x, construct parse tree for x –If a node n in tree is labeled by symbol Y Value of attribute p of Y at node n denoted as Y.p Value of Y.p computed using semantic rule for attribute p associated with the Y- production at n
14
Sanath Jayasena/Apr 2006 7-14 Synthesized Attributes An attribute is synthesized if its value at a parse tree node is determined from those at the child nodes Can be evaluated with a single bottom-up tree traversal (e.g., depth-first traversal) A syntax-directed definition that uses these exclusively is said to be an s-attributed definition
15
Sanath Jayasena/Apr 2006 7-15 Example 1 Translating expressions into postfix “.t ” is a string valued attribute, || is concatenation ProductionSemantic Rule expr → expr 1 + term expr.t := expr 1.t || term.t || ‘ + ’ expr → expr 1 - term expr.t := expr 1.t || term.t || ‘ - ’ expr → termexpr.t := term.t term → 0term.t := ‘ 0 ’ …… term → 9term.t := ‘ 9 ’
16
Sanath Jayasena/Apr 2006 7-16 Example 1 … contd expr.t = 95-2+ expr.t = 95- expr.t = 9 term.t = 9 9 term.t = 5 term.t = 2 -5+2 Annotated parse tree corresponding to “9-5+2”
17
Sanath Jayasena/Apr 2006 7-17 Example 2 Syntax-directed definition for desk calculator program Draw the annotated parse tree for “ 3*5+4 $ ” ProductionSemantic Rule L → E $print(E.val) E → E 1 + TE.val := E 1.val + T.val E → TE.val := T.val T → T 1 * FT.val := T 1.val × F.val T → FT.val := F.val F → digitF.val := digit.lexval
18
Sanath Jayasena/Apr 2006 7-18 Example 2 … contd E.Val = 19 T.val = 15 T.val=3 F.val=3 digit.lexval=3 T.val=5 T.val=4 * + Annotated parse tree corresponding to “3*5+4 $” F.val=5 F.val=4 digit.lexval=5 digit.lexval=4 L $ E.val = 15
19
Sanath Jayasena/Apr 2006 7-19 Inherited Attributes Value at a node is defined using attributes at siblings and/or parent of the node Useful for tracking the context of a construct –E.g., decide whether address or value of a var is needed by keeping track of whether it appears on RHS or LHS of an assignment
20
Sanath Jayasena/Apr 2006 7-20 Example Syntax-directed definition with inherited attribute L.in for declaration of variables of type int or real Draw the annotated parse tree for “ real id 1, id 2, id 3 ” ProductionSemantic Rule D → T LL.in := T.type T → intT.type := integer T → realT.type := real L → L 1, idL 1.in := L.in addtype(id.entry, L.in) L → idaddtype(id.entry, L.in)
21
Sanath Jayasena/Apr 2006 7-21 Example … contd D T.type = real L.in = real real, id 1, Annotated parse tree for “real id1, id2, id3” with inherited attribute in at each node L L.in = real id 2 id 3
22
Sanath Jayasena/Apr 2006 7-22 Translation Schemes Semantic actions embedded within RHS of productions –Unlike syntax-directed definitions, order of evaluation of semantic rules explicitly shown –Action to be taken shown by enclosing in { } E.g., rterm term { print ( ‘ + ’ ) } rterm1 –In a parse tree in this context, an action is shown by an extra child node & dashed edge
23
Sanath Jayasena/Apr 2006 7-23 Depth-First Order L-attributed definitions –Attributes can be always evaluated in depth-first order (left-to-right) Translation schemes with restrictions motivated by L-attributed definitions ensure that an attribute value is available when an action refers to it –E.g., when only synthesized attributes exist
24
Sanath Jayasena/Apr 2006 7-24 Example Translation scheme that maps infix expressions with addition/subtraction into corresponding postfix expressions E → T R R → addop T { print(addop.lexeme) } R 1 | Λ R → subop T { print(subop.lexeme) } R 2 | Λ T → num { print(num.val) } Show the parse tree for “ 9-5+2 ”
25
Sanath Jayasena/Apr 2006 7-25 Example … contd E 9 5 Parse tree for “9-5+2” showing actions; when performed in depth-first order, prints “95-2+” 2 Λ { print (‘9’) } R R T T T R { print (‘5’) } { print (‘-’) } { print (‘+’) } { print (‘2’) } - +
26
Sanath Jayasena/Apr 2006 7-26 Emitting a Translation For simple syntax-directed definitions, implementation possible with translation schemes where actions print additional strings in the order of appearance –[Simple: string representing the translation of the non-terminal on LHS of each production is the concatenation of translations of non-terminals on the RHS, in the same order as in the production]
27
Sanath Jayasena/Apr 2006 7-27 Example A translation scheme derived from Example in slide 7-15 expr → expr + term{ print ( ‘ + ’ ) } expr → expr – term{ print ( ‘ - ’ ) } expr → term term → 0{ print ( ‘ 0 ’ ) } term → 1{ print ( ‘ 1 ’ ) } … term → 9{ print ( ‘ 9 ’ ) }
28
Sanath Jayasena/Apr 2006 7-28 Example … contd expr 9 5 Actions translating “9-5+2” into “95-2+” 2 { print (‘9’) } { print (‘5’) } { print (‘-’) } { print (‘+’) } { print (‘2’) } - + term expr term
29
Sanath Jayasena/Apr 2006 7-29 Constructing Syntax Trees Syntax-directed definitions can be used Recall: syntax tree is a condensed form of parse tree –Operators, keywords appear as interior nodes Construction: similar to postfix notation –For a subexpression, create a node for each operator and operand –Children of operator node represent operands (as subexpressions) of that operator
30
Sanath Jayasena/Apr 2006 7-30 Nodes in a Syntax Tree A node is like a record with many fields: – label, pointers to operand nodes, value etc., 3 basic functions to create nodes –mknode(op, left, right): operator node with label op, two pointer fields left and right –mkleaf(id, entry): ID node with label id and field entry pointing to symbol-table entry –mkleaf(num, val): a NUM node with label num and value field containing value of number
31
Sanath Jayasena/Apr 2006 7-31 Example From Example 5.7, p. 288 –What is the sequence of calls to create the syntax tree for the expression “ a – 4 + c ” ? p1 = mkleaf(id, entry_a); p2 = mkleaf(num, 4); p3 = mknode( ‘ - ’, p1, p2); p4 = mkleaf(id, entry_c); p5 = mknode( ‘ + ’, p3, p4); What is the syntax tree?
32
Sanath Jayasena/Apr 2006 7-32 Constructing Syntax Trees … contd A syntax-directed definition may be used for constructing a syntax tree –Semantic rules: calls to functions mknode( ) and mkleaf( ) –E.g., for the production, E E1 + T, we may have the semantic rule E.nptr = mknode( ‘ + ’, E1.nptr, T.nptr) –Example 5.8, p. 289
33
Sanath Jayasena/Apr 2006 7-33 DAGs for Expressions A dag for an expression identifies common subexpressions –Unlike a syntax tree, a node for a common subexpression may have > 1 parent node –E.g., “ a + a * (b-c) + (b-c) * d ” Fig. 5.11, p.291 How to create a dag, given an expression? –Check if an identical node already exists –Example 5.9, p. 291
34
Sanath Jayasena/Apr 2006 7-34 Review Example: for the assignment statement, a = b * -c + b * -c, give a syntax tree, dag and postfix notation Fig. 8.2, p. 464
35
Sanath Jayasena/Apr 2006 7-35 Three-Address Code (3AC) 3AC is a sequence of statements of the general form x := y z –x, y, z are names, const ’ s, generated temp ’ s – is any operator (arithmetic, logical) 3AC means each statement usually has 3 addresses (2 for operands, 1 for the result)
36
Sanath Jayasena/Apr 2006 7-36 Examples Given the expression, x+y*z the 3AC t1 := y * zt2 := x + t1 Show 3AC for (a) syntax tree, (b) dag discussed earlier in slide 7-34 (Fig. 8.2) –Fig. 8.5, p. 466
37
Sanath Jayasena/Apr 2006 7-37 3AC … contd A name in a program replaced by a pointer to a symbol table entry for that name 3AC statements are like assembly code –There are flow-control statements –They can have symbolic labels –A label represents the index of a 3AC statement in an array containing the intermediate code
38
Sanath Jayasena/Apr 2006 7-38 Types of 3AC Statements 1.Assignment statements with binary operators (arithmetic or logical) –Of the form x:= y z 2.Assignment statements with unary operators (minus, logical not, shift etc.,) –Of the form x:= y 3.Copy statements –Of the form x := y
39
Sanath Jayasena/Apr 2006 7-39 Types of 3AC Statements … contd 4.Unconditional jump: goto L –Statement with label L to be executed next 5.Conditional jump: if x y goto L –A relational operator ( = … ) is applied to x and y –If the relation holds, statement with label L executed next –If not, statement following it is executed
40
Sanath Jayasena/Apr 2006 7-40 Types of 3AC Statements … contd 6.Function calls: param x, call p, n and return y –“ return y ” is optional –E.g., for call p(x1, x2, …, xn) the 3AC will be param x1 param x2 … param xn call p, n
41
Sanath Jayasena/Apr 2006 7-41 Types of 3AC Statements … contd 7.Indexed assignments: x := y[i], x[i] := y –In x:=y[i] : x is set to the value in location i units beyond memory location y –In x[i]:=y : value in location i units beyond memory location x is set to the value of y –x, y and i are data objects
42
Sanath Jayasena/Apr 2006 7-42 Types of 3AC Statements … contd 8.Address & pointer assignments: x := &y, x := *y, *x := y –In x:= &y : x is set to be the location of y y denotes an l -value, x is a pointer name –In x:= *y : ( r -value of) x is set to the value in location pointed by y y is a pointer; r -value of y is a location –In *x:= y : ( r -value of) object pointed by x is set to (the r -value of) y
43
Sanath Jayasena/Apr 2006 7-43 Syntax-Dir. Translation into 3AC When 3AC code is generated, temp names are made up for interior nodes in syntax tree –E.g., for E E1 + E2, value of E on LHS will be computed to a new temp t Example –Fig. 8.6, Fig 8.7 on p. 469
44
Sanath Jayasena/Apr 2006 7-44 Implementation of 3AC 3AC is an abstract form –Can be implemented in a compiler as records –(with fields for operator and operands) Three representations –Quadruples –Triples –Indirect triples
45
Sanath Jayasena/Apr 2006 7-45 (a) Quadruples A record structure with 4 fields –op, arg1, arg2 and result Examples –For x := y op z we have: y in arg1, z in arg2 and x in result –For unary operators, arg2 not used –For param operator, arg2 and result unused –Fig. 8.8(a), p. 471 for a:= b* -c + b* -c Content of fields are pointers to ST entries
46
Sanath Jayasena/Apr 2006 7-46 (b) Triples Temps generated in quadruples must be entered in symbol table To avoid this, we can refer to a temp value by the location of the relevant statement –We can have records with only 3 fields op, arg1 and arg2 –Fields arg1 and arg2 can be pointers to ST entries or to triple structure for temp values –Example: Fig 8.8(b), Fig. 8.9 on p. 471
47
Sanath Jayasena/Apr 2006 7-47 (c) Indirect Triples Listing of pointers to triples, rather than triples themselves Example –We can use an array to list pointers to triples in the desired order –Example: Fig 8.10 on p. 472
48
Sanath Jayasena/Apr 2006 7-48 Translating Language Constructs Balance of Chapter 8 in Dragon book covers details on implementing: –Declarations, scope –Assignments, array elements, fields in records –Boolean expressions –Case statements –Label renaming (called backpatching) –Function calls
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.