Download presentation
Presentation is loading. Please wait.
1
Chapter 2 Chang Chi-Chung 2008.03 rev.1
2
A Simple Syntax-Directed Translator This chapter contains introductory material to Chapters 3 to 8 To create a syntax-directed translator that maps infix arithmetic expressions into postfix expressions. Building a simple compiler involves: Defining the syntax of a programming language Develop a source code parser: for our compiler we will use predictive parsing Implementing syntax directed translation to generate intermediate code
3
A Code Fragment To Be Translated { int i; int j; float[100] a; float v; float x; while (true) { do i = i + 1; while ( a[i] < v ); do j = j – 1; while ( a[j] > v ); if ( i>= j ) break; x = a[i]; a[i] = a[j]; a[j] = x; } To extend syntax-directed translator to map code fragments into three- address code. See appendix A. 1: i = i + 1 2: t1 = a [ i ] 3: if t1 < v goto 1 4: j = j -1 5: t2 = a [ j ] 6: if t2 > v goto 4 7: ifFalse i >= j goto 9 8: goto 14 9: x = a [ i ] 10: t3 = a [ j ] 11: a [ i ] = t3 12: a [ j ] = x 13: goto 1 14:
4
Syntax tree A Model of a Compiler Front End Lexical analyzer Parser Character Stream Token stream Symbol Table Source program Intermediate Code Generator Three-address code
5
Two Forms of Intermediate Code Abstract syntax trees Tree-Address instructions do-while body assign i + i 1 > [ ] a v i 1: i = i + 1 2: t1 = a [ i ] 3: if t1 < v goto 1
6
Syntax Definition Using Context-free grammar (CFG) BNF: Backus-Naur Form Context-free grammar has four components: A set of tokens (terminal symbols) A set of nonterminals A set of productions A designated start symbol
7
Example of CFG G = T = { +,-,0,1,2,3,4,5,6,7,8,9 } N = { list, digit } P = list list + digit list list – digit list digit digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 S = list
8
Derivations The set of all strings (sequences of tokens) generated by the CFG using derivation Begin with the start symbol Repeatedly replace a nonterminal symbol in the current sentential form with one of the right-hand sides of a production for that nonterminal
9
Example of the Derivations Leftmost derivation replaces the leftmost nonterminal (underlined) in each step. Rightmost derivation replaces the rightmost nonterminal in each step. list list + digit list - digit + digit digit - digit + digit 9 - digit + digit 9 - 5 + digit 9 - 5 + 2 Production list list + digit list list – digit list digit digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
10
Parser Trees Given a CFG, a parse tree according to the grammar is a tree with following propertes. The root of the tree is labeled by the start symbol Each leaf of the tree is labeled by a terminal (=token) or Each interior node is labeled by a nonterminal If A X 1 X 2 … X n is a production, then node A has immediate children X 1, X 2, …, X n where X i is a (non)terminal or ( denotes the empty string) Example A XYZ A XYZ
11
Example of the Parser Tree Parse tree of the string 9-5+2 using grammar G list digit 9-5+2 list digit The sequence of leafs is called the yield of the parse tree
12
Ambiguity Consider the following context-free grammar This grammar is ambiguous, because more than one parse tree represents the string 9- 5+2 P = string string + string | string - string | 0 | 1 | … | 9 G =
13
Ambiguity (Cont ’ d) string 9-5+2 9-5+2
14
Associativity of Operators Left-associative If an operand with an operator on both sides of it, then it belongs to the operator to its left. string a+b+c has the same meaning as (a+b)+c Left-associative operators have left-recursive productions left left + term | term Right-associative If an operand with an operator on both sides of it, then it belongs to the operator to its right. string a=b=c has the same meaning as a=(b=c) Right-associative operators have right-recursive productions right term = right | term
15
Associativity of Operators (cont ’ d) list digit a+b+c list digit right = letter a c=b right letter left-associative right-associative
16
Precedence of Operators String 9+5*2 has the same meaning as 9+(5*2) * has higher precedence than + Constructs a grammar for arithmetic expressions with precedence of operators. left-associative : + - (expr) left-associative : * / (term) Step 4: expr expr + term | expr – term | term term term * factor | term / factor | factor factor digit | ( expr ) Step 1: factor digit | ( expr ) Step 2: term term * factor | term / factor | factor Step 3: expr expr + term | expr – term | term
17
An Example: Syntax of Statements The grammar is a subset of Java statements. This approach prevents the build-up of semicolons after statements such as if- and while-, which end with nested substatements. stmt id = expression ; | if ( expression ) stmt | if ( expression ) stmt else stmt | while ( expression ) stmt | do stmt while ( expression ) ; | { stmts } stmts stmts stmt |
18
Syntax-Directed Translation Syntax-Directed translation is done by attaching rules or program fragments to productions in a grammar. Translate infix expressions into postfix notation. ( in this chapter ) Infix: 9 – 5 + 2 Postfix: 9 5 – 2 + An Example expr expr 1 + term The pseudo-code of the translation translate expr 1 ; translate term ; handle + ;
19
Syntax-Directed Translation (Cont ’ d) Two concepts (approaches) related to Syntax-Directed Translation. Synthesized Attributes Syntax-directed definition Build up a translation by attaching strings (semantic rules) as attributes to the nodes in the parse tree. Translation Schemes Syntax-directed translation Build up a translation by program fragments which are called semantic actions and embedded within production bodies.
20
Syntax-directed definition The syntax-directed definition associates With each grammar symbol (terminals and nonterminals), a set of attributes. With each production, a set of semantic rules for computing the values of the attributes associated with the symbols appearing in the production. An attribute is said to be Synthesized if its value at a parse-tree node is determined from attribute values at its children and at the node itself. Inherited if its value at a parse-tree node is determined from attribute values at the node itself, its parent, and its siblings in the parse tree.
21
An Example: Synthesized Attributes An annotated parse tree Suppose a node N in a parse tree is labeled by grammar symbol X. The X.a is denoted the value of attribute a of X at node N. expr.t = “ 95-2+ ” term.t = “ 2 ” 9-5+2 expr.t = “ 95- ” expr.t = “ 9 ” term.t = “ 5 ” term.t = “ 9 ”
22
Semantic Rules ProductionSemantic Rules expr expr 1 + term expr expr 1 - term expr term term 0 term 1 … term 9 expr.t = expr 1.t || term.t || ‘+’ expr.t = expr 1.t || term.t || ‘-’ expr.t = term.t term.t = ‘0’ term.t = ‘1’ … term.t = ‘9’ || is the operator for string concatenation in semantic rule.
23
Depth-First Traversals Tree traversals Breadth-First Depth-First Preorder: N L R Inorder: L N R Postorder: L R N Depth-First Traversals: Postorder 、 From left to right procedure visit(node N) { for ( each child C of N, from left to right ) { visit(C); } evaluate semantic rules at node N; }
24
Example: Depth-First Traversals expr.t = 95-2+ term.t = 2 9-5+2 expr.t = 95- expr.t = 9term.t = 5 term.t = 9 Note: all attributes are the synthesized type
25
Translation Schemes A translation scheme is a CFG embedded with semantic actions Example rest + term { print(“+”) } rest rest termrest+ { print( “ + ” ) } Embedded Semantic Action
26
An Example: Translation Scheme expr term 9 - 5 + 2 expr term { print( ‘ + ’ ) } { print( ‘ - ’ ) }{ print( ‘ 2 ’ ) } { print( ‘ 9 ’ ) } { print( ‘ 5 ’ ) } expr expr + term { print( ‘ + ’ ) } expr expr – term { print( ‘ - ’ ) } expr term term 0 { print( ‘ 0 ’ ) } term 1 { print( ‘ 1 ’ ) } … term 9 { print( ‘ 9 ’ ) }
27
Parsing The process of determining if a string of terminals (tokens) can be generated by a grammar. Time complexity: For any CFG there is a parser that takes at most O(n 3 ) time to parse a string of n terminals. Linear algorithms suffice to parse essentially all languages that arise in practice. Two kinds of methods Top-down: constructs a parse tree from root to leaves Bottom-up: constructs a parse tree from leaves to root
28
Top-Down Parsing Recursive descent parsing is a top-down method of syntax analysis in which a set of recursive procedures is used to process the input. One procedure is associated with each nonterminal of a grammar. If a nonterminal has multiple productions, each production is implemented in a branch of a selection statement based on input lookahead information Predictive parsing A special form of recursive descent parsing The lookahead symbol unambiguously determines the flow of control through the procedure body for each nonterminal.
29
An Example: Top-Down Parsing stmt expr ; | if ( expr ) stmt | for ( optexpr ; optexpr ; optexpr ) stmt | other optexpr | expr stmt optexpr ε expr optexpr for ( ;;optexpr) stmt exprother
30
void stmt() { switch ( lookahead ) { case expr: match(expr); match(‘;’); break; case if: match(if); match(‘(‘); match(expr); match(‘)’); stmt(); break; case for: match(for); match(‘(‘); optexpr(); match(‘;’); optexpr(); match(‘)’); stmt(); break; case other: match(other); break; default: report(“syntax error”); } } void optexpr() { if ( lookahead == expr ) match(expr); } void match(terminal t) { if ( lookahead == t ) lookahead = nextTerminal; else report(“syntax error”); } stmt expr ; | if ( expr ) stmt | for ( optexpr ; optexpr ; optexpr ) stmt | other optexpr | expr Pseudocode For a Predictive Parser Use ε- Productions
31
Example: Predictive Parsing stmt for ( ; expr ; expr ) other Parse Tree Input LL(1) lookahead for match(for) ( match(‘(‘)optexpr()match(‘;‘) optexpr()match(‘;‘)optexpr()match(‘)‘)stmt() optexpr ; ; ) stmt
32
FIRST FIRST( ) is the set of terminals that appear as the first symbols of one or more strings generated from is Sentential Form Example FIRST( stmt ) = { expr, if, for, other } FIRST( expr ; ) = { expr } stmt expr ; | if ( expr ) stmt | for ( optexpr ; optexpr ; optexpr ) stmt | other
33
Examples: First FIRST(simple) = { integer, char, num } FIRST(^ id) = { ^ } FIRST(type) = { integer, char, num, ^, array } type simple | ^ id | array [ simple ] of type simple integer | char | num dotdot num
34
Designing a Predictive Parser A predictive parser is a program consisting of a procedure for every nonterminal. The procedure for nonterminal A It decides which A -production to use by examining the lookahead symbol. Left Factor Left Recursion ε Production Mimics the body of the chosen production. Applying translation scheme Construct a predictive parser, ignoring the actions. Copy the actions from the translation scheme into the parser
35
Left Factor One production for nonterminal A starts with the same symbols. Example: stmt if ( expr ) stmt | if ( expr ) stmt else stmt Use Left Factoring to fix it stmt if ( expr ) stmt rest rest else stmt | ε
36
Left Recursion Left Recursive A production for nonterminal A starts with a self reference. A A α | β An Example: expr expr + term | term Rewrite the left recursive to right recursive by using the following rules. A β R R αR | ε
37
Example: Left and Right Recursive βαα….α βαα α A A A A … A R R R … R ε left recursive right recursive
38
Abstract and Concrete Syntax + - 9 5 2 expr term 9-5+2 expr term helper
39
Conclusion: Parsing and Translation Scheme Give a CFG grammar G as below: expr expr + term { print(‘+’) } expr expr – term { print(‘-’) } expr term term 0 { print(‘0’) } term 1 { print(‘1’) } … term 9 { print(‘9’) } Semantic actions for translating into postfix notation.
40
Conclusion: Parsing and Translation Scheme Step 1 To elimination left-recursion Technique A Aα | Aβ | γ into A γ R R α R | βR | ε Use the rule to transforms G.
41
Left-Recursion-elimination expr term rest rest + term { print(‘+’) } rest | – term { print(‘-’) } rest | ε term 0 { print(‘0’) } term 1 { print(‘1’) } … term 9 { print(‘9’) } Conclusion: Parsing and Translation Scheme
42
An Example: Left-Recursion-elimination expr term 9 { print( ‘ 9 ’ ) } 5 rest - term { print( ‘ - ’ ) } { print( ‘ 5 ’ ) } 2 rest + term { print( ‘ + ’ ) } { print( ‘ 2 ’ ) } ε rest expr term rest rest + term { print( ‘ + ’ ) } rest | – term { print( ‘ - ’ ) } rest | ε term 0 { print( ‘ 0 ’ ) } | 1 { print( ‘ 1 ’ ) } | … | 9 { print( ‘ 9 ’ ) }
43
Conclusion: Parsing and Translation Scheme Step 2 Procedures for Nonterminals. void expr() { term(); rest(); } void rest() { if ( lookahead == ‘+’ ) { match(‘+’); term(); print(‘+’); rest(); } else if ( lookahead == ‘-’ ) { match(‘-’); term(); print(‘-’); rest(); } else { } //do nothing with the input } void term() { if ( lookahead is a digit ) { t = lookahead; match(lookahead); print(t); } else report(“syntax error”); }
44
Step 3 Simplifying the Translator Conclusion: Parsing and Translation Scheme void rest() { while ( true ) { if ( lookahead == ‘+’ ) { match(‘+’); term(); print(‘+’); continue; } else if (lookahead == ‘-’) { match(‘-’); term(); print(‘-’); continue; } break; } void rest() { if ( lookahead == ‘+’ ) { match(‘+’); term(); print(‘+’); rest(); } else if (lookahead == ‘-’) { match(‘-’); term(); print(‘-’); rest(); } else { }
45
Conclusion: Parsing and Translation Scheme Complete void term() throws IOException { if (Character.isDigit((char)lookahead){ System.out.write((char)lookahead); match(lookahead); } else throw new Error(“syntax error”); } void match(int t) throws IOException { if ( lookahead == t ) lookahead = System.in.read(); else throw new Error(“syntax error”); } } import java.io.*; class Parser { static int lookahead; public Parser() throws IOException { lookahead = System.in.read(); } void expr() { term(); while ( true ) { if ( lookahead == ‘+’ ) { match(‘+’); term(); System.out.write(‘+’); continue; } else if (lookahead == ‘-’) { match(‘-’); term(); System.out.write(‘-’); continue; } else return; }
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.