CPSC 388 – Compiler Design and Construction Parsers – Syntax Directed Translation
Syntax Directed Translation Translating from a sequence of tokens to some other form, based on the underlying syntax. augment the CFG: a translation rule is defined for each production. A translation rule defines the translation of the left-hand side nonterminal as a function of: constants the right-hand-side nonterminals' translations the right-hand-side tokens' values (e.g., the integer value associated with an INTLIT token, or the String value associated with an ID token)
Translating Strings 1.Build the parse tree. 2.Use the translation rules to compute the translation of each nonterminal in the tree, working bottom up (since a nonterminal's value may depend on the value of the symbols on the right-hand side, you need to work bottom-up so that those values are available). 3.The translation of the root node is the translation of the string
Build a Parse Tree Build a parse tree for the input: 2*4+5 exp → exp PLUS term exp → exp MINUS term exp → term term → term TIMES factor term → term DIVIDE factor term → factor factor → LPAREN exp RPAREN factor → INT_LIT exp1.trans = exp2.trans + term.trans exp1.trans = exp2.trans - term.trans exp.trans = term.trans term1.trans = term2.trans * factor.trans term1.trans = term2.trans / factor.trans term.trans = factor.trans factor.trans = exp.trans factor.trans = INT_LIT.value
Annotated Parse Tree Parse Tree can be annotated using the translation rules associated with each production Use translation rule for root node and recurse into sub-nodes as the translation rule dictates
Example Annotated Parse Tree Consider the following CFG, which defines expressions that use the three operators: +, &&, ==. Let's define a syntax-directed translation that type checks these expressions; i.e., for type-correct expressions, the translation will be the type of the expression (either INT or BOOL), and for expressions that involve type errors, the translation will be the special value ERROR. We'll use the following type rules: 1.Both operands of the + operator must be of type INT. 2.Both operands of the && operator must be of type BOOL. 3.Both operands of the == operator have the same (non-ERROR) type.
Example Annotated Parse Tree CFGTranslation rules === ================= exp -> exp + termif ((exp2.trans == INT) and (term.trans == INT) then exp1.trans = INT else exp1.trans = ERROR exp -> exp && termif ((exp2.trans == BOOL) and (term.trans == BOOL) then exp1.trans = BOOL else exp1.trans = ERROR exp -> exp == termif ((exp2.trans == term.trans) and (exp2.trans != ERROR)) then exp1.trans = BOOL else exp1.trans = ERROR exp -> termexp.trans = term.trans term -> trueterm.trans = BOOL term -> falseterm.trans = BOOL term -> intliteralterm.trans = INT term -> ( exp )term.trans = exp.trans Try Input: ( ) == 4 Construct Annotated Parse Tree
Create Translation Rules The following grammar defines the language of base- 2 numbers: b -> 0 b -> 1 b -> b 0 b -> b 1 Define a syntax-directed translation so that the translation of a binary number is its base 10 value. Illustrate your translation scheme by drawing the parse tree for 1001 and annotating each nonterminal in the tree with its translation.
Building AST from Parse Trees So far, our example syntax-directed translations have produced simple values (an int or a type) as the translation of an input. In practice however, we want the parser to build an abstract-syntax tree as the translation of an input program. But that is not really so different from what we've seen so far; we just need to use tree-building operations in the translation rules instead of, e.g., arithmetic operations.
Diffferences between AST and Parse Tree Operators appear at internal nodes instead of at leaves. "Chains" of single productions are collapsed. Lists are "flattened". Syntactic details (e.g., parentheses, commas, semi-colons) are omitted. In General ASTs omit details having to do with the source language, and just contains information about the essential structure of the program.
Example Abstract Syntax Tree * Construct Parse Tree for 3*(4+2) For constructs other than expressions, the compiler writer has some choices when defining the AST -- but remember that lists (e.g., lists of declarations lists of statements, lists of parameters) should be flattened, that operators (e.g., "assign", "while", "if") go at internal nodes, not at leaves, and that syntactic details are omitted.
Translation Rules for ASTs First need some java classes for nodes in AST class ExpNode { } class IntLitNode extends ExpNode { public IntLitNode(int val) {...} } class PlusNode extends ExpNode { public PlusNode( ExpNode e1, ExpNode e2 ) {... } } class TimesNode extends ExpNode { public TimesNode( ExpNode e1, ExpNode e2 ) {... } }
Translation Rules for ASTs CFGTranslation rules ==================== exp -> exp + termexp1.trans = new PlusNode(exp2.trans, term.trans) exp -> termexp.trans = term.trans term -> term * factorterm1.trans = new TimesNode(term2.trans, factor.trans) term -> factorterm.trans = factor.trans factor -> INTLITERALfactor.trans = new IntLitNode(INTLITERAL.value) factor -> ( exp )factor.trans = exp.trans Add to this CFG and Translation Rules for minus and divide Draw the Parse Tree for 2+3*4 and annotate with translation