Lesson 6 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg
2 Outline Code generation using syntax-directed translation Lexical analysis
CODE GENERATION USING SYNTAX-DIRECTED TRANSLATION 3
Syntax-directed translation Add attributes to the grammar symbols Add semantic actions to the grammar –Syntax-directed translation scheme Inject code into the parser 4
SDT example (Section 2.3 in the book) Expression grammar: expr expr + num | expr – num | num Infix to postfix notation 5
SDT example (Section 2.3 in the book) 6 Infix expressionPostfix expression (1 + 2) – – 1 + (2 – 3)1 2 3 – +
SDT example (Section 2.3 in the book) Formal definition: – POSTFIX (num) = num – POSTFIX ( (E) ) = POSTFIX (E) – POSTFIX (E 1 op E 2 ) = POSTFIX (E 1 ) POSTFIX (E 2 ) op 7
Exercise (1) Translate the following infix expressions into postfix notation: a)78 b)3 – 2 – 1 c)( * 3) d)3 * (17) / (92 + 8) Assume conventional operator precedence and associativity. 8
SDT example (Section 2.3 in the book) Translation scheme: expr expr + num{ print(num.value); print('+') } | expr – num{ print(num.value); print('–') } | num{ print(num.value) } 9
SDT example (Section 2.3 in the book) Extended parse tree for – 3: 10 expr num (1) – +num (2) num (3) { print(num.value); print('-') } { print(num.value); print('+') } { print(num.value) }
Exercise (2) Traverse the following extended parse tree in a depth-first, left-right order and execute the semantic actions: 11 expr num (1) – +num (2) num (3) { print(num.value); print('-') } { print(num.value); print('+') } { print(num.value) }
Left recursion elimination expr num { print(num.value) } rest rest + num { print(num.value); print('+') } rest rest - num { print(num.value); print('-') } rest rest ε 12
Exercise (3) Draw the parse tree for – 3 (i.e. num + num – num) with the new grammar. Include the semantic actions as leaf nodes. Then traverse it and execute the semantic actions. 13
Syntax-directed definitions Similar to translation schemes More abstract or declarative 14 ProductionSemantic rules expr expr 1 + numexpr.t = expr 1.t || num.value || '+' | expr 1 – numexpr.t = expr 1.t || num.value || '-' | numexpr.t = num.value
LEXICAL ANALYSIS 15
Lexical analysis Lexical analyzer/ scanner/tokenizer Simplifies the parser: –Removes white spaces –Removes comments –Identifies lexemes and returns tokens 16
Tokens Name + attribute Attributes: –Line and column number –Identifier name/symbol table index –Numerical value –… Lexemes 17
Differing requirements Allow spaces in identifiers? –Example: Fortran 90 Allow keywords as identifiers? –Example: PL/1 Language support for configuring the lexical analysis? –Example: TeX 18
Implementing lexical analysis Finite state machine? Hard-coded? Use a generator tool? 19
Input buffering 20
21 int lineno = 1, attribute = NONE; int GetNextToken(void) { char t; for (t = ReadChar(); t != 0; t = ReadChar()) { if (t == ' ' || t == '\t') /* Skip white spaces */ else if (t == '\n') lineno++; else if ('0' <= t && t <= '9') { attribute = GetNum(t); return NUM; } else { /* Error handling */ attribute = NONE; return UNKNOWN_TOKEN; } return EOF;/* End of file token */ }
22 int GetNum(char t) { int num = 0; for (; '0' <= t && t <= '9'; t = ReadChar()) { num *= 10; num += t – '0'; } // Put back the char that caused the loop to exit PutBack(t); return num; }
DFA-based scanner 23
DFA-based scanner 24 Symbol State <148 >638 =527 other48
Differentiating between keywords and identifiers Two strategies: –Keyword table –Test for keywords before identifiers 25
Error recovery Often hard to detect –Misspelled keywords = valid identifiers –Misspelled identifiers hard to detect Recovery strategies: –Panic mode –Try to fix the input 26
Conclusion Code generation using syntax-directed translation Lexical analysis 27
Next time Stack machine code Generating stack machine code using SDT 28