Chapter 4 Top-Down Parsing Part-1 September 8, 2018 Prof. Abdelaziz Khamis
Chapter 4: Part1-Topics Top-Down Parsing Recursive-Descent Parsing Grammar of TINY in EBNF A Recursive-Descent Parser for TINY September 8, 2018 Prof. Abdelaziz Khamis
Top-Down Parsing A top-down parsing algorithm parses an input string of tokens by tracing out the steps in a leftmost derivation. Begin with the start symbol (top of tree) Work down to the leaves (terminals) by expanding rules. Such an algorithm is called top-down because the implied traversal of the parse tree is a pre-order traversal and, thus, occurs from the root to the leaves. September 8, 2018 Prof. Abdelaziz Khamis
Top-Down Parsing (Continued) But how to choose the rule? Top-down parser come in two forms: Backtracking parsers Systematically try every choice, backtrack if one decision turns out to be wrong and make a different choice. (Slow - require exponential time in general) Predictive parsers Look at one or more lookahead tokens (usually one) to pick the right choice. Recursive-descent parsing LL(1) parsing September 8, 2018 Prof. Abdelaziz Khamis
Recursive-Descent Parsing The grammar rule for a non-terminal A is viewed as a definition for a function that will recognize the non-terminal A. The right-hand side of the grammar rule specifies the structure of the code for this function. Each appearance of a terminal causes a token to be matched. Each appearance of a non-terminal corresponds to a call of the associated function. Choices correspond to case- or if- statements. September 8, 2018 Prof. Abdelaziz Khamis
Recursive-Descent Parsing (Continued) Example: Rule to Code Grammar rule for factor: factor ( exp ) | number Code that recognizes a factor: void factor() { if (token == number) match(number); else if (token == ‘(’) { match(‘(‘); exp(); match(‘)’); } else error(token); This function uses one lookahead token. void match(Token expect) { if (token == expect) getToken(); else error(token,expect); } September 8, 2018 Prof. Abdelaziz Khamis
Recursive-Descent Parsing (Continued) A recursive-descent function can also compute values or syntax trees: int factor(void) { if (token == number) { int temp = atoi(tokStr); match(number); return temp; } else if (token == ‘(‘){ match(‘(‘); int temp = exp(); match(‘)’); return temp; else error(token); September 8, 2018 Prof. Abdelaziz Khamis
Recursive-Descent Parsing (Continued) Consider now the grammar rule for exp: exp exp addop term | term The corresponding code would lead to an immediate infinite recursive loop: void exp(void) { if (token == ??) { exp(); addop(); term(); } else term(); September 8, 2018 Prof. Abdelaziz Khamis
Recursive-Descent Parsing (Continued) The solution is to use the EBNF rule: exp term { addop term } The curly brackets expressing repetition can be translated into code for a loop, as follows: void exp(void) { term(); while (token == ‘+’ || token == ‘-’) { match(token); term(); } The non-terminal addop has been eliminated as a separate function, since its only task is to match the operators: + and -. September 8, 2018 Prof. Abdelaziz Khamis
Recursive-Descent Parsing (Continued) The left associativity implied by the curly brackets (and explicit in the original BNF) is maintained in the code: int exp(void) { int temp = term(); while (token == ‘+’|| token == ‘-’) if (token == ‘+’) { match(‘+’); temp += term();} else { match(‘-’); temp -= term();} return temp; } September 8, 2018 Prof. Abdelaziz Khamis
Recursive-Descent Parsing (Continued) The right recursion/associativity is not a problem: exp term [ addop exp ] void exp(void) { term(); if(token == ‘+’ || token == ‘-’) { match(token); exp(); } September 8, 2018 Prof. Abdelaziz Khamis
Grammar of TINY in EBNF program stmt-sequence stmt-sequence statement { ; statement } statement if-stmt | repeat-stmt | assign-stmt | read-stmt | write-stmt if-stmt if exp then stmt-sequence [ else stmt-sequence ] end repeat-stmt repeat stmt-sequence until exp assign-stmt identifier := exp read-stmt read identifier write-stmt write exp exp simple-exp [ comparison-op simple-exp ] comparison-op < | = simple-exp term { addop term } addop + | - term factor { mulop factor } mulop * | / factor ( exp ) | number | identifier September 8, 2018 Prof. Abdelaziz Khamis
A Recursive-Descent Parser for TINY Sample recursive-descent code in the TINY parser: TreeNode * statement(void) { TreeNode * t = NULL; switch (token) { case IF : t = if_stmt(); break; case REPEAT : t = repeat_stmt(); break; case ID : t = assign_stmt(); break; case READ : t = read_stmt(); break; case WRITE : t = write_stmt(); break; default : syntaxError("unexpected token -> "); printToken(token,tokenString); token = getToken(); break; } return t; September 8, 2018 Prof. Abdelaziz Khamis
A Recursive-Descent Parser for TINY (Continued) Sample recursive-descent code in the TINY parser: TreeNode * if_stmt(void) { TreeNode * t = newStmtNode(IfK); match(IF); if (t!=NULL) t->child[0] = exp(); match(THEN); if (t!=NULL) t->child[1] = stmt_sequence(); if (token==ELSE) { match(ELSE); if (t!=NULL) t->child[2] = stmt_sequence(); } match(END); return t; September 8, 2018 Prof. Abdelaziz Khamis