COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Overview of the Subject (COMP 3438) Overview of Unix Sys. Prog. ProcessFile System Overview of Device Driver Development Character Device Driver Development Introduction to Block Device Driver Overview of Complier Design Lexical Analysis Syntax Analysis (HW #4) Part I: Unix System Programming (Device Driver Development) Part II: Compiler Design Course Organization (This lecture is in red)

Outline Part I: Introduction to Syntax Analysis 1. Input (Tokens) and Output (Parse Tree) 2. How to specify syntax? Context Free Grammar (CFG) 3. How to obtain parse tree? CFG  Remove left recursion, left factoring, ambiguity  LL (Leftmost Derivation) CFG  (Remove ambiguity)  LR (Reverse Rightmost Derivation) Part II: Context Free Grammar, Parse Tree and Ambiguity Part III: Bottom-up Paring (LR) SLR, Canonical LR, LALR Part III: Top-down Parsing (LL) Left Recursion, Left factoring (Tutorial) Recursive-Decent Paring Predictive Parsing (without backtracking) –HW4 Nonrecursive Predictive Parsing Software Tool: yacc (Lab)

Part III: Intro. to Top-Down Parsing

Parsing Goal: Given a input string and a language, (1) Check if it belongs to the language (2) if yes, construct the parse tree Example: Input string: aabb Language: {a b | n>0} Parsing: n n Parser aabb S a S b ab

6 Top-down parsing methods Top-down parsing may be viewed as an attempt to find a leftmost derivation for an input string. Begin from the start symbol and try to regenerate the input string Substituting the correct choice of production at each step, guided by looking at the "next" terminal in the input string. Constructing a parse tree for the input string from the root and creating the nodes of the parse tree in preorder. S A = 1 + 3 * 4 / 5

7 Top-down parsing methods We shall concentrate on a rather simple and yet quite effective top-down parsing method, called recursive descent. Write recursive recognizers (subroutines) for each grammar rule If rules succeeds perform some action (i.e., build a tree node, emit code, etc.) If rule fails, return failure. Caller may try another choice or fail On failure it “backs up” We will study an efficient way of implementing the method, called predicative parsing.

8 Recursive-descent parsing Recursive-descent parsing involves with executing a set of recursive procedures to process the input. a procedure is associated with each nonterminal of a grammar. The recursive procedures can be quite easy to write and fairly efficient if written in a language that implements recursive procedure calls efficiently. Let us consider the grammar: S  cAd A  ab | a See next page for the procedures defined for nonterminals S and A.

9 Procedure S() begin if input symbol = 'c' then begin ADVANCE(); if A() then if input symbol = ‘d’ then begin ADVANCE(); return TRUE end end; return FAULSE end Procedure A() begin isave := input-pointer; if input symbol = 'a' then begin ADVANCE(); if input symbol = ‘b’ then begin ADVANCE(); return TRUE end end; input-pointer := isave; /* failure to find ab */ if input symbol = 'a' then begin ADVANCE();return TRUE end else return FALSE end The procedure ADVANCE() moves the input pointer to the next symbol. Procedures S() and A() return value TRUE or FALSE, depending on whether or not they have found on the input a string by the corresponding nonterminal. Note, on failure, each procedure leaves the input pointer where it was when the procedure is failed, and that on success it moves the input pointer over the substring recognized.

10 Partially Completed Recursive Descent Parse for Assignments

11 Difficulties with top-down parsing left-recursion: A grammar G is said to be left-recursive if there is a derivation A  for some A and . e.g. E  E + T | E – T | T A left-recursive grammar can cause a top-down parser to go into an infinite loop: When we try to expand A, we may again try to to expand A without consuming any input. This cycling will surely occur on an erroneous input string, and it may also occur on legal inputs, depending on the order in which the alternates for A are tried.

12 Nondeterminism and backtracking: If we make a sequence of erroneous expansions (due to nondeterminism), and subsequently discover a mismatch, we have to undo the semantic effects of making these erroneous expansions e.g., entries made in the symbol table might have to be removed. Since undoing semantic actions requires a substantial overhead, it is reasonable to consider top-down parsers that do no backtracking. One technique of avoiding nondeterminism is known as left factoring. Difficulties with top-down parsing

13 Grammar transformations In particular, there are several restrictions on grammars for overcoming the difficulties with recursive-descent parsing. eliminating left recursion (avoiding infinite loop) left factoring (avoiding nondeterminism)

15 In general, we can eliminate immediate left recursion as follows: (a) Group the productions as: A  A  1 | A  2 | … | A  m |  1 |  2 | …|  n (no  I begins with an A) (b) Replace the A-productions by A   1 A’ |  2 A’ | …|  n A’ A’   1 A’ |  2 A’ | … |  m A’ |  The above technique cannot eliminate left recursion involving derivations of two or more steps. E.g., S  Aa | b, A  Ac | Sd |  Algorithm 4.1 in textbook can be used to systematically eliminate left recursion. Eliminating left recursion

16 Left factoring We need to do backtracking if there is nondterminism, e.g. A   1 |  2 After seeing input , we should go  1 or  2 ? Left factoring can avoid backtracking due to nondeterminism in expanding a nonterminal symbol. Basic idea: when it is not clear which of two alternative productions to use to expand a nonterminal, rewrite the production and defer the decision until we have seen enough of the input to make the right choice.

17 Left factoring Left Factoring: Given A   1 |  2, Change it to A   A ’ A ’   1 |  2 Defer the decision by expanding A to  A ’ ; After seeing the input derived from , we then expand A ’ to  1 or  2. Algorithm 4.2 gives a method for left factoring a grammar.

Top-Down Parsing Recursive Decent Parsing (may have backtracking) Predictive Parsing ( no backtracking) Nonrecursive Predictive Parsing (no recursion) No Left Recursion Left Factoring

Top-down parsing is to find a leftmost derivation for an input string. The recursive descent parsing is a simple and yet effective top-down parsing method. In the recursive-descent parsing, we use a set of recursive procedures obtained from CFG to process the input. We need to eliminate left recursion and nondterminism in the recursive-descent parsing. Summary

COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Similar presentations

Presentation on theme: "COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Similar presentations

Presentation on theme: "COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ."— Presentation transcript:

Similar presentations

About project

Feedback