COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Slides:



Advertisements
Similar presentations
Compiler Construction
Advertisements

1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.
Lecture # 11 Grammar Problems.
LESSON 18.
Top-Down Parsing.
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
CS Summer 2005 Top-down and Bottom-up Parsing - a whirlwind tour June 20, 2005 Slide acknowledgment: Radu Rugina, CS 412.
Context-Free Grammars Lecture 7
Parsing III (Eliminating left recursion, recursive descent parsing)
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
1 Chapter 4: Top-Down Parsing. 2 Objectives of Top-Down Parsing an attempt to find a leftmost derivation for an input string. an attempt to construct.
Professor Yihjia Tsai Tamkang University
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
Top-Down Parsing.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Parsing Chapter 4 Parsing2 Outline Top-down v.s. Bottom-up Top-down parsing Recursive-descent parsing LL(1) parsing LL(1) parsing algorithm First.
Chapter 9 Syntax Analysis Winter 2007 SEG2101 Chapter 9.
Review: –How do we define a grammar (what are the components in a grammar)? –What is a context free grammar? –What is the language defined by a grammar?
Top-Down Parsing - recursive descent - predictive parsing
4 4 (c) parsing. Parsing A grammar describes the strings of tokens that are syntactically legal in a PL A recogniser simply accepts or rejects strings.
Chapter 5 Top-Down Parsing.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.
-Mandakinee Singh (11CS10026).  What is parsing? ◦ Discovering the derivation of a string: If one exists. ◦ Harder than generating strings.  Two major.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
COMP 3438 – Part II - Lecture 2: Lexical Analysis (I) Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ. 1.
Top Down Parsing - Part I Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
6/4/2016IT 3271 The most practical Parsers: Predictive parser: 1.input (token string) 2.Stacks, parsing table 3.output (syntax tree, intermediate codes)
COP4020 Programming Languages Parsing Prof. Xin Yuan.
Muhammad Idrees, Lecturer University of Lahore 1 Top-Down Parsing Top down parsing can be viewed as an attempt to find a leftmost derivation for an input.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Top-down Parsing lecture slides from C OMP 412 Rice University Houston, Texas, Fall 2001.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
TOP-DOWN PARSING Recursive-Descent, Predictive Parsing.
1 Nonrecursive Predictive Parsing  It is possible to build a nonrecursive predictive parser  This is done by maintaining an explicit stack.
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Top-Down Parsing.
Syntax Analyzer (Parser)
1 Introduction to Parsing. 2 Outline l Regular languages revisited l Parser overview Context-free grammars (CFG ’ s) l Derivations.
1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )
UMBC  CSEE   1 Chapter 4 Chapter 4 (b) parsing.
Parsing III (Top-down parsing: recursive descent & LL(1) )
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Lecture # 10 Grammar Problems. Problems with grammar Ambiguity Left Recursion Left Factoring Removal of Useless Symbols These can create problems for.
Spring 16 CSCI 4430, A Milanova 1 Announcements HW1 due on Monday February 8 th Name and date your submission Submit electronically in Homework Server.
COMP 3438 – Part II - Lecture 4 Syntax Analysis I Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
CS416 Compiler Design1. 2 Course Information Instructor : Dr. Ilyas Cicekli –Office: EA504, –Phone: , – Course Web.
Syntax Analysis By Noor Dhia Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.
Parsing — Part II (Top-down parsing, left-recursion removal)
Programming Languages Translator
CS510 Compiler Lecture 4.
Table-driven parsing Parsing performed by a finite state machine.
Parsing — Part II (Top-down parsing, left-recursion removal)
Compiler Construction
Top-down parsing cannot be performed on left recursive grammars.
Compiler Construction
Top-Down Parsing.
4 (c) parsing.
Top-Down Parsing CS 671 January 29, 2008.
Lecture 7: Introduction to Parsing (Syntax Analysis)
Lecture 8: Top-Down Parsing
Bottom Up Parsing.
Parsing — Part II (Top-down parsing, left-recursion removal)
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Overview of the Subject (COMP 3438) Overview of Unix Sys. Prog. ProcessFile System Overview of Device Driver Development Character Device Driver Development Introduction to Block Device Driver Overview of Complier Design Lexical Analysis Syntax Analysis (HW #4) Part I: Unix System Programming (Device Driver Development) Part II: Compiler Design Course Organization (This lecture is in red)

Outline Part I: Introduction to Syntax Analysis 1. Input (Tokens) and Output (Parse Tree) 2. How to specify syntax? Context Free Grammar (CFG) 3. How to obtain parse tree? CFG  Remove left recursion, left factoring, ambiguity  LL (Leftmost Derivation) CFG  (Remove ambiguity)  LR (Reverse Rightmost Derivation) Part II: Context Free Grammar, Parse Tree and Ambiguity Part III: Bottom-up Paring (LR) SLR, Canonical LR, LALR Part III: Top-down Parsing (LL) Left Recursion, Left factoring (Tutorial) Recursive-Decent Paring Predictive Parsing (without backtracking) –HW4 Nonrecursive Predictive Parsing Software Tool: yacc (Lab)

Part III: Intro. to Top-Down Parsing

Parsing Goal: Given a input string and a language, (1) Check if it belongs to the language (2) if yes, construct the parse tree Example: Input string: aabb Language: {a b | n>0} Parsing: n n Parser aabb S a S b ab

6 Top-down parsing methods Top-down parsing may be viewed as an attempt to find a leftmost derivation for an input string. Begin from the start symbol and try to regenerate the input string Substituting the correct choice of production at each step, guided by looking at the "next" terminal in the input string. Constructing a parse tree for the input string from the root and creating the nodes of the parse tree in preorder. S A = * 4 / 5

7 Top-down parsing methods We shall concentrate on a rather simple and yet quite effective top-down parsing method, called recursive descent. Write recursive recognizers (subroutines) for each grammar rule If rules succeeds perform some action (i.e., build a tree node, emit code, etc.) If rule fails, return failure. Caller may try another choice or fail On failure it “backs up” We will study an efficient way of implementing the method, called predicative parsing.

8 Recursive-descent parsing Recursive-descent parsing involves with executing a set of recursive procedures to process the input. a procedure is associated with each nonterminal of a grammar. The recursive procedures can be quite easy to write and fairly efficient if written in a language that implements recursive procedure calls efficiently. Let us consider the grammar: S  cAd A  ab | a See next page for the procedures defined for nonterminals S and A.

9 Procedure S() begin if input symbol = 'c' then begin ADVANCE(); if A() then if input symbol = ‘d’ then begin ADVANCE(); return TRUE end end; return FAULSE end Procedure A() begin isave := input-pointer; if input symbol = 'a' then begin ADVANCE(); if input symbol = ‘b’ then begin ADVANCE(); return TRUE end end; input-pointer := isave; /* failure to find ab */ if input symbol = 'a' then begin ADVANCE();return TRUE end else return FALSE end The procedure ADVANCE() moves the input pointer to the next symbol. Procedures S() and A() return value TRUE or FALSE, depending on whether or not they have found on the input a string by the corresponding nonterminal. Note, on failure, each procedure leaves the input pointer where it was when the procedure is failed, and that on success it moves the input pointer over the substring recognized.

10 Partially Completed Recursive Descent Parse for Assignments

11 Difficulties with top-down parsing left-recursion: A grammar G is said to be left-recursive if there is a derivation A  for some A and . e.g. E  E + T | E – T | T A left-recursive grammar can cause a top-down parser to go into an infinite loop: When we try to expand A, we may again try to to expand A without consuming any input. This cycling will surely occur on an erroneous input string, and it may also occur on legal inputs, depending on the order in which the alternates for A are tried.

12 Nondeterminism and backtracking: If we make a sequence of erroneous expansions (due to nondeterminism), and subsequently discover a mismatch, we have to undo the semantic effects of making these erroneous expansions e.g., entries made in the symbol table might have to be removed. Since undoing semantic actions requires a substantial overhead, it is reasonable to consider top-down parsers that do no backtracking. One technique of avoiding nondeterminism is known as left factoring. Difficulties with top-down parsing

13 Grammar transformations In particular, there are several restrictions on grammars for overcoming the difficulties with recursive-descent parsing. eliminating left recursion (avoiding infinite loop) left factoring (avoiding nondeterminism)

14 Eliminating left recursion Left recursion can always be removed e.g. A  A b | c Rewrite the grammar so production “makes” some progress A  c A’ A’  b A’ |  In general, we can eliminate all immediate left recursion A  A  |  with A   A’ A’   A’ |  Consider the following grammar of arithmetic expressions: By eliminating left recursion, we obtain E  E + T | T T  T * F | F F  (E) | id E  TE’ E’  +TE’ |  T  F T’ T’  *FT’ |  F  (E) | id

15 In general, we can eliminate immediate left recursion as follows: (a) Group the productions as: A  A  1 | A  2 | … | A  m |  1 |  2 | …|  n (no  I begins with an A) (b) Replace the A-productions by A   1 A’ |  2 A’ | …|  n A’ A’   1 A’ |  2 A’ | … |  m A’ |  The above technique cannot eliminate left recursion involving derivations of two or more steps. E.g., S  Aa | b, A  Ac | Sd |  Algorithm 4.1 in textbook can be used to systematically eliminate left recursion. Eliminating left recursion

16 Left factoring We need to do backtracking if there is nondterminism, e.g. A   1 |  2 After seeing input , we should go  1 or  2 ? Left factoring can avoid backtracking due to nondeterminism in expanding a nonterminal symbol. Basic idea: when it is not clear which of two alternative productions to use to expand a nonterminal, rewrite the production and defer the decision until we have seen enough of the input to make the right choice.

17 Left factoring Left Factoring: Given A   1 |  2, Change it to A   A ’ A ’   1 |  2 Defer the decision by expanding A to  A ’ ; After seeing the input derived from , we then expand A ’ to  1 or  2. Algorithm 4.2 gives a method for left factoring a grammar.

An Example for Left Factoring Given: A   1 |  2, change to be A   A ’ A ’   1 |  2 Example: S  if E then S | if E then S else S | others change to be: S  if E then S S’ | others S’  else S |  But it is ambiguous, we will learn how to deal with this later.

Top-Down Parsing Recursive Decent Parsing (may have backtracking) Predictive Parsing ( no backtracking) Nonrecursive Predictive Parsing (no recursion) No Left Recursion Left Factoring

Top-down parsing is to find a leftmost derivation for an input string. The recursive descent parsing is a simple and yet effective top-down parsing method. In the recursive-descent parsing, we use a set of recursive procedures obtained from CFG to process the input. We need to eliminate left recursion and nondterminism in the recursive-descent parsing. Summary