1 Chapter 4: Top-Down Parsing. 2 Objectives of Top-Down Parsing an attempt to find a leftmost derivation for an input string. an attempt to construct.

Slides:



Advertisements
Similar presentations
Compiler Construction
Advertisements

Top-Down Parsing.
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
Parsing III (Eliminating left recursion, recursive descent parsing)
1 Predictive parsing Recall the main idea of top-down parsing: Start at the root, grow towards leaves Pick a production and try to match input May need.
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
Professor Yihjia Tsai Tamkang University
Top-Down Parsing.
– 1 – CSCE 531 Spring 2006 Lecture 7 Predictive Parsing Topics Review Top Down Parsing First Follow LL (1) Table construction Readings: 4.4 Homework: Program.
COP4020 Programming Languages Computing LL(1) parsing table Prof. Xin Yuan.
Syntax and Semantics Structure of programming languages.
Parsing Chapter 4 Parsing2 Outline Top-down v.s. Bottom-up Top-down parsing Recursive-descent parsing LL(1) parsing LL(1) parsing algorithm First.
Chapter 9 Syntax Analysis Winter 2007 SEG2101 Chapter 9.
Review: –How do we define a grammar (what are the components in a grammar)? –What is a context free grammar? –What is the language defined by a grammar?
Top-Down Parsing - recursive descent - predictive parsing
Chapter 5 Top-Down Parsing.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.
Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
1 Compiler Construction Syntax Analysis Top-down parsing.
컴파일러 입문 제 7 장 LL 구문 분석.
Lesson 5 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
CSI 3120, Syntactic analysis, page 1 Syntactic Analysis and Parsing Based on A. V. Aho, R. Sethi and J. D. Ullman Compilers: Principles, Techniques and.
Syntax and Semantics Structure of programming languages.
11 Chapter 4 Grammars and Parsing Grammar Grammars, or more precisely, context-free grammars, are the formalism for describing the structure of.
6/4/2016IT 3271 The most practical Parsers: Predictive parser: 1.input (token string) 2.Stacks, parsing table 3.output (syntax tree, intermediate codes)
Pembangunan Kompilator.  The parse tree is created top to bottom.  Top-down parser  Recursive-Descent Parsing ▪ Backtracking is needed (If a choice.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Top-Down Parsing The parse tree is created top to bottom. Top-down parser –Recursive-Descent Parsing.
Parsing Top-Down.
LL(1) Parser. What does LL signify ? The first L means that the scanning takes place from Left to right. The first L means that the scanning takes place.
1 Compiler Construction Syntax Analysis Top-down parsing.
Top-down Parsing Recursive Descent & LL(1) Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
1 Context free grammars  Terminals  Nonterminals  Start symbol  productions E --> E + T E --> E – T E --> T T --> T * F T --> T / F T --> F F --> (F)
1 Nonrecursive Predictive Parsing  It is possible to build a nonrecursive predictive parser  This is done by maintaining an explicit stack.
Top-Down Parsing.
Syntax Analyzer (Parser)
Chapter 3 Chang Chi-Chung
Top-Down Predictive Parsing We will look at two different ways to implement a non- backtracking top-down parser called a predictive parser. A predictive.
Parsing methods: –Top-down parsing –Bottom-up parsing –Universal.
COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )
Chapter 2 (part) + Chapter 4: Syntax Analysis S. M. Farhad 1.
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Spring 16 CSCI 4430, A Milanova 1 Announcements HW1 due on Monday February 8 th Name and date your submission Submit electronically in Homework Server.
Syntax and Semantics Structure of programming languages.
Parsing COMP 3002 School of Computer Science. 2 The Structure of a Compiler syntactic analyzer code generator program text interm. rep. machine code tokenizer.
Context free grammars Terminals Nonterminals Start symbol productions
Compilers Welcome to a journey to CS419 Lecture15: Syntax Analysis:
Table-driven parsing Parsing performed by a finite state machine.
Syntactic Analysis and Parsing
Introduction to Top Down Parser
Top-down parsing cannot be performed on left recursive grammars.
SYNTAX ANALYSIS (PARSING).
FIRST and FOLLOW Lecture 8 Mon, Feb 7, 2005.
Top-Down Parsing.
3.2 Language and Grammar Left Factoring Unclear productions
Top-Down Parsing CS 671 January 29, 2008.
Lecture 7 Predictive Parsing
Syntax Analysis source program lexical analyzer tokens syntax analyzer
Top-Down Parsing Identify a leftmost derivation for an input string
Top-Down Parsing The parse tree is created top to bottom.
Chapter 4 Top Down Parser.
Predictive Parsing Lecture 9 Wed, Feb 9, 2005.
Computing Follow(A) : All Non-Terminals
Syntax Analysis - Parsing
Lecture 7 Predictive Parsing
Nonrecursive Predictive Parsing
Context Free Grammar – Quick Review
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Presentation transcript:

1 Chapter 4: Top-Down Parsing

2 Objectives of Top-Down Parsing an attempt to find a leftmost derivation for an input string. an attempt to construct a parse tree for the input string starting from the root and creating the nodes of the parse tree in preorder.

Input String : >>> lm

4 1. with backtracking (making repeated scans of the input, a general form of top-down parsing) Methods: To create a procedure for each nonterminal. Approaches of Top-Down Parsing

e.g. S -> cAd A -> ab | a S( ) { if input symbol == ‘c’ A( ) { isave= input-pointer; { Advance(); if input-symbol == ‘a’ if A() { Advance(); if input-symbol == ‘d’ if input-symbol == ‘b’ { Advance(); { Advance(); return true; return true; } } return false; input-pointer = isave; } if input-symbol == ‘a’ { Advance(); return true; } else return false; } L = { cabd, cad } c ad

Problems for top-down parsing with backtracking : (1) left-recursion (can cause a top-down parser to go into an infinite loop) Def. A grammar is said to be left-recursive if it has a nonterminal A s.t. there is a derivation A => A  for some . (2) backtracking - undo not only the movement but also the semantics entering in symbol table. (3) the order the alternatives are tried (For the grammar shown above, try w = cabd where A -> a is applied first) +

7 With immediate left recursion: A -> A  |  ==> transform into A ->  A' A' ->  A' |  A A  A . A A .  ===> A  A'  ..  Elimination of Left-Recursion   …  

8 e.g. E -> E + T | T T -> T * F | F F -> (E) | id After transformation: E -> TE ' E' -> +TE' |  T -> FT ' T ' -> *FT' |  F -> (E) | id

9 General form (with left recursion): A -> A  1 | A  2 |... | A  n |  1 |  2 |... |  m After transformation: ==> A ->  1 A' |  2 A' |... |  m A' A' ->  1 A' |  2 A' |... |  n A' | 

10 How about left recursion occurred for derivation with more than two steps? e.g., S -> Aa | b A -> Ac | Sd | e where S => Aa => Sda

Algorithm: Eliminating left recursion Input Context-free Grammar G with no cycles (i.e., A => A ) or  -production Methods: 1. Arrange the nonterminals in some order A1, A2,..., An 2. for i = 1 to n do { for j = 1 to i -1 do replace each production of the form Ai -> Aj  by the production Ai ->  1  |  2  |... |  k , where Aj ->  1 |  2 |... |  k are all current Aj-production; eliminate the immediate left-recursion among the Ai- production; } +

12 An Example e.g. S -> Aa | b A -> Ac | Sd | e Step 1: ==> S -> Aa | b Step 2: ==> A -> Ac | Aad | bd | e Step 3: ==> A -> bdA' |eA' A' -> cA' |adA' | 

2. Non-backtracking (recursive-descent) parsing recursive descent : use a collection of mutually recursive routines to perform the syntax analysis. Left Factoring : A ->  1 |   2 ==> A ->  A' A' ->  1 |  2 Methods: 1. For each nonterminal A find the longest prefix  common to two or more of its alternatives. If    replace all the A productions A ->   1 |   2 |... |   n | others by A ->  A‘ | others A' ->  1 |  2 |... |  n 2. Repeat the transformation until no more found e.g. S -> iCtS | iCtSeS | a C -> b ==> S -> iCtSS' | a S' -> eS |  C -> b

14 Predicative Parsing Features: - maintains a stack rather than recursive calls - table-driven Components: 1. An input buffer with end marker ($) 2. A stack with endmarker ($) on the bottom 3. A parsing table, a two-dimensional array M[A,a], where ‘A’ is a nonterminal symbol and ‘a’ is the current input symbol (terminal/token).

15 Parsing Table M[A,a] S ( S  ( S ) S ) S  ε $

16 Algorithm: Input: An input string w and a parsing table M for grammar G. Output: A leftmost derivation of w or an error indication.

Initially w$ is in input buffer and S$ is in the stack. Method: do { Let a of w be the next input symbol and X be the top stack symbol; if X is a terminal { if X == a then pop X from stack and remove a from input; else ERROR();} else { if M[X, a] = X -> Y1Y2...Yn then 1. pop X from the stack; 2. push YnYn-1...Y1 onto the stack with Y1 on top; else ERROR(); } } while (X ≠ $) if (X == $) and (the next input symbol == $) then accept else error(); Starting Symbol of the grammar

19 An Example

22 Construction of the parsing table for predictive parser First and Follow Def. First(  ) /*  denotes grammar symbol*/ is the set of terminals that begin the string derived from . If  => , then  is also in First(  ). Def. Follow(A), A is a nonterminal, is the set of terminals a that can appear immediately to the right of A in some sentential form, that is, the set of terminals 'a' s.t. there exists a derivation of the form S =>*  A a  for some  and . If A can be the rightmost symbol in some sentential form, then  is in Follow(A). *

23 Compute First(X) for all grammar symbols X: 1. If X is terminal, then First(X) = {X}. 2. If X ->  is a production then  is in First(X). 3. If X is nonterminal and X -> Y1Y2...Yk is a production, then place 'a' in First(X) if for some i, a is in First(Yi), and  is in all of First(Y1),..., First(Yi-1); that is Y1... Yi-1 => . If  is in First(Yj) for all j = 1,2,...,k, then add  in First(X). *

24 An Example E -> TE' E' -> +TE'|  T -> FT' T' -> *FT‘ |  F -> (E) | id First(E) = First(T) = First(F) = {(, id} First(E') = {+,  } First(T') = {*,  }

25

26 Compute Follow(A) for all nonterminals A 1. Place $ in Follow(S), where S is the start symbol and $ is the input buffer endmarker. 2. If there is a production A ->  B , then everything in First(  ) except for  is placed in Follow(B). 3. If there is a production A ->  B, or a production A ->  B  where First(  ) contains , then everything in Follow(A) is in Follow(B).

27 An Example E -> TE' E' -> +TE'|  T -> FT' T' -> *FT' |  F -> (E) | id /* E is the start symbol */ Follow(E) = { $,) } // rules 1 & 2 Follow(E') = { $,) } // rule 3 Follow(T) = { +,$,) } // rules 2 & 3 Follow(T') = { +,$,) } // rule 3 Follow(F) = { *,+,$,) } // rules 2 & 3

28 E -> TE' E' -> +TE'|  T -> FT' T' -> *FT‘ |  F -> (E) | id First(E) = First(T) = First(F) = {(, id} First(E') = {+,  } First(T') = {*,  }

29 Construct a Predicative Parsing Table 1. For each production A ->  of the grammar, do steps 2 and For each terminal a in First(  ), add A ->  to M[A, a]. 3. If  is in First(  ), add A ->  to M[A, b] for each terminal b in Follow(A). If  is in First(  ) and $ is in Follow(A), add A ->  to M[A, $]. 4. Make each undefined entry of M be error.

LL(1) grammar A grammar whose parsing table has no multiply-defined entries is said to be LL(1). First 'L' : scan the input from left to right. Second 'L': produce a leftmost derivation. '1' : use one input symbol to determine parsing action. * No ambiguous or left-recursive grammar can be LL(1).

31 Properties of LL(1) grammar A grammar G is LL(1) iff whenever A ->  |  are two distinct productions of G, the following conditions hold: (1) For no terminal a do both  and  derive strings beginning with a. (based on method 2)  First(  ) ∩ First(  ) = ψ (2) At most one of  and  can derive the empty string  (based on method 3). (3) if  =>  then  does not derive any string beginning with a terminal in Follow (A) (based on methods 2 and 3).  First(  ) ∩ Follow(A) = ψ (i.e. If First(A) contains  then First(A) ∩ Follow(A) = ψ) *

32 Def. for Multiply-defined entry If G is left-recursive or ambiguous, then M will have at least one multiply-defined entry. e.g. S -> iCtSS'| a S' -> eS |  C -> b generates: M[S',e] = { S' -> , S' -> eS} with multiply- defined entry.

33 Parsing table with multiply-defined entry abeit$ SS-> aS -> iCtSS' S’ S’->  S' -> eS S’->  CC->b

34 Difficulty in predictive parsing - Left recursion elimination and left factoring make the resulting grammar hard to read and difficult to use for translation purpose. Thus: * Use predictive parser for control constructs * Use operator precedence for expressions.

35 Assignment #3b Do exercises 4.3, 4.10, 4.13, 4.15