Top-Down Parsing.

Slides:



Advertisements
Similar presentations
Compiler Construction
Advertisements

YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.
Top-Down Parsing.
1 CMPSC 160 Translation of Programming Languages Fall 2002 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #7 Parsing.
Pertemuan 9, 10, 11 Top-Down Parsing
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
1 Predictive parsing Recall the main idea of top-down parsing: Start at the root, grow towards leaves Pick a production and try to match input May need.
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
1 Chapter 4: Top-Down Parsing. 2 Objectives of Top-Down Parsing an attempt to find a leftmost derivation for an input string. an attempt to construct.
Professor Yihjia Tsai Tamkang University
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
– 1 – CSCE 531 Spring 2006 Lecture 7 Predictive Parsing Topics Review Top Down Parsing First Follow LL (1) Table construction Readings: 4.4 Homework: Program.
1 Syntactic Analysis and Parsing (Based on: Compilers, Principles, Techniques and Tools, by Aho, Sethi and Ullman, 1986)
COP4020 Programming Languages Computing LL(1) parsing table Prof. Xin Yuan.
Top-Down Parsing - recursive descent - predictive parsing
Chapter 5 Top-Down Parsing.
4 4 (c) parsing. Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
1 Compiler Construction Syntax Analysis Top-down parsing.
컴파일러 입문 제 7 장 LL 구문 분석.
Lesson 5 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Chapter 4. Syntax Analysis (1). 2 Application of a production  A  in a derivation step  i   i+1.
CSI 3120, Syntactic analysis, page 1 Syntactic Analysis and Parsing Based on A. V. Aho, R. Sethi and J. D. Ullman Compilers: Principles, Techniques and.
4 4 (c) parsing. Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces.
1 Problems with Top Down Parsing  Left Recursion in CFG May Cause Parser to Loop Forever.  Indeed:  In the production A  A  we write the program procedure.
Pembangunan Kompilator.  The parse tree is created top to bottom.  Top-down parser  Recursive-Descent Parsing ▪ Backtracking is needed (If a choice.
COP4020 Programming Languages Parsing Prof. Xin Yuan.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Top-Down Parsing The parse tree is created top to bottom. Top-down parser –Recursive-Descent Parsing.
LL(1) Parser. What does LL signify ? The first L means that the scanning takes place from Left to right. The first L means that the scanning takes place.
1 Compiler Construction Syntax Analysis Top-down parsing.
Top-down Parsing Recursive Descent & LL(1) Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
TOP-DOWN PARSING Recursive-Descent, Predictive Parsing.
1 Context free grammars  Terminals  Nonterminals  Start symbol  productions E --> E + T E --> E – T E --> T T --> T * F T --> T / F T --> F F --> (F)
1 Nonrecursive Predictive Parsing  It is possible to build a nonrecursive predictive parser  This is done by maintaining an explicit stack.
Top-Down Parsing.
Top-Down Predictive Parsing We will look at two different ways to implement a non- backtracking top-down parser called a predictive parser. A predictive.
Parsing methods: –Top-down parsing –Bottom-up parsing –Universal.
1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )
Chapter 2 (part) + Chapter 4: Syntax Analysis S. M. Farhad 1.
UMBC  CSEE   1 Chapter 4 Chapter 4 (b) parsing.
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Spring 16 CSCI 4430, A Milanova 1 Announcements HW1 due on Monday February 8 th Name and date your submission Submit electronically in Homework Server.
Parsing COMP 3002 School of Computer Science. 2 The Structure of a Compiler syntactic analyzer code generator program text interm. rep. machine code tokenizer.
9/30/2014IT 3271 How to construct an LL(1) parsing table ? 1.S  A S b 2.S  C 3.A  a 4.C  c C 5.C  abc$ S1222 A3 C545 LL(1) Parsing Table What is the.
Compiler Construction
Context free grammars Terminals Nonterminals Start symbol productions
Compilers Welcome to a journey to CS419 Lecture15: Syntax Analysis:
Table-driven parsing Parsing performed by a finite state machine.
Syntactic Analysis and Parsing
Compiler Construction
Introduction to Top Down Parser
Top-down parsing cannot be performed on left recursive grammars.
CS 404 Introduction to Compiler Design
UNIT 2 - SYNTAX ANALYSIS Role of the parser Writing grammars
Top-Down Parsing.
3.2 Language and Grammar Left Factoring Unclear productions
Lecture 7 Predictive Parsing
Syntax Analysis source program lexical analyzer tokens syntax analyzer
Compiler Design 7. Top-Down Table-Driven Parsing
Top-Down Parsing Identify a leftmost derivation for an input string
Top-Down Parsing The parse tree is created top to bottom.
Chapter 4 Top Down Parser.
Computing Follow(A) : All Non-Terminals
Syntax Analysis - Parsing
Lecture 7 Predictive Parsing
Nonrecursive Predictive Parsing
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Predictive Parsing Program
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Top-Down Parsing

Relationship between parser types

Recursive descent Recursive descent parsers simply try to build a top-down parse tree. It would be better if we always knew the correct action to take. It would be better if we could avoid recursive procedure calls during parsing.

Predictive parsers A predictive parser always knows which production to use, ( to avoid backtracking ) Example: for the productions stmt -> if ( expr ) stmt else stmt | while ( expr ) stmt | for ( stmt expr stmt ) stmt a recursive descent parser would always know which production to use, depending on the input token.

Transition diagrams Transition diagrams can describe recursive parsers, just like they can describe lexical analyzers, (but the diagrams are slightly different.) Construction: Eliminate left recursion from G Left factor G For each non-terminal A, do Create an initial and final (return) state For each production A -> X1 X2 … Xn, create a path from the initial to the final state with edges X1 X2 … Xn.

Example transition diagrams An expression grammar with left recursion With ambiguity E -> E+T | T T -> T*F | F F -> (E) | id Corresponding transition diagrams: Eliminating the ambiguity E -> T E’ E’ -> + T E’ | ε T -> F T’ T’ -> * F T’ | ε F -> ( E ) | id

The parsing table and parsing program The table is a 2D array M[A,a] where A is a nonterminal symbol and a is a terminal or $. At each step, the parser considers the top-of-stack symbol X and input symbol a: If both are $, accept If they are the same (nonterminals), pop X, advance input If X is a nonterminal, consult M[X,a]. If M[X,a] is “ERROR” call an error recovery routine. Otherwise, if M[X,a] is a production of the grammar X -> UVW, replace X on the stack with WVU (U on top)

Predictive parsing without recursion To get rid of the recursive procedure calls, we maintain our own stack.

Example Use the table-driven predictive parser to parse id + id * id Assuming parsing table E -> T E’ E’ -> + T E’ | ε T -> F T’ T’ -> * F T’ | ε F -> ( E ) | id Initial stack is $E Initial input is id + id * id $

Building a predictive parse table The construction requires two functions: 1. FIRST 2. FOLLOW

For First For a string of grammar symbols α, FIRST(α) is the set of terminals that begin all possible strings derived from α. If α =*> ε, then ε is also in FIRST(α). E -> T E’ E’ -> + T E’ | ε T -> F T’ T’ -> * F T’ | ε F -> ( E ) | id FIRST(E) = FIRST (T) = FIRST (F) = {( , id } FIRST(E’) = {+ , e} FIRST(T) = {( , id} FIRST(T’) = { *, e} FIRST(F) = {( , id }

For Follow Follow (E) = { ) , $ } Follow (E’) = Follow (E)= { ) ,$ } FOLLOW(A) for non terminal A is the set of terminals that can appear immediately to the right of A in some sentential form. If A can be the last symbol in a sentential form, then $ is also in FOLLOW(A). E -> T E’ E’ -> + T E’ | ε T -> F T’ T’ -> * F T’ | ε F -> ( E ) | id Follow (E) = { ) , $ } Follow (E’) = Follow (E)= { ) ,$ } Follow (T) = { +, Follow (E)}= {+ , ) , $} Follow (T’) = {+, ) ,$} Follow ( F) = {*, +, ), $ }

How to compute FIRST(α) If X is a terminal, FIRST(X) = X. Otherwise (X is a nonterminal), 1. If X -> ε is a production, add ε to FIRST(X) 2. If X -> Y1 … Yk is a production, then place a in FIRST(X) if for some i, a is in FIRST(Yi) and Y1…Yi-1 =*> ε. Given FIRST(X) for all single symbols X, Let FIRST(X1…Xn) = FIRST(X1) If ε ∈ FIRST(X1), then add FIRST(X2), and so on…

How to compute FOLLOW(A) Place $ in FOLLOW(S) (for S the start symbol) If A -> α B β, then FIRST(β)-ε is placed in FOLLOW(B) If there is a production A -> α B or a production A -> α B β where β =*> ε, then everything in FOLLOW(A) is in FOLLOW(B). Repeatedly apply these rules until no FOLLOW set changes.

Example FIRST and FOLLOW For our favorite grammar: E -> TE’ E’ -> +TE | ε T -> FT’ T’ -> *FT’ | ε F -> (E) | id What is FIRST() and FOLLOW() for all nonterminals?

Parse table construction with FIRST/FOLLOW Basic idea: if A -> α and a is in FIRST(α), then we expand A to α any time the current input is a and the top of stack is A. Algorithm: For each production A -> α in G, do: For each terminal a in FIRST(α) add A -> α to M[A,a] If ε ∈ FIRST(α), for each terminal b in FOLLOW(A), do: add A -> α to M[A,b] If ε ∈ FIRST(α) and $ is in FOLLOW(A), add A -> α to M[A,$] Make each undefined entry in M[ ] an ERROR

Example predictive parse table construction For our favorite grammar: E -> TE’ E’ -> +TE | ε T -> FT’ T’ -> *FT’ | ε F -> (E) | id What the predictive parsing table?

LL(1) grammars The predictive parser algorithm can be applied to ANY grammar. But sometimes, M[ ] might have multiply defined entries. Example: for if-else statements and left factoring: stmt -> if ( expr ) stmt optelse optelse -> else stmt | ε When we have “optelse” on the stack and “else” in the input, we have a choice of how to expand optelse (“else” is in FOLLOW(optelse) so either rule is possible)

LL(1) grammars If the predictive parsing construction for G leads to a parse table M[ ] WITHOUT multiply defined entries, we say “G is LL(1)” 1 symbol of lookahead Leftmost derivation Left-to-right scan of the input

LL(1) grammars Necessary and sufficient conditions for G to be LL(1): If A -> α | β There does not exist a terminal a such that a ∈ FIRST(α) and a ∈ FIRST(β) At most one of α and β derive ε If β =*> ε, then FIRST(α) does not intersect with FOLLOW(β). This is the same as saying the predictive parser always knows what to do!

Model of a non recursive predictive parser. a + b $ X Y Z $ Input buffer stack Predictive parsing program/driver Parsing Table M Model of a non recursive predictive parser.

Moves made by predictive parser on input id + id * id STACK INPUT OUTPUT $E $E' T $E' T' F $E' T' id $E' T' $E' $E' T + $E' T' F * $ id + id * id$ + id * id$ id * id$ * id$ id$ E  T E' T  F T' F  id T'   E'  + T E' T'  * F T' E'   Moves made by predictive parser on input id + id * id

Nonrecursive Predictive Parsing 1. If X = a = $, the parser halts and announces successful completion of parsing. 2. If X = a  $, the parser pops X off the stack and advances the input pointer to the next input symbol. 3. If X is a nonterminal, the program consults entry M[X, a] of the parsing table M. This entry will be either an X-production of the grammar or an error entry. If, for example, M[X, a] = {X  UVW}, the parser replaces X on top of the stack by WVU (with U on top). As output, we shall assume that the parser just prints the production used; any other code could be executed here. If M[X, a] = error, the parser calls an error recovery routine.

Parsing table M for grammar NONTER-MINAL INPUT SYMBOL Id + * ( ) $ E E' T T' F E  TE' T  FT' F  id E'  +TE' T'   T'  *FT' F  (E) E'   Parsing table M for grammar

Top-down parsing recap RECURSIVE DESCENT parsers are easy to build, but inefficient, and might require backtracking. TRANSITION DIAGRAMS help us build recursive descent parsers. For LL(1) grammars, it is possible to build PREDICTIVE PARSERS with no recursion automatically. Compute FIRST() and FOLLOW() for all nonterminals Fill in the predictive parsing table Use the table-driven predictive parsing algorithm