# 1 CMPS 450 Parsing CMPS 450 J. Moloney. # 2 CMPS 450 Check that input is well-formed Build a parse tree or similar representation of input Recursive.

Slides:



Advertisements
Similar presentations
Compiler Construction
Advertisements

1 Chapter Parsing Techniques. 2 Section 12.3 Parsing Techniques We know (via a theorem) that the context- free languages are exactly those languages.
1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.
Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.
YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.
LESSON 18.
Top-Down Parsing.
1 Contents Introduction A Simple Compiler Scanning – Theory and Practice Grammars and Parsing LL(1) Parsing LR Parsing Lex and yacc Semantic Processing.
Parsing III (Eliminating left recursion, recursive descent parsing)
1 Predictive parsing Recall the main idea of top-down parsing: Start at the root, grow towards leaves Pick a production and try to match input May need.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
1 Chapter 4: Top-Down Parsing. 2 Objectives of Top-Down Parsing an attempt to find a leftmost derivation for an input string. an attempt to construct.
Professor Yihjia Tsai Tamkang University
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
Top-Down Parsing.
Bottom-up parsing Goal of parser : build a derivation
CPSC 388 – Compiler Design and Construction
COP4020 Programming Languages Computing LL(1) parsing table Prof. Xin Yuan.
Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Top-Down Parsing - recursive descent - predictive parsing
4 4 (c) parsing. Parsing A grammar describes the strings of tokens that are syntactically legal in a PL A recogniser simply accepts or rejects strings.
1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down  LL Bottom-up.
Chapter 5 Top-Down Parsing.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.
Parsing III (Top-down parsing: recursive descent & LL(1) )
-Mandakinee Singh (11CS10026).  What is parsing? ◦ Discovering the derivation of a string: If one exists. ◦ Harder than generating strings.  Two major.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
1 Compiler Construction Syntax Analysis Top-down parsing.
4 4 (c) parsing. Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces.
# 1 CMPS 450 Syntax-Directed Translations CMPS 450 J. Moloney.
1 Section 12.3 Context-Free Parsing We know (via a theorem) that the context-free languages are exactly those languages that are accepted by PDAs. When.
COP4020 Programming Languages Parsing Prof. Xin Yuan.
Parsing Top-Down.
Top-down Parsing lecture slides from C OMP 412 Rice University Houston, Texas, Fall 2001.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
More Parsing CPSC 388 Ellen Walker Hiram College.
Top-down Parsing Recursive Descent & LL(1) Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
TOP-DOWN PARSING Recursive-Descent, Predictive Parsing.
1 Context free grammars  Terminals  Nonterminals  Start symbol  productions E --> E + T E --> E – T E --> T T --> T * F T --> T / F T --> F F --> (F)
1 Nonrecursive Predictive Parsing  It is possible to build a nonrecursive predictive parser  This is done by maintaining an explicit stack.
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
10/10/2002© 2002 Hal Perkins & UW CSED-1 CSE 582 – Compilers LR Parsing Hal Perkins Autumn 2002.
Top-Down Parsing.
Top-Down Predictive Parsing We will look at two different ways to implement a non- backtracking top-down parser called a predictive parser. A predictive.
Parsing methods: –Top-down parsing –Bottom-up parsing –Universal.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
8 January 2004 Department of Software & Media Technology 1 Top Down Parsing Recursive Descent Parsing Top-down parsing: –Build tree from root symbol –Each.
COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
UMBC  CSEE   1 Chapter 4 Chapter 4 (b) parsing.
Parsing III (Top-down parsing: recursive descent & LL(1) )
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Lecture # 10 Grammar Problems. Problems with grammar Ambiguity Left Recursion Left Factoring Removal of Useless Symbols These can create problems for.
WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.
Parsing Bottom Up CMPS 450 J. Moloney CMPS 450.
Programming Languages Translator
Context free grammars Terminals Nonterminals Start symbol productions
Parsing IV Bottom-up Parsing
Table-driven parsing Parsing performed by a finite state machine.
Top-down parsing cannot be performed on left recursive grammars.
Top-Down Parsing.
Top-Down Parsing CS 671 January 29, 2008.
Compiler Design 7. Top-Down Table-Driven Parsing
Top-Down Parsing Identify a leftmost derivation for an input string
Lecture 8: Top-Down Parsing
Nonrecursive Predictive Parsing
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Presentation transcript:

# 1 CMPS 450 Parsing CMPS 450 J. Moloney

# 2 CMPS 450 Check that input is well-formed Build a parse tree or similar representation of input Recursive Descent Parsing A collection of recursive routines, each corresponding to one of the nonterminals of a grammar. Procedure A, corresponding to nonterminal A, scans the input and recognizes a string derived from A. If recognized, the appropriate semantic actions are taken, e. g., a node in a parse tree corresponding to A is created. In its most general form, recursive descent involves backtracking on the input. Parsing =

# 3 CMPS 450 LL-Parsing Specialization of recursive descent that requires no backtracking. Called LL, because we discover a leftmost derivation while scanning the input from left to right. A leftmost derivation is one where at each step, the leftmost nonterminal is replaced. Example A  BC  DEC  aEC is leftmost; A  BC  BFG is not. A left-sentential form is any string of terminals and nonterminals derived by some leftmost derivation

# 4 CMPS 450 LL Parser Structure Input, a string of terminals, is scanned left-to-right. A stack holds the tail of the current LSF, which is the portion of the LSF from the leftmost nonterminal to the right. The head of the LSF, or prefix of terminals up to the first nonterminal, will always match the portion of the input that has been scanned so far. Parser Actions 1.Match the top-of-stack, which must be a terminal, wih the current input symbol. Pop the stack, and move the input pointer right. 2.Replace the top-of-stack, which must be a nonterminal A, by the right side of some production for A. The magic of LL parsing: knowing which production to use in [2]

# 5 CMPS 450 Example Consider the following grammar for lists. E = “element” (atom or bracketed list), the strt symbol. L = “list,” one or more elements separated by commas. T = “tail,” empty or comma-list E  [ L ] | a L  ET T  | 

# 6 CMPS 450 Beginning of an LL Parse for string [a,[a,a]] is: L S, F,StackRemaining Input_____ EE[a, [a, a]] [L][L] [a, [a, a]] [L]L]a, [a, a]] [ET]ET]a, [a, a]] [aT]aT]a, [a, a]] [aT]T],[a, a]] [a, L], L], [a, a]] Build parse tree by expanding nodes corresponding ot the nonterminals exactly as the nonterminals themselves are expanded.

# 7 CMPS 450 FIRST To define “LL grammar” precisely, we need two definitions. The first is “FIRST” If  is any string, FIRST(  ) is the set of terminals b that can be the first symbol of some derivation  $  * b . Note that if   * , then b oculd be $, which is an “honorary” terminal. FOLLOW FOLLOW(A) is the set of terminals that can immediately follow nonterminal A in any derivation (not necesssarily leftmost) from S$, where S is the start symbol. Algortihms for computing FIRST and FOLLOW are found in the “Dragon book.”

# 8 CMPS 450 Definition of LL Grammar We must decide how to expand nonterminal A at the top of the stack, knowing only the next symbol, b (the “lookahead symbol”). Production A  is a choice if 1.b is in FIRST(  ), or 2.   * , and b is in FOLLOW(A). Otherwise, , followed by whatever is below A on the stack, could not possibly derive a string beginning with b. A grammar is LL(1) [or just “LL”] if and only if, for each A and b, there is at most one choice according to (1) and (2) above. A table given the choice for each nonterminal A (row) and lookahead symbol b (column) is an LL parsing table.

# 9 CMPS 450 Example The list grammar, E  [ L ] | a L  ET T  |  is LL. Here is the parsing table. a [ ], $ ______________________________________________ EE  a E  [L] LL  ET L  ET TT   T ,L For example, on lookahead “,”we can only expand T by,L. That production is a choice, because “,” is not in FOLLOW(T). On the other hand, “]” is in FOLLOW(T), so T   is a choice on this lookahead. Note that on lookaheads a, $, and “[“ there is no choice of expansion for T. Such a situation represents an error, the input cannot be derived from the start symbol.

# 10 CMPS 450 Example of non-LL Construct Consider the following grammar fragment, which represents the “dangling else.” 1.  if then 2.  else 3. 

# 11 CMPS 450 How do we expand on lookahead else? Surely, else is in FIRST(else ), so production (2) is a choice. However, (3) is also a choice, because else is in FOLLOW( ), as seen by a parse tree of the following form. if then if then else Normally, we pick (2), even thoough there is a technical LL violation. This choice corresponds to the idea “an else matches the previous unmatched then.”

# 12 CMPS 450 Designing LL Grammars There are several tricks that help us put grammars into an LL form. 1.Eliminating left-recursion. 2.Left-factoring Eliminating Left-Recursion A grammar with left-recursion, a nontrivial derivation of the form A  * A , cannot possibly be LL.

# 13 CMPS 450 Example Consider A  AB | c A typical derivation from A looks like: A AB c as to how many times we should apply A  AB before applying A  c. Seeing only lookahead c, we have no clue

# 14 CMPS 450 The “Dragon book” gives a general algorithm for eliminating left- recursion. The following simple rule applies in most cases. Replace the productions A  A  |  by A .  A’ A’   A’ |  The idea is that every leftmost derivation from A eventually uses A  , so in the new productions,  is generated first, followed by zero or more occurences of . The new nonterminal A’ takes care of the  ’s.

# 15 CMPS 450 A A   A’ A   A’   A’  BeforeAfter The implied transformation on parse trees is suggested by the following diagram;  is derived in each

# 16 CMPS 450 Example: A more natural grammar for lists is E  [L] a L  L, E | E We may replace the left-recursion for L by E  [L] | a L  EL’ L’ ,EL’ |  Note that L’ plays the role of T in our earlier example. However, instead of L’ ,L, the L has be3n expanded by its lone production.

# 17 CMPS 450 Left-Factoring: Sometimes, two productions for a nonterminal have right sides with a common prefix. For example, the following is also a natural grammar for lists. E  [L] a L  E, L | E If L is at the top of stack, and we see a lookahead symbol, like a, that is in FIRST(E’), then we cannot tell whether to expand L by E, L or just E. The trick is to “factor” the E out, and defer the decision until we have examined the portion of the input E derives. The new grammar is: E  [L] | a L  EL’ L’ , L |  This grammar is exactly our original, with L’ in place of T. General Rule Replace A   and A   by A   A’andA’   | 