Professor Yihjia Tsai Tamkang University

Slides:



Advertisements
Similar presentations
Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.
Advertisements

Top-Down Parsing.
CS 310 – Fall 2006 Pacific University CS310 Parsing with Context Free Grammars Today’s reference: Compilers: Principles, Techniques, and Tools by: Aho,
Parsing III (Eliminating left recursion, recursive descent parsing)
COS 320 Compilers David Walker. last time context free grammars (Appel 3.1) –terminals, non-terminals, rules –derivations & parse trees –ambiguous grammars.
Prof. Bodik CS 164 Lecture 61 Building a Parser II CS164 3:30-5:00 TT 10 Evans.
Prof. Fateman CS 164 Lecture 8 1 Top-Down Parsing CS164 Lecture 8.
1 Predictive parsing Recall the main idea of top-down parsing: Start at the root, grow towards leaves Pick a production and try to match input May need.
CS 310 – Fall 2006 Pacific University CS310 Parsing with Context Free Grammars Today’s reference: Compilers: Principles, Techniques, and Tools by: Aho,
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
1 Chapter 4: Top-Down Parsing. 2 Objectives of Top-Down Parsing an attempt to find a leftmost derivation for an input string. an attempt to construct.
LR(1) Languages An Introduction Professor Yihjia Tsai Tamkang University.
COS 320 Compilers David Walker. last time context free grammars (Appel 3.1) –terminals, non-terminals, rules –derivations & parse trees –ambiguous grammars.
Top-Down Parsing.
– 1 – CSCE 531 Spring 2006 Lecture 7 Predictive Parsing Topics Review Top Down Parsing First Follow LL (1) Table construction Readings: 4.4 Homework: Program.
CPSC 388 – Compiler Design and Construction
COP4020 Programming Languages Computing LL(1) parsing table Prof. Xin Yuan.
Parsing Chapter 4 Parsing2 Outline Top-down v.s. Bottom-up Top-down parsing Recursive-descent parsing LL(1) parsing LL(1) parsing algorithm First.
Top-Down Parsing - recursive descent - predictive parsing
4 4 (c) parsing. Parsing A grammar describes the strings of tokens that are syntactically legal in a PL A recogniser simply accepts or rejects strings.
Chapter 5 Top-Down Parsing.
4 4 (c) parsing. Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces.
# 1 CMPS 450 Parsing CMPS 450 J. Moloney. # 2 CMPS 450 Check that input is well-formed Build a parse tree or similar representation of input Recursive.
Parsing III (Top-down parsing: recursive descent & LL(1) )
Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
1 Compiler Construction Syntax Analysis Top-down parsing.
Lesson 5 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
CSI 3120, Syntactic analysis, page 1 Syntactic Analysis and Parsing Based on A. V. Aho, R. Sethi and J. D. Ullman Compilers: Principles, Techniques and.
4 4 (c) parsing. Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces.
6/4/2016IT 3271 The most practical Parsers: Predictive parser: 1.input (token string) 2.Stacks, parsing table 3.output (syntax tree, intermediate codes)
Comp 311 Principles of Programming Languages Lecture 3 Parsing Corky Cartwright August 28, 2009.
Parsing Top-Down.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Top-down Parsing Recursive Descent & LL(1) Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Lecture 3: Parsing CS 540 George Mason University.
1 Context free grammars  Terminals  Nonterminals  Start symbol  productions E --> E + T E --> E – T E --> T T --> T * F T --> T / F T --> F F --> (F)
1 Nonrecursive Predictive Parsing  It is possible to build a nonrecursive predictive parser  This is done by maintaining an explicit stack.
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
Top-Down Parsing.
CSE 5317/4305 L3: Parsing #11 Parsing #1 Leonidas Fegaras.
Top-Down Predictive Parsing We will look at two different ways to implement a non- backtracking top-down parser called a predictive parser. A predictive.
Parsing methods: –Top-down parsing –Bottom-up parsing –Universal.
COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )
Chapter 2 (part) + Chapter 4: Syntax Analysis S. M. Farhad 1.
UMBC  CSEE   1 Chapter 4 Chapter 4 (b) parsing.
Parsing III (Top-down parsing: recursive descent & LL(1) )
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Fangfang Cui Mar. 29, Overview 1. LL(k) 1. LL(k) Definition 2. LL(1) 3. Using a Parsing Table 4. First Sets 5. Follow Sets 6. Building a Parsing.
Spring 16 CSCI 4430, A Milanova 1 Announcements HW1 due on Monday February 8 th Name and date your submission Submit electronically in Homework Server.
Parsing COMP 3002 School of Computer Science. 2 The Structure of a Compiler syntactic analyzer code generator program text interm. rep. machine code tokenizer.
Programming Languages Translator
Context free grammars Terminals Nonterminals Start symbol productions
Top-down parsing cannot be performed on left recursive grammars.
Parsing with Context Free Grammars
CS 404 Introduction to Compiler Design
Top-Down Parsing.
4 (c) parsing.
Top-Down Parsing CS 671 January 29, 2008.
CS 540 George Mason University
Compiler Design 7. Top-Down Table-Driven Parsing
Ambiguity, Precedence, Associativity & Top-Down Parsing
Computing Follow(A) : All Non-Terminals
Syntax Analysis - Parsing
Nonrecursive Predictive Parsing
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Professor Yihjia Tsai Tamkang University Predictive Parsing Professor Yihjia Tsai Tamkang University

Top Down Parsing Methods Simplest method is a full-backup recursive descent parse Write recursive recognizers (subroutines) for each grammar rule If rules succeeds perform some action (I.e., build a tree node, emit code, etc.) If rule fails, return failure. Caller may try another choice or fail On failure it “backs up” which might have problem if it needs to return a lexical symbol to the input stream

Problems Also remember left recursion problem Need to backtrack , suppose that you could always tell what production applied by looking at one (or more) tokens of lookahead – called predictive parsing Factoring

Summary of Recursive Descent Simple and general parsing strategy usually coupled with simple handcrafted lexer Left-recursion must be eliminated first … but that can be done automatically Unpopular because of backtracking Thought to be too inefficient In practice, backtracking is eliminated by restricting the grammar

Elimination of Immediate Left Recursion A -> A a1 | A a2 | … A am | b1 | b2 | bn ________________________________________ -> b1 A’ | b2 A’ … bn A’ ‘ -> a1 A’ | a2 A’ | … amA’ | e Doesn’t solve S->+ Sa (See ASU Chapter 4 for general algorithm w/o cycle no e)

Predictive Parsers Like recursive-descent but parser can “predict” which production to use By looking at the next few tokens No backtracking Predictive parsers accept LL(k) grammars L means “left-to-right” scan of input L means “leftmost derivation” k means “predict based on k tokens of lookahead” In practice, LL(1) is used

Summary of Recursive Descent Simple and general parsing strategy Left-recursion must be eliminated first … but that can be done automatically Unpopular because of backtracking Thought to be too inefficient In practice, backtracking is eliminated by restricting the grammar

Predictive Parsers Like recursive-descent but parser can “predict” which production to use By looking at the next few tokens No backtracking Predictive parsers accept LL(k) grammars L means “left-to-right” scan of input L means “leftmost derivation” k means “predict based on k tokens of lookahead” In practice, LL(1) is used

LL(1) Languages In recursive-descent, for each non-terminal and input token there may be a choice of production LL(1) means that for each non-terminal and token there is only one production Can be specified via 2D tables One dimension for current non-terminal to expand One dimension for next token A table entry contains one production

Predictive Parsing and Left Factoring Recall the grammar E  T + E | T T  int | int * T | ( E ) Hard to predict because For T two productions start with int For E it is not clear how to predict A grammar must be left-factored before use for predictive parsing

Left-Factoring Example Recall the grammar E  T + E | T T  int | int * T | ( E ) Factor out common prefixes of productions E  T X X  + E |  T  ( E ) | int Y Y  * T | 

LL(1) Parsing Table Example Left-factored grammar E  T X X  + E |  T  ( E ) | int Y Y  * T |  The LL(1) parsing table: int * + ( ) $ E T X X + E  T int Y ( E ) Y * T

LL(1) Parsing Table Example (Cont.) Consider the [E, int] entry “When current non-terminal is E and next input is int, use production E  T X This production can generate an int in the first place Consider the [Y,+] entry “When current non-terminal is Y and current token is +, get rid of Y” Y can be followed by + only in a derivation in which Y  

LL(1) Parsing Tables. Errors Blank entries indicate error situations Consider the [E,*] entry “There is no way to derive a string starting with * from non-terminal E”

Using Parsing Tables Method similar to recursive descent, except For each non-terminal S We look at the next token a And chose the production shown at [S,a] We use a stack to keep track of pending non-terminals We reject when we encounter an error state We accept when we encounter end-of-input

LL(1) Parsing Algorithm initialize stack = <S $> and next repeat case stack of <X, rest> : if T[X,*next] = Y1…Yn then stack  <Y1… Yn rest>; else error (); <t, rest> : if t == *next ++ then stack  <rest>; else error (); until stack == <$ >

LL(1) Parsing Example Stack Input Action E $ int * int $ T X T X $ int * int $ int Y int Y X $ int * int $ terminal Y X $ * int $ * T * T X $ * int $ terminal T X $ int $ int Y int Y X $ int $ terminal Y X $ $  X $ $  $ $ ACCEPT

Constructing Parsing Tables LL(1) languages are those defined by a parsing table for the LL(1) algorithm No table entry can be multiply defined We want to generate parsing tables from CFG

Constructing Parsing Tables (Cont.) If A  , where in the line of A we place  ? In the column of t where t can start a string derived from   * t  We say that t  First() In the column of t if  is  and t can follow an A S *  A t  We say t  Follow(A)

Computing First Sets Definition: First(X) = { t | X * t}  { | X * } Algorithm sketch (see book for details): for all terminals t do First(t)  { t } for each production X   do First(X)  {  } if X  A1 … An  and   First(Ai), 1  i  n do add First() to First(X) for each X  A1 … An s.t.   First(Ai), 1  i  n do add  to First(X) repeat steps 4 & 5 until no First set can be grown

First Sets. Example Recall the grammar First sets E  T X X  + E |  T  ( E ) | int Y Y  * T |  First sets First( ( ) = { ( } First( T ) = {int, ( } First( ) ) = { ) } First( E ) = {int, ( } First( int) = { int } First( X ) = {+,  } First( + ) = { + } First( Y ) = {*,  } First( * ) = { * }

Computing Follow Sets Definition: Follow(X) = { t | S *  X t  } Intuition If S is the start symbol then $  Follow(S) If X  A B then First(B)  Follow(A) and Follow(X)  Follow(B) Also if B *  then Follow(X)  Follow(A)

Computing Follow Sets (Cont.) Algorithm sketch: Follow(S)  { $ } For each production A   X  add First() - {} to Follow(X) For each A   X  where   First() add Follow(A) to Follow(X) repeat step(s) ___ until no Follow set grows

Follow Sets. Example Recall the grammar Follow sets E  T X X  + E |  T  ( E ) | int Y Y  * T |  Follow sets Follow( + ) = { int, ( } Follow( * ) = { int, ( } Follow( ( ) = { int, ( } Follow( E ) = {), $} Follow( X ) = {$, ) } Follow( T ) = {+, ) , $} Follow( ) ) = {+, ) , $} Follow( Y ) = {+, ) , $} Follow( int) = {*, +, ) , $}

Constructing LL(1) Parsing Tables Construct a parsing table T for CFG G For each production A   in G do: For each terminal t  First() do T[A, t] =  If   First(), for each t  Follow(A) do If   First() and $  Follow(A) do T[A, $] = 

Notes on LL(1) Parsing Tables If any entry is multiply defined then G is not LL(1) If G is ambiguous If G is left recursive If G is not left-factored And in other cases as well Most programming language grammars are not LL(1) There are tools that build LL(1) tables

References Compilers Principles, Techniques and Tools, Aho, Sethi, Ullman Chapters 2/3 http://www.cs.columbia.edu/~lerner/CS4115 http://www.cs.wisc.edu/~bodik/cs536/lectures.html