Parsing V: Bottom-up Parsing

Slides:



Advertisements
Similar presentations
Parsing II : Top-down Parsing
Advertisements

Compiler Construction
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Pushdown Automata Consists of –Pushdown stack (can have terminals and nonterminals) –Finite state automaton control Can do one of three actions (based.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
Parsing III (Eliminating left recursion, recursive descent parsing)
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
Parsing V Introduction to LR(1) Parsers. from Cooper & Torczon2 LR(1) Parsers LR(1) parsers are table-driven, shift-reduce parsers that use a limited.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
CS 330 Programming Languages 09 / 23 / 2008 Instructor: Michael Eckmann.
LR(1) Languages An Introduction Professor Yihjia Tsai Tamkang University.
1 CIS 461 Compiler Design & Construction Fall 2012 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #12 Parsing 4.
Parsing II : Top-down Parsing Lecture 7 CS 4318/5331 Apan Qasem Texas State University Spring 2015 *some slides adopted from Cooper and Torczon.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
– 1 – CSCE 531 Spring 2006 Lecture 8 Bottom Up Parsing Topics Overview Bottom-Up Parsing Handles Shift-reduce parsing Operator precedence parsing Readings:
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Syntax and Semantics Structure of programming languages.
10/13/2015IT 3271 Tow kinds of predictive parsers: Bottom-Up: The syntax tree is built up from the leaves Example: LR(1) parser Top-Down The syntax tree.
-Mandakinee Singh (11CS10026).  What is parsing? ◦ Discovering the derivation of a string: If one exists. ◦ Harder than generating strings.  Two major.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
1 Compiler Construction Syntax Analysis Top-down parsing.
CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 1 Chapter 4 Chapter 4 Bottom Up Parsing.
Syntax and Semantics Structure of programming languages.
Top Down Parsing - Part I Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
COP4020 Programming Languages Parsing Prof. Xin Yuan.
Bottom-up Parsing, Part I Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Top-down Parsing lecture slides from C OMP 412 Rice University Houston, Texas, Fall 2001.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Bottom-Up Parsing David Woolbright. The Parsing Problem Produce a parse tree starting at the leaves The order will be that of a rightmost derivation The.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
Parsing V LR(1) Parsers. LR(1) Parsers LR(1) parsers are table-driven, shift-reduce parsers that use a limited right context (1 token) for handle recognition.
Lecture 5: LR Parsing CS 540 George Mason University.
Syntax and Semantics Structure of programming languages.
Parsing — Part II (Top-down parsing, left-recursion removal)
Chapter 4 - Parsing CSCE 343.
Programming Languages Translator
Bottom-up parsing Goal of parser : build a derivation
Bottom-Up Parsing.
Unit-3 Bottom-Up-Parsing.
Parsing IV Bottom-up Parsing
Table-driven parsing Parsing performed by a finite state machine.
Parsing — Part II (Top-down parsing, left-recursion removal)
4 (c) parsing.
Parsing Techniques.
Subject Name:COMPILER DESIGN Subject Code:10CS63
Top-Down Parsing CS 671 January 29, 2008.
4d Bottom Up Parsing.
BOTTOM UP PARSING Lecture 16.
Lecture 8 Bottom Up Parsing
Lecture 7: Introduction to Parsing (Syntax Analysis)
Bottom Up Parsing.
Compiler Construction
Parsing IV Bottom-up Parsing
LR Parsing. Parser Generators.
Parsing — Part II (Top-down parsing, left-recursion removal)
4d Bottom Up Parsing.
4d Bottom Up Parsing.
4d Bottom Up Parsing.
Compiler Construction
4d Bottom Up Parsing.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
4d Bottom Up Parsing.
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Parsing V: Bottom-up Parsing Lecture 10 CS 4318/5531 Spring 2009 Apan Qasem Texas State University *some slides adopted from Cooper and Torczon

Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production & try to match the input Bad “pick”  may need to backtrack Some grammars are backtrack-free (predictive parsing) Bottom-up parsers (LR(1), operator precedence) Start at the leaves and grow toward root As input is consumed, encode possibilities in an internal state Bottom-up parsers handle a large class of grammars Automatic parser generators such as yacc generate bottom-up parsers

Parsing Definitions : Recap The point of parsing is to construct a derivation A derivation consists of a series of rewrite steps S  0  1  2  …  n–1  n  sentence Each i is a sentential form If  contains only terminal symbols,  is a sentence in L(G) If  contains ≥ 1 non-terminals,  is a sentential form To get i from i–1, expand some NT A  i–1 by using A  Replace the occurrence of A  i–1 with  to get i In a leftmost derivation, it would be the first NT A  i–1

S  0  1  2  …  n–1  n  sentence Bottom-up Parsing A bottom-up parser builds a derivation by working from the input sentence back toward the start symbol S S  0  1  2  …  n–1  n  sentence To reduce i to i–1 match some rhs  with a substring in i then replace  with its corresponding lhs, A.

Bottom-up Parsing : Example Input : abbcde Build the parse tree in reverse Create leaf nodes for terminals in the input stream a b b c d e

Bottom-up Parsing : Example Input : abbcde Build the parse tree in reverse Create leaf nodes for terminals Leaf nodes make up the initial frontier Expand the frontier at every step a b b c d e

Bottom-up Parsing : Example Input : abbcde Scan from left to right Try to match part (or whole) of the frontier with the rhs of a production rule Look for exact matches a b b c d e left to right

Bottom-up Parsing : Example Input : abbcde Applied rule 3 Terminal b is an exact match for the rhs of rule 3 A a b b c d e left to right

Bottom-up Parsing : Example Input : abbcde Applied rule 3 Terminal b is an exact match for the rhs of rule 3 Frontier has changed A a b b c d e left to right

Bottom-up Parsing : Example Input : abbcde Applied rule 3 Terminal b is an exact match for the rhs of rule 3 Frontier has changed A a b b c d e left to right

Bottom-up Parsing : Example Input : abbcde A Abc matches rhs of rule 2 Apply rule 2 to reduce Abc to A Could we have applied rule 3 again to replace the second b in the input stream? How do we decide? How many symbols do we need to look at? A a b b c d e left to right

Bottom-up Parsing : Example Input : abbcde A B Apply rule 4 to reduce d to B A a b b c d e left to right

Bottom-up Parsing : Example Goal Input : abbcde A B aABe matches rhs of rule 1 Goal symbol is the root of the tree Consumed all input Accept string! A a b b c d e

Bottom-up Parsing : Example Goal Input : abbcde A B What kind of derivation did we apply? Reverse the process and look at which non-terminal is expanded first A a b b c d e

Bottom-up Parsing : Example Goal Input : abbcde Reverse the process: Start with Goal symbol at the root

Bottom-up Parsing : Example Goal Input : abbcde A B Apply rule 1 a e

Bottom-up Parsing : Example Goal Input : abbcde A B Apply rule 4 a d e

Bottom-up Parsing : Example Goal Input : abbcde A B Apply rule 2 A a b c d e

Bottom-up Parsing : Example 1 Goal Input : abbcde 3 2 A B What kind of derivation did we apply? Rightmost derivation in reverse 4 A a b b c d e

Bottom-up Parsing Algorithm Repeat Examine the frontier from left to right Find a substring beta in the frontier such that There is a production in the grammar of the form A ->  A ->  occurs as one step in a rightmost derivation of the sentence Replace  with A Make A the parent of each node that make up  Until frontier has only one symbol and that symbol is the Goal symbol

Bottom-up Parsing : Definitions Nodes with no parent in a partial tree form its frontier (also referred to as the upper fringe or upper frontier) Each replacement of  with A shrinks the frontier. We call this step a reduction. The matched substring  is called a handle Formally, A handle of a right-sentential form  is a pair <A,k> where A  P and k is the position in  of ’s rightmost symbol. If <A,k> is a handle, then replacing  at k with A produces the right sentential form from which  is derived in the rightmost derivation S -> … -> 1  2 -> 1 A 2 =  -> … -> sentence Because  is a right-sentential form, the substring to the right of a handle contains only terminal symbols

Bottom-up Parsing : Key Issues How do we choose the correct handle? Finding a match with a RHS of a production is not good enough Need to find a handle that leads to a valid derivation (if there is one) How do we find it quickly? How many symbols do we need to lookahead? When can we detect errors?

Choosing The Right Handle Goal A Input : ab C a b

Choosing The Right Handle Goal A Input : ab C a b

Stack-based Implementation Simpler and faster implementation than a tree-based approach Main idea Process one token at a time Fits the scanner mode Reading all tokens at once does not improve efficiency The action of getting the next token is called a shift Push token onto stack Build frontier incrementally Contents of the stack always represents the current frontier Look for handle at top of stack No need to keep track of k value of the handle k is always the top of the stack

Implementation : Shift-reduce Parser push INVALID token  next_token( ) repeat until (top of stack = Goal and token = EOF) if the top of the stack is a handle A // reduce  to A pop || symbols off the stack push A onto the stack else if (token  EOF) // shift push token else // need to shift, but out of input report an error How do errors show up? failure to find a handle hitting EOF & needing to shift (final else clause) Either generates an error

Back to x - 2 * y 1. Shift until the top of the stack is the right end of a handle 2. Find the left end of the handle & reduce

Back to x - 2 * y Should we reduce Expr?

Back to x - 2 * y 1. Shift until the top of the stack is the right end of a handle 2. Find the left end of the handle & reduce

We have two matches! Which rule do we apply? Back to x - 2 * y We have two matches! Which rule do we apply?

Back to x - 2 * y 1. Shift until the top of the stack is the right end of a handle 2. Find the left end of the handle & reduce

Back to x – 2 * y Should we reduce by 5 or 7?

Back to x – 2 * y 5 shifts + 9 reduces + 1 accept

Example Goal <id,x> Term Fact. Expr – <id,y> <num,2> *

Shift-reduce Parsing Shift reduce parsers are easily built and easily understood A shift-reduce parser has just four actions Shift — next word is shifted onto the stack Reduce — right end of handle is at top of stack Locate left end of handle within the stack Pop handle off stack & push appropriate lhs Accept — stop parsing & report success Error — call an error reporting/recovery routine Accept and Error are simple Shift is just a push and a call to the scanner Reduce takes |rhs| pops and 1 push If handle-finding requires state, put it in the stack  2x work Handle finding is key handle is on stack finite set of handles use a DFA !

LR(1) Parsers The name LR(1) comes from the following properties Scan input Left to right Produce a Reverse-rightmost derivation Use 1 (one) symbol lookahead A language is said to be LR(1) if it can be parsed using an LR(1) parser The LR(1) class is larger than LL(1)

CFG for List LIST  ( ELEMLIST ) ELEMLIST  ELEM ELEMLIST | ELEM ELEM  INT | LIST

FIRST and FOLLOW FIRST Sets FIRST(ABS) = FIRST(A) = {a} S  ABS FIRST(c) = {c} FIRST() = {} FIRST(a) = {a} FIRST(bSb) = {b} FIRST() = {} S  ABS | c |  A  a B  bSb

FIRST and FOLLOW FOLLOW Sets FOLLOW(S) = {EOF, b} FOLLOW(A) = {a, b, c, EOF} FOLLOW(B) = {a, b, c, EOF} S  ABS | c |  A  a B  bSb

FIRST and FOLLOW S  ABS FIRST+ Sets | c |  A  a B  bSb FIRST+() = {, EOF, b} FIRST+() = {, a, b, c, EOF} FIRST(ABS)  FIRST(c)  FIRST+() = {a}  {c}  {, b, EOF} OK FIRST(bSb)  FIRST+() = {b}  {a, b, c, EOF, } NOT OK

Left Recursion  = yu 1 = v 2 = zu P -> Qu -> Pyu P  Qu | v P  vP’ | zuP’ P’ yuP’ |  P  Qu | v Q  Py | z Foo  Foo  |  Foo   Bar Bar   Bar | 