Parsing IV Bottom-up Parsing

Slides:



Advertisements
Similar presentations
Parsing V: Bottom-up Parsing
Advertisements

A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Predictive Parsing l Find derivation for an input string, l Build a abstract syntax tree (AST) –a representation of the parsed program l Build a symbol.
Parsing III (Eliminating left recursion, recursive descent parsing)
Parsing V Introduction to LR(1) Parsers. from Cooper & Torczon2 LR(1) Parsers LR(1) parsers are table-driven, shift-reduce parsers that use a limited.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
1 CIS 461 Compiler Design & Construction Fall 2012 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #12 Parsing 4.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
– 1 – CSCE 531 Spring 2006 Lecture 8 Bottom Up Parsing Topics Overview Bottom-Up Parsing Handles Shift-reduce parsing Operator precedence parsing Readings:
Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Parsing IV Bottom-up Parsing. Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves.
-Mandakinee Singh (11CS10026).  What is parsing? ◦ Discovering the derivation of a string: If one exists. ◦ Harder than generating strings.  Two major.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
Top Down Parsing - Part I Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
COP4020 Programming Languages Parsing Prof. Xin Yuan.
Bottom-up Parsing, Part I Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Top-down Parsing lecture slides from C OMP 412 Rice University Houston, Texas, Fall 2001.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
Parsing V LR(1) Parsers. LR(1) Parsers LR(1) parsers are table-driven, shift-reduce parsers that use a limited right context (1 token) for handle recognition.
Syntax and Semantics Structure of programming languages.
Introduction to Parsing
WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.
CSE 3302 Programming Languages
Parsing #1 Leonidas Fegaras.
Parsing — Part II (Top-down parsing, left-recursion removal)
Chapter 4 - Parsing CSCE 343.
Parsing Bottom Up CMPS 450 J. Moloney CMPS 450.
Programming Languages Translator
Bottom-up parsing Goal of parser : build a derivation
CS510 Compiler Lecture 4.
Bottom-Up Parsing.
Lexical and Syntax Analysis
Unit-3 Bottom-Up-Parsing.
UNIT - 3 SYNTAX ANALYSIS - II
Parsing IV Bottom-up Parsing
Table-driven parsing Parsing performed by a finite state machine.
Parsing — Part II (Top-down parsing, left-recursion removal)
4 (c) parsing.
Parsing Techniques.
Subject Name:COMPILER DESIGN Subject Code:10CS63
Regular Grammar - Finite Automaton
Lexical and Syntax Analysis
Top-Down Parsing CS 671 January 29, 2008.
4d Bottom Up Parsing.
Lecture 9: Bottom-Up Parsing
BOTTOM UP PARSING Lecture 16.
Lecture 8 Bottom Up Parsing
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Compiler Design 7. Top-Down Table-Driven Parsing
Lecture 7: Introduction to Parsing (Syntax Analysis)
Bottom Up Parsing.
Compiler Construction
Introduction to Parsing
Introduction to Parsing
LR Parsing. Parser Generators.
Parsing — Part II (Top-down parsing, left-recursion removal)
Bottom-Up Parsing “Shift-Reduce” Parsing
4d Bottom Up Parsing.
4d Bottom Up Parsing.
4d Bottom Up Parsing.
4d Bottom Up Parsing.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
4d Bottom Up Parsing.
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Parsing IV Bottom-up Parsing

from Cooper and Torczon Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production & try to match the input Bad “pick”  may need to backtrack Some grammars are backtrack-free (predictive parsing) Bottom-up parsers (LR(1), operator precedence) Start at the leaves and grow toward root As input is consumed, encode possibilities in an internal state Start in a state valid for legal first tokens Bottom-up parsers handle a large class of grammars from Cooper and Torczon

Bottom-up Parsing (definitions) The point of parsing is to construct a derivation A derivation consists of a series of rewrite steps S  0  1  2  …  n-1  n  sentence Each i is a sentential form If  contains only terminal symbols,  is a sentence in L(G) If  contains ≥ 1 non-terminals,  is just a sentential form To get i from i-1, expand some N  i-1 by using N Replace the occurrence of N  i-1 with  to get i In a leftmost derivation, it would be the first N  i-1 A left-sentential form occurs in a leftmost derivation A right-sentential form occurs in a rightmost derivation from Cooper and Torczon

Bottom-up Parsing A bottom-up parser builds a derivation by working from the input sentence back toward the start symbol S S  0  1  2  …  n-1  n  sentence To derive i-1 from i, it matches i =    against i, then replaces  with its corresponding lhs, N, to yield i-1 =  N  . (assuming N) In terms of the parse tree, this is working from leaves to root Nodes with no parent in a partial tree form its upper fringe Since each replacement of  in    with N shrinks the upper fringe, we call it a reduction. The parse tree need not be built, it can be simulated |parse tree nodes| = |words| + |reductions| from Cooper and Torczon

from Cooper and Torczon Finding Reductions Consider the simple grammar And the input string abbcde The trick is scanning the input and finding the next reduction The mechanism for doing this must be efficient from Cooper and Torczon

Finding Reductions (Handles) The parser must find a substring    of the tree’s frontier that matches some production N  that occurs as one step In the rightmost derivation ( N  is in RRD) Informally, we call this substring  a handle Formally, A handle of a right-sentential form  is a pair <N ,k> where N   P and k is the position in  of ’s rightmost symbol. If < N ,k> is a handle, then replacing  at k with N produces the right sentential form preceding  in the rightmost derivation. Because  is a right-sentential form, the substring to the right of a handle contains only terminal symbols (convoluted, but true)  the parser doesn’t need to scan past the handle (very far) from Cooper and Torczon

Finding Reductions (Handles) Critical Insight (Theorem?) If G is unambiguous, then every right-sentential form has a unique handle. If we can find those handles, we can build a derivation ! Sketch of Proof: G is unambiguous  rightmost derivation is unique  a unique production N  applied to take i-1 to i  a unique position k at which N  is applied  a unique handle <N  ,k> This all follows from the definitions from Cooper and Torczon

Example (a very busy slide) The expression grammar Handles for rightmost derivation of x - 2 * y See Figure 3.5 in EAC from Cooper and Torczon

Handle-pruning, Bottom-up Parsers The process of discovering a handle & reducing it to the appropriate left-hand side is called handle pruning Handle pruning forms the basis for a bottom-up parsing method To construct a rightmost derivation S  0  1  2  …  n-1  n  w Apply the following simple algorithm for i  n to 1 by -1 Find the handle <Ni i,ki > in i Replace i with Ni to generate i-1 This takes 2n steps from Cooper and Torczon

Handle-pruning, Bottom-up Parsers One implementation technique is the shift-reduce parser push INVALID token  next_token( ) repeat until (top of stack = Goal and token = EOF) if the top of the stack is a handle N then /* reduce  to N */ pop || symbols off the stack push N onto the stack else if (token  EOF) then /* shift */ push token How do errors show up? failure to find a handle hitting EOF & needing to shift (final else clause) Either generates an error Figure 3.7 in EAC from Cooper and Torczon

from Cooper and Torczon Back to x - 2 * y 5 shifts + 9 reduces + 1 accept 1. Shift until TOS is the right end of a handle 2. Find the left end of the handle & reduce from Cooper and Torczon

from Cooper and Torczon Example Goal Expr Expr - Term Term Term * Fact. <id,y> Fact. Fact. <id,x> <num,2> from Cooper and Torczon

from Cooper and Torczon Shift-reduce Parsing Shift reduce parsers are easily built and easily understood A shift-reduce parser has just four actions Shift — next word is shifted onto the stack Reduce — right end of handle is at top of stack Locate left end of handle within the stack Pop handle off stack & push appropriate lhs Accept — stop parsing & report success Error — call an error reporting/recovery routine Accept & Error are simple Shift is just a push and a call to the scanner Reduce takes |rhs| pops & 1 push If handle-finding requires state, put it in the stack  2x work Handle finding is key handle is on stack finite set of handles use a DFA ! from Cooper and Torczon