Parsing IV Bottom-up Parsing

Slides:

Advertisements

Similar presentations

Parsing V: Bottom-up Parsing

Advertisements

A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.

Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.

Predictive Parsing l Find derivation for an input string, l Build a abstract syntax tree (AST) –a representation of the parsed program l Build a symbol.

By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.

ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.

Parsing V Introduction to LR(1) Parsers. from Cooper & Torczon2 LR(1) Parsers LR(1) parsers are table-driven, shift-reduce parsers that use a limited.

Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)

Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.

1 CIS 461 Compiler Design & Construction Fall 2012 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #12 Parsing 4.

1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.

– 1 – CSCE 531 Spring 2006 Lecture 8 Bottom Up Parsing Topics Overview Bottom-Up Parsing Handles Shift-reduce parsing Operator precedence parsing Readings:

Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.

Parsing IV Bottom-up Parsing. Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves.

Syntax and Semantics Structure of programming languages.

Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.

Syntax and Semantics Structure of programming languages.

Bottom-up Parsing, Part I Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.

Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.

Top-down Parsing lecture slides from C OMP 412 Rice University Houston, Texas, Fall 2001.

Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.

Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.

Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.

CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.

Parsing V LR(1) Parsers. LR(1) Parsers LR(1) parsers are table-driven, shift-reduce parsers that use a limited right context (1 token) for handle recognition.

Syntax and Semantics Structure of programming languages.

Introduction to Parsing

WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.

Parsing #1 Leonidas Fegaras.

Parsing — Part II (Top-down parsing, left-recursion removal)

Chapter 4 - Parsing CSCE 343.

Parsing Bottom Up CMPS 450 J. Moloney CMPS 450.

Programming Languages Translator

Bottom-up parsing Goal of parser : build a derivation

CS510 Compiler Lecture 4.

Bottom-Up Parsing.

Lexical and Syntax Analysis

Unit-3 Bottom-Up-Parsing.

UNIT - 3 SYNTAX ANALYSIS - II

Table-driven parsing Parsing performed by a finite state machine.

Parsing — Part II (Top-down parsing, left-recursion removal)

CS 404 Introduction to Compiler Design

Parsing Techniques.

Subject Name:COMPILER DESIGN Subject Code:10CS63

Regular Grammar - Finite Automaton

Lexical and Syntax Analysis

Top-Down Parsing CS 671 January 29, 2008.

Syntax Analysis source program lexical analyzer tokens syntax analyzer

4d Bottom Up Parsing.

BOTTOM UP PARSING Lecture 16.

Lecture 8 Bottom Up Parsing

Compilers Principles, Techniques, & Tools Taught by Jing Zhang

Compiler Design 7. Top-Down Table-Driven Parsing

Lecture 7: Introduction to Parsing (Syntax Analysis)

Bottom Up Parsing.

Compiler Construction

Parsing IV Bottom-up Parsing

LR Parsing. Parser Generators.

Parsing — Part II (Top-down parsing, left-recursion removal)

Bottom-Up Parsing “Shift-Reduce” Parsing

4d Bottom Up Parsing.

4d Bottom Up Parsing.

4d Bottom Up Parsing.

Compiler Construction

4d Bottom Up Parsing.

Parsing Bottom-Up Introduction.

Compilers Principles, Techniques, & Tools Taught by Jing Zhang

4d Bottom Up Parsing.

Parsing CSCI 432 Computer Science Theory

Presentation transcript:

Parsing IV Bottom-up Parsing

Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production & try to match the input Bad “pick”  may need to backtrack Some grammars are backtrack-free (predictive parsing) Bottom-up parsers (LR(1), operator precedence) Start at the leaves and grow toward root As input is consumed, encode possibilities in an internal state Start in a state valid for legal first tokens Bottom-up parsers handle a large class of grammars

Bottom-up Parsing (definitions) The point of parsing is to construct a derivation A derivation consists of a series of rewrite steps S  0  1  2  …  n–1  n  sentence Each i is a sentential form If  contains only terminal symbols,  is a sentence in L(G) If  contains ≥ 1 non-terminals,  is a sentential form To get i from i–1, expand some NT A  i–1 by using A  Replace the occurrence of A  i–1 with  to get i In a leftmost derivation, it would be the first NT A  i–1 A left-sentential form occurs in a leftmost derivation A right-sentential form occurs in a rightmost derivation

Bottom-up Parsing A bottom-up parser builds a derivation by working from the input sentence back toward the start symbol S S  0  1  2  …  n–1  n  sentence To reduce i to i–1 match some rhs  against i then replace  with its corresponding lhs, A. (assuming the production A) In terms of the parse tree, this is working from leaves to root Nodes with no parent in a partial tree form its upper fringe Since each replacement of  with A shrinks the upper fringe, we call it a reduction. The parse tree need not be built, it can be simulated |parse tree nodes| = |words| + |reductions| bottom-up

Finding Reductions Consider the simple grammar And the input string abbcde The trick is scanning the input and finding the next reduction The mechanism for doing this must be efficient

Finding Reductions (Handles) The parser must find a substring  of the tree’s frontier that matches some production A   that occurs as one step in the rightmost derivation (   A is in RRD) Informally, we call this substring  a handle Formally, A handle of a right-sentential form  is a pair <A,k> where A  P and k is the position in  of ’s rightmost symbol. If <A,k> is a handle, then replacing  at k with A produces the right sentential form from which  is derived in the rightmost derivation. Because  is a right-sentential form, the substring to the right of a handle contains only terminal symbols  the parser doesn’t need to scan past the handle (very far)

Finding Reductions (Handles) Critical Insight If G is unambiguous, then every right-sentential form has a unique handle. If we can find those handles, we can build a derivation ! Sketch of Proof: G is unambiguous  rightmost derivation is unique  a unique production A   applied to derive i from i–1  a unique position k at which A is applied  a unique handle <A,k> This all follows from the definitions

Example The expression grammar Handles for rightmost derivation of x – 2 * y This is the inverse of Figure 3.9 in EaC

Handle-pruning, Bottom-up Parsers The process of discovering a handle & reducing it to the appropriate left-hand side is called handle pruning Handle pruning forms the basis for a bottom-up parsing method To construct a rightmost derivation S  0  1  2  …  n–1  n  w Apply the following simple algorithm for i  n to 1 by –1 Find the handle <Ai i , ki > in i Replace i with Ai to generate i–1 This takes 2n steps

Handle-pruning, Bottom-up Parsers One implementation technique is the shift-reduce parser push INVALID token  next_token( ) repeat until (top of stack = Goal and token = EOF) if the top of the stack is a handle A then // reduce  to A pop || symbols off the stack push A onto the stack else if (token  EOF) then // shift push token else // need to shift, but out of input report an error How do errors show up? failure to find a handle hitting EOF & needing to shift (final else clause) Either generates an error Figure 3.7 in EAC

Back to x - 2 * y 1. Shift until the top of the stack is the right end of a handle 2. Find the left end of the handle & reduce

Back to x - 2 * y 1. Shift until the top of the stack is the right end of a handle 2. Find the left end of the handle & reduce

Back to x - 2 * y 1. Shift until the top of the stack is the right end of a handle 2. Find the left end of the handle & reduce

Back to x - 2 * y 1. Shift until the top of the stack is the right end of a handle 2. Find the left end of the handle & reduce

Back to x - 2 * y 1. Shift until the top of the stack is the right end of a handle 2. Find the left end of the handle & reduce

Back to x – 2 * y 5 shifts + 9 reduces + 1 accept 1. Shift until the top of the stack is the right end of a handle 2. Find the left end of the handle & reduce

Example Goal Expr Expr – Term Term Term * Fact. <id,y> Fact. <id,x> <num,2>

Shift-reduce Parsing Shift reduce parsers are easily built and easily understood A shift-reduce parser has just four actions Shift — next word is shifted onto the stack Reduce — right end of handle is at top of stack Locate left end of handle within the stack Pop handle off stack & push appropriate lhs Accept — stop parsing & report success Error — call an error reporting/recovery routine Accept & Error are simple Shift is just a push and a call to the scanner Reduce takes |rhs| pops & 1 push If handle-finding requires state, put it in the stack  2x work Handle finding is key handle is on stack finite set of handles use a DFA !

An Important Lesson about Handles To be a handle, a substring of a sentential form  must have two properties: It must match the right hand side  of some rule A   There must be some rightmost derivation from the goal symbol that produces the sentential form  with A   as the last production applied Simply looking for right hand sides that match strings is not good enough Critical Question: How can we know when we have found a handle without generating lots of different derivations? Answer: we use look ahead in the grammar along with tables produced as the result of analyzing the grammar. LR(1) parsers build a DFA that runs over the stack & finds them