Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.

Slides:



Advertisements
Similar presentations
Parsing V: Bottom-up Parsing
Advertisements

Compiler Construction
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Bottom up Parsing Bottom up parsing trys to transform the input string into the start symbol. Moves through a sequence of sentential forms (sequence of.
Mooly Sagiv and Roman Manevich School of Computer Science
Predictive Parsing l Find derivation for an input string, l Build a abstract syntax tree (AST) –a representation of the parsed program l Build a symbol.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
Top-Down Parsing.
1 Bottom Up Parsing. 2 Bottom-Up Parsing l Bottom-up parsing is more general than top-down parsing »And just as efficient »Builds on ideas in top-down.
CS Summer 2005 Top-down and Bottom-up Parsing - a whirlwind tour June 20, 2005 Slide acknowledgment: Radu Rugina, CS 412.
CS 536 Spring Introduction to Bottom-Up Parsing Lecture 11.
CS 536 Spring Bottom-Up Parsing: Algorithms, part 1 LR(0), SLR Lecture 12.
Bottom Up Parsing.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
CS 330 Programming Languages 09 / 23 / 2008 Instructor: Michael Eckmann.
Prof. Bodik CS 164 Lecture 91 Bottom-up parsing CS164 3:30-5:00 TT 10 Evans.
Professor Yihjia Tsai Tamkang University
LR(1) Languages An Introduction Professor Yihjia Tsai Tamkang University.
COS 320 Compilers David Walker. last time context free grammars (Appel 3.1) –terminals, non-terminals, rules –derivations & parse trees –ambiguous grammars.
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
Top-Down Parsing.
1 CIS 461 Compiler Design & Construction Fall 2012 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #12 Parsing 4.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
– 1 – CSCE 531 Spring 2006 Lecture 8 Bottom Up Parsing Topics Overview Bottom-Up Parsing Handles Shift-reduce parsing Operator precedence parsing Readings:
Bottom-up parsing Goal of parser : build a derivation
Lexical and syntax analysis
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Syntax and Semantics Structure of programming languages.
Top-Down Parsing - recursive descent - predictive parsing
4 4 (c) parsing. Parsing A grammar describes the strings of tokens that are syntactically legal in a PL A recogniser simply accepts or rejects strings.
10/13/2015IT 3271 Tow kinds of predictive parsers: Bottom-Up: The syntax tree is built up from the leaves Example: LR(1) parser Top-Down The syntax tree.
Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
1 Compiler Construction Syntax Analysis Top-down parsing.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
Syntax and Semantics Structure of programming languages.
4 4 (c) parsing. Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces.
Introduction to Parsing
COP4020 Programming Languages Parsing Prof. Xin Yuan.
Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Top-down Parsing lecture slides from C OMP 412 Rice University Houston, Texas, Fall 2001.
Bottom-Up Parsing David Woolbright. The Parsing Problem Produce a parse tree starting at the leaves The order will be that of a rightmost derivation The.
1 Nonrecursive Predictive Parsing  It is possible to build a nonrecursive predictive parser  This is done by maintaining an explicit stack.
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
Top-Down Parsing.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
UMBC  CSEE   1 Chapter 4 Chapter 4 (b) parsing.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
Bottom-Up Parsing Algorithms LR(k) parsing L: scan input Left to right R: produce Rightmost derivation k tokens of lookahead LR(0) zero tokens of look-ahead.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
CPSC46001 Bottom-up Parsing Reading Sections 4.5 and 4.7 from ASU.
Syntax and Semantics Structure of programming languages.
Parsing Bottom Up CMPS 450 J. Moloney CMPS 450.
Programming Languages Translator
Bottom-up parsing Goal of parser : build a derivation
Parsing IV Bottom-up Parsing
Bottom-Up Syntax Analysis
4 (c) parsing.
Subject Name:COMPILER DESIGN Subject Code:10CS63
LR Parsing. Parser Generators.
Lecture (From slides by G. Necula & R. Bodik)
Parsing IV Bottom-up Parsing
LR Parsing. Parser Generators.
Compiler Construction
Presentation transcript:

Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9

Prof. Fateman CS 164 Lecture 92 Outline Review LL parsing Shift-reduce parsing The LR parsing algorithm Constructing LR parsing tables

Prof. Fateman CS 164 Lecture 93 Top-Down Parsing. Review Top-down parsing expands a parse tree from the start symbol to the leaves –Always expand the leftmost non-terminal E T E + int * int + int

Prof. Fateman CS 164 Lecture 94 Top-Down Parsing. Review Top-down parsing expands a parse tree from the start symbol to the leaves –Always expand the leftmost non-terminal E int T * T E + int * int + int The leaves at any point form a string  A  –  contains only terminals –The input string is  b  –The prefix  matches –The next token is b

Prof. Fateman CS 164 Lecture 95 Top-Down Parsing. Review Top-down parsing expands a parse tree from the start symbol to the leaves –Always expand the leftmost non-terminal E int T * T E + T int * int + int The leaves at any point form a string  A  –  contains only terminals –The input string is  b  –The prefix  matches –The next token is b

Prof. Fateman CS 164 Lecture 96 Top-Down Parsing. Review Top-down parsing expands a parse tree from the start symbol to the leaves –Always expand the leftmost non-terminal E int T * T E + T int * int + int The leaves at any point form a string  A  –  contains only terminals –The input string is  b  –The prefix  matches –The next token is b

Prof. Fateman CS 164 Lecture 97 Predictive Parsing. Review. A predictive parser is described by a table –For each non-terminal A and for each token b we specify a production A !  –When trying to expand A we use A !  if b follows next Once we have the table –The parsing algorithm is simple and fast –No backtracking is necessary

Prof. Fateman CS 164 Lecture 98 Notes on LL(1) Parsing Tables If any entry is multiply defined then G is not LL(1) –If G is ambiguous –If G is left recursive –If G is not left-factored –And in other cases as well There are tools that build LL(1) tables. Most programming language grammars are very nearly but not quite LL(1).

Prof. Fateman CS 164 Lecture 99 What next? Good but we can do better with a more powerful parsing strategy that allows us to look at more of the input before deciding to do a reduction. We could try LL(2) but that would require a 3-D table. We have a better way.

Prof. Fateman CS 164 Lecture 910 Bottom Up Parsing

Prof. Fateman CS 164 Lecture 911 Bottom-Up Parsing Bottom-up parsing is more general than top-down parsing (handles more abstract languages) –And just as efficient –Builds on ideas in top-down parsing –Preferred method in practice Also called LR parsing –L means that tokens are read left to right –R means that it constructs a rightmost derivation -->

Prof. Fateman CS 164 Lecture 912 An Introductory Example LR parsers don’t need left-factored grammars and can also handle left-recursive grammars Consider the following grammar: E  E + ( E ) | int –Why is this not LL(1)? Consider the string: int + ( int ) + ( int )

Prof. Fateman CS 164 Lecture 913 The Idea LR parsing reduces a string to the start symbol by inverting productions: Str := input string of terminals repeat –Identify  in str such that A   is a production (i.e., str =    ) –Replace  by A in str (i.e., str   A  ) until str = S

Prof. Fateman CS 164 Lecture 914 A Bottom-up Parse in Detail (1) int++ () int + (int) + (int) ()

Prof. Fateman CS 164 Lecture 915 A Bottom-up Parse in Detail (2) E int++ () int + (int) + (int) E + (int) + (int) ()

Prof. Fateman CS 164 Lecture 916 A Bottom-up Parse in Detail (3) E int++ () int + (int) + (int) E + (int) + (int) E + (E) + (int) () E

Prof. Fateman CS 164 Lecture 917 A Bottom-up Parse in Detail (4) E int++ () int + (int) + (int) E + (int) + (int) E + (E) + (int) E + (int) E () E

Prof. Fateman CS 164 Lecture 918 A Bottom-up Parse in Detail (5) E int++ () int + (int) + (int) E + (int) + (int) E + (E) + (int) E + (int) E + (E) E () E E

Prof. Fateman CS 164 Lecture 919 A Bottom-up Parse in Detail (6) E E int++ () int + (int) + (int) E + (int) + (int) E + (E) + (int) E + (int) E + (E) E E () E E A rightmost derivation in reverse

Prof. Fateman CS 164 Lecture 920 Important Fact #1 Important Fact #1 about bottom-up parsing: An LR parser traces a rightmost derivation in reverse

Prof. Fateman CS 164 Lecture 921 Where Do Reductions Happen Important Fact #1 has an interesting consequence: –Let  be a step of a bottom-up parse –Assume the next reduction is by A   –Then  is a string of terminals Why? Because  A    is a step in a right- most derivation

Prof. Fateman CS 164 Lecture 922 Notation Idea: Split string into two substrings –Right substring is as yet unexamined by parsing (a string of terminals) –Left substring has terminals and non-terminals The dividing point is marked by a | or I –The I is not part of the string Initially, all input is unexamined: I x 1 x 2... x n

Prof. Fateman CS 164 Lecture 923 Shift-Reduce Parsing Bottom-up parsing uses only two kinds of actions: Shift Reduce

Prof. Fateman CS 164 Lecture 924 Shift Shift: Move I one place to the right –Shifts a terminal to the left string E + ( I int ) shifts to E + (int I )

Prof. Fateman CS 164 Lecture 925 Reduce Reduce: Apply an inverse production at the right end of the left string –If E  E + ( E ) is a production, then E + (E + ( E ) I ) reduces to E +(E I )

Shift-Reduce Example I int + (int) + (int)$ shift int++ ()()

Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E  int int++ ()()

Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E  int E I + (int) + (int)$ shift 3 times E int++ ()()

Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E  int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E  int E int++ ()()

Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E  int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E  int E + (E I ) + (int)$ shift E int++ ()() E

Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E  int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E  int E + (E I ) + (int)$ shift E + (E) I + (int)$ red. E  E + (E) E int++ ()() E

Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E  int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E  int E + (E I ) + (int)$ shift E + (E) I + (int)$ red. E  E + (E) E I + (int)$ shift 3 times E int++ () E () E

Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E  int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E  int E + (E I ) + (int)$ shift E + (E) I + (int)$ red. E  E + (E) E I + (int)$ shift 3 times E + (int I )$ red. E  int E int++ () E () E

Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E  int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E  int E + (E I ) + (int)$ shift E + (E) I + (int)$ red. E  E + (E) E I + (int)$ shift 3 times E + (int I )$ red. E  int E + (E I )$ shift E int++ () E () E E

Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E  int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E  int E + (E I ) + (int)$ shift E + (E) I + (int)$ red. E  E + (E) E I + (int)$ shift 3 times E + (int I )$ red. E  int E + (E I )$ shift E + (E) I $ red. E  E + (E) E int++ () E () E E

Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E  int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E  int E + (E I ) + (int)$ shift E + (E) I + (int)$ red. E  E + (E) E I + (int)$ shift 3 times E + (int I )$ red. E  int E + (E I )$ shift E + (E) I $ red. E  E + (E) E I $ accept E E int++ () E () E E

Prof. Fateman CS 164 Lecture 937 The Stack Left string can be implemented by a stack –Top of the stack is the I Shift pushes a terminal on the stack Reduce pops 0 or more symbols off of the stack (production rhs) and pushes a non- terminal on the stack (production lhs)

Prof. Fateman CS 164 Lecture 938 Key Issue: When to Shift or Reduce? Idea: use a finite automaton (DFA) to decide when to shift or reduce –The input is the stack –The language consists of terminals and non-terminals We run the DFA on the stack and we examine the resulting state X and the token tok after I –If X has a transition labeled tok then shift –If X is labeled with “A   on tok” then reduce

LR(1) Parsing. An Example int E  int on $, + accept on $ E  int on ), + E  E + (E) on $, + E  E + (E) on ), + ( + E int E + ) ( I int + (int) + (int)$ shift int I + (int) + (int)$ E  int E I + (int) + (int)$ shift(x3) E + (int I ) + (int)$ E  int E + (E I ) + (int)$ shift E + (E) I + (int)$ E  E+(E) E I + (int)$ shift (x3) E + (int I )$ E  int E + (E I )$ shift E + (E) I $ E  E+(E) E I $ accept int E )

Prof. Fateman CS 164 Lecture 940 Representing the DFA Parsers represent the DFA as a 2D table –Recall table-driven lexical analysis Lines correspond to DFA states Columns correspond to terminals and non- terminals Typically columns are split into: –Those for terminals: action table –Those for non-terminals: goto table

Prof. Fateman CS 164 Lecture 941 Representing the DFA. Example The table for a fragment of our DFA: int+()$E … 3s4 4s5g6 5 r E  int 6s8s7 7 r E  E+(E) … E  int on ), + E  E + (E) on $, + ( int ) E

Prof. Fateman CS 164 Lecture 942 The LR Parsing Algorithm After a shift or reduce action we rerun the DFA on the entire stack –This is wasteful, since most of the work is repeated Faster: Remember for each stack element to which state it brings the DFA. Extra memory means it is no longer a DFA, but memory is cheap. LR parser maintains a stack  sym 1, state 1 ...  sym n, state n  state k is the final state of the DFA on sym 1 … sym k

Prof. Fateman CS 164 Lecture 943 The LR Parsing Algorithm Let I = w$ be initial input Let j = 0 Let DFA state 0 be the start state Let stack =  dummy, 0  repeat case action[top_state(stack), I[j]] of shift k: push  I[j++], k  reduce X  A: pop |A| pairs, push  X, Goto[top_state(stack), X]  accept: halt normally error: halt and report error

Prof. Fateman CS 164 Lecture 944 LR Parsing Notes Can be used to parse more grammars than LL Most programming languages grammars are LR Can be described as a simple table There are tools for building the table How is the table constructed?