Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing

Slides:



Advertisements
Similar presentations
Compiler construction in4020 – lecture 4 Koen Langendoen Delft University of Technology The Netherlands.
Advertisements

Bottom-up Parsing A general style of bottom-up syntax analysis, known as shift-reduce parsing. Two types of bottom-up parsing: Operator-Precedence parsing.
Joey Paquet, 2000, 2002, 2008, Lecture 7 Bottom-Up Parsing II.
Pushdown Automata Consists of –Pushdown stack (can have terminals and nonterminals) –Finite state automaton control Can do one of three actions (based.
Mooly Sagiv and Roman Manevich School of Computer Science
LR Parsing – The Items Lecture 10 Mon, Feb 14, 2005.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
Syntax and Semantics Structure of programming languages.
10/13/2015IT 3271 Tow kinds of predictive parsers: Bottom-Up: The syntax tree is built up from the leaves Example: LR(1) parser Top-Down The syntax tree.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
Syntax and Semantics Structure of programming languages.
1 Bottom-Up Parsing  “Shift-Reduce” Parsing  Reduce a string to the start symbol of the grammar.  At every step a particular substring is matched (in.
Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.
Recursive Descent Parsers Lecture 6 Mon, Feb 2, 2004.
Bottom-Up Parsing David Woolbright. The Parsing Problem Produce a parse tree starting at the leaves The order will be that of a rightmost derivation The.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Top-Down Parsing.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
Lecture # 10 Grammar Problems. Problems with grammar Ambiguity Left Recursion Left Factoring Removal of Useless Symbols These can create problems for.
Lecture 5: LR Parsing CS 540 George Mason University.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
Compiler Construction Lecture Five: Parsing - Part Two CSC 2103: Compiler Construction Lecture Five: Parsing - Part Two Joyce Nakatumba-Nabende 1.
CS 2130 Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing Warning: The precedence table given for the Wff grammar is in error.
Syntax and Semantics Structure of programming languages.
Introduction to Parsing
Lecture 7 Syntax Analysis (5) Operator-Precedence Parsing
CSE 3302 Programming Languages
Announcements/Reading
Parsing Bottom Up CMPS 450 J. Moloney CMPS 450.
Programming Languages Translator
Bottom-up parsing Goal of parser : build a derivation
CS510 Compiler Lecture 4.
LR Parsing – The Items Lecture 10 Fri, Feb 13, 2004.
Unit-3 Bottom-Up-Parsing.
Lecture #12 Parsing Types.
Introduction to Parsing (adapted from CS 164 at Berkeley)
Parsing IV Bottom-up Parsing
Parsing — Part II (Top-down parsing, left-recursion removal)
Chapter 4 Syntax Analysis.
Bottom-Up Syntax Analysis
Syntax Analysis Part II
4 (c) parsing.
Parsing Techniques.
CS 3304 Comparative Languages
Top-Down Parsing CS 671 January 29, 2008.
Syntax-Directed Definition
LR Parsing – The Tables Lecture 11 Wed, Feb 16, 2005.
Programming Language Syntax 2
Lecture (From slides by G. Necula & R. Bodik)
BOTTOM UP PARSING Lecture 16.
Compiler Design 7. Top-Down Table-Driven Parsing
Lecture 7: Introduction to Parsing (Syntax Analysis)
Ambiguity, Precedence, Associativity & Top-Down Parsing
Bottom Up Parsing.
Predictive Parsing Lecture 9 Wed, Feb 9, 2005.
Parsing IV Bottom-up Parsing
Bottom-Up Parsing “Shift-Reduce” Parsing
Syntax Analysis - Parsing
Kanat Bolazar February 16, 2010
Parsing Bottom-Up.
Compiler Construction
Parsing Bottom-Up LR Table Construction.
Parsing Bottom-Up Introduction.
Compiler Construction
Parsing Bottom-Up LR Table Construction.
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing CS 2130 Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing

Bottom Up Parsing Assume we have a group of tokens Bottom up parsing tries to group tokens into things it can reduce Like our precedence rules make 1 + 2 * 3 Understood to be (1 + (2 * 3)) Same idea but with bottom up parsing we use special operators: Wirth-Weber

Wirth-Weber Operators x < y y has higher precedence than x x = y x and y have equal precedence x > y x has higher precedence than y

Operation Given a b c d e f g h Keep in mind that the letters represent tokens thus they can be operators, operands, special symbols, etc

Operation Given < a b c d e f g h > Assumed

Operation Given < a = b c d e f g h > Parser moves through tokens comparing precedence

Operation Given < a = b > c d e f g h > A series starting with < and ending with > is called a handle. When a handle is found the parser looks for a match with the right hand side of some production

Operation Given < a = b > c d e f g h > Assume the following is in effect r ::= a b s ::= d e f t ::= r c s

Operation Given < r c d e f g h > Assume the following is in effect r ::= a b s ::= d e f t ::= r c s

Operation Given < r c d e f g h > We continue Assume the following is in effect r ::= a b s ::= d e f t ::= r c s

Operation Given < r = c d e f g h > We continue Assume the following is in effect r ::= a b s ::= d e f t ::= r c s

Operation Given < r = c < d e f g h > We continue Assume the following is in effect r ::= a b s ::= d e f t ::= r c s

Operation Given < r = c < d = e f g h > We continue Assume the following is in effect r ::= a b s ::= d e f t ::= r c s

Operation Given < r = c < d = e = f g h > We continue Assume the following is in effect r ::= a b s ::= d e f t ::= r c s

Operation Given < r = c < d = e = f > g h > We continue Assume the following is in effect r ::= a b s ::= d e f t ::= r c s

Operation Given < r = c < d = e = f > g h > We continue Assume the following is in effect r ::= a b s ::= d e f t ::= r c s

Operation Given < r = c s g h > We continue Assume the following is in effect r ::= a b s ::= d e f t ::= r c s

Operation Given < r = c = s g h > We continue Assume the following is in effect r ::= a b s ::= d e f t ::= r c s

Operation Given < r = c = s > g h > We continue Assume the following is in effect r ::= a b s ::= d e f t ::= r c s

Operation Given < r = c = s > g h > We continue Assume the following is in effect r ::= a b s ::= d e f t ::= r c s

Operation Given < t g h > We continue Assume the following is in effect r ::= a b s ::= d e f t ::= r c s

Operation Given < t = g h > We continue Assume the following is in effect r ::= a b s ::= d e f t ::= r c s

Operation Given < t = g = h > If there is a rule for: "t g h" successful parse If not, syntax error

What kind of algorithm? Stack based Known as semantic stack or shift/reduce algorithm

Example < a = b < c = d > e < f > g > < a = b > x = e < f > g > < y < x = e = z = g > < y = m > < n > If this is the start symbol: Successful Parse!

< a = b < c = d > e < f > g > Stack a, < < a = b < c = d > e < f > g > We process the tokens adding the token and the precedence operator on its left to the stack. Encountering a > indicates a handle has been found and thus will initiate special proceesing •

Stack b, = a, < < a = b < c = d > e < f > g >

Stack c, < b, = a, < < a = b < c = d > e < f > g >

Stack d, = c, < b, = a, < < a = b < c = d > e < f > g >

Stack d, = c, < b, = a, < < a = b < c = d > e < f > g > < a = b > x = e < f > g > Finding a ">" between d and e, we trace back...

Stack x, > b, = a, < < a = b < c = d > e < f > g > < a = b > x = e < f > g > x is not really on stack because of the: >

Stack x, < y, < < a = b < c = d > e < f > g > < a = b > x = e < f > g > < y < x = e < f > g > We trace back to process the ab handle and then x goes on stack

Stack e, = x, < y, < < a = b < c = d > e < f > g > < a = b > x = e < f > g > < y < x = e < f > g > Continuing...

Stack f, < e, = x, < y, < < a = b < c = d > e < f > g > < a = b > x = e < f > g > < y < x = e < f > g > < y < x = e = z = g > Converting f to z

Stack g, = f, < e, = x, < y, < < a = b < c = d > e < f > g > < a = b > x = e < f > g > < y < x = e < f > g > < y < x = e = z = g >

Stack #, > g, = z, < e, = x, < y, < < a = b < c = d > e < f > g > < a = b > x = e < f > g > < y < x = e < f > g > < y < x = e = z = g > # signifies end of input

Stack #, > m, = y, < < a = b < c = d > e < f > g > < a = b > x = e < f > g > < y < x = e < f > g > < y < x = e = z = g > < y = m > Reducing

Stack #, > n, < < a = b < c = d > e < f > g > < a = b > x = e < f > g > < y < x = e < f > g > < y < x = e = z = g > < y = m > < n > If n is our start symbol: successful parse!

Shift/Reduce Parsing x < y x = y Shift x > y ? Reduce Error: No relationship  Syntax error Not allowed: More than one relationship Actually < = would be okay, >= or >< is called shift/reduce error

Shift/Reduce Parsing Reduce-Reduce Error No Right Hand Side Match When trying to match, parser finds two rules which match No Right Hand Side Match Syntax error

Simple Precedence Only Possibilities x < y x = y x > y No relationship exists (No others allowed)

Bottom-Up Parsing No issues regarding left-recursive versus right-recursive such as those found with Top-down parsing Note: There are grammars that will break a bottom-up parser.

Precedence Consider our simple grammar... Between symbols there must be ? <expr> ::= <expr> + <term> | <term> <term> ::= <term> * <factor> | <factor> <factor> ::= '(' <expr> ')' | num | id

Precedence Consider our simple grammar... Between symbols there must be = <expr> ::= <expr> = + = <term> | <term> <term> ::= <term> = * = <factor> | <factor> <factor> ::= '(' = <expr> = ')' | num | id

Precedence To determine "end points" we must look at multiple rules to see how they interact... x c = d

Precedence To determine "end points" we must look at multiple rules to see how they interact... x c = d To determine what goes here...

Precedence To determine "end points" we must look at multiple rules to see how they interact... x c = d We look here.

Precedence <expr> ::= <expr> + <term> | <term> <term> ::= <term> * <factor> | <factor> <factor> ::= '(' <expr> ')' | num | id + = <term> * = <factor> + < <factor> * < ( = <expr> = ) + < ( = <expr> = )

Precedence Table <expr> ::= <expr> + <term> | <term> <term> ::= <term> * <factor> | <factor> <factor> ::= '(' <expr> ')' | num | id L R <expr> <term> <factor> + * ( ) num id <expr> <term> <factor> + * ( ) num id

Precedence Table <expr> ::= <expr> + <term> | <term> <term> ::= <term> * <factor> | <factor> <factor> ::= '(' <expr> ')' | num | id L R <expr> <term> <factor> + * ( ) num id <expr> <term> <factor> + * ( ) num id

Precedence Table <expr> ::= <expr> + <term> | <term> <term> ::= <term> * <factor> | <factor> <factor> ::= '(' <expr> ')' | num | id L R <expr> <term> <factor> + * ( ) num id <expr> = = <term> > = > <factor> > > > + < < < < * = < < < ( < < < ) > > > num > > > id > > >

Precedence Table '(' <expr>... '(' < '(' <expr>... <expr> ::= <expr> + <term> | <term> <term> ::= <term> * <factor> | <factor> <factor> ::= '(' <expr> ')' | num | id L R <expr> <term> <factor> + * ( ) num id <expr> = = <term> > = > <factor> > > > + < < < < * = < < < ( < < < '(' <expr>... '(' < '(' <expr>... ) > > > num > > > id > > >

Precedence Table <expr> ::= <expr> + <term> | <term> <term> ::= <term> * <factor> | <factor> <factor> ::= '(' <expr> ')' | num | id L R <expr> <term> <factor> + * ( ) num id <expr> = = <term> > = > <factor> > > > + = < < < < < * = < < < ( = < < < < ) > > > num > > > id > > >

Precedence Table '(' = <expr> ')' '(' < <expr> + <expr> ::= <expr> + <term> | <term> <term> ::= <term> * <factor> | <factor> <factor> ::= '(' <expr> ')' | num | id L R <expr> <term> <factor> + * ( ) num id <expr> = = <term> > = > '(' = <expr> ')' <factor> > > > + = < < < < < * = < < < ( = < < < < ) > > > '(' < <expr> + num > > > id > > >

Resolving Ambiguity + = <term> + < <term> * <factor> Solve by lookahead: + = <term> + or ) + < <term> * Ambiguity can be resolved by increasing k in LR(k) but that's not the only way: We could rewrite grammar

Original Grammar <expr> ::= <expr> + <term> | <term> <term> ::= <term> * <factor> | <factor> <factor> ::= '(' <expr> ')' | num | id

Sources of Ambiguity <expr> ::= <expr> + <term> | <term> <term> ::= <term> * <factor> | <factor> <factor> ::= '(' <expr> ')' | num | id

Add 2 New Rules <expr> ::= <expr> + <term> | <term> <term> ::= <term> * <factor> | <factor> <factor> ::= '(' <expr> ')' | num | id <e> ::= <expr> <t> ::= <term>

Modify <expr> ::= <expr> + <t> | <term> <term> ::= <term> * <factor> | <factor> <factor> ::= '(' <e> ')' | num | id <e> ::= <expr> <t> ::= <term>

Original Grammar Rewritten Grammar <expr> ::= <expr> + <term> | <term> <term> ::= <term> * <factor> | <factor> <factor> ::= '(' <expr> ')' | num | id Rewritten Grammar <expr> ::= <expr> + <t> | <term> <term> ::= <term> * <factor> | <factor> <factor> ::= '(' <e> ')' | num | id <e> ::= <expr> <t> ::= <term>

Performance Size of table is O(n2) If we use operator precedence then it only uses terminals thus the table size is not affected by adding non-terminals

Question 1 a < b = c = d > e a ? x e What happens if there is no relationship here?

Question 2 Where do precedence relationships come from? Make a table by hand Write a program to make table How such a program works or how to write it are topics beyond the scope of this course.

Example

< num > + 1 + 2 * 3 Tokenized: num + num * num

< <factor> > + < num > + < <factor> > + 1 + 2 * 3 Tokenized: num + num * num

< <factor> > + < <term> > + < num > + < <factor> > + < <term> > + 1 + 2 * 3 Tokenized: num + num * num

< <factor> > + < <term> > + < num > + < <factor> > + < <term> > + < <expr> = + 1 + 2 * 3 Tokenized: num + num * num

< <factor> > + < <term> > + < num > + < <factor> > + < <term> > + < <expr> = + < <expr> = + < num 1 + 2 * 3 Tokenized: num + num * num

< <factor> > + < <term> > + < num > + < <factor> > + < <term> > + < <expr> = + < <expr> = + < num < <expr> = + < num > * 1 + 2 * 3 Tokenized: num + num * num

< <factor> > + < <term> > + < num > + < <factor> > + < <term> > + < <expr> = + < <expr> = + < num < <expr> = + < num > * < <expr> = + < <factor> > * 1 + 2 * 3 Tokenized: num + num * num

< <factor> > + < <term> > + < num > + < <factor> > + < <term> > + < <expr> = + < <expr> = + < num < <expr> = + < num > * < <expr> = + < <factor> > * < <expr> = + < <term> = * 1 + 2 * 3 Tokenized: num + num * num

< <factor> > + < <term> > + < num > + < <factor> > + < <term> > + < <expr> = + < <expr> = + < num < <expr> = + < num > * < <expr> = + < <factor> > * < <expr> = + < <term> = * < <expr> = + < <term> = * < num 1 + 2 * 3 Tokenized: num + num * num

< <factor> > + < <term> > + < num > + < <factor> > + < <term> > + < <expr> = + < <expr> = + < num < <expr> = + < num > * < <expr> = + < <factor> > * < <expr> = + < <term> = * < <expr> = + < <term> = * < num < <expr> = + < <term> = * < num > 1 + 2 * 3 Tokenized: num + num * num

< <factor> > + < <term> > + < num > + < <factor> > + < <term> > + < <expr> = + < <expr> = + < num < <expr> = + < num > * < <expr> = + < <factor> > * < <expr> = + < <term> = * < <expr> = + < <term> = * < num < <expr> = + < <term> = * < num > < <expr> = + < <term> = * = <factor> > 1 + 2 * 3 Tokenized: num + num * num

< <factor> > + < <term> > + < num > + < <factor> > + < <term> > + < <expr> = + < <expr> = + < num < <expr> = + < num > * < <expr> = + < <factor> > * < <expr> = + < <term> = * < <expr> = + < <term> = * < num < <expr> = + < <term> = * < num > < <expr> = + < <term> = * = <factor> > < <expr> = + = <term> > 1 + 2 * 3 Tokenized: num + num * num

< <factor> > + < <term> > + < num > + < <factor> > + < <term> > + < <expr> = + < <expr> = + < num < <expr> = + < num > * < <expr> = + < <factor> > * < <expr> = + < <term> = * < <expr> = + < <term> = * < num < <expr> = + < <term> = * < num > < <expr> = + < <term> = * = <factor> > < <expr> = + = <term> > < <expr> > 1 + 2 * 3 Tokenized: num + num * num

Questions?