Bottom-up parsing. Bottom-up parsing builds a parse tree from the leaves (terminals) to the start symbol int E T * TE+ T (4) (2) (3) (5) (1) int*+ E 

Slides:



Advertisements
Similar presentations
Compiler Principles Fall Compiler Principles Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University.
Advertisements

CSE 5317/4305 L4: Parsing #21 Parsing #2 Leonidas Fegaras.
LR Parsing Table Costruction
Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
1 Chapter 5: Bottom-Up Parsing (Shift-Reduce). 2 - attempts to construct a parse tree for an input string beginning at the leaves (the bottom) and working.
1 Bottom Up Parsing. 2 Bottom-Up Parsing l Bottom-up parsing is more general than top-down parsing »And just as efficient »Builds on ideas in top-down.
Pertemuan 12, 13, 14 Bottom-Up Parsing
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
Bottom-Up Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter (modified)
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter
1 CMPSC 160 Translation of Programming Languages Fall 2002 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #10 Parsing.
Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)
Bottom Up Parsing.
Chapter 4-2 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR Other.
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
1 LR parsing techniques SLR (not in the book) –Simple LR parsing –Easy to implement, not strong enough –Uses LR(0) items Canonical LR –Larger parser but.
Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
Shift/Reduce and LR(1) Professor Yihjia Tsai Tamkang University.
Bottom-up parsing Goal of parser : build a derivation
 an efficient Bottom-up parser for a large and useful class of context-free grammars.  the “ L ” stands for left-to-right scan of the input; the “ R.
LESSON 24.
Joey Paquet, 2000, 2002, 2012, Lecture 6 Bottom-Up Parsing.
410/510 1 of 21 Week 2 – Lecture 1 Bottom Up (Shift reduce, LR parsing) SLR, LR(0) parsing SLR parsing table Compiler Construction.
LR Parsing Compiler Baojian Hua
Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.
CS 321 Programming Languages and Compilers Bottom Up Parsing.
CSc 453 Syntax Analysis (Parsing) Saumya Debray The University of Arizona Tucson.
 an efficient Bottom-up parser for a large and useful class of context-free grammars.  the “ L ” stands for left-to-right scan of the input; the “ R.
CSc 453 Syntax Analysis (Parsing)
11 Outline  6.0 Introduction  6.1 Shift-Reduce Parsers  6.2 LR Parsers  6.3 LR(1) Parsing  6.4 SLR(1)Parsing  6.5 LALR(1)  6.6 Calling Semantic.
1 LR Parsers  The most powerful shift-reduce parsing (yet efficient) is: LR(k) parsing. LR(k) parsing. left to right right-most k lookhead scanning derivation.
Chapter 3-3 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR 
Chapter 5: Bottom-Up Parsing (Shift-Reduce)
Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
Bottom-Up Parsing Algorithms LR(k) parsing L: scan input Left to right R: produce Rightmost derivation k tokens of lookahead LR(0) zero tokens of look-ahead.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
Lecture 5: LR Parsing CS 540 George Mason University.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
1 Chapter 6 Bottom-Up Parsing. 2 Bottom-up Parsing A bottom-up parsing corresponds to the construction of a parse tree for an input tokens beginning at.
Conflicts in Simple LR parsers A SLR Parser does not use any lookahead The SLR parsing method fails if knowing the stack’s top state and next input token.
Chapter 8. LR Syntactic Analysis Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
COMPILER CONSTRUCTION
Lec04-bottomupparser 4/13/2018 LR Parsing.
CSc 453 Syntax Analysis (Parsing)
Bottom-up Parsing.
Compiler design Bottom-up parsing Concepts
Compiler Baojian Hua LR Parsing Compiler Baojian Hua
UNIT - 3 SYNTAX ANALYSIS - II
Bottom-up Parsing.
Table-driven parsing Parsing performed by a finite state machine.
Compiler Construction
Fall Compiler Principles Lecture 4: Parsing part 3
Bottom-Up Syntax Analysis
Syntax Analysis Part II
Implementing a LR parser engine
Subject Name:COMPILER DESIGN Subject Code:10CS63
CSC 4181 Compiler Construction Parsing
Syntax Analysis source program lexical analyzer tokens syntax analyzer
Parsing #2 Leonidas Fegaras.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Parsing #2 Leonidas Fegaras.
Bottom-up Parsing.
Chap. 3 BOTTOM-UP PARSING
Presentation transcript:

Bottom-up parsing

Bottom-up parsing builds a parse tree from the leaves (terminals) to the start symbol int E T * TE+ T (4) (2) (3) (5) (1) int*+ E  T + E | T T  int * T | int

Bottom-up parsing Bottom-up parsing builds a parse tree from the leaves (terminals) to the start symbol int E T * TE+ T E => T + E => E + T => E + int => int * T + int => int * int + int (4) (2) (3) (5) (1) int*+ Rightmost Derivation

Bottom-up parsing Bottom-up parsing builds a parse tree from the leaves (terminals) to the start symbol int E T * TE+ T E => T + E => E + T => E + int => int * T + int => int * int + int (4) (2) (3) (5) (1) int*+ Rightmost Derivation E => T + E => int * T + E => int * int + E => int * int + T => int * int + int Leftmost Derivation

Bottom-up parsing II Bottom-up parsing is a series of reductions (inverses of productions), the reverse of which is the rightmost derivation (1) int --> T (2) int * T --> T (3) int --> T (4) T --> E (5) T + E --> E E  T + E | T T  int * T | int E => T + E => E + T => E + int => int * T + int => int * int + int int E T * TE+ T (4) (2) (3) (5) (1) int*+ Rightmost DerivationReductions

LR(k) Most popular bottom-up parsing method is LR(k) parsing Left-to-right scanning of input Rightmost derivation With an input lookahead of k

Why LR(k) parsing? Recognizes most programming language constructs –LR(k) recognizes the body of a production in right- sentential form with k symbols of lookahead Determine when to apply reductions, A   given string  a 1 …a k w –LL(k) recognizes the use of a production after seeing the first k symbols of what the body derives Determine when to apply productions, A  a 1 …a k  given string wa 1 …a k  Possible to build efficient table-based algorithms LR(k) is a proper superset of LL(k)

Shift-reduce parsing LR parsers typically described as shift-reduce parsers Pushdown automata with 4 possible actions –Shift a: Move token a from input to stack –Reduce A   Reduce sequence  on stack to A –Accept: Accept string –Error: Reject string Compare with actions used in LL parsing –Scan a: Pop token a from stack; Match a from input –Push A   1 …  n : Pop A; Push states  n …  1 to stack

Recall LL(1) E  T X T  ( E ) | int Y X  + E |  Y  * T |  E: T: X: Y: TX (E) int Y +E *T  

Recall LL(1) FOLLOW(X) = { ), $ } FOLLOW(Y) = { +, ), $ } E: T: X: Y: TX (E) int Y +E *T   Stack E X T X Y int X Y X T * X T X Y int X Y X  … Input int * int $ * int $ int $ $ …

Shift-reduce parsing Shift S m : Push S m on stack; increment input position Parser GotoAction Stack Input $ SmSm S m-1 a1a1 aiai $ …… … Action[S,a] = {Shift, Reduce, Accept} Goto[S,A] = S

Shift-reduce parsing Reduce A   : Pop |  | symbols; push Goto[S m-|  |,A] on stack Parser GotoAction Stack Input $ SmSm S m-1 a1a1 aiai $ …… … Action[S,a] = {Shift, Reduce, Accept} Goto[S,A] = S

Shift-reduce parsing StateActionGoto int()+*$ETXY 1S9S6132 2R5S4R53 3R1 4S9S652 5R4 6S9S672 7S8 8R2 9R7 S11R710 R3 11S9S612 R6 13acc

LR tables All LR parsers use the same basic algorithm, they only differ on how the transition tables are built –SLR(0), LR(1), LALR(1) The basic problem is determining when to shift and when to reduce –Use an LR(0) automaton to determine viable prefixes

How do we build transition tables? LR(0) automata encapsulate all we need –Push-down automata with edges labeled with terminals and non-terminals –Reducing and accepting states Different capabilities due to –When to reduce –How automata are made deterministic

LR(0) automaton E: T: X: Y: TX (E) int Y +E *T   Start out with LL(1) picture –Separate automaton for each production

LR(0) automaton E: T: X: Y: TX (E) int Y +E *T   Add  - transitions to indicate possible parsing states –Intuition:  - transitions allow use to nondeterministically pick the right production to apply Apply NFA to DFA conversion

LR(0) automaton E: TX  123

LR(0) automaton E: T: TX (E) int Y  1, ( int

LR(0) automaton E: T: Y: TX (E) int Y * T 1, , ( int *

LR(0) automaton E: T: Y: TX (E) int Y * T 1,42 45,1,46 8, ( int * T (

LR(0) automaton E: T: Y: TX (E) int Y *T 1,42,10 45,1,46 8, ( int * T ( X: +E

LR(0) automaton E: T: Y: TX (E) int Y * T 1,42,10 45,1,46 8, ,415 ( int * T ( X: +E int(

LR(0) automaton E: T: Y: TX (E) int Y * T 1,42,10 45,1,46 8, ,415 ( int * T ( X: +E 1011,1,412 + int(

LR(0) automaton 1T 2 ( 6E 7 int 9Y 10 X 3 ) 8 * 11T E 5 E  T X T  ( E ) | int Y X  + E |  Y  * T |  X  + E Y  * T T  ( E ) E  T X X   Y   T  int Y acc E 13 Useful to augment grammar with rule S’  S to identify accepting state

Constructing an LR(0) automaton 1.Add a dummy start symbol [S’  S $] –Distinguishes accepting reductions 2.Make an automaton for each production 3.For transitions on non-terminals, add  - edges to the corresponding automaton 4.Apply NFA to DFA conversion

Example P’  P $ P  S | T S  A A T  A B A  a a B  b b SP P’P$ T SAA TAB Aaa Bbb

Example P’  P $ P  S | T S  A A T  A B A  a a B  b b SP P’P$ T SAA TAB Aaa Bbb

Example P’  P $ P  S | T S  A A T  A B A  a a B  b b S P’P$ T AA B aa bb P’  P $ P  S P  T S  A A T  A B A  a a B  b b

From sets to automaton Many of the important properties that we calculated from sets and constraints are encapsulated in the LR(0) automaton –Easier to calculate from set definitions –Perhaps easier to understand from automaton Examples –FIRST, FOLLOW, Items

FIRST “Possible first terminals for a non-terminal” Case 1 –For [A  a] then a  FIRST(A) Case 2 –For [A  X 1 X 2 … X n ] then FIRST(X 1 )  FIRST(A) If NULLABLE(X 1 ) then FIRST(X 2 )  FIRST(A) If NULLABLE(X 1, X 2, …, X n-1 ) then FIRST(X n )  FIRST(A) –For [A   ] then no constraint

FIRST P P’ S T A B P’  P $ P  S P  T S  A A T  A B A  a a B  b b P$ S T A A A B a b a b “Possible first terminals for a non-terminal”

FIRST P P’ S T A B FIRST(P’) = { a } FIRST(P) = { a } FIRST(S) = { a } FIRST(T) = { a } FIRST(A) = { a } FIRST(B) = { b } P’  P $ P  S P  T S  A A T  A B A  a a B  b b P$ S T A A A B a b a b

FOLLOW “Terminals that can follow a non-terminal” For [A  … X B 1 B 2 … B n ] Case 1 –FIRST(B 1 )  FOLLOW(X) Case 2 –If NULLABLE(B 1 ) then FIRST(B 2 )  FOLLOW(X) –If NULLABLE(B 1, B 2, …, B n-1 ) then FIRST(B n )  FOLLOW(X) Case 3 –If NULLABLE(B 1, B 2, …, B n ) then FOLLOW(A)  FOLLOW(X) –NULLABLE({ }) = true

FOLLOW P P’ S T A B P’  P $ P  S P  T S  A A T  A B A  a a B  b b P$ S T A A A B a b a b “Terminals that can follow a non- terminal”

FOLLOW P P’ S T A B P’  P $ P  S P  T S  A A T  A B A  a a B  b b P$ S T A A A B a b a b “Terminals that can follow a non- terminal”  - edges show where non- terminals are used Read FOLLOW constraints from graph –(1) FIRST(A)  FOLLOW(A) –(2) FOLLOW(S)  FOLLOW(A) –(3) FIRST(B)  FOLLOW(A) (1) (2)(3)

FOLLOW P P’ S T A B FOLLOW(P’) = { } FOLLOW(P) = { $ } FOLLOW(S) = { $ } FOLLOW(T) = { $ } FOLLOW(A) = { a, b, $ } FOLLOW(B) = { a, $ } P’  P $ P  S P  T S  A A T  A B A  a a B  b b P$ S T A A A B a b a b

LR(0) items A LR(0) item is –A production, A  X Y Z, with a dot in the body, e.g., A . X Y Z –Represents state of parser A  X. Y Z means that the parser has seen X so far and is looking for a string derivable from Y Z Alternate construction of LR(0) automaton

LR(0) items CLOSURE(I : item set) –I  CLOSURE(I) –If [A   B  CLOSURE(I) and [B   ] –Then [B .  CLOSURE(I) –“If we’re looking for B  and B   then we should also be looking for  ” GOTO(I, X : symbol) –If [A   X  ]  I then [A   X.  ]  GOTO(I,X) –CLOSURE(GOTO(I, X))  GOTO(I, X) –“If we’re in state I and see symbol X, we are now in state I’”

LR(0) items I1: P’ . P $ P . S P . T P . A P . a I2: P’  P. $ I6: S  A. A T  A. B I9: A  a. A I11: B  b. b I4 I1I2I3 I5 I6I7 I8 I9 I10 I11I12 I3: P’  P $. I4: P  S. I5: P  T. I7: S  A A. I8: T  A B. I10 A  a a. I12: B  b b. CLOSURE defines states GOTO defines transitions

Recap Equivalence between LR(0) automaton properties and set constraints –Set constraints may be easier to calculate –FIRST(A): The set of terminals reachable from A through  -moves –FOLLOW(A): For each incoming  -edge to non-terminal, either FIRST(B)  FOLLOW(A) or FOLLOW(B)  FOLLOW(A) depending on incident state –LR(0) items  LR(0) automaton

How do we build transition tables? LR(0) automata encapsulate all we need –Push-down automata with edges labeled with terminals and non-terminals –Reducing and accepting states Now what about the transition tables?

SLR(1) parser 1 T 2 ( 6E 7 int 9Y 10 X 3 ) 8 * 12 T E 5 X  + E Y  * T T  ( E ) E  T X X   Y   T  int Y E  T X T  ( E ) | int Y X  + E |  Y  * T |  Simple LR(1) parser

SLR(1) parser 1 T 2 ( 6E 7 int 9Y 10 X 3 ) 8 * 12 T E 5 X  + E Y  * T T  ( E ) E  T X X   Y   T  int Y When to apply  -reductions? E  T X T  ( E ) | int Y X  + E |  Y  * T | 

SLR(1) parser take 1 Always reduce? Generate Action table from automaton –For each edge S i ==> S j in LR(0) automaton, Action[S i,a] = shift S j –For each “reduce” node S i with reduction [A   ], Action[S i,  ] = reduce A   Exception: If the node corresponds to the reduction S’  S, then Action[S i,  $] = accept –All other actions are error –Conflict between actions  grammar not SLR a

SLR(1) parser 1 T 2 ( 6E 7 int 9Y 10 X 3 ) 8 * 12 T E 5 X  + E Y  * T T  ( E ) E  T X X   Y   T  int Y When to apply  -reductions? E  T X T  ( E ) | int Y X  + E |  Y  * T | 

SLR(1) parser 1 T 2 ( 6E 7 int 9Y 10 X 3 ) 8 * 12 T E 5 X  + E Y  * T T  ( E ) E  T X X   Y   T  int Y FOLLOW(X) = { ), $ } FOLLOW(Y) = { +, ), $ } When to apply  -reductions? –When current token is in FOLLOW set E  T X T  ( E ) | int Y X  + E |  Y  * T | 

SLR(1) parser take 2 Generate Action table from automaton –For each edge S i ==> S j in LR(0) automaton, Action[S i,a] = shift S j –For each “reduce” node S i with reduction [A   ], Action[S i,a] = reduce A   where a  FOLLOW(A) Exception: If the node corresponds to the reduction S’  S, Action[S i,  $] = accept –All other actions are error –Conflict between actions  grammar not SLR a

SLR(1) parser take 2 Generate Goto table from automaton –For each edge S i ==> S j in LR(0) automaton, Goto[S i,A] = S j A

SLR(1) parsing example 1T 2 ( 6E 7 int 9Y 10 X 3 ) 8 * 11T E 5 4: X  + E 6: Y  * T 2: T  ( E ) 1: E  T X 5: X   7: Y   3: T  int Y acc E 13 FOLLOW(X) = { ), $ } FOLLOW(Y) = { +, ), $ } FOLLOW(E) = { ), $ } FOLLOW(T) = { +, $ }

SLR(1) parsing example StateActionGoto int()+*$ETXY 1S9S6132 2R5S4R53 3R1 4S9S652 5R4 6S9S672 7S8 8R2 9R7 S11R710 R3 11S9S612 R6 13acc

Problem with SLR(1) S’  S $ S  L = R | R L  *R | id R  L L 2 * 4 id 5 = 6L 8 R 7 R 9 S 1 R 3 S’ 10 L  * R R  L S  L = R R  L L  id S’  S $ S  R acc 0 FOLLOW(S) = { $ } FOLLOW(R) = { =, $ } FOLLOW(L) = { =, $ } $ 11

Problem with SLR(1) S’  S $ S  L = R | R L  *R | id R  L L 2 * 4 id 5 = 6L 8 R 7 R 9 S 1 R 3 S’ 10 L  * R R  L S  L = R R  L L  id S’  S $ S  R acc 0 FOLLOW(S) = { $ } FOLLOW(R) = { =, $ } FOLLOW(L) = { =, $ } $ 11 When to shift and when to reduce?

LR(1) Use k = 1 lookahead symbols to determine when to shift rather than reduce –Reduce only when we have a matching lookahead –The set of lookahead symbols for A is some subset of FOLLOW(A) Use LR(0) automaton to give intuition about LR(1)

LR(1) example = R R S  L = R S  R L S R L  * R L  id * L L R  L R (1) FIRST($)  FOLLOW(S) (2) FIRST(=)  FOLLOW(L) (3) FOLLOW(R)  FOLLOW(L) (4) FOLLOW(S)  FOLLOW(R) (5) FOLLOW(S)  FOLLOW(R) (6) FOLLOW(L)  FOLLOW(R) S’  S $ S  L = R | R L  *R | id R  L $S S’ S’  S $ id (1) (2) (3) (4) (5) (6) FOLLOW(S) = { $ } FOLLOW(R) = { =, $ } FOLLOW(L) = { =, $ } Constraints Solutions

LR(1) example = R R S  L = R S  R L S R L  * R L  id * L L R  L R (1) FIRST($)  FOLLOW(S) (2) FIRST(=)  FOLLOW(L) (3) FOLLOW(R)  FOLLOW(L) (4) FOLLOW(S)  FOLLOW(R) (5) FOLLOW(S)  FOLLOW(R) (6) FOLLOW(L)  FOLLOW(R) S’  S $ S  L = R | R L  *R | id R  L $S S’ S’  S $ id $ = =,$ $ $ FOLLOW(S) = { $ } FOLLOW(R) = { =, $ } FOLLOW(L) = { =, $ } Constraints Solutions

LR(1) example = R R S  L = R, {$} S  R, {$} L S$ R L  * R, {=} L  id, {=} * L= $S S’ S’  S $ id L R  L, {$} R$ L R  L, {=, $} R=$ R L  * R, {=, $} L  id, {=, $} * L=$ id =

LR(1) example = R R S  L = R, {$} S  R, {$} L S$ R L  * R, {=} L  id, {=} * L= $S S’ S’  S $ id L R  L, {$} R$ L R  L, {=} R= R L  * R, {$} L  id, {$} * L$ id =

Recap 1.Use “context” of  - moves to introduce states corresponding to the terminal(s) we expect to see after non-terminal –State dependent FOLLOW –Subset of FOLLOW 2.Propagate lookahead to reduction rules 3.Perform NFA to DFA conversion

LR(1) items Equivalence between LR(1) automaton and LR(1) item sets A LR(1) item is –An LR(0) item augmented with a lookahead symbol (terminal), e.g., [A . X Y Z, a] –The item [A  X Y Z., a] calls for a reduction only if the next input symbol is a

LR(1) items CLOSURE(I : item set) –I  CLOSURE(I) –If [A   B , a]  CLOSURE(I), [B   ], and b  FIRST(  a) –Then [B . , b]  CLOSURE(I) –“If we’re looking for B  and B   then we should also be looking for  ” GOTO(I, X : symbol) –If [A   X , a]  I then [A   X. , a]  GOTO(I,X) –CLOSURE(GOTO(I, X))  GOTO(I, X) –“If we’re in state I and see symbol X, we are now in state I’” Like FOLLOW(B) but takes into account of current production; “a” term handles the case when  =  ; FIRST(a)  FOLLOW(A)

LR(1) parser Generate Action table from automaton –For each edge S i ==> S j in LR(1) automaton, Action[S i,a] = shift S j –For each “reduce” node S i with reduction [A  , a], Action[S i,a] = reduce A   Exception: If the node corresponds to the reduction [S’  S, $], then Action[S i,  $] = accept –All other actions are error –If there is a conflict between actions, grammar is not in LR(1) Goto table generated as in SLR(1) a

LALR(1) LR(1) construction can generate many more states than SLR(1) –Lookahead may only be needed for a few constructs in grammar –Merge states that only differ on lookahead symbol (i.e., identical cores) –Cannot introduce a shift-reduce conflict because shift actions only depend on core –May introduce reduce-reduce conflicts

LALR(1) example S’  S $ S  C C C  c C | d S’ . S, $ S . C C, $ C . c C, c/d C . d, c/d S’  S., $ S  C. C, $ C . c C, $ C . d, $ S  C C., $ C  c. C, $ C . c C, $ C . d, $ C  c C., $ C  d., $ C  c. C, c/d C . c C, c/d C . d, c/d C  d., c/d C  c C., c/d S C c d c c d d d c C C C

LALR(1) example S’  S $ S  C C C  c C | d S’ . S, $ S . C C, $ C . c C, c/d C . d, c/d S’  S., $ S  C. C, $ C . c C, $ C . d, $ S  C C., $ C  c. C, $ C . c C, $ C . d, $ C  c C., $ C  d., $ C  c. C, c/d C . c C, c/d C . d, c/d C  d., c/d C  c C., c/d S C c d c c d d d c C C C

LALR(1) example S’  S $ S  C C C  c C | d S’ . S, $ S . C C, $ C . c C, c/d C . d, c/d S’  S., $ S  C. C, $ C . c C, $ C . d, $ S  C C., $ C  c. C, cd$ C . c C, cd$ C . d, cd$ C  c C., cd$ C  d., cd$ S C c d c d d c C C

Dealing with ambiguity Commonly, parser generators allow methods for dealing with ambiguous grammars –Precedence and associativity rules for operators –Implemented by modifying generated parser table

JFlex and CUP JFlex, a lexer generator for Java CUP, a parser generator for Java Both take specifications and generate Java code

JFlex import java_cup.runtime.Symbol; % %class Lexer %cup %{ private Symbol symbol(int sym) { return new Symbol(sym, yyline+1, yycolumn+1); } private Symbol symbol(int sym, Object val) { return new Symbol(sym, val); } %} IntLiteral = 0 | [1-9][0-9]* new_line = \r|\n|\r\n; white_space = {new_line} | [ \t\f] % {IntLiteral} { return symbol(sym.INT, new Integer(Integer.parseInt(yytext()))); } "(" { return symbol(sym.LPAREN); } ")" { return symbol(sym.RPAREN); } "+" { return symbol(sym.PLUS); }... {white_space} { /* ignore */ }.|\n { error("Illegal character "); }

CUP /* Terminals (tokens returned by lexer). */ terminal PLUS, MINUS, SLASH, STAR, QUESTION, COLON, LPAREN, RPAREN; terminal Integer INT; non terminal Integer Exp; precedence left QUESTION; precedence left PLUS, MINUS; precedence left STAR, SLASH; Exp ::= INT:i {: RESULT = i; :} | Exp:e1 PLUS Exp:e2 {: RESULT = e1 + e2; :} | Exp:e1 MINUS Exp:e2 {: RESULT = e1 - e2; :} | Exp:e1 SLASH Exp:e2 {: RESULT = e1 / e2; :} | Exp:e1 STAR Exp:e2 {: RESULT = e1 * e2; :} | Exp:e1 QUESTION Exp:e2 COLON Exp:e3 {: RESULT = e1 == 0 ? e3 : e2; :} | LPAREN Exp:e1 RPAREN {: RESULT = e1; :} ;