Download presentation
Presentation is loading. Please wait.
1 Bottom Up Parsing
2 Bottom-Up Parsing l Bottom-up parsing is more general than top-down parsing »And just as efficient »Builds on ideas in top-down parsing »Preferred method in practice l Also called LR parsing »L means that tokens are read left to right »R means that it constructs a rightmost derivation in reverse!
3 An Introductory Example LR parsers don ’ t need left-factored grammars and can also handle left-recursive grammars l Consider the following grammar: E E + ( E ) | int »Why is this not LL(1)? l Consider the string: int + ( int ) + ( int )
4 The Idea l LR parsing reduces a string to the start symbol by inverting productions: str à input string of terminals repeat »Identify in str such that A ! is a production (i.e., str = ) »Replace by A in str (i.e., str à A ) until str = S
5 A Bottom-up Parse in Detail (1) int++ () int + (int) + (int) ()
6 A Bottom-up Parse in Detail (2) E int++ () int + (int) + (int) E + (int) + (int) ()
7 A Bottom-up Parse in Detail (3) E int++ () int + (int) + (int) E + (int) + (int) E + (E) + (int) () E
8 A Bottom-up Parse in Detail (4) E int++ () int + (int) + (int) E + (int) + (int) E + (E) + (int) E + (int) E () E
9 A Bottom-up Parse in Detail (5) E int++ () int + (int) + (int) E + (int) + (int) E + (E) + (int) E + (int) E + (E) E () E E
10 A Bottom-up Parse in Detail (6) E E int++ () int + (int) + (int) E + (int) + (int) E + (E) + (int) E + (int) E + (E) E E () E E A rightmost derivation in reverse
11 Important Fact #1 Important Fact #1 about bottom-up parsing: An LR parser traces a rightmost derivation in reverse
12 Where Do Reductions Happen Important Fact #1 has an interesting consequence: »Let be a step of a bottom-up parse »Assume the next reduction is by A »Then is a string of terminals Why? Because A is a step in a right-most derivation
13 Notation l Idea: Split string into two substrings »Right substring is as yet unexamined by parsing (a string of terminals) »Left substring has terminals and non-terminals The dividing point is marked by a I »The I is not part of the string Initially, all input is unexamined: I x 1 x 2... x n
14 Shift-Reduce Parsing l Bottom-up parsing uses only two kinds of actions: Shift Reduce
15 Shift Shift: Move I one place to the right »Shifts a terminal to the left string E + ( I int ) E + (int I )
16 Reduce Reduce: Apply an inverse production at the right end of the left string »If E E + ( E ) is a production, then E + (E + ( E ) I ) E +(E I )
Shift-Reduce Example I int + (int) + (int)$ shift int++ ()()
Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int int++ ()()
Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int E I + (int) + (int)$ shift 3 times E int++ ()()
Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E ! int E int++ ()()
Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E ! int E + (E I ) + (int)$ shift E int++ ()() E
Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E ! int E + (E I ) + (int)$ shift E + (E) I + (int)$ red. E ! E + (E) E int++ ()() E
Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E ! int E + (E I ) + (int)$ shift E + (E) I + (int)$ red. E ! E + (E) E I + (int)$ shift 3 times E int++ () E () E
Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E ! int E + (E I ) + (int)$ shift E + (E) I + (int)$ red. E ! E + (E) E I + (int)$ shift 3 times E + (int I )$ red. E ! int E int++ () E () E
Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E ! int E + (E I ) + (int)$ shift E + (E) I + (int)$ red. E ! E + (E) E I + (int)$ shift 3 times E + (int I )$ red. E ! int E + (E I )$ shift E int++ () E () E E
Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E ! int E + (E I ) + (int)$ shift E + (E) I + (int)$ red. E ! E + (E) E I + (int)$ shift 3 times E + (int I )$ red. E ! int E + (E I )$ shift E + (E) I $ red. E ! E + (E) E int++ () E () E E
Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E ! int E + (E I ) + (int)$ shift E + (E) I + (int)$ red. E ! E + (E) E I + (int)$ shift 3 times E + (int I )$ red. E ! int E + (E I )$ shift E + (E) I $ red. E ! E + (E) E I $ accept E E int++ () E () E E
28 The Stack l Left string can be implemented by a stack »Top of the stack is the I l Shift pushes a terminal on the stack l Reduce pops 0 or more symbols off the stack (production rhs) and pushes a non-terminal on the stack (production lhs)
29 Key Issue: When to Shift or Reduce ? l Idea: use a finite automaton (FA) to decide when to shift or reduce »The input is the stack »The language consists of terminals and non- terminals We run the FA on the stack and we examine the resulting state X and the token tok after I »If X has a transition labeled tok then shift »If X is labeled with “ A ! on tok ” then reduce
LR(1) Parsing. An Example 0 I int + (int) + (int)$ shift 0 int 1 I + (int) + (int)$ E ! int 0 E 2 I + (int) + (int)$ shift 3 0 E 2 + 3 ( 4 int 5 I ) + (int)$ E ! int 0 E 2 + 3 ( 4 E 6 I ) + (int)$ shift 0 E 2 + 3 ( 4 E 6 ) I + (int)$ E ! E+(E) 0 E 2 I + (int)$ shift 3 0 E 2 + 3 ( 4 int 5 I )$ E ! int 0 E 2 + 3 ( 4 E 6 I )$ shift 0 E 2 + 3 ( 4 E 6 ) 7 I $ E ! E+(E) 0 E 2 I $ accept int E ! int on $, + accept on $ E ! int on ), + E ! E + (E) on $, + E ! E + (E) on ), + ( + E int 10 9 11 01 234 56 8 7 + E + ) ( int E )
31 Representing the FA l Parsers represent the Deterministic FA as a 2D table »Recall table-driven lexical analysis l Lines correspond to DFA states l Columns correspond to terminals and non- terminals l Typically columns are split into: »Those for terminals: action table »Those for non-terminals: goto table
32 Representing the DFA. Example l The table for a fragment of our DFA: int+()$E … 3s4 4s5g6 5 r E! int 6s8s7 7 r E! E+(E) … E ! int on ), + E ! E + (E) on $, + ( int 34 5 6 7 ) E
33 The LR Parsing Algorithm l After a shift or reduce action we rerun the DFA on the entire stack »This is wasteful, since most of the work is repeated l Remember for each stack element on which state it brings the DFA l LR parser maintains a stack sym 1, state 1 ... sym n, state n state k is the final state of the DFA on sym 1 … sym k
34 The LR Parsing Algorithm Let w be initial input Let j = 0 Let DFA state 0 be the start state Let stack = dummy, 0 repeat case action[topState(stack), w[j] ] of shift k: push w[j++], k reduce X : pop | | pairs, push X, Goto[topState(stack), X] accept: halt normally error: halt and report error
35 LR Parsing Notes l Can be used to parse more grammars than LL l Most programming language grammars are LR l Can be described as a simple table l There are tools for building the table l How is the table constructed?
36 LR Parsing and Parser Generators
37 Outline l Implementing a Shift-reduce Parser »Synthesize a DFA –Captures all the possible states that the parser can be in during state transitions for terminals and non- terminals –DFA for LR(0) Parser –DFA for SLR Parser –DFA for LR(1) Parser –DFA for LALR(1) Parser »Use DFA to create a parse table l Using parser generators
38 Key Issue: How is the DFA Constructed? l The stack describes the context of the parse »What non-terminal we are looking for »What production rhs we are looking for »What we have seen so far from the rhs »=> LR(0) item l Each DFA state describes several such contexts »E.g., when we are looking for non-terminal E, we might be looking either for an int or a E + (E) rhs »=> LR(0) item set
39 LR Example l The grammar S A $(1) A (A)(2) A ( )(3)
40 DFA States Based on LR(0) Items l We need to capture how much of a given production we have scanned so far A ( A ) Are we here?Or here?
41 LR(0) Items l We need to capture how much of a given production we have scanned so far l Production Generates 4 items »A (A ) A ( A )
42 Example of LR(0) Items l The grammar S A $ A (A ) A ( ) l Items S A $ A (A) A ( )
43 Key idea behind LR(0) items If the “ current state ” contains the item A c and the current symbol in the input buffer is c »the state prompts parser to perform a shift action »next state will contain A c If the “ state ” contains the item A »the state prompts parser to perform a reduce action If the “ state ” contains the item S $ and the input buffer is empty »the state prompts parser to accept But How about A B ? where B is a nonterminal?
44 The NFA for LR(0) items l The transition of LR(0) items can be represented by an NFA, in which »1. each LR(0) item is a state, »2. there is a transition from item A X to item A X with label X, »3. there is an -transition from item A B » to B »4. S A $ is the start state »5. A is a final state.
45 Example NFA for Items l LR(0) Items S A $ S A $A (A) A ( A ) A (A ) A (A) A ( ) A ( )A ( ) A ( A ) A ( ) A (A) S A $ A ( ) A ( A ) A A ( ( ) )
46 The DFA from LR(0) items l After the NFA for LR(0) is constructed, the resulting DFA for LR(0) parsing can be obtained by the usual NFA2DFA construction. l we thus require » -closure (I) » move(S, a)
47 Closure() of a set of items Closure finds all the items in the same “ state ” l Fixed Point Algorithm for Closure(I) »Every item in I is also an item in Closure(I) »If A B is in Closure(I) and B is an item, then add B to Closure(I) »Repeat until no more new items can be added to Closure(I)
48 Example of Closure Closure({ A ( A ) } ) l Items S A $ A (A) A ( ) A ( A) A ( )
49 Another Example S A $ A ( A ) A ( ) closure({ S A $}) l Items S A $ A (A) A ( )
50 Goto() of a set of items l Goto finds the new state after consuming a grammar symbol while at the current state l Algorithm for Goto(I, X) where I is a set of items and X is a grammar symbol Goto(I, X) = Closure( { A X | A X in I } ) goto is the new set obtained by “ moving the dot ” over X
51 Example of Goto Goto ({ A ( A ) }, A ) A ( A ) l Items S A $ A (A) A ( )
52 Example of Goto Goto ({ A ( A ) }, ( ) l Items S A $ A (A) A ( ) A ( A ) A ( )
53 l Essentially the usual NFA2DFA construction!! l Let A be the start symbol and S a new start symbol. l Create a new rule S A $ Create the first state to be Closure({ S A $}) l Pick a state I »for each item A X in I –find Goto(I, X) –if Goto(I, X) is not already a state, make one –Add an edge X from state I to Goto(I, X) state l Repeat until no more additions possible Building the DFA states
54 DFA Example S A$ A (A) A ( ) s0 S A $ s1 A A ( A) A ( ) A (A) A ( ) s2 ( A (A ) A s3 ( A ( ) ) s5 A (A) ) s4
55 Constructing a LR(0) Parse Engine l Build a DFA »DONE l Construct a parse table using the DFA
56 Creating the parse tables l For each state l Transition to another state using a terminal symbol is a shift to that state (shift to sn) l Transition to another state using a non-terminal is a goto that state (goto sn) If there is an item A in the state do a reduction with that production for all terminals (reduce k)
57 Building Parse Table Example S A$ A (A) A ( ) s0 S A $ s1 A A ( A) A ( ) A (A) A ( ) s2 ( A (A ) A s3 ( A ( ) ) s5 A (A) ) s4
58 Problem With LR(0) Parsing l No lookahead l Vulnerable to unnecessary conflicts »Shift/Reduce Conflicts (may reduce too soon in some cases) »Reduce/Reduce Conflicts l Solutions: »SLR(1) parsing - reduce only when next symbol can occur after nonterminal from production »LR(1) parsing - systematic lookahead
59 SLR(1) Parsing If a state contains A l Reduce by A only if next input symbol can follow A in some derivation l Example Grammar S A $ A a A a b
60 LR(0) Parser S A $ A a A a b S A $ A a A a b s0 s3 s1 s2 A a b
61 Creating SLR parse tables l For each state l Transition to another state using a terminal symbol is a shift to that state (shift to sn) (same as LR(0)) l Transition to another state using a non-terminal is a goto that state (goto sn) (same as LR(0)) If there is an item A in the state do a reduction with that production for all terminals that can follow A.
62 Follow() sets For each non-terminal A, Follow(A) is the set of terminals that can come after A in some derivation
63 1. $ Follow(S), where S is the start symbol 2. If A B is a production then First( ) Follow(B) 3. If A B is a productionthen Follow(A) Follow(B) 4. If A B is a production and derives then Follow(A) Follow(B) l Note: 3. is a special case of 4. Constraints for Follow()
64 Algorithm for Follow for all nonterminals NT Follow(NT) = {} Follow(S) = { } // {} since we add the rule S ’ S $ For all productions A B and nonterminal B at RHS Follow(B) = Follow(B) First( ) while Follow sets keep changing for all productions A B and nonterminal B if ( derives ) Follow(B) = Follow(B) Follow(A)
65 Augmenting Example with Follow l Example Grammar for Follow S A $ A a A a b Follow(S) = { } Follow(A) = { $ }
66 SLR Eliminates Shift/Reduce Conflict S A $ A a A a b S A $ A a A a b s0 s3 s1 s2 A a b b Follow(A)
67 Basic Idea Behind LR(1) l Split states in LR(0) DFA based on lookahead l Reduce based on item and lookahead
68 LR(1) Items l An LR(1) item is a pair: A , a »A ! is a production »a is a terminal (the lookahead terminal) »LR(1) means 1 lookahead terminal [A , a] describes a context of the parser : »We are trying to find an A followed by an a, and »We have already on top of the stack »Thus we need to see next a prefix derived from a
69 Note The symbol I was used before to separate the stack from the rest of input » I , where is the stack and is the remaining string of terminals In items is used to mark a prefix of a production rhs: A , a »Here might contain non-terminals as well l In both case the stack is on the left
70 Convention We add to our grammar a fresh new start symbol S and a production S ! E »Where E is the old start symbol l The initial parsing context contains: S ! E, $ »Trying to find an S as a string derived from E$ »The final stack is
71 LR(1) Items (Cont.) l In context containing E ! E + ( E ), + »If ( follows then we can perform a shift to context containing E ! E + ( E ), + l In context containing E ! E + ( E ), + »We can perform a reduction with E ! E + ( E ) »But only if a + follows
72 LR(1) Items (Cont.) l Consider the item E ! E + ( E ), + l We expect a string derived from E ) + l There are two productions for E E ! int and E ! E + ( E) l We describe this by extending the context with two more items: E ! int, ) E ! E + ( E ), )
73 The Closure Operation l The operation of extending the context with items is called the closure operation Closure(Items) = repeat for each [A ! B , a] in Items for each production B ! for each b 2 First( a) add [B ! , b] to Items until Items is unchanged
74 Constructing the Parsing DFA (1) Construct the start context: Closure({ [ S ! E, $] }) S ! E, $ E ! E+(E), $ E ! int, $ E ! E+(E), + E ! int, + S ! E, $ E ! E+(E), $/+ E ! int, $/+ We abbreviate as:
75 Constructing the Parsing DFA (2) l A DFA state is a closed set of LR(1) items The start state is Closure( { [S ! E, $] } ). A state that contains [A ! , b] is labeled with “ reduce with A ! on b ” And now the transitions …
76 The DFA Transitions A state S that contains [A y , b] has a transition labeled y to a state transition(S, y) that contains the item [A y , b] »y can be a terminal or a non-terminal transition(S, y) Items ← for each [A y , b] 2 State add [A y , b] to Items return Closure(Items)
77 Constructing the Parsing DFA. Example. E ! E+ (E), $/+ E ! int on $, + accept on $ E ! E+( E), $/+ E ! E+(E), )/+ E ! int, )/+ E ! int on ), + E ! E+(E ), $/+ E ! E +(E), )/+ and so on… S ! E, $ E ! E+(E), $/+ E ! int, $/+ 0 3 4 5 6 1 S ! E, $ E ! E +(E), $/+ 2 int E + ( E
78 LR Parsing Tables. Notes l Parsing tables (i.e. the DFA) can be constructed automatically for a CFG l But we still need to understand the construction to work with parser generators »E.g., they report errors in terms of sets of items l What kind of errors can we expect?
79 Shift/Reduce Conflicts l If a DFA state contains both [A a , b] and [B , a] Then on input “ a ” we could either »Shift into state [A a , b], or »Reduce with B l This is called a shift-reduce conflict
80 Shift/Reduce Conflicts l Typically due to ambiguities in the grammar l Classic example: the dangling else S if E then S | if E then S else S | OTHER l Will have DFA state containing [S if E then S, else] [S if E then S else S, x] l If else follows then we can shift or reduce l Default (bison, CUP, etc.) is to shift »Default behavior is as needed in this case
81 More Shift/Reduce Conflicts l Consider the ambiguous grammar E E + E | E * E | int l We will have the states containing [E E * E, +] [E E * E, +] [E E + E, +] E [E E + E, +] … l Again we have a shift/reduce on input + »We need to reduce (* binds more tightly than +) »Recall solution: declare the precedence of * and +
82 More Shift/Reduce Conflicts l In bison declare precedence and associativity: %left + %left * In CUP: precedence left ADD SUB precedence left MULT DIV l Precedence of a rule = that of its last terminal »See CUP/bison manual for ways to override this default l Resolve shift/reduce conflict with a shift if: »no precedence declared for either rule or terminal »input terminal has higher precedence than the rule »the precedences are the same and right associative
83 Using Precedence to Solve S/R Conflicts l Back to our example: [E E * E, +] [E E * E, +] [E E + E, +] E [E E + E, +] … Will choose reduce because precedence of rule E E * E is higher than that of terminal +
84 Using Precedence to Solve S/R Conflicts l Same grammar as before E E + E | E * E | int l We will also have the states [E E + E, *] [E E + E, *] [E E * E, *] E [E E * E, *] … l Now we also have a shift/reduce on input * »We choose shift because * has a higher precedence than the rule E E + E.
85 Using Precedence to Solve S/R Conflicts l the grammar E E + E | E - E | E * E | int l We will also have the states [E E + E, -] [E E + E, -] [E E - E, -] E [E E - E, -] … l Now we also have a shift/reduce on input - »We choose reduce because E E + E and - have the same precedence and +/- is left-associative
86 Using Precedence to Solve S/R Conflicts l Back to our dangling else example [S if E then S, else] [S if E then S else S, x] l Can eliminate conflict by declaring else with higher precedence than then »Or just rely on the default shift action But this starts to look like “ hacking the parser ” Best to avoid overuse of precedence declarations or you ’ ll end with unexpected parse trees
87 Reduce/Reduce Conflicts l If a DFA state contains both [A ! , a] and [B ! , a] »Then on input “ a ” we don ’ t know which production to reduce l This is called a reduce/reduce conflict
88 Reduce/Reduce Conflicts l Usually due to gross ambiguity in the grammar l Example: a sequence of identifiers S | id | id S l There are two parse trees for the string id S id S id S id l How does this confuse the parser?
89 More on Reduce/Reduce Conflicts Consider the states [S id, $] [S ’ S, $] [S id S, $] [S , $] id [S , $] [S id, $] [S id, $] [S id S, $] [S id S, $] l Reduce/reduce conflict on input $ S ’ S id S ’ S id S id Better rewrite the grammar: S | id S
90 Using Parser Generators l Parser generators construct the parsing DFA given a CFG »Use precedence declarations and default conventions to resolve conflicts »The parser algorithm is the same for all grammars (and is provided as a library function) l But most parser generators do not construct the DFA as described before »Because the LR(1) parsing DFA has 1000s of states even for a simple language
91 LR(1) Parsing Tables are Big l But many states are similar, e.g. and l Idea: merge the DFA states whose items differ only in the lookahead tokens »We say that such states have the same core l We obtain E ! int on $, + E int, $/+ E int, )/+ E ! int on ), + 5 1 E ! int on $, +, ) E int, $/+/) 1’
92 The Core of a Set of LR Items l Definition: The core of a set of LR items is the set of first components »Without the lookahead terminals l Example: the core of { [X , b], [Y , d]} is {X , Y }
93 LALR States l Consider for example the LR(1) states {[A , a], [B , c]} {[A , b], [B , d]} l They have the same core and can be merged l And the merged state contains: {[A , a/b], [B , c/d]} l These are called LALR(1) states »Stands for LookAhead LR »Typically 10 times fewer LALR(1) states than LR(1)
94 A LALR(1) DFA l Repeat until all states have distinct core »Choose two distinct states with same core »Merge the states by creating a new one with the union of all the items »Point edges from predecessors to new state »New state points to all the previous successors A ED CB F A BE D C F
Conversion LR(1) to LALR(1). Example. int E ! int on $, + E ! int on ), + E ! E + (E) on $, + E ! E + (E) on ), + ( + E int 10 9 11 01 234 56 8 7 + E + ) ( int E ) accept on $ int E ! int on $, +, ) E ! E + (E) on $, +, ) ( E int 01,5 23,84,9 6,107,11 + + ) E accept on $
96 The LALR Parser Can Have Conflicts l Consider for example the LR(1) states {[A , a], [B , b]} {[A , b], [B , c]} l And the merged LALR(1) state {[A , a/b], [B , b/c]} l Has a new reduce-reduce conflict on b l In practice such cases are rare
97 Practical method to construct LALR(1) machine l alternative method to construct LALR(1) finite state machine »rather than constructing LR(1) machine and then merge states. »adopted by many LR parser generators l Steps 1.construct the cores (i.e., a DFA for LR(0)/SLR(1) ). 2. compute lookaheads 2.1. create propagate link b/t LR(1) items. 2.2. determine spontaneous lookahead 2.3. propagate until convergence inter-state : A X ,L1 A X , L1 L2 l intra-state : B A , L A , First( ) L if B A , L A , First( ) o/w … A X ,L2
98 Example : Input LR(0) DFA S A A (A) A ( ) s0 S A s1 A A ( A) A ( ) A (A) A ( ) s2 ( A (A ) A s3 ( A ( ) ) s5 A (A) ) s4
99 Example : Inter-state Propagation Links S A, $ A (A) A ( ) s0 S A, s1 A A ( A) A ( ) A (A) A ( ) s2 ( A (A ) A s3 ( A ( ) ) s5 A (A) ) s4
100 Example : Intra-state Propagation Links S A, $ A (A) A ( ) s0 S A, s1 A A ( A) A ( ) A (A) A ( ) s2 ( A (A ) A s3 ( A ( ) ) s5 A (A) ) s4 x x
101 Example : Spontaneous lookaheads insertion S A, $ A (A) A ( ) s0 S A, s1 A A ( A) A ( ) A (A) A ( ) s2 ( A (A ) A s3 ( A ( ) ) s5 A (A) ) s4 x x {} ) )
102 Example : lookaheads propagation S A, $ A (A) A ( ) s0 S A, $ s1 A A ( A) A ( ) A (A) A ( ) s2 ( A (A ) A s3 ( A ( ) ) s5 A (A) ) s4 {} $ ) ) ) )
103 Example : lookaheads propagation S A, $ A (A) A ( ) s0 S A, $ s1 A A ( A) A ( ) A (A) A ( ) s2 ( A (A ) A s3 ( A ( ) ) s5 A (A) ) s4 {} $ ) ) )$)$ )$)$)$)$ )$)$ )$)$ Propagation flow: | | | | |
104 LALR vs. LR Parsing l LALR languages are not natural »They are an efficiency hack on LR languages l Any reasonable programming language has a LALR(1) grammar l LALR(1) has become a standard for programming languages and for parser generators
105 A Hierarchy of Grammar Classes From Andrew Appel, “Modern Compiler Implementation in Java”
106 Notes on Parsing l Parsing »A solid foundation: context-free grammars »A simple parser: LL(1) »A more powerful parser: LR(1) »An efficiency hack: LALR(1) »LALR(1) parser generators
107 Supplement to LR Parsing Strange Reduce/Reduce Conflicts Due to LALR Conversion (from the bison manual)
108 Strange Reduce/Reduce Conflicts (skipped) l Consider the grammar S P R, NL N | N, NL P T | NL : T R T | N : T N id T id l P - parameters specification » P is id | id + : id l R - result specification »R is id | id : id l N - a parameter or result name l T - a type name l NL - a list of names
109 Strange Reduce/Reduce Conflicts (skipped) l In P an id is a »N when followed by, or : »T when followed by id l In R an id is a »N when followed by : »T when followed by, l Ex: id N1,id N2 : id T3 id N4 :id T5, l This is an LR(1) grammar. l But it is not LALR(1). Why? »For obscure reasons
110 A Few LR(1) States P T id P NL : T id NL N : NL N, NL : N id : N id, T id id 1 R T, R N : T, T id, N id : 2 T id id N id : N id, id 3 T id, N id : id 4 T id id/, N id :/, LALR merge LALR reduce/reduce conflict on “,”
111 What Happened? l Two distinct states were confused because they have the same core l Fix: add dummy productions to distinguish the two confused states. l E.g., add R id bogus »bogus is a terminal not used by the lexer »This production will never be used during parsing »But it distinguishes R from P
112 A Few LR(1) States After Fix P T id P NL : T id NL N : NL N, NL : N id : N id, T id id R T, R N : T, R id bogus, T id, N id : T id id N id : N id, T id, N id : R id bogus, id 1 2 3 4 Different cores no LALR merging
Similar presentations
© 2025 Inc.
All rights reserved.