Download presentation
Presentation is loading. Please wait.
1
Bottom-up Parsing
2
Top-down versus Bottom-up Parsing
Recursive descent parsing LL(k) parsing Top to down and leftmost derivation Expanding from starting symbol (top) to gradually derive the input string Can use a parsing table to decide which production to use next The power is limited Many grammars are not LL(k) Left recursion elimination and left factoring can help make some grammars LL(k), but after rewriting, the grammar can be very hard to comprehend Space efficient Easy to build the parse tree
3
Top-down versus Bottom-up Parsing
Also known as shift-reduce parsing LR family Precedence parsing Shift: allow shifting input characters to the stack, waiting till a matching production can be determined Reduce: once a matching production is determined, reduce Follow the rightmost derivation, in a reversed way Parse from bottom (the leaves of the parse tree) and work up to the starting symbol Due to the added “shift” More powerful Can handle left recursive grammars and grammars with left factors Less space efficient
4
Basic Concepts How to build a predictive bottom-up parser?
Sentential form For a grammar G with start symbol S A string is a sentential form of G if S * may contain terminals and nonterminals If is in T*, then is a sentence of L(G) Left sentential form: A sentential form that occurs in the leftmost derivation of some sentence Right sentential form: A sentential form that occurs in the rightmost derivation of some sentence
5
Basic Concepts Example of the sentential form Another example
E E * E | E + E | ( E ) | id Leftmost derivation: E E + E E * E + E id * E + E id * id + E id * id + E * E id * id + id * E id * id + id * id All the derived strings are of the left sentential form Rightmost derivation E E + E E + E * E E + E * id E + id * id E * E + id * id E * id + id * id id * id + id * id All the derived strings are of the right sentential form Another example S AB, A CD, B EF S AB CDB S AB AEF
6
Basic Concepts Handle Given a rightmost derivation
S 1 2 … k (Aw) k+1 (w) … n i, for all i, are the right sentential forms From k to k+1, production A is used A handle of k+1 (= w) is the production A and the position of in k+1 Informally, is the handle
7
Basic Concepts Theorem Proof But
If G is unambiguous, then every right-sentential form has a unique handle Proof G is unambiguous rightmost derivation is unique Consider a right-sentential form k+1 A unique production A is applied to k, and applied at a unique position A unique handle in k+1 But During the derivation, the production rule is unique During the reduction, can we uniquely determine the production that was used during the derivation?
8
Basic Concepts Viable prefix Example:
Prefix of a right-sentential form, do not pass the end of the handle E.g., Or the prefix of Example: E E + E E + E * E E + E * id E + id * id E * E + id * id E * id + id * id id * id + id * id E E * E | E + E | ( E ) | id Viable prefix Derivation Parsing (reduction) Handles w
9
Meaning of LR L: Process input from left to right
R: Use rightmost derivation, but in reversed order E E + E E + E * E E + E * id E + id * id E * E + id * id E * id + id * id id * id + id * id E + * id E E E + E E + E E * E E * E E * E id id id id id
10
Bottom-up Parsing Traverse rightmost derivation backwards
If reduction is done arbitrarily It may not reduce to the starting symbol Need backtracking By follow the path of rightmost derivation All the reductions are guaranteed to be “correct” Guaranteed to lead to the starting symbol without backtracking That is: If it is always possible to correctly find the handle How to find the handle for reduction for each right sentential form Use a stack to keep track of the viable prefix The prefix of the handle will always be at the top of the stack
11
Bottom-up Parsing Consider a right-sentential form w
Where A and is a handle (let = ’w’) Right to is always a subsentence (T*) Handle (= ’w’) from the top of the stack and the current input substring ’ is a viable prefix w Parser $ Parser table stack Input tokens ’ w’ The knowledge for recognizing (or ’) is built in the table
12
Bottom-up Parsing Example grammar Cannot know what aw
X aAB | … Y aAC | … Cannot know what aw should be reduced to shift a to stack, reduce some part of w to A, shift A to stack, … till something is clear Shift adds power to parsing How to systematically do this? ? aw w’ 12
13
Bottom-up Parsing Shift-reduce operations in bottom-up parsing
Shift the input into the stack Wait for the current handle to complete or to appear Or wait for a handle that may complete later Reduce Once the handle is completely in the stack, then reduce The operations are determined by the parsing table Parsing table includes Action table Determine the action of shift or reduce To shift (current handle is not completely or not yet in stack) To reduce (current handle is completely in stack) Goto table Determine which state to go to next
14
Parsing Table Idea Characteristic finite state automata (CFSA)
Build a finite automata based on the grammar Follow the automata to construct the parsing tables Characteristic finite state automata (CFSA) Is the basis for building the parsing table But the automata is not a part of the parsing table States of the automata Each state is represented by a set of LR(0) items To keep track of what has already been seen (already in the stack) In other words, keep track of the viable prefix To track the possible productions that may be used for reduction State transitions Fired by grammar symbols (terminals or nonterminals)
15
Build the Automata LR(0) Item of a grammar G
Is a production of G with a distinguished position Position is used to indicate how much of the handle has already been seen (in the stack) For production S a B S, items for it include S a B S S a B S S a B S S a B S Left of are the parts of the handle that has already been seen When reaches the end of the handle reduction For production S , the single item is S
16
Building the Automata Closure function Closure(I)
I is a set of items for a grammar G Every item in I is in Closure(I) If A B is in Closure(I) and B is a production in G Then add B to Closure(I) If it is not already there Meaning When is in the stack and B is expected next One of the B-production rules may be used to reduce the input to B May not be one-step reduction though Apply the rule until no more new items can be added
17
Building the Automata Goto function Goto(I,X) X is a grammar symbol
If A X is in I then A X is in Goto(I, X) Let J denote the set constructed by this step All items in Closure(J) are in Goto(I, X) Meaning If I is the set of valid items for some viable prefix Then goto(I, X) is the set of valid items for the viable prefix X
18
Building the Automata Augmented grammar
G is the grammar and S is the staring symbol Construct G’ by adding production S’ S into G S’ is the new starting symbol E.g.: G: S | G’: S’ S, S | Meaning The starting symbol may have several production rules and may be used in other non-terminal’s production rules Add S’ S to force the starting symbol to have a single production When S’ S is seen, it is clear that parsing is done
19
Building the Automata Given a grammar G Step 1: augment G
Step 2: initial state Construct the valid item set “I” of State 0 (the initial state) Add S’ S into I All expansions have to start from here Compute Closure(I) as the complete valid item set of state 0 All possible expansions S can lead into Step 3: From state I, for all grammar symbol X Construct J = Goto(I, X) Compute Closure(J) Create the new state with the corresponding Goto transition Only if the valid item set is non-empty and does not exist yet Repeat Step 3 till no new states can be derived
20
Building the Automata -- Example
Grammar G: S E E E + T | T T id | ( E ) Step 1: Augment G S’ S S E E E + T | T T id | ( E ) Step 2: Construct Closure(I0) for State 0 First add into I0: S’ S Compute Closure(I0) S’ S S E E E + T E T T id T ( E ) Expect to see S next S won’t just appear May have to see E first and reduce it to S using this rule
21
Building the Automata -- Example
Step 3 I1 Add into I1: Goto(I0, S) = S’ S No new items to be added to Closure (I1) I2 Add into I2: Goto(I0, E) = S E E E + T No new items to be added to Closure (I2) I3 Add into I3: Goto(I0, T) = E T No new items to be added to Closure (I3) I4 Add into I4: Goto(I0, id) = T id No new items to be added to Closure (I4) I0: S’ S S E E E + T E T T id T ( E ) When E is moved to the stack (after a reduction), these two are the possible handles S E implies a reduction is to be done o should be done if seeing Follow(S) E E + T implies + is expected to be the next input
22
Building the Automata -- Example
Step 3 I5 Add into I5: Goto(I0, “(”) = T ( E ) Closure(I5) E E + T E T T id T ( E ) No more moves from I0 No possible moves from I1 I6 Add into I6: Goto(I2, +) = E E + T No possible moves from I3 and I4 I0: S’ S S E E E + T E T T id T ( E ) After seeing (, we expect E next E could be reduced from other E-production rules So, put E-productions in the set
23
Building the Automata -- Example
Step 3 I7 Add into I7: Goto(I5, E) = T ( E ) E E + T No new items to be added to Closure (I7) Goto(I5, T) = I3 Goto(I5, id) = I4 Goto(I5, “(”) = I5 No more moves from I5 I8 Add into I8: Goto(I6, T) = E E + T No new items to be added to Closure (I8) Goto(I6, id) = I4 Goto(I6, “(”) = I5
24
Building the Automata -- Example
Step 3 I9 Add into I9: Goto(I7, “)”) = T ( E ) No new items to be added to Closure (I9) Goto(I7, +) = I6 No possible moves from I8 and I9
25
Building the Automata -- Example
S S I1 I6 S I2 E E + T T id T ( E ) S E E E + T + T E E + T I0 E I8 S’ S S E E E + T E T T id T ( E ) I3 id ( T E T id I4 + T T id id ( T ( E ) E E + T E T T id T ( E ) I7 E T ( E ) E E + T ) T ( E ) I5 ( I9
26
Building the Automata -- Example
See how parsing works directly on the automata Follow(S) = {$} Follow(E) = {+, ), $} Follow(T) = {+, ), $} Stack Input Action id + id $ S4 0 id 4 + id $ Tid, Goto[0,T]=3 0 T 3 ET, Goto[0,E]=2 0 E 2 s6 0 E 2 + 6 id $ 0 E id 4 $ Goto[6,T]=8 0 E T 8 EE+T, SE, Goto[0,S]=1 0 S 1 accept S’ S S E E E + T E T T id T ( E ) S E E E + T S S E T T id T ( E ) E E + T S E T id ( + I0 I1 I2 I3 I4 I5 I6 T ( E ) I7 E E + T I8 T ( E ) ) I9
27
Building the Parsing Table
Action [M, N] M states N tokens Actions = Shift i: shift the input token into the stack and go to state i Reduce i: reduce by the i-th production Accept Error Goto [M, L] L non-terminals Goto[i, j] = x Move to state Sx
28
Building the Action Table
If state Ii has item A a , and Goto(Ii, a) = Ij Next symbol in the input is a Then Action[Ii, a] = Ij Meaning: Shift “a” to the stack and move to state Ij Need to wait for the handle to appear or to complete If State Ii has item A Then Action[S, b] = reduce using A For all b in Follow(A) Meaning: The entire handle is in the stack, need to reduce Need to wait to see Follow(A) to know that the handle is ready E.g. S E E E + T Current input can be either Follow(S) or +
29
Building the Action Table
If state has S’ S0 Then Action[S, $] = accept Current state The action to be taken depends on the current state In LL, it depends on the current non-terminal on the top of the stack In LR, non-terminal is not known till reduction is done Who is keeping track of current state? The stack Need to push the state also into the stack The stack includes the viable prefix and the corresponding state for each symbol in the viable prefix
30
Building the Goto Table
If Goto(Ii, A) = Ij Then Goto[i, A] = j Meaning When a reduction X taken place The non-terminal X is added to the stack replacing What should the state be after adding X This information is kept in Goto table
31
Building the Parsing Table -- Example
Follow(S) = {$} Follow(E) = {+, ), $} Follow(T) = {+, ), $} + id ( ) $ S E T 4 5 1 2 3 Acc 6 SE ET Tid 7 8 9 EE+T T(E) Goto Table Action Table
32
LR Parsing Algorithm Elements Initialization
Parser, parsing tables, stack, input Initialization Append the $ at the end of the input Push state 0 into the stack On the top of the stack, it is always a state It is the current state of parsing
33
LR Parsing Algorithm Steps If Action[x, a] = y
x is the current state, on the top of the stack a is the input token Then shift a into the stack and put y on top of the stack If Action[x, a] = A Note that a is in Follow(A) Then Pop the handle and all the state corresponding to out of the stack y is the state on the top of the stack after popping Check Goto table, if Goto[y, A] = z Push A and then z into the stack
34
LR Parsing - Example Stack Input Action id + id $ S4 0 id 4 + id $
id + id $ S4 0 id 4 + id $ Tid, Goto[0,T]=3 0 T 3 ET, Goto[0,E]=2 0 E 2 s6 0 E 2 + 6 id $ 0 E id 4 $ Goto[6,T]=8 0 E T 8 EE+T, SE, Goto[0,S]=1 0 S 1 accept + id ( ) $ S E T 4 5 1 2 3 Acc 6 SE ET Tid 7 8 9 EE+T T(E) Rightmost derivation: S E E + T E + id T + id id + id Reverse trace back: Reduce left most input first.
35
LR Parsing -- Another Example
Stack Input Action ((aa))$ S1 0(1 (aa))$ 0(1(1 aa))$ S5 0(1(1a5 a))$ Aa Goto[1,A] 0(1(1A2 S6 0(1(1A2a ))$ AAa SA Goto[1,S] 0(1(1S3 S4 0(1(1S3)4 )$ S(S) 0(1S3 0(1S3)4 Goto[0,S] 0S $ ?accept LR Parsing -- Another Example 1 ( S ( S ) S (S) S A A Aa A a 3 4 ) S S ( S ) S (S) ( a S (S) S A A Aa A a A a A a 5 A a S A A A a A A a 6 2 S (S) A Aa | a Follow(S) = {$, )} Follow(A) = {$, ), a} ( ) a $ S A 1 5 ? 2 3 SA 6 4 S(S) Aa AAa Unclear accepting state 35
36
LR Parsing -- Another Example
( ) a b $ S A B 1 9 ? 4 2 3 S(S) 7 8 5 SAB 6 BBb AAa Bb Aa LR Parsing -- Another Example S (S) | AB A Aa | a B Bb | b Follow(S) = {$, )} Follow(A) = {a, b} Follow(B) = {$, ), b} 1 2 3 S ( S ) S (S) S AB A Aa A a ( ) S S ( S ) S (S) 5 4 A 6 ( S AB B B b S A B A A a B Bb B b a B B B b S (S) S AB A Aa A a A a A A a 7 a a b A a 9 B b 8 36
37
LR Parsing -- Looking into the Automata
If we want to get S), we need to have the input gradually being reduced to S. So we leave the first partial handle ( in the stack and look for the next handle. We expand all S productions in hope of getting something reduced to S. Also, the position is marked on the production rule ( S) For the given input, we get another (, so we shift it to the stack and hope to get the second S). Now there are two partial handles in the stack. S ( ) A B a b Input: ((aabbb))$ S (S) ((S)) ((AB)) ((ABb)) ((ABbb)) ((Abbb)) ((Aabbb)) ((aabbb)) Next we have ‘a’ in the input. So we expect AB being reduced to S and we look for next handle AB. In order to get A, all A’s productions are put into this state. This will lead to the transition to states related to ‘a’. From starting symbol, expect to see the handle (S) or AB. If they appear, can be reduced to S. For the given input, we get (, so shift it to the stack and hope to get S). 1 2 3 ) S ( S ) S (S) S AB A Aa A a S ( S ) S (S) ( S 5 4 6 A S AB B B b S A B A A a B Bb B b a B B B b ( S (S) S AB A Aa A a a A A A a 7 a a b a B b 8 A a 9 37
38
LR Parsing -- Looking into the Automata
( ) A B a b Input: ((aabbb))$ Stack: 0 S (S) ((S)) ((AB)) ((ABb)) ((ABbb)) ((Abbb)) ((Aabbb)) ((aabbb)) 1 2 3 ) S ( S ) S (S) S AB A Aa A a S ( S ) S (S) ( S 5 4 6 A S AB B B b S A B A A a B Bb B b a B B B b ( S (S) S AB A Aa A a a A A A a 7 a a b a B b 8 A a 9 38
39
LR Parsing -- Looking into the Automata
With state 0 and input (, looking for handle (S); Shift ( to stack, go to state 1; In state 1, we have ( S ), ( is in stack, we need to have S and ) in order to have the complete handle; Input: (aabbb))$ Stack: 0(1 S (S) ((S)) ((AB)) ((ABb)) ((ABbb)) ((Abbb)) ((Aabbb)) ((aabbb)) 1 2 3 ) S ( S ) S (S) S AB A Aa A a S ( S ) S (S) ( S 5 4 6 A S AB B B b S A B A A a B Bb B b a B B B b ( S (S) S AB A Aa A a a A A A a 7 a a b a B b 8 A a 9 39
40
LR Parsing -- Looking into the Automata
With state 1 and input (, we look for another handle (S); Shift ( to stack, go to state 1, waiting for S to become available; Stack now has 2 partial handles, looking for yet the next handle Input: aabbb))$ Stack: 0(1(1 S (S) ((S)) ((AB)) ((ABb)) ((ABbb)) ((Abbb)) ((Aabbb)) ((aabbb)) 1 2 3 ) S ( S ) S (S) S AB A Aa A a S ( S ) S (S) ( S 5 4 6 A S AB B B b S A B A A a B Bb B b a B B B b ( S (S) S AB A Aa A a a A A A a 7 a a b a B b 8 A a 9 40
41
LR Parsing -- Looking into the Automata
With state 1 and input a, the next handle can only be AB; But A is not available (neither B); Shift a to stack, go to state 9, waiting for A to become available; Stack now has 2 partial handles and one complete handle; Input: abbb))$ Stack: 0(1(1a9 S (S) ((S)) ((AB)) ((ABb)) ((ABbb)) ((Abbb)) ((Aabbb)) ((aabbb)) 1 2 3 ) S ( S ) S (S) S AB A Aa A a S ( S ) S (S) ( S 5 4 6 A S AB B B b S A B A A a B Bb B b a B B B b ( S (S) S AB A Aa A a a A A A a 7 a a b a B b 8 A a 9 41
42
LR Parsing -- Looking into the Automata
With state 9 and input a, we can perform reduction using A a; The handle a in stack is replaced by A; Now A is the third partial handle in stack; We may have a handle AB or Aa; Input: abbb))$ Stack: 0(1(1A S (S) ((S)) ((AB)) ((ABb)) ((ABbb)) ((Abbb)) ((Aabbb)) ((aabbb)) 1 2 3 ) S ( S ) S (S) S AB A Aa A a S ( S ) S (S) ( S 5 4 6 A S AB B B b S A B A A a B Bb B b a B B B b ( S (S) S AB A Aa A a a A A A a 7 a a b a B b 8 A a 9 42
43
LR Parsing -- Looking into the Automata
Input: abbb))$ Stack: 0(1(1A4 S (S) ((S)) ((AB)) ((ABb)) ((ABbb)) ((Abbb)) ((Aabbb)) ((aabbb)) We go back to state 1 (the state before the reduction). There is A in stack, from state 1, with A, go to state 4; With state 4 and input a, we know the next handle is Aa; …… 1 2 3 ) S ( S ) S (S) S AB A Aa A a S ( S ) S (S) ( S 5 4 6 A S AB B B b S A B A A a B Bb B b a B B B b ( S (S) S AB A Aa A a a A A A a 7 a a b a B b 8 A a 9 43
44
LR Parsing -- The RM Derivation Backwards
Stack Input Action ((aabbb))$ S1 0(1 (aabbb))$ 0(1(1 aabbb))$ S9 0(1(1a9 abbb))$ Aa 0(1(1A4 S6 0(1(1A4a7 bbb))$ AAa S8 0(1(1A4b8 bb))$ Bb 0(1(1A4B5 0(1(1A4B5b6 b))$ BBb ))$ SAB 0(1(1S2 S3 0(1(1S2)3 )$ S(S) 0(1S2 0(1S2)3 $ 0S ?accept LR Parsing -- The RM Derivation Backwards ( ) a b $ S A B 1 9 ? 4 2 3 S(S) 7 8 5 SAB 6 BBb AAa Bb Aa Input: ((abbb))$ 44
45
LR Parsing -- The RM Derivation Backwards
Stack Input Action Order ((aabbb))$ S1 0(1 (aabbb))$ 0(1(1 aabbb))$ S9 0(1(1a9 abbb))$ Aa (1) 0(1(1A4 S6 0(1(1A4a7 bbb))$ AAa (2) S8 0(1(1A4b8 bb))$ Bb (3) 0(1(1A4B5 0(1(1A4B5b6 b))$ BBb (4) ))$ (5) SAB (6) 0(1(1S2 S3 0(1(1S2)3 )$ S(S) (7) 0(1S2 0(1S2)3 $ (8) 0S ?accept LR Parsing -- The RM Derivation Backwards S (S) ((S)) ((AB)) ((ABb)) ((ABbb)) ((Abbb)) ((Aabbb)) ((aabbb)) (8) (7) (6) (5) (4) (3) (2) (1) traverse the rightmost derivation backwards S ( ) A B a b (8) (7) (2) (5) (6) (4) (1) (3) Bottom up parsing 45
46
SLR Parsing LR LR(0) = SLR(1) L: input scanned from left
R: traverse the rightmost derivation path LR(0) = SLR(1) The LR parser we discussed is LR(0) 0 in LR: lookahead symbol with the item (will be clear later) LR(0) is also called SLR(1) Simple LR 1 in SLR: lookahead symbol
47
SLR and LL Example: A Aa | a Follow(A) = {a, $} Not LL But is SLR(1)
Left recursive grammar But is SLR(1) First a got reduced to A The remaining a’s got reduced with the already generated A (Aa) In LR, it is reduction based, when seeing ‘a’, ‘A a’ is the only choice, after there is A, then reduce Aa by A Aa Stack Input Action aaa$ S3 0a3 aa$ Aa, Goto[0,A]=1 0A1 S2 0A1a2 a$ AAa $ a $ A 3 1 2 AAa Aa Unclear accepting state Incorrect state transition 47
48
SLR and LL Example: A aA | a Follow(A) = {$} Not LL(1) But is SLR(1)
Productions for A have left factors But is SLR(1) All ‘a’s got shifted to stack Final ‘a’, seeing $, got reduced to ‘A’ All ‘a’s in stack got reduced with newly generated ‘A’s Potential shift-reduce conflict shift: expect to see ‘a’ reduce: follow(A) only has $ no problem Stack Input Action aaa$ S1 0a1 aa$ 0a1a1 a$ 0a1a1a1 $ Aa Goto[1,A]=2 0a1a1A2 AaA 0a1A2 same as above 0A? a $ A 1 Aa 2 AaA Unclear accepting state The input string is actually acceptable If [0,$] is accept, will accept
49
SLR and LL Example: A aAa | a Follow(A) = {$, a} Not LL(1)
I2 I3 SLR and LL I1 A a Aa A a A aA A a A aA a A aAa A aAa A a I0 a Example: A aAa | a Follow(A) = {$, a} Not LL(1) Productions for A have left factors Not SLR(1) Has shift-reduce conflict a $ A 1 Aa 2 AaA Stack Input Action aaa$ S1 0a1 aa$ Aa 0A? Shift-reduce conflict reduce: follow(A) has $, a conflict
50
SLR and LL Example: S Ax | By A aA | a Follow(S) = {$} B aB | a
Stack Input Action aaax$ S3 0a3 aax$ 0a3a3 ax$ S1 0a3a3a3 x$ Aa Goto[3,A]=6 0a3a3A6 AaA 0a3A6 same as above 0A1 S4 0A1x4 $ SAx 0S Example: S Ax | By A aA | a B aB | a Follow(S) = {$} Follow(A) = {x} Follow(B) = {y} I1 I0 S A x S Ax I4 S Ax S By A aA A a B aB B a S B y S By I5 I2 A a A A a B a B B a A aA A a B aB B a I3 A aA I6 B aB I7 Unclear accepting state S does not appear at the right hand side So, no Goto info Potential reduce-reduce conflict But follow(A) and follow(B) are different a
51
SLR and LL Continue with the example: Not LL(k) Is SLR(1) S Ax | By
A aA | a B aB | a Not LL(k) S Ax and S By, First(Ax) and First(By) are ‘a’ Even with large k, Firstk of both will have “aa…a” Is SLR(1) No problem with A aA and A a, they lead to different states No problem with A a and B a, just go back to the same state During parsing, ‘a’ continuously got shifted into the stack When x or y appears, reduce By that time, it is clear which rule to use for reduction Follow(A) = {x}, if seeing x, reduce with A a Follow(B) = {y}, if seeing y, reduce with B a
52
SLR and LL Example: S Ax | By A Aa | a B Ba | a Stack Input
Action aaax$ S3 0a3 aax$ Reduction Multiple productions Have to make decision too soon, right at the first ‘a’ S Ax I4 I1 S A x S A a S Aa I5 I2 S Ax S By A Aa A a B Ba B a I0 S By I6 S B y S B a S Ba I7 Follow(S) = {$} Follow(A) = {x, a} Follow(B) = {y, a} A a B a reduce-reduce conflict Both A and B has ‘a’ in their follow sets I3
53
SLR and LL Continue with the example: Not LL Not SLR either
S Ax | By A Aa | a B Ba | a Not LL S Ax and S By, First(Ax) and First(By) are ‘a’ Even with large k, Firstk of both A and B will have “aa…a” (A and B are both in S’s productions) Not SLR either Not SLR(k), for any k Even with large k, Followk of both A and B will have “aa…a”
54
SLR and LL Example: Not SLR(1) Is LL(1) S (X | [Y X A) | B]
reduce-reduce conflict Both A and B has ]/) in their follow sets S ( X X A) X B] A B Example: S (X | [Y X A) | B] Y A] | B) A B Not SLR(1) Is LL(1) S (X S [Y Follow(S) = {$} Follow(X) = {$} Follow(Y) = {$} Follow(A) = {], )} Follow(B) = {], )} S [ Y Y A] Y B) A B The rules of each nonterminal have different first symbols A and B are from different nonterminals First(A) = { } First(B) = { } First(X) = { , ), ] } First(Y) = { , ), ] } First(S) = { (, [ } ( [ ) ] $ S S (X S [Y X X A) X B] Y Y B) Y A] A A B B
55
SLR Parser Family Consider grammar G G is SLR(2) S A b c | B b d
Follow(S) = {$} Follow(A) = {b} Follow(B) = {b} Consider grammar G S A b c | B b d A a B a G is SLR(2) Lookahead two characters will resolve the conflict Follow2(A) = {bc}, Follow2(B) = {bd} Action[4, bc] = A a Action[4, bd] = B a S’ S S A b c S B b d A a B a a A a B a reduce-reduce conflict b is in the follow sets of both A and B
56
SLR Parser Family Consider grammar G S A bk–1c | B bk–1d A a B a
G is SLR(k) not SLR(k-1) Need to lookahead k characters in the Follow set Followk–1(A) = {bk–1}, Followk–1(B) = {bk–1} Followk(A) = {bk–1c}, Followk(B) = {bk–1d}
57
SLR and LR Consider grammar G S L = R S R R L L * R L id
58
SLR and LR shift-reduce conflict the shift rule expect =
= is in R’s follow set SLR and LR S’ S S • L = R S • R L • * R R • L L • id S L = R R L S S S R L * • R S L = • R S L R = I0 I1 I2 I3 S L = R I4 * L * R I8 I7 L id id I6 I5 I9 * id = $ S L R 7 9 1 2 6 Acc RL 3 5 4 SL=R SR 8 R*L Lid S L = R S R R L L * R L id Follow(S) = {$} Follow(L) = {=, $} Follow(R) = {=, $}
59
SLR and LR Grammar G has shift-reduce conflict
Not helpful by looking further ahead the Follow set Followk(L) = {$, =id$, =*id$, =**id$, …, =*…*id$, =*…*id, =*…*} Followk(R) = Followk(L) This is not SLR(k) Further lookahead will not help with distinguishing Followk(R) from Followk(L)
60
SLR and LR What is the problem? Solution: Canonical LR Parsing
Lookahead information is too crude Need to distinguish If L * R is from S L = R *R = R, then Follow(R) = {=, $} If L * R is from S R L *R, then Follow(R) = {$} Solution: Carry the specific lookahead information with the LR(0) item The item becomes LR(1) item Use the lookahead symbol(s) with the item to identify the correct reduction rule to apply Canonical LR Parsing The parsing scheme based on LR(1) item
61
LR(1) Item LR(1) Item of a grammar G When [A , a] is in the state
A is an LR(0) item a is the lookahead symbol ( a terminal in Follow(A) ) [A , a] implies S * A a is in First($) I.e., “a” follows A in a right sentential form When [A , a] is in the state Reduction (same as SLR) But only if “a” is seen in the input string Next, need to define Closure and Goto functions for LR(1) items
62
Building the Automata Changes to Closure(I) Changes to Goto(I,X)
If A B is in Closure(I) and B is a production in G Then add B to Closure(I) If [A B , a] is in Closure(I) and B is a production in G Then add [B , c] to Closure(I) For all c, c First(a) Changes to Goto(I,X) If A X is in I then A X is in Goto(I, X) If [A X , a] is in I then [A X , a] is in Goto(I, X) Simply carry the lookahead symbol over
63
Building the Action Table
If state has item [A a , b] Add the shift action to the Action table (same as before) If state has [S’ S0 , $] Add accept to Action table (same as before) If State Ii has item [A , b] Action[S, b] = reduce using A Not for all terminals in Follow(A) Only for all terminals in the lookahead part of the item Goto table construction is the same as before
64
LR Parsing No longer has conflict $: reduce with R L =: shift
S’ S , $ S L = R, $ R L, $ L * R, $ L id, $ S L = R, $ R L , $ S’ S, $ S L = R, $ S R, $ L * R, = L id, = R L, $ L * R, $ L id, $ S L = R , $ No longer has conflict $: reduce with R L =: shift S R , $ R L , $ L * R, $ R L, $ L * R, $ L id, $ L * R , $ L id , $ R L , =$ L * R, =$ R L, =$ L * R, =$ L id, =$ L * R , =$ L id , =$
65
LR Parsing The parsing algorithm is the same for the LR family
Only the table is different LR is more powerful An SLR(1) grammar is always an LR(1), but not vice versa LR(1) Use one lookahead symbol in the item LR(k) Use k lookahead symbols in the item LR(2) grammar S A b c | B b d A a B a SLR(2) also reduce-reduce conflict in LR(1) But no conflict in LR(2) S’ S, $ S A b c, $ S B b d, $ A a, bc B a, bd a A a , bc B a , bd I0 I4
66
SLR and LR Example: Not SLR(1) Is LR(1) S (X | [Y X A) | B]
Y A] | B) A B Not SLR(1) Is LR(1) No reduce-reduce conflict any more S ( X, $ X A), $ X B], $ A , ) B , ] S (X, $ S [Y, $ Follow(S) = {$} Follow(X) = {$} Follow(Y) = {$} Follow(A) = {], )} Follow(B) = {], )} S [ Y, $ Y A], $ Y B), $ A , ] B , )
67
LR Parsing LR is more powerful than SLR
But LR has a larger number of states Higher space consuming Common programming language has hundreds of states and hundreds of terminals Approximately 100 X 100 table size Can the number of states in LR be reduced? Some states in LR are duplicated and can be merged LALR LookAhead LR Try to merge states in LR(1) automata When the core items in two LR(1) states are the same merge them
68
LALR Parsing Still no problem
S’ S , $ S L = R, $ R L, $ L * R, $ L id, $ S L = R, $ R L , $ S’ S, $ S L = R, $ S R, $ L * R, = L id, = R L, $ L * R, $ L id, $ S L = R , $ Still no problem The follow set is carried along with the item Which resolves the problem of unwanted follow symbol S R , $ L * R, $ R L, $ L * R, $ L id, $ L * R , $ L id , $ R L , $ Each pair of states in these two blocks have the same core items Can be fully merged R L , =$ L * R, =$ R L, =$ L * R, =$ L id, =$ L * R , =$ L id , =$
69
LALR Parsing Can merging states introduce conflicts?
Cannot introduce shift-reduce conflict May introduce reduce-reduce conflict Cannot introduce shift-reduce conflict? Assume: two LR states I1, I2 are merged into an LALR state I If conflict, I must have items [A , a] and [B a, b] In fact, and have to be the same, otherwise, they won’t come to the same state If they are from different states, they are different core items, cannot be merged into I If I1 has [A , a] and [B b, c] and I2 has [A , d] and [B b, e] To have a conflict, we should have b = d or b = a, shift-reduce conflicts were there in I1 and I2 already!
70
LALR Parsing Introducing reduce-reduce conflict? Reduce-reduce
S aAd | bBd | bAe | aBe A c B c I1: S’ S , $ I3: S aA d, $ I0: S’ S, $ S aAd, $ S bBd, $ S bAe, $ S aBe, $ Merge I5 and I9 I2: S a Ad, $ S a Be, $ A c, d B c, e I4: S aB e, $ I5-9: A c , d/e B c , d/e I5: A c , d B c , e I9: A c , e B c , d I6: S b Ae, $ S b Bd, $ A c, e B c, d I7: S bA e, $ Reduce-reduce Conflict I8: S bB d, $ LR(1)
71
LALR Parsing Another LALR example S CC C cC C d
First(C) = {c, d} First(S) = {c, d} Follow(S) = {$} Follow(C) = {c,d,$} Is this grammar ambiguous? No. it has two d’s, one in the middle, separating the substrings generated by the two C’s S’ S, $ S CC, $ C cC, c/d C d, c/d S S , $ I1 I2 S C C, $ C cC, $ C d, $ S CC , $ I6 I0 I3 c d $ S C 3 5 1 2 Acc 6 4 CcC Cd SCC I7 I4 I8 C c C, c/d C cC, c/d C d, c/d /$ C c C, $ C cC, $ C d, $ C cC , c/d C cC , $ c c C d , c/d C d , $ I9 I5
72
LALR Parsing Delay error detection? LR stack S CC, C cC, C d
Parse string ccd$ LR stack 0c3c3d5, seeing $ reduce using C d only if seeing {c, d}, not $ error S’ S, $ S CC, $ C cC, c/d C d, c/d S S , $ I1 I2 S C C, $ C cC, $ C d, $ S CC , $ I6 I0 I3 I7 I4 I8 C c C, c/d C cC, c/d C d, c/d C c C, $ C cC, $ C d, $ C cC , c/d C cC , $ c c C d , c/d I5 C d , $ I9
73
LALR Parsing Delay error detection? LALR stack
0c3c3d5, seeing $ reduce using C d, goto 4 (0c3c3C4) 0c3c3C4, seeing $ Reduce by C cC, goto 4 (0c3C4) 0c3C4, seeing $ Reduce by C cC, goto 2 (0C2) 0C2, seeing $ error, only allow seeing c, d, C S’ S, $ S CC, $ C cC, c/d C d, c/d S S , $ I1 I2 S C C, $ C cC, $ C d, $ S CC , $ I6 I0 I3 I4 C c C, c/d/$ C cC, c/d/$ C d, c/d/$ C cC , c/d/$ c C d , c/d/$ I5
74
LALR Parsing LALR SLR, LR, LALR
Can also be constructed using SLR procedure But add lookahead symbols SLR, LR, LALR LR is most powerful and SLR is least powerful LALR(1) is most commonly used All reasonable languages are LALR(1) Has the same number of states as SLR(1)
75
LL and LR If a grammar is LL, it must be LR A grammar is LL
Consider each non-terminal, the first sets of its productions should be non-overlapping If first sets of some productions have in them, then consider the follow sets for those productions A grammar is not LR then it must not be LL Has either reduce-reduce conflict or shift-reduce conflict Proof that with these conflicts, cannot be LL (next page) Note We don’t know whether a language always has an LL and/or an LR grammar
76
LL and LR If a grammar G is LL, it must be LR Reduce-reduce conflict
Cannot have [A , a] and [B , a], . They won’t come to one state If a grammar G is LL, it must be LR Reduce-reduce conflict At least has [A , a] and [B , a] in one state [A , a] and [B , a] come to the same state, then one of the following must be true X A A and X B B and A and B are together in some previous state If , two productions of X have the same first set If = , X has two productions with overlapping first sets either First(A) and First(B) both have First() in them or First() = and Follow(A) and Follow(B) both have ‘a’ in them X X11, X1 X22, X2 X33, …, Xi AA, Xj BB and A , B are together in some previous state WLOG, i > j, Xj’s productions have overlapping first sets G cannot be LL Can’t have * X2 3X33, 3 - won’t get to Xi AA * X1 2 X22, 2 X1 AA, X2 BB - X1 AA won’t be here
77
LL and LR If a grammar G is LL, it must be LR Shift-reduce conflict
At least has [A , a] and [B a, b] in one state X AA and X BB and A and B a are together in some previous state X X1, X1 X22, …, Xn–1 Xn3, Xi AA, Xj B B and A , B a are together in some previous state G cannot be LL
78
Grammar Class Hierarchy
79
Bottom-up Parsing -- Summary
Read textbook Sections Bottom-up Parsing Handle and viable prefix SLR parsing SLR(1) = LR(0) SLR(k) Canonical LR Parsing LR(1) LR(k) LALR
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.