Download presentation
Presentation is loading. Please wait.
Published byMarilynn Grant Modified over 9 years ago
1
2016-3-161 Bottom Up Parsing PITT CS 1622
2
2016-3-16 PITT CS 1622 2 Bottom Up Parsing Also known as Shift-Reduce parsing More powerful than top down Don’t need left factored grammars Can handle left recursion Attempt to construct parse tree from an input string beginning at leaves and working to top Process of reducing strings to a non terminal – shift-reduce Uses parse stack Contains symbols already parsed “shifted” until match RHS of production “reduced” to non-terminal on LHS Eventually reduced to start symbol
3
2016-3-16 PITT CS 1622 3 Z b M b M ( L | a L M a ) | ) Considering string: w = b ( a a ) b Z b M b b ( L b b ( M a ) b b ( a a ) b Trying to find handles and then reduce – sentential form b ( a a ) b b ( M a ) b b ( L b b M b Z – rightmost derivation Z M L M a)ab(b
4
2016-3-16 PITT CS 1622 4 Grammar E E+E E E*E E (E) E id Use # to indicate where we are in string id 1 #+id 2 *id 3 E # + id 2 * id 3 E + # id 2 * id 3 E + E # * id 3 E + E * id 3 # E + E * E # E + E # E Sentential formHandleProducts id 1 + id 2 * id 3 id 1 E id E + id 2 * id 3 id 2 E id E + E * id 3 id 3 E id E + E * E E*E E E*E E + E E E+E E
5
2016-3-16 PITT CS 1622 5 Handle Intuition: reduce only if it leads to the start symbol Handle has to match RHS of production and lead to rightmost derivation, if reduced to LHS of some rule Definition: Let w be a sentential form where is an arbitrary string of symbols X is a production w is a string of terminals Then at is a handle of w if S Xw w by a rightmost derivation Handles formalize the intuition (reduce to X), not really say how to find handle
6
2016-3-16 PITT CS 1622 6 Issues Locate handle – right sentential form What production to reduce it to – which of the RHS Notice in right-most derivation, where right sentential form is 1 62 437 5 Never have to go into the middle of the string – left to right – right always contains terminals Rightmost derivation in reverse: 7 6 5 4 3 2 1
7
2016-3-16 PITT CS 1622 7 Consider our usual grammar – problem when to reduce E T + E | T T int * T | int | ( E ) Consider the string: int * int + int sentential form production int * int + intT int int * T + intT int * T T + intT int T + TE T T + EE T + E E E T+E int*T T
8
2016-3-16 PITT CS 1622 8 Viable Prefix Definition: is a viable prefix if There is a w where w is a right sentential form #w is a configuration of a shift-reduced parser b ( a #a ) b b ( M# a ) b b ( L# b b M # b Z# Alternatively, a prefix of a rightmost derived sentential form is viable if it does not extend the right end of the handle
9
Properties of Viable Prefix A prefix is viable because it can be extended by adding terminals to form a valid (rightmost derived) sentential form As long as the parser has viable prefixes on the stack, no parsing error has been detected. Types of bottom up parsers Simple precedence Operator precedence LR family 2016-3-16 PITT CS 2210 9
10
2016-3-16 PITT CS 1622 10 $ table stack input string Operations 1.Shift – shift input symbol onto the stack 2.Reduce – RHS of a non-terminal “handle” is at the top of the stack. Decide which non-terminal to reduce it to 3.Accept – success 4.Error Stack Implementation Parser Driver
11
2016-3-16 PITT CS 1622 11 Z b M b M ( L | a L M a ) | ) String b ( a a ) $ StackInputAction $ $ b $ b ( $ b ( a $ b ( M $ b ( M a $ b ( M a ) $ b ( L $ b M $ b M b $ Z b ( a a ) b $ ( a a ) b $ a a ) b $ a ) b $ ) b $ b $ $ shift reduce shift reduce shift reduce accept
12
2016-3-16 PITT CS 1622 12 Ambiguous Grammars Conflicts arise with ambiguous grammars Ambiguous grammars generate conflicts but so do other types of grammars Example: Consider the ambiguous grammar E E * E | E + E | ( E ) | int Sentential formActions int * int + int … E * E # + int E # + int E + # int E + int # E + E # E # shift … reduce E E * E shift reduce E int reduce E E + E Sentential formActions int * int + int … E * E # + int E * E + # int E * E + int # E * E + E # E * E # E # shift … shift reduce E int reduce E E + E reduce E E * E
13
2016-3-16 PITT CS 1622 13 Ambiguity In the first step shown, we can either shift or reduce by E E * E Choice because of precedence of + and * Same problem with association of * and + Can always rewrite ambiguous grammars of this sort to encode precedence and association in the grammar Sometimes result in convoluted grammars Tools have other means to encode precedence and association But must get rid of conflicts ! Know what a handle is but not clear how to detect it
14
2016-3-16 PITT CS 1622 14 Properties about Bottom Up Parsing Handles always appear at the top of the stack Never in middle of stack Justifies use of stack in shift – reduce parsing General shift – reduce strategy If there is no handle on the stack, shift If there is a handle, reduce to the non-terminal, Conflicts If it is legal to either shift or reduce then there is a shift-reduce conflict. If it is legal to reduce by two or more productions, then there is a reduce-reduce conflict.
15
2016-3-16 PITT CS 1622 15 LR Parsers LR family of parsers LR(k)L – left to right R – rightmost derivation in reverse k elements of look ahead Attractive 1.LR(k) is powerful – virtually all language constructs 2.Efficient 3.LL(k) LR(k) 4.LR parsers can detect an error as soon as it is possible to do so 5.Automatic technique to generate – YACC, Bison
16
2016-3-16 PITT CS 1622 16 LR and LL Parsers LR parser, each reduction needed for parse is detected on the basic of Left context Reducible phrase k terminals of look ahead LL parser Left context First k symbols of what right hand side derive (combined phrase and what is to right of phrase)
17
2016-3-16 PITT CS 1622 17 Types of LR Parsers SLR – simple LR Easiest to implement Not as powerful Canonical LR Most powerful Expensive to implement LALR Look ahead LR In between the 2 previous ones in power and overhead Overall parsing algorithm is the same – table is different
18
2016-3-16 PITT CS 1622 18 Parsing Overview X i grammar symbol with S i state (state says it all) Each state summarizes the information contained in the stack below it – what has been seen so far (S 0 X 1 S 1 X 2 S 2 …X n S n #a i a i+1 …a n $) X 1 X 2 …X n a i …a n – right sentential State at the top of the stack and current input – index into parsing table to determine whether to shift or reduce $ LR Parsing program stack input string a2a2 a1a1 --- X1X1 X m-1 XmXm XmXm … ActionsGoTo Table S0S0 S1S1 S m-1 SmSm S m+1 …
19
2016-3-16 PITT CS 1622 19 Action --- X1X1 X m-1 XmXm … S0S0 S1S1 S m-1 SmSm … $X m+1 …… S m+1 --- X1X1 X m-1 XmXm … S0S0 S1S1 S m-1 SmSm … $X m+1 …… --- X1X1 X m-1 … S0S0 S1S1 S m-1 … Shift Reduce(1) Xz X m X m+1 --- X1X1 X m-1 … S0S0 S1S1 S m-1 … XzXz (2) $X m+2 …… --- X1X1 X m-1 … S0S0 S1S1 S m-1 … XzXz S m+1 GOTO XmXm SmSm X m+1 S m+1
20
2016-3-16 PITT CS 1622 20 Parse Table: Action and Goto Action [S m,A i ] shift S: where S is a state reduce by a grammar production accept error Goto [Sm, grammar symbol] change to another state
21
2016-3-16 PITT CS 1622 21 Actions Assume S 0 X 1 S 1 X 2 S 2 …X n S n #a i a i+1 …a n $ right sentential form X 1 X 2 …X n a i …a n 1.Action [S m, a i ] is shift input and goto state (S 0 X 1 S 1 X 2 S 2 …X m S m a i S # a i+1 …$) 2.Action [S m, a i ] is reduce A | | = r, pop 2r symbols (S 0 X 1 S 1 …X m-r S m-r A S # a i a i+1 …$) where S = goto [S m-r, A] output generated after reduce tree 3.Action [S m, a i ] = accept – parsing is complete 4.Action [S m, a i ] = error – report and stop
22
2016-3-16 PITT CS 1622 22 Grammar 1.S E 2.E E+T 3.E T 4.T id 5.T (E) Non-terminalFollow SETSET $ + ) $ +id()$ S0S3S4 S1S7accept S2r3 S3r4 S4S3S4 S5S7S6 r5 S7S3S4 S8r2 ETS S012 S1 S2 S3 S452 S5 S6 S78 S8 ACTION GOTO
23
2016-3-16 PITT CS 1622 23 Power Added to DFA Return to state where non-terminal was predicted and continue do this by counting states Example StackInputAction S0 id S3+ id +id$r4, goto[S0, T] S0 T S2+ id +id$r3, goto[S0, E] ……… S0 E S1 + S7 T S8+id$r2, goto[S0, E] …. +id +
24
2016-3-16 PITT CS 1622 24 Two Operations Bottom up only uses 2 operations Shift and reduce on stack#input stack. input shift: ExT. abc ExTa. bc Reduce: ExTa. bc ExF. bc
25
2016-3-16 PITT CS 1622 25 LR Parsers Can tell handle by looking at stack top [grammar symbol, state] and k input symbols – FSA In practice, k<=1 How to construct LR parse table from grammar First construct SLR parser LR and LALR are augmented basic SLR techniques 2 phases to construct table Build deterministic finite state automation to go from state to state Build table from DFA Each state – how do we know from grammar where we are in the parse. Production already seen.
26
2016-3-16 PITT CS 1622 26 Notation of an LR(0) item An item is a production with a distinguished position on the right hand side – position indicates how much of the production already seen Example S a B S is a production Items for the production: S . a B S S a. B S S a B. S S a B S. Called LR(0) items – Basic idea – construct a DFA that recognizes the viable prefixes group items into sets – state of SLR a B S
27
2016-3-16 PITT CS 1622 27 Construction of LR(0) items 1.Create augmented grammar G’ G: S | G’: S’ SS | What else is needed A c. d E – indicate new state by consuming symbol d: need go to function A c d. E – what are all possible things to see – all possible derivations from E? Add strings derivable from E – closure function A c d E. – reduce to A and goto another state 2.Compute functions closure and go to – will be used to determine the action and go to parts of the parsing table closure – essentially defines what is expected go to – moves from one state to another by consuming symbol
28
2016-3-16 PITT CS 1622 28 Closure Closure(I) where I is a set of items – form state Let N be a non-terminal If distinguished point is in front of N then add each production for that N and put distinguished point at the beginning of the RHS A . B is in I ; we expect to see a string derive from B B . is added to the closure, where B is a production Apply rule until nothing is added Example: Grammar S E E E + T E T T id | ( E ) Assume I = { S . E } Closure(I) = { S .E, E . E + T, E . T, T . id, T . ( E ) }
29
2016-3-16 PITT CS 1622 29 Items Two kinds of items Kernel items Include S . S and all items where points not at left end Non-kernel items Items with points at left end can always add Not keep around to save storage Closure of kernel items
30
2016-3-16 PITT CS 1622 30 Goto Operation Goto (I, X), where X is a grammar symbol I – set of items Shift action moves from one state to another by absorbing single symbol. Successor states will contain each item with distinguished point advanced by 1 grammar symbol If A . B is in I then closure of A B. is added to goto(I, X) Sets are viable prefixes If is a viable prefix for I then X is a viable prefix to goto(I,X) Example Goto(I, ( ) = closure( T (. E ) )
31
2016-3-16 PITT CS 1622 31 Procedure items (C), C is set of items – state begin C:={closure(S . S)} Repeat for each set of items I in C and each grammar symbol X Such that goto(X) is not empty and not in C do add goto(I, X) to C Until no more sets can be added New states are generated by goto Construction of set-of-items
32
2016-3-16 PITT CS 1622 32 Example : S E E E + T | T T id | ( E ) S 0 = closure {[S . E]} = {S . E, E . E + T, E . T, T . id, T . ( E )} goto(S 0, E) = closure {[S E.], [S E. + T]} S 1 = {S E., S E. + T} goto(S 0, T) = closure {[E T.]} S 2 = {E T. } goto(S 0, id) = closure {[T id.]} S 3 = {T id. } …… S 8 = …
33
2016-3-16 PITT CS 1622 33 DFA for the previous grammar( * closure of state ) * S E. * E E.+T S1S1 * S . E E . E+T E . T T . Id T . (E) S0S0 * E E+.T T . id T . (E) S7S7 * E T. S2S2 * T id. S3S3 * E E+T. S8S8 * T (. E) E . E+T E . T T . Id T . (E) S4S4 * T (E. ) * E E. +T * T (E). S6S6 E+ T ( id T E T S2S2 S3S3 ( S4S4 S5S5 S2S2 ( S4S4 ) + S7S7
34
2016-3-16 PITT CS 1622 34 Building Parse Table from DFA ACTION [state, input symbol/terminal symbol] GOTO [state, non-terminal symbol] ACTION: 1.If [A a ] is in S i and a is a terminal and goto(S i, a) = S j then ACTION[S i, a] = shift j 2.If [A ] is in S i, then ACTION[S i, a] = reduce A for all a is Follow(A) if no conflicts in 1 and 2 – then SLR(1) grammar 3.If [S’ S 0 ] is in S i, then ACTION[S i, $] = accept GOTO 1.if goto(S i, A) = S j then GOTO[S i,A]=S j 2.all entries not filled are errors
35
2016-3-16 PITT CS 1622 35 Grammar 1.S E 2.E E+T 3.E T 4.T id 5.T (E) Non-terminalFollow SETSET $ + ) $ +id()$ S0S3S4 S1S7accept S2r3 S3r4 S4S3S4 S5S7S6 r5 S7S3S4 S8r2 ETS S012 S1 S2 S3 S452 S5 S6 S78 S8 ACTION GOTO
36
2016-3-16 PITT CS 1622 36 Power Added to DFA Return to state where non-terminal was predicted and continue do this by couting states StackInputAction S0id + id $S3 S0 id S3+ id $r4, goto[S0, T] S0 T S2+ id $r3, goto[S0, E] S0 E S1+ id $S7 S0 E S1 + S7id $S3 S0 E S1 + S7 id S3$r4, goto[S7, T] S0 E S1 + S7 T S8$r2, goto[S0, E] S0 E S1$accept S E E + T T id 1 2 3 4 5
37
2016-3-16 PITT CS 1622 37 Consider the grammar G S A b c | B b d A a b Follow(A) and also b Follow(B) B a S .Abc S .Bbd A .a B .a A a. B a. a S0S0 What is reduced when “a b” is seen? reduce to A or B? conflict G is not SLR(1) but SLR(2) We need 2 symbols of look ahead to look past b: b c – reduce to A b d – reduce to B Possible to extend SLR(1) to k symbols of look ahead – allows larger class of CFGs to be parsed Confliction S1S1
38
2016-3-16 PITT CS 1622 38 SLR(k) Extend SLR(1) definition to SLR(k) as follows let , V * First k ( ) = { x V T *| ( *x and |x|<=k)} gives all terminal strings of size <= k derivable from all k-symbol terminal prefixes of strings derivable from Follow k (B) = {w V T *| S * B and w First k ( )} gives all k symbol terminal strings that can follow B in some derivation all shorter terminal strings that can follow B in a sentential form
39
2016-3-16 PITT CS 1622 39 Parse Table Let S be a state and b V T * such that |b| k 1.If A . S and b Follow k (A) then Action(S,b) – reduce to production A , 2.If D .a S and a V T and b First k (a Follow k (D)) Action(S,b) = shift j where goto(S,a)=S j For k =1, this definition reduces to SLR(1) First k (a Follow k (D)) = {a}
40
2016-3-16 PITT CS 1622 40 SLR(k) Consider S A b k-1 c | B b k-1 d A a B a SLR(k) not SLR(k-1) cannot decide what to reduce, reduce a to A or B depends the next k symbols b k-1 c or b k-1 d
41
2016-3-16 PITT CS 1622 41 Non SLR(k) Consider another Grammar G S j A j | A m | a j A a Follow(A) = {j, m} State S1: [A a.] – reduce using this production [S a.j] – shift j not SLR(1) ? SLR(k)? Follow k (A) = First k (j) = {j}, First k (m) = {m} for S a.j, First k (jFollow k (S)) = {j} so not SLR(k) for any k !!! S .jAj S .Am S .aj A .a S a.j A a. S j.Aj A .a S0 S1 S2 a j j a A Conflict Note only m can follow A Note only j can follow A
42
2016-3-16 PITT CS 1622 42 Why? Look ahead is too crude In S1, if A a is reduced then m is the only possible symbol that can be seen – the only valid look ahead Fact that j can follow A in another context is irrelevant Want to limit look ahead in a state to those symbols that might actually occur in the context represented by the state. Done in Canonical LR !!! Determine look ahead appropriate to a configuration – working backwards from DFA States contains a look ahead – will be used only for reductions
43
2016-3-16 PITT CS 1622 43 A B. X Y Z …, A B. Follow(A)={a,b,c} SLR(1) A B.,a/b X Y Z …, A B.,b/c subset of Follow(A) LALR(1) X Y Z …, A B.,c state splitting LR(1) A B.,a A B.,b
44
2016-3-16 PITT CS 1622 44 Constructing Canonical LR Problem: Follow set in SLR is not precise enough need to carry more information in state for reduction A Extra information is used only for reductions Extra information is incorporated into state by redefining items to include terminal symbol as second component Form of item – LR(1) item [A . , a] where A is a production and a is a terminal or $ LR(1) item only effect [A ., a] calls for a reduction only if next input symbol is a second component will always be a subset of Follow(A) [A ., a]: S * A A , a first or $
45
2016-3-16 PITT CS 1622 45 Constructing Canonical LR Essentially the same as LR(0) set of items only add look ahead modify closure and goto function Changes for closure [A . a] and B then [B . , c] where c First( a) Changes for goto function carry look ahead [A .X a] I then goto (I, X) = [A X. a] Why?
46
2016-3-16 PITT CS 1622 46 Example Grammar S’ S S CC C eC | d S0: [S’ .S,$]first( $)={$} [S .CC, $] [C .eC, e/d] first(C$)={e,d} [C .d, e/d] S1: goto(S0, S) closure(S’ S., $) [S’ S., $] S2: goto(S0, C) closure(S C.C,$)first( $)={$} [S C.C, $] [C .eC, $] [C .d, $]
47
2016-3-16 PITT CS 1622 47 S3: goto(S0,e)=closure(C e.C, e/d) [C e.C, e/d]First( ed) = {e,d} [C .eC, e/d] [C .d, e/d] S4: goto(S0, d)=closure(C d., e/d) [C d., e/d] S5: goto(S2, C)=closure(S CC., $) [S CC., $] S6: goto(S2,e)=closure(C e.C, $) [C e.C, $]First( $) = {$} [C .eC, $] [C .d, $] S7: goto(S2, d)=closure(C d., $) [C d., $] S8: goto(S3, C)=closure(C eC., e/d) [C eC., e/d] S9: goto(S6,C)=closure(C eC., $) [C eC., $]
48
2016-3-16 PITT CS 1622 48 S0S1 S2 S8 S4 S6 S7 S8 S3 S C e d e e d C d e C C Note S3 and S6 are same except for look ahead (same for S4 and S7) In SLR(1) – one state represents both
49
2016-3-16 PITT CS 1622 49 Constructing Canonical LR Parse Table same as before for shift don’t use follow set for reduce for reduce, reduce only on look ahead Action and GOTO 1.if [A a ,b] Si and goto(Si, a) = Sj, Action[I,a] = Sj – shift and goto state j 2.if [A ,a] Si Action[I,a] = reduce A note previously true for all symbols in Follow(A) 3.if [S’ S., $] Si, Action[i,$] = accept
50
2016-3-16 PITT CS 1622 50 Revisit SLR and LR S aEa | bEb | aFb | bFa E e F e SLR: reduce/reduce conflict follow(E,F) = {a,b} aea E eandbea F e can LR(1) work? will not have a conflict because states will be split to take into account this left context E if followed by a/b preceded by a/b F if followed by a/b preceded by b/a S .aEa S .bEb S .aFb S .bFa S a.Ea S a.Fb E .e F .e E e. F e. a b e
51
2016-3-16 PITT CS 1622 51 S .aEa S .bEb S .aFb S .bFa S a.Ea S a.Fb E .e F .e E e. F e. a b e SLR: Follow(E) = Follow(F) = {a,b} S b.Eb S b.Fa E .e F .e E e. F e. e LR: Follow sets more precise S0 S1 S2 S3 S4 [E e., a] [F e., b] [E e., b] [F e., a] a b e e
52
2016-3-16 PITT CS 1622 52 SLR(1) and LR(1) Every SLR(1) grammar is LR(1) but LR(1) has more states than SLR(1) – orders of magnitude differences LALR(1) – look ahead LR parsing method used in practice because most syntactic structure can be represented by LALR (not true for SLR) Same number of states as SLR Can be constructed by merging states with the same core
53
2016-3-16 PITT CS 1622 53 Example Grammar S’ S S CC C eC | d S3: goto(S0,e)=closure(C e.C, e/d) [C e.C, e/d] [C .eC, e/d] [C .d, e/d] S4: goto(S0, d)=closure(C d., e/d) [C d., e/d] S8: goto(S3, C)=closure(C eC., e/d) [C eC., e/d] S6: goto(S2,e)=closure(C e.C, $) [C e.C, $] [C .eC, $] [C .d, $] S7: goto(S2, d)=closure(C d., $) [C d., $] S9: goto(S6,C)=closure(C eC., $) [C eC., $]
54
2016-3-16 PITT CS 1622 54 S0S1 S2 S8 S4 S6 S7 S8 S3 S C e d e e d C d e C C Note S3 and S6 are same except for look ahead (same for S4 and S7) In SLR(1) – one state represents both
55
2016-3-16 PITT CS 1622 55 Merging states Can merge S3 and S6 Similarly S47: [C d., e/d/$] S89: [C eC., e/d/$] S3: goto(S0,e)=closure(C e.C, e/d) [C e.C, e/d] [C .eC, e/d] [C .d, e/d] S6: goto(S2,e)=closure(C e.C, $) [C e.C, $] [C .eC, $] [C .d, $] S36:[C e.C, e/d/$] [C .eC, e/d/$] [C .d, e/d/$]
56
2016-3-16 PITT CS 1622 56 Effects of Merging 1.Detection of errors can be delayed LALR parsers will not shift another symbol after the LR parser declares error but may proceed to do some more reductions Example: S’ SS S | CC C eC | d and string eed$ Canonical LR: Parse Stack 0e3e3d4 state 4 $input= error S4:{C d., e/d} LALR: stack: 0e 36 e 36 d 47 state 47 input $, reduce C d stack: 0e 36 e 36 C 89 reduce C eC stack: 0e 36 C 89 reduce C eC stack: 0 C 2 state 2 input $, error Why ?
57
2016-3-16 PITT CS 1622 57 Effects of Merging 2.Merging of states can introduce conflicts cannot introduce shift-reduce conflicts can introduce reduce-reduce conflicts Shift-reduce conflicts Suppose Sij: [A ., a] reduce on input a [B .a , b] shift on input a formed by merging Si and Sj However, these cores must be same and must contain [A ., a] and [B .a , b] shift-reduce conflicts were present in Si and Sj and not introducted by merging Why ?
58
2016-3-16 PITT CS 1622 58 Reduce-reduce Conflicts S aEa | bEb | aFb | bFa E e F e S3: [E e., a]viable prefix ae [F e., b] S4:[E e., b]viable prefix be [F e., a] MergingS34:[E e., a/b] [F e., a/b] both reductions are called on inputs a and b, i.e. reduce-reduce conflict
59
2016-3-16 PITT CS 1622 59 A B. X Y Z …, A B. Follow(A)={a,b,c} SLR(1) A B.,a/b X Y Z …, A B.,b/c subset of Follow(A) LALR(1) X Y Z …, A B.,c state splitting LR(1) A B.,a A B.,b
60
2016-3-16 PITT CS 1622 60 Construction on LALR Parser One solution: construct LR(1) items merge states if no conflicts, you have a LALR parser Inefficient because of building LR(1) items are expensive in time and space Efficient construction of LALR parsers avoid construction of LR(1) items construction states containing only LR(0) kernel items compute look ahead for the kernel items predict actions using kernel items
61
2016-3-16 PITT CS 1622 61 LALR vs. LR Parsing LALR languages are not natural They are an efficiency hack on LR languages Any reasonable programming language has a LRLR(1) grammar LALR(1) has become a standard for programming languages and for parser generators
62
2016-3-16 PITT CS 1622 62 A Hierarchy of Grammar Classes
63
2016-3-1663 Using Automatic Tools -- YACC PITT CS 1622
64
2016-3-16 PITT CS 1622 64 Using Parser Generator Most common parser generators are LALR(1) A parser generator constructs a LALR(1) table and reports an error when a table entry is multiply defined A shift and a reduce – report shift/reduce conflict Multiple reduces – report reduce/reduce conflict An ambiguous grammar will generate conflicts Must resolve conflicts
65
2016-3-16 PITT CS 1622 65 Shift/Reduce Conflicts Typically due to ambiguities in the grammar Classic example: the dangling else S if E then S | if E then S else S | OTHER will have DFA state containing [S if E then S., else] [S if E then S. else S, x] if else follows then we can shift or reduce default (YACC, bison, etc.) is to shift – default behavior is as needed in this case
66
2016-3-16 PITT CS 1622 66 More Shift/Reduce Conflicts Consider the ambiguous grammar E E+E | E*E | int we will have the states containing [E E*. E, +][E E*E., +] [E . E+E, +] [E E. +E, +]... Again we have a shift/reduce on input + we need to reduce (* is higher than +) recall solution: declare the precedence of * and + E
67
2016-3-16 PITT CS 1622 67 More Shift/Reduce Conflicts In YACC declare precedence and associativity %left + %left * Precedence of a rule = that of its last terminal see yacc manual for ways to override this default Resolve shift/reduce conflict with a shift if: no precedence declared for either rule or terminal input terminal has higher precedence than the rule the precedence are the same and right associative
68
2016-3-16 PITT CS 1622 68 Use Precedence to Solve S/R Conflict [E E*. E, +][E E*E., +] [E . E+E, +] [E E. +E, +]... we will choose reduce because precedence of rule E E*E is higher than that of terminal + [E E+. E, +][E E+E., +] [E . E+E, +] [E E. +E, +]... we will choose reduce because E E+E and + have the same precedence and + is left-associative E E
69
2016-3-16 PITT CS 1622 69 Back to our dangling else example [S if E then S., else] [S if E then S. else S, x] can eliminate conflict by declaring else with higher precedence than then But this starts to look like “hacking the tables” Best to avoid overuse of precedence declarations or you will end with unexpected parse trees
70
2016-3-16 PITT CS 1622 70 Reduce/Reduce Conflicts Usually due to ambiguity in the grammar Example:a sequence of identifiers S | id | id S There are two parse trees for the string id S id How does this confuse the parser?
71
2016-3-16 PITT CS 1622 71 Reduce/Reduce Conflicts Consider the states [S’ .S, $][S id., $] [S ., $][S id.S, $] [S . id, $] [S ., $] [S . id S, $][S . id, $] [S . id S, $] Reduce/reduce conflict on input “id$” S’ S id S’ S id S id Better rewrite the grammar: S | id S E
72
2016-3-16 PITT CS 1622 72 Semantic Actions Semantic actions are implemented for LR parsing keep attributes on the semantic stack – parallel syntax stack on shift a, push attribute for a on semantic stack on reduce X pop attributes for compute attribute for X push it on the semantic stack Create AST Bottom up Creating leaf node for tokens and create internal nodes from subtrees.
73
2016-3-16 PITT CS 1622 73 Performing Semantic Actions Compute the value E T + E1 {E.val = T.val + E1.val} | T{E.val = T.val} T int * T1{T.val = int.val * T1.val} | int {T.val = int.val} consider the parsing of the string 3 * 5 + 8 Recall: creating the AST E intE.ast = mkleaf(int.lexval) | E1+E2E.ast = mktree(plus, E1.ast, E2.ast) | (E1)E.ast = E1.ast a bottom-up evaluation of the ast attribute: E.ast = mktree(plus, mkleaf(5), mktree(plus, mkleaf(2), mkleaf(3) ) ) PLUS 523
74
2016-3-16 PITT CS 1622 74 Error Recovery Error detected when parser consults parsing table empty entry Canonical LR will not make a single reduction before announcing an error SLR and LALR parser may make several reductions before announcing an error but not shift an erroneous symbol on the stack Simple error recovery continue to scan down the stack until a state S with a goto on a particular non-terminal A is found zero or more input symbols are discarded until a symbol “a” is found that can follow A goto[s,A] put on stack and parsing continues Choice of A – non-terminal representing major program piece e.g. if A is ‘stmt’ the ‘a’ may be ‘end’ or ‘;’
75
2016-3-16 PITT CS 1622 75 Compaction of LR Parse Table A typical language grammar with 50-100 terminals and 100 productions may have an LALR parser with several hundred states and thousands of action entries Often row of table are the same so share the rows Rows can be made shorter – use lists (input, action) Slows access to the table
76
2016-3-16 PITT CS 1622 76 Notes on Parsing Parsing A solid foundation: CFG A simple parser: LL(1) A more powerful parser: LR(1) An efficient hack: LALR(1) LALR(1) parser generators
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.