Chapter 4-2 Chang Chi-Chung 2007.06.07. Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR Other.

Chapter 4-2 Chang Chi-Chung 2007.06.07

Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR Other special cases:  Shift-reduce parsing  Operator-precedence parsing

Operator-Precedence Parsing Special case of shift-reduce parsing See textbook section 4.6

Bottom-Up Parsing E → T → T * F → T * id → F * id → id * id rightmost derivation reduction E → E + T | T T → T * F | F F → ( E ) | id

Handle Pruning Handle  A handle is a substring of grammar symbols in a right-sentential form that matches a right-hand side of a production Right Sentential FormHandleReducing Production id 1 * id 2 id 1 F → id F * id 2 FT → F T * id 2 id 2 F → id T * F E → T * F E → E + T | T T → T * F | F F → ( E ) | id

Handle Pruning A rightmost derivation in reverse can be obtained by “handle pruning” S = γ 0 → rm γ 1 → rm γ 2 → rm …. → rm γ n-1 → rm γ n =ω Handle definition  S →* rm αAω →* rm αβω S A α ω β A handle A →β in the parse tree for αβω

Example: Handle Handle Grammar S  a A B e A  A b c | b B  d NOT a handle, because further reductions will fail (result is not a sentential form) a b b c d e a A b c d e a A A e … ? a b b c d e a A b c d e a A d e a A B e S

Shift-Reduce Parsing Shift-Reduce Parsing is a form of bottom-up parsing  A stack holds grammar symbols  An input buffer holds the rest of the string to parsed. Shift-Reduce parser action  shift  reduce  accept  error

Shift-Reduce Parsing Shift  Shift the next input symbol onto the top of the stack. Reduce  The right end of the string to be reduced must be at the top of the stack.  Locate the left end of the string within the stack and decide with what nonterminal to replace the string. Accept  Announce successful completion of parsing Error  Discover a syntax error and call recovery routine

Shift-Reduce Parsing StackInputAction $id 1 * id 2 $shift $id 1 * id 2 $reduce by F → id $F* id 2 $reduce by T → F $T* id 2 $shift $T *id 2 $shift $T * id 2 $reduce by F → id $T * F$reduce by T → T * F $T$reduce by E → T $E$accept E → E + T | T T → T * F | F F → ( E ) | id

Example: Shift-Reduce Parsing Grammar: S  a A B e A  A b c | b B  d Shift-reduce corresponds to a rightmost derivation: S  rm a A B e  rm a A d e  rm a A b c d e  rm a b b c d e Reducing a sentence: a b b c d e a A b c d e a A d e a A B e S S a b b c d e A A B A A B A A A These match production’s right-hand sides

Conflicts During Shift-Reduce Parsing Conflicts Type  shift-reduce  reduce-reduce Shift-reduce and reduce-reduce conflicts are caused by  The limitations of the LR parsing method (even when the grammar is unambiguous)  Ambiguity of the grammar

Shift-Reduce Conflict Stack $ $id $E $E+ $E+id $E+E $E+E* $E+E*id $E+E*E $E+E $E Input id+id*id$ +id*id$ +id*id$ id*id$ *id$ *id$ id$ $ $ $ $ Action shift reduce E  id shift shift reduce E  id shift (or reduce?) shift reduce E  id reduce E  E * E reduce E  E + E accept Grammar E  E + E E  E * E E  ( E ) E  id Find handles to be reduced How to resolve conflicts?

Shift-Reduce Conflict Stack $… $…if E then S Input …$ else…$ Action … shift or reduce? Ambiguous grammar: S  if E then S | if E then S else S | other Resolve in favor of shift, so else matches closest if Shift else to if E then S or Reduce if E then S

Reduce-Reduce Conflict Stack $ $a Input aa$ a$ Action shift reduce A  a or B  a ? Grammar C  A B A  a B  a Resolve in favor of reduce A  a, otherwise we’re stuck! 動彈不得

LR LR parser are table-driven  Much like the nonrecursive LL parsers. The reasons of the using the LR parsing  An LR-parsering method is the most general nonbacktracking shift-reduce parsing method known, yet it can be implemented as efficiently as other.  LR parsers can be constructed to recognize virtually all programming-language constructs for which context-free grammars can be written.  An LR parser can detect a syntactic error as soon as it is possible to do so on a left-to-right scan of the input.  The class of grammars that can be parsed using LR methods is a proper superset of the class of grammars that can be parsed with predictive or LL methods.

LR(0) An LR parser makes shift-reduce decisions by maintaining states to keep track. An item of a grammar G is a production of G with a dot at some position of the body.  Example A → X Y Z A → ． X Y Z A → X ． Y Z A → X Y ． Z A → X Y Z ． A → X ． Y Z stacknext derivations with input strings items Note that production A   has one item [A  ]

LR(0) Canonical LR(0) Collection  One collection of sets of LR(0) items  Provide the basis for constructing a DFA that is used to make parsing decisions. LR(0) automation The canonical LR(0) collection for a grammar  Augmented the grammar If G is a grammar with start symbol S, then G’ is the augmented grammar for G with new start symbol S’ and new production S’ → S  Closure function  Goto function

Use of the LR(0) Automaton

Function Closure If I is a set of items for a grammar G, then closure(I) is the set of items constructed from I. Create closure(I) by the two rules:  add every item in I to closure(I)  If A →α ． B β is in closure(I) and B →γ is a production, then add the item B → ． γ to closure(I). Apply this rule untill no more new items can be added to closure(I). Divide all the sets of items into two classes  Kernel items initial item S’ → ． S, and all items whose dots are not at the left end.  Nonkernel items All items with their dots at the left end, except for S’ → ． S

Example The grammar G E’ → E E → E + T | T T → T * F | F F → ( E ) | id Let I = { E’ → ． E }, then closure(I) = { E’ → ． E E → ． E + T E → ． T T → ． T * F T → ． F F → ． ( E ) F → ． id }

Exercise The grammar G E’ → E E → E + T | T T → T * F | F F → ( E ) | id Let I = { E → E + ． T }

Function Goto Function Goto(I, X)  I is a set of items  X is a grammar symbol  Goto(I, X) is defined to be the closure of the set of all items [ A  α X ‧ β] such that [ A  α ‧ Xβ] is in I.  Goto function is used to define the transitions in the LR(0) automation for a grammar.

Example I = { E’ → E ． E → E ． + T } Goto (I, +) = { E → E + ． T T → ． T * F T → ． F F → ． ( E ) F → ． id } The grammar G E’ → E E → E + T | T T → T * F | F F → ( E ) | id

Constructing the LR(0) Collection 1.The grammar is augmented with a new start symbol S ’ and production S’  S 2.Initially, set C = closure({[S’ S]}) (this is the start state of the DFA) 3.For each set of items I  C and each grammar symbol X  (N  T) such that GOTO(I, X)  C and goto(I, X)  , add the set of items GOTO(I, X) to C 4.Repeat 3 until no more sets can be added to C

Example: The Parse of id * id Line STACK SYMBOLS INPUTACTION (1)0$id * id $shift to 5 (2)0 5$ id* id $reduce by F → id (3)0 3$ F* id $reduce by T → F (4)0 2$ T* id $shift to 7 (5)0 2 7$ T *id $shift to 5 (6)0 2 7 5$ T * id$reduce by F → id (7)0 2 7 10$ T * F$reduce by T → T * F (8)0 2$ T$reduce by E → T (9)0 1$ E$accept

Model of an LR Parser

Structure of the LR Parsing Table Parsing Table consists of two parts:  A parsing-action function ACTION  A goto function GOTO The Action function, Action[i, a ], have one of four forms:  Shift j, where j is a state. The action taken by the parser shifts input a to the stack, but use state j to represent a.  Reduce A→β. The action of the parser reduces β on the top of the stack to head A.  Accept The parser accepts the input and finishes parsing.  Error

Structure of the LR Parsing Table(1) The GOTO Function, GOTO[Ii, A], defined on sets of items.  If GOTO[Ii, A] = Ij, then GOTO also maps a state i and a nonterminal A to state j.

LR-Parser Configurations Configuration ( = LR parser state): ($s 0 s 1 s 2 … s m, a i a i+1 … a n $) stackinput ($ X 1 X 2 … X m, a i a i+1 … a n $) If action[s m, a i ] = shift s then push s (a i ), and advance input (s 0 s 1 s 2 … s m, a i a i+1 … a n $)  (s 0 s 1 s 2 … s m s, a i+1 … a n $) If action[s m, a i ] = reduce A   and goto[s m-r, A] = s with r = |  | then pop r symbols, and push s ( push A ) ( ( s 0 s 1 s 2 … s m, a i a i+1 … a n $)  (s 0 s 1 s 2 … s m-r s, a i a i+1 … a n $) If action[s m, a i ] = accept then stop If action[s m, a i ] = error then attempt recovery

Example LR Parse Table Grammar: 1. E  E + T 2. E  T 3. T  T * F 4. T  F 5. F  ( E ) 6. F  id s5s4 s6acc r2s7r2 r4 s5s4 r6 s5s4 s5s4 s6s11 r1s7r1 r3 r5 id+*()$ 0 1 2 3 4 5 6 7 8 9 10 11 ETF 123 823 93 10 shift & goto 5 reduce by production #1 actiongoto state

LineSTACKSYMBOLSINPUTACTION (1)0id * id + id $shift 5 (2)0 5id* id + id $reduce 6 goto 3 (3)0 3F* id + id $reduce 4 goto 2 (4)0 2T* id + id $shift 7 (5)0 2 7T *id + id $shift 5 (6)0 2 7 5T * id+ id $reduce 6 goto 10 (7)0 2 7 10T * F+ id $reduce 3 goto 2 (8)0 2T+ id $reduce 2 goto 1 (9)0 1E+ id $shift 6 (10)0 1 6E + id $shift 5 (11)0 1 6 5E + id$reduce 6 goto 3 (12)0 1 6 3E + F$reduce 4 goto 9 (13)0 1 6 9E + T$reduce 1 goto 1 (14)0 1$ E$accept Grammar 0. S  E 1. E  E + T 2. E  T 3. T  T * F 4. T  F 5. F  ( E ) 6. F  id Example

SLR Grammars SLR (Simple LR): a simple extension of LR(0) shift-reduce parsing SLR eliminates some conflicts by populating the parsing table with reductions A  on symbols in FOLLOW(A) S  E E  id + E E  id State I 0 : S  E E  id + E E  id State I 2 : E  id+ E E  id goto(I 0,id)goto(I 3,+) FOLLOW(E)={$} thus reduce on $ Shift on +

SLR Parsing Table Reductions do not fill entire rows Otherwise the same as LR(0) s2 acc s3r3 s2 r2 id+$ 0 1 2 3 4 E 1 4 1. S  E 2. E  id + E 3. E  id FOLLOW(E)={$} thus reduce on $ Shift on +

Constructing SLR Parsing Tables Augment the grammar with S’  S Construct the set C ={ I 0, I 1, …, I n } of LR(0) items State i is constructed from Ii If [ A a  ]  I i and goto(I i, a)=I j then set action[i, a]=shift j If [ A  ]  I i then set action[i,a]=reduce A  for all a  FOLLOW(A) (apply only if A  S ’ ) If [ S’  S ] is in I i then set action[i,$]=accept If goto(I i, A)=I j then set goto[i, A]=j set goto table Repeat 3-4 until no more entries added The initial state i is the I i holding item [ S’ S ]

Example SLR Grammar and LR(0) Items Augmented grammar: 0. C’  C 1. C  A B 2. A  a 3. B  a State I 0 : C’  C C  A B A  a State I 1 : C’  C State I 2 : C  AB B  a State I 3 : A  a State I 4 : C  A B State I 5 : B  a goto(I 0,C) goto(I 0,a) goto(I 0,A) goto(I 2,a) goto(I 2,B) I 0 = closure({[C’  C]}) I 1 = goto(I 0,C) = closure({[C’  C]}) … start final

Example SLR Parsing Table s3 acc s5 r2 r1 r3 a$ 0 1 2 3 4 5 CAB 12 4 State I 0 : C’  C C  A B A  a State I 1 : C’  C State I 2 : C  AB B  a State I 3 : A  a State I 4 : C  A B State I 5 : B  a 1 2 4 5 3 0 start a A C B a Grammar: 0. C’  C 1. C  A B 2. A  a 3. B  a

SLR and Ambiguity Every SLR grammar is unambiguous, but not every unambiguous grammar is SLR, maybe LR(1) Consider for example the unambiguous grammar S  L = R | R L  * R | id R  LFOLLOW(R) = {=, $} I 0 : S’  S S  L=R S  R L  *R L  id R  L I 1 : S’  S I 2 : S  L=R R  L I 3 : S  R I 4 : L  *R R  L L  *R L  id I 5 : L  id I 6 : S  L=R R  L L  *R L  id I 7 : L  *R I 8 : R  L I 9 : S  L=R action[2,=]=s6 action[2,=]=r5 no Has no SLR parsing table

LR(1) Grammars SLR too simple LR(1) parsing uses lookahead to avoid unnecessary conflicts in parsing table LR(1) item = LR(0) item + lookahead LR(0) item [A   ] LR(1) item [A  , a]

SLR Versus LR(1) Split the SLR states by adding LR(1) lookahead Unambiguous grammar S  L = R | R L  * R | id R  L I 2 : S  L=R R  L action[2,=]=s6 Should not reduce, because no right-sentential form begins with R= split R  LS  L=R

LR(1) Items An LR(1) item [ A  , a ] contains a lookahead terminal a, meaning  already on top of the stack, expect to see  a For items of the form [ A , a ] the lookahead a is used to reduce A  only if the next input is a For items of the form [ A  , a ] with  the lookahead has no effect

The Closure Operation for LR(1) Items Start with closure(I) = I If [ A B , a ]  closure(I) then for each production B  in the grammar and each terminal b  FIRST(  a) add the item [ B  , b ] to I if not already in I Repeat 2 until no new items can be added

The Goto Operation for LR(1) Items For each item [ A X , a ]  I, add the set of items closure({[A  X , a]}) to goto(I,X) if not already there Repeat step 1 until no more items can be added to goto(I,X)

Example Let I = { (S’ → S, $) } I 0 = closure(I) = { S’ → S, $ S → C C, $ C → c C, c/d C → d, c/d } goto(I 0, S) = closure( {S’ → S, $ } ) = {S’ → S, $ } = I 1 The grammar G S’ → S S → C C C → c C | d

Exercise Let I = { (S → C C, $) } I 2 = closure(I) = ? I 3 = goto(I 2, c) = ? The grammar G S’ → S S → C C C → c C | d

Construction of the sets of LR(1) Items Augment the grammar with a new start symbol S’ and production S’  S Initially, set C = closure({[S’ S, $]}) (this is the start state of the DFA) For each set of items I  C and each grammar symbol X  (N  T) such that goto(I, X)  C and goto(I, X)  , add the set of items goto(I, X) to C Repeat 3 until no more sets can be added to C

LR(1) Automation

Construction of the Canonical LR(1) Parsing Tables Augment the grammar with S’  S Construct the set C={I 0,I 1,…,I n } of LR(1) items State i of the parser is constructed from I i  If [ A a , b ]  I i and goto(I i,a)=I j then set action[i,a]=shift j  If [ A , a ]  I i then set action[i,a]=reduce A  (apply only if A  S’ )  If [ S’  S, $ ] is in I i then set action[i,$]=accept If goto(I i,A)=I j then set goto[i,A]=j Repeat 3 until no more entries added The initial state i is the I i holding item [S ’  S,$]

Example The grammar G S’ → S S → C C C → c C | d state ACTIONGOTO cd$SC 0s3s412 1acc 2s6s75 3s3s48 4r3 5r1 6s6s79 7r3 8r2 9

Example Grammar and LR(1) Items Unambiguous LR(1) grammar: S  L = R | R L  * R | id R  L Augment with S’  S LR(1) items (next slide)

[S’  S,$] goto(I 0,S)=I 1 [S  L=R,$] goto(I 0,L)=I 2 [S  R,$] goto(I 0,R)=I 3 [L  *R,=/$] goto(I 0,*)=I 4 [L  id,=/$] goto(I 0,id)=I 5 [R  L,$] goto(I 0,L)=I 2 [S’  S,$] [S  L=R,$] goto(I 0,=)=I 6 [R  L,$] [S  R,$] [L  *R,=/$] goto(I 4,R)=I 7 [R  L,=/$] goto(I 4,L)=I 8 [L  *R,=/$] goto(I 4,*)=I 4 [L  id,=/$] goto(I 4,id)=I 5 [L  id,=/$] [S  L=R,$] goto(I 6,R)=I 9 [R  L,$] goto(I 6,L)=I 10 [L  *R,$] goto(I 6,*)=I 11 [L  id,$] goto(I 6,id)=I 12 [L  *R,=/$] [R  L,=/$] [S  L=R,$] [R  L,$] [L  *R,$] goto(I 11,R)=I 13 [R  L,$] goto(I 11,L)=I 10 [L  *R,$] goto(I 11,*)=I 11 [L  id,$] goto(I 11,id)=I 12 [L  id,$] [L  *R,$] I0:I0: I1:I1: I2:I2: I3:I3: I4:I4: I5:I5: I6:I6: I7:I7: I8:I8: I9:I9: I 10 : I 12 : I 11 : I 13 :

Example LR(1) Parsing Table s5s4 acc s6r6 r3 s5s4 r5 s12s11 r4 r6 r2 r6 s12s11 r5 r4 id*=$ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 SLR 123 87 104 13 Grammar: 1. S’  S 2. S  L = R 3. S  R 4. L  * R 5. L  id 6. R  L

LALR(1) Grammars LR(1) parsing tables have many states LALR(1) parsing (Look-Ahead LR) combines LR(1) states to reduce table size Less powerful than LR(1)  Will not introduce shift-reduce conflicts, because shifts do not use lookaheads  May introduce reduce-reduce conflicts, but seldom do so for grammars of programming languages SLR and LALR tables for a grammar always have the same number of states, and less than LR(1) tables.  Like C, SLR and LALR >100, LR(1) > 1000

Constructing LALR Parsing Tables Two ways  Construction of the LALR parsing table from the sets of LR(1) items. Union the states Requires much space and time  Construction of the LALR parsing table from the sets of LR(0) items Efficient Use in practice.

Example state ACTIONGOTO cd$SC 0s36s4712 1acc 2s36s475 36s36s4789 47r3 5r1 89r2 state ACTIONGOTO cd$SC 0s3s412 1acc 2s6s75 3s3s48 4r3 5r1 6s6s79 7r3 8r2 9 LALR(1) LR(1)

Constructing LALR(1) Parsing Tables Construct sets of LR(1) items Combine LR(1) sets with sets of items that share the same first part [L  *R,=] [R  L,=] [L  *R,=] [L  id,=] [L  *R,$] [R  L,$] [L  *R,$] [L  id,$] I4:I4: I 11 : [L  *R,=/$] [R  L,=/$] [L  *R,=/$] [L  id,=/$] Shorthand for two items in the same set

Example LALR(1) Grammar Unambiguous LR(1) grammar: S  L = R | R L  * R | id R  L Augment with S’  S LALR(1) items (next slide)

[S’  S,$] goto(I 0,S)=I 1 [S  L=R,$] goto(I 0,L)=I 2 [S  R,$] goto(I 0,R)=I 3 [L  *R,=/$] goto(I 0,*)=I 4 [L  id,=/$] goto(I 0,id)=I 5 [R  L,$] goto(I 0,L)=I 2 [S’  S,$] [S  L=R,$] goto(I 0,=)=I 6 [R  L,$] [S  R,$] [L  *R,=/$] goto(I 4,R)=I 7 [R  L,=/$] goto(I 4,L)=I 9 [L  *R,=/$] goto(I 4,*)=I 4 [L  id,=/$] goto(I 4,id)=I 5 [L  id,=/$] [S  L=R,$] goto(I 6,R)=I 8 [R  L,$] goto(I 6,L)=I 9 [L  *R,$] goto(I 6,*)=I 4 [L  id,$] goto(I 6,id)=I 5 [L  *R,=/$] [S  L=R,$] [R  L,=/$] I0:I0: I1:I1: I2:I2: I3:I3: I4:I4: I5:I5: I6:I6: I7:I7: I8:I8: I9:I9: Shorthand for two items [R  L,=] [R  L,$]

Example LALR(1) Parsing Table s5s4 acc s6r6 r3 s5s4 r5 s5s4 r4 r2 r6 id*=$ 0 1 2 3 4 5 6 7 8 9 SLR 123 97 98 Grammar: 1. S’  S 2. S  L = R 3. S  R 4. L  * R 5. L  id 6. R  L

LL, SLR, LR, LALR Summary LL parse tables computed using FIRST/FOLLOW  Nonterminals  terminals  productions  Computed using FIRST/FOLLOW LR parsing tables computed using closure/goto  LR states  terminals  shift/reduce actions  LR states  nonterminals  goto state transitions A grammar is  LL(1) if its LL(1) parse table has no conflicts  SLR if its SLR parse table has no conflicts  LR(1) if its LR(1) parse table has no conflicts  LALR(1) if its LALR(1) parse table has no conflicts

LL, SLR, LR, LALR Grammars LL(1) LR(1) LR(0) SLR LALR(1)

YACC Yacc or Bison compiler yacc specification yacc.y y.tab.c input stream C compiler a.out output stream y.tab.c a.out

Chapter 4-2 Chang Chi-Chung 2007.06.07. Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR Other.

Similar presentations

Presentation on theme: "Chapter 4-2 Chang Chi-Chung 2007.06.07. Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR Other."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 4-2 Chang Chi-Chung 2007.06.07. Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR Other.

Similar presentations

Presentation on theme: "Chapter 4-2 Chang Chi-Chung 2007.06.07. Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR Other."— Presentation transcript:

Similar presentations

About project

Feedback