Chapter 3-3 Chang Chi-Chung 2015.07.03. Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR 

Slides:



Advertisements
Similar presentations
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Advertisements

Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Compiler Designs and Constructions
Compiler Principles Fall Compiler Principles Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University.
 CS /11/12 Matthew Rodgers.  What are LL and LR parsers?  What grammars do they parse?  What is the difference between LL and LR?  Why do.
CH4.1 CSE244 SLR Parsing Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Box U-155 Storrs,
Review: LR(k) parsers a1 … a2 … an $ LR parsing program Action goto Sm xm … s1 x1 s0 output input stack Parsing table.
1 May 22, May 22, 2015May 22, 2015May 22, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa Pacific University,
LR Parsing Table Costruction
Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
1 Chapter 5: Bottom-Up Parsing (Shift-Reduce). 2 - attempts to construct a parse tree for an input string beginning at the leaves (the bottom) and working.
Pertemuan 12, 13, 14 Bottom-Up Parsing
Lecture #8, Feb. 7, 2007 Shift-reduce parsing,
Bottom Up Parsing.
Chapter 4-2 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR Other.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
Bottom-up parsing Goal of parser : build a derivation
LALR Parsing Canonical sets of LR(1) items
 an efficient Bottom-up parser for a large and useful class of context-free grammars.  the “ L ” stands for left-to-right scan of the input; the “ R.
LESSON 24.
410/510 1 of 21 Week 2 – Lecture 1 Bottom Up (Shift reduce, LR parsing) SLR, LR(0) parsing SLR parsing table Compiler Construction.
Bottom-Up Parsing CS308 Compiler Theory1. 2 Bottom-Up Parsing A bottom-up parser creates the parse tree of the given input starting from leaves towards.
SLR PARSING TECHNIQUES Submitted By: Abhijeet Mohapatra 04CS1019.
LR(k) Parsing CPSC 388 Ellen Walker Hiram College.
lec04-bottomupparser April 23, 2017 Bottom-Up Parsing
CS 321 Programming Languages and Compilers Bottom Up Parsing.
 an efficient Bottom-up parser for a large and useful class of context-free grammars.  the “ L ” stands for left-to-right scan of the input; the “ R.
1 Compiler Construction Syntax Analysis Top-down parsing.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
1 LR Parsers  The most powerful shift-reduce parsing (yet efficient) is: LR(k) parsing. LR(k) parsing. left to right right-most k lookhead scanning derivation.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Bottom-Up Parsing A bottom-up parser creates the parse tree of the given input starting from leaves.
Lesson 9 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Chapter 5: Bottom-Up Parsing (Shift-Reduce)
Syntax Analysis - LR(0) Parsing Compiler Design Lecture (02/04/98) Computer Science Rensselaer Polytechnic.
Announcements/Reading
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
LR Parser: LR parsing is a bottom up syntax analysis technique that can be applied to a large class of context free grammars. L is for left –to –right.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
Three kinds of bottom-up LR parser SLR “Simple LR” –most restrictions on eligible grammars –built quite directly from items as just shown LR “Canonical.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
Lecture 5: LR Parsing CS 540 George Mason University.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
Bottom-up parsing. Bottom-up parsing builds a parse tree from the leaves (terminals) to the start symbol int E T * TE+ T (4) (2) (3) (5) (1) int*+ E 
1 Chapter 6 Bottom-Up Parsing. 2 Bottom-up Parsing A bottom-up parsing corresponds to the construction of a parse tree for an input tokens beginning at.
Conflicts in Simple LR parsers A SLR Parser does not use any lookahead The SLR parsing method fails if knowing the stack’s top state and next input token.
Chapter 8. LR Syntactic Analysis Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
LR Parsing. LR Parsers The most powerful shift-reduce parsing (yet efficient) is: LR(k) parsing. left to right right-mostk lookhead scanning derivation(k.
S YNTAX A NALYSIS O R P ARSING Lecture 08. B OTTOM -U P P ARSING A bottom-up parser creates the parse tree of the given input starting from leaves towards.
Announcements/Reading
Lec04-bottomupparser 4/13/2018 LR Parsing.
Compiler Baojian Hua LR Parsing Compiler Baojian Hua
lec04-bottomupparser June 6, 2018 Bottom-Up Parsing
UNIT - 3 SYNTAX ANALYSIS - II
Compiler Construction
Fall Compiler Principles Lecture 4: Parsing part 3
LALR Parsing Canonical sets of LR(1) items
Bottom-Up Syntax Analysis
Simple, efficient;limitated
Canonical LR Parsing Tables
Syntax Analysis Part II
Subject Name:COMPILER DESIGN Subject Code:10CS63
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Chapter 4. Syntax Analysis (2)
Compiler SLR Parser.
Chapter 4. Syntax Analysis (2)
Chap. 3 BOTTOM-UP PARSING
Presentation transcript:

Chapter 3-3 Chang Chi-Chung

Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR  最右推導  沒有右遞迴  明確性文法 Other special cases:  Shift-reduce parsing  Operator-precedence parsing

Bottom-Up Parsing E → T → T * F → T * id → F * id → id * id rightmost derivation reduction E → E + T | T T → T * F | F F → ( E ) | id

LR(0) An LR parser makes shift-reduce decisions by maintaining states to keep track. An item of a grammar G is a production of G with a dot at some position of the body.  Example A → X Y Z A → . X Y Z A → X . Y Z A → X Y . Z A → X Y Z . A → X . Y Z stacknext derivations with input strings items Note that production A   has one item [A  ]

LR(0) Canonical LR(0) Collection  One collection of sets of LR(0) items  Provide the basis for constructing a DFA that is used to make parsing decisions. LR(0) automation The canonical LR(0) collection for a grammar  Augmented the grammar If G is a grammar with start symbol S, then G’ is the augmented grammar for G with new start symbol S’ and new production S’ → S  Closure function  Goto function

Shift-Reduce Parsing Shift-Reduce Parsing is a form of bottom-up parsing  A stack holds grammar symbols  An input buffer holds the rest of the string to parsed. Shift-Reduce parser action  shift  reduce  accept  error

Shift-Reduce Parsing Shift  Shift the next input symbol onto the top of the stack. Reduce  The right end of the string to be reduced must be at the top of the stack.  Locate the left end of the string within the stack and decide with what nonterminal to replace the string. Accept  Announce successful completion of parsing Error  Discover a syntax error and call recovery routine

Conflicts During Shift-Reduce Parsing Conflicts Type  shift-reduce  reduce-reduce Shift-reduce and reduce-reduce conflicts are caused by  The limitations of the LR parsing method (even when the grammar is unambiguous)  Ambiguity of the grammar

Use of the LR(0) Automaton

Function Closure If I is a set of items for a grammar G, then closure(I) is the set of items constructed from I. Create closure(I) by the two rules:  add every item in I to closure(I)  If A →α . B β is in closure(I) and B →γ is a production, then add the item B → . γ to closure(I). Apply this rule untill no more new items can be added to closure(I). Divide all the sets of items into two classes  Kernel items initial item S’ → . S, and all items whose dots are not at the left end.  Nonkernel items All items with their dots at the left end, except for S’ → . S

Example The grammar G E’ → E E → E + T | T T → T * F | F F → ( E ) | id Let I = { E’ → . E }, then closure(I) = { E’ → . E E → . E + T E → . T T → . T * F T → . F F → . ( E ) F → . id }

Exercise The grammar G E’ → E E → E + T | T T → T * F | F F → ( E ) | id Let I = { E → E + . T }

Function Goto Function Goto(I, X)  I is a set of items  X is a grammar symbol  Goto(I, X) is defined to be the closure of the set of all items [ A  α X ‧ β] such that [ A  α ‧ Xβ] is in I.  Goto function is used to define the transitions in the LR(0) automation for a grammar.

Example I = { E’ → E . E → E . + T } Goto (I, +) = { E → E + . T T → . T * F T → . F F → . ( E ) F → . id } The grammar G E’ → E E → E + T | T T → T * F | F F → ( E ) | id

Constructing the LR(0) Collection 1.The grammar is augmented with a new start symbol S ’ and production S’  S 2.Initially, set C = closure({[S’ S]}) (this is the start state of the DFA) 3.For each set of items I  C and each grammar symbol X  (N  T) such that GOTO(I, X)  C and goto(I, X)  , add the set of items GOTO(I, X) to C 4.Repeat 3 until no more sets can be added to C

Model of an LR Parser

Structure of the LR Parsing Table Parsing Table consists of two parts:  A parsing-action function ACTION  A goto function GOTO The Action function, Action[i, a ], have one of four forms:  Shift j, where j is a state. The action taken by the parser shifts input a to the stack, but use state j to represent a.  Reduce A→β. The action of the parser reduces β on the top of the stack to head A.  Accept The parser accepts the input and finishes parsing.  Error

Structure of the LR Parsing Table(1) The GOTO Function, GOTO[Ii, A], defined on sets of items.  If GOTO[Ii, A] = Ij, then GOTO also maps a state i and a nonterminal A to state j.

LR-Parser Configurations Configuration ( = LR parser state): ($s 0 s 1 s 2 … s m, a i a i+1 … a n $) stackinput ($ X 1 X 2 … X m, a i a i+1 … a n $) If action[s m, a i ] = shift s then push s (a i ), and advance input (s 0 s 1 s 2 … s m, a i a i+1 … a n $)  (s 0 s 1 s 2 … s m s, a i+1 … a n $) If action[s m, a i ] = reduce A   and goto[s m-r, A] = s with r = |  | then pop r symbols, and push s ( push A ) ( ( s 0 s 1 s 2 … s m, a i a i+1 … a n $)  (s 0 s 1 s 2 … s m-r s, a i a i+1 … a n $) If action[s m, a i ] = accept then stop If action[s m, a i ] = error then attempt recovery

Example LR Parse Table Grammar: 1. E  E + T 2. E  T 3. T  T * F 4. T  F 5. F  ( E ) 6. F  id s5s4 s6acc r2s7r2 r4 s5s4 r6 s5s4 s5s4 s6s11 r1s7r1 r3 r5 id+*()$ ETF shift & goto 5 reduce by production #1 actiongoto state

LineSTACKSYMBOLSINPUTACTION (1)0id * id + id $shift 5 (2)0 5id* id + id $reduce 6 goto 3 (3)0 3F* id + id $reduce 4 goto 2 (4)0 2T* id + id $shift 7 (5)0 2 7T *id + id $shift 5 (6) T * id+ id $reduce 6 goto 10 (7) T * F+ id $reduce 3 goto 2 (8)0 2T+ id $reduce 2 goto 1 (9)0 1E+ id $shift 6 (10)0 1 6E + id $shift 5 (11) E + id$reduce 6 goto 3 (12) E + F$reduce 4 goto 9 (13) E + T$reduce 1 goto 1 (14)0 1$ E$accept Grammar 0. S  E 1. E  E + T 2. E  T 3. T  T * F 4. T  F 5. F  ( E ) 6. F  id Example

SLR Grammars SLR (Simple LR): a simple extension of LR(0) shift-reduce parsing SLR eliminates some conflicts by populating the parsing table with reductions A  on symbols in FOLLOW(A) S  E E  id + E E  id State I 0 : S  E E  id + E E  id State I 2 : E  id+ E E  id goto(I 0,id)goto(I 3,+) FOLLOW(E)={$} thus reduce on $ Shift on +

SLR Parsing Table Reductions do not fill entire rows Otherwise the same as LR(0) s2 acc s3r3 s2 r2 id+$ E S  E 2. E  id + E 3. E  id FOLLOW(E)={$} thus reduce on $ Shift on +

Constructing SLR Parsing Tables Augment the grammar with S’  S Construct the set C ={ I 0, I 1, …, I n } of LR(0) items State i is constructed from Ii If [ A a  ]  I i and goto(I i, a)=I j then set action[i, a]=shift j If [ A  ]  I i then set action[i,a]=reduce A  for all a  FOLLOW(A) (apply only if A  S ’ ) If [ S’  S ] is in I i then set action[i,$]=accept If goto(I i, A)=I j then set goto[i, A]=j set goto table Repeat 3-4 until no more entries added The initial state i is the I i holding item [ S’ S ]

Example SLR Grammar and LR(0) Items Augmented grammar: 0. C’  C 1. C  A B 2. A  a 3. B  a State I 0 : C’  C C  A B A  a State I 1 : C’  C State I 2 : C  AB B  a State I 3 : A  a State I 4 : C  A B State I 5 : B  a goto(I 0,C) goto(I 0,a) goto(I 0,A) goto(I 2,a) goto(I 2,B) I 0 = closure({[C’  C]}) I 1 = goto(I 0,C) = closure({[C’  C]}) … start final

Example SLR Parsing Table s3 acc s5 r2 r1 r3 a$ CAB 12 4 State I 0 : C’  C C  A B A  a State I 1 : C’  C State I 2 : C  AB B  a State I 3 : A  a State I 4 : C  A B State I 5 : B  a start a A C B a Grammar: 0. C’  C 1. C  A B 2. A  a 3. B  a

SLR and Ambiguity Every SLR grammar is unambiguous, but not every unambiguous grammar is SLR, maybe LR(1) Consider for example the unambiguous grammar S  L = R | R L  * R | id R  LFOLLOW(R) = {=, $} I 0 : S’  S S  L=R S  R L  *R L  id R  L I 1 : S’  S I 2 : S  L=R R  L I 3 : S  R I 4 : L  *R R  L L  *R L  id I 5 : L  id I 6 : S  L=R R  L L  *R L  id I 7 : L  *R I 8 : R  L I 9 : S  L=R action[2,=]=s6 action[2,=]=r5 no Has no SLR parsing table

LR(1) Grammars SLR too simple LR(1) parsing uses lookahead to avoid unnecessary conflicts in parsing table LR(1) item = LR(0) item + lookahead LR(0) item [A   ] LR(1) item [A  , a]

SLR Versus LR(1) Split the SLR states by adding LR(1) lookahead Unambiguous grammar S  L = R | R L  * R | id R  L I 2 : S  L=R R  L action[2,=]=s6 Should not reduce, because no right-sentential form begins with R= split R  LS  L=R

LR(1) Items An LR(1) item [ A  , a ] contains a lookahead terminal a, meaning  already on top of the stack, expect to see  a For items of the form [ A , a ] the lookahead a is used to reduce A  only if the next input is a For items of the form [ A  , a ] with  the lookahead has no effect

The Closure Operation for LR(1) Items Start with closure(I) = I If [ A B , a ]  closure(I) then for each production B  in the grammar and each terminal b  FIRST(  a) add the item [ B  , b ] to I if not already in I Repeat 2 until no new items can be added

The Goto Operation for LR(1) Items For each item [ A X , a ]  I, add the set of items closure({[A  X , a]}) to goto(I,X) if not already there Repeat step 1 until no more items can be added to goto(I,X)

Example Let I = { (S’ → S, $) } I 0 = closure(I) = { S’ → S, $ S → C C, $ C → c C, c/d C → d, c/d } goto(I 0, S) = closure( {S’ → S, $ } ) = {S’ → S, $ } = I 1 The grammar G S’ → S S → C C C → c C | d

Exercise Let I = { (S → C C, $) } I 2 = closure(I) = ? I 3 = goto(I 2, c) = ? The grammar G S’ → S S → C C C → c C | d

Construction of the sets of LR(1) Items Augment the grammar with a new start symbol S’ and production S’  S Initially, set C = closure({[S’ S, $]}) (this is the start state of the DFA) For each set of items I  C and each grammar symbol X  (N  T) such that goto(I, X)  C and goto(I, X)  , add the set of items goto(I, X) to C Repeat 3 until no more sets can be added to C

LR(1) Automation

Construction of the Canonical LR(1) Parsing Tables Augment the grammar with S’  S Construct the set C={I 0,I 1,…,I n } of LR(1) items State i of the parser is constructed from I i  If [ A a , b ]  I i and goto(I i,a)=I j then set action[i,a]=shift j  If [ A , a ]  I i then set action[i,a]=reduce A  (apply only if A  S’ )  If [ S’  S, $ ] is in I i then set action[i,$]=accept If goto(I i,A)=I j then set goto[i,A]=j Repeat 3 until no more entries added The initial state i is the I i holding item [S ’  S,$]

Example The grammar G S’ → S S → C C C → c C | d state ACTIONGOTO cd$SC 0s3s412 1acc 2s6s75 3s3s48 4r3 5r1 6s6s79 7r3 8r2 9

Example Grammar and LR(1) Items Unambiguous LR(1) grammar: S  L = R | R L  * R | id R  L Augment with S’  S LR(1) items (next slide)

[S’  S,$] goto(I 0,S)=I 1 [S  L=R,$] goto(I 0,L)=I 2 [S  R,$] goto(I 0,R)=I 3 [L  *R,=/$] goto(I 0,*)=I 4 [L  id,=/$] goto(I 0,id)=I 5 [R  L,$] goto(I 0,L)=I 2 [S’  S,$] [S  L=R,$] goto(I 0,=)=I 6 [R  L,$] [S  R,$] [L  *R,=/$] goto(I 4,R)=I 7 [R  L,=/$] goto(I 4,L)=I 8 [L  *R,=/$] goto(I 4,*)=I 4 [L  id,=/$] goto(I 4,id)=I 5 [L  id,=/$] [S  L=R,$] goto(I 6,R)=I 9 [R  L,$] goto(I 6,L)=I 10 [L  *R,$] goto(I 6,*)=I 11 [L  id,$] goto(I 6,id)=I 12 [L  *R,=/$] [R  L,=/$] [S  L=R,$] [R  L,$] [L  *R,$] goto(I 11,R)=I 13 [R  L,$] goto(I 11,L)=I 10 [L  *R,$] goto(I 11,*)=I 11 [L  id,$] goto(I 11,id)=I 12 [L  id,$] [L  *R,$] I0:I0: I1:I1: I2:I2: I3:I3: I4:I4: I5:I5: I6:I6: I7:I7: I8:I8: I9:I9: I 10 : I 12 : I 11 : I 13 :

Example LR(1) Parsing Table s5s4 acc s6r6 r3 s5s4 r5 s12s11 r4 r6 r2 r6 s12s11 r5 r4 id*=$ SLR Grammar: 1. S’  S 2. S  L = R 3. S  R 4. L  * R 5. L  id 6. R  L

LALR(1) Grammars LR(1) parsing tables have many states LALR(1) parsing (Look-Ahead LR) combines LR(1) states to reduce table size Less powerful than LR(1)  Will not introduce shift-reduce conflicts, because shifts do not use lookaheads  May introduce reduce-reduce conflicts, but seldom do so for grammars of programming languages SLR and LALR tables for a grammar always have the same number of states, and less than LR(1) tables.  Like C, SLR and LALR >100, LR(1) > 1000

Constructing LALR Parsing Tables Two ways  Construction of the LALR parsing table from the sets of LR(1) items. Union the states Requires much space and time  Construction of the LALR parsing table from the sets of LR(0) items Efficient Use in practice.

Example state ACTIONGOTO cd$SC 0s36s4712 1acc 2s36s475 36s36s r3 5r1 89r2 state ACTIONGOTO cd$SC 0s3s412 1acc 2s6s75 3s3s48 4r3 5r1 6s6s79 7r3 8r2 9 LALR(1) LR(1)

Constructing LALR(1) Parsing Tables Construct sets of LR(1) items Combine LR(1) sets with sets of items that share the same first part [L  *R,=] [R  L,=] [L  *R,=] [L  id,=] [L  *R,$] [R  L,$] [L  *R,$] [L  id,$] I4:I4: I 11 : [L  *R,=/$] [R  L,=/$] [L  *R,=/$] [L  id,=/$] Shorthand for two items in the same set

Example LALR(1) Grammar Unambiguous LR(1) grammar: S  L = R | R L  * R | id R  L Augment with S’  S LALR(1) items (next slide)

[S’  S,$] goto(I 0,S)=I 1 [S  L=R,$] goto(I 0,L)=I 2 [S  R,$] goto(I 0,R)=I 3 [L  *R,=/$] goto(I 0,*)=I 4 [L  id,=/$] goto(I 0,id)=I 5 [R  L,$] goto(I 0,L)=I 2 [S’  S,$] [S  L=R,$] goto(I 0,=)=I 6 [R  L,$] [S  R,$] [L  *R,=/$] goto(I 4,R)=I 7 [R  L,=/$] goto(I 4,L)=I 9 [L  *R,=/$] goto(I 4,*)=I 4 [L  id,=/$] goto(I 4,id)=I 5 [L  id,=/$] [S  L=R,$] goto(I 6,R)=I 8 [R  L,$] goto(I 6,L)=I 9 [L  *R,$] goto(I 6,*)=I 4 [L  id,$] goto(I 6,id)=I 5 [L  *R,=/$] [S  L=R,$] [R  L,=/$] I0:I0: I1:I1: I2:I2: I3:I3: I4:I4: I5:I5: I6:I6: I7:I7: I8:I8: I9:I9: Shorthand for two items [R  L,=] [R  L,$]

Example LALR(1) Parsing Table s5s4 acc s6r6 r3 s5s4 r5 s5s4 r4 r2 r6 id*=$ SLR Grammar: 1. S’  S 2. S  L = R 3. S  R 4. L  * R 5. L  id 6. R  L

LL, SLR, LR, LALR Summary LL parse tables computed using FIRST/FOLLOW  Nonterminals  terminals  productions  Computed using FIRST/FOLLOW LR parsing tables computed using closure/goto  LR states  terminals  shift/reduce actions  LR states  nonterminals  goto state transitions A grammar is  LL(1) if its LL(1) parse table has no conflicts  SLR if its SLR parse table has no conflicts  LR(1) if its LR(1) parse table has no conflicts  LALR(1) if its LALR(1) parse table has no conflicts

LL, SLR, LR, LALR Grammars LL(1) LR(1) LR(0) SLR LALR(1)

YACC Yacc or Bison compiler yacc specification yacc.y y.tab.c input stream C compiler a.out output stream y.tab.c a.out