More SLR /LR(1) Professor Yihjia Tsai Tamkang University.

Slides:

Advertisements

Similar presentations

Compiler Designs and Constructions

Advertisements

 CS /11/12 Matthew Rodgers.  What are LL and LR parsers?  What grammars do they parse?  What is the difference between LL and LR?  Why do.

Review: LR(k) parsers a1 … a2 … an $ LR parsing program Action goto Sm xm … s1 x1 s0 output input stack Parsing table.

CSE 5317/4305 L4: Parsing #21 Parsing #2 Leonidas Fegaras.

1 May 22, May 22, 2015May 22, 2015May 22, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa Pacific University,

LR Parsing Table Costruction

6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)

1 Chapter 5: Bottom-Up Parsing (Shift-Reduce). 2 - attempts to construct a parse tree for an input string beginning at the leaves (the bottom) and working.

1 Bottom Up Parsing. 2 Bottom-Up Parsing l Bottom-up parsing is more general than top-down parsing »And just as efficient »Builds on ideas in top-down.

Pertemuan 12, 13, 14 Bottom-Up Parsing

By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.

1 CMPSC 160 Translation of Programming Languages Fall 2002 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #10 Parsing.

Parsing V Introduction to LR(1) Parsers. from Cooper & Torczon2 LR(1) Parsers LR(1) parsers are table-driven, shift-reduce parsers that use a limited.

Bottom Up Parsing.

Chapter 4-2 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR Other.

1 LR parsing techniques SLR (not in the book) –Simple LR parsing –Easy to implement, not strong enough –Uses LR(0) items Canonical LR –Larger parser but.

Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.

Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.

1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.

Shift/Reduce and LR(1) Professor Yihjia Tsai Tamkang University.

Bottom-up parsing Goal of parser : build a derivation

LALR Parsing Canonical sets of LR(1) items

Syntax and Semantics Structure of programming languages.

LR Parsing Compiler Baojian Hua

SLR PARSING TECHNIQUES Submitted By: Abhijeet Mohapatra 04CS1019.

LR(k) Parsing CPSC 388 Ellen Walker Hiram College.

Chap. 6, Bottom-Up Parsing J. H. Wang May 17, 2011.

Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.

Chapter 3-3 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR 

Syntax and Semantics Structure of programming languages.

Chapter 5: Bottom-Up Parsing (Shift-Reduce)

Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.

Syntax Analysis - LR(0) Parsing Compiler Design Lecture (02/04/98) Computer Science Rensselaer Polytechnic.

Announcements/Reading

1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.

4. Bottom-up Parsing Chih-Hung Wang

LR Parser: LR parsing is a bottom up syntax analysis technique that can be applied to a large class of context free grammars. L is for left –to –right.

Bernd Fischer RW713: Compiler and Software Language Engineering.

Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.

Parsing V LR(1) Parsers. LR(1) Parsers LR(1) parsers are table-driven, shift-reduce parsers that use a limited right context (1 token) for handle recognition.

Three kinds of bottom-up LR parser SLR “Simple LR” –most restrictions on eligible grammars –built quite directly from items as just shown LR “Canonical.

CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.

1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.

Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.

Bottom-up parsing. Bottom-up parsing builds a parse tree from the leaves (terminals) to the start symbol int E T * TE+ T (4) (2) (3) (5) (1) int*+ E 

1 Chapter 6 Bottom-Up Parsing. 2 Bottom-up Parsing A bottom-up parsing corresponds to the construction of a parse tree for an input tokens beginning at.

Conflicts in Simple LR parsers A SLR Parser does not use any lookahead The SLR parsing method fails if knowing the stack’s top state and next input token.

Chapter 8. LR Syntactic Analysis Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.

COMPILER CONSTRUCTION

Syntax and Semantics Structure of programming languages.

Announcements/Reading

Programming Languages Translator

Table-driven parsing Parsing performed by a finite state machine.

Compiler Construction

LALR Parsing Canonical sets of LR(1) items

Bottom-Up Syntax Analysis

Simple, efficient；limitated

Syntax Analysis Part II

LR Parsing – The Tables Lecture 11 Wed, Feb 16, 2005.

Parsing #2 Leonidas Fegaras.

Lecture (From slides by G. Necula & R. Bodik)

Compilers Principles, Techniques, & Tools Taught by Jing Zhang

LALR Parsing Adapted from Notes by Profs Aiken and Necula (UCB) and

Chapter 4. Syntax Analysis (2)

Parsing #2 Leonidas Fegaras.

Kanat Bolazar February 16, 2010

Chapter 4. Syntax Analysis (2)

Lecture 11 LR Parse Table Construction

Lecture 11 LR Parse Table Construction

Presentation transcript:

More SLR /LR(1) Professor Yihjia Tsai Tamkang University

2 Remember Computing FIRST (N) If N  First (N) includes  if N aABC First (N) includes a if N X1X2 First (N) includes First (X1) if N X1X2 … and X1 , First (N) includes First (X2) Obvious generalization to First () where a is X1X2...

3 Computing Follow (N) Follow (N) is computed from productions in which N appears on the rhs For the sentence symbol S, Follow (S) includes $ if A  N , Follow (N) includes First () –because an expansion of N will be followed by an expansion from  if A  N, Follow (N) includes Follow (A) –because N will be expanded in the context in which A is expanded if A  N B, B , Follow (N) includes Follow (A)

4 Recall our Example A grammar to generate all palindromes over  = { a, b } 1) S--> P 2) P --> a Pa 3) P --> b P b 4) P --> c LR parsers work with an augmented grammar in which the start symbol never appears in the right side of a production. Here the original grammar was rules 2-4

5 Computing the Items S0: S-->.P, P -->.a P a, P-->.bP b, P-->.c S1: S--> P. S2: P --> a.Pa, P-->.aPa,P-->.bPb,P-- >.c S3:P--> b.P b, P-->.aPa,P-->.bPb,P-- >.c S4: P--> c. S5: P--> aP.a S6:P--> bP.b S7: P--> aPa. S8: P--> bP b.

6 Finite State Machine Draw the FSA. The major difference is that transitions can be both terminal and non-terminal symbols. The Goto and Action Parts of the parsing table come from the FSA

FSA I o S->.P P ->.aPa P ->.bPb P ->.c I 1 S-> P. I 2 P -> a.Pa P ->.aPa P ->.bPb P ->.c P a I 3 P -> b.Pb P ->.aPa P ->.bPb P ->.c I 6 P-> bP.b I 4 P-> c. I 8 P-> bPb. c b c b a a I 5 P-> a P.a P I 7 P-> a Pa. a P b b c 1) P -> aPa 2) P -> bPb 3) P -> c

Parsing Table stateabc$P 0S2S3S41 1acc 2S2S3S45 3S2S3S46 4R3 5S7 6S8 7R1 8R2

9 Parsing Table Contd S i means shift the input symbol and goto state I. Rj means reduce by jth production. Note that we are not storing all the items in the state in our table. example: abcba$ if we go thru, parsing algorithm, we get

10 Example Contd StateInputAction $0 abcba$shift $0a2 bcba$shift $0a2b3 cba$shift $0a2b3c4 ba$ reduce $0a2b3P6ba$shift $0a2b3P6b8a$reduce $0a2P5a$shift $0a2P5a7$ reduce $0P1$ accept

11 LR(0) Summary LR(0) state: set of LR(0) items LR(0) item: a production with a dot in RHS Compute LR(0) states and build DFA –Use closure operation to compute states –Use goto operation to compute transitions Build LR(0) parsing table from the DFA Use LR(0) parsing table to determine whether to shift or reduce

12 LR(0) Limitations An LR(0) machine only works if states with reduce actions have a single reduce action With a more complex grammar, construction gives states with shift/reduce or reduce/reduce conflicts Need to use lookahead to choose

13 A Non-LR(0) Grammar Grammar for addition of numbers –S  S + E | E –E  num Left-associative version is LR(0) Right-associative is not LR(0) –S  E + S | E –E  num

14 Shift/Reduce Conflicts An LR(0) state contains a conflict if its canonical set has two items that recommend conflicting actions. shift/reduce conflict - when one item prompts a shift action, the other prompts a reduce action. reduce/reduce conflict - when two items prompt for reduce actions by different production. A grammar is said be to be LR(0) grammar, if the table does not have any conflicts.

15 Shift/Reduce Conflict S’ ->.S S ->.A b | d c | b A c A ->.d A very simple language = {db, dc, bdc} Follow(S) = {$}, Follow(A) = {b,c} Form part f the SLR(1) parser: I 0 S’ ->.S S ->.A b S ->. d c S ->. b A c A ->.d I 1 S -> d.c A -> d. But since c is in Follow(A), we don’t whether to shift or reduce in I1 D 1 : S’ -> S ->dc D 2 ; S’ ->S ->bAc ->bdc

16 Reduce/Reduce Conflict S’-> S S -> b A e | b B d | A c A -> d B -> Ec E-> d S’ -> S -> Ac -> dc S’ ->S -> bBd -> bEcd -> bdcd S’ -> S -> bAe -> bde I 0 S’ ->.S S ->. b A e S ->.b B d S ->.A c A ->.d I 1 S -> b.A e S -> b.B d A ->. d B ->.E c E ->.d I 2 A -> d. E -> d. b d Which reduction should be taken? There is not enough context to decide!

17 SLR(1) Grammar An LR parser using SLR(1) parsing tables for a grammar G is called as the SLR(1) parser for G. If a grammar G has an SLR(1) parsing table, it is called SLR(1) grammar (or SLR grammar in short). Every SLR grammar is unambiguous, but every unambiguous grammar is not a SLR grammar.

18 Uses DFA to recognize viable prefixes of grammar G Each state in the DFA: –is the set of LR(0 items valid for a viable prefix –“ encodes ” information about the symbols that have been shifted onto the stack Valid LR(0) items are computed by applying the closure and goto functions to the initial, valid item [S ’ ->.S] (this is called the canonical collection of LR(0) items) Uses FOLLOW to disambiguate actions SLR Summary

19 SLR(1) Summary 1.If A -> is in I k and goto( I k, a) = Ij, then set actions[k,a] to sj 2.If A -> is in I k then set actions[k,b] to rule#, for all b FOLLOW(A) 3.If S ’ -> S. is in I k then set actions[k,$] to accept Rules 1-3 may define conflicting actions for an entry in the actions table. In this case, the grammar is not SLR(1).

20 LR(0) Limitations An LR(0) machine only works if states with reduce actions have a single reduce action With a more complex grammar, construction gives states with shift/reduce or reduce/reduce conflicts Need to use lookahead to choose L  L, S. S  S., L L  S, L. L  S. OK shift/reduce reduce/reduce

21 A Non-LR(0) Grammar Grammar for addition of numbers –S  S + E | E –E  num Left-associative version is LR(0) Right-associative is not LR(0) –S  E + S | E –E  num

22 LR(0) Parsing Table S’ . S $ S .E + S S . E E .num E  num. S  E. +S S  E. E num + S  E + S. S’  S $. S S  E +. S S . E + S S . E E . num S’  S. $ S Grammar S  E + S | E E  num $ E num num+$ES 1s4g2g6 s3/S  E 2S  Es3/S  ES  E Shift or reduce in state 2?

23 Solve Conflict With Lookahead 3 popular techniques for employing lookahead of 1 symbol with bottom-up parsing –SLR – Simple LR –LALR – LookAhead LR –LR(1) Each as a different means of utilizing the lookahead –Results in different processing capabilities

24 SLR Parsing SLR Parsing = Easy extension of LR(0) –For each reduction X  , look at next symbol C –Apply reduction only if C is not in FOLLOW(X) SLR parsing table eliminates some conflicts –Same as LR(0) table except reduction rows –Adds reductions X   only in the columns of symbols in FOLLOW(X) num+$ES 1s4g2g6 2s3S  E Example: FOLLOW(S) = {$}

25 SLR Parsing Table Reductions do not fill entire rows as before Otherwise, same as LR(0) num+$ES 1s4g2g6 2s3S  E 3s4g2g5 4E  numE  num 5 S  E+S 6 s7 7 accept Grammar S  E + S | E E  num

26 Class Problem Consider : S  L = R S  R L  *R L  ident R  L Think of L as l-value, R as r-value, and * as a pointer dereference When you create the states in the SLR(1) DFA, 2 of the states are the following: S  L. = R R  L. S  R. Do you have any shift/reduce conflicts?

Another SLR(1) Example 1.S ’  S 2.S  dca 3.S  dAb 4.A  c S0: S'   S S   dca S  ● dAb S2: S  d  ca S  d  Ab A  ● c d S1: S’  S ● S S4: S  dA ● b A S3: S  dc ● a A  c ● c S5: S  dca ● a S6: S  dAb ● b ActionGoto abcd$SA S0S21 S1A S2S34 S3S5R4 S4S6 S5R2 S6R3 In S3 there is reduce/shift conflict: It can be R4 or shift. By looking at the Follow set of A, the conflict is removed. UMBC

1.S ’  S 2.S  dca 3.S  dAb 4.S  Aa 5.A  c S0: S'   S S   dca S  ● dAb S  ● Aa A  ● c S2: S  d  ca S  d  Ab A  ● c d S1: S’  S ● S S4: S  dA ● b A S3: S  dc ● a A  c ● c S5: S  dca ● a S6: S  dAb ● b S7: S  A ● a A S9: A  c ● c S8: S  Aa ● a Non- SLR(1) example S3 has shift/reduce conflict. By looking at Follow(A), both a and b are in the follow set. So under column a we still don’t know whether to reduce or shift.

29 The conflict SLR parsing table ActionGoto abcd$SA S0S9S217 S1A S2S34 S5/R 5 R5 S4S6 S5R2 S6R3 S7S8 R4 S9R5 Follow(A) = {a, b}

30 LR(1) Solution: keep more information about context. Namely keep track what next input symbol can be as part of DFA state Idea: keep an input look-ahead as part of each item - these are called LR(1) items Always a subset of Follow(A) for any non-terminal A (may not be a proper subset) Can give rise to larger parsers (i.e. many states) than SLR but recognizes a greater number of constructs

31 LR(k) Items The table construction algorithm for an LR(k) parser uses LR(k) items to represent the set of possible states in a parse An LR(k) item is a pair [, ], where –  is a production from G with a “. ” at some position in the rhs –  is a look-ahead string containing k (where k is typically 1) symbols that are terminals or $ Example LR(1) item [A -> X. Y Z, a ]  

32 LR(k) Items What ’ s the point of the look-ahead symbols? Carry them along to allow us to choose correct reduction when there is any choice Look-ahead symbols are bookkeeping unless item has unless reducing (i.e. has a “. ” at the right end) [A -> X. Y Z, a ] [A -> X Y Z., a ] No Use Use to Guide Reduction The point: for [A -> ., a] and [B -> ., b], we can decide between reducing to A or to B by looking at limited right context

33 LR(1) DFA Construction S’ . S, $ S . E + S, $ S . E, $ E .num, +,$ E  num., +,$ S’ S., $ E num + S  E+S., +,$ S S  E +. S, $ S . E + S, $ S . E, $ E . num, +,$ S  E. + S, $ S  E., $ S Grammar S’  S$ S  E + S | E E  num E num If S’ = goto(S,x) then add an edge labeled x from S to S ’

34 LR(1) Reductions S’ . S, $ S . E + S, $ S . E, $ E .num, +,$ E  num., +,$ S’ S., $ E num + S  E., +,$ S S  E +. S, $ S . E + S, $ S . E, $ E . num, +,$ S  E. + S, $ S  E., $ S Grammar S’  S$ S  E + S | E E  num E num Reductions correspond to LR(1) items of the form (X  ., y)

35 LR(1) Parsing Table Construction Same as construction of LR(0), except for reductions For a transition S  S ’ on terminal x: –Table[S,x] += Shift(S ’ ) For a transition S  S ’ on non-terminal N: –Table[S,N] += Goto(S ’ ) If I contains {(X  ., y)} then: –Table[I,y] += Reduce(X  )

36 LR(1) Parsing Table Example S’ . S, $ S . E + S, $ S . E, $ E .num, +,$ E + S  E +. S, $ S . E + S, $ S . E, $ E . num, +,$ S  E. + S, $ S  E., $ Grammar S’  S$ S  E + S | E E  num $E 1g2 2s3S  E Fragment of the parsing table

37 LALR(1) Grammars Problem with LR(1): too many states LALR(1) parsing (aka LookAhead LR) –Constructs LR(1) DFA and then merge any 2 LR(1) states whose items are identical except lookahead –Results in smaller parser tables –Theoretically less powerful than LR(1) LALR(1) grammar = a grammar whose LALR(1) parsing table has no conflicts S  id., + S  E., $ S  id., $ S  E., + += ??

38 LALR Parsers LALR(1) –Generally same number of states as SLR (much less than LR(1)) –But, with same lookahead capability of LR(1) (much better than SLR) –Pascal programming language In SLR, several hundred states In LR(1), several thousand states

39 LL/LR Grammar Summary LL parsing tables –Non-terminals x terminals  productions –Computed using FIRST/FOLLOW LR parsing tables –LR states x terminals  {shift/reduce} –LR states x non-terminals  goto –Computed using closure/goto operations on LR states A grammar is: –LL(1) if its LL(1) parsing table has no conflicts –same for LR(0), SLR, LALR(1), LR(1)

40 Classification of Grammars LR(0) SLR LALR(1) LR(1) LL(1) LR(k)  LR(k+1) LL(k)  LL(k+0) LL(k)  LR(k) LR(0)  SLR LALR(1)  LR(1) Not to scale

41 Automate the Parsing Process Can automate: –The construction of LR parsing tables –The construction of shift-reduce parsers based on these parsing tables LALR(1) parser generators –yacc, bison –Not much difference compared to LR(1) in practice –Smaller parsing tables than LR(1) –Augment LALR(1) grammar specification with declarations of precedence, associativity –Output: LALR(1) parser program

42 Associativity S  S + E | E E  num E  E + E E  num What happens if we run this grammar through LALR construction? E  E + E E  num E  E + E., + E  E. + E, +,$ + shift/reduce conflict shift: 1+ (2+3) reduce: (1+2)

43 Associativity (2) If an operator is left associative –Assign a slightly higher value to its precedence if it is on the parse stack than if it is in the input stream –Since stack precedence is higher, reduce will take priority (which is correct for left associative) If operator is right associative –Assign a slightly higher value if it is in the input stream –Since input streamis higher, shift will take priority (which is correct for right associative)

44 Precedence E  E + E | T T  T x T | num | (E) E  E + E | E x E | num | (E) Shift/reduce conflict results What happens if we run this grammar through LALR construction? E  E. + E,... E  E x E., + E  E + E., x E  E. x E,... Precedence: attach precedence indicators to terminals Shift/reduce conflict resolved by Shift/reduce conflict resolved by: 1. If precedence of the input token is greater than the last terminal on parse stack, favor shift over reduce 2. If the precedence of the input token is less than or equal to the last terminal on the parse stack, favor reduce over shift

45 References Modern Compiler Implementation in Java, Andrew Appel, Cambridge University Press