CH4.1 CSE244 More on LR Parsing Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 191 Auditorium Road, Box U-155.

Slides:



Advertisements
Similar presentations
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Advertisements

Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Compiler Designs and Constructions
Compilation (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
Compiler Principles Fall Compiler Principles Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University.
Joey Paquet, 2000, 2002, 2008, Lecture 7 Bottom-Up Parsing II.
CH4.1 CSE244 SLR Parsing Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Box U-155 Storrs,
CSE 5317/4305 L4: Parsing #21 Parsing #2 Leonidas Fegaras.
1 May 22, May 22, 2015May 22, 2015May 22, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa Pacific University,
Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)
Cse321, Programming Languages and Compilers 1 6/12/2015 Lecture #10, Feb. 14, 2007 Modified sets of item construction Rules for building LR parse tables.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
CH4.1 CSE244 More on LR Parsing Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Unit 1155 Storrs,
Section 4.8 Aggelos Kiayias Computer Science & Engineering Department
CH4.1 CSE244 Sections 4.5,4.6 Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Box U-155 Storrs,
1 LR parsing techniques SLR (not in the book) –Simple LR parsing –Easy to implement, not strong enough –Uses LR(0) items Canonical LR –Larger parser but.
CH4.1 CSE244 Introduction to LR Parsing Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Box.
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
Bottom-up parsing Goal of parser : build a derivation
LALR Parsing Canonical sets of LR(1) items
LESSON 24.
LR(k) Parsing CPSC 388 Ellen Walker Hiram College.
CS 321 Programming Languages and Compilers Bottom Up Parsing.
CH4.1 CSE244 Sections 4.5,4.6 Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Box U-155 Storrs,
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
10/10/2002© 2002 Hal Perkins & UW CSED-1 CSE 582 – Compilers LR Parsing Hal Perkins Autumn 2002.
Three kinds of bottom-up LR parser SLR “Simple LR” –most restrictions on eligible grammars –built quite directly from items as just shown LR “Canonical.
Bottom-Up Parsing Algorithms LR(k) parsing L: scan input Left to right R: produce Rightmost derivation k tokens of lookahead LR(0) zero tokens of look-ahead.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
1 Chapter 6 Bottom-Up Parsing. 2 Bottom-up Parsing A bottom-up parsing corresponds to the construction of a parse tree for an input tokens beginning at.
CH4.1 CSE244 Midterm Subjects Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Box U-155 Storrs,
Conflicts in Simple LR parsers A SLR Parser does not use any lookahead The SLR parsing method fails if knowing the stack’s top state and next input token.
Chapter 8. LR Syntactic Analysis Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
Eliminating Left-Recursion Where some of a nonterminal’s productions are left-recursive, top-down parsing is not possible “Immediate” left-recursion can.
COMPILER CONSTRUCTION
Lec04-bottomupparser 4/13/2018 LR Parsing.
Introduction to LR Parsing
Bottom-Up Parsing.
Compiler Baojian Hua LR Parsing Compiler Baojian Hua
Unit-3 Bottom-Up-Parsing.
Table-driven parsing Parsing performed by a finite state machine.
Chapter 4 Syntax Analysis.
CS 404 Introduction to Compiler Design
Compiler Construction
Compiler design Bottom-up parsing: Canonical LR and LALR
Fall Compiler Principles Lecture 4: Parsing part 3
LALR Parsing Canonical sets of LR(1) items
Bottom-Up Syntax Analysis
Canonical LR Parsing Tables
Syntax Analysis Part II
Subject Name:COMPILER DESIGN Subject Code:10CS63
Parsing #2 Leonidas Fegaras.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Ambiguity in Grammar, Error Recovery
Chapter 4. Syntax Analysis (2)
Compiler SLR Parser.
Parsing #2 Leonidas Fegaras.
Kanat Bolazar February 16, 2010
Announcements HW2 due on Tuesday Fall 18 CSCI 4430, A Milanova.
Chapter 4. Syntax Analysis (2)
Parsing Bottom-Up LR Table Construction.
Parsing Bottom-Up LR Table Construction.
Chap. 3 BOTTOM-UP PARSING
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 7, 10/09/2003 Prof. Roy Levow.
Compiler design Bottom-up parsing: Canonical LR and LALR
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

CH4.1 CSE244 More on LR Parsing Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 191 Auditorium Road, Box U-155 Storrs, CT

CH4.2 CSE244 Picture So Far  SLR construction: based on canonical collection of LR(0) items – gives rise to canonical LR(0) parsing table.  No multiply defined labels => Grammar is called “SLR(1)”  More general class: LR(1) grammars. Using the notion of LR(1) item and the canonical LR(1) parsing table.

CH4.3 CSE244 LR(1) Items  DEF. A LR(1) item is a production with a marker together with a terminal: E.g. []  DEF. A LR(1) item is a production with a marker together with a terminal: E.g. [ S  aA.Be, c ] intuition: it indicates how much of a certain production we have seen already (aA) + what we could expect next (Be) + a lookahead that agrees with what should follow in the input if we ever do Reduce by the production S  aABe By incorporating such lookahead information into the item concept we will make more wise reduce decisions.  Direct use of lookahead in an LR(1) item is only performed in considering reduce actions. (I.e. when marker is in the rightmost).  Core of an LR(1) item [] is the LR(0) item  Core of an LR(1) item [ S  aA.Be, c ] is the LR(0) item S  aA.Be   Different LR(1) items may share the same core.

CH4.4 CSE244 Usefulness of LR(1) items  E.g. if we have two LR(1) items of the form   [ A  ., a ] [ B  ., b ] we will take advantage of the lookahead to decide which reduction to use (the same setting would perhaps produce a reduce/reduce conflict in the SLR approach).   How the Notion of Validity changes:   An item [ A   1.  2, a ] is valid for a viable prefix  1 if we have a rightmost derivation that yields  Aaw which in one step yields  1  2 aw

CH4.5 CSE244 Constructing the Canonical Collection of LR(1) items   Initial item: [ S’ .S, $]   Closure. (more refined) if [A .B , a] belongs to the set of items, and B   is a production of the grammar, then: we add the item [B . , b] for all b  FIRST(  a)  Goto. (the same)  Goto. (the same) A state containing [A .X , a] will move to a state containing [A  X. , a] with label X   Every state is closed according to Closure.   Every state has transitions according to Goto.

CH4.6 CSE244 Constructing the LR(1) Parsing Table  Shift actions: (same) If is in state I k and I k moves to state I m with label then we add the action action[k, ] = “shift m”  Shift actions: (same) If [A .b , a] is in state I k and I k moves to state I m with label b then we add the action action[k, b] = “shift m”  Reduce actions: (more refined) If is in state I k then we add the action: “Reduce ” into action[A, ] Observe that we don’t use information from FOLLOW(A) anymore.  Reduce actions: (more refined) If [A ., a] is in state I k then we add the action: “Reduce A  ” into action[A, a] Observe that we don’t use information from FOLLOW(A) anymore.  Goto part of the table is as before.

CH4.7 CSE244 Example I S’  S S  CC C  c C | d FIRST S c d C c d construction

CH4.8 CSE244 Example II S’  S S  L = R | R L  * R | id R  L FIRST S * id L * id R * id

CH4.9 CSE244 LR(1) more general to SLR(1): S’  S S  L = R | R L  * R | id R  L I 0 = {[ S’ .S, $ ] [S .L = R, $ ] [S .R, $ ] [L .* R, = / $ ] [L . id, = / $ ] [R .L, $ ] } I 1 = {[ S’  S., $ ] } I 2 = {[ S  L. = R, $ ] [R  L., $ ] } I 3 = {[ S  R., $ ] } I 4 = {[ L  *.R, = / $ ] [R .L, = / $ ] [L .* R, = / $ ] [L . id, = / $ ] } action[2, = ] ? s6 (because of S  L. = R ) THERE IS NO CONFLICT ANYMORE I 5 = { [L  id., = / $ ] } I 6 = {[ S  L =. R, $ ] [R .L, $ ] [L .* R, $ ] [L . id, $ ] } I 7 = {[L  *R., = / $ ]} I 8 = {[R  L., = / $ ]} I 10 = {[L  *R., $ ]} I 11 = { [L  id., $ ] } I 12 = {[R  L., $ ]} I 9 = {[ L  *.R, $ ] [R .L, $ ] [L .* R, $ ] [L . id, $ ] }

CH4.10 CSE244 LALR Parsing  Canonical sets of LR(1) items  Number of states much larger than in the SLR construction  LR(1) = Order of thousands for a standard prog. Lang.  SLR(1) = order of hundreds for a standard prog. Lang.  LALR(1) (lookahead-LR)  A tradeoff:  Collapse states of the LR(1) table that have the same core (the “LR(0)” part of each state)  LALR never introduces a Shift/Reduce Conflict if LR(1) doesn’t.  It might introduce a Reduce/Reduce Conflict (that did not exist in the LR(1))…  Still much better than SLR(1) (larger set of languages)  … but smaller than LR(1), actually ~ SLR(1)  What Yacc and most compilers employ.

CH4.11 CSE244 Collapsing states with the same core.  E.g., If I 3 I 6 collapse then whenever the LALR(1) parser puts I 36 into the stack, the LR(1) parser would have either I 3 or I 6  A shift/reduce action would not be introduced by the LALR “collapse”  Indeed if the LALR(1) has a Shift/Reduce conflict this conflict should also exist in the LR(1) version: this is because two states with the same core would have the same outgoing arrows.  On the other hand a reduce/reduce conflict may be introduced.  Still LALR(1) preferred: table proportional to SLR(1)  Direct construction is also possible.

CH4.12 CSE244 Error Recovery in LR Parsing  For a given stack $...I i and input symbols it holds that action[i,] = empty  For a given stack $...I i and input symbols s…s’…$ it holds that action[i,s] = empty  Panic-mode error recovery.

CH4.13 CSE244 Panic Recovery Strategy I  Scan down the stack till a state I j is found  I j moves with the non-terminal A to some state I k  I k moves with s’ to some state I k’  Proceed as follows:  Pop all states till I j  Push A and state I k  Discard all symbols from the input till s’  There may be many choices as above.  [essentially the parser in this way determines that a string that is produced by A has an error; it assumes it is correct and advances]  Error message: construct of type “A” has error at location X

CH4.14 CSE244 Panic Recovery Strategy II  Scan down the stack till a state I j is found  I j moves with the terminal t to some state I k  I k with s’ has a valid action.  Proceed as follows:  Pop all states till I j  Push t and state I k  Discard all symbols from the input till s’  There may be many choices as above.  Error message: “missing t”

CH4.15 CSE244Example E’  E E  E + E | | E * E | ( E ) | id id+*()$E 0 s3 s3e1e1s2e2e11 1e3s4s5e3e2acc 2s3e1e1s2e2e16 3r4r4r4r4r4r4 4s3e1e1s2e2e17 5s3e1e1s2e2e18 6e3s4s5e3s9e4 7r1r1s5r1r1 r1 r1 8r2r2r2r2r2r2 9r3r3r3r3r3r3 action goto

CH4.16 CSE244 Collection of LR(0) items E’  E E  E + E | | E * E | ( E ) | id I 0 I 2 I 5 I 8 E’ .EE  (. E ) E  E *. E E  E * E. E .E + EE .E + E E .E + EE  E. + E E .E * E E .E * E E .E * E E  E. * E E .( E )E .( E ) E .( E ) E .idE .id E .id I 1 I 3 I 6 I 9 E’  E.E  id. E  ( E. ) E  ( E ). E  E. + E E  E. * E I 4 E  E. * E E  E +. E E .E + EI 7 E .E * E E  E + E. E .( E )E  E. + E E .id E  E. * E Follow(E’)=$ Follow(E)=+*)$

CH4.17 CSE244 The parsing table id+*()$E 0s3s21 1s4s5acc 2s3s26 3r4r4r4r4 4s3s27 5s3s28 6s4s5s9 7s4/r1s5/r1r1r1 8s4/r2s5/r2r2r2 9r3r3r3r3

CH4.18 CSE244Error-handling id+*()$E 0s3e1s21 1s4s5acc 2s3s26 3r4r4r4r4 4s3s27 5s3s28 6s4s5s9 7s4/r1s5/r1r1r1 8s4/r2s5/r2r2r2 9r3r3r3r3

CH4.19 CSE244Error-handling I 0 I 2 I 5 I 8 E’ .EE  (. E ) E  E *. E E  E * E. E .E + EE .E + E E .E + EE  E. + E E .E * E E .E * E E .E * E E  E. * E E .( E )E .( E ) E .( E ) E .idE .id E .id e1 Push E into the stack and move to state 1 “missing operand” : e1 Push id into the stack and change to state 3 “missing operand”

CH4.20 CSE244Error-handling id+*()$E 0s3e1e1 s2e1 1 1 s4s5acc 2s3s26 3r4r4r4r4 4s3s27 5s3s28 6s4s5s9 7s4/r1s5/r1r1r1 8s4/r2s5/r2r2r2 9r3r3r3r3

CH4.21 CSE244Error-handling id+*()$E 0s3e1e1 s2e2 e1 1 1 s4s5e2 acc 2s3s26 3r4r4r4r4 4s3e1 s27 5s3s28 6s4s5s9 7s4/r1s5/r1r1r1 8s4/r2s5/r2r2r2 9r3r3r3r3

CH4.22 CSE244Error-handling e2 remove “)” from input. “unbalanced right parenthesis” Try the input id+)

CH4.23 CSE244 Error-handling state 1 id+*()$E 0s3e1e1 s2e2 e1 1 1e3 s4s5acc 2s3s26 3r4r4r4r4 4s3s27 5s3s28 6s4s5s9 7s4/r1s5/r1r1r1 8s4/r2s5/r2r2r2 9r3r3r3r3

CH4.24 CSE244Error-Handling I 1 I 3 I 6 I 9 E’  E.E  id. E  ( E. ) E  ( E ). E  E. + E E  E. * E I 4 E  E. * E E  E +. E E .E + EI 7 E .E * E E  E + E. E .( E )E  E. + E E .id E  E. * E e3 Push + into the stack and change to state 4 “missing operator”

CH4.25 CSE244 Intro to Translation  Side-effects and Translation Schemes.  Do the construction as before but:  Side-effect in front of a symbol will be executed in a state when we make the move following that symbol to another state.  Side-effects on the rightmost end are executed during reduce actions. E’  E E  E + E {print(+)} | E * E {print(*)} | {parenthesis++} ( E ) {parenthesis--} | id { print(id); print(parenthesis); } Do for example id*(id+id)$ side-effects attached to the symbols to the right of them.