Compiler design Bottom-up parsing: Canonical LR and LALR

Slides:



Advertisements
Similar presentations
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Advertisements

CH4.1 CSE244 More on LR Parsing Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 191 Auditorium Road, Box U-155.
Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Compiler Designs and Constructions
Compiler construction in4020 – lecture 4 Koen Langendoen Delft University of Technology The Netherlands.
Compiler Principles Fall Compiler Principles Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University.
Bottom up Parsing Bottom up parsing trys to transform the input string into the start symbol. Moves through a sequence of sentential forms (sequence of.
Joey Paquet, 2000, 2002, 2008, Lecture 7 Bottom-Up Parsing II.
Mooly Sagiv and Roman Manevich School of Computer Science
Lexical and Syntactic Analysis Here, we look at two of the tasks involved in the compilation process –Given source code, we need to first break it into.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
Lecture #8, Feb. 7, 2007 Shift-reduce parsing,
CS 330 Programming Languages 09 / 23 / 2008 Instructor: Michael Eckmann.
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
Bottom-up parsing Goal of parser : build a derivation
LALR Parsing Canonical sets of LR(1) items
Joey Paquet, 2000, 2002, 2012, Lecture 6 Bottom-Up Parsing.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
11 Outline  6.0 Introduction  6.1 Shift-Reduce Parsers  6.2 LR Parsers  6.3 LR(1) Parsing  6.4 SLR(1)Parsing  6.5 LALR(1)  6.6 Calling Semantic.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
Lecture 5: LR Parsing CS 540 George Mason University.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
Eliminating Left-Recursion Where some of a nonterminal’s productions are left-recursive, top-down parsing is not possible “Immediate” left-recursion can.
Lec04-bottomupparser 4/13/2018 LR Parsing.
Compiler design Bottom-up parsing Concepts
Bottom-Up Parsing.
Compiler Baojian Hua LR Parsing Compiler Baojian Hua
Unit-3 Bottom-Up-Parsing.
UNIT - 3 SYNTAX ANALYSIS - II
Table-driven parsing Parsing performed by a finite state machine.
Chapter 4 Syntax Analysis.
Compiler Construction
Fall Compiler Principles Lecture 4: Parsing part 3
LALR Parsing Canonical sets of LR(1) items
Bottom-Up Syntax Analysis
Syntax Analysis Part II
4 (c) parsing.
Shift Reduce Parsing Unit -3
Subject Name:COMPILER DESIGN Subject Code:10CS63
4d Bottom Up Parsing.
Parsing #2 Leonidas Fegaras.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Compiler Design 7. Top-Down Table-Driven Parsing
Bottom Up Parsing.
R.Rajkumar Asst.Professor CSE
LALR Parsing Adapted from Notes by Profs Aiken and Necula (UCB) and
Ambiguity in Grammar, Error Recovery
Compiler design Error recovery in top-down predictive parsing
Compiler Construction
LL and Recursive-Descent Parsing Hal Perkins Autumn 2011
Parsing #2 Leonidas Fegaras.
Bottom-Up Parsing “Shift-Reduce” Parsing
4d Bottom Up Parsing.
5. Bottom-Up Parsing Chih-Hung Wang
Kanat Bolazar February 16, 2010
Parsing Bottom-Up LR Table Construction.
4d Bottom Up Parsing.
LL and Recursive-Descent Parsing Hal Perkins Autumn 2009
Parsing Bottom-Up LR Table Construction.
Predictive Parsing Program
LL and Recursive-Descent Parsing Hal Perkins Winter 2008
Lecture 11 LR Parse Table Construction
Compiler design Bottom-up parsing: Canonical LR and LALR
Compiler design Review COMP 442/6421 – Compiler Design
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Compiler design Bottom-up parsing: Canonical LR and LALR COMP 442/6421 – Compiler Design Compiler design Bottom-up parsing: Canonical LR and LALR Error recovery in LR parsing Parsing: concluding comments Joey Paquet, 2000-2017

Problems with LR parsers COMP 442/6421 – Compiler Design Problems with LR parsers The SLR parser breaks down when there are conflicts between shift/recude: Shift-reduce conflict: it cannot decide whether to shift or reduce Reduce-reduce conflict: A given configuration implies more than one possible reduction, e.g. S  if E then S | if E then S else S State 3 : S  id FOLLOW(S) = {$} V[id] E  id FOLLOW(E) = {$,+,=} Z  S S  E = E S  id E  E + id E  id Joey Paquet, 2000-2017

Problem: the FOLLOW set is not discriminating enough COMP 442/6421 – Compiler Design Canonical LR parsers Problem: the FOLLOW set is not discriminating enough For example, the previous example problem is not without solution: if the next token is either + or =, reduce id to E if the next token is $, reduce id to S Solution: use lookahead sets in the items generation process to eliminate ambiguities Joey Paquet, 2000-2017

COMP 442/6421 – Compiler Design Canonical LR parsers State 0: [Z  S:{$}] : [Z  S:{$}] : V[S] State 1 V[] [S  id:{$}] [S  E=E:{$}] : V[E] State 2 [S  E=E:{$}] [E  E+id:{=,+}] : V[E] State 2 [E  E+id:{=}] [E  id]:{=,+}] : V[id] State 3 [E  id:{=}] [S  id:{$}] : V[id] State 3 [E  E+id:{+}] [E  id:{+}] State 1: [Z  S:{$}] : [Z  S:{$}] : accept Final State V[S] State 2: [S  E= E:{$}] : [S  E=E:{$}] : V[E=] State 4 V[E] [E  E+id:{=,+}] [E  E+id:{=,+}] : V[E+] State 5 State 3: [E  id]:{=,+}] : [E  id:{=,+}] : handle (r4) V[id] [S  id :{$}] [S  id:{$}] : handle (r2) State 4: [S  E=E:{$}] : [S  E=E:{$}] : V[E=E] State 6 V[E=] [E  E+id:{$}] [E  E+id:{+,$}] : V[E=E] State 6 [E  id:{$}] [E  id:{+,$}] : V[E=id] State 7 Joey Paquet, 2000-2017

COMP 442/6421 – Compiler Design Canonical LR parsers State 5: [E  E+id:{=,+}] : [E  E+id:{=,+}] : V[E+id] State 8 V[E+] State 6: [S  E=E:{$}] : [S  E=E:{$}] : handle (r1) V[E=E] [E  E+id:{+,$}] [E  E+id:{+,$}] : V[E=E+] State 9 State 7: [E  id:{+,$}] : [E  id:{+,$}] : handle (r4) V[E=id] State 8: [E  E+id:{=,+}] : [E  E+id:{=,+}] : handle (r3) V[E+id] State 9: [E  E+id:{+,$}] : [E  E+id:{+,$}] : V[E=E+id] State 10 V[E=E+] State 10: [E  E+id:{+,$}] : [E  E+id:{+,$}] : handle (r3) V[E=E+id] Joey Paquet, 2000-2017

CLR parsing table Z  S 1 S  E = E 2 S  id 3 E  E + id 4 E  id COMP 442/6421 – Compiler Design CLR parsing table state action goto id = + $ S E s3 1 2 acc s4 s5 3 r4 r2 4 s7 6 5 s8 s9 r1 7 8 r3 9 s10 10 Z  S 1 S  E = E 2 S  id 3 E  E + id 4 E  id Joey Paquet, 2000-2017

COMP 442/6421 – Compiler Design LALR Problem: CLR generates more states than SLR, e.g. for a typical programming language, the SLR table has hundreds of states whereas a CLR table has thousands of states Solution: merge similar states, i.e. states that have the same item sets, but not necessarily with the same lookahead sets. Merge such states into one state, and merge the lookahead sets of items that are duplicated. Joey Paquet, 2000-2017

LALR A. State 0: {[Z  S:{$}],[S  id:{$}],[S  E=E:{$}], COMP 442/6421 – Compiler Design LALR A. State 0: {[Z  S:{$}],[S  id:{$}],[S  E=E:{$}], [E  E+id:{=,+}],[E  id]:{=,+}]} B. State 1: {[Z  S:{$}]} C. State 2: {[S  E=E:{$}],[E  E+id:{=,+}]} D. State 3: {[E  id:{=,+}],[S  id:{$}]} E. State 4: {[S  E=E:{$}],[E  E+id:{+,$}],[E  id:{+,$}]} F. State 5: {[E  E+id:{=,+}]} State 9: {[E  E+id:{+,$}]} G. State 6: {[S  E=E:{$}],[E  E+id:{+,$}]} H. State 7: {[E  id:{+,$}]} I. State 8: {[E  E+id:{=,+}]} State 10: {[E  E+id:{+,$}]} State 0: {[Z  S:{$}],[S  id:{$}],[S  E=E:{$}], [E  E+id:{=,+}],[E  id]:{=,+}]} State 1: {[Z  S:{$}]} State 2: {[S  E=E:{$}],[E  E+id:{=,+}]} State 3: {[E  id:{=,+}],[S  id:{$}]} State 4: {[S  E=E:{$}],[E  E+id:{+,$}],[E  id:{+,$}]} State 5: {[E  E+id:{=,+,$}]} State 6: {[S  E=E:{$}],[E  E+id:{+,$}]} State 7: {[E  id:{+,$}]} State 8: {[E  E+id:{=,+,$}]} Joey Paquet, 2000-2017

LALR parsing table 1 Z  S 2 S  E = E 3 S  id 4 E  E + id 5 E  id COMP 442/6421 – Compiler Design LALR parsing table state action goto id = + $ S E s3 1 2 acc s4 s5 3 r4 r2 4 s7 6 5 s8 r1 7 8 r3 1 Z  S 2 S  E = E 3 S  id 4 E  E + id 5 E  id Joey Paquet, 2000-2017

Error recovery in LR parsing COMP 442/6421 – Compiler Design Error recovery in LR parsing Joey Paquet, 2000-2017

Error detection in LR parsers COMP 442/6421 – Compiler Design Error detection in LR parsers An error is detected when the parser consults the action table and finds an empty entry Each empty entry potentially represents a different and specific syntax error If we come onto an empty entry on the goto table, it means that there is an error in the table itself Joey Paquet, 2000-2017

Error recovery in LR parsers COMP 442/6421 – Compiler Design Error recovery in LR parsers onError() 1. pop() until a state s is on top of the stack and T is not empty where s has at least one entry in the goto table under non-terminal N S is the set of states in the goto table for s T is the set of terminals for which the elements of S have a shift or accept entry in the action table 4. lookahead = nextToken() until lookahead x is in T 5. push the N corresponding to x and its corresponding state in S 6. resume parse Joey Paquet, 2000-2017

Error recovery in LR parsers COMP 442/6421 – Compiler Design Error recovery in LR parsers This method of error recovery attempts to eliminate the remainder of the current phrase derivable from a non-terminal A which contains a syntactic error. Part of that A was already parsed, which resulted in a sequence of states on top of the stack. The remainder of this phrase is still in the input. The parser attempts to skip over the remainder of this phrase by looking for a symbol on the input that can legitimately follow A. By removing states from the stack, skipping over the input, and pushing GOTO(s, A) on the stack, the parser pretends that it has successfully reduced an instance of A and resumes normal parsing. Example: in many programming languages, statements end with a “;” if an error is found in a statement, the stack is popped until we get V[<statement>] on top of the stack nextToken() until the next “;” is found resume parse Joey Paquet, 2000-2017

Error recovery in LR parsers: example COMP 442/6421 – Compiler Design Error recovery in LR parsers: example state action goto id + * ( ) $ E T F s5 e1 s4 e0 1 2 3 e2 s6 ee e3 acc r2 s7 r4 4 8 5 r6 6 e4 9 7 e5 10 r1 s11 r3 11 r5 1 E  E + T 2 E  T 3 T  T * F 4 T  F 5 F  (E) 6 F  id Joey Paquet, 2000-2017

Error recovery in LR parsers: example COMP 442/6421 – Compiler Design Error recovery in LR parsers: example stack input action 1 id)id*id)$ shift 5 2 0id5 )id*id)$ reduce (F  id) 3 0F3 reduce (T  F) 4 0T2 reduce (E  T) 5 0E1 e3 6 *id)$ shift 7 7 0T2*7 id)$ 8 0T2*7id5 )$ 9 0T2*7F10 reduce (T  T * F) 10 11 12 $ accept e0 unexpected end of program e1 missing operand e2 missing operator e3 mismatched parenthesis e4 missing term e5 factor expected e6 + or ) expected ee parser error 1 E  E + T 2 E  T 3 T  T * F 4 T  F 5 F  (E) 6 F  id Joey Paquet, 2000-2017

Parsing: general comments COMP 442/6421 – Compiler Design Parsing: general comments Joey Paquet, 2000-2017

COMP 442/6421 – Compiler Design Parser generators For real-life programming languages, construction of the table is extremely laborious and error-prone (several thousands of states) Table construction follows strictly defined rules that can be implemented as a program called a parser generator Yacc (Yet Another Compiler-Compiler) generates a LALR(1) parser code and table Grammar is given in input in (near) BNF Detects conflicts and resolves some conflicts automatically Requires a minimal number of changes to the grammar Each right hand side is associated with a custom semantic action to generate the symbol table and code Joey Paquet, 2000-2017

Which method to use? We have seen COMP 442/6421 – Compiler Design Which method to use? We have seen recursive descent predictive table-driven predictive SLR, CLR, LALR Parser generator For real-life languages, the recursive descent parser lacks maintainability The need for changing the grammar is a disadvantage for predictive parsers Code generation and error detection is more difficult and less accurate in top-down parsers Most compilers are now implemented using the LR method, using parser generators Other more recent ones are generating top-down parsing methods (e.g. JavaCC) Joey Paquet, 2000-2017