241-437 Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.

Slides:



Advertisements
Similar presentations
Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Advertisements

Compiler construction in4020 – lecture 4 Koen Langendoen Delft University of Technology The Netherlands.
Compilation (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
Joey Paquet, 2000, 2002, 2008, Lecture 7 Bottom-Up Parsing II.
Review: LR(k) parsers a1 … a2 … an $ LR parsing program Action goto Sm xm … s1 x1 s0 output input stack Parsing table.
1 May 22, May 22, 2015May 22, 2015May 22, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa Pacific University,
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
1 Chapter 5: Bottom-Up Parsing (Shift-Reduce). 2 - attempts to construct a parse tree for an input string beginning at the leaves (the bottom) and working.
Pertemuan 12, 13, 14 Bottom-Up Parsing
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
CS Summer 2005 Top-down and Bottom-up Parsing - a whirlwind tour June 20, 2005 Slide acknowledgment: Radu Rugina, CS 412.
Parsing V Introduction to LR(1) Parsers. from Cooper & Torczon2 LR(1) Parsers LR(1) parsers are table-driven, shift-reduce parsers that use a limited.
Bottom Up Parsing.
Chapter 4-2 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR Other.
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
CS 330 Programming Languages 09 / 23 / 2008 Instructor: Michael Eckmann.
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
Bottom-up parsing Goal of parser : build a derivation
LALR Parsing Canonical sets of LR(1) items
Syntax and Semantics Structure of programming languages.
410/510 1 of 21 Week 2 – Lecture 1 Bottom Up (Shift reduce, LR parsing) SLR, LR(0) parsing SLR parsing table Compiler Construction.
LR Parsing Compiler Baojian Hua
LR(k) Parsing CPSC 388 Ellen Walker Hiram College.
1 Compiler Construction Syntax Analysis Top-down parsing.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 1 Chapter 4 Chapter 4 Bottom Up Parsing.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
11 Outline  6.0 Introduction  6.1 Shift-Reduce Parsers  6.2 LR Parsers  6.3 LR(1) Parsing  6.4 SLR(1)Parsing  6.5 LALR(1)  6.6 Calling Semantic.
1 LR Parsers  The most powerful shift-reduce parsing (yet efficient) is: LR(k) parsing. LR(k) parsing. left to right right-most k lookhead scanning derivation.
Chapter 3-3 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR 
CSI 3120, Syntactic analysis, page 1 Syntactic Analysis and Parsing Based on A. V. Aho, R. Sethi and J. D. Ullman Compilers: Principles, Techniques and.
Syntax and Semantics Structure of programming languages.
Chapter 5: Bottom-Up Parsing (Shift-Reduce)
Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
10/10/2002© 2002 Hal Perkins & UW CSED-1 CSE 582 – Compilers LR Parsing Hal Perkins Autumn 2002.
4. Bottom-up Parsing Chih-Hung Wang
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
LR Parser: LR parsing is a bottom up syntax analysis technique that can be applied to a large class of context free grammars. L is for left –to –right.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Three kinds of bottom-up LR parser SLR “Simple LR” –most restrictions on eligible grammars –built quite directly from items as just shown LR “Canonical.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
Lecture 5: LR Parsing CS 540 George Mason University.
Bottom-up parsing. Bottom-up parsing builds a parse tree from the leaves (terminals) to the start symbol int E T * TE+ T (4) (2) (3) (5) (1) int*+ E 
1 Chapter 6 Bottom-Up Parsing. 2 Bottom-up Parsing A bottom-up parsing corresponds to the construction of a parse tree for an input tokens beginning at.
Conflicts in Simple LR parsers A SLR Parser does not use any lookahead The SLR parsing method fails if knowing the stack’s top state and next input token.
Chapter 8. LR Syntactic Analysis Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
Syntax and Semantics Structure of programming languages.
Programming Languages Translator
Compiler Baojian Hua LR Parsing Compiler Baojian Hua
UNIT - 3 SYNTAX ANALYSIS - II
Compiler Construction
Fall Compiler Principles Lecture 4: Parsing part 3
LALR Parsing Canonical sets of LR(1) items
Bottom-Up Syntax Analysis
Syntax Analysis Part II
Subject Name:COMPILER DESIGN Subject Code:10CS63
Top-Down Parsing CS 671 January 29, 2008.
4d Bottom Up Parsing.
4d Bottom Up Parsing.
4d Bottom Up Parsing.
4d Bottom Up Parsing.
4d Bottom Up Parsing.
Chap. 3 BOTTOM-UP PARSING
4d Bottom Up Parsing.
Presentation transcript:

Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR parse tables are generated , Semester 1, Bottom-up (LR) Parsing

Compilers: Bottom-up/6 2 Overview 1. What is a LR Parser? 2. Bottom-up using Shift-Reduce 3. Building a LR Parser 4. Generating the Parse Table 5. LR Conflicts 6.LL, SLR, LR, LALR Grammars

Compilers: Bottom-up/6 3 In this lecture Source Program Target Lang. Prog. Semantic Analyzer Syntax Analyzer Lexical Analyzer Front End Code Optimizer Target Code Generator Back End Int. Code Generator Intermediate Code but concentrating on bottom-up parsing

Compilers: Bottom-up/ What is a LR Parser? A LR parser reads its input tokens from Left-to-right and produces a Rightmost derivation. The parse tree is built bottom-up, starting from the leaves and working upwards to the start symbol.

Compilers: Bottom-up/6 5 LR in Action Grammar: S  a A B e A  A b c | b B  d The tree corresponds to a rightmost derivation: S  a A B e  a A d e  a A b c d e  a b b c d e Reducing a sentence: a b b c d e a A b c d e a A d e a A B e S S a b b c d e A A B A A B A A A These match production’s right-hand sides parse "a b b c d e"

Compilers: Bottom-up/6 6 LR(k) Parsing The k is to the number of input tokens that are looked at when deciding which production to use. – –e.g. LR(0), LR(1) We'll be using a variation of LR(0) parsing in this chapter.

Compilers: Bottom-up/6 7 LR versus LL LR can deal with more complex (powerful) grammars than LL (top-down parsers). LR can detect errors quicker than LL. LR parsers can be implemented very efficiently, but they're difficult to build by hand (unlike LL parsers).

Compilers: Bottom-up/ Bottom-up using Shift-Reduce The usual way of implementing bottom-up parsing is by using shift-reduce: – –‘shift’ means read in a new input token, and push it onto a stack – –‘reduce’ means to group several symbols into a single non-terminal by choosing a production to use 'backwards' the symbols are popped off the stack, and the production's non-terminal is pushed onto it

Compilers: Bottom-up/6 9 Shift-Reduce Parsing $$ Reduce S => a A B e $ $ a A B e Shift e $ $ a A B Reduce B => d e $ $ a A d Shift d e $ $ a A Reduce A => A b c d e $ $ a A b c Shift c d e $ $ a A b Shift b c d e $ $ a A Reduce A => b b c d e $ $ a b Shift b b c d e $ $ a Shift a b b c d e $ $ ActionInputStack S => a A B e A => A b c | b B => d

Compilers: Bottom-up/ Building a LR Parser The standard way of writing a shift-reduce LR parser is to generate a parse table for the grammar, and 'plug' that into a standard LR compiler framework. The table has two main parts: actions and gotos.

Compilers: Bottom-up/6 11 actionsgotos 3.1. Inside an LR Parser$ anananan… aiaiaiai… a2a2a2a2 a1a1a1a1 LR Parser X o s 0 X o s 0 … X m-1 s m-1 X m s m X m s m output (parse tree) stack input tokens possible actions are shift, reduce, accept, error X is terminals or non-terminals, S = state Parse table (you create this bit) gotos involve state changes push; pop

Compilers: Bottom-up/6 12 Parse Table for the Example 1: S => a A B e 2: A => A b c 3: A => b 4: B => d Action part Goto part s means shift to to that state r means reduce by that numbered production

Compilers: Bottom-up/ Table Algorithm push( ); /* push pair */ currToken = scanner(); while(1) { = pair on top of stack; if (action[state, currToken ] == ) { push( ); currToken = scanner(); } : : 4 branches for the four possible actions that can be in a table cell continued

Compilers: Bottom-up/6 14 else if (action[state, currToken ] == ) { A -->  is rule number ruleNum; bodySize = numElements(  ); pop bodySize pairs off stack; state’ = state part of pair on top of stack; push( ); } : : continued

Compilers: Bottom-up/6 15 else if (action[state,currToken ] = accept) { S -->  is the start symbol production; bodySize = numElements(  ); pop bodySize pairs off stack; state’ = state part of pair on top of stack; if (state’ == 0) break; // success; can now stop else error(); } else error(); } // of while loop

Compilers: Bottom-up/ Table Parsing Example $$0 Accept S => a A B e $$0,a1,A2,B6,e7 Shift 7 e $ $0,a1,A2,B4 Reduce B => d e $ $0,a1,A2,d6 Shift 6 d e $ $0,a1,A2 Reduce A => A b c d e $ $0,a1,A2,b5,c8 Shift 8 c d e $ $0,a1,A2,b5 Shift 5 b c d e $ $0,a1,A2 Reduce A => b b c d e $ $0,a1,b3 Shift 3 b b c d e $ $0,a1 Shift 1 a b b c d e $ $0 ActionInputStack pop 1 pair state' == 1 push(A,goto(1, A)) = push(A,2) pop 3 pairs state' == 1 push(A,goto(1, A)) = push(A,2) S => a A B e A => A b c | b B => d

Compilers: Bottom-up/ The LR Parse Stack The parse stack holds the branches of the tree being built bottom-up. For example, – –the stack $0,a1,A2,b5,c8 represents: a b A bcbc continued

Compilers: Bottom-up/6 18 The next stack: $0,a1,A2 a b A b c A Later, $0,a1,A2,B6,e7 a b A b c A d B e continued

Compilers: Bottom-up/ Generating the Parse Table The example parse table was generated using the SLR (simple LR) algorithm – –an extension of LR(0) which uses the grammar's FOLLOW() sets The other LR algorithms can be used to make a parse table: – –e.g. LR(1), LALR(1)

Compilers: Bottom-up/6 20 Supporting Techniques SLR table generation makes use of three techniques: – –LR(0) items – –the closure() function – –the goto() function I'll explain each one first, before the table generation algorithm.

Compilers: Bottom-up/ LR(0) Items An LR(0) item is a grammar production with a at some position of the right-hand side. So, a production A  X Y Z has four items: A  X Y Z A  X Y Z A  X Y Z A  X Y Z Production A   has one item A 

Compilers: Bottom-up/ The closure() Function The closure() function generates a set of LR(0) items. Assume that the grammar only has one production for the start symbol S, S =>  The initial closure set is: closure( { S =>  } ) continued

Compilers: Bottom-up/6 23 If A B  is in the set, then for each production B , add the item B   to the set, if it's not already there. Repeat until no new items can be added to the set.

Compilers: Bottom-up/6 24 Example use of closure() Grammar: S --> E E  E + T | T T  T * F | F F  ( E ) F  id { S  E } closure({ S  E }) = { S  E E  E + T E  T } { S  E E  E + T E  T T  T * F T  F } { S  E E  E + T E  T T  T * F T  F F  ( E ) F  id } Add E   Add T   Add F  

Compilers: Bottom-up/ The goto() Function goto(I n, X) takes as input an existing closure set I n, and a terminal/non-terminal symbol X. The output is a new closure set I n+1 : – –for each item A   X  in I n, add closure({ A   X  }) to I n+1 – –repeat until no more items can be added to I n+1 InIn I n+1 X

Compilers: Bottom-up/6 26 goto() Example 1 Grammar: S => A B // rule 1, for start symbol A => a B => b Initial state I 0 = closure( { S => A B } ) = { S => A B A => a } continued

Compilers: Bottom-up/6 27 goto( I 0, A) = = closure( { S => A B } ) = { S => A B, B => b} // call it I 1 goto( I 0, a) = = closure( { A => a } ) = { A => a } // call it I 2 I0I0 I1I1 I2I2 A a continued

Compilers: Bottom-up/6 28 goto( I 1, B) = = closure( { S => A B } ) = { S => A B } // call it I 3 – –this is the end of the S production goto( I 1, b) = = closure( { B => b } ) = { B => b } // call it I 4 I0I0 I1I1 I2I2 A a I3I3 I4I4 B b end state

Compilers: Bottom-up/6 29 goto() Example 2 Grammar: S => a A B e // rule 1, for start symbol A => A b c | b B => d Initial state I 0 = closure( { S => a A B e } ) = { S => a A B e } continued

Compilers: Bottom-up/6 30 goto( I 0, a) = = closure( { S => a A B e } ) = { S => a A B e A => A b c A => b} // call it I 1 continued I0I0 I1I1 a

Compilers: Bottom-up/6 31 goto( I 1, A) = = closure( { S => a A B e A => A b c } ) = { S => a A B e A => A b c B => d } // call it I 2 goto( I 1, b) = = closure( { A => b } ) = { A => b } // call it I 3 I0I0 I1I1 I2I2 a A I3I3 b continued

Compilers: Bottom-up/6 32 goto( I 2, B) = = closure( { S => a A B e } ) = { S => a A B e } // call it I 4 Others – –I 5 : { A => A b c } – –I 6 : { B => d } – –I 7 : { S => a A B e } // end of start symbol rule – –I 8 : { A => A b c } I0I0 I1I1 I2I2 a A I3I3 b I4I4 I5I5 I6I6 I7I7 I8I8 B b d ec

Compilers: Bottom-up/ Using goto() to make a Table The columns of the table should be the grammar's terminals, $, and non-terminals. The rows should be the I 0, I 1, …, I n numbers 0, 1, …, n. what we've been calling states

Compilers: Bottom-up/6 34 Stage 1 In stage 1, we add the shift, goto, and accept entries to the table. action[i, a] gets if goto(I i,a) = I j goto[ i, A ] gets j if goto( I i, A) == I j continued

Compilers: Bottom-up/6 35 action[i, $] get accept if S =>  in I i (there must be only one S rule)

Compilers: Bottom-up/6 36 Example Grammar 1 S --> A B A --> a B --> b I0I0 I1I1 I2I2 A a I3I3 I4I4 B b ab$SAB s2 s4 acc 1 3 action[] goto[]

Compilers: Bottom-up/6 37 Stage 2 In stage 2, we add the reduce and error entries to the table. action[i, a] gets if [A =>  ] in I i and A is not S and a is in FOLLOW(A) and A =>  is rule number ruleNum continued

Compilers: Bottom-up/6 38 After filling the table cells with shift, goto, accept, and reduce actions, any remaining empty cells will trigger an error() call.

Compilers: Bottom-up/6 39 Finishing the Example Table The reduce states are the state boxes at the leaves of the closure graph. – –but exclude the end state For the example 1 grammar, there are two boxes at the leaves: I 2 and I 4. I0I0 I1I1 I2I2 A a I3I3 I4I4 B b

Compilers: Bottom-up/6 40 I 2 Reduction I 2 = { A => a } – –A => a is rule number 2 – –FOLLOW(A) == FIRST(B) = { b } So action[ 2, b ] gets S --> A B A --> a B --> b

Compilers: Bottom-up/6 41 I 4 Reduction I 4 = { B => b } – –B => b is rule number 3 – –FOLLOW(B) = { $ } So action[ 4, $ ] gets S --> A B A --> a B --> b

Compilers: Bottom-up/6 42 Adding Reduce Entries S --> A B A --> a B --> b I0I0 I1I1 I2I2 A a I3I3 I4I4 B b ab$SAB s2 s4 acc 1 3 action[] goto[] r2 r3

Compilers: Bottom-up/6 43 Using the Example 1 Table $$0 Accept (S --> A B) $$0,A1,B3 Reduce 3 (B --> b) $$0,A1,b4 Shift 4 b $ $0,A1 Reduce 2 (A --> a) b $ $0,a2 Shift 2 a b $ $0 ActionInputStack S --> A B A --> a B --> b pop 1 pair; state' = 0; push(A, goto(0,A)) == push(A,1); pop 1 pair; state' = 1; push(B, goto(1,B)) == push(B,3);

Compilers: Bottom-up/ Example Grammar 2 S --> a A B e A --> A b c | b B --> d I0I0 I1I1 I2I2 a A I3I3 b I4I4 I5I5 I6I6 I7I7 I8I8 B b d ec action[]goto[] abcde$SAB Stage 1 s1 s3 s5s6 s7 s8 acc 2 4

Compilers: Bottom-up/6 45 Reduce States For the example 2 grammar, there are three boxes at the leaves: I 3, I 6, and I 8.

Compilers: Bottom-up/6 46 I 3 Reduction I 3 = { A => b } – –A => b is rule number 3 – –FOLLOW(A) = {b}  FIRST(B) – – = {b, d} So action[ 3, b ] and action[ 3, d ] gets S --> a A B e A --> A b c A --> b B --> d

Compilers: Bottom-up/6 47 I 6 Reduction I 6 = { B => d } – –B => d is rule number 4 – –FOLLOW(B) = {e} So action[ 6, e ] gets S --> a A B e A --> A b c A --> b B --> d

Compilers: Bottom-up/6 48 I 8 Reduction I 8 = { A => A b c } – –A => A b c is rule number 2 – –FOLLOW(A) = {b, d} So action[ 8, b ] and action[ 8, d ] gets S --> a A B e A --> A b c A --> b B --> d

Compilers: Bottom-up/6 49 Adding Reduce Entries S --> a A B e A --> A b c | b B --> d I0I0 I1I1 I2I2 a A I3I3 b I4I4 I5I5 I6I6 I7I7 I8I8 B b d ec action[]goto[] abcde$SAB s1 s3 s5s6 s7 s8 acc 2 4 r3 r4 r2

Compilers: Bottom-up/ LR Conflicts A LR conflict occurs when a cell in the action part of the parse table contains more than one action. There are two kinds of conflict: – –shift/reduce and reduce/reduce Conflicts appear because of: – –grammar ambiguity – –limitations of the SLR parsing method (even when the grammar is unambiguous)

Compilers: Bottom-up/ Shift/Reduce A shift/reduce conflict occurs when the parser cannot decide whether to shift the next symbol or reduce with a production – –typically, the default action is to shift

Compilers: Bottom-up/6 52 Dangling Else Example Grammar rule: IfStmt => if Expr then Stmt | if Expr then Stmt else Stmt Example: if (a == 1) then if (b == 4) then x = 2; else... <-- this goes with which 'if' ?

Compilers: Bottom-up/6 53 On the Stack Stack $… $…if Expr then Stmt Input …$ else…$ Action … shift or reduce? Choose shift, so else matches closest if

Compilers: Bottom-up/ Reduce/Reduce A reduce/reduce conflict occurs when the parser cannot decide which production to use to make a reduction. Typically, the first suitable production is used.

Compilers: Bottom-up/6 55 Example Stack $ $a Input aa$ a$ Action shift reduce A  a or B  a ? Grammar: C  A B A  a B  a Choose A  a, since it's the first suitable one.

Compilers: Bottom-up/ LL, SLR, LR, LALR Grammars LL(1) LR(1) LR(0) SLR LALR(1) the ovals represent the complexity of the grammars that the notation can handle we've been using SLR in this chapter LL(1) was used in chapter 5 on top-down parsing

Compilers: Bottom-up/6 57 LR(1) Grammars LR(1) parsing uses one token lookahead to avoid conflicts in the parsing table. It can deal with more complex/powerful grammars than LR(0) or SLR. A LR(1) grammar takes longer to convert into a parse table.

Compilers: Bottom-up/6 58 LALR(1) Grammars LALR(1) parsing (Look-Ahead LR) combines LR(1) states to reduce the size of the parse table. LALR(1) is less powerful than LR(1) – –it may introduce reduce-reduce conflicts, but that's not likely for programming language grammars LALR(1) is used by the YACC parsing tool – –see next chapter