Lecture 5: LR Parsing CS 540 George Mason University.

Slides:



Advertisements
Similar presentations
Parsing V: Bottom-up Parsing
Advertisements

A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Compiler Designs and Constructions
Joey Paquet, 2000, 2002, 2008, Lecture 7 Bottom-Up Parsing II.
Review: LR(k) parsers a1 … a2 … an $ LR parsing program Action goto Sm xm … s1 x1 s0 output input stack Parsing table.
1 May 22, May 22, 2015May 22, 2015May 22, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa Pacific University,
LR Parsing Table Costruction
Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)
– 1 – CSCE 531 Spring 2006 Lecture 9 SLR Parse Table Construction Topics SLR Parse Table Construction Sets of Items Closure(I) Readings: 4.7 Homework:
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
1 Chapter 5: Bottom-Up Parsing (Shift-Reduce). 2 - attempts to construct a parse tree for an input string beginning at the leaves (the bottom) and working.
Pertemuan 12, 13, 14 Bottom-Up Parsing
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
Lecture #8, Feb. 7, 2007 Shift-reduce parsing,
Parsing V Introduction to LR(1) Parsers. from Cooper & Torczon2 LR(1) Parsers LR(1) parsers are table-driven, shift-reduce parsers that use a limited.
Chapter 4-2 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR Other.
1 CIS 461 Compiler Design & Construction Fall 2012 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #12 Parsing 4.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
Bottom-up parsing Goal of parser : build a derivation
LALR Parsing Canonical sets of LR(1) items
 an efficient Bottom-up parser for a large and useful class of context-free grammars.  the “ L ” stands for left-to-right scan of the input; the “ R.
Syntax and Semantics Structure of programming languages.
Joey Paquet, 2000, 2002, 2012, Lecture 6 Bottom-Up Parsing.
410/510 1 of 21 Week 2 – Lecture 1 Bottom Up (Shift reduce, LR parsing) SLR, LR(0) parsing SLR parsing table Compiler Construction.
 an efficient Bottom-up parser for a large and useful class of context-free grammars.  the “ L ” stands for left-to-right scan of the input; the “ R.
1 Compiler Construction Syntax Analysis Top-down parsing.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 1 Chapter 4 Chapter 4 Bottom Up Parsing.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
1 LR Parsers  The most powerful shift-reduce parsing (yet efficient) is: LR(k) parsing. LR(k) parsing. left to right right-most k lookhead scanning derivation.
Chapter 3-3 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR 
Lesson 9 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Syntax and Semantics Structure of programming languages.
Chapter 5: Bottom-Up Parsing (Shift-Reduce)
111 Chapter 6 LR Parsing Techniques Prof Chung. 1.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
Three kinds of bottom-up LR parser SLR “Simple LR” –most restrictions on eligible grammars –built quite directly from items as just shown LR “Canonical.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
1 Chapter 6 Bottom-Up Parsing. 2 Bottom-up Parsing A bottom-up parsing corresponds to the construction of a parse tree for an input tokens beginning at.
Conflicts in Simple LR parsers A SLR Parser does not use any lookahead The SLR parsing method fails if knowing the stack’s top state and next input token.
Chapter 8. LR Syntactic Analysis Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
Syntax and Semantics Structure of programming languages.
Announcements/Reading
Lec04-bottomupparser 4/13/2018 LR Parsing.
Programming Languages Translator
Bottom-up parsing Goal of parser : build a derivation
Compiler design Bottom-up parsing Concepts
lec04-bottomupparser June 6, 2018 Bottom-Up Parsing
Unit-3 Bottom-Up-Parsing.
UNIT - 3 SYNTAX ANALYSIS - II
CS 488 Spring 2012 Lecture 4 Bapa Rao Cal State L.A.
Table-driven parsing Parsing performed by a finite state machine.
Compiler Construction
Fall Compiler Principles Lecture 4: Parsing part 3
LALR Parsing Canonical sets of LR(1) items
Syntax Analysis Part II
Subject Name:COMPILER DESIGN Subject Code:10CS63
CS 540 George Mason University
Syntax Analysis source program lexical analyzer tokens syntax analyzer
4d Bottom Up Parsing.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Bottom Up Parsing.
Compiler SLR Parser.
Chap. 3 BOTTOM-UP PARSING
Presentation transcript:

Lecture 5: LR Parsing CS 540 George Mason University

CS 540 Spring 2009 GMU2 Static Analysis - Parsing Scanner (lexical analysis) Parser (syntax analysis) Code Optimizer Semantic Analysis (IC generator) Code Generator Symbol Table Source language tokens Syntatic structure Syntatic/semantic structure Target language Syntax described formally Tokens organized into syntax tree that describes structure Error checking

CS 540 Spring 2009 GMU3 LL vs. LR LR (shift reduce) is more powerful than LL (predictive parsing) Can detect a syntactic error as soon as possible. LR is difficult to do by hand (unlike LL)

CS 540 Spring 2009 GMU4 LR(k) Parsing – Bottom Up Construct parse tree from leaves, ‘reducing’ the string to the start symbol (and a single tree) During parse, we have a ‘forest’ of trees Shift-reduce parsing –‘Shift’ a new input symbol –‘Reduce’ a group of symbols to a single non- terminal –Choice is made using the k lookaheads LR(1)

CS 540 Spring 2009 GMU5 Example S  a T R e T  T b c | b R  d Rightmost derivation: S  a T R e  a T d e  a T b c d e  a b b c d e S a T R e T b c d b LR parsing corresponds to the rightmost derivation in reverse.

CS 540 Spring 2009 GMU6 Shift Reduce Parsing S  a T R e T  T b c | b R  d Remaining input: abbcde Rightmost derivation: S  a T R e  a T d e  a T b c d e  a b b c d e

CS 540 Spring 2009 GMU7 Shift Reduce Parsing S  a T R e T  T b c | b R  d a b Shift a, Shift b  Remaining input: bcde Rightmost derivation: S  a T R e  a T d e  a T b c d e  a b b c d e

CS 540 Spring 2009 GMU8 Shift Reduce Parsing S  a T R e T  T b c | b R  d a b Shift a, Shift b Reduce T  b   T Remaining input: bcde Rightmost derivation: S  a T R e  a T d e  a T b c d e  a b b c d e

CS 540 Spring 2009 GMU9 Shift Reduce Parsing S  a T R e T  T b c | b R  d a b Shift a, Shift b Reduce T  b Shift b, Shift c    T b c Remaining input: de Rightmost derivation: S  a T R e  a T d e  a T b c d e  a b b c d e

CS 540 Spring 2009 GMU10 Shift Reduce Parsing S  a T R e T  T b c | b R  d a b Shift a, Shift b Reduce T  b Shift b, Shift c Reduce T  T b c     T b c T Remaining input: de Rightmost derivation: S  a T R e  a T d e  a T b c d e  a b b c d e

CS 540 Spring 2009 GMU11 Shift Reduce Parsing S  a T R e T  T b c | b R  d a b Shift a, Shift b Reduce T  b Shift b, Shift c Reduce T  T b c Shift d      T b c T d Remaining input: e Rightmost derivation: S  a T R e  a T d e  a T b c d e  a b b c d e

CS 540 Spring 2009 GMU12 Shift Reduce Parsing S  a T R e T  T b c | b R  d a b Shift a, Shift b Reduce T  b Shift b, Shift c Reduce T  T b c Shift d Reduce R  d       T b c T d R Remaining input: e Rightmost derivation: S  a T R e  a T d e  a T b c d e  a b b c d e

CS 540 Spring 2009 GMU13 Shift Reduce Parsing S  a T R e T  T b c | b R  d a b Shift a, Shift b Reduce T  b Shift b, Shift c Reduce T  T b c Shift d Reduce R  d Shift e        T b c T d R e Remaining input: Rightmost derivation: S  a T R e  a T d e  a T b c d e  a b b c d e

CS 540 Spring 2009 GMU14 Shift Reduce Parsing S  a T R e T  T b c | b R  d a b Shift a, Shift b Reduce T  b Shift b, Shift c Reduce T  T b c Shift d Reduce R  d Shift e Reduce S  a T R e         T b c T d R e S Remaining input: Rightmost derivation: S  a T R e  a T d e  a T b c d e  a b b c d e

CS 540 Spring 2009 GMU15 LR Parsing Data Structures: –Stack – contains symbol/state pairs. The state on top of stack summarizes the information below. –Tables: Action: state x   reduce/shift/accept/error Goto: state x V n  state

CS 540 Spring 2009 GMU16 Example LR Table Stateabcde$STR 0s1 1s32 2s5s64 3r3 4s7 5s8 6r4 7acc 8r2 1: S  a T R e 2: T  T b c 3: T  b 4: R  d Action table Goto table s means shift to to some state r means reduce by some production

CS 540 Spring 2009 GMU17 Algorithm: LR(1) push($,0); /* always pushing a symbol/state pair */ lookahead = yylex(); loop s = top(); /*always a state */ if action[s,lookahead] = shift s’ push(lookahead,s’); lookahead = yylex(); else if action[s,lookahead] = reduce A   pop size of  pairs s’ = state on top of stack push(A,goto[s’,A]); else if action[s,lookahead] = accept then return else error(); end loop;

CS 540 Spring 2009 GMU18 LR Parsing Example 1 StackInputAction $0$0a b b c d e $s1

CS 540 Spring 2009 GMU19 LR Parsing Example 1 StackInputAction $0a b b c d e $s1 $0,a1b b c d e $s3

CS 540 Spring 2009 GMU20 LR Parsing Example 1 StackInputAction $0a b b c d e $s1 $0,a1b b c d e $s3 $0,a1,b3b c d e $ r3 (T  b)

CS 540 Spring 2009 GMU21 LR Parsing Example 1 StackInputAction $0a b b c d e $s1 $0,a1b b c d e $s3 $0,a1,b3b c d e $ r3 (T  b) $0,a1,T2b c d e $s5 $0,a1,T2,b5c d e $s8 $0,a1,T2,b5,c8d e $ r2 (T  T b c) goto(T,1)=2

CS 540 Spring 2009 GMU22 LR Parsing Example 1 StackInputAction $0a b b c d e $s1 $0,a1b b c d e $s3 $0,a1,b3b c d e $ r3 (T  b) $0,a1,T2b c d e $s5 $0,a1,T2,b5c d e $s8 $0,a1,T2,b5,c8d e $ r2 (T  T b c) $0,a1,T2d e $s6 $0,a1,T2,d6e $ r4 (R  d) goto(T,1)=2

CS 540 Spring 2009 GMU23 LR Parsing Example 1 StackInputAction $0a b b c d e $s1 $0,a1b b c d e $s3 $0,a1,b3b c d e $ r3 (T  b) $0,a1,T2b c d e $s5 $0,a1,T2,b5c d e $s8 $0,a1,T2,b5,c8d e $ r2 (T  T b c) $0,a1,T2d e $s6 $0,a1,T2,d6e $ r4 (R  d) $0,a1,T2,R4e $s7 $0,a1,T2,R4,e7$accept! goto(R,2)=4

CS 540 Spring 2009 GMU24 LR Parse Stack During LR parsing, there is always a ‘forest’ of trees. Parse stack holds root of each of these trees: –For example, that stack $0,a1,T2,b5,c8 represents the corresponding forest a b T b c

CS 540 Spring 2009 GMU25 The next stack: $0,a1,T2 a b T b c T Later, we have $0,a1,T2,R6,e7 a b T b c T d R e

CS 540 Spring 2009 GMU26 Where does the table come from? Handle – “a substring that matches the right side of a production and whose reduction to the non-terminal represents one step along the reverse of a rightmost derivation” Using the grammar, want to create a DFA to find handles.

CS 540 Spring 2009 GMU27 SLR parsing Simplest LR algorithm Provide an understanding of –the basic mechanics of shift/reduce parsing –source of shift/reduce and reduce/reduce conflicts There are better (more powerful) algorithms (LALR, LR) but we won’t study them here.

CS 540 Spring 2009 GMU28 Generating SLR parse tables Augmented grammar: grammar with new start symbol and production S’  S where S is old start symbol. –Augmentation only required if there is no single production to signal the end. Construct C = {…} the LR(0) items Construct Action table for state i of parser: All undefined entries are error

CS 540 Spring 2009 GMU29 LR(0) items Canonical LR(0) collections are the basis for constructing SLR (simple LR) parsers Defn: LR(0) item of a grammar G is a production of G with a dot at some point on the right side. A  X Y Z yields four different LR(0) items: –A . X Y Z –A  X. Y Z –A  X Y. Z –A  X Y Z. A   yields one item –A .

CS 540 Spring 2009 GMU30 Closure(I) function Closure(I) where I is a set of LR(0) items = –Every item in I (kernel) and –If A  . B  in closure(I) and B   is a production, add B .  to closure(I) (if not already there). –Keep applying this rule until no more items can be added.

CS 540 Spring 2009 GMU31 Closure Example E’  E E  E + T | T T  T * F | F F  ( E ) | id Closure({T  T *. F}) = {T  T *. F, F . ( E ), F . id} Closure({E  E. + T, F . id}) = {E  E. + T, F . id}

CS 540 Spring 2009 GMU32 Closure Example E’  E E  E + T | T T  T * F | F F  ( E ) | id Closure({F  (. E )} = {F  (. E ), E . E + T, E . T} = {F  (. E ), E . E + T, E . T, T . T * F, T . F} = {F  (. E ), E . E + T, E . T, T . T * F, T . F, F . Id, F . ( E ) }

CS 540 Spring 2009 GMU33 Goto function Goto(I,X), where I is a set of items and X is a grammar symbol, is the closure(A   X.  where A  . X  is in I. Ex: Goto({E’  E., E  E. + T},+) = closure({E  E +. T}) = {E  E +. T, T . T * F, T . F, F . id, F . ( E )}

CS 540 Spring 2009 GMU34 Goto function Goto({T  T *. F, T . F},F) = closure({T  T * F., T  F. }) = {T  T * F., T  F.} Goto({E’  E., E  E +. T},+) = closure(  ) =  since + does not occur before the. symbol

CS 540 Spring 2009 GMU35 Algorithm: Finding canonical collection C = {I 0,I 1,…,I n } for grammar G C ={closure({S’ . S})} for start symbol S’ Repeat –For each I k in C and grammar symbol X such that Goto(I k,X) is not empty and not in C Add Goto(I k,X) to C I0I0

CS 540 Spring 2009 GMU36 Example 1 Grammar: S  a T R e, T  T b c | b, R  d I 0 : S . a T R e Goto({S . a T R e },a) = I 1 I 1 : S  a. T R e Goto({S  a. T R e, T . T b c},T) T . T b c = I 2 T . b Goto({T . b },b) = I 3 I 2 : S  a T. R e goto 4 T  T. b c goto 5 R . d goto 6 I0I0 I1I1 I2I2 I3I3 I6I6 I5I5 I4I4 a bT R b d kernel of each item set is in blue

CS 540 Spring 2009 GMU37 Example 1 Grammar: S  a T R e, T  T b c | b, R  d I 3 : T  b. reduce I 4 : S  a T R. e goto state 7 I 5 : T  T b. c goto state 8 I 6 : R  d. reduce I 7 : S  a T R e. reduce I 8 : T  T b c. reduce I0I0 I1I1 I2I2 I3I3 I6I6 I5I5 I4I4 a bT R b d I8I8 I7I7 ec

CS 540 Spring 2009 GMU38 Algorithm: Canonical sets state = 0; max_state = 1; kernel[0] = [S’ . S] loop c = closure(kernel[state]); for t in c, where all productions are form A  . B  if exists k <= state where t = kernel[k] then goto(state,B) = k; else kernel[max_state] = goto(state,B) = t; max_state++; state++; until state+1 = max_state;

CS 540 Spring 2009 GMU39 Example 2 Grammar: S’  S, S  A S | b, A  S A | c I 0 : S’ . S S . A S S . b A . S A A . c I 1 : S’  S. A  S. A A . S A A . c S . A S S . b

CS 540 Spring 2009 GMU40 Example 2 Grammar: S’  S, S  A S | b, A  S A | c I 2 : S  A. S S . A S S . b A . S A A . c I 3 : A  c. I 4 : S  b. I0I0 I1I1 I2I2 I3I3 I4I4 S A c b So far: c b I5I5 I6I6 A S I7I7 S A c b

CS 540 Spring 2009 GMU41 Example 2 Grammar: S’  S, S  A S | b, A  S A | c I 5 : S  A. S I 6 : A  S. A I 7 : S  A S. A  S A. A . S A A  S. A S . A S A . c A . S A S . b S . A S A . c A . S A S . b S . A S A . c S . b

CS 540 Spring 2009 GMU42 Example 2 Grammar: S’  S, S  A S | b, A  S A | c I 0 : S’ . S I 1 : S’  S. A  S. A I 2 : S  A. S I 3 : A  c. I 4 : S  b. I 5 : S  A. S A  S A. I 6 : A  S. A I 7 : S  A S. A  S. A I0I0 I1I1 I2I2 I3I3 I4I4 S A c b So far: c b I5I5 I6I6 A S I7I7 S A c b A A S A S S I5—I7 also have connections to I3 and I4

CS 540 Spring 2009 GMU43 Generating SLR parse tables Construct C = {…} the LR(0) items as in previous slides Action table for state i of parser: –If [A  . a  ] in I i, goto(I i,a) = I j then action[i,a] = shift j –If [A  .,b] in I i, where A is not S’, then action[i,a] = reduce A   for all a in FOLLOW(A) –If [S’  S,$] in I i, set action[i,$] = accept All undefined entries are error Goto Table for state i of parser: –If [A  . B] in I i and goto(I i,B) = I j then goto[i,B] = j

CS 540 Spring 2009 GMU44 Example 2 Grammar: S’  S, S  A S | b, A  S A | c FirstFollow S’cb$ S $cb Acb

CS 540 Spring 2009 GMU45 Example 2 Grammar: S’  S, S  A S | b, A  S A | c I 0 : S’ . S goto 1 S . A S goto 2 S . b goto 3 A . S A goto 1 A . c goto 4 I 1 : S’  S. reduce A  S. A goto 5 A . S A goto 6 A . c goto 4 S . A S goto 5 S . b goto 3 Statecb$SA 0s4s312 1s4s3acc

CS 540 Spring 2009 GMU46 Example 2 Grammar: S’  S, S  A S | b, A  S A | c I 2 : S  A. S S . A S S . b A . S A A . A I 3 : S  b. I 4 : A  c. I0I0 I1I1 I2I2 I4I4 I3I3 S A c b So far: c b I5I5 I6I6 A S I7I7 S A c b

CS 540 Spring 2009 GMU47 LR Table for Example 2 Statecb$SA 0s4s312 1s4s3acc65 2s4s372 3r3 4r : S’  S 2: S  A S 3: S  b 4: A  S A 5: A  c

CS 540 Spring 2009 GMU48 Example 2 Grammar: S’  S, S  A S | b, A  S A | c I 5 : S  A. S I 6 : A  S. A I 7 : S  A S. A  S A. A . S A A  S. A S . A S A . c A . S A S . b S . A S A . c A . S A S . b S . A S A . c S . b

CS 540 Spring 2009 GMU49 LR Table for Example 2 Statecb$SA 0s4s312 1s4s3acc65 2s4s372 3r3 4r5 5s4/r4s3/r472 6s4s365 7s4/r2s3/r2r265 1: S’  S 2: S  A S 3: S  b 4: A  S A 5: A  c

CS 540 Spring 2009 GMU50 LR Conflicts Shift/reduce –When it cannot be determined whether to shift the next symbol or reduce by a production –Typically, the default is to shift. –Examples: previous grammar, dangling else if_stmt  if expr then stmt | if expr then stmt else stmt if ex1 then if ex2 then stmt; else  which ‘if’ owns this else??

CS 540 Spring 2009 GMU51 LR Conflicts Reduce/reduce –When it cannot be determined which production to reduce by –Example: stmt  id ( expr_list )  function call expr  id ( expr_list )  array (as in Ada) –Convention: use first production in grammar or use more powerful technique

CS 540 Spring 2009 GMU52 Error Recovery in LR parsing Just as with LL, we typically want to discard some part of the input and resume parsing from some ‘known’ point. –Search back in the stack for some non-terminal A (how to choose A?) then process input until find token in Follow(A) –Can also decorate the LR table with error recovery routines tailored to the state and token – more complicated to get right.