Announcements/Reading

Slides:

Advertisements

Similar presentations

Compiler Construction

Advertisements

Compiler construction in4020 – lecture 4 Koen Langendoen Delft University of Technology The Netherlands.

1 May 22, May 22, 2015May 22, 2015May 22, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa Pacific University,

LR Parsing Table Costruction

Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)

Cse321, Programming Languages and Compilers 1 6/12/2015 Lecture #10, Feb. 14, 2007 Modified sets of item construction Rules for building LR parse tables.

6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)

1 Chapter 5: Bottom-Up Parsing (Shift-Reduce). 2 - attempts to construct a parse tree for an input string beginning at the leaves (the bottom) and working.

CS Summer 2005 Top-down and Bottom-up Parsing - a whirlwind tour June 20, 2005 Slide acknowledgment: Radu Rugina, CS 412.

1 CMPSC 160 Translation of Programming Languages Fall 2002 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #10 Parsing.

More SLR /LR(1) Professor Yihjia Tsai Tamkang University.

Lecture #8, Feb. 7, 2007 Shift-reduce parsing,

Bottom Up Parsing.

LR(k) Grammar David Rodriguez-Velazquez CS6800-Summer I, 2009 Dr. Elise De Doncker.

Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.

Syntax Analysis – Part II Quick Look at Using Bison Top-Down Parsers EECS 483 – Lecture 5 University of Michigan Wednesday, September 20, 2006.

LALR Parsing Canonical sets of LR(1) items

Syntax and Semantics Structure of programming languages.

410/510 1 of 21 Week 2 – Lecture 1 Bottom Up (Shift reduce, LR parsing) SLR, LR(0) parsing SLR parsing table Compiler Construction.

Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.

Chapter 3-3 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR 

Syntax and Semantics Structure of programming languages.

Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.

Announcements/Reading

1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.

Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.

4. Bottom-up Parsing Chih-Hung Wang

Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.

Syntax-Directed Definitions CS375 Compilers. UT-CS. 1.

Three kinds of bottom-up LR parser SLR “Simple LR” –most restrictions on eligible grammars –built quite directly from items as just shown LR “Canonical.

CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.

1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.

Lecture 5: LR Parsing CS 540 George Mason University.

Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.

Conflicts in Simple LR parsers A SLR Parser does not use any lookahead The SLR parsing method fails if knowing the stack’s top state and next input token.

Chapter 8. LR Syntactic Analysis Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.

CS 2130 Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing Warning: The precedence table given for the Wff grammar is in error.

Syntax and Semantics Structure of programming languages.

A Simple Syntax-Directed Translator

Programming Languages Translator

Compiler design Bottom-up parsing Concepts

Compiler Baojian Hua LR Parsing Compiler Baojian Hua

Unit-3 Bottom-Up-Parsing.

UNIT - 3 SYNTAX ANALYSIS - II

Lecture #12 Parsing Types.

Table-driven parsing Parsing performed by a finite state machine.

Compiler Construction

Fall Compiler Principles Lecture 4: Parsing part 3

LALR Parsing Canonical sets of LR(1) items

Syntax Analysis Part II

Fall Compiler Principles Lecture 4: Parsing part 3

Implementing a LR parser engine

Subject Name:COMPILER DESIGN Subject Code:10CS63

Top-Down Parsing CS 671 January 29, 2008.

Parsing #2 Leonidas Fegaras.

Lecture (From slides by G. Necula & R. Bodik)

Compilers Principles, Techniques, & Tools Taught by Jing Zhang

LALR Parsing Adapted from Notes by Profs Aiken and Necula (UCB) and

Compiler Construction

Chapter 4. Syntax Analysis (2)

Parsing #2 Leonidas Fegaras.

5. Bottom-Up Parsing Chih-Hung Wang

Kanat Bolazar February 16, 2010

Chapter 4. Syntax Analysis (2)

Parsing Bottom-Up LR Table Construction.

Parsing Bottom-Up LR Table Construction.

Chap. 3 BOTTOM-UP PARSING

Lecture 11 LR Parse Table Construction

Compiler design Bottom-up parsing: Canonical LR and LALR

Presentation transcript:

Syntax Analysis – Part VI LR(1) Parsing EECS 483 – Lecture 9 University of Michigan Wednesday, October 4, 2006

Announcements/Reading Project 1 grading Room 4817 CSE (Room at top of stairs by the restrooms) Project 2 Details still in progress, hopefully will be available by Fri

From Last Time: LR(1) Parsing Get as much as possible out of 1 lookahead symbol parsing table LR(1) grammar = recognizable by a shift/reduce parser with 1 lookahead LR(1) parsing uses similar concepts as LR(0) Parser states = set of items LR(1) item = LR(0) item + lookahead symbol possibly following production LR(0) item: S  . S + E LR(1) item: S  . S + E , + Lookahead only has impact upon REDUCE operations, apply when lookahead = next input

LR(1) States LR(1) state = set of LR(1) items LR(1) item = (X   .  , y) Meaning:  already matched at top of the stack, next expect to see  y Shorthand notation (X   .  , {x1, ..., xn}) means: (X   .  , x1) . . . (X   .  , xn) Need to extend closure and goto operations S  S . + E +,$ S  S + . E num

LR(1) Closure LR(1) closure operation: Start with Closure(S) = S For each item in S: X   . Y  , z and for each production Y   , add the following item to the closure of S: Y  .  , FIRST(z) Repeat until nothing changes Similar to LR(0) closure, but also keeps track of lookahead symbol

LR(1) Start State Initial state: start with (S’  . S , $), then apply closure operation Example: sum grammar S’  S $ S  E + S | E E  num S’  . S , $ S  . E + S , $ S  . E , $ E  . num , +,$ closure S’  . S , $

LR(1) Goto Operation LR(1) goto operation = describes transitions between LR(1) states Algorithm: for a state S and a symbol Y (as before) If the item [X   . Y ] is in I, then Goto(I, Y) = Closure( [X   Y .  ] ) S1 Goto(S1, ‘+’) S2 S  E . + S , $ S  E . , $ Closure({S  E + . S , $}) Grammar: S’  S$ S  E + S | E E  num

Class Problem 1. Compute: Closure(I = {S  E + . S , $}) S’  S $ S  E + S | E E  num 2. Compute: Goto(I, num) 3. Compute: Goto(I, E)

LR(1) DFA Construction S’  . S , $ S  . E + S , $ S  . E , $ E  .num , +,$ S S’  S . , $ E  num . , +,$ num num E Grammar S’  S$ S  E + S | E E  num + S  E + . S , $ S  . E + S , $ S  . E , $ E  . num , +,$ S  E . + S , $ S  E . , $ E S S  E+S. , +,$

LR(1) Reductions Reductions correspond to LR(1) items of the form (X   . , y) S’  . S , $ S  . E + S , $ S  . E , $ E  .num , +,$ S S’ S . , $ E  num . , +,$ num num E Grammar S’  S$ S  E + S | E E  num + S  E + . S , $ S  . E + S , $ S  . E , $ E  . num , +,$ S  E . + S , $ S  E . , $ E S S  E . , +,$

LR(1) Parsing Table Construction Same as construction of LR(0), except for reductions For a transition S  S’ on terminal x: Table[S,x] += Shift(S’) For a transition S  S’ on non-terminal N: Table[S,N] += Goto(S’) If I contains {(X   . , y)} then: Table[I,y] += Reduce(X  )

LR(1) Parsing Table Example Grammar S’  S$ S  E + S | E E  num S’  . S , $ S  . E + S , $ S  . E , $ E  .num , +,$ 3 S  E + . S , $ S  . E + S , $ S  . E , $ E  . num , +,$ E 2 + S  E . + S , $ S  E . , $ + $ E 1 g2 2 s3 SE Fragment of the parsing table

Class Problem Compute the LR(1) DFA for the following grammar E  E + T | T T  TF | F F  F* | a | b

LALR(1) Grammars Problem with LR(1): too many states LALR(1) parsing (aka LookAhead LR) Constructs LR(1) DFA and then merge any 2 LR(1) states whose items are identical except lookahead Results in smaller parser tables Theoretically less powerful than LR(1) LALR(1) grammar = a grammar whose LALR(1) parsing table has no conflicts S  id . , + S  E . , $ S  id . , $ S  E . , + + = ??

LALR Parsers LALR(1) Generally same number of states as SLR (much less than LR(1)) But, with same lookahead capability of LR(1) (much better than SLR) Example: Pascal programming language In SLR, several hundred states In LR(1), several thousand states

LL/LR Grammar Summary LL parsing tables LR parsing tables Non-terminals x terminals  productions Computed using FIRST/FOLLOW LR parsing tables LR states x terminals  {shift/reduce} LR states x non-terminals  goto Computed using closure/goto operations on LR states A grammar is: LL(1) if its LL(1) parsing table has no conflicts same for LR(0), SLR, LALR(1), LR(1)

Classification of Grammars LL(1) LR(1) LR(k)  LR(k+1) LL(k)  LL(k+0) LL(k)  LR(k) LR(0)  SLR LALR(1)  LR(1) LALR(1) SLR LR(0) not to scale 

Automate the Parsing Process Can automate: The construction of LR parsing tables The construction of shift-reduce parsers based on these parsing tables LALR(1) parser generators yacc, bison Not much difference compared to LR(1) in practice Smaller parsing tables than LR(1) Augment LALR(1) grammar specification with declarations of precedence, associativity Output: LALR(1) parser program

Associativity S  S + E | E E  num E  E + E E  num What happens if we run this grammar through LALR construction? E  E + E E  num E  E + E . , + E  E . + E , +,$ + 1 + 2 + 3 shift/reduce conflict shift: 1+ (2+3) reduce: (1+2)+3

Associativity (2) If an operator is left associative Assign a slightly higher value to its precedence if it is on the parse stack than if it is in the input stream Since stack precedence is higher, reduce will take priority (which is correct for left associative) If operator is right associative Assign a slightly higher value if it is in the input stream Since input stream is higher, shift will take priority (which is correct for right associative)

Precedence E  E + E | T T  T x T | num | (E) E  E + E | E x E | num | (E) What happens if we run this grammar through LALR construction? E  E . + E , ... E  E x E . , + E  E + E . , x E  E . x E, ... Shift/reduce conflict results Precedence: attach precedence indicators to terminals Shift/reduce conflict resolved by: 1. If precedence of the input token is greater than the last terminal on parse stack, favor shift over reduce 2. If the precedence of the input token is less than or equal to the last terminal on the parse stack, favor reduce over shift

Abstract Syntax Tree (AST) - Review Derivation = sequence of applied productions S  E+S  1+S  1+E 1+2 Parse tree = graph representation of a derivation Doesn’t capture the order of applying the productions AST discards unnecessary information from the parse tree E + S ( S ) E + + 5 E + S 5 1 + 1 E + S 2 + 2 E 3 4 ( S ) E + S 3 E 4

Implicit AST Construction LL/LR parsing techniques implicitly build AST The parse tree is captured in the derivation LL parsing: AST represented by applied productions LR parsing: AST represented by applied reductions We want to explicitly construct the AST during the parsing phase

AST Construction - LL S  ES’ LL parsing: extend procedures S’   | +S E  num | (S) LL parsing: extend procedures for non-terminals Expr parse_S() { switch (token) { case num: case ‘(‘: Expr left = parse_E(); Expr right = parse_S’(); if (right == NULL) return left else return new Add(left,right); default: ParseError(); } void parse_S() { switch (token) { case num: case ‘(‘: parse_E(); parse_S’(); return; default: ParseError(); }

AST Construction - LR We again need to add code for explicit AST construction AST construction mechanism Store parts of the tree on the stack For each nonterminal symbol X on stack, also store the sub-tree rooted at X on stack Whenever the parser performs a reduce operation for a production X  , create an AST node for X

AST Construction for LR - Example S  E + S | S E  num | (S) input string: “1 + 2 + 3” . S Add + Num(1) Num(2) . S . E Add Num(3) . . . . Num(1) Add stack Num(2) Num(3) Before reduction: S  E + S After reduction: S  E + S

Problems Unstructured code: mixing parsing code with AST construction code Automatic parser generators The generated parser needs to contain AST construction code How to construct a customized AST data structure using an automatic parser generator? May want to perform other actions concurrently with parsing phase E.g., semantic checks This can reduce the number of compiler passes

Syntax-Directed Definition Solution: Syntax-directed definition Extends each grammar production with an associated semantic action (code): S  E + S {action} The parser generator adds these actions into the generated parser Each action is executed when the corresponding production is reduced

Semantic Actions Actions = C code (for bison/yacc) The actions access the parser stack Parser generators extend the stack of symbols with entries for user-defined structures (e.g., parse trees) The action code should be able to refer to the grammar symbols in the productions Need to refer to multiple occurrences of the same non-terminal symbol, distinguish RHS vs LHS occurrence E  E + E Use dollar variables in yacc/bison ($$, $1, $2, etc.) expr ::= expr PLUS expr {$$ = $1 + $3;}

Building the AST Use semantic actions to build the AST AST is built bottom-up along with parsing Recall: User-defined type for objects on the stack (%union) expr ::= NUM {$$ = new Num($1.val); } expr ::= expr PLUS expr {$$ = new Add($1, $3); } expr ::= expr MULT expr {$$ = new Mul($1, $3); } expr ::= LPAR expr RPAR {$$ = $2; }