Compiler Principles Fall 2014-2015 Compiler Principles Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University.

Slides:



Advertisements
Similar presentations
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Advertisements

Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Compiler Designs and Constructions
Compiler construction in4020 – lecture 4 Koen Langendoen Delft University of Technology The Netherlands.
Compilation (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
Bottom up Parsing Bottom up parsing trys to transform the input string into the start symbol. Moves through a sequence of sentential forms (sequence of.
Joey Paquet, 2000, 2002, 2008, Lecture 7 Bottom-Up Parsing II.
Pushdown Automata Consists of –Pushdown stack (can have terminals and nonterminals) –Finite state automaton control Can do one of three actions (based.
Mooly Sagiv and Roman Manevich School of Computer Science
Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)
Theory of Compilation Erez Petrank Lecture 3: Syntax Analysis: Bottom-up Parsing 1.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
Formal Aspects Term 2, Week4 LECTURE: LR “Shift-Reduce” Parsers: The JavaCup Parser-Generator CREATES LR “Shift-Reduce” Parsers, they are very commonly.
1 LR parsing techniques SLR (not in the book) –Simple LR parsing –Easy to implement, not strong enough –Uses LR(0) items Canonical LR –Larger parser but.
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
Bottom-up parsing Goal of parser : build a derivation
Syntax and Semantics Structure of programming languages.
410/510 1 of 21 Week 2 – Lecture 1 Bottom Up (Shift reduce, LR parsing) SLR, LR(0) parsing SLR parsing table Compiler Construction.
CS 321 Programming Languages and Compilers Bottom Up Parsing.
CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 1 Chapter 4 Chapter 4 Bottom Up Parsing.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
11 Outline  6.0 Introduction  6.1 Shift-Reduce Parsers  6.2 LR Parsers  6.3 LR(1) Parsing  6.4 SLR(1)Parsing  6.5 LALR(1)  6.6 Calling Semantic.
Chapter 3-3 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR 
Syntax and Semantics Structure of programming languages.
Compilation /15a Lecture 5 1 * Syntax Analysis: Bottom-Up parsing Noam Rinetzky.
Compiler Principles Fall Compiler Principles Lecture 6: Parsing part 5 Roman Manevich Ben-Gurion University.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
Compiler Principles Winter Compiler Principles Syntax Analysis (Parsing) – Part 3 Mayer Goldberg and Roman Manevich Ben-Gurion University.
10/10/2002© 2002 Hal Perkins & UW CSED-1 CSE 582 – Compilers LR Parsing Hal Perkins Autumn 2002.
Compiler Principles Fall Compiler Principles Lecture 5: Parsing part 4 Roman Manevich Ben-Gurion University.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Bottom-Up Parsing Algorithms LR(k) parsing L: scan input Left to right R: produce Rightmost derivation k tokens of lookahead LR(0) zero tokens of look-ahead.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
Lecture 5: LR Parsing CS 540 George Mason University.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
Bottom-up parsing. Bottom-up parsing builds a parse tree from the leaves (terminals) to the start symbol int E T * TE+ T (4) (2) (3) (5) (1) int*+ E 
1 Chapter 6 Bottom-Up Parsing. 2 Bottom-up Parsing A bottom-up parsing corresponds to the construction of a parse tree for an input tokens beginning at.
Conflicts in Simple LR parsers A SLR Parser does not use any lookahead The SLR parsing method fails if knowing the stack’s top state and next input token.
COMPILER CONSTRUCTION
Syntax and Semantics Structure of programming languages.
Parsing Bottom Up CMPS 450 J. Moloney CMPS 450.
Programming Languages Translator
Compiler design Bottom-up parsing Concepts
Unit-3 Bottom-Up-Parsing.
UNIT - 3 SYNTAX ANALYSIS - II
Table-driven parsing Parsing performed by a finite state machine.
Compiler Construction
Compiler design Bottom-up parsing: Canonical LR and LALR
Fall Compiler Principles Lecture 4: Parsing part 3
Bottom-Up Syntax Analysis
Syntax Analysis Part II
Fall Compiler Principles Lecture 4: Parsing part 3
Subject Name:COMPILER DESIGN Subject Code:10CS63
Parsing #2 Leonidas Fegaras.
Lecture 4: Syntax Analysis: Botom-Up Parsing Noam Rinetzky
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Compiler Design 7. Top-Down Table-Driven Parsing
Parsing #2 Leonidas Fegaras.
Fall Compiler Principles Lecture 4: Parsing part 3
Kanat Bolazar February 16, 2010
Announcements HW2 due on Tuesday Fall 18 CSCI 4430, A Milanova.
Parsing Bottom-Up LR Table Construction.
Parsing Bottom-Up LR Table Construction.
Chap. 3 BOTTOM-UP PARSING
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 7, 10/09/2003 Prof. Roy Levow.
Compiler design Bottom-up parsing: Canonical LR and LALR
Presentation transcript:

Compiler Principles Fall Compiler Principles Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University

Tentative syllabus Front End Scanning Top-down Parsing (LL) Bottom-up Parsing (LR) Attribute Grammars Intermediate Representation Lowering Optimizations Local Optimizations Dataflow Analysis Loop Optimizations Code Generation Register Allocation Instruction Selection 2 mid-termexam

Previously 3 Top-down parsing – Recursive descent – Handling conflicts – LL(k) via pushdown automata

Agenda 4 Shift-reduce (LR) parsing model Building the LR parsing table Types of conflicts

Shift-reduce parsing 5

Some terminology The opposite of derivation is called reduction – Let A  α be a production rule – Let βAµ be a sentential form – A reduction replaces α with A: βαµ  βAµ A handle is a substring that is reduced during a series of steps in a rightmost derivation 6

Using shift and reduce to parse E  E + (E) E  i 7 actionInputStack shift1 + (2) + (3) reduce+ (2) + (3)1 shift+ (2) + (3)E shift (2) + (3)E + shift2) + (3)E + ( reduce) + (3)E + (2 shift) + (3)E + (E reduce+ (3)E + (E) shift+ (3)E shift(3)E + shift3)E + ( reduce)E + (3 shift)E + (E reduceE + (E) acceptE On each step we either: - shift a symbol from the input to the stack, or - reduce symbols on the stack

How will the parser know what to do? A state will keep the info gathered so far A stack will maintain formerly reduced handles and partially reduced handles A table will tell it “what to do” based on – Current state, – Symbol on top of stack, and – k-next tokens (k≥0) 8

Model of an LR parser 9 LR Parsing program Stack $id+ + Output Parser table Input State

States and LR(0) items The state will “remember” the potential derivation rules given the part that was already identified For example, if we have already identified E then the state will remember the two alternatives: (1) E → E * B, (2) E → E + B Actually, we will also remember where we are in each of them: (1) E → E ● * B, (2) E → E ● + B A derivation rule with a location marker is called an LR(0) item The state is actually a set of LR(0) items – For example: q 13 = { E → E ● * B, E → E ● + B} 10 E → E * B | E + B | B B → 0 | 1

Intuition We gather the input token by token until we find a right-hand side of a rule and then we replace it with the nonterminal on the left side Going over a token and remembering it in the stack is a shift Each shift moves to a state that remembers what we’ve seen so far A reduce replaces a string in the stack with the nonterminal that derives it 11

Why do we need the stack? E  E + (E) E  i 12 actionInputStack shift1 + (2) + (3) reduce+ (2) + (3)1 shift+ (2) + (3)E shift (2) + (3)E + shift2) + (3)E + ( reduce) + (3)E + (2 shift) + (3)E + (E reduce+ (3)E + (E) shift+ (3)E shift(3)E + shift3)E + ( reduce)E + (3 shift)E + (E reduceE + (E) acceptE Suppose so far we have discovered E → 1 and gather information on “E +” In the given grammar this can only mean E → E + ● (E) Suppose state q represents this possibility Now, the next token is (, and we need to ignore q for a minute, and work on E → 2 to obtain E+(E) Therefore, we push q to the stack, and after identifying E, we pop it to continue

LR parser stack 13 LR Parsing program 5 T id 0 Stack $id+ + Output state symbol gotoaction Input State

LR parsing table state terminals non-terminals shift/reduce actions goto part snsn rkrk shift state nreduce by rule k gmgm goto state m acc accept error 14

LR(0) parser table example 15 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) gotoactionSTATE TE$)(+id g6g1s7s50 s2s31 acc2 g4s7s53 r3 4 r4 5 r2 6 g6g8s7s57 s9s38 r5 9 Always entire row of rk Always entire row of shift and gotos (possibly accept)

LR parser moves 16

Shift move 17 LR Parsing program q Stack $…t… Output gotoaction Input If action[q, t] = sn then push t, push n current state n is the next state

Result of shift 18 LR Parsing program n t q Stack $…t… Output gotoaction Input If action[q, t] = sn then push t, push n

Reduce move If action[q n, t] = rk Production: (k) A  σ 1 … σ n Top of stack looks like q 1 σ 1 … q n σ n 1.Pop q n σ n … q 1 σ 1 2.If goto[q, A] = q’ then push A, push q’ 19 LR Parsing program qnqn … q … Stack $…t… Output gotoaction Input 2*n Rule k

Result of reduce move If action[q n, t] = rk Production: (k) A  σ 1 … σ n Top of stack looks like q 1 σ 1 … q n σ n 1.Pop q n σ n … q 1 σ 1 2.If goto[q, A] = q’ then push A, push q’ 20 LR Parsing program Stack Output gotoaction q’ A q … $…t… Input

Accept move 21 LR Parsing program q Stack $t… Output gotoaction Input If action[q, t] = accept parsing completed

Error move 22 LR Parsing program q Stack $…t… Output gotoaction Input If action[q, t] = error parsing discovered a syntactic error

Example of Shift-reduce parser run 23

Parsing id+id$ 24 gotoactionS TE$)(+id g6g1s7s50 s2s31 acc2 g4s7s53 r3 4 r4 5 r2 6 g6g8s7s57 s9s38 r5 9 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) StackInputAction 0id + id $? Initialize with state 0

Parsing id+id$ 25 gotoactionS TE$)(+id g6g1s7s50 s2s31 acc2 g4s7s53 r3 4 r4 5 r2 6 g6g8s7s57 s9s38 r5 9 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) StackInputAction 0id + id $s5

Parsing id+id$ 26 StackInputAction 0id + id $s5 0 id 5+ id $r4 gotoactionS TE$)(+id g6g1s7s50 s2s31 acc2 g4s7s53 r3 4 r4 5 r2 6 g6g8s7s57 s9s38 r5 9 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E )

Parsing id+id$ 27 StackInputAction 0id + id $s5 0 id 5+ id $r4 gotoactionS TE$)(+id g6g1s7s50 s2s31 acc2 g4s7s53 r3 4 r4 5 r2 6 g6g8s7s57 s9s38 r5 9 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) pop id 5

Parsing id+id$ 28 StackInputAction 0id + id $s5 0 id 5+ id $r4 gotoactionS TE$)(+id g6g1s7s50 s2s31 acc2 g4s7s53 r3 4 r4 5 r2 6 g6g8s7s57 s9s38 r5 9 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) push T 6

Parsing id+id$ 29 StackInputAction 0id + id $s5 0 id 5+ id $r4 0 T 6+ id $r2 gotoactionS TE$)(+id g6g1s7s50 s2s31 acc2 g4s7s53 r3 4 r4 5 r2 6 g6g8s7s57 s9s38 r5 9 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E )

Parsing id+id$ 30 StackInputAction 0id + id $s5 0 id 5+ id $r4 0 T 6+ id $r2 0 E 1+ id $s3 gotoactionS TE$)(+id g6g1s7s50 s2s31 acc2 g4s7s53 r3 4 r4 5 r2 6 g6g8s7s57 s9s38 r5 9 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E )

Parsing id+id$ 31 StackInputAction 0id + id $s5 0 id 5+ id $r4 0 T 6+ id $r2 0 E 1+ id $s3 0 E 1 + 3id $s5 gotoactionS TE$)(+id g6g1s7s50 s2s31 acc2 g4s7s53 r3 4 r4 5 r2 6 g6g8s7s57 s9s38 r5 9 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E )

Parsing id+id$ 32 StackInputAction 0id + id $s5 0 id 5+ id $r4 0 T 6+ id $r2 0 E 1+ id $s3 0 E 1 + 3id $s5 0 E id 5$r4 gotoactionS TE$)(+id g6g1s7s50 s2s31 acc2 g4s7s53 r3 4 r4 5 r2 6 g6g8s7s57 s9s38 r5 9 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E )

Parsing id+id$ 33 StackInputAction 0id + id $s5 0 id 5+ id $r4 0 T 6+ id $r2 0 E 1+ id $s3 0 E 1 + 3id $s5 0 E id 5$r4 0 E T 4$r3 gotoactionS TE$)(+id g6g1s7s50 s2s31 acc2 g4s7s53 r3 4 r4 5 r2 6 g6g8s7s57 s9s38 r5 9 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E )

Parsing id+id$ 34 StackInputAction 0id + id $s5 0 id 5+ id $r4 0 T 6+ id $r2 0 E 1+ id $s3 0 E 1 + 3id $s5 0 E id 5$r4 0 E T 4$r3 0 E 1$s2 gotoactionS TE$)(+id g6g1s7s50 s2s31 acc2 g4s7s53 r3 4 r4 5 r2 6 g6g8s7s57 s9s38 r5 9 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E )

Parsing id+id$ 35 StackInputAction 0id + id $s5 0 id 5+ id $r4 0 T 6+ id $r2 0 E 1+ id $s3 0 E 1 + 3id $s5 0 E id 5$r4 0 E T 4$r3 0 E 1$s2 0 E 1 $ 2acc gotoactionS TE$)(+id g6g1s7s50 s2s31 acc2 g4s7s53 r3 4 r4 5 r2 6 g6g8s7s57 s9s38 r5 9 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E )

Constructing an LR(0) parsing table 36

Overall process 1.Construct a (determinized) transition diagram from LR(0) items 2.If there are conflicts – stop – Grammar is not LR(0) 3.Otherwise, fill table entries from diagram 37

LR(0) item N  α  β Already matched To be matched Input Hypothesis about αβ being a possible handle, so far we’ve matched α, expecting to see β 38

LR(0) items N  α  β Shift Item N  αβ  Reduce Item 39

LR(0) items enumeration example All items can be obtained by placing a dot at every position for every production: (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) 1: S   E$ 2: S  E  $ 3: S  E $  4: E   T 5: E  T  6: E   E + T 7: E  E  + T 8: E  E +  T 9: E  E + T  10: T   id 11: T  id  12: T   (E) 13: T  (  E) 14: T  (E  ) 15: T  (E)  Grammar LR(0) items 40

Operations for transition diagram construction Initial = {S’   S$} For an item set I solve: Closure(I) = Closure(I) + {X   µ is in grammar| N  α  Xβ in I} Goto(I, σ) = { N  ασ  β | N  α  σβ in I} – σ is either a terminal or nonterminal 41

Initial example Initial = { S   E $ } 42 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) Grammar

Closure example Initial = { S   E $ } Closure({ S   E $ }) = S   E $ E   T E   E + T T   id T   ( E ) 43 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) Grammar

Goto example Initial = { S   E $ } Closure({ S   E $ }) = S   E $ E   T E   E + T T   id T   ( E ) Goto({S   E $, E   E + T, T   id}, E) = {S  E  $, E  E  + T} 44 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) Grammar

Constructing the transition diagram 1.Start with state 0 containing item Closure({ S   E $ }) 2.Repeat until no new states are discovered – For every state p containing item set Ip, and symbol N, compute state q containing item set Iq = Closure(Goto(Ip, N)) 45 Why does it terminate?

LR(0) automaton example 46 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) S   E$ E   T E   E + T T   id T   (E) T  (  E) E   T E   E + T T   id T   (E) E  E + T  T  (E)  S  E$  S  E  $ E  E  + T E  E+  T T   id T   (E) T  id  T  (E  ) E  E  +T E  T  q0q0 q1q1 q2q2 q3q3 q4q4 q5q5 q6q6 q7q7 q8q8 q9q9 T ( id E + $ T ) + E T ( i (

LR(0) automaton construction example 47 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) S   E$ q0q0 Initialize

LR(0) automaton construction example 48 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) S   E$ E   T E   E + T T   id T   (E) q0q0 apply Closure

LR(0) automaton construction example 49 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) S   E$ E   T E   E + T T   id T   (E) q0q0 E  T  q6q6 T T  (  E) E   T E   E + T T   id T   (E) ( T  id  q5q5 id S  E  $ E  E  + T q1q1 E

LR(0) automaton construction example 50 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) S   E$ E   T E   E + T T   id T   (E) T  (  E) E   T E   E + T T   id T   (E) E  E + T  T  (E)  S  E$  S  E  $ E  E  + T E  E+  T T   id T   (E) T  id  T  (E  ) E  E  +T E  T  q0q0 q1q1 q2q2 q3q3 q4q4 q5q5 q6q6 q7q7 q8q8 q9q9 T ( id E + $ T ) + E T ( i ( terminal transition corresponds to shift action in parse table non-terminal transition corresponds to goto action in parse table a single reduce item corresponds to reduce action

LR(0) conflicts 51

Conflicts Can construct a diagram for every grammar but some may introduce conflicts shift-reduce conflict: an item set contains at least one shift item and one reduce item reduce-reduce conflict: an item set contains two reduce items 52 What about shift-shift conflicts?

Shift-reduce conflict example S  E $ E  T E  E + T T  id T  ( E ) T  id[E] S   E$ E   T E   E + T T   id T   (E) T   id[E] T  id  T  id  [E] q0q0 q5q5 T ( id E Shift/reduce conflict … … … 53

Reduce-reduce conflict example S  E $ E  T E  V E  E + T T  id V  id T  ( E ) S   E$ E   T E   V E   E + T T   id V   id T   (E) T   i[E] T  id  V  id  q0q0 q5q5 T ( id E reduce/reduce conflict … … … 54

LR(0) conflicts Any grammar with an  -rule cannot be LR(0) Inherent shift/reduce conflict – A   – reduce item – P  α  Aβ – shift item – A   can always be predicted from P  α  Aβ Similar to FIRST-FOLLOW conflicts in LL(1) parsing – Similar solution 55

LR parsing variants 56

LR variants LR(0) – what we’ve seen so far SLR(0) – Removes infeasible reduce actions via FOLLOW set reasoning LR(1) – LR(0) with one lookahead token in items LALR(1) – LR(1) with merging of states with same LR(0) component 57

SLR parsing 58

SRL parsing A handle should not be reduced to a non- terminal N if the lookahead is a token that cannot follow N A reduce item N  α  is applicable only when the lookahead is in FOLLOW(N) – If b is not in FOLLOW(N) we just proved there is no terminating derivation S =>* βNb and thus it is safe to remove the reduce item from the conflicted state Differs from LR(0) only on the ACTION table – Now a row in the parsing table may contain both shift actions and reduce actions and we need to consult the current token to decide which one to take 59

SLR action table Stateid+()[]$ 0shift 1 2accept 3shift 4E  E+T 5T  id r5, s6 T  id 6ETETETETETET 7shift 8 9T  (E) vs. stateaction q0shift q1shift q2 q3shift q4E  E+T q5T  id q6ETET q7shift q8shift q9TETE SLR – use 1 token look-aheadLR(0) – no look-ahead … as before… T  id T  id[E] Lookahead token from the input 60 [ is not in FOLLOW(T)

Next lecture: SLR/LR(1)/LALR(1)/Parser generation