Finishing Tool Construction

Slides:



Advertisements
Similar presentations
Bottom-up Parser Table Construction David Walker COS 320.
Advertisements

A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Compiler Designs and Constructions
Compiler construction in4020 – lecture 4 Koen Langendoen Delft University of Technology The Netherlands.
Compiler Principles Fall Compiler Principles Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University.
 CS /11/12 Matthew Rodgers.  What are LL and LR parsers?  What grammars do they parse?  What is the difference between LL and LR?  Why do.
Bottom up Parsing Bottom up parsing trys to transform the input string into the start symbol. Moves through a sequence of sentential forms (sequence of.
Regular Expressions and DFAs COP 3402 (Summer 2014)
Compiler Construction Sohail Aslam Lecture Finite Automaton of Items Then for every item A →  X  we must add an  -transition for every production.
CSE 5317/4305 L4: Parsing #21 Parsing #2 Leonidas Fegaras.
Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
CS Summer 2005 Top-down and Bottom-up Parsing - a whirlwind tour June 20, 2005 Slide acknowledgment: Radu Rugina, CS 412.
Formal Aspects Term 2, Week4 LECTURE: LR “Shift-Reduce” Parsers: The JavaCup Parser-Generator CREATES LR “Shift-Reduce” Parsers, they are very commonly.
Lecture #8, Feb. 7, 2007 Shift-reduce parsing,
CS 536 Spring Bottom-Up Parsing: Algorithms, part 1 LR(0), SLR Lecture 12.
Parsing V Introduction to LR(1) Parsers. from Cooper & Torczon2 LR(1) Parsers LR(1) parsers are table-driven, shift-reduce parsers that use a limited.
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
1 LR parsing techniques SLR (not in the book) –Simple LR parsing –Easy to implement, not strong enough –Uses LR(0) items Canonical LR –Larger parser but.
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
Bottom-up parsing Goal of parser : build a derivation
Syntax and Semantics Structure of programming languages.
1 Chapter 3 Scanning – Theory and Practice. 2 Overview Formal notations for specifying the precise structure of tokens are necessary –Quoted string in.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
11 Outline  6.0 Introduction  6.1 Shift-Reduce Parsers  6.2 LR Parsers  6.3 LR(1) Parsing  6.4 SLR(1)Parsing  6.5 LALR(1)  6.6 Calling Semantic.
Syntax and Semantics Structure of programming languages.
Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.
111 Chapter 6 LR Parsing Techniques Prof Chung. 1.
Top-Down Predictive Parsing We will look at two different ways to implement a non- backtracking top-down parser called a predictive parser. A predictive.
Language Translation Part 2: Finite State Machines.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
Three kinds of bottom-up LR parser SLR “Simple LR” –most restrictions on eligible grammars –built quite directly from items as just shown LR “Canonical.
Bottom-Up Parsing Algorithms LR(k) parsing L: scan input Left to right R: produce Rightmost derivation k tokens of lookahead LR(0) zero tokens of look-ahead.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.
Lecture 5: LR Parsing CS 540 George Mason University.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
COMPILER CONSTRUCTION
Syntax and Semantics Structure of programming languages.
Announcements/Reading
Pushdown Automata.
Programming Languages Translator
Table-driven parsing Parsing performed by a finite state machine.
COP4620 – Programming Language Translators Dr. Manuel E. Bermudez
Compiler Construction
Fall Compiler Principles Lecture 4: Parsing part 3
Canonical LR Parsing Tables
Subject Name:COMPILER DESIGN Subject Code:10CS63
Regular Grammar - Finite Automaton
Syntax Analysis - LR(1) and LALR(1) Parsing
Feedback from Assignment 1
Parsing #2 Leonidas Fegaras.
Compiler SLR Parser.
LR Parsing. Parser Generators.
Parsing #2 Leonidas Fegaras.
Kanat Bolazar February 16, 2010
Announcements HW2 due on Tuesday Fall 18 CSCI 4430, A Milanova.
Parsing Bottom-Up LR Table Construction.
Parsing Bottom-Up LR Table Construction.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Chap. 3 BOTTOM-UP PARSING
Grammars and Finite State Machines and Regular Expressions
Building Readahead FSMs for Grammars
Building Readback FSMs for Readahead FSMs
Class Relation.
Readahead FSMs, Readback FSMs, and Reduce States
Trees That Represent Grammars
Scanners/Parsers in a Nutshell
Overview of the Course.
Semantic Routines.
Presentation transcript:

Finishing Tool Construction

Before We Start Note that previously, we constructed the readahead FSM with lookahead that switches to initial readback states. We constructed the readback FSM with lookback that switches to a reduce state. What's left? Semantic Transitions Finalizing readahead and reduce states Detecting conflicts Converting to tables

Constructing Semantic Action States

Recall: There are 2 Kinds of Semantic Action Transitions Tree building E '+' T #buildTree ['+'] E -> 1 2 3 4 5 Non-tree building parser #type [] E G -> 1 2 3 4 transition names

Each Will Have Corresponding Transitions in the Readahead FSM Tree building Ra Ra Ra Ra Ra E '+' T #buildTree ['+'] 1 2 3 4 5 Non-tree building Ra Ra Ra Ra parser #type [] E 6 7 8 9 But the final tables have semantic tables corresponding to semantic states (NOT TRANSITIONS)

What we need to Do Rip out the semantic transition and plug in a semantic state in its place. Ra Ra Ra Ra Ra E '+' T #buildTree ['+'] 1 2 3 4 5 Semantic Action #buildTree ['+'] ??? ??? 10 Ra Ra Ra Ra parser #type [] Needs some lookahead E Needs to go somewhere 1 2 3 4 Semantic Action #type [] ??? ??? 20 What's the difference (if any)?

Recall buildTree: rootNode "Pick up the children from the tree stack between left and right inclusive (provided they're not nil) and build a tree with the given label. Store it in instance variable newTree so a reduce table can use it." | children | children := (left to: right) collect: [:index | treeStack at: index] when: [:index | (treeStack at: index) notNil]. newTree := Tree new label: rootNode; children: children. Who sets up left?

Recall buildTree: rootNode "Pick up the children from the tree stack between left and right inclusive (provided they're not nil) and build a tree with the given label. Store it in instance variable newTree so a reduce table can use it." | children | children := (left to: right) collect: [:index | treeStack at: index] when: [:index | (treeStack at: index) notNil]. newTree := Tree new label: rootNode; children: children. The readback sets up left? So this can't run until the readback is completely done

Tree Building Semantic Actions Must be pushed to the end of readback 1 Ra Ra Ra Ra Ra lookahead E '+' T #buildTree ['+'] 1 2 3 4 5 Rb Rb Rb Rb lookback Reduce to E T4 '+'3 E2 4 2 Semantic Action #buildTree ['+'] 3 10 i.e., Push sem state to the end of readback; i.e., in front of reduce state 1. Duplicate the lookahead transitions (the follow of it's left part). 2. Replace ALL reachable lookback's goto by semantic action state. 3. Make the semantic state goto the reduce table.

Non-Tree Building Semantic Actions Must run WHERE THEY ARE. Ra Ra Ra Ra parser #type [] E 1 2 3 4 Follow (Ra2) Semantic Action #type [] 10 1. FOLLOW(Ra2) is computed from it's e-successors. There an extra slide in the Follow set notes for more details.

Finalizing Readahead and Reduce States

Readahead FSMs contain both terminal and nonterminal transitions Recall Readahead FSMs contain both terminal and nonterminal transitions Readahead tables contain only terminal transitions where do the nonterminal transitions go? Reduce tables contain the nonterminal transitions.

This is the readahead FSM for grammar '|-' E {EndOfFile} @E' 1 2 3 4 +, - T @E 5 6 *, / To 8 P To 10 '(' To 11 This is the readahead FSM for grammar T @E i 7 To 14 *, / P @T 8 9 E -> E + T | E – T | T . T -> T * P | T / P | P . P -> '(' E ')' | i . @T '(' P To 11 10 i To 14 '(' ')' E @P 11 12 13 T +, - To 7 To 5 P Note: All nonterminal transitions have attributes (whether or not it's shown) To 10 '(' To 11 i To 14 i @P 14

'|-' E {EndOfFile} @E' 1 2 3 4 Red E +, - T @E 5 6 *, / To 8 P To 10 attrib1 1 2 3 4 2 Red E 3 attrib2 +, - T @E 11 12 5 6 *, / To 8 P To 10 '(' Finalizing E To 11 T @E i 7 To 14 *, / P @T 8 9 @T '(' P To 11 10 i To 14 '(' ')' E @P 11 12 13 T +, - To 7 To 5 P To 10 '(' To 11 i To 14 i @P 14

'|-' E {EndOfFile} @E' 1 2 3 4 Red E +, - T @E 5 6 *, / To 8 P To 10 attrib1 1 2 3 4 2 Red E 3 11 attrib2 +, - T @E 5 6 12 *, / To 8 P To 10 attrib1 5 '(' Red T 6 To 11 2 attrib2 T @E i 7 7 To 14 11 attrib3 7 *, / P @T 8 9 @T '(' Finalizing T P To 11 10 i To 14 '(' ')' E @P 11 12 13 T +, - To 7 To 5 P To 10 '(' To 11 i To 14 i @P 14

'|-' {EndOfFile} @E' E 1 2 3 4 Red E T @E +, - 5 6 *, / To 8 P To 10 attrib1 1 2 3 4 2 Red E 3 T @E 11 attrib2 +, - 5 6 12 *, / To 8 P To 10 attrib1 5 '(' Red T 6 To 11 2 attrib2 T @E i 7 7 To 14 11 attrib3 7 *, / P @T 8 9 Ra Ra attrib1 @T '(' 5 E P Red P 10 2 To 11 3 attrib2 10 8 i 2 To 14 9 Red E 3 attrib3 '(' ')' Ra E @P 2 Ra 11 10 12 11 12 13 E 11 attrib4 11 12 10 T +, - To 7 To 5 Finalizing P P To 10 '(' To 11 i To 14 i @P 14

Finalizing the Readahead FSM No more nonterminal transitions '|-' {EndOfFile} @E' attrib1 1 2 3 4 2 Red E 3 attrib2 +, - @E 11 5 6 12 *, / To 8 attrib1 5 '(' Red T 6 To 11 2 attrib2 @E i 7 7 To 14 11 attrib3 7 *, / @T 8 9 attrib1 '(' 5 @T Red P 10 To 11 attrib2 10 8 i To 14 9 attrib3 '(' ')' @P 2 10 11 12 13 11 attrib4 10 +, - To 5 Finalizing the Readahead FSM '(' To 11 i To 14 i No more nonterminal transitions @P 14

Recall: What Do Tables Look Like read from building RA states, looks from lookahead follow sets (NO NONTERMINALTRANSITIONS) table number (ReadaheadTable 1 (Integer 'RSN' 27) (Identifier 'RSN' 4) ('(' 'RS' 5)) (ReadbackTable 21 ((Term 12) 'RSN' 40) ((Term 3) 'RSN' 40)) (ShiftbackTable 25 1 35) (ReduceTable 35 Expression (1 'RSN' 2)(5 ‘L' 10)(8 'RN' 13)(9 'RSN' 14)(15 'RSN' 17)) (SemanticTable 39 buildTree: '+' 35) (AcceptTable 43)) Triples: symbol × attributes × goto Triples: pair × attributes × goto read from building RB states, looks from lookbacks where pair: symbol × table number Shift amount and goto Nonterminal to reduce to Triples: “table number at top of table number stack” × attributes × goto Transitions originally from RA states “action followed by 0 or more parameters” × goto

Detecting Conflicts

A conflict is a situation where the tables do not work. Conflicts A conflict is a situation where the tables do not work. Example NOT YET at the right end of a handle a R1 Ra {a} R2 at the right end of a handle This indicates that a readahead state (or table) believes it can be simultaneously at the right end of a handle and NOT at the right end. Tables with conflicts cannot be used

There are 3 kinds of conflicts Can't tell if we are at the right end. a lookahead conflict Can't tell if we are at the left end. a lookback conflict Attributes conflict; e.g., simultaneously stack and noStack. an attribute conflict The only recourse is to give an error message and indicate that tables cannot be built

Lookahead Conflicts Lookahead conflicts (readahead state transitions that are equal when you ignore the read/look attributes) require error messages. look-look conflicts {a} R1 This will be lookahead for some A Ra {a} Not deterministic R2 This will be lookahead for some B read-look conflicts a R1 This means a production is not yet finished Ra {a} Not deterministic This means a production is finished for some A R2

Lookback Conflicts Lookback conflicts (readback state transitions that are equal when you ignore the read/look attributes) require error messages. read-look conflicts Mp R1 This means we reached the left end of the handle Rb {Mp} Not deterministic R2 This means we DID NOT reach the left end of the handle

Attributes Conflicts (in Readahead/Readback States) Attribute conflicts (transitions other than read/look that are equal when you ignore the attributes) require error messages. node conflicts aNode R1 Ra anoNode Not deterministic R2 stack conflicts aStack R1 Ra anoStack Not deterministic R2 keep conflicts aKeep R1 Ra anoKeep Not deterministic R2

Converting to Tables

Mostly a printing Task Just output the states but call them tables. Some things can be optimized away. Must be possible to renumber your states with globally consecutive numbers (with all tables of the same type being consecutive)..

What Can Be Optimized Current parsers can't handle them Convert pair-reading RbStates to state-reading RbStates. Merge RbStates if no conflict results when the transitions are combined Eliminate RaStates where all transitions are lookaheads to the same successor. Transitions that go to them must be replaced by their successors. Eliminate RbStates where all lookbacks are to the same successor. Transitions that go to them must be replaced by their successors. Replace RbStates by shift states if you can If all transition goto the same state  shift 1 goto state Shift m followed by shift n  shift m+n Unreachable states need to be discarded

The Complete Table Building Algorithm (7 slides)

The Table Building Algorithm Build the grammar G and augment it with an additional production G' -> |- G {EndOfFile} Convert the right part to FSMs and and compute Follow sets; denoted @A if A is a nonterminal. Build the reduce states (one per nonterminal) and the sole accept state.

Remember A readahead state is a collection of right part states. '+' T E -> 1 2 3 4 Each of these is a right part state E '-' T 5 6 7 T 8 a Readahead State It's a final readahead state since one right part state is final 1 3 8

The Table Building Algorithm Build the readahead states from the initial right part state IG' of goal item G' * Initial RaState: IG' Successor Rastates of R: M * Successor: Q = R M Transition: R Q Q, the M-successor of R was originally computed without the closure. The closure is added before successors of Q are computed.

Each of these is a right part state Remember A readback state is a collection of <right part state, readahead state> pairs. Ra10 Ra20 Each of these is a right part state 1 3 8 5 3 9 aReadback state It's a final readback state since one right part state is initial; i.e., 1 (indicating we're at the left end) 110 310 320 920

The Table Building Algorithm Build the readback states from final readahead states, one RbFSM per nonterminal A; i.e., Exception: If A is G', add @A transitions to Accept state Build initial RbState Rb: If FA is the set of READBACK items in Ra where the right part state is final and associated with nonterminal A (there can be more than one nonterminal), add lookahead transitions for @A to initial RbState Rb where * Rb = FA Initial RbState: invisible @A New Transition: Ra Rb Build Successor RbStates of R: Mp * Successor: Q = R Mp visible invisible Transition: R Q Q, the Mp-successor of R was originally computed without the closure. The closure is added before successors of Q are computed.

The Table Building Algorithm Build the lookback transitions They contain pairs like p10 where p is an initial right part state Let Rb be a final RBstate associated with A. Let I be the readback items in Rb with initial right part states. Compute the lookback L for A as shown below BY LOOKING FOR AN ITEM WITH AN UP; i.e., for which I is non-empty L Red A New Transition: Rb Mp L = {Mp | I ( | ) is not empty} * invisible visible In words, there is lookback if there is an up, and you get the lookback by going up and left over invisibles as far as you can... AND THEN, you encounter a visible. THE VISIBLE IS THE LOOKBACK

The Table Building Algorithm Build semantic action states. There are 2 cases: RaState q Semantic action s RaState p Semantic State s {@p} It’s not tree building Compute the follow set of the state Replace transition by @A Final It is tree building (the transition was associated with a left part A).  '?' RaState p RaState q {@A} Target  Sem State for  '?' Red A Replace the transition by a set of look transitions for @A Push the sem state right until it is between lookback and the reduce state There can be many such places TO PUSH TO

The Table Building Algorithm Report lookahead conflicts; transitions that are equal when you ignore the attributes. Detect look-look conflicts (give error message) {a} R1 This will be lookahead for some A Ra Not deterministic {a} R2 This will be lookahead for some B Detect read-look conflicts (give error message) a R1 This means a production is not yet finished Ra Not deterministic {a} This means a production is finished for some A R2 Detect other attribute conflicts (give error message) aNode R1 Or Stack versus noStack Ra Or Keep versus noKeep anoNode Not deterministic R2

The Table Building Algorithm Report lookback conflicts; i.e., transitions that are equal when you ignore the attributes Detect read-look conflicts (give error message) Mp R1 Rb {Mp} This means there a situation where we are simultaneously at the left end and not yet at the left end and we don’t know what to do. R2 Not deterministic

The Table Building Algorithm Finalize readahead and reduce states by moving all RaState nonterminal transitions into the reduce states. Ra Ra E 2 3 attrib1 2 Red E 3 Ra Ra 11 attrib2 12 E 11 12 2 transitions in the readahead states the same 2 transitions in the reduce state Eliminate the initial RAState. We don’t need it to place “|-” in the parse stack. The successor is the new initial state

The Table Building Algorithm Optimize the states; i.e., if you can optional Convert pair-reading RbStates to state-reading RbStates. Merge RbStates if no conflict results when the transitions are combined Eliminate RaStates where all transitions are lookaheads to the same successor. Transitions that go to them must be replaced by their successors. Eliminate RbStates where all lookbacks are to the same successor. Transitions that go to them must be replaced by their successors. Replace RbStates by shift states if you can. Renumber the states in ascending order by type (all states of the same type being consecutive). Output the states as tables.

The Summary Assignment #1 Assignment #2 Assignment #3 Assignment #4 Use and finish the scanner/parser Build finite state machines Assignment #2 Assignment #3 Build grammar and follow sets Build reduce and accept states Build readahead states with lookahead bridges Assignment #4 Build semantic action states Build readback states with lookback bridges Report lookahead conflicts Report lookback conflicts Finalize reduce states Eliminate the initial RAState Optimize states Renumber the states and output

What about Scanner Tables? Multiple goals. No Readback No Reduce tables (alternative is to loop back to the initial scanner readahead state).

Done