CPSC46001 Bottom-up Parsing Reading Sections 4.5 and 4.7 from ASU.

Slides:



Advertisements
Similar presentations
Parsing V: Bottom-up Parsing
Advertisements

Compiler Construction
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Bottom up Parsing Bottom up parsing trys to transform the input string into the start symbol. Moves through a sequence of sentential forms (sequence of.
Predictive Parsing l Find derivation for an input string, l Build a abstract syntax tree (AST) –a representation of the parsed program l Build a symbol.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
1 Chapter 5: Bottom-Up Parsing (Shift-Reduce). 2 - attempts to construct a parse tree for an input string beginning at the leaves (the bottom) and working.
1 Bottom Up Parsing. 2 Bottom-Up Parsing l Bottom-up parsing is more general than top-down parsing »And just as efficient »Builds on ideas in top-down.
Pertemuan 12, 13, 14 Bottom-Up Parsing
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
CS Summer 2005 Top-down and Bottom-up Parsing - a whirlwind tour June 20, 2005 Slide acknowledgment: Radu Rugina, CS 412.
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
CS 536 Spring Introduction to Bottom-Up Parsing Lecture 11.
CS 536 Spring Bottom-Up Parsing: Algorithms, part 1 LR(0), SLR Lecture 12.
Bottom Up Parsing.
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
CS 330 Programming Languages 09 / 23 / 2008 Instructor: Michael Eckmann.
LR(1) Languages An Introduction Professor Yihjia Tsai Tamkang University.
1 CIS 461 Compiler Design & Construction Fall 2012 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #12 Parsing 4.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
Bottom-up parsing Goal of parser : build a derivation
Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Syntax and Semantics Structure of programming languages.
410/510 1 of 21 Week 2 – Lecture 1 Bottom Up (Shift reduce, LR parsing) SLR, LR(0) parsing SLR parsing table Compiler Construction.
LR Parsing Compiler Baojian Hua
1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down  LL Bottom-up.
LR(k) Parsing CPSC 388 Ellen Walker Hiram College.
Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.
CS 321 Programming Languages and Compilers Bottom Up Parsing.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
1 Compiler Construction Syntax Analysis Top-down parsing.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
CH4.1 CSE244 Sections 4.5,4.6 Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Box U-155 Storrs,
CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 1 Chapter 4 Chapter 4 Bottom Up Parsing.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
Syntax and Semantics Structure of programming languages.
1 Bottom-Up Parsing  “Shift-Reduce” Parsing  Reduce a string to the start symbol of the grammar.  At every step a particular substring is matched (in.
Chapter 5: Bottom-Up Parsing (Shift-Reduce)
Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.
111 Chapter 6 LR Parsing Techniques Prof Chung. 1.
1 Compiler Construction Syntax Analysis Top-down parsing.
Bottom-Up Parsing David Woolbright. The Parsing Problem Produce a parse tree starting at the leaves The order will be that of a rightmost derivation The.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
10/10/2002© 2002 Hal Perkins & UW CSED-1 CSE 582 – Compilers LR Parsing Hal Perkins Autumn 2002.
Top-Down Parsing.
CSE 5317/4305 L3: Parsing #11 Parsing #1 Leonidas Fegaras.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Bottom-Up Parsing Algorithms LR(k) parsing L: scan input Left to right R: produce Rightmost derivation k tokens of lookahead LR(0) zero tokens of look-ahead.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.
Lecture 5: LR Parsing CS 540 George Mason University.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
Syntax and Semantics Structure of programming languages.
Programming Languages Translator
Compiler Baojian Hua LR Parsing Compiler Baojian Hua
Unit-3 Bottom-Up-Parsing.
UNIT - 3 SYNTAX ANALYSIS - II
Parsing IV Bottom-up Parsing
Subject Name:COMPILER DESIGN Subject Code:10CS63
4d Bottom Up Parsing.
Lecture (From slides by G. Necula & R. Bodik)
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Bottom Up Parsing.
Parsing IV Bottom-up Parsing
LR Parsing. Parser Generators.
Chap. 3 BOTTOM-UP PARSING
Presentation transcript:

CPSC46001 Bottom-up Parsing Reading Sections 4.5 and 4.7 from ASU

CPSC46002 Predictive Parsing Summary  First and Follow sets are used to construct predictive tables For non-terminal A and input t, use a production A  a where t  First( a ) For non-terminal A and input t, if e  First(A) and t  Follow( a ), then use a production A  a where e  First( a )  Recursive-descent without backtracking do not need the parse table explicitly

CPSC46003 Bottom-Up Parsing(1)  Bottom-up parsing is more general than top-down parsing And just as efficient Builds on ideas in top-down parsing  Bottom-up is the preferred method in practice

CPSC46004 Bottom-Up Parsing(2)  Table-driven using an explicit stack (non-recursive)  Stack can be viewed as containing both terminals and nonterminals  Basic operations: Shift: Move terminals from input stream to the stack until the right-hand side of an appropriate production rule has been identified in the stack Reduce: Replace the sentential form appearing on the stack (considered from top that matched the right-hand side of an appropriate production rule) with the nonterminal appearing on the left-hand side of the production.

CPSC46005 An Introductory Example  Bottom-up parsers don ’ t need left-factored grammars  Hence we can revert to the “ natural ” grammar for our example: E  T + E | T T  num * T | num | (E)  Consider the string: num * num + num

CPSC46006 The Idea Bottom-up parsing reduces a string to the start symbol by inverting productions: E E  T + E T + E E  T T + T T  num T + num T  num * T num * T + num T  num num * num + num InputProductions Used

CPSC46007 Right-most Derivation  In a right-most derivation, the rightmost nonterminal of a sentential form is replaced at each derivation step.  Question: find the rightmost derivation of the string num* num + num

CPSC46008 Observation  Read the productions found by bottom-up parse in reverse (i.e., from bottom to top)  This is a rightmost derivation! E E  T + E T + E E  T T + T T  num T + num T  num * T num * T + num T  num num * num + num

CPSC46009 Important Facts A bottom-up parser traces a rightmost derivation in reverse

CPSC A Bottom-up Parse E T + E T + T T + num num * T + num num * num + num E TE + num * T T

CPSC A Bottom-up Parse in Detail (1) + num * num * num + num

CPSC A Bottom-up Parse in Detail (2) num * T + num num * num + num + num * T

CPSC A Bottom-up Parse in Detail (3) T + num num * T + num num * num + num T + num * T

CPSC A Bottom-up Parse in Detail (4) T + T T + num num * T + num num * num + num T + num * T T

CPSC A Bottom-up Parse in Detail (5) T + E T + T T + num num * T + num num * num + num TE +num * T T

CPSC A Bottom-up Parse in Detail (6) E T + E T + T T + num num * T + num num * num + num E TE + num * T T

CPSC Bottom-up Parsing A trivial bottom-up parsing algorithm Let I = input string repeat pick a non-empty substring  of I where X   is a production if no such , backtrack replace one  by X in I until I = “ S ” (the start symbol) or all possibilities are exhausted

CPSC Observations  The termination of the algorithm (when/if)  Running time of the algorithm  If there are more than one choices for the sub-string to be replaced (reduce) which one to choose?

CPSC Where Do Reductions Happen Recall A bottom-up parser traces a rightmost derivation in reverse Let  be a rightmost sentential form Assume the next reduction is by X   Then  is a string of terminals Why? Because  X    is a step in a right-most derivation

CPSC Shift-Reduce Parsing Bottom-up parsing uses only two kinds of actions: Shift Reduce

CPSC Shift  Shift: Move # (marking the part of the input that has been processed) one place to the right Shifts a terminal to the left string ABC#xyz  ABCx#yz

CPSC Reduce  Apply an inverse production at the right end of the left string If A  xy is a production, then Cbxy#ijk  CbA#ijk

CPSC The Example with Shift-Reduce Parsing reduce T  num T + num # shiftT + # num shiftnum # * num + num shiftnum * # num + num shift#num * num + num E # reduce E  T + E T + E # reduce E  T T + T # shiftT # + num reduce T  num * T num * T # + num reduce T  num num * num # + num

CPSC A Shift-Reduce Parse in Detail (1) + num *  #num * num + num

CPSC A Shift-Reduce Parse in Detail (2) + num *  num # * num + num #num * num + num

CPSC A Shift-Reduce Parse in Detail (3) + num *  num # * num + num num * # num + num #num * num + num

CPSC A Shift-Reduce Parse in Detail (4) + num *  num # * num + num num * # num + num #num * num + num num * num # + num

CPSC A Shift-Reduce Parse in Detail (5) + num * T num # * num + num num * # num + num #num * num + num num * T # + num num * num # + num 

CPSC A Shift-Reduce Parse in Detail (6) T + num * T num # * num + num num * # num + num #num * num + num T # + num num * T # + num num * num # + num 

CPSC A Shift-Reduce Parse in Detail (7) T + num * T T + # num num # * num + num num * # num + num #num * num + num T # + num num * T # + num num * num # + num 

CPSC A Shift-Reduce Parse in Detail (8) T + num * T T + num # T + # num num # * num + num num * # num + num #num * num + num T # + num num * T # + num num * num # + num 

CPSC A Shift-Reduce Parse in Detail (9) T +num * T T T + num # T + # num num # * num + num num * # num + num #num * num + num T + T # T # + num num * T # + num num * num # + num 

CPSC A Shift-Reduce Parse in Detail (10) TE +num * T T T + num # T + # num num # * num + num num * # num + num #num * num + num T + E # T + T # T # + num num * T # + num num * num # + num 

CPSC A Shift-Reduce Parse in Detail (11) E TE +num * T T T + num # T + # num num # * num + num num * # num + num #num * num + num E # T + E # T + T # T # + num num * T # + num num * num # + num 

CPSC The Stack  Left string can be implemented by a stack Top of the stack is the #  Shift pushes a terminal on the stack  Reduce pops 0 or more symbols off of the stack (production rhs) and pushes a non-terminal on the stack (production lhs)

CPSC Key Issue (will be resolved by algorithms)  How do we decide when to shift or reduce? Consider step: num # * num + num We could reduce by T  num giving T # * num + num A fatal mistake: No way to reduce to the start symbol E

CPSC Conflicts  Generic shift-reduce strategy: If there is a handle on top of the stack, reduce Otherwise, shift  But what if there is a choice? If it is legal to shift or reduce, there is a shift-reduce conflict If it is legal to reduce by two different productions, there is a reduce-reduce conflict

CPSC Conflict Example Consider the ambiguous grammar: num| (E)| E * E| E + E  E

CPSC One Shift-Reduce Parse E # reduce E  E + E E + E #... reduce E  E * E E * E # + num shift#num * num + num reduce E  num E + num# shiftE + # num shiftE # + num InputAction

CPSC Another Shift-Reduce Parse E # reduce E  E * E E * E #... shiftE * E # + num shift#num * num + num reduce E  E + E E * E + E# reduce E  num E * E + num # shiftE * E + # num Input Action

CPSC Observations  In the second step E * E # + num we can either shift or reduce by E  E * E  Choice determines associativity of + and *  As noted previously, grammar can be rewritten to enforce precedence  Precedence declarations are an alternative

CPSC Overview  LR(k) parsing L: scan input Left to right R: produce rightmost derivation k tokens of lookahead  LR(0) zero tokens of look-ahead  SLR Simple LR: like LR(0) but uses FOLLOW sets to build more “ precise ” parsing tables

CPSC Basic Terminologies  Handle A substring that matches the right side of a production whose reduction with that production ’ s left side constitutes one step of the rightmost derivation of the string from the start nonterminal of the grammar

CPSC Model of Shift-Reduce Parsing  Stack + input = current right-sentential form.  Locate the handle during parsing: shift zero or more terminals (tokens) onto the stack until a handle  is on top of the stack.  Replace the handle with a proper non-terminal (Handle Pruning): reduce  to A where A 

CPSC Model of an LR Parser

CPSC Problem: when to shift, when to reduce?  Recall grammar: E  T + E | T T  num * T | num | (E)  how to know when to reduce and when to shift?

CPSC Model of Shift-Reduce Parsing  Stack + input = current right-sentential form.  Locate the handle during the parsing: shift zero or terminals onto the stack until a handle is  on top of the stack.  Replace the handle with a proper non-terminal (Handle Pruning)

CPSC What we need to know to do LR parsing  LR(0) states describe states in which the parser can be Note: LR(0) states are used by both LR(0) and SLR parsers  Parsing tables transitions between LR(0) states, actions to take at transition:  shift, reduce, accept, error  How to construct LR(0) states  How to construct parsing tables  How to drive the parser

CPSC An LR(0) state = a set of LR(0) items  An LR(0) item [X --> a.b] says that the parser is looking for an X it has an a  on top of the stack expects to find in the input a string derived from b.  Notes: [X --> a.ab] means that if a is on the input, it can be shifted. That is:  a is a correct token to see on the input, and  shifting a would not “ over-shift ” (still a viable prefix). [X -->a.] means that we could reduce X

CPSC LR(0) states S’ . E E . T E .T + E T .(E) T .num * T T .num S’  E. E  T. E  T. + E T  num. * T T  num. T  (. E) E .T E .T + E T .(E) T .num * T T .num E  T + E. E  T +. E E .T E .T + E T .(E) T .num * T T .num T  num *.T T .(E) T .num * T T .num T  num * T. T  (E.) T  (E). E T ( num * ) E E T ( ( T  (

CPSC SLR Parsing  Remember the state of the automaton on each prefix of the stack  Change stack to contain pairs  Symbol, DFA State 

CPSC SLR Parsing (Contd.)  For a stack  sym 1, state 1 ...  sym n, state n  state n is the final state of the DFA on sym 1 … sym n  Detail: The bottom of the stack is  any,start  where any is any dummy state start is the start state of the DFA

CPSC Goto Table  Define Goto[i,A] = j if state i  A state j where A is a nonterminal  Goto is just the transition function of the DFA One of two parsing tables

CPSC Parser Moves  Shift x Push a, x on the stack a is current input x is a DFA state  Reduce A   As before  Accept  Error

CPSC Action Table For each state s i and terminal a If s i has item X  .a  and there is a transition on terminal a from state i to state j then Action[i,a] = shift j If s i has item X  . and a  Follow(X) and X != S’ then Action[i,a] = reduce X   If s i has item S ’  S. then action[i,$] = accept Otherwise, action[i,a] = error

CPSC SLR Parsing Algorithm Let I = w$ be initial input Let j = 0 Let DFA state 1 have item S ’ .S Let stack =  dummy, 1  repeat case action[top_state(stack),I[j]] of shift k: push  I[j++], k  reduce X  A: pop |A| pairs, I[--j] = X // prepend X to input accept: halt normally error: halt and report error

CPSC Notes on SLR Parsing Algorithm  Note that the algorithm uses only the DFA states and the input The stack symbols are never used!  However, we still need the symbols for semantic actions

CPSC The Compiler So Far  Lexical analysis Detects inputs with illegal tokens  Parsing Detects inputs with ill-formed parse trees  Semantic analysis Last “ front end ” phase Catches all remaining errors

CPSC Typical Semantic Errors multiple declarations: a variable should be declared (in the same scope) at most once undeclared variable: a variable should not be used before being declared. type mismatch: type of the left-hand side of an assignment should match the type of the right-hand side. wrong arguments: methods should be called with the right number and types of arguments.

CPSC Sample Semantic Analyzer For each scope in the program:  process the declarations add new entries to the symbol table (or a similar structure) and report any variables that are multiply declared  process the statements find uses of undeclared variables,  use the symbol-table information to determine the type of each expression, and to find type errors.

CPSC Scope Rules for Pascal- Rule 6.1: All constants, types, variables, and procedures definedin the same block must have different names Rule 6.2: A constant, type, or variable defined in a block is normallyknown from the end of its declaration to the end of the block. A procedure defined in a block B is normally known from the beginning of the procedure to the end of the block B Rule 6.3: Consider a block Q that defines an object x. If Q contains a block R that defines another object named x, the first object is unknown in the scope of the second object.

CPSC Pascal- Program (1) { 0 Begin Standard Block} 1 program P; 2 type T = array[1..100] of integer; 3 var x: T; 4 5 procedure Q(x: integer); 6 const c = 13; 7 begin... x... end{Q}; 8 9 procedure R; 10 var b, c: Boolean; 11 begin... x...end{R}; begin... end.{P} 14 {End Standard block}

CPSC Pascal- Program (2) {Constant = Numeral | ConstantName.} procedure Constant(Stop: Symbols); begin if Symbol = Numeral1 then Expect(Numeral, Stop) else if Symbol = Name1 then begin Find(Argument); Expect(Name1, Stop) end else SyntaxError(Stop) end;

CPSC Pascal- Program (3) {ConstantDefinition = ConstantName '=' Constant ';'.} procedure ConstantDefinition(stop: Symbols); begin ExpectName(Name, Symbols[Equal1, Semicolon1] + ConstantSymbols + Stop); Expect(Equal1, ConstantSymbols + Symbols[Semicolon1] + Stop); Constant(Symbols[Semicolon1] + Stop); Define(Name); Expect(Semicolon1, Stop) end;

CPSC Pascal- Program (4) {Program = 'program' ProgramName ';' BlockBody '.'} procedure Programx(Stop: Symbols); begin Expect(Program1, Symbols[Name1, Semicolon1, Period1] + BlockSymbols + Stop); Expect(Name1, Symbols[Semicolon1, Period1] + BlockSymbols + Stop); Expect(Semicolon1, Symbols[Period1] + BlockSymbols + Stop); NewBlock; BlockBody(Symbols[Period1] + Stop); EndBlock; Expect(Period1, Stop) end;

CPSC Pascal- Program (5-1) {Constant = Numeral | ConstantName.} procedure Constant(var Value: integer; var Typex: Pointer; Stop: Symbols); begin if Symbol = Numeral1 then begin Value := Argument; Typex := TypeInteger; Expect(Numeral, Stop) end else if Symbol = Name1 then begin Find(Argument, Object); if = Constantx then begin Value := Typex := end

CPSC Pascal- Program (5-2) else begin KindError(object); Value := 0; Typex := TypeUniversal; end; Expect(Name1, Stop) end else begin SyntaxError(Stop); Value := 0; Typex := TypeUniversal; end;

CPSC Pascal- Program (6) {ConstantDefinition = ConstantName '=' Constant ';'.} procedure ConstantDefinition(stop: Symbols); var Name, Value: integer; Constx, Typex: Pointer; begin ExpectName(Name, Symbols[Equal1, Semicolon1] + ConstantSymbols + Stop); Expect(Equal1, ConstantSymbols + Symbols[Semicolon1] + Stop); Constant(Value, Typex, Symbols[Semicolon1] + Stop); Define(Name, Constantx, Constx); := Value; := Typex; Expect(Semicolon1, Stop) end;

CPSC Static and Dynamic Scope #include int main() { int x = 1; char x = ‘ b ’ ; char y = ‘ a ’ ; q(); void p() { return 0 double x = 2.5; } printf( “ %c\n ”,y}; { int y[10]; } } void q() { int y = 42; printf(%d\n ”, x); p(); }