COMPILER CONSTRUCTION

Slides:



Advertisements
Similar presentations
Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Advertisements

Compiler Designs and Constructions
Compiler construction in4020 – lecture 4 Koen Langendoen Delft University of Technology The Netherlands.
Compiler Principles Fall Compiler Principles Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University.
Joey Paquet, 2000, 2002, 2008, Lecture 7 Bottom-Up Parsing II.
LR-Grammars LR(0), LR(1), and LR(K).
Compiler Construction Sohail Aslam Lecture Finite Automaton of Items Then for every item A →  X  we must add an  -transition for every production.
CSE 5317/4305 L4: Parsing #21 Parsing #2 Leonidas Fegaras.
Mooly Sagiv and Roman Manevich School of Computer Science
LR Parsing – The Items Lecture 10 Mon, Feb 14, 2005.
Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
1 Chapter 5: Bottom-Up Parsing (Shift-Reduce). 2 - attempts to construct a parse tree for an input string beginning at the leaves (the bottom) and working.
Bottom-Up Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter (modified)
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter
Lecture #8, Feb. 7, 2007 Shift-reduce parsing,
Parsing V Introduction to LR(1) Parsers. from Cooper & Torczon2 LR(1) Parsers LR(1) parsers are table-driven, shift-reduce parsers that use a limited.
Chapter 4-2 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR Other.
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
CS 330 Programming Languages 09 / 23 / 2008 Instructor: Michael Eckmann.
1 LR parsing techniques SLR (not in the book) –Simple LR parsing –Easy to implement, not strong enough –Uses LR(0) items Canonical LR –Larger parser but.
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.
LR(k) Grammar David Rodriguez-Velazquez CS6800-Summer I, 2009 Dr. Elise De Doncker.
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
– 1 – CSCE 531 Spring 2006 Lecture 8 Bottom Up Parsing Topics Overview Bottom-Up Parsing Handles Shift-reduce parsing Operator precedence parsing Readings:
Bottom-up parsing Goal of parser : build a derivation
LESSON 24.
Syntax and Semantics Structure of programming languages.
LR(k) Parsing CPSC 388 Ellen Walker Hiram College.
Compiler Principle and Technology Prof. Dongming LU Mar. 21th, 2014.
Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.
Pushdown Automata (PDAs)
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 1 Chapter 4 Chapter 4 Bottom Up Parsing.
Chapter 3-3 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR 
Syntax and Semantics Structure of programming languages.
Chapter 5: Bottom-Up Parsing (Shift-Reduce)
Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
Parsing and Code Generation Set 24. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program,
GRAMMARS & PARSING. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program, referred to as a.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
Three kinds of bottom-up LR parser SLR “Simple LR” –most restrictions on eligible grammars –built quite directly from items as just shown LR “Canonical.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
Bottom-up parsing. Bottom-up parsing builds a parse tree from the leaves (terminals) to the start symbol int E T * TE+ T (4) (2) (3) (5) (1) int*+ E 
1 Chapter 6 Bottom-Up Parsing. 2 Bottom-up Parsing A bottom-up parsing corresponds to the construction of a parse tree for an input tokens beginning at.
Conflicts in Simple LR parsers A SLR Parser does not use any lookahead The SLR parsing method fails if knowing the stack’s top state and next input token.
CMSC 330: Organization of Programming Languages Pushdown Automata Parsing.
Chapter 8. LR Syntactic Analysis Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
Syntax and Semantics Structure of programming languages.
Programming Languages Translator
Bottom-Up Parsing.
Unit-3 Bottom-Up-Parsing.
UNIT - 3 SYNTAX ANALYSIS - II
Table-driven parsing Parsing performed by a finite state machine.
Fall Compiler Principles Lecture 4: Parsing part 3
LALR Parsing Canonical sets of LR(1) items
Bottom-Up Syntax Analysis
Syntax Analysis Part II
Subject Name:COMPILER DESIGN Subject Code:10CS63
Lexical and Syntax Analysis
Lecture 9 SLR Parse Table Construction
Parsing #2 Leonidas Fegaras.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Parsing #2 Leonidas Fegaras.
Compiler Construction
Presentation transcript:

COMPILER CONSTRUCTION Principles and Practice Kenneth C. Louden

5. Bottom-Up Parsing PART ONE

Contents PART ONE 5.1 Overview of Bottom-Up Parsing 5.2 Finite Automata of LR(0) Items and LR(0) Parsing PART TWO 5.3 SLR(1) Parsing 5.4 General LR(1) and LALR(1) Parsing 5.5 Yacc: An LALR(1) Parser Generator PART THREE 5.6 Generation of a TINY Parser Using Yacc 5.7 Error Recovery in Bottom-Up Parsers

5.1 Overview of Bottom-Up Parsing

A bottom-up parser uses an explicit stack to perform a parse The parsing stack contains tokens, nonterminals as well as some extra state information The stack is empty at the beginning of a bottom-up parse, and will contain the start symbol at the end of a successful parse A schematic for bottom-up parsing: $ InputString $ ……. $ StartSymbol $ accept Where the parsing stack is on the left, The input is in the center, and The actions of the parser are on the right.

A bottom-up parser has two possible actions (besides "accept"): Shift a terminal from the front of the input to the top of the stack Reduce a string α at the top of the stack to a nonterminal A, given the BNF choice A→α A bottom-up parser is thus sometimes called a shift-reduce parser One further feature of bottom-up parsers is that, grammars are always augmented with a new start symbol S' → S

Example 5. 1 The augmented grammar for balanced parentheses: S' → S S → (S)S|ε A bottom-up parser of the string ( ) using this grammar is given in following table 1 2 3 4 5 6 7 Parsing stack Input Action $ $( $(S $(S) $(S)S $S $S’ ( )$ )$ shift reduce S→ε reduce S → (S)S reduce S' → S accept

Example. 5.2 The augmented grammar for rudimentary arithmetic expressions: E’→E E→E + n | n A bottom-up parse of the string n + n using this grammar is given in following table. 1 2 3 4 5 6 7 Parsing stack input Action $ $n $E $E+ $E+n $E’ n+n$ +n$ n$ shift reduce E→n reduce E→E + n reduce E’→E accept

The handle of the right sentential form: A string, together with The position in the right sentential form where it occurs, and The production used to reduce it. Determining the next handle is the main task of a shift-reduce parser. 1 2 3 4 5 6 7 Parsing stack input Action $ $n $E $E+ $E+n $E’ n+n$ +n$ n$ shift reduce E→n reduce E→E + n reduce E’→E accept

Note: The string of a handle forms a complete right-hand side of one production; The rightmost position of the handle string is at the top of the stack; To be the handle, it is not enough for the string at the top of the stack to match the right-hand side of a production. Indeed, if an ε-production is available for reduction, as in Example 5.1 ,then its right-hand side (the empty string) is always at the top of the stack. Reductions occur only when the resulting string is indeed a right sentential form. For example, in step 3 of Table 5.1 a reduction by S→ε could be performed, but the resulting string( S S ) is not a right sentential form, and thusεis not the handle at this position in the sentential form ( S ) .

5.2 FINITE AUTOMATA OF LR(0) ITEMS AND LR(0) PARSING

5.2.1 LR(0) Items

An LR(0) item of a context-free grammar: A production choice with a distinguished position in its right-hand side Indicating the distinguished position by a period Example: if A → αis a production choice, and if β and γare any two strings of symbols (including the empty string s) such that βγ = α then A→ β·γis an LR(0) item. These are called LR(0) items because they contain no explicit reference to lookahead

Example 5.3 The grammar of Example 5. l : S' → S S → (S)S|ε This grammar has three production choices and eight items: S' → ·S S' → S· S → ·(S)S S → (·S)S S → (S·)S S → (S)·S S → (S)S· S → ·

Example 5.4 The grammar of Example 5.2 There are the following eight items: E’→·E E’→E· E→·E + n E→E· + n E→E +· n E→E + n· E→·E E→E·

5.2.2 Finite Automata of Items

The LR(0) items can be used as the states of a finite automaton that maintains information about the parsing stack and the progress of shift-reduce parse. This will start out as a nondeterministic finite automaton. From the NFA of LR(0) items to construct the DFA of sets of LR(0) items using the subset construction of Chapter 2 The transitions of the NFA of LR(0) items? Consider the item A → α·γ, and Supposeγbegins with the symbol X, be either a token or a nonterminal, So that the item can be written as A → α·Xη.

A transition on the symbol X from the state represented by this item to the state represented by the item A → αX·η. In graphical form we write this as (1) If X is a token, then this transition corresponds to a shift of X from the input to the top of the stack during a parse. (2) If X is a nonterminal, then the interpretation of this transition is more complex, since X will never appear as an input symbol.

This can only occur during a reduction by a production X → β. In fact, such a transition will still correspond to the pushing of X onto the stack during a parse; This can only occur during a reduction by a production X → β. Such a reduction must be preceded by the recognition of aβ, and the state given by the initial item X→·βrepresents the beginning of this process (the dot indicating that we are about to recognize a β, then for every item A → α·Xηwe must add an ε-transition) for every production choice X→βof X, indicating that X can be produced by recognizing any of the right-hand sides of its production choices.

The solution is to augment the grammar by a single production S'→S The start state of the NFA should correspond to the initial state of the parser: The stack is empty, and Be about to recognize an S, where S is the start symbol of the grammar. Any initial item S → ·α constructed from a production choice for S could serve as a start state. Unfortunately, there may be many such production choices for S. The solution is to augment the grammar by a single production S'→S where S' is a new non-terminal

The NFA will have no accepting states at all: The solution is to augment the grammar by a single production S'→S, where S' is a new nonterminal. S' then becomes the start state of the augmented grammar, and the initial item S' →· S becomes the start state of the NFA. This is the reason we have augmented the grammars of the previous examples. The NFA will have no accepting states at all: The purpose of the NFA is to keep track of the state of a parse, not to recognize strings; The parser itself will decide when to accept, and the NFA need not contain that information.

Example 5.5 The NFA with the eight LR(0) items of the grammar of Example 5. l in Figure 5.1 Note : Every item in the figure with a dot before the non-terminal S has anε-transition to every initial item of S.

Note: The initial item E→ . E + n has an s-transition to itself Example 5.6 The NFA of the LR(0) items associated with the grammar of Example 5.2. in Figure 5.2. Note: The initial item E→ . E + n has an s-transition to itself (This situation will occur in all grammars with immediate left recursion.)

In order to describe the use of items to keep track of the parsing state, must construct the DFA of sets of items corresponding to the NFA of items according to the subset construction. Perform the subset construction for the two examples of NFAs that have been just given.

Example 5.7 Consider the NFA of Figure 5.l.

Example 5.8 Consider the NFA of Figure 5.2

5.2.3The LR(0) Parsing Algorithm

The algorithm depends on keeping track of the current state in the DFA of sets of items; Modify the parsing stack to store not only symbols but also state numbers; Pushing the new state number onto the parsing stack after each push of a symbol. $0 Input String $0 n 2 Rest input string

Definition: The LR (0) parsing algorithm Let s be the current state (at the top of the parsing stack). Then actions are defined as follows: 1. If state s contains any item of the form A → α·Xβ, where X is a terminal. Then the action is to shift the current input token on to the stack. If this token is X. and state s contains item A → α·Xβ, then the new state to be pushed on the stack is the state containing the item A → αX·β. If this token is not X for some item in state s of the form just described, an error is declared.

2. If state s contains any complete item (an item of the form A → γ·), then the action is to reduce by the rule A → γ· A reduction by the rule S` → S, where S is the start state, is equivalent to acceptance, provided the input is empty, and error if the input is not empty In all other cases, for new state is computed as follows: Remove the string γand all of its corresponding states from the parsing stack Correspondingly, back up in the DFA to the state from which the construction of γbegan Again, by the construction of the DFA, this state must contain an item of the form B → α·Aβ Push A onto the stack, and push (as the new state) the state containing the item B → αA·β.

Example 5.9 Consider the grammar: A → ( A ) | a 1 2 3 4 5 6 7 8 9 Parsing stack input Action $ 0 $ 0 ( 3 $ 0 ( 3 ( 3 $ 0 ( 3 ( 3 a 2 $ 0 ( 3 ( 3 A 4 $ 0 ( 3 ( 3 A 4 ) 5 $ 0 ( 3 A 4 $ 0 ( 3 A 4 ) 5 $ 0 A 1 ( (a) )$ ( a) )$ a )$ ) )$ )$ $ shift reduce A→a reduce A→(A) accept

The parsing table The DFA of sets of items and the actions specified by the LR(0) parsing algorithm can be combined into a parsing table. Then, the LR(0) parsing becomes a table-driven parsing method.

An example of such a parsing table State Action Rule Input Goto 1 2 3 4 5 shift reduce A’→A A→a A→(A) ( a ) A One column is reserved to indicate the actions for each state; A further column is used to indicate the grammar choice for the reduction; For each token, there is a column to represent the new state; Transitions on non-terminals are listed in the Goto sections.

End of Part One THANKS