LR-Grammars LR(0), LR(1), and LR(K).

Slides:



Advertisements
Similar presentations
Compiler Construction
Advertisements

Chapter 5 Pushdown Automata
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
CFGs and PDAs Sipser 2 (pages ). Long long ago…
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
CFGs and PDAs Sipser 2 (pages ). Last time…
Introduction to Computability Theory
CS5371 Theory of Computation
Predictive Parsing l Find derivation for an input string, l Build a abstract syntax tree (AST) –a representation of the parsed program l Build a symbol.
1 Chapter 5: Bottom-Up Parsing (Shift-Reduce). 2 - attempts to construct a parse tree for an input string beginning at the leaves (the bottom) and working.
Costas Busch - RPI1 NPDAs Accept Context-Free Languages.
Courtesy Costas Busch - RPI1 NPDAs Accept Context-Free Languages.
Fall 2004COMP 3351 NPDA’s Accept Context-Free Languages.
Foundations of (Theoretical) Computer Science Chapter 2 Lecture Notes (Section 2.2: Pushdown Automata) Prof. Karen Daniels, Fall 2009 with acknowledgement.
Parsing V Introduction to LR(1) Parsers. from Cooper & Torczon2 LR(1) Parsers LR(1) parsers are table-driven, shift-reduce parsers that use a limited.
Normal forms for Context-Free Grammars
Transparency No. P2C1-1 Formal Language and Automata Theory Part II Pushdown Automata and Context-Free Languages.
CH4.1 CSE244 Introduction to LR Parsing Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Box.
CS5371 Theory of Computation Lecture 8: Automata Theory VI (PDA, PDA = CFG)
1 PDAs Accept Context-Free Languages. 2 Context-Free Languages (Grammars) Languages Accepted by PDAs Theorem:
Bottom-up parsing Goal of parser : build a derivation
Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Syntax and Semantics Structure of programming languages.
Chapter 9 Syntax Analysis Winter 2007 SEG2101 Chapter 9.
Chap. 6, Bottom-Up Parsing J. H. Wang May 17, 2011.
Pushdown Automata (PDAs)
Chapter 5 Context-Free Grammars
Definition Moves of the PDA Languages of the PDA Deterministic PDA’s Pushdown Automata 11.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Syntax and Semantics Structure of programming languages.
Compilers ABHISHEK REDDY PAM (11CS30002) DATE : 07/10/2013.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 11 Midterm Exam 2 -Context-Free Languages Mälardalen University 2005.
Chapter 5: Bottom-Up Parsing (Shift-Reduce)
Pushdown Automata Chapters Generators vs. Recognizers For Regular Languages: –regular expressions are generators –FAs are recognizers For Context-free.
DETERMINISTIC CONTEXT FREE LANGUAGES
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
Chapter 7 Pushdown Automata
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
PZ03A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ03A - Pushdown automata Programming Language Design.
Foundations of (Theoretical) Computer Science Chapter 2 Lecture Notes (Section 2.2: Pushdown Automata) Prof. Karen Daniels, Fall 2010 with acknowledgement.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 9 Mälardalen University 2006.
10/10/2002© 2002 Hal Perkins & UW CSED-1 CSE 582 – Compilers LR Parsing Hal Perkins Autumn 2002.
Lecture 8 Context-Free Grammar- Cont.
Context-Free Languages
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 2 Context-Free Languages Some slides are in courtesy.
Grammar Set of variables Set of terminal symbols Start variable Set of Production rules.
Bernd Fischer RW713: Compiler and Software Language Engineering.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
Conflicts in Simple LR parsers A SLR Parser does not use any lookahead The SLR parsing method fails if knowing the stack’s top state and next input token.
Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
CS 154 Formal Languages and Computability March 15 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron Mak.
CMSC 330: Organization of Programming Languages Pushdown Automata Parsing.
Theory of Languages and Automata By: Mojtaba Khezrian.
COMPILER CONSTRUCTION
Lecture 11  2004 SDU Lecture7 Pushdown Automaton.
6. Pushdown Automata CIS Automata and Formal Languages – Pei Wang.
Introduction to LR Parsing
Programming Languages Translator
Bottom-Up Parsing.
UNIT - 3 SYNTAX ANALYSIS - II
Parsing IV Bottom-up Parsing
PDAs Accept Context-Free Languages
Chapter 7 PUSHDOWN AUTOMATA.
Subject Name:COMPILER DESIGN Subject Code:10CS63
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Chapter Fifteen: Stack Machine Applications
Compiler Construction
Chap. 3 BOTTOM-UP PARSING
Presentation transcript:

LR-Grammars LR(0), LR(1), and LR(K)

Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton (DPDA) Many programming languages can be described by means of DCFLs

Prefix and Proper Prefix Prefix (of a string) Any number of leading symbols of that string Example: abc Prefixes: , a, ab, abc Proper Prefix (of a string) A prefix of a string, but not the string itself Proper prefixes: , a, ab

Prefix Property Context-Free Language (CFL) L is said to have the prefix property whenever w is in L and no proper prefix of w is in L Not considered a serve restriction Why? Because we can easily convert a DCFL to a DCFL with the prefix property by introducing an endmarker

Suffix and Proper Suffix Suffix (of a string) Any number of trailing symbols Proper Suffix A suffix of a string, but not the string itself

Example Grammar This is the grammar that will be used in many of the examples: S’  Sc S  SA | A A  aSb | ab

LR-Grammar Left-to-right scan of the input producing a rightmost derivation Simply: L stands for Left-to-right R stands for rightmost derivation

LR-Items An item (for a given CFG) A production with a dot anywhere in the right side (including the beginning and end) In the event of an -production: B   B  · is an item

Example: Items Given our example grammar: S’  Sc, S  SA|A, A  aSb|ab The items for the grammar are: S’·Sc, S’S·c, S’Sc· S·SA, SS·A, SSA·, S·A, SA· A·aSb, Aa·Sb, AaS·b, AaSb·, A·ab, Aa·b, Aab·

Some Notation * = 1 or more steps in a derivation *rm = rightmost derivation rm = single step in rightmost derivation

Right-Sentential Form A sentential form that can be derived by a rightmost derivation A string of terminals and variables  is called a sentential form if S* 

More terms Handle If the grammar is unambiguous: A substring which matches the right-hand side of a production and represents 1 step in the derivation Or more formally: (of a right-sentential form  for CFG G) Is a substring  such that: S *rm w w =  If the grammar is unambiguous: There are no useless symbols The rightmost derivation (in right-sentential form) and the handle are unique

Example Given our example grammar: An example right-most derivation: S’  Sc, S  SA|A, A  aSb|ab An example right-most derivation: S’  Sc  SAc  SaSbc Therefore we can say that: SaSbc is in right-sentential form The handle is aSb

More terms Viable Prefix Complete item (of a right-sentential form for ) Is any prefix of  ending no farther right than the right end of a handle of . Complete item An item where the dot is the rightmost symbol

Example Given our example grammar: The right-sentential form abc: S’  Sc, S  SA|A, A  aSb|ab The right-sentential form abc: S’ *rm Ac  abc Valid prefixes: A  ab for prefix ab A  ab for prefix a A  ab for prefix  Aab is a complete item,  Ac is the right-sentential form for abc

LR(0) Left-to-right scan of the input producing a rightmost derivation with a look-ahead (on the input) of 0 symbols It is a restricted type of CFG 1st in the family of LR-grammars LR(0) grammars define exactly the DCFLs having the prefix property

Computing Sets of Valid Items The definition of LR(0) and the method of accepting L(G) for LR(0) grammar G by a DPDA depends on: Knowing the set of valid items for each prefix  For every CFG G, the set of viable prefixes is a regular set This regular set is accepted by an NFA whose states are the items for G

Continued Given an NFA (whose states are the items for G) that accepts the regular set We can apply the subset construction to this NFA and yield a DFA The DFA whose state is the set of valid items for 

NFA M Three Rules NFA M recognizes the viable prefixes for CFG M = (Q, V  T, , q0, Q) Q = set of items for G plus state q0 G = (V, T, P, S) Three Rules (q0,) = {S| S is a production} (AB,) = {B| B is a production} Allows expansion of a variable B appearing immediately to the right of the dot (AX, X) = {AX} Permits moving the dot over any grammar symbol X if X is the next input symbol

Theorem 10.9 The NFA M has property that (q0, ) contains A iff A is valid for  This theorem gives a method for computing the sets of valid items for any viable prefix Note: It is an NFA. It can be converted to a DFA. Then by inspecting each state it can be determine if it is a valid LR(0) grammar

Definition of LR(0) Grammar G is an LR(0) grammar if The start symbol does not appear on the right side of any productions  prefixes  of G where A is a complete item, then it is unique i.e., there are no other complete items (and there are no items with a terminal to the right of the dot) that are valid for 

Facts we now know: Every LR(0) grammar generates a DCFL Every DCFL with the prefix property has a LR(0) grammar Every language with LR(0) grammar have the prefix property L is DCFL iff L has a LR(0) grammar

DPDA’s from LR(0) Grammars We trace out the rightmost derivation in reverse The stack holds a viable prefix (in right-sentential form) and the current state (of the DFA) Viable prefixes: X1X2…Xk States: s1, s2,…,sk Stack: s0X1s1…Xksk

Reduction If sk contains A Let Then A is valid for X1X2…Xk  = suffix of X1X2…Xk Let  = Xi+1…Xk w such that X1…Xkw is a right-sentential form.

Reduction Continued There is a derivation: S *rm X1…XiAw rm X1…Xkw To obtain the right-sentential form (X1…Xkw) in a right derivation we reduce  to A Therefore, we pop Xi+1…Xk from the stack and push A onto the stack

Shift If sk contains only incomplete items Then the right-sentential form (X1…Xkw) cannot be formed using a reduction Instead we simply “shift” the next input symbol onto the stack

Theorem 10.10 If L is L(G) for an LR(0) grammar G, then L is N(M) for a DPDA M N(M) = the language accepted by empty stack or null stack

Proof Construct from G the DFA D Stack Symbols of M are Transition function: recognizes G’s prefixes Stack Symbols of M are Grammar Symbols of G States of D M has start state q and other states used to perform reduction

We know that: If G is LR(0) then Reductions are the only way to get the right-sentential form when the state of the DFA (on the top of the stack) contains a complete item When M starts on input w it will construct a right-most derivation for w in reverse order

What we need to prove: When a shift is called for and the top DFA state on the stack has only incomplete items then there are no handles (Note: if there was a handle, then some DFA state on the stack would have a complete item)

Suppose  state A (complete item) Each state is put onto the top of the stack It would then immediately be reduced to A Therefore, a complete item cannot possibly become buried on the stack

Proof continued The acceptance of G occurs when the top of the stack contains the start symbol The start symbol by definition of LR(0) grammars cannot appear on the right side of a production L(G) always has a prefix property if G is LR(0)

Conclusion of Proof Thus, if w is in L(G), M finds the rightmost derivation of w, reduces w to S, and accepts If M accepts w, then the sequence of right-sentential forms provides a derivation of w from S N(M) = L(G)

Corollary of Theorem 10.10 Every LR(0) grammar is unambiguous Why? The rightmost derivation of w is unique (Given the construction we provided)

LR(1) Grammars LR grammar with 1 look-ahead All and only deterministic CFL’s have LR(1) grammars Are greatly important to compiler design Why? Because they are broad enough to include the syntax of almost all programming languages Restrictive enough to have efficient parsers (that are essentially DPDAs)

LR(1) Item Consists of an LR(0) item followed by a look-ahead set consisting of terminals and/or the special symbol $ $ = the right end of the string General Form: A  , {a1, a2, …, an} The set of LR(1) items forms the states of a viable prefix by converting the NFA to a DFA

A grammar is LR(1) if The start symbol does not appear on the right side of any productions The set of items, I, valid for some viable prefix includes some complete item A, {a1,…,an} then No ai appears immediately to the right of the dot in any item of I If B, {b1,…,bk} is another complete item in I, then ai  bj for any 1  i  n and 1  j  k

Accepting LR(1) language: Similar to the DPDA used with LR(0) grammars However, it is allowed to use the next input symbol during it’s decision making This is accomplished by appending a $ to the end of the input and the DPDA keeps the next input symbol as part of the state

LR(1) Rules for Reduce/Shift If the top set of items has a complete item A, {a1, a2, …, an}, where A  S, reduce by A if the current input symbol is in {a1, a2, …, an} If the top set of items has an item S, {$}, then reduce by S and accept if the current symbol is $ (i.e., the end of the input is reached) If the top set of items has an item AaB, T, and a is the current input symbol, then shift

Regarding the Rules Guarantees that at most one of the rules will be applied for any input symbol or $ Often for practicality the information is summarized into a table Rows: sets of items Columns: terminals and $