LR(1) grammars The Chinese University of Hong Kong Fall 2010

Slides:



Advertisements
Similar presentations
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Advertisements

1 CSC 3130: Automata theory and formal languages Tutorial 4 KN Hung Office: SHB 1026 Department of Computer Science & Engineering.
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Lecture 4 Context-free grammars Jan Maluszynski, IDA, 2007
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter
Fall 2005 CSE 467/567 1 Formal languages regular expressions regular languages finite state machines.
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.
LR(k) Parsing CPSC 388 Ellen Walker Hiram College.
CSCI 2670 Introduction to Theory of Computing September 21, 2004.
CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Context-free.
CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Ambiguity.
CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Context-free.
Regular Grammars Reading: 3.3. What we know so far…  FSA = Regular Language  Regular Expression describes a Regular Language  Every Regular Language.
CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Pushdown.
CSCI 3130: Formal languages and automata theory Andrej Bogdanov The Chinese University of Hong Kong Limitations.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Normal forms.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Pushdown.
CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong LR(0) grammars.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Mid-Terms Exam Scope and Introduction. Format Grades: 100 points -> 20% in the final grade Multiple Choice Questions –8 questions, 7 points each Short.
CSCI 3130: Formal languages and automata theory Andrej Bogdanov The Chinese University of Hong Kong Decidable.
2016/7/9Page 1 Lecture 11: Semester Review COMP3100 Dept. Computer Science and Technology United International College.
Animated Conversion of Regular Expressions to C Code On the regular expression: ((a ⋅ b)|c) *
Nondeterminism The Chinese University of Hong Kong Fall 2011
Chapter 4 - Parsing CSCE 343.
Formal Language & Automata Theory
Programming Languages Translator
LR(k) grammars The Chinese University of Hong Kong Fall 2009
50/50 rule You need to get 50% from tests, AND
Ambiguity Parsing algorithms
Syntax Specification and Analysis
Table-driven parsing Parsing performed by a finite state machine.
Context-Free Grammars
CSE 105 theory of computation
CS314 – Section 5 Recitation 3
PDAs Accept Context-Free Languages
Bottom-Up Syntax Analysis
Syntax Analysis Sections :.
Department of Software & Media Technology
Context-Free Grammars
LR(0) grammars The Chinese University of Hong Kong Fall 2010
Pushdown automata and CFG ↔ PDA conversions
A New Look at LR(k) Bill McKeeman, MathWorks Fellow for
CSCI 3130: Formal languages and automata theory Tutorial 6
More on DFA minimization and DFA equivalence
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Context-Free Languages
NFAs, DFAs, and regular expressions
Decidable and undecidable languages
Chapter 4. Syntax Analysis (2)
Parsers for programming languages
Parsers for programming languages
Chapter 2 Context-Free Language - 01
Theory of Computation Lecture #
CFGs: Formal Definition
Chapter Fifteen: Stack Machine Applications
LR(1) grammars The Chinese University of Hong Kong Fall 2011
NFA to DFA conversion and regular expressions
Limitations of pushdown automata
Compiler Construction
CSE 105 theory of computation
Chapter 4. Syntax Analysis (2)
Pushdown automata The Chinese University of Hong Kong Fall 2011
LR(k) grammars The Chinese University of Hong Kong Fall 2008
Normal forms and parsing
Context-Free Grammars
Context Free Grammars-II
Limitations of context-free languages
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 7, 10/09/2003 Prof. Roy Levow.
CSE 105 theory of computation
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

LR(1) grammars The Chinese University of Hong Kong Fall 2010 CSCI 3130: Automata theory and formal languages LR(1) grammars Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130

LR(0) parsing review A  a•Ab A  a•b A  •aAb A  •ab A  aA•b A  aAb• A  ab• A b a A •ab 1 2 3 5 4 A  aAb A  ab parser generator CFG G error if G is not LR(0) “PDA” for parsing G Motivation: Fast parsing for programming languages

Parsing computer programs if (n == 0) { return x; } else { return x + 1; } Statement Block ... else Statement if ParExpression Statement ( Expression ) Block ... ... Most programming language CFGs are not LR(0)!

LR(0) parsing review A  aAb | ab A a b A a A b A  a•Ab A  a•b 2 3 4 A b A  a•Ab A  a•b A  •aAb A  •ab A  aA•b A  aAb• 1 A  •aAb A •ab a 5 A  ab• b stack state action  1 S A  aAb | ab 1 2 S A • • 12 2 S a b A 122 5 R • • • • 12 3 S • • • 123 4 R

Meaning of LR(0) items NFA transitions to: X  •g A  aX•b A  a•Xb A undiscovered part shift focus to subtree rooted at X (if X is nonterminal) a • X b focus A  aX•b A  a•Xb move past subtree rooted at X

Outline of LR(0) parsing algorithm LR(0) parser has two kinds of actions: What if: no complete item is valid there is one valid item, and it is complete shift (S) reduce (R) some valid items complete, some not more than one valid complete item S / R conflict R / R conflict

Hierarchy of context-free grammars CYK algorithm (slow) LR(1) grammars allow some conflicts conflicts can be resolved by lookahead LR(0) grammars LR(0) parsing algorithm

A CFG that is not LR(0) input: valid LR(0) items: S  A(1) | Bc(2) A  aA(3) | a(4) B  a(5) | ab(6) input: a valid LR(0) items: update S  •A, S  •Bc A  •aA, A  •a B  •a, B  •ab,

A CFG that is not LR(0) input: valid LR(0) items: S/R, R/R conflicts! S  A(1) | Bc(2) A  aA(3) | a(4) B  a(5) | ab(6) input: a peek inside! valid LR(0) items: A  a•A, A  a• B  a•, B  a•b, A  •aA, A  •a A S B a c • S/R, R/R conflicts! R(4), R(5), S(6) possible parse trees

parse tree must look like this Lookahead S  A(1) | Bc(2) A  aA(3) | a(4) B  a(5) | ab(6) input: a peek inside! a A a S • … valid LR(0) items: A  a•A, A  a• B  a•, B  a•b, A  •aA, A  •a action: shift parse tree must look like this

parse tree must look like this Lookahead S  A(1) | Bc(2) A  aA(3) | a(4) B  a(5) | ab(6) input: a a peek inside! a … A a S • valid LR(0) items: A  a•A, A  a• A  •aA, A  •a action: shift parse tree must look like this

parse tree must look like this Lookahead S  A(1) | Bc(2) A  aA(3) | a(4) B  a(5) | ab(6) input: a a a e A a S • valid LR(0) items: A  a•A, A  a• A  •aA, A  •a action: reduce parse tree must look like this

LR(0) items vs. LR(1) items A a b • LR(1) A a b • A  a•Ab [A  a•Ab, b] A  aAb | ab

LR(1) items A A a b x • a • b [A  a•b, x] [A  a•b, e]

Generating an LR(1) parser S  A(1) | Bc(2) A  aA(3) | a(4) B  a(5) | ab(6) NFA states are LR(1) items DFA + stack may have S/R, R/R conflicts A CFG is LR(1) if conflicts can always be resolved with one symbol lookahead

NFA for LR(0) parsing e q0 S  •a For every LR(0) item S  •a X a, b: terminals A, B, C: variables a, b, d: mixed strings X: terminal or variable notation e q0 S  •a For every LR(0) item S  •a X A  •X A  X• For every LR(0) item A  •X e A  •C C  •d For every pair of LR(0) items A  •C, C  •d

NFA for LR(1) parsing e q0 [S  •a, e] For every item S  •a X a, b: terminals A, B, C: variables a, b, d: mixed strings X: terminal or variable notation q0 e [S  •a, e] For every item S  •a X [A  X•, x] [A  •X, x] For every LR(1) item [A  •X, x] e [C  •d, y] [A  •C, x] For every LR(1) item [A  a•Cb, x] and production C  d and every y in FIRST(bx)

Explaining the transitions • X b x a X • b x X [A  •X, x] [A  X•, x] C b A y a • C b x • d e [A  •C, x] [C  •d, y] y ∈ FIRST(bx)

FIRST sets FIRST(g) are all leftmost terminals in derivations g ⇒ ... [C  •d, y] [A  •C, x] S  A(1) | cB(2) A  aA(3) | a(4) B  a(5) | ab(6) For every y in FIRST(bx) a A S cA BA e g FIRST(g) A {a} {a} a • C b x {a, c} {c} {a} FIRST(g) are all leftmost terminals in derivations g ⇒ ... ∅

Example: Constructing the NFA [S  A•, e] A S  A(1) | Bc(2) A  aA(3) | a(4) B  a(5) | ab(6) [A  •aA, e] e [A  •a, e] [S  •A, e] e . . . q0 [S  B•c, e] B [S  •Bc, e] e [B  •a, c] [B  •ab, c] e

Example: Constructing the NFA S  A(1) | Bc(2) A  aA(3) | a(4) B  a(5) | ab(6) [S  A•, e] A a e A [S  •A, e] [A  •aA, e] [A  a•A, e] [A  aA•, e] e e a e [A  •a, e] [A  a•, e] q0 e c [S  B•c, e] [S  Bc•, e] B e a [S  •Bc, e] [B  •a, c] [B  a•, c] e a b [B  •ab, c] [B  a•b, c] [B  ab•, c]

Example: Convert NFA to DFA LEGEND S  A | Bc A  aA | a B  a | ab shift variable shift terminal reduce A 2 1 [A  a•A, e] [S  •A, e] 3 [S  •Bc, e] [A  •aA, e] [A  a•A, e] 4 [A  •a, e] [A  •aA, e] A [A  •aA, e] a a [A  aA•, e] [B  a•b, c] [A  •a, e] [A  •a, e] [B  •a, c] [A  a•, e] [A  a•, e] [B  a•, c] [B  •ab, c] a b A B 5 6 7 8 c [S  A•, e] [S  B•c, e] [S  Bc•, e] [B  ab•, c]

Example: Resolving conflicts by lookahead LEGEND S  A(1) | Bc(2) A  aA(3) | a(4) B  a(5) | ab(6) shift variable shift terminal reduce 2 next action 3 next action [A  a•A, e] [A  a•A, e] a shift a shift [A  •aA, e] [A  •aA, e] [A  •a, e] b shift [A  •a, e] b error [B  a•b, c] c reduce A [A  a•, e] c error [A  a•, e] e reduce B e reduce A [B  a•, c]

Example: Reconstruct the parse tree 1 2 stack state action [S  •A, e] [A  a•A, e] [S  •Bc, e]  1 S [A  •aA, e] [A  •aA, e] [A  •a, e] 1 2 S a A [A  •a, e] [B  a•b, c] [B  •a, c] 12 8 R [A  a•, e] [B  •ab, c] [B  a•, c] 1 6 S A 5 16 7 R a B [S  A•, e] 3  [A  a•A, e] 6 [S  B•c, e] b [A  •aA, e] A [A  •a, e] S c 7 [A  a•, e] B [S  Bc•, e] a A • a b c • 8 4 • • [A  aA•, e] [B  ab•, c]