LR(1) grammars The Chinese University of Hong Kong Fall 2011

Slides:



Advertisements
Similar presentations
Grammar types There are 4 types of grammars according to the types of rules: – General grammars – Context Sensitive grammars – Context Free grammars –
Advertisements

A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Lecture 4 Context-free grammars Jan Maluszynski, IDA, 2007
Fall 2005 CSE 467/567 1 Formal languages regular expressions regular languages finite state machines.
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.
LR(k) Parsing CPSC 388 Ellen Walker Hiram College.
CSCI 2670 Introduction to Theory of Computing September 21, 2004.
CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Ambiguity.
CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Context-free.
Regular Grammars Reading: 3.3. What we know so far…  FSA = Regular Language  Regular Expression describes a Regular Language  Every Regular Language.
CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Pushdown.
CSCI 3130: Formal languages and automata theory Andrej Bogdanov The Chinese University of Hong Kong Limitations.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Normal forms.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Pushdown.
CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong LR(0) grammars.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Pushdown.
Bernd Fischer RW713: Compiler and Software Language Engineering.
CSCI 3130: Formal languages and automata theory Andrej Bogdanov The Chinese University of Hong Kong Decidable.
CSCI 2670 Introduction to Theory of Computing September 16, 2004.
2016/7/9Page 1 Lecture 11: Semester Review COMP3100 Dept. Computer Science and Technology United International College.
Animated Conversion of Regular Expressions to C Code On the regular expression: ((a ⋅ b)|c) *
Nondeterminism The Chinese University of Hong Kong Fall 2011
Chapter 4 - Parsing CSCE 343.
Formal Language & Automata Theory
CS 404 Introduction to Compiler Design
Programming Languages Translator
LR(k) grammars The Chinese University of Hong Kong Fall 2009
50/50 rule You need to get 50% from tests, AND
Ambiguity Parsing algorithms
Syntax Specification and Analysis
Table-driven parsing Parsing performed by a finite state machine.
CSE 105 theory of computation
CS314 – Section 5 Recitation 3
PDAs Accept Context-Free Languages
Bottom-Up Syntax Analysis
Syntax Analysis Sections :.
Parsing Techniques.
Department of Software & Media Technology
LR(0) grammars The Chinese University of Hong Kong Fall 2010
Pushdown automata and CFG ↔ PDA conversions
A New Look at LR(k) Bill McKeeman, MathWorks Fellow for
CSCI 3130: Formal languages and automata theory Tutorial 6
LR(1) grammars The Chinese University of Hong Kong Fall 2010
More on DFA minimization and DFA equivalence
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Context-Free Languages
NFAs, DFAs, and regular expressions
Decidable and undecidable languages
Chapter 4. Syntax Analysis (2)
Parsers for programming languages
Parsers for programming languages
Chapter 2 Context-Free Language - 01
Theory of Computation Lecture #
CFGs: Formal Definition
Chapter Fifteen: Stack Machine Applications
NFA to DFA conversion and regular expressions
Limitations of pushdown automata
Compiler Construction
CSE 105 theory of computation
Chapter 4. Syntax Analysis (2)
Pushdown automata The Chinese University of Hong Kong Fall 2011
LR(k) grammars The Chinese University of Hong Kong Fall 2008
Normal forms and parsing
Context Free Grammars-II
Limitations of context-free languages
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 7, 10/09/2003 Prof. Roy Levow.
Nondeterminism The Chinese University of Hong Kong Fall 2010
CSE 105 theory of computation
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

LR(1) grammars The Chinese University of Hong Kong Fall 2011 CSCI 3130: Formal languages and automata theory LR(1) grammars Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130

LR(0) parsing review A  a•Ab A  a•b A  •aAb A  •ab A  aA•b A  aAb• A  ab• A b a A •ab 1 2 3 5 4 A  aAb A  ab parser generator CFG G “PDA” for parsing G error if G is not LR(0) Motivation: Fast parsing for programming languages

Parsing computer programs if (n == 0) { return x; } else { return x + 1; } Statement Block ... else Statement if ParExpression Statement ( Expression ) Block ... ... CFGs of programming languages are not LR(0)

LR(0) parsing review A  aAb | ab A a b A a A b A  a•Ab A  a•b 2 3 4 A b A  a•Ab A  a•b A  •aAb A  •ab A  aA•b A  aAb• 1 A  •aAb A •ab a 5 A  ab• b stack state action  1 S A  aAb | ab 1 2 S A • • 12 2 S a b A 122 5 R • • • • 12 3 S • • • 123 4 R

Meaning of LR(0) items PDA transitions: A  aX•b X  •g A  a•Xb A undiscovered part move past subtree rooted at X a • X b focus X  •g A  a•Xb shift focus to subtree rooted at X

Outline of LR(0) parsing algorithm LR(0) parser has two kinds of actions: What if: no complete item is valid shift (S) there is one valid item, and it is complete reduce (R) some valid items complete, some not S / R conflict more than one valid complete item R / R conflict

Hierarchy of context-free grammars LR(1) grammars allow some conflicts conflicts can be resolved by lookahead LR(0) grammars

A CFG that is not LR(0) input: valid LR(0) items: S  A | Bc A  aA | a B  a | ab input: a valid LR(0) items: update S  •A , S  •Bc, A  •aA, A  •a, B  •a, B  •ab

A CFG that is not LR(0) input: valid LR(0) items: S/R, R/R conflicts! S  A | Bc A  aA | a B  a | ab input: a peek inside! valid LR(0) items: A  a•A, A  a• B  a•, B  a•b, A  •aA, A  •a B S c a • b … A S/R, R/R conflicts! possible parse trees

Lookahead input: valid LR(0) items: action: shift S  A | Bc A  aA | a B  a | ab input: a peek inside! a B S c a • b … A valid LR(0) items: A  a•A, A  a• B  a•, B  a•b, A  •aA, A  •a action: shift possible parse trees

Lookahead input: valid LR(0) items: S/R conflict action: shift S  A | Bc A  aA | a B  a | ab input: a a peek inside! a … A a S • valid LR(0) items: A  a•A, A  a• A  •aA, A  •a S/R conflict possible parse trees action: shift

Lookahead input: valid LR(0) items: action: reduce S  A | Bc A  aA | a B  a | ab input: a a a e … A a S • valid LR(0) items: A  a•A, A  a• A  •aA, A  •a action: reduce possible parse trees

LR(0) items vs. LR(1) items A a b • LR(1) A a b • A  a•Ab [A  a•Ab, b] A  aAb | ab

LR(1) items A A a b x • a • b [A  a•b, x] [A  a•b, e]

Generating an LR(1) parser S  A | Bc A  aA | a B  a | ab NFA states are LR(1) items DFA with stack may have S/R, R/R conflicts In an LR(1) CFG conflicts can always be resolved with one symbol lookahead

NFA for LR(0) parsing e q0 S  •a For every LR(0) item S  •a X a, b: terminals A, B, C: variables a, b, d: mixed strings X: terminal or variable notation e q0 S  •a For every LR(0) item S  •a X A  •X A  X• For every LR(0) item A  •X e A  •C C  •d For every pair of LR(0) items A  •C, C  •d

NFA for LR(1) parsing e q0 [S  •a, e] For every item S  •a X a, b: terminals A, B, C: variables a, b, d: mixed strings X: terminal or variable notation q0 e [S  •a, e] For every item S  •a X [A  X•, x] [A  •X, x] For every LR(1) item [A  •X, x] e [C  •d, y] [A  •C, x] For every LR(1) item [A  a•Cb, x] and production C  d and every y in FIRST(bx)

Explaining the transitions • X b x a X • b x X [A  •X, x] [A  X•, x] C b A y a • C b x • d e [A  •C, x] [C  •d, y] y ∈ FIRST(bx)

FIRST sets FIRST(g) are all leftmost terminals in derivations g ⇒ ... [C  •d, y] [A  •C, x] S  A | cB A  aA | a B  a | ab For every y in FIRST(bx) a A S cA BA e g FIRST(g) A {a} {a} a • C b x {a, c} {c} {a} FIRST(g) are all leftmost terminals in derivations g ⇒ ... ∅

Example: Construct the NFA [S  A•, e] A S  A(1) | Bc(2) A  aA(3) | a(4) B  a(5) | ab(6) [A  •aA, e] e [A  •a, e] [S  •A, e] e . . . q0 [S  B•c, e] B [S  •Bc, e] e [B  •a, c] [B  •ab, c] e

Example: Construct the NFA S  A(1) | Bc(2) A  aA(3) | a(4) B  a(5) | ab(6) [S  A•, e] A a e A [S  •A, e] [A  •aA, e] [A  a•A, e] [A  aA•, e] e e a e [A  •a, e] [A  a•, e] q0 e c [S  B•c, e] [S  Bc•, e] B e a [S  •Bc, e] [B  •a, c] [B  a•, c] e a b [B  •ab, c] [B  a•b, c] [B  ab•, c]

Example: Convert NFA to DFA LEGEND S  A | Bc A  aA | a B  a | ab shift variable shift terminal reduce A 2 1 [A  a•A, e] [S  •A, e] 3 [S  •Bc, e] [A  •aA, e] [A  a•A, e] 4 [A  •a, e] [A  •aA, e] A [A  •aA, e] a a [A  aA•, e] [B  a•b, c] [A  •a, e] [A  •a, e] [B  •a, c] [A  a•, e] [A  a•, e] [B  a•, c] [B  •ab, c] a b A B 5 6 7 8 c [S  A•, e] [S  B•c, e] [S  Bc•, e] [B  ab•, c]

Example: Resolve conflicts by lookahead LEGEND S  A(1) | Bc(2) A  aA(3) | a(4) B  a(5) | ab(6) shift variable shift terminal reduce 2 next action 3 next action [A  a•A, e] [A  a•A, e] a shift a shift [A  •aA, e] [A  •aA, e] [A  •a, e] b shift [A  •a, e] b error [B  a•b, c] c reduce B [A  a•, e] c error [A  a•, e] e reduce A e reduce A [B  a•, c]

Example: Reconstruct the parse tree 1 2 stack state action [S  •A, e] [A  a•A, e] [S  •Bc, e]  1 S [A  •aA, e] [A  •aA, e] [A  •a, e] 1 2 S a A [A  •a, e] [B  a•b, c] [B  •a, c] 12 8 R [A  a•, e] [B  •ab, c] [B  a•, c] 1 6 S A 5 16 7 R a B [S  A•, e] 3  [A  a•A, e] 6 [S  B•c, e] b [A  •aA, e] A [A  •a, e] S c 7 [A  a•, e] B [S  Bc•, e] a A • a b c • 8 4 • • [A  aA•, e] [B  ab•, c]