Download presentation
Presentation is loading. Please wait.
Published byMartine Boivin Modified over 6 years ago
1
LR(0) grammars The Chinese University of Hong Kong Fall 2010
CSCI 3130: Automata theory and formal languages LR(0) grammars Andrej Bogdanov
2
Parsing computer programs
if (n == 0) { return x; } First the javac compiler does a lexical analysis: if (ID == INT_LIT) { return ID; } ID = identifier (name of variable, procedure, class, ...) INT_LIT = integer literal (value) The alphabet of java CFG consists of symbols like: S = {if, return, (, ) {, }, ;, ==, ID, INT_LIT, ...}
3
Parsing computer programs
if (n == 0) { return x; } Statement if ParExpression Statement ( Expression ) Expression ExpressionRest Infixop Expression Literal Primary Identifier Block { BlockStatements } BlockStatement return Expression ; == INT_LIT ID if (ID == INT_LIT) { return ID; } the parse tree of a java statement
4
CFG of the java programming language
Identifier: ID QualifiedIdentifier: Identifier { . Identifier } Literal: IntegerLiteral FloatingPointLiteral CharacterLiteral StringLiteral BooleanLiteral NullLiteral Expression: Expression1 [AssignmentOperator Expression1]] AssignmentOperator: = += -= *= /= &= |= … from /second_edition/html/syntax.doc.html#52996
5
Parsing java programs Simple java program: about 500 symbols …
class Point2d { /* The X and Y coordinates of the point--instance variables */ private double x; private double y; private boolean debug; // A trick to help with debugging public Point2d (double px, double py) { // Constructor x = px; y = py; debug = false; // turn off debugging } public Point2d () { // Default constructor this (0.0, 0.0); // Invokes 2 parameter Point2D constructor // Note that a this() invocation must be the BEGINNING of // statement body of constructor public Point2d (Point2d pt) { // Another consructor x = pt.getX(); y = pt.getY(); … Simple java program: about 500 symbols
6
Parsing algorithms How long would it take to parse this?
Can we parse faster? No! CYK is the fastest known general-purpose parsing algorithm exhaustive algorithm about 1080 years (longer than life of universe) CYK algorithm about 1 week!
7
Another way of thinking
Scientist: Find an algorithm that can parse strings in any grammar Engineer: Design your grammar so it has a very fast parsing algorithm
8
Parsing left to right input: abaabbc Try to match to the left of •
S Tc(1) T TA(2) | A(3) A aTb(4) | ab(5) S input: abaabbc T A T T A A • a • b • a • a • b • b • c • Try to match to the left of •
9
Items An item is a production augmented with a •
S Tc(1) T TA(2) T A(3) A aTb(4) A ab(5) S •Tc S T•c S Tc• T •TA T T•A T TA• T •A T A• A •aTb A a•Tb A aT•b A aTb• A •ab A a•b A ab• An item is a production augmented with a • The item is complete if the • is the last symbol
10
Meaning of items Items represent possibilities
S Tc(1) T TA(2) | A(3) A aTb(4) | ab(5) A a•Ab A a•b a b c • a b A c T • A aT•b Items represent possibilities at various stages of the parsing process
11
Meaning of items When a complete item occurs,
S Tc(1) T TA(2) | A(3) A aTb(4) | ab(5) A a•Ab A a•b a b c • A a b A c T • a b A c T • A aT•b A aTb• When a complete item occurs, a part of the parse tree is discovered
12
LR(0) parsing A aAb | ab a b Move from left to right
Keep track of all possible valid items Prune the invalid items When a complete item occurs, build part of parse tree valid items registry A •aAb A •ab a b •
13
LR(0) parsing A aAb | ab a b Move from left to right
Keep track of all possible valid items Prune the invalid items When a complete item occurs, build part of parse tree valid items registry A a•Ab A a•b a b • A •aAb A •ab
14
LR(0) parsing A aAb | ab a b Move from left to right
Keep track of all possible valid items Prune the invalid items When a complete item occurs, build part of parse tree A a•Ab valid items registry A a•Ab A a•b a b • A a•Ab A a•b
15
LR(0) parsing A aAb | ab a b Move from left to right
Keep track of all possible valid items Prune the invalid items When a complete item occurs, build part of parse tree A a•Ab valid items registry A a•Ab A a•b a b • A •aAb A •ab
16
LR(0) parsing A aAb | ab a b Move from left to right
Keep track of all possible valid items Prune the invalid items When a complete item occurs, build part of parse tree A a•Ab A a•Ab valid items registry A a•Ab A ab• a b • A •aAb A •ab
17
LR(0) parsing A aAb | ab A a b Move from left to right
Keep track of all possible valid items Prune the invalid items When a complete item occurs, build part of parse tree A a•Ab valid items registry A A aA•b A ab• a b •
18
LR(0) parsing A aAb | ab A A a b Move from left to right
Keep track of all possible valid items Prune the invalid items When a complete item occurs, build part of parse tree A valid items registry A A aAb• a b •
19
LR(0) two kinds of actions
valid items registry A a•Ab A a•b A •aAb A •ab valid items registry A ab• A a b a b • • • no complete item in registry exactly one complete item shift reduce
20
LR(0) implementation: first take
stack valid items action A aAb | ab A •aAb A •ab S A a•Ab A a•b A •aAb A •ab a S A a•Ab A a•b A •aAb A •ab aa S aab A ab• R A aA A aA•b S A aAb R A aAb• a b • • • • • A
21
valid item update rules
a, b: terminals A, B, C: variables a, b, d: mixed strings notation initial valid items: S •a read a shift updates: A a•ab A aa•b read b A a•ab disappear read x A a•Bb disappear reduce B A a•Bb A aB•b reduce updates: reduce C A a•Bb disappear common updates: A a•Bb B •d
22
valid item update rules
a, b: terminals A, B, C: variables a, b, d: mixed strings X: terminal or variable notation initial valid items: S •a A a•ab A aa•b read a shift updates: A a•Xb A aX•b X A a•Bb A aB•b reduce B reduce updates: common updates: A a•Bb B •d
23
LR(0) parsing: NFA representation
a, b: terminals A, B, C: variables a, b, d: mixed strings X: terminal or variable notation e q0 S •a For every item S •a X A •X A X• For every item A •X e A •C C •d For every pair of items A •C, C •d
24
NFA example A aAb | ab a A b A •aAb A a•Ab A aA•b A aAb•
q0 A •ab A a•b A ab• a b A aAb | ab NFA alphabet is S = {a, b, A} start state is q0 other states are items
25
NFA to DFA conversion a A b A •aAb A a•Ab A aA•b A aAb• q0
qdie A ab• qdie b, A b
26
Shift states and reduce states
2 3 4 A b A a•Ab A a•b A •aAb A •ab A aA•b A aAb• 1 A •aAb A •ab a 5 A ab• b 1 2 3 are shift states 4 5 are reduce states
27
LR(0) parsing: take 2 A aAb | ab a b a A b A a•Ab A a•b A •aAb
3 4 A b A a•Ab A a•b A •aAb A •ab A aA•b A aAb• 1 A •aAb A •ab a 5 A ab• b stack state action 1 S A aAb | ab a 2 S ? How do we know what to reduce to? aa 2 S aab 5 R a b • • • •
28
remember state in stack!
2 3 4 A b A a•Ab A a•b A •aAb A •ab A aA•b A aAb• 1 A •aAb A •ab a 5 A ab• backtrack two steps b stack state action 1 S A aAb | ab 1a 2 S 1a2a 2 S A 1a2a2b 5 R 1a2A a b • • • •
29
remember state in stack!
2 3 4 A b A a•Ab A a•b A •aAb A •ab A aA•b A aAb• 1 A •aAb A •ab a 5 A ab• b stack state action 1 S A aAb | ab 1a 2 S A 1a2a 2 S A 1a2a2b 5 R 1a2A 3 S a b • • 1a2A3b 4 R
30
PDA for LR(0) parsing A aAb | ab A A a b a A b A a•Ab A a•b
2 3 4 A b A a•Ab A a•b A •aAb A •ab A aA•b A aAb• 1 A •aAb A •ab a 5 A ab• b stack state action 1 S A aAb | ab 1 2 S A 12 2 S A 122 5 R 12 3 S a b • • • • • 123 4 R
31
PDA for LR(0) parsing •A a e, e/$ A b A a•Ab A a•b A •aAb
2 3 4 e, e/$ A b A a•Ab A a•b A •aAb A •ab A aA•b A aAb• 1 A •aAb A •ab a 5 A ab• A, $/e b A• pop stateb, statea pop stateb, stateA, statea take A transition out of statea take A transition out of statea push statea push statea
32
Example 1 L = {w#wR : w ∈ {a, b}*} A aAa | bAb | # a A a A •aAa e
q0 A •# A #• e e e b A b A •bAb e A b•Ab A bA•b A bAb•
33
Example 1 A aAa | bAb | e A 4 A A 4 b a # a b • • • • • •
stack state action e 1 S a, b 1 2 S 1 2 12 2 S A •aAa A a•Aa 122 6 R a, b A •bAb A b•Ab 122 3 S A •# A •aAa R A •bAb 12 3 S # # A •# 123 5 R 6 A #• A 3 a A aA•a A 4 4 A aAa• A bA•b A b A 5 4 A bAb• b a # a b • • • • • •
34
LR(0) grammars and deterministic PDAs
The PDA for LR(0) parsing is deterministic Some CFLs require non-deterministic PDAs, e.g. What goes wrong when we do LR(0) parsing on L? L = {wwR : w ∈ {a, b}*}
35
Example 2 L = {wwR : w ∈ {a, b}*} A aAa | bAb | e A •aAa A a•Aa
q0 e
36
Example 2 input: abba L = {wwR : w ∈ {a, b}*} A aAa | bAb | e
shift-reduce conflict A •bAb A • A a A aA•a 4 A aAa• A bA•b input: abba b 4 A bAb•
37
Parsing computer programs
if (n == 0) { return x; } else { return x + 1; } Statement if ParExpression Statement ( Expression ) Expression ExpressionRest Infixop Expression Literal Primary Identifier Block { BlockStatements } BlockStatement return Expression ; Statement == INT_LIT ID
38
Parsing computer programs
if (n == 0) { return x; } else { return x + 1; } Statement if ParExpression Statement else Statement ( Expression ) Block Block ... ... ... LR(0) parsers cannot tell apart if ... then from if ... then ... else ID
39
When you can’t LR(0) parse
Algorithm can perform two actions: What if: no complete item is valid there is one valid item, and it is complete shift (S) reduce (R) some valid items complete, some not more than one valid complete item S / R conflict R / R conflict
40
Hierarchy of context-free grammars
parse using CYK algorithm (slow) LR(∞) grammars … to be continued… java perl python … LR(1) grammars LR(0) grammars parse using LR(0) algorithm
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.