Download presentation
Presentation is loading. Please wait.
Published byArthur Palmer Modified over 8 years ago
1
CSCI 3130: Automata theory and formal languages Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130 The Chinese University of Hong Kong LR(0) grammars Fall 2011
2
Parsing computer programs First the javac compiler does a lexical analysis: if (n == 0) { return x; } if (ID == INT_LIT) { return ID; } ID = identifier (name of variable, procedure, class,...) INT_LIT = integer literal (value) The alphabet of java CFG consists of symbols like: = { if, return, (, ) {, }, ;, ==, ID, INT_LIT,...}
3
Parsing computer programs if (n == 0) { return x; } Statement if ParExpression Statement ( Expression ) ExpressionExpressionRest Infixop Expression Literal Primary Identifier Block { BlockStatements } BlockStatement return Expression ; Statement == INT_LIT ID Primary Identifier ID if (ID == INT_LIT) { return ID; } the parse tree of a java statement
4
CFG of the java programming language Identifier: ID QualifiedIdentifier: Identifier {. Identifier } Literal: IntegerLiteral FloatingPointLiteral CharacterLiteral StringLiteral BooleanLiteral NullLiteral Expression: Expression1 [AssignmentOperator Expression1]] AssignmentOperator: = += -= *= /= &= |= from http://java.sun.com/docs/books/jls /second_edition/html/syntax.doc.html#52996 …
5
Parsing java programs class Point2d { /* The X and Y coordinates of the point--instance variables */ private double x; private double y; private boolean debug;// A trick to help with debugging public Point2d (double px, double py) {// Constructor x = px; y = py; debug = false; // turn off debugging } public Point2d () {// Default constructor this (0.0, 0.0); // Invokes 2 parameter Point2D constructor } // Note that a this() invocation must be the BEGINNING of // statement body of constructor public Point2d (Point2d pt) { // Another consructor x = pt.getX(); y = pt.getY(); } … Simple java program: about 500 symbols
6
Parsing algorithms How long would it take to parse this program? Can we parse faster? No! CYK is the fastest known general-purpose parsing algorithm for CFGs try all parse treesabout 10 80 years CYK algorithmabout 1 week!
7
Another way of thinking Scientist: Find an algorithm that can parse any CFG Engineer: Design your CFG so it can be parsed very quickly
8
Parsing left to right S Tc (1) T TA (2) | A (3) A aTb (4) | ab (5) input: abaabbc a a b b A ab A c T T T A S Try to match to the left of
9
Items An item is a production augmented with a The item is complete if the is the last symbol S Tc T TA T A A aTb A ab S Tc (1) T A (3) T TA (2) A aTb (4) A ab (5)
10
Meaning of items S Tc (1) T TA (2) | A (3) A aTb (4) | ab (5) a a b b A ab A c T T Items represent possibilities at various stages of the parsing process A aTb a a b b abc A aTb A ab
11
Meaning of items S Tc (1) T TA (2) | A (3) A aTb (4) | ab (5) a a b b A ab A c T T When a complete item occurs, a part of the parse tree is discovered A aTb a a b b abc A aAb A ab A a a b b A ab A c T T A aTb
12
LR(0) parsing Move from left to right Keep track of all possible valid items Prune the invalid items When a complete item occurs, build part of parse tree valid items registry A aAb | ab aabb A aAb A ab
13
LR(0) parsing Move from left to right Keep track of all possible valid items Prune the invalid items When a complete item occurs, build part of parse tree valid items registry A aAb | ab aabb A aAb A ab A aAb A ab
14
LR(0) parsing Move from left to right Keep track of all possible valid items Prune the invalid items When a complete item occurs, build part of parse tree valid items registry A aAb | ab aabb A aAb A ab A aAb A ab A aAb
15
LR(0) parsing Move from left to right Keep track of all possible valid items Prune the invalid items When a complete item occurs, build part of parse tree valid items registry A aAb | ab aabb A aAb A ab A aAb A ab
16
A aAb LR(0) parsing Move from left to right Keep track of all possible valid items Prune the invalid items When a complete item occurs, build part of parse tree valid items registry A aAb | ab aabb A aAb A ab A aAb A ab
17
LR(0) parsing Move from left to right Keep track of all possible valid items Prune the invalid items When a complete item occurs, build part of parse tree valid items registry A aAb | ab aabb A ab A aAb A
18
LR(0) parsing Move from left to right Keep track of all possible valid items Prune the invalid items When a complete item occurs, build part of parse tree valid items registry A aAb | ab aabb A A aAb A
19
valid items registry A ab two kinds of actions no complete item in registry shiftreduce exactly one complete item valid items registry A aAb A ab A aAbA ab aabb aabb A
20
LR(0) implementation: first take stack action valid items A aAb | ab aabb S S S R S R a aa aab aA aAb A A aAb A ab A ab A aAb A A
21
valid item update rules S A a disappear a, b : terminals A, B, C : variables : mixed strings notation initial valid items: shift updates: read a A a read b A B read x disappear reduce updates: A B disappear reduce B A B B reduce C common updates:
22
valid item update rules S a, b : terminals A, B, C : variables : mixed strings X : terminal or variable notation initial valid items: shift updates: A a read a reduce updates: A B reduce B A B B common updates: A X X
23
LR(0) parsing: NFA representation S q0q0 A X X C A C For every item S For every item A X For every pair of items A C , C a, b : terminals A, B, C : variables : mixed strings X : terminal or variable notation
24
NFA example A aAb | ab A aAb A ab a A a b b q0q0 NFA alphabet is = {a, b, A} start state is q 0 other states are items
25
NFA to DFA conversion A aAb A ab a A a b b q0q0 A aAb A ab A aAb A ab A aAb A ab A b b a A aAb A ab a q die b, A a, A q die
26
Shift states and reduce states 123 45 are shift states are reduce states A aAb A ab A aAb A ab A aAb A ab A b b a A aAb A ab a 1 2 3 5 4
27
LR(0) parsing: second take A aAb | ab stack action state S A aAb A ab A aAb A ab A aAb A ab A b b a A aAb A ab a 1 2 3 5 4 aabb 1 S a2 S aa2 R aab5 ? How do we know what to reduce to?
28
remember state in stack! A aAb | ab stack action state S A aAb A ab A aAb A ab A aAb A ab A b b a A aAb A ab a 1 2 3 5 4 aabb 1 S 1a2 S 1a2a2 R 1a2a2b5 A 1a2A backtrack two steps
29
remember state in stack! A aAb | ab stack action state S A aAb A ab A aAb A ab A aAb A ab A b b a A aAb A ab a 1 2 3 5 4 aabb 1 S 1a2 S 1a2a2 R 1a2a2b5 A 1a2A 3 S 1a2A3b 4 R A
30
PDA for LR(0) parsing A aAb | ab stack action state S A aAb A ab A aAb A ab A aAb A ab A b b a A aAb A ab a 1 2 3 5 4 aabb 1 S 12 S 122 R 1225 A 12 3 S 123 4 R A
31
PDA for LR(0) parsing A aAb A ab A aAb A ab A aAb A ab A b b a A aAb A ab a 1 2 3 5 4 pop state b, state a take transition A out of state a push state a pop state b, state A, state a take transition A out of state a push state a A A , /$ A, $/
32
Example 1 L = {w#w R : w ∈ {a, b}*} A aAa | bAb | # A aAa A bAb A # a b A b q0q0 # A a
33
Example 1 A aAa | bAb | A aAa A bAb A # 4 # 1 3 A aAa A bAb A # 2 A aAa 7 5 A # baab A # A A stackstateaction 1S1S 14S14S 143S 1432R 1435S 14357R 146S 1468R 4 A bAb A aAa A bAb A # b a a b # a b A a 4 A bAb 8 6 a
34
LR(0) grammars and deterministic PDAs The PDA for LR(0) parsing is deterministic Some CFLs require non-deterministic PDAs, e.g. What goes wrong when we do LR(0) parsing on L ? L = {ww R : w ∈ {a, b}*}
35
Example 2 L = {ww R : w ∈ {a, b}*} A aAa | bAb | A aAa A bAb A a b A b q0q0 A a
36
shift-reduce conflict Example 2 L = {ww R : w ∈ {a, b}*} A aAa | bAb | input: abba A aAa A bAb A 4 A aAa A bAb A A aAa A A bAb A aAa A bAb A a b a b A a 4 A bAb a
37
Parsing computer programs if (n == 0) { return x; } Statement if ParExpression Statement ( Expression ) ExpressionExpressionRest Infixop Expression Literal Primary Identifier Block { BlockStatements } BlockStatement return Expression ; Statement == INT_LIT ID Primary Identifier ID else { return x + 1; }
38
Parsing computer programs if (n == 0) { return x; } Statement ( Expression ) Block ID else { return x + 1; } if ParExpression Statement else Statement Block... LR(0) parsers cannot tell apart if... then from if... then... else
39
When you can’t LR(0) parse LR(0) parser can perform two actions: What if: no complete item is valid shift (S) there is one valid item, and it is complete reduce (R) some valid items complete, some not S / R conflict more than one valid complete item R / R conflict
40
context-free grammars LR(∞) grammars … Hierarchy of context-free grammars LR(1) grammars LR(0) grammars parse using LR(0) algorithm java perl python … to be continued…
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.