Download presentation
Presentation is loading. Please wait.
Published byAbraham Chandler Modified over 6 years ago
1
Presentation Outline Review of Lexical analysis
Introduction to Syntax Analysis Context Free Grammar Parsing Grammar Ambiguity Top Down Parser Bottom Up Parser
2
Introduction to Syntax Analysis
Every programming language has precise rules that prescribe the syntactic structure of well-formed programs. The Syntax Analysis phase of a compiler has two major goals: Check the input program to determine whether it is syntactically correct. Produce either a complete parse tree or at least trace the structure of the complete parse tree for syntactically correct input.
3
Some Basic Definitions syntax: the way in which words are put together
to form phrases, clauses, or sentences. The rules governing the formation of statements in a programming language. syntax analysis: the task concerned with fitting a sequence of tokens into a specified syntax. parsing: To break a sentence down into its component parts of speech with an explanation of the form, function, and syntactical relationship of each part.
4
Some Basic Definitions Expression(s)
Syntactic structure: the syntactic structure of programming languages can be informally expressed by the following diagram. Program Block(s) Statement(s) Expression(s) Token(s)
5
Context-free Grammars
62
6
CFG = (V, T, P, S) Context-free Grammars
Definition ❚ V : Finite set of variables/non-terminals ❚ T : Alphabet/Finite set of terminals ❚ P : Finite set of rules/productions ❚ S : Start symbol S V V T Rule : A A V (V T) *
7
Context-free Grammars
Definition ❚ Context-freeness: An A-rule can be applied whenever A occurs in a string, irrespective of the context (that is, non- terminals and terminals around A).
8
v w1 ... wn w uAv uv Context-free Grammars
Derivation ❚ One-step Derivation uAv uv A ❚ w is derivable from v in CFG, if there is a finite sequence of rule applications such that: v w1 ... wn w In this case we can write this derivation as v * w
9
Context-free Grammars
Derivation The derivation as v * w is called: Leftmost derivation: if in every step the leftmost variable is selected for reduction Rightmost derivation: if in every step the rightmost variable is selected for reduction
10
Context-free Grammars
Example 1 Let G = ({S}, {a,b},S,P) with for P: ❚S→aSa, and S→bSb, and S→λ. ❚Some derivations from this grammar: ❚ S aSa aaSaa aabSbaa aabbaa ❚ S bSb baSab baab, and so on. ❚In general S …. wwR for w{a,b}*. L(G)={wwR : w{a,b}*}
11
Rightmost Derivation : S A A b Ab aAb ab A A
Context-free Grammars G ({S, A, B},{a, b}, {S AB, Example 2 | , | }, A → aA λ B → Bb λ S ) L(G) L(a * b*) Leftmost Derivation : S B a B aB aBb ab Rightmost Derivation : S A A b Ab aAb ab A A B B
12
Context-free Grammars
Example 4 Consider the CFG: G {{S},{a, b},{S , S aSb}, S ) ❚ Derivation of aabb is S aSb aaSbb aabb
13
S aSa | aBa B bB | b L(B) {bm | m 0}
Context-free Grammars Example 5 Consider the CFG G: S aSa | aBa B bB | b L(B) {bm | m 0} L(S) {anbman | n 0 m 0} L(G)= L(S)
14
L(G1) {anbman | n 0 m 0} L(G ) {(ab)n cn | n 0}
Context-free Grammars Example 6 S aSa | B B bB | Consider the CFG G1: The language generated by G1 is: L(G1) {anbman | n 0 m 0} Consider the CFG G2: S abSc | The language generated by G2 is: L(G ) {(ab)n cn | n 0} 2
15
Context-free Grammars
Example 1 Consider the CFG G: G {{S},{a, b},{S , S aSb}, S ) ❚ The derivation of aabb is: S aSb aaSbb aabb S ❚ Derivation tree is S a b a b
16
Context-free Grammars
Example 2 A 0A1 00A11 00B11 00#11 A A A B 0 0 # 1 1
17
Context-free Grammars
Example 3 <EXPR> → <EXPR> + <EXPR> <EXPR> → <EXPR> * <EXPR> <EXPR> → ( <EXPR> ) <EXPR> → a Build a parse tree for a + a * a <EXPR> <EXPR> <EXPR> <EXPR> a + a * a
18
Recognition of strings in a language
CFG: Parsing Recognition of strings in a language
19
CFG: Parsing Generative aspect of CFG: By now it should be clear how, from a CFG G, you can derive strings wL(G). Analytical aspect: Given a CFG G and a string w, how do you decide if wL(G) and –if so– how do you determine the derivation tree or the sequence of production rules that produce w? This is called the problem of parsing.
20
CFG: Parsing Parser Top-down parsers Bottom-up parsers
A program that determines if a string L(G) by constructing a derivation. Equivalently, it searches the graph of G. Top-down parsers Constructs the derivation tree from root to leaves. Leftmost derivation. Bottom-up parsers Constructs the derivation tree from leaves to root. Rightmost derivation in reverse.
21
Parse trees (=Derivation Tree)
CFG: Parsing Parse trees (=Derivation Tree) A parse tree is a graphical representation of a derivation sequence of a sentential form. Tree nodes represent symbols of the grammar (nonterminals or terminals) and tree edges represent derivation steps.
22
CFG: Parsing Parse Tree: Example Given the following grammar:
E E + E | E * E | ( E ) | - E | id Is the string -(id + id) a sentence in this grammar? Yes because there is the following derivation: E -E -(E) -(E + E) -(id + id)
23
CFG: Parsing Parse Tree: Example 1
E E + E | E * E | ( E ) | - E | id Lets examine this derivation: E -E -(E) -(E + E) -(id + id) E E E E E - E - E - E - E ( E ) ( E ) ( E ) E + E E + E This is a top-down derivation because we start building the parse tree at the top id id parse tree
24
S SS aS ab CFG: Parsing Parse Tree: Example 2 S S S S S S S S S
S SS | a | b ab L(S ) Parse Tree: Example 2 S S S S Derivation Trees S S S S S S a a b Leftmost derivation S SS aS ab
25
S SS Sb ab CFG: Parsing S S S S S S S S S S b a b S S S S a b S
Rightmost derivation S SS Sb ab S S Parse Tree: Example 2 S S Derivation Trees S S S S S S b a b S S S S Rightmost Derivation in Reverse a b S S
26
CFG: Parsing Example 3 Consider the CFG grammar G
S A A T | A T T b | ( A) Show that (b)+b L(G)? S S S S S S S S A A A A A A A A T A A T ( b )+ b A T A T A T T A A T T T T T T ( b )+ 11 + + ( ) + ( ) +
27
Practical Parsers CFG: Parsing Top-down parsers : LL(k) languages
Language/Grammar designed to enable deterministic (directed and backtrack-free) searches. Top-down parsers : LL(k) languages E.g., Pascal, Ada, etc. Better error diagnosis and recovery. Bottom-up parsers : LALR(1), LR(k) languages E.g., C/C++, Java, etc. Handles left recursion in the grammar. Backtracking parsers E.g., Prolog interpreter. 12
28
Grammar Ambiguity Definition Definition: a string is derived ambiguously in a context-free grammar if it has two or more different parse trees Definition: a grammar is ambiguous if it generates some string ambiguously 15
29
A grammar is ambiguous if some strings are derived ambiguously.
Grammar Ambiguity A string wL(G) is derived ambiguously if it has more than one derivation tree (or equivalently: if it has more than one leftmost derivation (or rightmost)). A grammar is ambiguous if some strings are derived ambiguously. Typical example: rule S 0 | 1 | S+S | SS S S+S SS+S 0S+S 01+S 01+1 versus S SS 0S 0S+S 01+S 01+1 16
30
S S S S S + S S 1 S + S S 1 1 1 Grammar Ambiguity
The ambiguity of 01+1 is shown by the two different parse trees: S S S S S + S S 1 S + S S 1 1 1 17
31
S + 1 Grammar Ambiguity Note that the two different derivations:
S S+S 0+S 0+1 and S S+S S+1 0+1 do not constitute an ambiguous string 0+1 as have the same parse tree: S + 1 Ambiguity causes troubles when trying to interpret strings like: “She likes men who love women who don't smoke.” Solutions: Use parentheses, or use precedence rules such as a+(bc) = a+bc ≠ (a+b)c. 18
32
<EXPR> <EXPR>
Grammar Ambiguity Example <EXPR> → <EXPR> + <EXPR> <EXPR> → <EXPR> * <EXPR> <EXPR> → ( <EXPR> ) <EXPR> → a Build a parse tree for a + a * a <EXPR> <EXPR> <EXPR> <EXPR> <EXPR> <EXPR> <EXPR> <EXPR> <EXPR> <EXPR> a a 19 a + a * + a * a
33
Find a derivation for the expression: id + id * id
Grammar Ambiguity Example E E + E | E * E | ( E ) | - E | id Find a derivation for the expression: id + id * id E E E E E + E E + E E + E E * E id E * E id id E E E E E * E E * E E * E E + E E + E id Which derivation tree is correct? id id
34
Find a derivation for the expression: id + id * id
Grammar Ambiguity Example E E + E | E * E | ( E ) | - E | id Find a derivation for the expression: id + id * id E According to the grammar, both are correct. E + E id E * E A grammar that produces more than one parse tree for any input sentence is said to be an ambiguous grammar. id id E E + E E * E id id id
35
* has precedence over + Grammar Ambiguity
One way to resolve ambiguity is to associate precedence to the operators. Example * has precedence over + 1 + 2 * 3 = 1 + (2 * 3) 1 + 2 * 3 ≠ (1 + 2)*3 Associativity and precedence information is typically used to disambiguate non-fully parenthesized expressions containing unary prefix/postfix operators or binary infix operators.
36
if B1 then if B2 then S1 else S2 vs
Grammar Ambiguity Example stm if expr | if expr then stm else stm Grammar: if B1 then if B2 then S1 else S2 vs Ambiguity:
37
P aPb | C cC | Q bQc | A aA |
Grammar Ambiguity Quiz 1 S PC | AQ P aPb | C cC | Q bQc | A aA | Is the following grammar ambiguous? Yes: consider the string abc
38
S aS | Sb | ab | Grammar Ambiguity Quiz 2 Yes: consider ab
Is the following grammar ambiguous? Yes: consider ab
39
S SS | Grammar Ambiguity Quiz S Yes SS Cyclic structure SSS
Is the following grammar ambiguous? S Yes SS Cyclic structure SSS (Illustrates ambiguous grammar with cycles.) 27
40
Grammar Applications Programming Languages
Programming languages are often defined as Context Free Grammars in Backus-Naur Form (BNF). Example: <if_statement> ::= IF <expression><then_clause><else_clause> <expression> ::= <term> | <expression>+<term> <term> ::= <factor>|<term>*<factor> The variables as indicated by <a variable name> The arrow → is replaces by ::= Here, IF, + and * are terminals. “Syntax Checking” is checking if a program is an element of the CFG of the programming language.
41
This part of the compiler use the Grammar
Grammar Applications Compiler Syntax Analysis This part of the compiler use the Grammar Compiler: Source Program Scanner Parser Semantic Analy. Inter. Code Gen. Optimizer Code Generation 33 Target Program
42
get next lexical analyzer next token Syntax analyzer token Source
1. 2. Uses Regular Expressions to define tokens Uses Finite Automata to recognize tokens next char lexical analyzer next token Syntax analyzer get next get next char token Source Program symbol table (Contains a record for each identifier) Uses Top-down parsing or Bottom-up parsing To construct a Parse tree
43
Syntax errors Parsing errors include: 1.
1. misspelling of identifier, keyword, or operator 2. arithmetic expression with unbalanced parentheses 3. punctuation errors such as using comma in place of semicolon 4. missing brackets, semicolons, etc.
44
Error recovery The error handler in a parser has the following jobs:
1. report the presence of errors clearly and accurately 2. quick recovery of errors 3. not to slow the processing of programs
45
Example: The following C code shows some examples of syntax errors:
#include<stdio.h> int max(int I int j) { if(i>j) return(i) return(j); } void main() int x, y scanf("%d %d", x, y); printf("%d", max(x,y) ; ; ,
46
Example: A typical compilation of this erroneous program gives the
following list of errors: 1. 2. 3. 4. 5. 6. error C2235: C2059: C2239: C2078: C2660: C2143: ';' in formal parameter list syntax error : ')' unexpected token 'f' following declaration too many initializers of 'j' 'max' : function does not take 2 parameters syntax error : missing ')' before ';'
47
Example: The correct version of this program is
#include<stdio.h> int max(int i, int j) { if(i>j) return(i); return(j); } void main() int x, y; scanf("%d %d", x, y); printf("%d", max(x,y));
48
Significance of Context-Free Grammars
Grammars offer several significant advantages: 1. 2. 3. 4. Easy Easy Easy Easy to understand and construct programs parsing error detection and handling language extension
49
Parsing Bottom Up Parsing Top Down Parsing Shift-reduce Parsing
Predictive Parsing Shift-reduce Parsing LR(k) Parsing LL(k) Parsing Left Recursion Left Factoring
50
Parsing Bottom Up Parsing Shift-reduce Parsing Top Down Parsing
LR(k) Parsing Top Down Parsing Predictive Parsing LL(k) Parsing Left Recursion Left Factoring Top-down parsers: starts constructing the parse tree at the top (root) of the tree and move down towards the leaves. Easy to implement by hand, but work with restricted grammars. Example: predictive parsers
51
Left Recursion E → E + T | T T → T * F | F F → ( E ) | id
Consider the grammar: A top-down parser might loop forever when parsing an expression E using this grammar E E E E + T E + T E + T E + T E + T T E +
52
Left Recursion E → E + T | T T → T * F | F F → ( E ) | id Consider the
grammar: A grammar that has at least one production of the form A ⇒ Aα is a left recursive grammar. Top-down parsers do not grammars. work with left-recursive Left-recursion can often be eliminated by rewriting grammar. the
53
Left Recursion E’ → +TE’ | λ
E → E + T | T T → T * F | F F → ( E ) | id This left-recursive grammar: Can be re-written to eliminate the immediate left recursion: E → TE’ E’ → +TE’ | λ T → FT’ T’ → *FT’ | λ F → ( E ) | id
54
Predictive Parsing stm → if expr then stmt else stmt
| while expr do stmt | begin stmt_list end Consider the grammar: A parser for this following simple grammar can be written with the structure: switch(gettoken()) { case if: …. break; case while: …. break; case begin: default: reject input; } Based only on the first token, the parser knows which rule to use to derive a statement. Therefore this is called a predictive parser.
55
Left Factoring stmt‘→ else stmt | λ The following grammar:
stmt → if expr then stmt else stmt | if expr then stmt Cannot be parsed by a predictive parser one element ahead. that looks stmt → if expr then stmt stmt’ stmt‘→ else stmt | λ But the grammar can be re-written: Where λ is the empty string. Rewriting a grammar to eliminate multiple productions starting with the same token is called left factoring.
56
Left Factoring The basic idea is, in general, as follows: 1. 2. let A à αβ1 | αβ2 be two production rules for the nonterminal if the input begins with a nonempty string derived from α symbol A 3. and we do not know whether to expand A to αβ1 or αβ2 then we may defer the decision by expanding A to αA' after seeing the input derived from α, we expand A' to β1 or to β2 this means, left-factored, the original productions become 4. 5. 6. A à αA' A' à β1 | β2
57
A Predictive Parser How it works? 1. Construct the parsing table from
the given grammar 2. Apply the predictive parsing algorithm to construct the parse tree
58
A Predictive Parser 1. Construct the parsing table from the given grammar The following algorithm shows how we can construct the parsing table: Input: a grammar G Output: the corresponding parsing table M Method: For each production A ! α of the grammar do the following steps: 1. For each terminal a in FIRST(α), add A ! α to M[A,a]. 2. If λ in FIRST(α), add A ! α to M[A,b] for each terminal b in FOLLOW(A). 3. If λ FIRST(α) and $ in FOLLOW(A), add A ! α to M[A,$] How to construct FIRST and FOLLOW operations?
59
The Parsing Table E’ → +TE’ | ε
How to construct FIRST and FOLLOW operations? Example E → TE’ E’ → +TE’ | ε T → FT’ T’ → *FT’ | ε F → ( E ) | id Here ε = λ = empty string Given this grammar: How is this parsing table built? NON INPUT SYMBOL TERMINAL id * ( ) $ E E → TE’ E → TE’ E’ E’ → +TE’ E’ → ε E’ → ε T T → FT’ T → FT’ T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε F F → id F → (E) PARSING TABLE:
60
FIRST and FOLLOW We need to build a FIRST set and a FOLLOW
for each symbol in the grammar. set The elements of FIRST and FOLLOW are terminal symbols. FIRST(α) is the set of terminal symbols that can begin any string derived from α. FOLLOW(α) is the set of terminal symbols that can follow α: t ∈ FOLLOW(α) ↔ ∃ derivation containing αt
61
Rules to Create FIRST 3. If X → Y1Y2 ••• Yk GRAMMAR: FIRST rules:
E → TE’ E’ → +TE’ | ε T → FT’ T’ → *FT’ | ε F → ( E ) | id 1. If X is a terminal, FIRST(X) = {X} 2. If X → ε , then ε ∈ FIRST(X) and Y1 ••• Yi-1 ⇒* ε and a ∈FIRST(Yi) then a ∈ FIRST(X) 3. If X → Y1Y2 ••• Yk SETS: FIRST(id) = {id} FIRST(*) = {*} FIRST(+) = {+} FIRST(() = {(} FIRST()) = {)} FIRST(E’) = {ε} {+, ε} FIRST(T’) = {ε} {*, ε} FIRST(F) = {(, id} FIRST(T) = FIRST(F) = {(, id} FIRST(E) = FIRST(T) = {(, id}
62
Create FOLLOW Rules ⇒* β FIRST(F) = {(, id}to GRAMMAR: FOLLOW rules:
FIRST(E’) = {+, ε} FIRST(T’) = {* , ε} FIRST(F) = {(, id} FIRST(F) = {(, id}to Create FOLLOW FIRST(T) = {(, id} FIRST(E) = {(, id} GRAMMAR: FOLLOW rules: E → TE’ E’ → +TE’ | ε T → FT’ T’ → *FT’ | ε F → ( E ) | id 1. If S is the start symbol, then $ ∈ FOLLOW(S) and a ∈ FIRST(β) then a ∈ FOLLOW(B) 3. If A → αB and a ∈ FOLLOW(A) 3a. If A → αBβ 2. If A → αBβ, and a ≠ ε SETS: FOLLOW(E) = {$} { ), $} FOLLOW(E’) = { ), $} FOLLOW(T) = { ), $} ⇒* and β ε A and B are non-terminals, α and β are strings of grammar symbols
63
Create FOLLOW Rules 1. If S is the start symbol, then $ ∈ FOLLOW(S)
FIRST(E’) = {+, ε} FIRST(T’) = {* , ε} FIRST(F) = {(, id} FIRST(E) = {(, id} FIRST(F) = {(, id}to Create FOLLOW FIRST(T) = {(, id} GRAMMAR: FOLLOW rules: E → TE’ E’ → +TE’ | ε T → FT’ T’ → *FT’ | ε F → ( E ) | id 1. If S is the start symbol, then $ ∈ FOLLOW(S) 2. If A → αBβ, and a ∈ FIRST(β) then a ∈ FOLLOW(B) and a ≠ ε SETS: 3. If A → αB and a ∈ FOLLOW(A) then a ∈ FOLLOW(B) 3a. If A → αBβ FOLLOW(E) = {), $} FOLLOW(E’) = { ), $} FOLLOW(T) = { ), $} {+, ), $} ⇒* and β ε A and B are non-terminals, α and β are strings of grammar symbols
64
Create FOLLOW Rules then a ∈ FOLLOW(B) then a ∈ FOLLOW(B)
FIRST(E’) = {+, ε} FIRST(T’) = {* , ε} FIRST(F) = {(, id} FIRST(E) = {(, id} FIRST(F) = {(, id}to Create FOLLOW FIRST(T) = {(, id} GRAMMAR: FOLLOW rules: E → TE’ E’ → +TE’ | ε T → FT’ T’ → *FT’ | ε F → ( E ) | id 1. If S is the start symbol, then $ ∈ FOLLOW(S) and a ∈ FIRST(β) then a ∈ FOLLOW(B) 2. If A → αBβ, and a ≠ ε SETS: 3. If A → αB and a ∈ FOLLOW(A) then a ∈ FOLLOW(B) FOLLOW(E) = {), $} FOLLOW(E’) = { ), $} FOLLOW(T’) = {+, ), $} 3a. If A → αBβ and β ⇒* ε and a ∈ FOLLOW(A) then a ∈ FOLLOW(B) FOLLOW(T) = {+, ), $} A and B are non-terminals, α and β are strings of grammar symbols
65
Create FOLLOW Rules ≠ ε 3a. If A → αBβ and β ⇒* ε FIRST(F) = {(, id}to
FIRST(E’) = {+, ε} FIRST(T’) = {* , ε} FIRST(F) = {(, id} FIRST(E) = {(, id} FIRST(F) = {(, id}to Create FOLLOW FIRST(T) = {(, id} GRAMMAR: FOLLOW rules: E → TE’ E’ → +TE’ | ε T → FT’ T’ → *FT’ | ε F → ( E ) | id 1. If S is the start symbol, then $ ∈ FOLLOW(S) and a ∈ FIRST(β) then a ∈ FOLLOW(B) 3. If A → αB and a ∈ FOLLOW(A) 2. If A → αBβ, and a ≠ ε SETS: FOLLOW(E) = {), $} FOLLOW(E’) = { ), $} FOLLOW(T’) = {+, ), $} 3a. If A → αBβ and β ⇒* ε and a ∈ FOLLOW(A) then a ∈ FOLLOW(B) FOLLOW(T) = {+, ), $} FOLLOW(F) = {+, ), $} A and B are non-terminals, α and β are strings of grammar symbols
66
Create FOLLOW Rules 1. If S is the start symbol, then $ ∈ FOLLOW(S)
FIRST(E’) = {+, ε} FIRST(T’) = {* , ε} FIRST(F) = {(, id}to Create FOLLOW FIRST(F) = {(, id} FIRST(E) = {(, id} FIRST(T) = {(, id} GRAMMAR: FOLLOW rules: E → TE’ E’ → +TE’ | ε T → FT’ T’ → *FT’ | ε F → ( E ) | id 1. If S is the start symbol, then $ ∈ FOLLOW(S) 2. If A → αBβ, and a ∈ FIRST(β) then a ∈ FOLLOW(B) and a ≠ ε SETS: 3. If A → αB and a ∈ FOLLOW(A) then a ∈ FOLLOW(B) 3a. If A → αBβ and β ⇒* ε FOLLOW(E) = {), $} FOLLOW(E’) = { ), $} FOLLOW(T’) = {+, ), $} FOLLOW(T) = {+, ), $} FOLLOW(F) = {+, ), $} {+, *, ), $} A and B are non-terminals, α and β are strings of grammar symbols
67
Rule s to Build Table ing FIRST SETS: ε} Pars ε GRAMMAR: FOLLOW SETS:
E → TE’ T → FT’ T’ → *FT’ | ε F → ( E ) | id s to Build Table FOLLOW(E’) = { ), $} FOLLOW(T) = {+, ), $} FOLLOW(F) = {+, *, ), $} ing E E’ → TE’ → +TE’ | FIRST(E’) = {+, ε} FIRST(T’) = {* , ε} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FOLLOW(E) = {), $} FIRST(E’) = {+, ε FOLLOW(T’) = {+, ), $} 1. If A → α: if a ∈ FIRST(α), add A → α to M[A, a] PARSING TABLE: NON- TERMINAL INPUT SYMBOL id + * ( ) $ E E → TE’ E → TE’ E’ E’ → +TE’ E’ → ε E’ → ε T T → FT’ T → FT’ T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε F F → id F → (E)
68
Rule s to Build Table ing FIRST SETS: ε} Pars ε GRAMMAR: FOLLOW SETS:
E → TE’ T → FT’ T’ → *FT’ | ε F → ( E ) | id s to Build Table FOLLOW(E’) = { ), $} FOLLOW(T) = {+, ), $} FOLLOW(F) = {+, *, ), $} ing E E’ → TE’ → +TE’ | FIRST(E’) = {+, ε} FIRST(T’) = {* , ε} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FOLLOW(E) = {), $} FIRST(E’) = {+, ε FOLLOW(T’) = {+, ), $} 1. If A → α: if a ∈ FIRST(α), add A → α to M[A, a] PARSING TABLE M: NON- TERMINAL INPUT SYMBOL id + * ( ) $ E E → TE’ E → TE’ E’ E’ → +TE’ E’ → ε E’ → ε T T → FT’ T → FT’ T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε F F → id F → (E)
69
Rule s to Build Table ing FIRST SETS: ε} Pars ε GRAMMAR: FOLLOW SETS:
E → TE’ T → FT’ T’ → *FT’ | ε F → ( E ) | id s to Build Table FOLLOW(E’) = { ), $} FOLLOW(T) = {+, ), $} FOLLOW(F) = {+, *, ), $} ing E E’ → TE’ → +TE’ | FIRST(E’) = {+, ε} FIRST(T’) = {* , ε} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FOLLOW(E) = {), $} FIRST(E’) = {+, ε FOLLOW(T’) = {+, ), $} 1. If A → α: if a ∈ FIRST(α), add A → α to M[A, a] PARSING TABLE M: NON- TERMINAL INPUT SYMBOL id + * ( ) $ E E → TE’ E → TE’ E’ E’ → +TE’ E’ → ε E’ → ε T T → FT’ T → FT’ T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε F F → id F → (E)
70
Rule s to Build Table ing FIRST SETS: ε} Pars ε GRAMMAR: FOLLOW SETS:
E → TE’ T → FT’ T’ → *FT’ | ε F → ( E ) | id s to Build Table FOLLOW(E’) = { ), $} FOLLOW(T) = {+, ), $} FOLLOW(F) = {+, *, ), $} ing E E’ → TE’ → +TE’ | FIRST(E’) = {+, ε} FIRST(T’) = {* , ε} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FOLLOW(E) = {), $} FIRST(E’) = {+, ε FOLLOW(T’) = {+, ), $} 1. If A → α: if a ∈ FIRST(α), add A → α to M[A, a] PARSING TABLE M: NON- TERMINAL INPUT SYMBOL id + * ( ) $ E E → TE’ E → TE’ E’ E’ → +TE’ E’ → ε E’ → ε T T → FT’ T → FT’ T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε F F → id F → (E)
71
Rule s to Build Table ing FIRST SETS: ε} Pars ε GRAMMAR: FOLLOW SETS:
E → TE’ T → FT’ T’ → *FT’ | ε F → ( E ) | id s to Build Table FOLLOW(E’) = { ), $} FOLLOW(T) = {+, ), $} FOLLOW(F) = {+, *, ), $} ing E E’ → TE’ → +TE’ | FIRST(E’) = {+, ε} FIRST(T’) = {* , ε} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FOLLOW(E) = {), $} FIRST(E’) = {+, ε FOLLOW(T’) = {+, ), $} 1. If A → α: if a ∈ FIRST(α), add A → α to M[A, a] PARSING TABLE M: NON- TERMINAL INPUT SYMBOL id + * ( ) $ E E → TE’ E → TE’ E’ E’ → +TE’ E’ → ε E’ → ε T T → FT’ T → FT’ T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε F F → id F → (E)
72
Rule s to Build Table ing FIRST SETS: ε} Pars ε GRAMMAR: FOLLOW SETS:
E → TE’ T → FT’ T’ → *FT’ | ε F → ( E ) | id s to Build Table FOLLOW(E’) = { ), $} FOLLOW(T) = {+, ), $} FOLLOW(F) = {+, *, ), $} ing E E’ → TE’ → +TE’ | FIRST(E’) = {+, ε} FIRST(T’) = {* , ε} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FOLLOW(E) = {), $} FIRST(E’) = {+, ε FOLLOW(T’) = {+, ), $} 1. If A → α: if a ∈ FIRST(α), add A → α to M[A, a] 2. If A → α: if ε ∈ FIRST(α), add A → α to M[A, b] for each terminal b ∈ FOLLOW(A), PARSING TABLE M: NON- TERMINAL INPUT SYMBOL id + * ( ) $ E E → TE’ E → TE’ E’ E’ → +TE’ E’ → ε E’ → ε T T → FT’ T → FT’ T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε F F → id F → (E)
73
Rule s to Build Table ing FIRST SETS: ε} Pars ε GRAMMAR: FOLLOW SETS:
E → TE’ T → FT’ T’ → *FT’ | ε F → ( E ) | id s to Build Table FOLLOW(E’) = { ), $} FOLLOW(T) = {+, ), $} FOLLOW(F) = {+, *, ), $} ing E E’ → TE’ → +TE’ | FIRST(E’) = {+, ε} FIRST(T’) = {* , ε} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FOLLOW(E) = {), $} FIRST(E’) = {+, ε FOLLOW(T’) = {+, ), $} 1. If A → α: if a ∈ FIRST(α), add A → α to M[A, a] 2. If A → α: if ε ∈ FIRST(α), add A → α to M[A, b] for each terminal b ∈ FOLLOW(A), PARSING TABLE M: NON- TERMINAL INPUT SYMBOL id + * ( ) $ E E → TE’ E → TE’ E’ E’ → +TE’ E’ → ε E’ → ε T T → FT’ T → FT’ T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε F F → id F → (E)
74
Rule s to Build Table ing FIRST SETS: ε} Pars GRAMMAR: FOLLOW SETS:
E → TE’ T → FT’ T’ → *FT’ | ε F → ( E ) | id Rule s to Build Table FOLLOW(E’) = { ), $} FOLLOW(T) = {+, ), $} FOLLOW(F) = {+, *, ), $} ing E E’ → TE’ → +TE’ | ε FIRST(E’) = {+, ε} FIRST(T’) = {* , ε} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FOLLOW(E) = {), $} FIRST(E’) = {+, FOLLOW(T’) = {+, ), $} 1. If A → α: if a ∈ FIRST(α), add A → α to M[A, a] 2. If A → α: if ε ∈ FIRST(α), add A → α to M[A, b] for each terminal b ∈ FOLLOW(A), if ε ∈ FIRST(α), and $ ∈ FOLLOW(A), add A → α to M[A, $] 3. If A → α: NON INPUT SYMBOL TERMINAL id * ( ) $ E E → TE’ E → TE’ E’ E’ → +TE’ E’ → ε E’ → ε T T → FT’ T → FT’ T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε F F → id F → (E) PARSING TABLE M:
75
A Predictive Parser end X! Y1Y2 …Yk then
2. Apply the predictive parsing algorithm to construct the parse tree The following algorithm shows how we can construct the move parsing table for an input string w$ with respect to a given grammar G. set ip to point to the first symbol of the input string w$ repeat if Top(stack) is a terminal or $ then if Top(stack) = Current-Input(ip) then Pop(stack) and advance ip else begin Pop(stack); Push Y1; Y2;… ; Yk onto the stack, with Y1 on top; Output the production X! Y1Y2 …Yk end else null until Top(stack) = $ (i.e. the stack become empty) else null if M[X,a]= X! Y1Y2 …Yk then
76
A Predictive Parser E’ → +TE’ | ε E → TE’ T → FT’ T’ → *FT’ | ε
2. Apply the predictive parsing algorithm to construct the parse tree E → TE’ E’ → +TE’ | ε T → FT’ T’ → *FT’ | ε F → ( E ) | id Example Grammar: NON- TERMINAL INPUT SYMBOL id + * ( ) $ E E → TE’ E → TE’ Parsing Table: E’ E’ → +TE’ E’ → ε E’ → ε T T → FT’ T → FT’ T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε F F → id F → (E)
77
if Top(stack) = Current-Input(ip) then
Set ip to point to the first symbol of the input string w$ repeat if Top(stack) is a terminal or $ then if Top(stack) = Current-Input(ip) then else else null be Push Y1; Y2;… ; Yk onto the stack, with Y1 on top; else null Set ip to point to the first symbol of the input string w$ if Top(stack) is a terminal or $ then Pop(stack) and advance ip else if M[X,a]= X! Y1Y2 …Yk then if M[X,a]= X! Y1Y2 …Yk then gin Pop(stack); Pop(stack); Push Y1; Y2;… ; Yk onto the stack, with Y1 on top; Output the production X! Y1Y2 …Yk Output the production Y1; Y2;… ; Yk ; end until Top(stack)=$ Top(stack) = $ (i.e. the stack become empty) id + id * id $ OUTPUT: INPUT: E T E’ ip Predictive Parsing Program STACK: T E $ E’ $ PARSING TABLE: NON- TERMINAL INPUT SYMBOL id + * ( ) $ E E → TE’ E → TE’ E’ E’ → +TE’ E’ → ε E’ → ε T T → FT’ T → FT’ T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε F F → id F → (E)
78
A Predictive Parser * Predictive Parsing Program INPUT: OUTPUT: E T E’
id + id * id $ OUTPUT: E T E’ F T’ Predictive Parsing Program STACK: T T F E’ T’ E’ $ E’ $ $ PARSING TABLE: NON- TERMINAL INPUT SYMBOL id + * ( ) $ E E → TE’ E → TE’ E’ E’ → +TE’ E’ → ε E’ → ε T T → FT’ T → FT’ T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε F F → id F → (E)
79
A Predictive Parser * Predictive Parsing Program INPUT: OUTPUT: E T E’
id + id * id $ OUTPUT: E T E’ F T’ id Predictive Parsing Program STACK: T F T id T’ E’ T’ E’ E’ $ E’ $ $ $ PARSING TABLE: NON- TERMINAL INPUT SYMBOL id + * ( ) $ E E → TE’ E → TE’ E’ E’ → +TE’ E’ → ε E’ → ε T T → FT’ T → FT’ T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε F F → id F → (E)
80
A Predictive Parser * Predictive Parsing Program
Action when Top(Stack) = input ≠ $ : Pop stack, advance input. id + id * id $ OUTPUT: INPUT: E T E’ F T’ id Predictive Parsing Program STACK: F id T’ T’ E’ E’ $ $ NON INPUT SYMBOL TERMINAL id * ( ) $ E E → TE’ E → TE’ E’ E’ → +TE’ E’ → ε E’ → ε T T → FT’ T → FT’ T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε F F → id F → (E) PARSING TABLE:
81
A Predictive Parser * Predictive Parsing Program INPUT: OUTPUT: E T E’
id + id * id $ OUTPUT: E T E’ F T’ id ε Predictive Parsing Program STACK: E’ T’ $ E’ $ PARSING TABLE: NON- TERMINAL INPUT SYMBOL id + * ( ) $ E E → TE’ E → TE’ E’ E’ → +TE’ E’ → ε E’ → ε T T → FT’ T → FT’ T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε F F → id F → (E)
82
A Predictive Parser * The predictive parser proceeds
in this fashion emiting following productions: the T E’ E’ → +TE’ F T’ + T E’ T F → FT’ → id id ε F T’ ε id * F T’ T’ → * FT’ F → id T’ → ε E’ → ε id ε When Top(Stack) = input = $ the parser halts and accepts the input string.
83
LL(k) Parser This parser parses from left to right, and does a leftmost-derivation. It looks up 1 symbol ahead to choose its next action. Therefore, it is known as a LL(1) parser. An LL(k) parser looks k symbols ahead to decide its action. LL(1) A grammar whose parsing table has no multiply-defined entries LL(1) grammars enjoys several nice properties: for example they are not ambiguous and not left recursive.
84
LL(k) Parser E’ → +TE’ | ε
LL(1) A grammar whose parsing table has no multiply-defined entries Example 1 E → TE’ E’ → +TE’ | ε T → FT’ T’ → *FT’ | ε F → ( E ) | id The grammar Whose PARSINGTABLE: NON- TERMINAL INPUT SYMBOL id + * ( ) $ E E → TE’ E → TE’ E’ E’ → +TE’ E’ → ε E’ → ε T T → FT’ T → FT’ T’ T’→ ε T’ → *FT’ T’ → ε T’ → ε F F → id F → (E) Is LL(1) grammar
85
LL(k) Parser LL(1) A grammar whose parsing table has no multiply-defined entries Example 2 The grammar S → iEtSS`| a S’ → eS | ε E → Fb Whose PARSINGTABLE: NON- TERMINAL INPUT SYMBOL a b e i t $ S S→ a S → iEtSS’ S’ S’ → ε S’ →eS S’ → ε E E →b Is NOT LL(1) grammar
86
Parsing Top Down Parsing Bottom Up Parsing Predictive Parsing
LL(k) Parsing Left Recursion Left Factoring Bottom Up Parsing Shift-reduce Parsing LR(k) Parsing
87
Bottom-Up Parsers Bottom-up parsers: build the nodes on the bottom of the parse tree first. Suitable for automatic parser generation, handle a larger class of grammars. Examples: shift-reduce parser (or LR(k) parsers)
88
Bottom-up Parsing LR(1), SLR(1), LALR(1) •
• No problem with left-recursion Widely used in practice LR(1), SLR(1), LALR(1)
89
Grammar Hierarchy Non-ambiguous CFG CLR(1) LALR(1) LL(1) SLR(1)
90
– identify handle - reducible sequence:
Bottom-up Parsing • Works from tokens to Repeat: start-symbol – identify handle - reducible sequence: • non-terminal is not constructed but • all its children have been constructed – reduce stack - construct non-terminal and update • Until reducing to start-symbol
91
Bottom-up Parsing → 1 E + (2) (E) (3) + (3) (3) (3) E → E + (E) i E E
(3) (3) E → E + (E) i = 0,1, 2, …, 9 → i E E E E E 1 + ( 2 ) + ( 3 )
92
Bottom-up Parsing → E • Is the following grammar LL(1) ? ❚ NO 1 + (2)
• Is the following grammar LL(1) ? E → E + (E) → i ❚ NO 1 + (2) + (3) ❚ But this is a useful grammar
93
Bottom-Up Parser A bottom-up parser, or a shift-reduce parser, begins
at the leaves and works up to the top of the tree. The reduction steps trace a rightmost on reverse. derivation S → aABe A → Abc | b B → d Consider the Grammar: We want to parse the input string abbcde.
94
Bottom-Up Parser Example Bottom-Up Parsing Program OUTPUT: INPUT:
c d e $ OUTPUT: Production S → aABe Bottom-Up Parsing A → Abc Program A → b B → d
95
Bottom-Up Parser Example Bottom-Up Parsing Program OUTPUT: INPUT: A
c d e $ OUTPUT: A b Production S → aABe Bottom-Up Parsing A → Abc Program A → b B → d
96
Bottom-Up Parser Example Bottom-Up Parsing Program OUTPUT: INPUT: A
c d e $ OUTPUT: A b Production S → aABe Bottom-Up Parsing A → Abc Program A → b B → d
97
Bottom-Up Parser Example Bottom-Up Parsing Program
INPUT: a A b c d e $ OUTPUT: A b Production S → aABe Bottom-Up Parsing A → Abc Program A → b B → d We are not reducing here in this example. A parser would reduce, get stuck and then backtrack!
98
Bottom-Up Parser Example Bottom-Up Parsing Program OUTPUT: INPUT: A
c d e $ OUTPUT: A A b c b Production S → aABe Bottom-Up Parsing A → Abc Program A → b B → d
99
Bottom-Up Parser Example Bottom-Up Parsing Program OUTPUT: INPUT: A
d e $ OUTPUT: A A b c b Production S → aABe Bottom-Up Parsing A → Abc Program A → b B → d
100
Bottom-Up Parser Example A B A b c d b Bottom-Up Parsing Program
INPUT: a A d e $ OUTPUT: A B A b c d b Production S → aABe Bottom-Up Parsing A → Abc Program A → b B → d
101
Bottom-Up Parser Example A B A b c d b Bottom-Up Parsing Program
INPUT: a A B e $ OUTPUT: A B A b c d b Production S → aABe Bottom-Up Parsing A → Abc Program A → b B → d
102
Bottom-Up Parser Example a A B e Bottom-Up Parsing Program OUTPUT:
INPUT: a A B e $ OUTPUT: S a A B e A b c d b Production S → aABe Bottom-Up Parsing A → Abc Program A → b B → d
103
Bottom-Up Parser Example a A B e Bottom-Up Parsing Program
OUTPUT: INPUT: S $ S a A B e A b c d b Production S → aABe Bottom-Up Parsing A → Abc Program A → b B → d This parser is known as an LR Parser because it scans the input from Left to right, and it constructs a Rightmost derivation in reverse order.
104
Bottom-Up Parser Example The scanning of productions for matching with
handles in the input string, and backtracking makes the method used in the previous example very inefficient. Can we do better?
105
LR Parser Example Input S t a c k LR Parsing Program Output action
goto
106
Shift reduce parser 1. Construct the action-goto table from the given grammar 2. Apply the shift-reduce parsing algorithm to construct the parse tree
107
Shift reduce parser 1. Construct the action-goto table from the given grammar This is what make difference between different typs of shift reduce parsing such as SLR, CLR, LALR In this course due to short of time we will not study how to construct the action-goto table
108
Shift reduce parser 2. Apply the shift-reduce parsing algorithm to construct the parse tree The following algorithm shows how we can construct the move parsing table for an input string w$ with respect to a given grammar G. set ip to point to the first symbol of the input string w$ repeat forever begin if action[top(stack), current-input(ip)] = shift(s) then begin push current-input(ip) then s on top of the stack advance ip to the next input symbol else if action[top(stack), current-input(ip)] = reduce A ! β then pop 2*|β| symbols off the stack; push A then goto[top(stack), A] on top of the stack; output the production A ! β end else if action[top(stack), current-input(ip)] = accept then return else error() end begin
109
LR Parser Example Can be parsed with this action The following
grammar: and goto table (1) E → E + T (2) E → T (3) T → T * F (4) T → F (5) F → ( E ) (6) F → id State action goto id + * ( ) $ E T F s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 s represents shift r represents reduce acc represents accept empty represents error 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1 10 r3 r3 r3 r3 11 r5 r5 r5 r5
110
LR Parser Example * GRAMMAR: (1) E → E + T (2) E → T
(3) T → T * F (4) T → F (5) F → ( E ) (6) F → id LR Parser Example OUTPUT: INPUT: id + id * id $ LR Parsing Program STACK: E State action goto id * ( ) $ E T F s s s acc r2 s r r2 r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5
111
LR Parser Example * GRAMMAR: (1) E → E + T (2) E → T
(3) T → T * F (4) T → F (5) F → ( E ) Parser Example OUTPUT: F id INPUT: id * id + id $ (6) F → id LR Parsing Program STACK: 5 E id State action goto s s s acc r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 id + * ( ) $ E T F 2 r2 s7 r2 r2
112
LR Parser Example * GRAMMAR: (1) E → E + T (2) E → T
(3) T → T * F (4) T → F (5) F → ( E ) LR Parser Example OUTPUT: F id INPUT: id * id + id $ (6) F → id LR Parsing Program STACK: State action goto s s s acc r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 id + * ( ) $ E T F 2 r2 s7 r2 r2
113
LR Parser Example (3) T → T * F (5) F → ( E ) * GRAMMAR:
(1) E → E + T (2) E → T (3) T → T * F Parser Example OUTPUT: (4) T → F T F id (5) F → ( E ) (6) F → id INPUT: id * id + id $ LR Parsing Program STACK: 3 E F The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
114
LR Parser Example (3) T → T * F (5) F → ( E ) * GRAMMAR:
(1) E → E + T (2) E → T (3) T → T * F Parser Example OUTPUT: (4) T → F T F id (5) F → ( E ) (6) F → id INPUT: id * id + id $ LR Parsing Program STACK: State action goto s s s acc r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 id + * ( ) $ E T F 2 r2 s7 r2 r2
115
LR Parser Example * 2 r2 s7 r2 r2 GRAMMAR: (1) E → E + T (2) E → T
(3) T → T * F (4) T → F (5) F → ( E ) (6) F → id Parser Example OUTPUT: T F id INPUT: id * id + id $ LR Parsing Program STACK: E 2 T State action goto s s s acc r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 id + * ( ) $ E T F 2 r2 s7 r2 r2
116
LR Parser Example * * GRAMMAR: (1) E → E + T (2) E’ → T
(3) T → T * F (4) T → F (5) F → ( E ) (6) F → id Parser Example OUTPUT: T F id INPUT: id * id + id $ LR Parsing Program STACK: 7 E * 2 T State action goto s s s acc r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 id + * ( ) $ E T F 2 r2 s7 r2 r2
117
LR Parser Example (5) F → ( E ) F id * * GRAMMAR:
(1) E → E + T (2) E’ → T (3) T → T * F (4) T → F (5) F → ( E ) Parser Example OUTPUT: T F F id id INPUT: id * id + id $ (6) F → id LR Parsing Program STACK: E 5 id 7 * State action goto s s r2 s r r2 r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 2 id + * ( ) $ E T F T 1 s6 acc
118
LR Parser Example (5) F → ( E ) F id * * GRAMMAR:
(1) E → E + T (2) E’ → T (3) T → T * F (4) T → F (5) F → ( E ) Parser Example OUTPUT: T F F id id INPUT: id * id + id $ (6) F → id LR Parsing Program STACK: E 7 * 2 T State action goto s s s acc r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 id + * ( ) $ E T F 2 r2 s7 r2 r2
119
LR Parser Example (4) T → F F id * * GRAMMAR: (1) E → E + T (2) E’ → T
(3) T → T * F OUTPUT: (4) T → F (5) F → ( E ) (6) F → id T T * F F id id INPUT: id * id + id $ LR Parsing Program STACK: 10 E F 7 * State action goto s s r2 s r r2 r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 2 id + * ( ) $ E T F T 1 s6 acc
120
LR Parser Example (4) T → F F id * GRAMMAR: (1) E → E + T (2) E → T
(3) T → T * F OUTPUT: (4) T → F (5) F → ( E ) (6) F → id T T * F F id id INPUT: id * id + id $ LR Parsing Program STACK: State action goto s s s acc r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 id + * ( ) $ E T F 2 r2 s7 r2 r2
121
LR Parser Example (1) E → E + T F id * 2 r2 s7 r2 r2 GRAMMAR:
(3) T → T * F (4) T → F (5) F → ( E ) (6) F → id OUTPUT: E T T * F F id id INPUT: id * id + id $ LR Parsing Program STACK: 2 T State action goto s s s acc r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 id + * ( ) $ E T F 2 r2 s7 r2 r2
122
LR Parser Example (1) E → E + T F id * GRAMMAR: (2) E → T
(3) T → T * F (4) T → F (5) F → ( E ) (6) F → id OUTPUT: E T T * F F id id INPUT: id * id + id $ LR Parsing Program STACK: State action goto s s s acc r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 id + * ( ) $ E T F 2 r2 s7 r2 r2
123
LR Parser Example F id * GRAMMAR: (1) E → E + T (2) E’ → T
(3) T → T * F (4) T → F (5) F → ( E ) (6) F → id LR Parser Example OUTPUT: E T T * F F id id INPUT: id * id + id $ LR Parsing Program STACK: 1 E State action goto s s s acc r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 id + * ( ) $ E T F 2 r2 s7 r2 r2
124
LR Parser Example F id * GRAMMAR: (1) E → E + T (2) E’ → T
(3) T → T * F (4) T → F (5) F → ( E ) (6) F → id LR Parser Example OUTPUT: E T T * F F id id INPUT: id * id + id $ LR Parsing Program STACK: 6 + 1 E State action goto s s s acc r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 id + * ( ) $ E T F 2 r2 s7 r2 r2
125
LR Parser Example (5) F → ( E ) F id * GRAMMAR:
(1) E → E + T (2) E’ → T (3) T → T * F (4) T → F (5) F → ( E ) LR Parser Example OUTPUT: E T F T * F id F id id INPUT: id * id + id $ (6) F → id LR Parsing Program STACK: 5 id 6 + State action goto s s r2 s r r2 r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 1 id + * ( ) $ E T F E 1 s6 acc
126
LR Parser Example (5) F → ( E ) F id * GRAMMAR:
(1) E → E + T (2) E’ → T (3) T → T * F (4) T → F (5) F → ( E ) Parser Example OUTPUT: E T F T * F id F id id INPUT: id * id + id $ (6) F → id LR Parsing Program STACK: 6 + 1 E State action goto s s s acc r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 id + * ( ) $ E T F 2 r2 s7 r2 r2
127
LR Parser Example (3) T → T * F E T (5) F → ( E ) F id id * GRAMMAR:
(1) E → E + T (2) E’ → T (3) T → T * F LR Parser Example OUTPUT: (4) T → F E T T F T * F id F id id (5) F → ( E ) (6) F → id INPUT: id * id + id $ LR Parsing Program STACK: 3 F 6 + State action goto s s r2 s r r2 r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 1 id + * ( ) $ E T F E 1 s6 acc
128
LR Parser Example (3) T → T * F E T (5) F → ( E ) F id id * GRAMMAR:
(1) E → E + T (2) E’ → T (3) T → T * F Parser Example OUTPUT: (4) T → F E T T F T * F id F id id (5) F → ( E ) (6) F → id INPUT: id * id + id $ LR Parsing Program STACK: 6 + 1 E State action goto s s s acc r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 id + * ( ) $ E T F 2 r2 s7 r2 r2
129
LR Parser Example (2) E’ → T E + T F id id * GRAMMAR: (1) E → E + T
(3) T → T * F (4) T → F (5) F → ( E ) (6) F → id OUTPUT: E E T T F T * F id F id id INPUT: id * id + id $ LR Parsing Program STACK: 9 T 6 + State action goto s s r2 s r r2 r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 1 id + * ( ) $ E T F E 1 s6 acc
130
LR Parser Example (2) E → T E + T F id id * GRAMMAR: (1) E → E + T
(3) T → T * F (4) T → F (5) F → ( E ) (6) F → id OUTPUT: E E T T F T * F id F id id INPUT: id * id + id $ LR Parsing Program STACK: State action goto s s s acc r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 id + * ( ) $ E T F 2 r2 s7 r2 r2
131
LR Parser Example E + T F id id * GRAMMAR: (1) E → E + T (2) E’ → T
(3) T → T * F (4) T → F (5) F → ( E ) (6) F → id LR Parser Example OUTPUT: E E T T F T * F id F id id INPUT: id * id + id $ LR Parsing Program STACK: 1 E State action goto s s s acc r4 r r r4 s s r6 r r r6 s s s s s s11 r1 s r r1 r3 r r r3 r5 r r r5 id + * ( ) $ E T F 2 r2 s7 r2 r2
132
Constructing Parsing Tables
All LR parsers use the same parsing program that we demonstrated in the previous slides. What differentiates the LR parsers are the action and the goto tables: Simple LR (SLR): succeeds for the fewest grammars, but is the easiest to implement. Canonical LR: succeeds for the most grammars, but is the hardest to implement. It splits states when necessary to prevent reductions that would get the parser stuck. Lookahead LR (LALR): succeeds for most common syntactic constructions used in programming languages, but produces LR tables much smaller than canonical LR.
133
Grammar Hierarchy Non-ambiguous CFG CLR(1) LALR(1) LL(1) SLR(1)
134
Parsing How parser works? Top Down Parsing Bottom Up Parsing
Predictive Parsing LL(k) Parsing Left Recursion Left Factoring Bottom Up Parsing Shift-reduce Parsing LR(k) Parsing How to write parser?
135
get next lexical analyzer next token Syntax analyzer token Source
1. 2. Uses Regular Expressions to define tokens Uses Finite Automata to recognize tokens next char lexical analyzer next token Syntax analyzer get next get next char token Source Program symbol table (Contains a record for each identifier) Uses Top-down parsing or Bottom-up parsing To construct a Parse tree
136
How to write a parser? Yacc
137
Yacc Lex Yacc Compiler Source program token description
lexical analysis Language grammar Yacc syntax analysis Inter. representation code generation Target program
138
How to write an LR parser? The construction is done
General approach: The construction is done automatically by a tool such as the Unix program yacc. Using the source program language grammar to write a simple yacc program and save it in a file named name.y Using the unix program yacc to compile name.y resulting a C (parser) program named y.tab.c in Compiling and linking the C program y.tab.c in a normal way resulting the required parser.
139
LR parser generators Yacc: Yet another compiler compiler •
• Automatically generate LALR parsers • Created by S.C. Johnson in 1970’s
140
Using Yacc Yacc source Yacc compiler program filename.y y.tab.c C
a.out a.out (Parser) Input tokens Parse tree
141
Yacc analyzer spec Source program Lexical lexer spec LEX .c C compiler
tokens Parser spec Yacc .c C compiler Parser
142
Yacc Example analyzer spec tomatoes + potatoes + carrots Lexical
lexer spec LEX .c C compiler id1, PLUS, id2, PLUS, id3 Parser spec Yacc .c C compiler Parser + + id3 id1 id2
143
How to write parser symbol table source Scanner Parser token program
lex.yy.c y.tab.c Lex Yacc Lex spec (.l) yacc spec (.y)
144
How to write parser symbol table source Scanner Parser token program
lex.yy.c y.tab.c Yacc Lex spec (.l) yacc spec (.y)
145
How to write a yacc program comments > auxiliary subroutines>
myfile.y %{ < C global variables, prototype comments > %} [DEFINITION SECTION] This part will be embedded into myfile.tab.c < C global variables, prototypes, contains token declarations. Tokens are recognized in lexer. %% define how to “understand” the input language, and what actions to take for each “sentence”. [PRODUCTION RULES SECTION] %% any user code. For example, a main function to call the parser function < C auxiliary subroutines> < C auxiliary subroutines> < C auxiliary subroutines> yyparse()
146
Example: PRODUCTION RULES SECTION Example: statement ! expression
expression ! expression + expression | expression - expression | expression * expression | expression / expression | NUMBER statement : expression { printf (“ = %g\n”, $1); } expression : expression ‘+’ expression { $$ = $1 + $3; } | expression ‘-’ expression { $$ = $1 - $3; } | expression ‘*’ expression { $$ = $1 * $3; } | expression ‘/’ expression { $$ = $1 / $3 ; } | NUMBER { $$ = $1; } ;
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.