Download presentation
Presentation is loading. Please wait.
Published byPeter Morrison Modified over 9 years ago
1
Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)
2
Introduction Syntax analysis is the second phase after lexical analysis in compiler design. It basically checks the syntax of the language. It takes the token from lexical analyzer and groups them in such a way that some programming structure can be recognized. GPERI – CD - UNIT-32
3
Introduction After grouping the tokens if any syntax cannot be recognized then syntactic error will be generated. It is a major component of the front end of a compiler. For the syntactic specification of a programming language, use a notation called context free grammar. GPERI – CD - UNIT-33
4
Role of the parser It obtains a string of tokens from the lexical analyzer. Group the tokens to identify large structure in the program. It should be report any syntax error in the program. It should be recover from the error so that it can continue to process the rest of the input. GPERI – CD - UNIT-34
5
Role of the parser. GPERI – CD - UNIT-35 Lexical analyzer Parser Symbol Table Source Program Token getNextToken Parse Tree Syntax Error
6
Context-Free Grammar Grammar involves four quantities: Terminals, Non-terminals, A start symbol and Production. One non-terminal is selected as a start symbol. Each production consist of a non-terminal, followed by an arrow ( ) or (:=) followed by a string of non-terminals and terminals. GPERI – CD - UNIT-36
7
Context-Free Grammar A context free grammar (CFG) is defined: As 4-tuples (V N, ∑, P, S). Where: V N = Set of non-terminals ∑ = Set of terminals. S = A start symbol. P = Set of production rules. One non-terminal finite string of terminals and/or non- terminals. GPERI – CD - UNIT-37
8
Context-Free Grammar Example. stmt if ( expr ) stmt else stmt Where: Non-terminals: stmt, expr Terminals: if, (, ), else Start symbol: stmt GPERI – CD - UNIT-38
9
Context-Free Grammar Example. expression -> expression + term expression -> expression – term expression -> term term -> term * factor term -> term / factor term -> factor GPERI – CD - UNIT-39
10
Context-Free Grammar Example: factor -> ( expression ) factor -> id GPERI – CD - UNIT-310
11
Context-Free Grammar Notational Conventions: Terminal symbols: Lower case letters such as a,b,c. Operator symbols such as +, *, -, / etc. Punctuation symbols such as parentheses, comma and so on. The digits 0,1, ….., 9. Bold face string such as id or if, each of which represents a single terminal symbol. GPERI – CD - UNIT-311
12
Context-Free Grammar Notational Conventions: Non-terminal symbols: Uppercase letters, such as A, B, C. The letter S, when it appears, it usually the start symbol. Lowercase, italic such as expr or stmt. GPERI – CD - UNIT-312
13
Derivation The construction of parse tree can be precise by taking a derivational view, In which each productions are treated as rewriting rules. Beginning with start symbol, Each rewriting step replace a non-terminal by the body of one of its production. E E + E | E * E | - E | ( E ) | id GPERI – CD - UNIT-313
14
Derivation list list + digit list list – digit list digit digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 GPERI – CD - UNIT-314
15
Derivation list => list + digit => list – digit + digit => digit – digit + digit => 9 – digit + digit => 9 – 5 + digit => 9 – 5 + 2 GPERI – CD - UNIT-315
16
Derivation This is an example leftmost derivation, because we replaced the leftmost nonterminal (underlined) in each step. Likewise, a rightmost derivation replaces the rightmost nonterminal in each step. GPERI – CD - UNIT-316
17
Derivation Construct a CFG, for the language L = {w c w : w ϵ (a,b)*}. Sol, G = (V N,∑,P,S) Here, V N = {S}, ∑ = {a,b,c} Production rule P is defined as GPERI – CD - UNIT-317 S -> a S a S -> b S b S -> c
18
Parse Tree The string generated by a context free grammar can be represented by a hierarchical structure called tree. Such tree representing derivations are called derivation trees or parse tree or syntax tree. GPERI – CD - UNIT-318
19
Parse Tree Characteristics of parse tree: The root of the tree is labeled by the start symbol. Each leaf of the tree is labeled by a terminal (token or ϵ). Each interior node is labeled by a nonterminal. If A → X1 X2 … Xn is a production, then node A has immediate children X1, X2, …, Xn where Xi is a (non)terminal or ε (ε denotes the empty string) GPERI – CD - UNIT-319
20
Parse Tree - Example GPERI – CD - UNIT-320 list digit listdigit list digit 9 - 5 + 2
21
Exercise Write a CGF, which generates strings having equal number of a’s and b’s: Sol: CGF, G = (V N,∑,P,S) where V N = {S}, ∑ = {a,b} P is defined as: S -> aSb S -> bSa S -> ^ GPERI – CD - UNIT-321
22
Exercise Construct a CGF for the language L = {a n b n : n >= 1} Sol: CGF, G = (V N,∑,P,S) where V N = {S}, ∑ = {a,b} P is defined as: S -> aSb S -> ab GPERI – CD - UNIT-322
23
Exercise Write a CGF, which generates string of balanced parenthesis. Sol: Grammar will accept the balanced right and left parenthesis. e.g. (), ((( ))), CGF, G = (V N,∑,P,S) where V N = {S}, ∑ = { (, )} P is given by: S -> SS S -> (S) S -> ^ GPERI – CD - UNIT-323
24
Exercise A CGF given by the productions is: S -> a | a A S A -> bS Obtain the derivation tree of the word : a b a a b a a. GPERI – CD - UNIT-324
25
Exercise Given the grammar G = (V N,∑,P,S) where V N = {E}, S = E, ∑ = {id,+,*,c} and P consist of E -> E + E | E * E | (E) | id Obtain the derivation tree for id*id + id and (id+id)*id GPERI – CD - UNIT-325
26
Ambiguity A grammar is said to be ambiguous, If there exist more than one parse tree for the same sentence. Example: S -> aSbS | bSaS | ϵ For the string “abab” have two different parse tree. GPERI – CD - UNIT-326
27
Ambiguity A classical example of ambiguous grammar is that of: if-then-else construct of many programming language. Most of the language have both if-then and if-then-else versions of the statement. The grammar rules for it as follows: stmt -> if condition then stmt else stmt | if condition then stmt GPERI – CD - UNIT-327
28
Ambiguity Consider the following code segment: If a>b then if c>d then x=y else x=z GPERI – CD - UNIT-328
29
Ambiguity Leftmost derivation GPERI – CD - UNIT-329 stmt ifconditionthenstmtelsestmt ifcondition then stmt a>bx=z c>d x=y
30
Ambiguity Rightmost derivation GPERI – CD - UNIT-330 stmt ifconditionthenstmt ifcondition then stmt a>b x=z c>d x=y else stmt
31
Eliminating Ambiguity Ambiguities may be eliminated by rewriting the grammar: If-then-else grammar may be rewritten as: stmt -> m_stmt | un_stmt m_stmt -> if condition then m_stmt else m_stmt | other_stmt unm_stmt -> if condition then stmt | if condition then m_stmt else unm_stmt GPERI – CD - UNIT-331
32
Eliminating Ambiguity Another technique is to modify the language a bit. Many language require that an if should have a matching endif. Thus the grammar is modified as stmt -> if condition then stmt else stmt endif | if condition then stmt endif GPERI – CD - UNIT-332
33
Eliminating Ambiguity Example: Grammar GPERI – CD - UNIT-333 E -> I E -> E + E E -> E * E E -> (E) I -> a | b | c Ambiguity is due to the precedence of operator, if we correct the precedence then ambiguity may be removed. Here two causes of ambiguity: 1.The precedence of operator is not respected. 2.The sequence of identical operators can group either from left or from right..
34
Eliminating Ambiguity The unambiguous grammar. GPERI – CD - UNIT-334 E -> T T -> F F -> I E -> E + T T -> T * F F -> (E) I -> a | b | c
35
Eliminating Ambiguity The solve parse tree for a + b * c GPERI – CD - UNIT-335 E
36
Eliminating Ambiguity The solve parse tree for a + b * c GPERI – CD - UNIT-336 E +TE
37
Eliminating Ambiguity The solve parse tree for a + b * c GPERI – CD - UNIT-337 E +TE T
38
Eliminating Ambiguity The solve parse tree for a + b * c GPERI – CD - UNIT-338 E +TE T F I a
39
Eliminating Ambiguity The solve parse tree for a + b * c GPERI – CD - UNIT-339 E +TE T F I a T * F
40
Eliminating Ambiguity The solve parse tree for a + b * c GPERI – CD - UNIT-340 E +TE T F I a T * F F I b
41
Eliminating Ambiguity The solve parse tree for a + b * c GPERI – CD - UNIT-341 E +TE T F I a T * F F I b I c
42
Left Recursion A grammar is left recursive if it has a nonterminal, say A, that has a derivation of Aα from it. Presence of left recursion creates difficulties while designing parsers. Types of left recursion: Immediate left recursion General left recursion GPERI – CD - UNIT-342
43
Left Recursion Immediate left recursion: It happen with a nonterminal A having production rule of the form : A -> Aα OR The production is recursive if the leftmost symbol on right side is the same as non-terminal of the left side, for example: A -> Aα GPERI – CD - UNIT-343
44
Left Recursion Immediate left recursion: (Continue..) It can be eliminated by introducing a new nonterminal symbol, say A’. Modify the grammar: A -> βA’ A’ -> αA’ | ϵ GPERI – CD - UNIT-344
45
Left Recursion Immediate left recursion: (Continue..) Thus the rule. A -> Aα 1 | Aα 2 |…….| Aα m |β 1 | β 1 |…..…| β n A -> β 1 A’| β 2 A’|……| β n A’ A’ -> α 1 A’| α 2 A’|……. |α m A’|ϵ GPERI – CD - UNIT-345
46
Left Recursion Immediate left recursion: (Continue..) Example. E -> E + T | T T -> T * F | F F -> (E) | id GPERI – CD - UNIT-346 E -> TE’ E’ -> +TE’ | ϵ T -> FT’ T’ -> *FT’ | ϵ F -> (E) | id
47
Left Recursion General left recursion: (Continue..) If there may be no immediate left recursion, a number of production rules may act together to give a general left recursion. For example: S -> Aa A -> Sb | c GPERI – CD - UNIT-347 Here, S is left recursive, because: S -> Aa -> Sba
48
Left Recursion Algorithm eliminate left recursion: 1. Arrange non-terminals in some order say A 1,A 2,….,A m 2. For i = 1 to m do for j = 1 to i-1 do for each set of production A i -> A j γ and A j -> ᵟ1 | ᵟ2 | …….|ᵟk replace A i -> A j γ by A i -> ᵟ1γ | ᵟ2γ |…..|ᵟkγ 3. Eliminate immediate felt recursion from all production. GPERI – CD - UNIT-348
49
Left Recursion Example: S -> Aa A -> Sb | c GPERI – CD - UNIT-349 The order of non-terminals S,A. For i = 1, the rule S -> Aa, no immediate left recursion For i = 2, A -> Sb | c is modified as, A -> Aab | c, which has immediate left recursion, eliminated by modifying the rule as: A -> cA’ A’ -> abA’ | ϵ
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.