Download presentation
Presentation is loading. Please wait.
1
Midterm Content – Excepted from PPTs
2
Chapter 3: Lexical Analysis and Flex
Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371 Fairfield Way, Unit 2155 Storrs, CT (860) Material for course thanks to: Laurent Michel Aggelos Kiayias Robert LeBarre
3
Rules for Specifying Regular Expressions:
is a regular expression denoting { } If a is in , a is a regular expression that denotes {a} Let r and s be regular expressions with languages L(r) and L(s). Then (a) (r) | (s) is a regular expression L(r) L(s) (b) (r)(s) is a regular expression L(r) L(s) (c) (r)* is a regular expression (L(r))* (d) (r) is a regular expression L(r) All are Left-Associative. precedence
4
Algebraic Properties of Regular Expressions
AXIOM DESCRIPTION r | s = s | r r | (s | t) = (r | s) | t (r s) t = r (s t) r = r r = r r* = ( r | )* r ( s | t ) = r s | r t ( s | t ) r = s r | t r r** = r* | is commutative | is associative concatenation is associative concatenation distributes over | relation between * and Is the identity element for concatenation * is idempotent
5
Towards Token Definition
Regular Definitions: Associate names with Regular Expressions For Example : PASCAL IDs letter A | B | C | … | Z | a | b | … | z digit 0 | 1 | 2 | … | 9 id letter ( letter | digit )* Shorthand Notation: “+” : one or more r* = r+ | & r+ = r r* “?” : zero or one [range] : set range of characters (replaces “|” ) [A-Z] = A | B | C | … | Z Using Shorthand : PASCAL IDs id [A-Za-z][A-Za-z0-9]* We’ll Use Both Techniques
6
Epsilon-Transitions Given the regular expression: (a (b*c)) | (a (b |c+)?) Find a transition diagram NFA that recognizes it. Solution ?
7
NFA to DFA Conversion – Main Idea
Look at the state reachable without consuming any input, and Aggregate them in macro states
8
Final Result A state is final If one of the NFA states was final
9
Regular Expression to NFA Construction
We now focus on transforming a Reg. Expr. to an NFA This construction allows us to take: Regular Expressions (which describe tokens) To an NFA (to characterize language) To a DFA (which can be computerized) The construction process is componentwise Builds NFA from components of the regular expression in a special order with particular techniques. NOTE: Construction is syntax-directed translation, i.e., syntax of regular expression is determining factor for NFA construction and structure.
10
Construct NFA Solutions
start : a : b: ab: a start b start b a start
11
Construct NFA Solutions
| ab : a* ( | ab )* : a <
12
Construction Algorithm : R.E. NFA
Construction Process : 1st : Identify subexpressions of the regular expression symbols r | s rs r* 2nd : Characterize “pieces” of NFA for each subexpression
13
Piecing Together NFAs 1. For in the regular expression, construct NFA L() start i f 2. For a in the regular expression, construct NFA a start i f L(a)
14
Piecing Together NFAs – continued(1)
3.(a) If s, t are regular expressions, N(s), N(t) their NFAs s|t has NFA: i f N(s) N(t) L(s) L(t) start where i and f are new start / final states, and -moves are introduced from i to the old start states of N(s) and N(t) as well as from all of their final states to f.
15
Piecing Together NFAs – continued(2)
3.(b) If s, t are regular expressions, N(s), N(t) their NFAs st (concatenation) has NFA: start i f N(s) N(t) L(s) L(t) Alternative: overlap where i is the start state of N(s) (or new under the alternative) and f is the final state of N(t) (or new). Overlap maps final states of N(s) to start state of N(t).
16
Piecing Together NFAs – continued(3)
3.(c) If s is a regular expressions, N(s) its NFA, s* (Kleene star) has NFA: N(s) start i f where : i is new start state and f is new final state -move i to f (to accept null string) -moves i to old start, old final(s) to f -move old final to old start (WHY?)
17
Parse Tree for this regular expression:
Detailed Example See example 3.16 in textbook for (a | b)*abb 2nd Example - (ab*c) | (a(b|c*)) Parse Tree for this regular expression: r13 r12 r5 r3 r11 r4 r9 r10 r8 r7 r6 r0 r1 r2 b * c a | ( ) What is the NFA? Let’s construct it !
18
Detailed Example – Construction(1)
b r2: c b r1:
19
Detailed Example – Construction(2)
r4 : r1 r2 b c r5 : r3 r4 b a c
20
Detailed Example – Construction(3)
r8: r11: a r7: b r6: c
21
Detailed Example – Construction(4)
r9 : r7 | r8 b r10 : r9 c r12 : r11 r10 b a
22
Organized by Portion of Parse Tree
c b r1: r4 : r1 r2 b c r5 : r3 r4 b a c
23
Organized by Portion of Parse Tree
c r8: r11: a r7: b r6: c c r9 : r7 | r8 b r10 : r9 c r12 : r11 r10 b a
24
Detailed Example – Final Step
r13 : r5 | r12 b a c 1 6 5 4 3 8 2 10 9 12 13 14 11 15 7 16 17
25
Converting NFA to DFA – 1st Look
8 5 4 7 3 6 2 1 b a c From State 0, Where can we move without consuming any input ? This forms a new state: 0,1,2,6,8 What transitions are defined for this new state ?
26
Showing the Steps a 3 0, 1, 2, 6, 8 a a c b 1, 2, 5, 6, 7, 8
1, 2, 4, 5, 6, 8 c c
27
The Resulting DFA Which States are FINAL States ?
1, 2, 5, 6, 7, 8 1, 2, 4, 5, 6, 8 0, 1, 2, 6, 8 3 c b a Which States are FINAL States ? D C A B c b a How do we handle alphabet symbols not defined for A, B, C, D ?
28
Chapter 4: Syntax Analysis Part 1: Grammar Concepts
Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371 Fairfield Way, Unit 2155 Storrs, CT (860) Material for course thanks to: Laurent Michel Aggelos Kiayias Robert LeBarre
29
Context Free Grammars : Concepts & Terminology
Definition: A Context Free Grammar, CFG, is described by T, NT, S, PR, where: T: Terminals / tokens of the language NT: Non-terminals to denote sets of strings generatable by the grammar & in the language S: Start symbol, SNT, which defines all strings of the language PR: Production rules to indicate how T and NT are combines to generate valid strings of the language. PR: NT (T | NT)* Like a Regular Expression / DFA / NFA, a Context Free Grammar is a mathematical model !
30
Context Free Grammars : A First Look
assign_stmt id := expr ; expr term operator term term id term real term integer operator + operator - What do “BLUE” symbols represent? What do “BLACK” symbols represent? Derivation: A sequence of grammar rule applications and substitutions that transform a starting non-term into a collection of terminals / tokens. Simply stated: Grammars / production rules allow us to “rewrite” and “identify” correct syntax.
31
Example Grammar expr expr op expr expr ( expr ) expr - expr
expr id op + op - op * op / op Black : NT Blue : T expr : S 9 Production rules To simplify / standardize notation, we offer a synopsis of terminology.
32
Leftmost and Rightmost Derivations
Leftmost: Replace the leftmost non-terminal symbol E E A E id A E id * E id * id Rightmost: Replace the rightmost non-terminal symbol E E A E E A id E * id id * id lm lm lm lm rm rm rm rm Important Notes: A If A , what’s true about ? If A , what’s true about ? lm rm Derivations: Actions to parse input can be represented pictorially in a parse tree.
33
Examples of LM Derivations
E E A E | ( E ) | -E | id A + | - | * | / | A leftmost derivation of : id + id * id id +E A E E E A E E A E A E id A E A E id + id A E id + id * E id + id * id Another leftmost derivation of : id + id * id E E A E id A E id + E id + E A E id + id * id id + id A E id + id * E
34
Examples of RM Derivations
E E A E | ( E ) | -E | id A + | - | * | / | A rightmost derivation of : id + id * id E E A E E A E A E E A E A id E A E * id E A id * id E + id * id id + id * id Another rightmost derivation of : id + id * id E E A E E A id E * id E A E * id id + id * id E A id * id E + id * id
35
Resolving Grammar Problems/Difficulties
How do Regular Expressions and Context Free Grammars Related? What are Possible Grammar Problems? Ambiguous Grammars Where Does Else Match? Left Recursion in Grammar How Do you know When to Stop Derviation Epsilon Moves (ε-Moves) How Do Empty Moves impact Grammars? Cycles What if the Grammar Cycles Back to Earlier Rule? Left Factoring What if two Grammar Rules are Too Similar?
36
Resolving Grammar Difficulties: Motivation
Humans write / develop grammars Different parsing approaches have different needs LL(k) Recursive LR(k) LALR(k) Top-Down vs. Bottom-Up For: 1 remove “errors” For: 2 put / redesign grammar ambiguity left recursion -moves cycles left factoring Grammar Problems
37
Resolving Problems: Ambiguous Grammars
Consider the following grammar segment: stmt if expr then stmt | if expr then stmt else stmt | other (any other statement) What’s problem here ? Consider the Program: if e1 then if e2 then s1 else s2 Else must match to previous then. Structure indicates parse subtree for expression.
38
Resulting Parse Tree Easy case Else must match to previous then. if e1
then s1 else if e2 then s2 else s3
39
Example : What Happens with this string?
If E1 then if E2 then S1 else S2 How is this parsed ? if E1 then if E2 then S1 else S2 if E1 then if E2 then S1 else S2 vs. What’s the issue here ?
40
Parse Trees for Example
if e1 then if e2 then s1 else s2 Form 1: Form 2: What’s the issue here ?
41
Removing Ambiguity Take Original Grammar: Revise to remove ambiguity:
stmt if expr then stmt | if expr then stmt else stmt | other (any other statement) Revise to remove ambiguity: stmt matched_stmt | unmatched_stmt matched_stmt if expr then matched_stmt else matched_stmt | other unmatched_stmt if expr then stmt | if expr then matched_stmt else unmatched_stmt How does this grammar work ?
42
How does this grammar work ?
matched_stmt Forces the else to match to the if The then and else can also be matched_stmt Else always matches the previous then regardless of the nesting unmatched_stmt Allows a link to stmt (matched/unmatched) Second option allows matched or unmatched stmt matched_stmt | unmatched_stmt matched_stmt if expr then matched_stmt else matched_stmt | other unmatched_stmt if expr then stmt | if expr then matched_stmt else unmatched_stmt
43
Resolving Difficulties : Left Recursion
A left recursive grammar has rules that support the derivation : A A, for some . + Top-Down parsing can’t reconcile this type of grammar, since it could consistently make choice which wouldn’t allow termination. Don’t know when to Choose ! A A A A … etc. A A | Transition left recursive grammar: A A | To the following: A A’ A’ A’ |
44
Why is Left Recursion a Problem ?
Input: id + id + id Consider: E E + T | T T T * F | F F ( E ) | id E E + T T + T How do you pick the right one? E E + T E + T + T T + T + T E E + T E + T + T E + T + T + T What’s the Problem Here?
45
Resolving Difficulties : Left Recursion
A left recursive grammar has rules that support the derivation : A A, for some . + Top-Down parsing can’t reconcile this type of grammar, since it could consistently make choice which wouldn’t allow termination. A A A A … etc. A A | Take left recursive grammar: A A | To the following: A A’ A’ A’ |
46
Resolving Difficulties : Left Recursion
Informal Discussion: Take all productions for A and order as: A A1 | A2 | … | Am | 1 | 2 | … | n Where no i begins with A. Now apply concepts of previous slide: A 1A’ | 2A’ | … | nA’ A’ 1A’ | 2A’ | … | m A’ | For our example A A1 | A A1 | 1 E E + T | T T T * F | F
47
Resolving Difficulties : Left Recursion
For: A A | Apply: A’ A’ A’ A’ | For our example A A | A A1 | 1 E E + T | T T T * F | F E TE’ E’ + TE’ | T FT’ T’ * FT’ |
48
Left Recursion Summary
This Only Works for Direct (Obvious – can See) Left Recursion: Take all productions for A and order as: A A1 | A2 | … | Am | 1 | 2 | … | n Where no i begins with A. Now apply concepts of previous slide: A 1A’ | 2A’ | … | nA’ A’ 1A’ | 2A’ | … | m A’ | For our example: E E + T | T T T * F | F F ( E ) | id E TE’ E’ + TE’ | T FT’ T’ * FT’ |
49
Resolving Difficulties : Left Recursion
Problem: If left recursion is two-or-more levels deep, this isn’t enough For below: Where is Direct Left Recursion? What can Happen in a Derivation of adcca? S Aa | b A Ac | Sd | S Aa S Aa Aca A Ac Acca A Ac Sdcca A Sd Aadcca S Aa adcca A
50
Resolving Difficulties : Left Recursion
Problem: If left recursion is two-or-more levels deep, this isn’t enough S Aa | b A Ac | Sd | S Aa Aca Acca Sdcca Algorithm: Input: Grammar G with no cycles or -productions Output: An equivalent grammar with no left recursion Arrange the non-terminals in some order A1,A2,…An for i := 1 to n do begin for j := 1 to i – 1 do begin replace each production of the form Ai Aj by the productions Ai 1 | 2 | … | k where Aj 1|2|…|k are all current Aj productions; end j for loop eliminate the immediate left recursion among Ai productions end i for loop
51
First –Rename Non-Terminals and Set Values of two for Loops
Original Grammar: S Aa | b A Ac | Sd | Renamed Grammar: A1 A2a | b A2 A2c | A1d | General Algorithm for Loops – n number of NTs for i := 1 to n do begin for j := 1 to i – 1 do Specific Algorithm for Loops for i := 1 to 2 do begin
52
Using the Algorithm Apply the algorithm to: A1 A2a | b
A2 A2c | A1d | i = 1 j loop is j=1 to 0 so doesn’t execute eliminate the immediate left recursion among A1 productions For A1 there is no left recursion i = 2 for j=1 to 1 do Take productions: A2 A1 and replace with A2 1 | 2 | … | k | where A1 1 | 2 | … | k are A1 productions in our case A2 A1d becomes A2 A2ad | bd in A2 A1d replace A1 with A1 A2a | b What’s left: A1 A2a | b A2 A2 c | A2 ad | bd | Are we done ?
53
Using the Algorithm No ! We must now remove A2 direct left recursion !
A1 A2a | b A2 A2 c | A2 ad | bd | Recall: A A1 | A2 | … | Am | 1 | 2 | … | n A 1A’ | 2A’ | … | nA’ A’ 1A’ | 2A’ | … | m A’ | Apply to above case. What do you get ? A1 A2a | b A2 bd A2’ | A2’ A2’ c A2’ | ad A2’ |
54
Apply Algorithm to Revised Earlier Ex
E E + T | T T T * F | F F E | ( E ) | id A1 A1 + A2 | A2 A2 A2 * A3 | A3 A3 A1 | (A1 ) | id i = 1 j loop is j=1 to 0 so doesn’t execute eliminate the immediate left recursion among A1 productions A1 A2 A1’ A1’ + A2 A1’ | Grammar is now: A1 A2 A1’ A1’ + A2 A1’ | A2 A2 * A3 | A3 A3 A1 | (A1 ) | id
55
Apply Algorithm to Revised Earlier Ex
E E + T | T T T * F | F F E | ( E ) | id A1 A1 + A2 | A2 A2 A2 * A3 | A3 A3 A1 | (A1 ) | id i = 2 j loop is j=1 to 1 so does execute one time Does A2 A1 ? No – don’t do anything eliminate the immediate left recursion among A2 productions A2 A3 A2’ A2’ *A3 A3’ | Grammar is now: A1 A2 A1’ A1’ + A2 A1’ | A2 A3 A2’ A2’ *A3 A3’ | A3 A1 | (A1 ) | id
56
Apply Algorithm to Revised Earlier Ex
A1 A2 A1’ A1’ + A2 A1’ | A2 A3 A2’ A2’ *A3 A3’ | A3 A1 | (A1 ) | id i = 3 j loop is j=1 to 2 so executes two times j=1 Does A3 A1 ? Yes! Replace A1 with A2 A1’ in A3 A1 A3 A2 A1 | (A1 ) | id Grammar is now A1 A2 A1’ A1’ + A2 A1’ | A2 A3 A2’ A2’ *A3 A3’ | A3 A2 A1 | (A1 ) | id
57
Apply Algorithm to Revised Earlier Ex
A1 A2 A1’ A1’ + A2 A1’ | A2 A3 A2’ A2’ *A3 A3’ | A3 A2 A1 | (A1 ) | id i = 3 j loop is j=1 to 2 so executes two times j=2 Does A3 A2 ? Yes! Replace A2 with A3 A2’ in A3 A2 A3 A3 A2’ A1 | (A1 ) | id Grammar now: A1 A2 A1’ A1’ + A2 A1’ | A2 A3 A2’ A2’ *A3 A3’ | A3 A3 A2’ A1 | (A1 ) | id eliminate the immediate left recursion among A3 productions
58
Apply Algorithm to Revised Earlier Ex
eliminate the immediate left recursion among A3 productions A1 A2 A1’ A1’ + A2 A1’ | A2 A3 A2’ A2’ *A3 A3’ | A3 A3 A2’ A1 | (A1 ) | id A3 (A1 ) A3’ | id A3’ A3’ A2’ A1 A3’ | Final Result: A1 A2 A1’ A1’ + A2 A1’ | A2 A3 A2’ A2’ *A3 A3’ | A3 (A1 ) A3’ | id A3’ A3’ A2’ A1 A3’ |
59
Removing Difficulties : -Moves A First Look
Very Simple: A and B uAv implies add B uv to the grammar G. Replace A with on all r.h.s. rules Why does this work ? Examples: E TE’ E T E’ + TE’ E’ + T E TE’ E’ + TE’ | T FT’ T’ * FT’ | T FT’ T F T’ * FT’ T’ * F F ( E ) | id
60
In Class Left Recursion Example Spring 92 Midterm
S A A | A S B | x B A y | y for i := 1 to 3 do begin (for S, A, and B) i=1: Just look for direct left recursion of S eliminate the immediate left recursion among S productions i=2: j=1 Does A S if so replace S with all alternatives eliminate the immediate left recursion among A productions i=3: j=1 Does B S if so replace S with all alternatives i=3: j=2 Does B A if so replace A with all alternatives eliminate the immediate left recursion among B productions end i for loop
61
In Class Left Recursion Example Spring 92 Midterm
S A A | A S B | x B A y | y i=1, no j loop No direct LF S A A | A S B | x B A y | y for i := 1 to 3 do begin (for S, A, and B) i=1: Just look for direct left recursion of S eliminate the immediate left recursion among S productions i=2: j=1 Does A S if so replace S with all alternatives eliminate the immediate left recursion among A productions i=3: j=1 Does B S if so replace S with all alternatives i=3: j=2 Does B A if so replace A with all alternatives eliminate the immediate left recursion among B productions end i for loop i=2, j=1 S A A | A A A B | B | x B A y | y i=3, j=1 no B S Rem left rec A S A A | A B A’ | x A’ A’ A B A’ | B A y | y i=3, j=2 S A A | A B A’ | x A’ A’ A B A’ | B B A’ y | x A’ y | y Final rem left rec B S A A | A B A’ | x A’ A’ A B A’ | B x A’ y B’ | y B’ B’ A’ y B’ |
62
Removing Difficulties : -Moves
Very Simple: A and B uAv implies add B uv to the grammar G. A1 A2 a | b A2 bd A2’ | A2’ A2’ c A2’ | bd A2’ | First do: A2’ Add: A2 bd | A2’ c | bd Revised Grammar A1 A2 a | b A2 bd A2’ | bd | A2’ c A2’ | bd A2’ | c | bd Now do: A2 Add: A1 a Final Grammar A1 A2 a | b | a A2 bd | A2’ bd A2’ c A2’ | bd A2’ | c | bd
63
Removing Difficulties : -Moves Fall 91 Midterm
S ( L ) | E b | a L c L’ | ( L ) L’ | E b L’ | a L’ L’ , S L’ | E [ E ] E’ | c L’ c E’ | ( L ) L’ c E’ | a L’ c E’ E’ d E’ | b L’ c E’ | Do: L’ S ( L ) | E b | a L c L’ | ( L ) L’ | E b L’ | a L’ | c | ( L ) | E b | a L’ , S L’ | , S E [ E ] E’ | c L’ c E’ | ( L ) L’ c E’ | a L’ c E’ | c c E’ | ( L ) c E’ | a c E’ E’ d E’ | b L’ c E’ | b c E’ | Do: E’ S ( L ) | E b | a L c L’ | ( L ) L’ | E b L’ | a L’ | c | ( L ) | E b | a L’ , S L’ | , S E [ E ] E’ | c L’ c E’ | ( L ) L’ c E’ | a L’ c E’ | c c E’ | ( L ) c E’ | a c E’ | [ E ] | c L’ c | ( L ) L’ c | a L’ c | c c | ( L ) c | a c E’ d E’ | b L’ c E’ | b c E’ | d | b L’ c | b c
64
Removing Difficulties : -Moves Spring 92 Midterm
S A A | A B A’ | x A’ A’ A B A’ | B x A’ y B’ | y B’ B’ A’ y B’ | Do A’ S A A | A B A’ | x A’ | B | x A’ A B A’ | A B B x A’ y B’ | y B’ | x y B’ | y B’ B’ A’ y B’ | | y B’ Do B’ S A A | A B A’ | x A’ | B | x A’ A B A’ | A B B x A’ y B’ | y B’ | x y B’ | y B’ | x A’ y | y | x y | y B’ A’ y B’ | y B’ | A’ y | y
65
Removing Difficulties : Left Factoring
Problem : Uncertain which of 2 rules to choose: stmt if expr then stmt else stmt | if expr then stmt When do you know which one is valid ? What to you Notice about the Two Rules?
66
Removing Difficulties : Left Factoring
How can you Fix the Problem? Abstract out Commonalities – Rework the Rule stmt if expr then stmt else stmt | if expr then stmt What’s the general form of stmt ? A 1 | : if expr then stmt 1: else stmt 2 : Transform to: A A’ A’ 1 | 2 EXAMPLE: stmt if expr then stmt rest rest else stmt |
67
Chapter 4: Syntax Analysis Part 2: Top-Down Parsing
Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371 Fairfield Way, Unit 2155 Storrs, CT (860) Material for course thanks to: Laurent Michel Aggelos Kiayias Robert LeBarre
68
LL(1) Grammars for Top-Down Parsing
L : Scan input from Left to Right L : Construct a Leftmost Derivation 1 : Use “1” input symbol as lookahead in conjunction with stack to decide on the parsing action LL(1) grammars have no multiply-defined entries in the parsing table. Properties of LL(1) grammars: Grammar can’t be ambiguous or left recursive Grammar is LL(1) when A 1. & do not derive strings starting with the same terminal a 2. Either or can derive , but not both. Note: It may not be possible for a grammar to be manipulated into an LL(1) grammar
69
Non-Recursive / Table Driven
+ b $ Y X Z Input Predictive Parsing Program Stack Output Parsing Table M[A,a] (String + terminator) NT + T symbols of CFG What actions parser should take based on stack / input Empty stack symbol General parser behavior: X : top of stack a : current input 1. When X=a = $ halt, accept, success 2. When X=a $ , POP X off stack, advance input, go to 1. 3. When X is a non-terminal, examine M[X,a] if it is an error call recovery routine if M[X,a] = {X UVW}, POP X, PUSH W,V,U in reverse order DO NOT expend any input
70
Algorithm for Non-Recursive Parsing
Set ip to point to the first symbol of w$; repeat let X be the top stack symbol and a the symbol pointed to by ip; if X is terminal or $ then if X=a then pop X from the stack and advance ip else error() else /* X is a non-terminal */ if M[X,a] = XY1Y2…Yk then begin pop X from stack; push Yk, Yk-1, … , Y1 onto stack, with Y1 on top output the production XY1Y2…Yk end until X=$ /* stack is empty */ Input pointer May also execute other code based on the production used
71
Example Our well-worn example ! Table M E TE’ E’ + TE’ |
T FT’ T’ * FT’ | F ( E ) | id Our well-worn example ! Table M Non-terminal INPUT SYMBOL id + * ( ) $ E E’ T T’ F ETE’ TFT’ Fid E’+TE’ T’ T’*FT’ F(E) E’
72
Trace of Example Expend Input STACK INPUT OUTPUT $E $E’T $E’T’F
$E’T’id $E’T’ $E’ $E’T+ $E’T’F* $ id + id * id$ + id * id$ id * id$ * id$ id$ $ E TE’ T FT’ F id T’ E’ +TE’ T’ *FT’ E’ Expend Input
73
Leftmost Derivation for the Example
The leftmost derivation for the example is as follows: E TE’ FT’E’ id T’E’ id E’ id + TE’ id + FT’E’ id + id T’E’ id + id * FT’E’ id + id * id T’E’ id + id * id E’ id + id * id
74
What’s the Missing Puzzle Piece ?
Constructing the Parsing Table M ! 1st : Calculate First & Follow for Grammar 2nd: Apply Construction Algorithm for Parsing Table Conceptual Perspective: First: Let be a string of grammar symbols. First() are the first terminals that can appear in in any possible derivation. NOTE: If , then is First( ). Follow: Let A be a non-terminal. Follow(A) is the set of terminals that can appear directly to the right of A in some sentential form. (S Aa, for some and ). NOTE: If S A, then $ is Follow(A). * * *
75
Conceptually – What is First?
Calculated for Each Non-Terminal of the Grammar First (E), First (E’), First (T), First (T’), First(F) What is the First terminal that the non-terminal turns into? Consider Grammar Below: What is First(F)? What is First (T’)? What is First (E’)? What is First (T)? What is First (E)? { ( , id } { * } { + } {( , id } {( , id } E TE’ E’ + TE’ | T FT’ T’ * FT’ | F ( E ) | id
76
Conceptually – What is First?
So far: What is Unique about E’ and T’ Rules? E’ T’ Add in epsilon to First calculations Now, what is First (T)? First(T) = First (FT’) = First(F) + First(T’) if in First(F) So Finally, what is First (E)? First(E) = First (TE’) = First(T) + First(E’) if in First(T) First(F) = { ( , id } First(E’) = { + } First(T’) = { * } First(F) = { ( , id } First(E’) = { +, } First(T’) = { *, } First(T) = {( , id } First(E) = {( , id }
77
Computing First(X) : All Grammar Symbols
1. If X is a terminal, First(X) = {X} 2. If X is a production rule, add to First(X) 3. If X is a non-terminal, and X Y1Y2…Yk is a production rule Place First(Y1) in First(X) if Y1 , Place First(Y2) in First(X) - if in First(Y2) if Y2 , Place First(Y3) in First(X) - if in First(Y2) & First(Y3) … if Yk-1 , Place First(Yk) in First(X) - if in First(Y2) to First(Yk-1) NOTE: As soon as Yi , Stop. May repeat 1, 2, and 3, above for each Yj * * * *
78
Computing First(X) : All Grammar Symbols - continued
Informally, suppose we want to compute First(X1 X2 … Xn ) = First (X1) “+” First(X2) if is in First(X1) “+” First(X3) if is in First(X2) “+” … First(Xn) if is in First(Xn-1) Note 1: Only add to First(X1 X2 … Xn) if is in First(Xi) for all i Note 2: For First(X1), if X1 Z1 Z2 … Zm , then we need to compute First(Z1 Z2 … Zm) !
79
Conceptually: What is First (E, T, …) in Derivation?
The leftmost derivation for the example is as follows: INPUT: id + id * id $ E $ TE’ FT’E’ id T’E’ id E’ id + TE’ id + FT’E’ id + id T’E’ id + id * FT’E’ id + id * id T’E’ id + id * id E’ id + id * id $ First(E,F,T) = { (, id } First(E’) = { +, } First(T’) = { *, }
80
Example Computing First for: E TE’ E’ + TE’ |
T FT’ T’ * FT’ | F ( E ) | id First(TE’) First(T) “+” First(E’) First(E) * Not First(E’) since T First(T) First(F) “+” First(T’) First(F) * Not First(T’) since F First((E)) “+” First(id) “(“ and “id” Overall: First(E) = { ( , id } = First(F) First(E’) = { + , } First(T’) = { * , } First(T) First(F) = { ( , id }
81
Conceptually – What is Follow?
Again, Calculated for Each Non-Terminal of the Grammar Follow (E), Follow (E’), Follow (T), Follow (T’), Follow (F) What is first terminal that follows Non-Terminal in a derivation? Consider Grammar Below: Look at where Non-Terminal appears on the Right Hand Side of Each Rule E TE’ E’ + TE’ | T FT’ T’ * FT’ | F ( E ) | id
82
Conceptually – What is Follow?
Examine each Non-Terminal appears on the Right Hand Side of Each Rule What is Follow(E)? Rule: F ( E ) So - Follow(E) = { )} What is Follow(T)? Rules: E TE’ and E’ +TE’ So - Follow(T)={+} = First(E’) w/o What is Follow(F)? Rules: T FT’ and T’ *FT’ So - Follow(F) ={*}= First(T’) w/o First(E) = { ( , id } First(T) = { ( , id } First(F) = { ( , id } First(E’) = { + , } First(T’) = { * , }
83
Conceptually – What is Follow? Are we Done Yet?
Some other Things to Consider Assume that the Input Stream has a final token “$” So for the Start Symbol E, $ Follows any possible input so Follow(E) = { ), $} Consider Follow(T) in E TE’ where First(E’) contains Since E TE’ and in First(E’) Place Follow(E) into Follow(T) So Follow(T)= { +, ), $ } First(E) = { ( , id } First(T) = { ( , id } First(F) = { ( , id } First(E’) = { + , } First(T’) = { * , }
84
Let’s Explore in Detail
Consider the Derivation: Input id$ E$ TE’ $ FT’E’$ idT’E’ $ idE’$ id$ What is in Follow(T) if T’ and E’ ? Whatever Follows E! In Rule, if E’ , E TE’ becomes E T and whatever follows T is what follows E! So what do we add in? E TE’ E’ + TE’ | T FT’ T’ * FT’ | F ( E ) | id First(E,F,T) = { (, id } First(E’) = { +, } First(T’) = { *, } Follow(E) = { ), $} Follow(F) = { * } Follow(T) = { +, ) }
85
Computing Follow(A) : All Non-Terminals
1. Place $ in Follow(S), where S is the start symbol and $ signals end of input 2. If there is a production A B, then everything in First() is in Follow(B) except for . 3. If A B is a production, or A B and (First() contains ), then everything in Follow(A) is in Follow(B) (Whatever followed A must follow B, since nothing follows B from the production rule) * We’ll calculate Follow for two grammars.
86
Conceptually: What is Follow in Derivation?
The leftmost derivation for the example is as follows: INPUT: id + id * id $ E$ TE’ $ FT’E’$ id T’E’ $ id E’$ id + TE’$ id + FT’E’$ id + id T’E’$ id + id * FT’E’$ id + id * id T’E’ $ id + id * id E’$ id + id * id $ What Follow(E)? What Follow(F)? First(T’) E TE’ E’ + TE’ | T FT’ T’ * FT’ | F ( E ) | id First(E,F,T) = { (, id } First(E’) = { +, } First(T’) = { *, } Follow(E) = { ), $} Follow(F) = { * } Follow(T) = { +, }
87
Let’s Explore in Detail
Consider the Derivation: Input id$ E$ TE’ $ FT’E’$ idT’E’ $ idE’$ id$ What is in Follow(T) if T’ and E’ ? Whatever Follows E! In Rule, if E’ , E TE’ becomes E T and whatever follows T is what follows E! So what do we add in? E TE’ E’ + TE’ | T FT’ T’ * FT’ | F ( E ) | id First(E,F,T) = { (, id } First(E’) = { +, } First(T’) = { *, } Follow(E) = { ), $} Follow(F) = { * } Follow(T) = { +, ) }
88
Example Compute Follow for: E TE’ E’ + TE’ |
T FT’ T’ * FT’ | F ( E ) | id First(E) = { ( , id } First(T) = { ( , id } First(F) = { ( , id } First(E’) = { + , } First(T’) = { * , } Follow(E) - contains $ since E is the start symbol. Also, since F (E) then First(“)”) is in Follow(E). Thus Follow(E) = { ) , $ } Follow(E’) : E TE’ implies Follow(E) is in Follow(E’), and Follow(E’) = { ) , $ } E’ at end of rule, whatever follows E follows E’ Follow(T) : E TE’ implies put in First(E’). Since E’ , put in Follow(E). Since E’ +TE’ , Put in First(E’), and since E’ , put in Follow(E’). Thus Follow(T) = { +, ), $ }. Follow(T’) Follow(F) * * You do these !
89
Computing Follow : 2nd Example
Recall: S i E t SS’ | a S’ eS | E b First(S) = { i, a } First(S’) = { e, } First(E) = { b } Follow(S) – Contains $, since S is start symbol Since S i E t SS’ , put in First(S’) – not Since S’ , Put in Follow(S) Since S’ eS, put in Follow(S’) So…. Follow(S) = { e, $ } Follow(S’) = Follow(S) HOW? Follow(E) = { t } *
90
First & Follow – Examine the Derivation
Consider the following derivation: E TE’ FT’E’ ( E ) T’E’ ( TE’ ) T’E’ ( FT’E’ ) T’E’ ( id T’E’ ) T’E’ ( id E’ ) T’E’ ( id ) T’E’ ( id ) * FT’E’ ( id ) * id T’E’ ( id ) * id E’ ( id ) * id + TE’ * ( id ) * id + FT’E’ ( id ) * id + T’E’ ( id ) * id + id$
91
First & Follow – Examine the Derivation
Consider the following derivation: What’s First for each non-terminal ? Still needs your First(F) E TE’ FT’E’ ( E ) T’E’ ( TE’ ) T’E’ ( FT’E’ ) T’E’ ( id T’E’ ) T’E’ ( id E’ ) T’E’ ( id ) T’E’ ( id ) * FT’E’ ( id ) * id T’E’ ( id ) * id E’ ( id ) * id + TE’ T’ * ( id ) * id + FT’E’ ( id ) * id + T’E’ ( id ) * id + id$ E’
92
First & Follow – Examine the Derivation
Consider the following derivation: Still needs your Follow(F) What’s Follow for each non-terminal ? E TE’ FT’E’ ( E ) T’E’ ( TE’ ) T’E’ ( FT’E’ ) T’E’ ( id T’E’ ) T’E’ ( id E’ ) T’E’ ( id ) T’E’ ( id ) * FT’E’ ( id ) * id T’E’ ( id ) * id E’ ( id ) * id + TE’ T’ * ( id ) * id + FT’E’ ( id ) * id + T’E’ ( id ) * id + id$ E’
93
First & Follow – Examine the Derivation
Consider the following derivation: Still needs your First(F) and Follow(F) What’s First for each non-terminal ? What’s Follow for each non-terminal ? E TE’ FT’E’ ( E ) T’E’ ( TE’ ) T’E’ ( FT’E’ ) T’E’ ( id T’E’ ) T’E’ ( id E’ ) T’E’ ( id ) T’E’ ( id ) * FT’E’ ( id ) * id T’E’ ( id ) * id E’ ( id ) * id + TE’ T’ * ( id ) * id + FT’E’ ( id ) * id + T’E’ ( id ) * id + id$ E’
94
In Class Exercise One – First and Follow
A B A’ A’ o B A’ A’ B D B ( C ) C D C’ C’ a D C’ C’ D x D n x First(A) = { x, n, ( } First(A’) = { o, } First(B) = { x, n, ( } First(C) = {x, n } First(C’) = { a, } First(D) = {x, n } Follow(A) = { $ } Follow(A’) = { $ } Follow(B) = { $, o} Follow(C) = { ) } Follow(C’) = { ) } Follow(D) = {$, a, o}
95
In Class Exercise Two – First and Follow
S ( L ) | E b | a L L , S | S | c E [ E ] | L c | E d S ( L ) | E b | a L c L’ | ( L ) L’ | E b L’ | a L’ L’ , S L’ | E [ E ] E’ | c L’ c E’ | ( L ) L’ c E’ | a L’ c E’ E’ d E’ | b L’ c E’ | First(S) = { (, a, [, c } First(L) = {(, a, [, c } First(E) = {(, a, [, c } First(E’) = {b, d, } First(L’) = { “,”, } Follow(S) = { $ } Follow(L) = { c, ‘,’, )} Follow(E)= { b, d, ] } Follow(L’)= {c, ‘,’, )} Follow(E’)= { b, d, ]}
96
In Class Exercise Three– First and Follow
S A A | A B A’ | z A’ A’ A B A’ | B x A’ y B’ | y B’ B’ A’ y B’ | First(S) = {x, y, z , } First(A) = {x, y, z} First(A’) = {x, y, z , } First(B) = {x, y} First(B’) = {x, y, z , } Follow(S) = { $ } Follow(A) = {x, y, z, $} Follow(A’) = {x, y, z, d} Follow(B) = {x, y, z, $} Follow(B’) = {x, y, z, $ }
97
Constructing Top-Down Parsing Table
Algorithm: Repeat Steps 2 & 3 for each rule A Terminal a in First()? Add A to M[A, a ] 3.1 in First()? Add A to M[A, a ] for all terminals b in Follow(A). 3.2 in First() and $ in Follow(A)? Add A to M[A, $ ] 4. All undefined entries are errors.
98
Step 2:Each rule A -- First()
E TE’ E’ + TE’ | T FT’ T’ * FT’ | F ( E ) | id First(E,F,T) = { (, id } First(E’) = { +, } First(T’) = { *, } Follow(E,E’) = { ), $} Follow(F) = { *, +, ), } Follow(T,T’) = { +, ), } F ( E ) F id E TE’ T FT’ E’ + TE’ T’ * FT’ E’ T’ INPUT SYMBOL Non-terminal id + * ( ) $ E ETE’ ETE’ Table M E’ E’+TE’ E’ E’ T TFT’ TFT’ T’ T’ T’*FT’ T’ T’ F Fid F(E)
99
Constructing Top-Down Parsing Table Example 2
E TE’ E’ + TE’ | T FT’ T’ * FT’ | F ( E ) | id First(E,F,T) = { (, id } First(E’) = { +, } First(T’) = { *, } Follow(E,E’) = { ), $} Follow(F) = { *, +, ), } Follow(T,T’) = { +, ), } Expression Example: E TE’ : First(TE’) = First(T) = { (, id } M[E, ( ] : E TE’ M[E, id ] : E TE’ (by rule 2) E’ +TE’ : First(+TE’) = + : M[E’, +] : E’ +TE’ (by rule 3) E’ : in First( ) T’ : in First( ) by rule 2 M[E’, )] : E’ (3.1) M[T’, +] : T’ (3.1) M[E’, $] : E’ (3.2) M[T’, )] : T’ (3.1) (Due to Follow(E’) M[T’, $] : T’ (3.2) (Due to Follow(T’)
100
Example Our well-worn example ! Table M E TE’ E’ + TE’ |
T FT’ T’ * FT’ | F ( E ) | id Our well-worn example ! Table M Non-terminal INPUT SYMBOL id + * ( ) $ E E’ T T’ F ETE’ TFT’ Fid E’+TE’ T’ T’*FT’ F(E) E’
101
Constructing Top-Down Parsing Table Example 2
S i E t SS’ | a S’ eS | E b First(S) = { i, a } First(S’) = { e, } First(E) = { b } Follow(S) = { e, $ } Follow(S’) = { e, $ } Follow(E) = { t } S i E t SS’ S a E b First(i E t SS’)={i} First(a) = {a} First(b) = {b} S’ eS S First(eS) = {e} First() = {} Follow(S’) = { e, $ } Non-terminal INPUT SYMBOL a b e i t $ S S a S iEtSS’ S’ S’ S’ eS S E E b
102
First & Follow – Derivation to Parsing Table
Consider the following derivation: What are implications ? ( id ) * id + id$ (input) M - Table E TE’ FT’E’ ( E ) T’E’ 1. E TE’ and ( in First(E) 2. TFT’ and ( in First(T) 3. F (E) and ( in First(F) 4. E’ and ) in Follow(E’) 5. Since $ in Follow(T’), T’ 6. Since $ in Follow(E’), E’ 1. M [ E, ( ] 2. M [ T, ( ] 3. M [ F, ( ] ( TE’ ) T’E’ ( FT’E’ ) T’E’ ( id T’E’ ) T’E’ ( id E’ ) T’E’ ( id ) T’E’ ( id ) * FT’E’ 4. M [ E’, ) ] ( id ) * id T’E’ ( id ) * id E’ ( id ) * id + TE’ * ( id ) * id + FT’E’ ( id ) * id + T’E’ ( id ) * id + id$ 5. M [ T’, $ ] 6. M [ E’, $ ]
103
Detailed Example Step 1 Compute Follow First T → F T’ T’ → * F T’ →
F → ( E ) → Id S → E $ E → T E’ E’ → + T E’ Overall: First(S) = { } First(E) = { ( , id } = First(F) First(E’) = { + , } First(T’) = { * , } First(T) First(F) = { ( , id } Follow(E) = Follow(E’) = { ), $ } Follow(T) = Follow(T’) = {+, ), $ } Follow(F) = {+, *, ), $ }
104
Detailed Example Step 2 Build the parser table Step 3
Input: Id + Id * Id $ T → F T’ T’ → * F T’ → F → ( E ) → Id S → E $ E → T E’ E’ → + T E’ Input Symbols NT Id + * ( ) $ S S → E$ E E → TE’ E →TE’ E’ E’ →+TE’ E’ → T T → FT’ T →FT’ T’ T’ → T’ →*FT’ F F → Id F → (E) Parser Table
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.