Download presentation
Presentation is loading. Please wait.
1
Normal forms and parsing
The Chinese University of Hong Kong Fall 2008 CSC 3130: Automata theory and formal languages Normal forms and parsing Andrej Bogdanov
2
Testing membership and parsing
Given a grammar How can we know if a string x is in its language? If so, can we reconstruct a parse tree for x? S → 0S1 | 1S0S1 | T T → S | e
3
First attempt S → 0S1 | 1S0S1 | T T → S | x = 00111
Maybe we can try all possible derivations: S 0S1 00S11 01S0S11 0T1 when do we stop? 1S0S1 10S10S1 ... T S
4
Problems S → 0S1 | 1S0S1 | T T → S | x = 00111
How do we know when to stop? S 0S1 00S11 01S0S11 when do we stop? 0T1 1S0S1 10S10S1 ...
5
Problems S → 0S1 | 1S0S1 | T T → S | x = 01011
Idea: Stop derivation when length exceeds |x| Not right because of -productions We might want to eliminate -productions too S 0S1 01S0S11 01S011 01011 1 3 7 6 5
6
Problems S → 0S1 | 1S0S1 | T T → S | x = 00111
Loops among the variables (S → T → S) might make us go forever We might want to eliminate such loops
7
Unit productions A unit production is a production of the form where A1 and A2 are both variables Example A1 → A2 grammar: unit productions: S → 0S1 | 1S0S1 | T T → S | R | R → 0SR S T R
8
Removal of unit productions
If there is a cycle of unit productions delete it and replace everything with A1 Example A1 → A2 → ... → Ak → A1 S T S → 0S1 | 1S0S1 | T T → S | R | R → 0SR S → 0S1 | 1S0S1 S → R | R → 0SR R T is replaced by S in the {S, T} cycle
9
Removal of unit productions
For other unit productions, replace every chain by productions A1 → ,... , Ak → Example A1 → A2 → ... → Ak → S → 0S1 | 1S0S1 | R | R → 0SR S → 0S1 | 1S0S | 0SR | R → 0SR S → R → 0SR is replaced by S → 0SR, R → 0SR
10
Removal of -productions
A variable N is nullable if there is a derivation How to remove -productions (except from S) N * Find all nullable variables N1, ..., Nk For i = 1 to k For every production of the form A → Ni, add another production A → If Ni → is a production, remove it If S is nullable, add the special production S →
11
Example Find the nullable variables grammar nullable variables S ACD
A a B C ED | D BC | b E b B C D Find all nullable variables N1, ..., Nk
12
Finding nullable variables
To find nullable variables, we work backwards First, mark all variables A s.t. A as nullable Then, as long as there are productions of the form where all of A1,…, Ak are marked as nullable, mark A as nullable A → A1… Ak
13
Eliminating e-productions
D C S AD D B D e S AC S A C E S ACD A a B C ED | D BC | b E b nullable variables: B, C, D For i = 1 to k For every production of the form A → Ni, add another production A → If Ni → is a production, remove it
14
Recap After eliminating e-productions and unit productions, we know that every derivation doesn’t shrink in length and doesn’t go into cycles Exception: S → We will not use this rule at all, except to check if e L Note e-productions must be eliminated before unit productions S a1…ak * where a1, …, ak are terminals
15
Example: testing membership
unit, e-prod eliminate S → 0S1 | 1S0S1 | T T → S | S → | 01 | 101 | 0S1 |10S1 | 1S01 | 1S0S1 x = 00111 S 01, 101 0S1 0011, 01011 00S11 strings of length ≥ 6 only strings of length ≥ 6 10S1 10011, strings of length ≥ 6 1S01 10101, strings of length ≥ 6 1S0S1 only strings of length ≥ 6
16
Algorithm 1 for testing membership
We can now use the following algorithm to check if a string x is in the language of G Eliminate all e-productions and unit productions If x = e and S → , accept; else delete S → Let X := S While some new production P can be applied to X Apply P to X If X = x, accept If |X| > |x|, backtrack If no more productions can be applied to X, reject
17
Practical limitations of Algorithm I
Previous algorithm can be very slow if x is long There is a faster algorithm, but it requires that we do some more transformations on the grammar G = CFG of the java programming language x = code for a 200-line java program algorithm might take about steps!
18
Chomsky Normal Form A grammar is in Chomsky Normal Form if every production (except possibly S → e) is of the type Conversion to Chomsky Normal Form is easy: A → BC or A → a A → BcDE A → BCDE C → c A → BX1 X1 → CX2 X2 → DE break up sequences with new variables replace terminals with new variables C → c
19
Exercise Convert this CFG into Chomsky Normal Form: S |ADDA A a
C c D bCb
20
Algorithm 2 for testing membership
SAC S AB | BC A BA | a B CC | b C AB | a – SAC – B B SA B SC SA B AC AC B AC x = baaba b a a b a Idea: We generate each substring of x bottom up
21
Parse tree reconstruction
b AC B SA SC – SAC S AB | BC A BA | a B CC | b C AB | a x = baaba Tracing back the derivations, we obtain the parse tree
22
Cocke-Younger-Kasami algorithm
Input: Grammar G in CNF, string x = x1…xk table cells For i = 1 to k If there is a production A xi Put A in table cell ii For b = 2 to k For s = 1 to k – b Set t = s + b For j = s to t If there is a production A BC where B is in cell sj and C is in cell jt Put A in cell st 1k … … 12 23 11 22 kk x x … xk 1 s j t k b Cell ij remembers all possible derivations of substring xi…xj
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.