Download presentation
Presentation is loading. Please wait.
1
NORMAL FORMS FDP ON THEORY OF COMPUTING
By G Sudha Sadasivam Assistant Professor, CSE
2
CONTEXT FREE GRAMMAR Formal languages, NFA and DFA describe grammars
English Sentence rules <Sentence> <Noun Phrase> <Verb Phrase> <Noun Phrase> <Article> <Noun> | <Noun> <Verb Phrase> <Verb> | <Verb> <Noun Phrase> <Article> a | the <Noun> Sita | boy | girl | ball | dog | ... <Verb> caught | saw | took | ...
3
CONTENTS CFG Chomsky hierarchy Normal forms Unit productions
Useless productions CNF Applications
4
Sentence NP VP Noun Verb Article Sita took the ball
5
Chomsky Hierarchy- Avram Noam Chomsky
Grammar G = ( VN, VT, P, S), where VN is set of NT/ Variables, VT is alphabet set / terminals (∑), P is set of productions , S start symbol In CFG, rules are of the form A w, w ε ( VT U VN) Type Name Productions (P) Unrestricted a ® b with a Î (VT È VN)+ and b Î (VT È VN)* 1 Context-Sensitive a1Aa2 ® a1b a2 with A Î VN and a1, a2 Î (VT È VN)* and b Î (VT È VN)+ 2 Context-Free A ® b with A Î VN and b Î (VT È VN)* 3 Regular, Finite A ® bB or A ® b with A, B Î VN and b Î V*T
6
CFG Context-free means a variable can be replaced with w.
They are powerful to describe Programming languages. CFG are simple to construct efficient parsing algorithms – LR & LL parsers. BNF (Backus-Naur form is used to represent CFG. S aSb | λ Panini described Sanskrit using CFG Venpa is governed by CFG Text Mining in Bio-medicine
7
Normal Forms A NF for a grammar has additional conditions imposed upon its productions and is equivalent to the given grammar. Two types Chomsky NF (CNF) Greibach NF (GNF) Simple form of productions. Rules in CNF has both theoretical and practical implications.
8
T0 T1 T2 GNF T3 CNF
9
Examples CNF: CYK membership algorithm to find if a string is in the language represented by the grammar. GNF: is used for conversion from CFG to NDPA and vice versa.
10
CNF CFG, RHS can be a combination of V & T Eg NP the N is reduced to
DET the NP DET N A λ-free CFG is said to be in CNF if prod. are A a B CD , with A, B, C, D εVN and a ε VT. If CFG is not λ-free then include S λ As 2nd prod has two variables – binary grammars
11
GNF Productions are of the form Aax , where a is a VT and x ε VN*.
They are long
12
1. λ-free languages Let L be any CF language,
G (with λ proiductions) has prod S0 S | λ λ - productions are A λ; A variable A for A * λ is nullable In this case λ-prod can be removed If G is a λ-free CFG, then there is G1 having no λ-prod
13
STEPS Find the set VN of all nullable variables A λ, put A in VN
Repeat to Add variables to VN For prod B A1 A2 A3 A4 A1 A2 A3 A4 are in VN, add B to VN 2. For A x1 x2 … xm with m >=1, put into P1, that production and prod generated by replacing nullable variables with λ in all combinations
14
S ABaC A BC B b | λ C D | λ D d Nullable is { A, B, C} S ABaC | BaC | AaC | ABa | aC | Ba | Aa | a A BC | C | B B b C D D d
15
2. Substitution Rule G : A x1 B x2 and B y1 | y2 |.. | yn
G1: A x1 y1 x2 | x1 y2 x2 | x1 y3 x2 |…| x1 yn x2 B y1 | y2 |.. | yn Then G = G1 Example: A a | aaA | abBc B abbA | b Then A a | aaA | ababbAc | abbc
16
3. Removing Useless Productions
Prod that do not take part in any derivations S * xAy * w is useful Useless variables Cannot be reached from S S A; A aA | λ; B bA 2. Cannot derive a terminal string S A | b A aA
17
Identify the variables that can lead to a terminal string
Set V1(G1=(V1,T1,P1,S1) to NULL Repeat For A x1x2x3……x n for xi in V1 U T, add A to V1 Add to P1 all prod in P whose symbols are in A U V1 Eliminate var that cannot be reached from S Dependency graph Useful-var is reached from start (S).
18
For eg1: S aSb | λ | A A aA A – is useless Eg2: S A A aA | λ B bA A cannot derive a terminal string B is not reachable from start
19
1) C is useless since it does not derive a terminal string
Eg3: 1: S aS | A | C 2: A a 3: B aa 4: C aCb 1) C is useless since it does not derive a terminal string 2) Reachability graph – B is not reachable S A B
20
4. Removing Unit Productions
Prod of the form A B are unit productions Find A * B from a dependency graph Add to P1 all non-unit prod from P If AB and B y1 | y2 |.. | yn then A y1 | y2 |.. | yn
21
S Aa | B S Aa | bb A a | bc | B A a | bc | bb B A | bb
Non-unit rules S Aa A a | bc B bb 2. As S * B , S bb is added As A * B, A bb is added As B * A, B a|bc is added S Aa | bb A a | bc | bb B bb | a | bc S A B
22
Altogether Remove λ productions Remove Unit Productions.
Remove useless productions and non-terminals.
23
CNF CNF -- Rules are of the form A BC or A a
Eg: S AS | a; A SA | b Steps Eliminate λ, unit and useless productions Add production of form A a or A BC to P1 Consider A x1x2x3……x n If n = 1 then x1 should be a Terminal (T) If n >=2, introduce Ba for a ε T and Convert A x1x2x3……x n to A B1 B2 … B3 and add B1 x1 … to P1 For n>2 introduce new Var D1,D2… Eg. A BCDE is written as A BD1 D1 C D2 D2 D D3 D3 E D4
24
G: S ABa A aab B Ac Result: Na a Nb b Nc c S AX1 X1 BNa A NaX2 X2 NaNb B ANc
25
Remove useless symbols
Eg1: Remove useless symbols Eg 2: Convert to CNF A BD A a S λ S AA|CD|bB A aA|a B bB|bC C cB D dD|d
26
Eg3: S A|ABa|AbA A Aa|λ B Bb|BC C CB|CA|bB
27
THANK YOU
28
NORMAL FORMS FDP ON THEORY OF COMPUTING
By G Sudha Sadasivam Assistant Professor, CSE
29
GNF Productions are of the form Aax , where a is a VT and x ε VN*.
They are long Can be used to construct PDA to recognise CFG Both GNF and s-grammars require that rules have the form A ax. s-grammars requires that the first VN of all the A-rules , be distinct. GNF does not impose such a restriction.
30
Remove λ productions Remove Unit Productions. Remove useless productions and non-terminals. Convert to CNF Convert to GNF
31
GNF S aAB | bBB | bB S AB A aA | bB | b A aA | bB | b B b
S aY SY | aX X a Y b Eg1: S AB A aA | bB | b B b Eg2: S abSb | aa
32
Removing direct left recursions
A A α | β, the equivalent is A β A’ ; A’ α A’ | λ Eg: S Sa | b is equivalent to S b S’ ; S’ a S’ | λ
33
Answer: S A | C A B A’ | a A’ A’ aBA’ | aCA’ B Cb B’ B’ bB’ C cC | c
34
1. Grammar G: 2. Removing left recursion: 3. BAA is out of order 4. Take each rule that does not have a terminal at start and follow the derivation until a terminal is produced
35
answer
36
Construct GNF for Answer
37
CYK algorithm Cocke-Younger-Kasami algorithm - To recognise CF language To prove membership of strings to CFL To construct a possible parse tree. Can also process Stochastic CFG, where probabilities a stored in a table. Asymptotic time complexity is θ(n3), where, n -- strlen
38
I/P string a 1 ... a n of length ‘n’. G has ‘r’ terminals.
Grammar has nonterminals R R r & R 1 is start symbol. P[n n r] is an array of booleans initialized to false. For each i = 1 to n For each unit production R j → a i set P[i 1 j] = true. For each i = 2 to n -- Length of span For each j = 1 to n-i+1 -- Start of span For each k = 1 to i-1 -- Partition of span For each production R A -> R B R C if P[j k B] and P[j+k i-k C] then set P[j i A] = true if P[1 n 1] is true then string is member of language else string is not member of language
39
CYK Parsers i/p str: w = w1w2 w3……wn;
wij = wi wi+1……wj and Vij = { A ε V | A * wij } W belongs to L iff S ε V1n. Vii. Is found by examining RHS of rules O(n3) – n is string length
40
S AB A BB | a B AB | b To generate string aabbb
41
String: aaabbb
42
String : baaa
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.