Presentation is loading. Please wait.

Presentation is loading. Please wait.

NORMAL FORMS FDP ON THEORY OF COMPUTING

Similar presentations


Presentation on theme: "NORMAL FORMS FDP ON THEORY OF COMPUTING"— Presentation transcript:

1 NORMAL FORMS FDP ON THEORY OF COMPUTING
By G Sudha Sadasivam Assistant Professor, CSE

2 CONTEXT FREE GRAMMAR Formal languages, NFA and DFA describe grammars
English Sentence rules <Sentence>  <Noun Phrase> <Verb Phrase> <Noun Phrase>  <Article> <Noun> | <Noun> <Verb Phrase>  <Verb> | <Verb> <Noun Phrase> <Article>  a | the <Noun>  Sita | boy | girl | ball | dog | ... <Verb>  caught | saw | took | ...

3 CONTENTS CFG Chomsky hierarchy Normal forms Unit productions
Useless productions CNF Applications

4 Sentence NP VP Noun Verb Article Sita took the ball

5 Chomsky Hierarchy- Avram Noam Chomsky
Grammar G = ( VN, VT, P, S), where VN is set of NT/ Variables, VT is alphabet set / terminals (∑), P is set of productions , S start symbol In CFG, rules are of the form A  w, w ε ( VT U VN) Type Name Productions (P) Unrestricted a ® b with a Î (VT È VN)+ and b Î (VT È VN)* 1 Context-Sensitive a1Aa2 ® a1b a2 with A Î VN and a1, a2 Î (VT È VN)* and b Î (VT È VN)+ 2 Context-Free A ® b with A Î VN and b Î (VT È VN)* 3 Regular, Finite A ® bB or A ® b with A, B Î VN and b Î V*T

6 CFG Context-free means a variable can be replaced with w.
They are powerful to describe Programming languages. CFG are simple to construct efficient parsing algorithms – LR & LL parsers. BNF (Backus-Naur form is used to represent CFG. S  aSb | λ Panini described Sanskrit using CFG Venpa is governed by CFG Text Mining in Bio-medicine

7 Normal Forms A NF for a grammar has additional conditions imposed upon its productions and is equivalent to the given grammar. Two types Chomsky NF (CNF) Greibach NF (GNF) Simple form of productions. Rules in CNF has both theoretical and practical implications.

8 T0 T1 T2 GNF T3 CNF

9 Examples CNF: CYK membership algorithm to find if a string is in the language represented by the grammar. GNF: is used for conversion from CFG to NDPA and vice versa.

10 CNF CFG, RHS can be a combination of V & T Eg NP  the N is reduced to
DET  the NP  DET N A λ-free CFG is said to be in CNF if prod. are A  a B  CD , with A, B, C, D εVN and a ε VT. If CFG is not λ-free then include S λ As 2nd prod has two variables – binary grammars

11 GNF Productions are of the form Aax , where a is a VT and x ε VN*.
They are long

12 1. λ-free languages Let L be any CF language,
G (with λ proiductions) has prod S0  S | λ λ - productions are A  λ; A variable A for A * λ is nullable In this case λ-prod can be removed If G is a λ-free CFG, then there is G1 having no λ-prod

13 STEPS Find the set VN of all nullable variables A  λ, put A in VN
Repeat to Add variables to VN For prod B  A1 A2 A3 A4 A1 A2 A3 A4 are in VN, add B to VN 2. For A  x1 x2 … xm with m >=1, put into P1, that production and prod generated by replacing nullable variables with λ in all combinations

14 S  ABaC A  BC B  b | λ C  D | λ D  d Nullable is { A, B, C} S  ABaC | BaC | AaC | ABa | aC | Ba | Aa | a A  BC | C | B B  b C  D D  d

15 2. Substitution Rule G : A  x1 B x2 and B  y1 | y2 |.. | yn
G1: A  x1 y1 x2 | x1 y2 x2 | x1 y3 x2 |…| x1 yn x2 B  y1 | y2 |.. | yn Then G = G1 Example: A  a | aaA | abBc B  abbA | b Then A  a | aaA | ababbAc | abbc

16 3. Removing Useless Productions
Prod that do not take part in any derivations S * xAy * w is useful Useless variables Cannot be reached from S S  A; A  aA | λ; B  bA 2. Cannot derive a terminal string S  A | b A  aA

17 Identify the variables that can lead to a terminal string
Set V1(G1=(V1,T1,P1,S1) to NULL Repeat For A  x1x2x3……x n for xi in V1 U T, add A to V1 Add to P1 all prod in P whose symbols are in A U V1 Eliminate var that cannot be reached from S Dependency graph Useful-var is reached from start (S).

18 For eg1: S  aSb | λ | A A  aA A – is useless Eg2: S  A A  aA | λ B  bA A cannot derive a terminal string B is not reachable from start

19 1) C is useless since it does not derive a terminal string
Eg3: 1: S  aS | A | C 2: A  a 3: B  aa 4: C  aCb 1) C is useless since it does not derive a terminal string 2) Reachability graph – B is not reachable S A B

20 4. Removing Unit Productions
Prod of the form A  B are unit productions Find A * B from a dependency graph Add to P1 all non-unit prod from P If AB and B  y1 | y2 |.. | yn then A  y1 | y2 |.. | yn

21 S  Aa | B S  Aa | bb A  a | bc | B A  a | bc | bb B  A | bb
Non-unit rules S  Aa A  a | bc B  bb 2. As S * B , S bb is added As A * B, A bb is added As B * A, B  a|bc is added S  Aa | bb A  a | bc | bb B  bb | a | bc S A B

22 Altogether Remove λ productions Remove Unit Productions.
Remove useless productions and non-terminals.

23 CNF CNF -- Rules are of the form A  BC or A  a
Eg: S  AS | a; A  SA | b Steps Eliminate λ, unit and useless productions Add production of form A a or A  BC to P1 Consider A  x1x2x3……x n If n = 1 then x1 should be a Terminal (T) If n >=2, introduce Ba for a ε T and Convert A  x1x2x3……x n to A  B1 B2 … B3 and add B1  x1 … to P1 For n>2 introduce new Var D1,D2… Eg. A BCDE is written as A BD1 D1  C D2 D2  D D3 D3 E D4

24 G: S  ABa A  aab B  Ac Result: Na  a Nb  b Nc  c S  AX1 X1  BNa A  NaX2 X2  NaNb B  ANc

25 Remove useless symbols
Eg1: Remove useless symbols Eg 2: Convert to CNF A  BD A  a S  λ S  AA|CD|bB A  aA|a B  bB|bC C  cB D  dD|d

26 Eg3: S  A|ABa|AbA A  Aa|λ B  Bb|BC C  CB|CA|bB

27 THANK YOU

28 NORMAL FORMS FDP ON THEORY OF COMPUTING
By G Sudha Sadasivam Assistant Professor, CSE

29 GNF Productions are of the form Aax , where a is a VT and x ε VN*.
They are long Can be used to construct PDA to recognise CFG Both GNF and s-grammars require that rules have the form A  ax. s-grammars requires that the first VN of all the A-rules , be distinct. GNF does not impose such a restriction.

30 Remove λ productions Remove Unit Productions. Remove useless productions and non-terminals. Convert to CNF Convert to GNF

31 GNF S  aAB | bBB | bB S  AB A  aA | bB | b A  aA | bB | b B  b
S  aY SY | aX X  a Y  b Eg1: S  AB A  aA | bB | b B  b Eg2: S  abSb | aa

32 Removing direct left recursions
A  A α | β, the equivalent is A  β A’ ; A’  α A’ | λ Eg: S  Sa | b is equivalent to S  b S’ ; S’  a S’ | λ

33 Answer: S  A | C A  B A’ | a A’ A’  aBA’ | aCA’ B  Cb B’ B’  bB’ C  cC | c

34 1. Grammar G: 2. Removing left recursion: 3. BAA is out of order 4. Take each rule that does not have a terminal at start and follow the derivation until a terminal is produced

35 answer

36 Construct GNF for Answer

37 CYK algorithm Cocke-Younger-Kasami algorithm - To recognise CF language To prove membership of strings to CFL To construct a possible parse tree. Can also process Stochastic CFG, where probabilities a stored in a table. Asymptotic time complexity is θ(n3), where, n -- strlen

38 I/P string a 1 ... a n of length ‘n’. G has ‘r’ terminals.
Grammar has nonterminals R R r & R 1 is start symbol. P[n n r] is an array of booleans initialized to false. For each i = 1 to n For each unit production R j → a i set P[i 1 j] = true. For each i = 2 to n -- Length of span For each j = 1 to n-i+1 -- Start of span For each k = 1 to i-1 -- Partition of span For each production R A -> R B R C if P[j k B] and P[j+k i-k C] then set P[j i A] = true if P[1 n 1] is true then string is member of language else string is not member of language

39 CYK Parsers i/p str: w = w1w2 w3……wn;
wij = wi wi+1……wj and Vij = { A ε V | A * wij } W belongs to L iff S ε V1n. Vii. Is found by examining RHS of rules O(n3) – n is string length

40 S  AB A  BB | a B  AB | b To generate string aabbb

41 String: aaabbb

42 String : baaa


Download ppt "NORMAL FORMS FDP ON THEORY OF COMPUTING"

Similar presentations


Ads by Google