NORMAL FORMS FDP ON THEORY OF COMPUTING

Slides:



Advertisements
Similar presentations
Grammar types There are 4 types of grammars according to the types of rules: – General grammars – Context Sensitive grammars – Context Free grammars –
Advertisements

Closure Properties of CFL's
Simplifying CFGs There are several ways in which context-free grammars can be simplified. One natural way is to eliminate useless symbols those that cannot.
About Grammars CS 130 Theory of Computation HMU Textbook: Sec 7.1, 6.3, 5.4.
Fall 2004COMP 3351 Simplifications of Context-Free Grammars.
Context Free Grammars.
Dept. of Computer Science & IT, FUUAST Automata Theory 2 Automata Theory VII.
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
Western Michigan University CS6800 Advanced Theory of Computation Spring 2014 By Abduljaleel Alhasnawi & Rihab Almalki.
CS 3240 – Chapter 6.  6.1: Simplifying Grammars  Substitution  Removing useless variables  Removing λ  Removing unit productions  6.2: Normal Forms.
Chapter 4 Normal Forms for CFGs Chomsky Normal Form n Defn A CFG G = (V, , P, S) is in chomsky normal form if each rule in G has one of.
CS5371 Theory of Computation
1 CSC 3130: Automata theory and formal languages Tutorial 4 KN Hung Office: SHB 1026 Department of Computer Science & Engineering.
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Lecture 4 Context-free grammars Jan Maluszynski, IDA, 2007
Normal forms for Context-Free Grammars
1 Module 32 Chomsky Normal Form (CNF) –4 step process.
Cs466(Prasad)L8Norm1 Normal Forms Chomsky Normal Form Griebach Normal Form.
Context-Free Grammars Chapter 3. 2 Context-Free Grammars and Languages n Defn A context-free grammar is a quadruple (V, , P, S), where  V is.
CS 3813 Introduction to Formal Languages and Automata Chapter 6 Simplification of Context-free Grammars and Normal Forms These class notes are based on.
Context-Free Grammars
نظریه زبان ها و ماشین ها فصل دوم Context-Free Languages دانشگاه صنعتی شریف بهار 88.
CONVERTING TO CHOMSKY NORMAL FORM
Formal Languages Context free languages provide a convenient notation for recursive description of languages. The original goal of CFL was to formalize.
Context-Free Grammars Normal Forms Chapter 11. Normal Forms A normal form F for a set C of data objects is a form, i.e., a set of syntactically valid.
Normal Forms for Context-Free Grammars Definition: A symbol X in V  T is useless in a CFG G=(V, T, P, S) if there does not exist a derivation of the form.
Grammars CPSC 5135.
Languages & Grammars. Grammars  A set of rules which govern the structure of a language Fritz Fritz The dog The dog ate ate left left.
Context Free Grammar. Introduction Why do we want to learn about Context Free Grammars?  Used in many parsers in compilers  Yet another compiler-compiler,
Introduction to Language Theory
Membership problem CYK Algorithm Project presentation CS 5800 Spring 2013 Professor : Dr. Elise de Doncker Presented by : Savitha parur venkitachalam.
Phrase-structure grammar A phrase-structure grammar is a quadruple G = (V, T, P, S) where V is a finite set of symbols called nonterminals, T is a set.
Section 12.4 Context-Free Language Topics
Chapter 6 Simplification of Context-free Grammars and Normal Forms These class notes are based on material from our textbook, An Introduction to Formal.
CSCI 3130: Formal languages and automata theory Tutorial 4 Chin.
1 Simplification of Context-Free Grammars Some useful substitution rules. Removing useless productions. Removing -productions. Removing unit-productions.
Closure Properties Lemma: Let A 1 and A 2 be two CF languages, then the union A 1  A 2 is context free as well. Proof: Assume that the two grammars are.
1 Chapter 6 Simplification of CFGs and Normal Forms.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Formal Languages and Grammars
Exercises on Chomsky Normal Form and CYK parsing
Chomsky Normal Form.
1 Context Free Grammars Xiaoyin Wang CS 5363 Spring 2016.
About Grammars Hopcroft, Motawi, Ullman, Chap 7.1, 6.3, 5.4.
Theory of Languages and Automata By: Mojtaba Khezrian.
Theory of Computation Automata Theory Dr. Ayman Srour.
Context-Free Languages & Grammars (CFLs & CFGs) (part 2)
PROGRAMMING LANGUAGES
Normal Forms for CFG’s Eliminating Useless Variables Removing Epsilon
David Rodriguez-Velazquez CS – 6800 Summer I
Complexity and Computability Theory I
7. Properties of Context-Free Languages
Lecture 22 Pumping Lemma for Context Free Languages
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Simplifications of Context-Free Grammars
FORMAL LANGUAGES AND AUTOMATA THEORY
Jaya Krishna, M.Tech, Assistant Professor
Context-Free Languages
Context-Free Grammars (CFG’s)
7. Properties of Context-Free Languages
Chapter 6 Simplification of Context-free Grammars and Normal Forms
CHAPTER 2 Context-Free Languages
CSE 311: Foundations of Computing
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Normal forms and parsing
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Context-Free Languages
Presentation transcript:

NORMAL FORMS FDP ON THEORY OF COMPUTING By G Sudha Sadasivam Assistant Professor, CSE

CONTEXT FREE GRAMMAR Formal languages, NFA and DFA describe grammars English Sentence rules <Sentence>  <Noun Phrase> <Verb Phrase> <Noun Phrase>  <Article> <Noun> | <Noun> <Verb Phrase>  <Verb> | <Verb> <Noun Phrase> <Article>  a | the <Noun>  Sita | boy | girl | ball | dog | ... <Verb>  caught | saw | took | ...

CONTENTS CFG Chomsky hierarchy Normal forms Unit productions Useless productions CNF Applications

Sentence NP VP Noun Verb Article Sita took the ball

Chomsky Hierarchy- Avram Noam Chomsky Grammar G = ( VN, VT, P, S), where VN is set of NT/ Variables, VT is alphabet set / terminals (∑), P is set of productions , S start symbol In CFG, rules are of the form A  w, w ε ( VT U VN) Type Name Productions (P) Unrestricted a ® b with a Î (VT È VN)+ and b Î (VT È VN)* 1 Context-Sensitive a1Aa2 ® a1b a2 with A Î VN and a1, a2 Î (VT È VN)* and b Î (VT È VN)+ 2 Context-Free A ® b with A Î VN and b Î (VT È VN)* 3 Regular, Finite A ® bB or A ® b with A, B Î VN and b Î V*T

CFG Context-free means a variable can be replaced with w. They are powerful to describe Programming languages. CFG are simple to construct efficient parsing algorithms – LR & LL parsers. BNF (Backus-Naur form is used to represent CFG. S  aSb | λ Panini described Sanskrit using CFG Venpa is governed by CFG Text Mining in Bio-medicine

Normal Forms A NF for a grammar has additional conditions imposed upon its productions and is equivalent to the given grammar. Two types Chomsky NF (CNF) Greibach NF (GNF) Simple form of productions. Rules in CNF has both theoretical and practical implications.

T0 T1 T2 GNF T3 CNF

Examples CNF: CYK membership algorithm to find if a string is in the language represented by the grammar. GNF: is used for conversion from CFG to NDPA and vice versa.

CNF CFG, RHS can be a combination of V & T Eg NP  the N is reduced to DET  the NP  DET N A λ-free CFG is said to be in CNF if prod. are A  a B  CD , with A, B, C, D εVN and a ε VT. If CFG is not λ-free then include S λ As 2nd prod has two variables – binary grammars

GNF Productions are of the form Aax , where a is a VT and x ε VN*. They are long

1. λ-free languages Let L be any CF language, G (with λ proiductions) has prod S0  S | λ λ - productions are A  λ; A variable A for A * λ is nullable In this case λ-prod can be removed If G is a λ-free CFG, then there is G1 having no λ-prod

STEPS Find the set VN of all nullable variables A  λ, put A in VN Repeat to Add variables to VN For prod B  A1 A2 A3 A4 A1 A2 A3 A4 are in VN, add B to VN 2. For A  x1 x2 … xm with m >=1, put into P1, that production and prod generated by replacing nullable variables with λ in all combinations

S  ABaC A  BC B  b | λ C  D | λ D  d Nullable is { A, B, C} S  ABaC | BaC | AaC | ABa | aC | Ba | Aa | a A  BC | C | B B  b C  D D  d

2. Substitution Rule G : A  x1 B x2 and B  y1 | y2 |.. | yn G1: A  x1 y1 x2 | x1 y2 x2 | x1 y3 x2 |…| x1 yn x2 B  y1 | y2 |.. | yn Then G = G1 Example: A  a | aaA | abBc B  abbA | b Then A  a | aaA | ababbAc | abbc

3. Removing Useless Productions Prod that do not take part in any derivations S * xAy * w is useful Useless variables Cannot be reached from S S  A; A  aA | λ; B  bA 2. Cannot derive a terminal string S  A | b A  aA

Identify the variables that can lead to a terminal string Set V1(G1=(V1,T1,P1,S1) to NULL Repeat For A  x1x2x3……x n for xi in V1 U T, add A to V1 Add to P1 all prod in P whose symbols are in A U V1 Eliminate var that cannot be reached from S Dependency graph Useful-var is reached from start (S).

For eg1: S  aSb | λ | A A  aA A – is useless Eg2: S  A A  aA | λ B  bA A cannot derive a terminal string B is not reachable from start

1) C is useless since it does not derive a terminal string Eg3: 1: S  aS | A | C 2: A  a 3: B  aa 4: C  aCb 1) C is useless since it does not derive a terminal string 2) Reachability graph – B is not reachable S A B

4. Removing Unit Productions Prod of the form A  B are unit productions Find A * B from a dependency graph Add to P1 all non-unit prod from P If AB and B  y1 | y2 |.. | yn then A  y1 | y2 |.. | yn

S  Aa | B S  Aa | bb A  a | bc | B A  a | bc | bb B  A | bb Non-unit rules S  Aa A  a | bc B  bb 2. As S * B , S bb is added As A * B, A bb is added As B * A, B  a|bc is added S  Aa | bb A  a | bc | bb B  bb | a | bc S A B

Altogether Remove λ productions Remove Unit Productions. Remove useless productions and non-terminals.

CNF CNF -- Rules are of the form A  BC or A  a Eg: S  AS | a; A  SA | b Steps Eliminate λ, unit and useless productions Add production of form A a or A  BC to P1 Consider A  x1x2x3……x n If n = 1 then x1 should be a Terminal (T) If n >=2, introduce Ba for a ε T and Convert A  x1x2x3……x n to A  B1 B2 … B3 and add B1  x1 … to P1 For n>2 introduce new Var D1,D2… Eg. A BCDE is written as A BD1 D1  C D2 D2  D D3 D3 E D4

G: S  ABa A  aab B  Ac Result: Na  a Nb  b Nc  c S  AX1 X1  BNa A  NaX2 X2  NaNb B  ANc

Remove useless symbols Eg1: Remove useless symbols Eg 2: Convert to CNF A  BD A  a S  λ S  AA|CD|bB A  aA|a B  bB|bC C  cB D  dD|d

Eg3: S  A|ABa|AbA A  Aa|λ B  Bb|BC C  CB|CA|bB

THANK YOU

NORMAL FORMS FDP ON THEORY OF COMPUTING By G Sudha Sadasivam Assistant Professor, CSE

GNF Productions are of the form Aax , where a is a VT and x ε VN*. They are long Can be used to construct PDA to recognise CFG Both GNF and s-grammars require that rules have the form A  ax. s-grammars requires that the first VN of all the A-rules , be distinct. GNF does not impose such a restriction.

Remove λ productions Remove Unit Productions. Remove useless productions and non-terminals. Convert to CNF Convert to GNF

GNF S  aAB | bBB | bB S  AB A  aA | bB | b A  aA | bB | b B  b S  aY SY | aX X  a Y  b Eg1: S  AB A  aA | bB | b B  b Eg2: S  abSb | aa

Removing direct left recursions A  A α | β, the equivalent is A  β A’ ; A’  α A’ | λ Eg: S  Sa | b is equivalent to S  b S’ ; S’  a S’ | λ

Answer: S  A | C A  B A’ | a A’ A’  aBA’ | aCA’ B  Cb B’ B’  bB’ C  cC | c

1. Grammar G: 2. Removing left recursion: 3. BAA is out of order 4. Take each rule that does not have a terminal at start and follow the derivation until a terminal is produced

answer

Construct GNF for Answer

CYK algorithm Cocke-Younger-Kasami algorithm - To recognise CF language To prove membership of strings to CFL To construct a possible parse tree. Can also process Stochastic CFG, where probabilities a stored in a table. Asymptotic time complexity is θ(n3), where, n -- strlen

I/P string a 1 ... a n of length ‘n’. G has ‘r’ terminals. Grammar has nonterminals R 1 ... R r & R 1 is start symbol. P[n n r] is an array of booleans initialized to false. For each i = 1 to n For each unit production R j → a i set P[i 1 j] = true. For each i = 2 to n -- Length of span For each j = 1 to n-i+1 -- Start of span For each k = 1 to i-1 -- Partition of span For each production R A -> R B R C if P[j k B] and P[j+k i-k C] then set P[j i A] = true if P[1 n 1] is true then string is member of language else string is not member of language

CYK Parsers i/p str: w = w1w2 w3……wn; wij = wi wi+1……wj and Vij = { A ε V | A * wij } W belongs to L iff S ε V1n. Vii. Is found by examining RHS of rules O(n3) – n is string length

S  AB A  BB | a B  AB | b To generate string aabbb

String: aaabbb

String : baaa