1 Chomsky Normal Form of CFG’s Definition Purpose Method of Constuction.

Slides:



Advertisements
Similar presentations
Closure Properties of CFL's
Advertisements

Context-Free Grammars
Simplifying CFGs There are several ways in which context-free grammars can be simplified. One natural way is to eliminate useless symbols those that cannot.
About Grammars CS 130 Theory of Computation HMU Textbook: Sec 7.1, 6.3, 5.4.
Fall 2004COMP 3351 Simplifications of Context-Free Grammars.
Prof. Busch - LSU1 Simplifications of Context-Free Grammars.
SIMPLIFYING GRAMMARS Definition: A useless symbol of a context-free
Context Free Grammars.
Lecture 17 Naveen Z Quazilbash Simplification of Grammars.
Dept. of Computer Science & IT, FUUAST Automata Theory 2 Automata Theory VII.
FORMAL LANGUAGES, AUTOMATA, AND COMPUTABILITY
Western Michigan University CS6800 Advanced Theory of Computation Spring 2014 By Abduljaleel Alhasnawi & Rihab Almalki.
CS 3240 – Chapter 6.  6.1: Simplifying Grammars  Substitution  Removing useless variables  Removing λ  Removing unit productions  6.2: Normal Forms.
Chapter 4 Normal Forms for CFGs Chomsky Normal Form n Defn A CFG G = (V, , P, S) is in chomsky normal form if each rule in G has one of.
CS5371 Theory of Computation
CS 310 – Fall 2006 Pacific University CS310 Pushdown Automata Sections: 2.2 page 109 October 11, 2006.
Lecture Note of 12/22 jinnjy. Outline Chomsky Normal Form and CYK Algorithm Pumping Lemma for Context-Free Languages Closure Properties of CFL.
1 CSC 3130: Automata theory and formal languages Tutorial 4 KN Hung Office: SHB 1026 Department of Computer Science & Engineering.
Normal Forms for CFG’s Eliminating Useless Variables Removing Epsilon
1 Context-Free Grammars Formalism Derivations Backus-Naur Form Left- and Rightmost Derivations.
Normal forms for Context-Free Grammars
1 Module 32 Chomsky Normal Form (CNF) –4 step process.
How to Convert a Context-Free Grammar to Greibach Normal Form
January 15, 2014CS21 Lecture 61 CS21 Decidability and Tractability Lecture 6 January 16, 2015.
1 Background Information for the Pumping Lemma for Context-Free Languages Definition: Let G = (V, T, P, S) be a CFL. If every production in P is of the.
Cs466(Prasad)L8Norm1 Normal Forms Chomsky Normal Form Griebach Normal Form.
Context-Free Grammars Chapter 3. 2 Context-Free Grammars and Languages n Defn A context-free grammar is a quadruple (V, , P, S), where  V is.
Homework #7 Solutions. #1. Use the pumping lemma for CFL’s to show L = {a i b j a i b j | i, j > 0} is not a CFL. Proof by contradiction using the Pumping.
Decision Properties of Regular Languages
Properties of Context-Free Languages
CS 3813 Introduction to Formal Languages and Automata Chapter 6 Simplification of Context-free Grammars and Normal Forms These class notes are based on.
1 Properties of Context-free Languages Reading: Chapter 7.
Lecture 21: Languages and Grammars. Natural Language vs. Formal Language.
Chapter 4 Context-Free Languages Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1.
Context-Free Grammars
Lecture 16 Oct 18 Context-Free Languages (CFL) - basic definitions Examples.
CONVERTING TO CHOMSKY NORMAL FORM
Formal Languages Context free languages provide a convenient notation for recursive description of languages. The original goal of CFL was to formalize.
Context-Free Grammars Normal Forms Chapter 11. Normal Forms A normal form F for a set C of data objects is a form, i.e., a set of syntactically valid.
Normal Forms for Context-Free Grammars Definition: A symbol X in V  T is useless in a CFG G=(V, T, P, S) if there does not exist a derivation of the form.
The Pumping Lemma for Context Free Grammars. Chomsky Normal Form Chomsky Normal Form (CNF) is a simple and useful form of a CFG Every rule of a CNF grammar.
Context-Free Grammars – Chomsky Normal Form Lecture 16 Section 2.1 Wed, Sep 26, 2007.
CSCI 2670 Introduction to Theory of Computing September 21, 2004.
1 Parse Trees Definitions Relationship to lm and rm Derivations Ambiguity in Grammars.
CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Ambiguity.
Context Free Grammar. Introduction Why do we want to learn about Context Free Grammars?  Used in many parsers in compilers  Yet another compiler-compiler,
Section 12.4 Context-Free Language Topics
Chapter 6 Simplification of Context-free Grammars and Normal Forms These class notes are based on material from our textbook, An Introduction to Formal.
1 A well-parenthesized string is a string with the same number of (‘s as )’s which has the property that every prefix of the string has at least as many.
1 Simplification of Context-Free Grammars Some useful substitution rules. Removing useless productions. Removing -productions. Removing unit-productions.
Closure Properties Lemma: Let A 1 and A 2 be two CF languages, then the union A 1  A 2 is context free as well. Proof: Assume that the two grammars are.
1 Chapter 6 Simplification of CFGs and Normal Forms.
Grammars A grammar is a 4-tuple G = (V, T, P, S) where 1)V is a set of nonterminal symbols (also called variables or syntactic categories) 2)T is a finite.
Context-Free Languages
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Normal forms.
Chomsky Normal Form.
1 A well-parenthesized string is a string with the same number of (‘s as )’s which has the property that every prefix of the string has at least as many.
About Grammars Hopcroft, Motawi, Ullman, Chap 7.1, 6.3, 5.4.
Normal Forms for CFG’s Eliminating Useless Variables Removing Epsilon
7. Properties of Context-Free Languages
Simplifications of Context-Free Grammars
Simplifications of Context-Free Grammars
Lecture 14 Grammars – Parse Trees– Normal Forms
Jaya Krishna, M.Tech, Assistant Professor
Properties of Context-free Languages
7. Properties of Context-Free Languages
Chapter 6 Simplification of Context-free Grammars and Normal Forms
Properties of Context-Free Languages
Normal forms and parsing
Context-Free Languages
Presentation transcript:

1 Chomsky Normal Form of CFG’s Definition Purpose Method of Constuction

2 uA construct used to establish properties of context-free languages (CFLs)  Every CFL without  can be generated by a CFG in Chomsky normal form.  To show that language without  is a CFL it is sufficient to show that it has a CFG in Chomsky normal form. uTypical approach to closure properites Chomsky Normal Form: Purpose

3 Chomsky Normal Form: Definition A context free grammar (CFG) in which all production are of the form A->BC or A->a, where A, B and C are variables and a is a terminal

4 uEliminate “useless: symbols  Variables or terminals that do not appear in any derivation of a terminal string from the start symbol  Eliminate  -productions  A->  uEliminate unit-productions wA->B for variables A and B Chomsky Normal Form: method of construction

5 uFor each elimination task, a method will be defined reclusively by an inductive proof. uOrder in which tasks are preformed is important Chomsky Normal Form: method of construction - 2

6 Generating and Reachable Symbols uX is generating if X =>* w (terminal string) uIf X is a terminal, then it can generate itself in zero steps.  X is reachable if S =>*  X  for some  and , (S is a start symbol) uAny symbol that is not generating and reachable is useless

7 Induction to find generating variables uBasis: If there is a production A -> w, where w is a terminal string, then A is generating. uInduction: If there is a production A -> , where  consists only of terminals and variables known to derive a terminal string, then A derives a terminal string; hence is generating.

8 Algorithm to eliminate non- generating variables 1.Discover all variables that derive terminal strings. 2.For all other variables, remove all productions in which they appear either on the LHS or RHS of ->.

9 Example: finding generating variables S->AB|C, A->aA|a, B->bB, C->c uBasis: A and C are generating due to productions A->a and C->c. uInduction: S is generating due to production S->C. uEliminate B->bB and S->AB uResult: S->C, A->aA|a, C->c uStill have unreachable variables

10 Finding reachable symbols uBasis: Obviously, start symbol is reachable. uInduction: if we can reach A, and there is a production A-> , then we can reach all symbols of . uIn result from previous slide wS->C, A->aA|a, C->c uOnly S and C are reachable

11 Epsilon Productions  Theorem: If L is a CFL with no empty string, then it has a CFG which can be put in Chomsky form with no  -productions.  A->  is clearly an  -production  To eliminate all types  -productions, we must first discover the nullable variables,  i.e. variables A such that A =>* ε.

12 Inductive definition of nullable symbols  Basis: If there is a production A -> ε, then A is nullable. uInduction: If there is a production A -> , and all symbols of  are nullable, then A is nullable.

13 Example: Nullable Symbols S->AB, A->aA| ε, B->bB|A  A is nullable because of A -> ε. uB is nullable because of B -> A. uS is nullable because of S -> AB.

14 Algorithm to eliminate  -productions uIdentify all nullable symbols. uConsider each production A->X 1 …X n that contains nullable symbols uSuppose A->X 1 …X n contains m<n nullable symbols uConstruct a family of productions with 2 m members that are all combinations of nullable symbols present or absent uIf m=n exclude case with all symbols absent

15 Eliminating  -productions  The new CFG with no  -productions consist of all families of productions derived from productions with nullable symbols uPlus all productions from the original CFG that did not contain nullable symbols

16 Example: Eliminating ε -Productions S->ABC, A->aA| ε, B->bB| ε, C-> ε uA, B, C, and S are all nullable. uProductions S->ABC|AB|AC|BC|A|B|C come from S->ABC uProductions A->aA|a come from A->aA uProductions B->bB|b come from B->bB

17 Eliminating ε -Productions continued S->ABC, A->aA| ε, B->bB| ε, C-> ε uNo contribution to CNF from original CFG uC is not generating uEliminate C in productions of the new CFG S -> ABC | AB | AC | BC | A | B | C A -> aA | a B -> bB | b

18 Define Unit Productions uA unit production is a production whose right side consists of exactly one variable. uA->a is not a unit production if a is terminal uEliminate by expansion is most common approach

19 Eliminate by expansion uIn the CFG defined by wE->T|E+T wT->F|T*F wF->I|(E) wI->a|Ia uE->T eliminated by E->F|T*F|E+T uE->F eliminated by E->I|(E)|T*F|E+T uE->I eliminated by E->a|Ia|(E)|T*F|E+T

20 Eliminate by expansion uWill not work on cycles of unit productions wA->B wB->C wC->A uAlternative: find all pairs (A,B) such that A=>*B by a sequence of unit productions u Works in all cases.

21 Alternative to expansion in eliminating unit productions uBasic idea: If A=>*B by a series of unit productions, and B->  is a non- unit-production, then add production A->  and drop the unit productions. uExample

22 Example of basic idea uIn the CFG defined by wE->T|E+T wT->F|T*F wF->I|(E) wI->a|Ia uE=>*I by the series of unit productions E->T, T->F, F->I uI->a is a non-unit production. uReplace by E->a uE->a|Ia|(E)|T*F|E+T (same as expansion method)

23 Pair search defined by induction uFind all pairs (A,B) such that A=>*B by a sequence of unit productions only. uBasis: A=>*A, therefor (A,A). uInduction: If we have found (A,B), and B->C is a unit production, then add (A,C)

24 Example of pair search uIn CFG defined by wE->T|E+T wT->F|T*F wF->I|(E) wI->a|Ia uObviously (E,T), (T,F), (F,I) u(T,I) and (E,F) also

25 Cleaning up a Grammar  Theorem: if L is a CFL, then there is a CFG for L – { ε } that has: 1.No useless symbols. 2.No ε -productions. 3.No unit productions. uevery right side of a production is either a single terminal or has length > 2.

26 Clean-up continued uProof: Start with a CFG for L. uPerform the following steps in order: 1.Eliminate ε -productions. 2.Eliminate unit productions. 3.Eliminate variables that derive no terminal string. 4.Eliminate variables not reached from the start symbol. Must be first. Can create unit productions and useless variables.

27 Chomsky Normal Form uA CFG is said to be in Chomsky Normal Form if every production is of one of these two forms: 1.A -> BC (right side is two variables). 2.A -> a (right side is a single terminal).  Theorem: If L is a CFL, then L – { ε } has a CFG in CNF.

28 Proof by construction uStep 1: “Clean” the grammar, so every production has right side either a single terminal or length >2. uStep 2: For each right side  a single terminal, make the right side all variables. wFor each terminal a create new variable A a and production A a -> a. (not a unit production) wReplace a by A a in right sides of productions.

29 Example: Step 2 uConsider production A -> BcDe. uWe need variables A c and A e. with productions A c -> c and A e -> e. wNote: you create at most one variable for each terminal, and use it everywhere it is needed. uReplace A -> BcDe by A -> BA c DA e.

30 CNF construction: final step uStep 3: Break right sides longer than 2 into a chain of productions with right sides of two variables. uExample: A -> BCDE is replaced by A -> BF, F -> CG, and G -> DE. wF and G must be used nowhere else.

31 Example text p266 S->AB A->aAA|  B->bBB| 

32 Assignment 11, Due Exercise text p 275 and 277

33