Section 12.4 Context-Free Language Topics

Slides:



Advertisements
Similar presentations
1 Chapter Parsing Techniques. 2 Section 12.3 Parsing Techniques We know (via a theorem) that the context- free languages are exactly those languages.
Advertisements

Context free languages 1. Equivalence of context free grammars 2. Normal forms.
Closure Properties of CFL's
Simplifying CFGs There are several ways in which context-free grammars can be simplified. One natural way is to eliminate useless symbols those that cannot.
FORMAL LANGUAGES, AUTOMATA, AND COMPUTABILITY
Chapter 4 Normal Forms for CFGs Chomsky Normal Form n Defn A CFG G = (V, , P, S) is in chomsky normal form if each rule in G has one of.
CS5371 Theory of Computation
127 The Chomsky Hierarchy(review) Recursively Enumerable Sets Turing Machines Post System Markov Algorithms,  -recursive Functions Regular Expression.
1 CSC 3130: Automata theory and formal languages Tutorial 4 KN Hung Office: SHB 1026 Department of Computer Science & Engineering.
1 The Pumping Lemma for Context-Free Languages. 2 Take an infinite context-free language Example: Generates an infinite number of different strings.
104 Closure Properties of Regular Languages Regular languages are closed under many set operations. Let L 1 and L 2 be regular languages. (1) L 1  L 2.
Costas Busch - RPI1 The Pumping Lemma for Context-Free Languages.
Normal forms for Context-Free Grammars
1 Module 32 Chomsky Normal Form (CNF) –4 step process.
How to Convert a Context-Free Grammar to Greibach Normal Form
1 Background Information for the Pumping Lemma for Context-Free Languages Definition: Let G = (V, T, P, S) be a CFL. If every production in P is of the.
Context-Free Grammars Chapter 3. 2 Context-Free Grammars and Languages n Defn A context-free grammar is a quadruple (V, , P, S), where  V is.
Prof. Busch - LSU1 Pumping Lemma for Context-free Languages.
Homework #7 Solutions. #1. Use the pumping lemma for CFL’s to show L = {a i b j a i b j | i, j > 0} is not a CFL. Proof by contradiction using the Pumping.
1 Applications of Regular Closure. 2 The intersection of a context-free language and a regular language is a context-free language context free regular.
1 The Pumping Lemma for Context-Free Languages. 2 Take an infinite context-free language Example: Generates an infinite number of different strings.
INHERENT LIMITATIONS OF COMPUTER PROGRAMS CSci 4011.
Chapter 12: Context-Free Languages and Pushdown Automata
1 Properties of Context-free Languages Reading: Chapter 7.
CSE 3813 Introduction to Formal Languages and Automata Chapter 8 Properties of Context-free Languages These class notes are based on material from our.
1 Section 3.3 Grammars A grammar is a finite set of rules, called productions, that are used to describe the strings of a language. Notational Example.
1 Chapter Construction Techniques. 2 Section 3.3 Grammars A grammar is a finite set of rules, called productions, that are used to describe the.
Pushdown Automata (PDA) Intro
Context-Free Grammars Normal Forms Chapter 11. Normal Forms A normal form F for a set C of data objects is a form, i.e., a set of syntactically valid.
CSCI 2670 Introduction to Theory of Computing September 20, 2005.
Normal Forms for Context-Free Grammars Definition: A symbol X in V  T is useless in a CFG G=(V, T, P, S) if there does not exist a derivation of the form.
The Pumping Lemma for Context Free Grammars. Chomsky Normal Form Chomsky Normal Form (CNF) is a simple and useful form of a CFG Every rule of a CNF grammar.
Context-Free Grammars – Chomsky Normal Form Lecture 16 Section 2.1 Wed, Sep 26, 2007.
CSCI 2670 Introduction to Theory of Computing September 21, 2004.
1 Chapter Regular Language Topics. 2 Section 11.4 Regular Language Topics Regular languages are also characterized by special grammars called regular.
Context Free Grammar. Introduction Why do we want to learn about Context Free Grammars?  Used in many parsers in compilers  Yet another compiler-compiler,
ELIMINATING LEFT RECURSIVENESS. Abbreviation. “cfg” stands for “context free grammar” Definition. A cfg is left recursive if it contains a production.
1 Section 12.3 Context-Free Parsing We know (via a theorem) that the context-free languages are exactly those languages that are accepted by PDAs. When.
Lecture 11 Theory of AUTOMATA
1 Simplification of Context-Free Grammars Some useful substitution rules. Removing useless productions. Removing -productions. Removing unit-productions.
Closure Properties Lemma: Let A 1 and A 2 be two CF languages, then the union A 1  A 2 is context free as well. Proof: Assume that the two grammars are.
Non-CF Languages The language L = { a n b n c n | n  0 } does not appear to be context-free. Informal: A PDA can compare #a’s with #b’s. But by the time.
Section 11.4 Regular Language Topics
Pumping Lemma for CFLs. Theorem 7.17: Let G be a CFG in CNF and w a string in L(G). Suppose we have a parse tree for w. If the length of the longest path.
1 Chapter 6 Simplification of CFGs and Normal Forms.
Costas Busch - LSU1 Pumping Lemma for Context-free Languages.
Introduction Finite Automata accept all regular languages and only regular languages Even very simple languages are non regular (  = {a,b}): - {a n b.
Chapter 8 Properties of Context-free Languages These class notes are based on material from our textbook, An Introduction to Formal Languages and Automata,
CSCI 3130: Formal languages and automata theory Andrej Bogdanov The Chinese University of Hong Kong Limitations.
Exercises on Chomsky Normal Form and CYK parsing
Chomsky Normal Form.
Complexity and Computability Theory I Lecture #12 Instructor: Rina Zviel-Girshin Lea Epstein.
Normal Forms for CFG’s Eliminating Useless Variables Removing Epsilon
Context-Free Grammars: an overview
Chomsky Normal Form CYK Algorithm
Complexity and Computability Theory I
7. Properties of Context-Free Languages
Lecture 22 Pumping Lemma for Context Free Languages
Simplifications of Context-Free Grammars
FORMAL LANGUAGES AND AUTOMATA THEORY
Jaya Krishna, M.Tech, Assistant Professor
More on Context Free Grammars
Definition: Let G = (V, T, P, S) be a CFL
7. Properties of Context-Free Languages
CHAPTER 2 Context-Free Languages
Pumping Lemma for Context-free Languages
Properties of Context-Free Languages
Key Answers for Homework #7
Applications of Regular Closure
Presentation transcript:

Section 12.4 Context-Free Language Topics Algorithm. Remove L-productions from grammars for langauges without L. Find nonterminals that derive L. For each production A  w construct all productions A  w’ where w’ is obtained from w by removing one or more occurrences of the nonterminals from Step 1. Combine the original productions with those of step 2 and eliminate any L-productions. Example. Remove L-productions from the grammar S  ABc A  aA | L B  bB | L. Solution. Step 1: The nonterminals A and B derive L. Step 2: From the production S  ABc we construct S  Bc | Ac | c. From the production A  aA we construct A  a. From the production B  bB we construct B  b. Step 3: S  ABc | Bc | Ac | c A  aA | a B  bB | b. Quiz. Remove L -productions from S  ABc | Ab | c A  ABa | L B  Bbc | L. Solution. S  ABc | Ab | c| Bc | Ac | b A  ABa | Ba | Aa | a B  Bbc | bc.

Chomsky Normal Form. Productions have one of the following forms A  a (a a terminal) A  BC S  L (if L is in the language). Advantages: Parse trees are binary, which are easy to represent. Any string of length n > 0 can be derived in 2n – 1 steps. Algorithm. Transform to Chomsky normal form (with the additional property that no start symbol occurs on the right side of a production) 1. If the start symbol S occurs on some right side, create a new start symbol S´ and a new production S´  S. 2. Remove A  L (if A ≠ S) by previous algorithm. (If S  L is removed, add it back.) 3. Remove unit productions (i.e., A  B): If A  B or A + B, then construct productions A  w where B  w is not a unit production. Now remove all unit productions. 4. For each production whose right side has two or more symbols, replace all occurrrences of each terminal a with a new nonterminal A and also add the new production A  a. 5. Replace each production B  C1…Cn with n > 2 with B  C1D where D  C2 …Cn. Repeat this step until all right sides have length two.

Example. Construct a Chomsky normal form for the grammar S  aSb | D D  Dc | L. Solution. Step 1: Add the production S´  S. Step 2: S´  S | L S  aSb | ab | D D  Dc | c. Step 3: S´  aSb | ab | Dc | c | L S  aSb | ab | Dc | c Step 4: S´  ASB | AB | DC | c | L S  ASB | AB | DC | c D  DC | c A  a B  b C  c. Step 5: Replace S´ ASB and S  ASB by S´ AE, S´ AE, and E  SB.

Quiz. Construct a Chomsky normal form for the grammar S  aTbb | U | L T  cT | c U  Ud | d. Solution. Step 1: No change. Step 2: No change. Step 3: Remove unit production S  U to obtain S  aTbb | Ud | d | L Step 4: Transform right sides of length at least two into strings of nonterminals. S  ATBB | UD | d | L T  CT | c U  UD | d. A  a B  b C  c D  d. Step 5: Replace S  ATBB with the productions S  AE, E  TF, F  BB.

Greibach Normal Form. Productions have one of the following forms A  b (b a terminal) A  bD1…Dk S  L (if L is in the language). Advantage: Any string of length n > 0 can be derived in n steps. Algorithm (idea). Transform context-free grammar to Greibach normal form. Perform steps 1, 2, and 3 of the Chomsky algorithm. Remove all left-recursion, including indirect, without adding L 3. Make substitutions to transform the grammar into the proper form. Example. Put the following grammar into Greibach normal form. S  AB | Ac | d A  aA | a B Ab | c. Solution: Steps 1 (Chomsky steps 1, 2, and 3) and 2 are non needed. Step 3: Replace A in S  AB | Ac | d with aA | a to obtain S  aAB | aB | aAc | ac | d. Replace A in B Ab | c with theright side of A  aA | a to obtain B aAb | ab | c. Now add the new productions C  c and D  b to obtain the proper form: S  aAB | aB | aAC | aC | d A  aA | a B aAD | aD | c C  c D  b.

Example. Put the following grammar into Greibach normal form. S  AB | Ac | d A  aA | a B Ab | c. Solution: Steps 1 (Chomsky steps 1, 2, and 3) and 2 are not needed. Step 3: Replace A in S  AB | Ac | d with aA | a to obtain S  aAB | aB | aAc | ac | d. Replace A in B Ab | c with theright side of A  aA | a to obtain B aAb | ab | c. Now add the new productions C  c and D  b and make appropriate replacements to obtain the proper form: S  aAB | aB | aAC | aC | d B aAD | aD | c C  c D  b.

Properties of Context-Free Languages When we know some properties of context-free languages they can help us argue, BWOC, that certain languages are not context-free. The Pumping Lemma If L is an infinite context-free language, then any grammar for L must be recursive, so there must be derivations of the the following form where u, v, w, x, and y are terminal strings. S + uNy N + vNx (where v and x are not both L) N + w. These derivations lead to derivations like S + uNy + uvNxy + uv2Nx2y + uvkNxky + uvkwxky  L for all k  N. This is the basis for the Pumping Lemma: There is an integer m > 0 such that if z  L and | z | ≥ m, then z has the form z = uvwxy where 1 ≤ | vx | ≤ | vwx | ≤ m and uvkwxky  L for all k  N. Note: The number m depends on the grammar as we’ll see in the following example.

Example. Suppose we have the following grammar for {L, bbc}  {abcnd | n  N}. S  aNd | bbc | L N Nc | b. Here are a few derivations: S  aNd  abd S  aNd  aNcd  abcd S  aNd  aNcd  aNccd  abccd S + abckd for any k in N. For this grammar m = 4 can be used in the pumping lemma because any derivation of a string z with | z | ≥ 4 must use the nonterminal N. For example, if | z | = 8 and z = abcccccd, then the pumping lemma factors z = abcccccd = uvwxy where 1 ≤ | vx | ≤ | vwx | ≤ 4 and uvkwxky  L for all k  N. In this case let u = a, v = L, w = b, x = c, and y = ccccd. Example. The language L = {anbncn+k | k, n  N} is not context-free. Proof: Assume, BWOC, that L is context-free. L is infinite, so pumping lemma applies. Choose z = ambmcm where m is the positive integer from the lemma. Then z = ambmcm = uvwxy where 1 ≤ | vx | ≤ | vwx | ≤ m and uvkwxky  L for all k  N. Observe neither v nor x can contain distinct letters. For example, if v = …a…b…, then v2 = …a…b……a…b…, which can’t appear as a substring of any string in L. So v and x must be strings of repeated occurrences of a single letter. Now since | vwx | ≤ m, there are two possible places in ambmcm where v and x must occur: (1) v and x occur in ambm. (2) v and x occur in bmcm. But we obtain the following contradictions because v and x are not both L. (1) Let k = 2 to obtain uv2wx2y = am+ibm+jcm, where i > 0 or j > 0. So uv2wx2y  L (2) Let k = 0 to obtain uwy = ambm-icm–j, where i > 0 or j > 0. So we have uwy  L. These contradictions imply that L is not context-free. QED.

Example/Quiz. Prove that the language L = {ss | s  {a, b} Example/Quiz. Prove that the language L = {ss | s  {a, b}*} is not context-free. Proof: Assume, BWOC, that L is context-free. L is infinite, so pumping lemma applies. Choose z = ambmambm where m is the positive integer from the lemma. Then z = ambmambm = uvwxy where 1 ≤ | vx | ≤ | vwx | ≤ m and uvkwxky  L for all k  N. Now since | vwx | ≤ m, there are three possible places in ambmambm where v and x must occur: (1) v and x occur in ambm (on the left of z). (2) v and x occur in bmam (in the center of z). (3) v and x occur in ambm (on the right of z). Notice that v and x can consist only of repetitions a single letter. For example, in case (1) suppose v = aibj for some i > 0 and j > 0 and x = bn for some n ≥ 0. Then, letting k = 0, we would obtain uwy = am–ibm–j–nambm, which cannot be in L. The argument is similar for the other cases. So v and x must consist only of repetitions of a single letter. We need to find a contradiction in each of the three cases. We’ll do it by using k = 0. This tells us that uwy  L. But we obtain the following contradictions because v and x are not both L. (1) uwy = am–ibm–jambm where either i > 0 or j > 0 So uwy  L, (2) uwy = ambm–iam–jbm where either i > 0 or j > 0. So uwy  L. (3) uwy = ambmam–ibm–j where either i > 0 or j > 0. So uwy  L. Therefore L is not context-free. QED. Remark: Be careful that the choice of z is not in a context-free sublanguage of L. For example, if we chose z = (ab)m(ab)m in the preceding example, we would not get any contradictions.