Context-Free Grammars Pushdown Store Automata Properties of the CFLs

Slides:



Advertisements
Similar presentations
The Pumping Lemma for CFL’s
Advertisements

Theory of Computation CS3102 – Spring 2014 A tale of computers, math, problem solving, life, love and tragic death Nathan Brunelle Department of Computer.
1 Pushdown Automata (PDA) Informally: –A PDA is an NFA-ε with a stack. –Transitions are modified to accommodate stack operations. Questions: –What is a.
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Closure Properties of CFL's
Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
Simplifying CFGs There are several ways in which context-free grammars can be simplified. One natural way is to eliminate useless symbols those that cannot.
Context Free Grammars.
Pushdown Automata Part II: PDAs and CFG Chapter 12.
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
Introduction to Computability Theory
CS5371 Theory of Computation
CS 310 – Fall 2006 Pacific University CS310 Pushdown Automata Sections: 2.2 page 109 October 11, 2006.
Chap 2 Context-Free Languages. Context-free Grammars is not regular Context-free grammar : eg. G 1 : A  0A1substitution rules A  Bproduction rules B.
Lecture Note of 12/22 jinnjy. Outline Chomsky Normal Form and CYK Algorithm Pumping Lemma for Context-Free Languages Closure Properties of CFL.
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Lecture 4 Context-free grammars Jan Maluszynski, IDA, 2007
104 Closure Properties of Regular Languages Regular languages are closed under many set operations. Let L 1 and L 2 be regular languages. (1) L 1  L 2.
Normal forms for Context-Free Grammars
Transparency No. P2C5-1 Formal Language and Automata Theory Part II Chapter 5 The Pumping Lemma and Closure properties for Context-free Languages.
CS5371 Theory of Computation Lecture 8: Automata Theory VI (PDA, PDA = CFG)
1 Background Information for the Pumping Lemma for Context-Free Languages Definition: Let G = (V, T, P, S) be a CFL. If every production in P is of the.
Today Chapter 2: (Pushdown automata) Non-CF languages CFL pumping lemma Closure properties of CFL.
Chapter 12: Context-Free Languages and Pushdown Automata
1 Properties of Context-free Languages Reading: Chapter 7.
CSE 3813 Introduction to Formal Languages and Automata Chapter 8 Properties of Context-free Languages These class notes are based on material from our.
نظریه زبان ها و ماشین ها فصل دوم Context-Free Languages دانشگاه صنعتی شریف بهار 88.
Pushdown Automata (PDA) Intro
The Pumping Lemma for Context Free Grammars. Chomsky Normal Form Chomsky Normal Form (CNF) is a simple and useful form of a CFG Every rule of a CNF grammar.
Pushdown Automata (PDAs)
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 11 Midterm Exam 2 -Context-Free Languages Mälardalen University 2005.
Pushdown Automata Chapters Generators vs. Recognizers For Regular Languages: –regular expressions are generators –FAs are recognizers For Context-free.
Closure Properties Lemma: Let A 1 and A 2 be two CF languages, then the union A 1  A 2 is context free as well. Proof: Assume that the two grammars are.
Chapter 7 Pushdown Automata
CS 208: Computing Theory Assoc. Prof. Dr. Brahim Hnich Faculty of Computer Sciences Izmir University of Economics.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
Foundations of (Theoretical) Computer Science Chapter 2 Lecture Notes (Section 2.2: Pushdown Automata) Prof. Karen Daniels, Fall 2010 with acknowledgement.
1 Chapter 6 Simplification of CFGs and Normal Forms.
Chapter 8 Properties of Context-free Languages These class notes are based on material from our textbook, An Introduction to Formal Languages and Automata,
C SC 473 Automata, Grammars & Languages Automata, Grammars and Languages Discourse 04 Context-Free Grammars and Pushdown Automata.
Formal Languages, Automata and Models of Computation
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 2 Context-Free Languages Some slides are in courtesy.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.
FORMAL LANGUAGES, AUTOMATA, AND COMPUTABILITY
Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
CS 154 Formal Languages and Computability March 15 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron Mak.
About Grammars Hopcroft, Motawi, Ullman, Chap 7.1, 6.3, 5.4.
Theory of Languages and Automata By: Mojtaba Khezrian.
Context-Free and Noncontext-Free Languages Chapter 13.
Formal Languages, Automata and Models of Computation
Closed book, closed notes
Normal Forms for CFG’s Eliminating Useless Variables Removing Epsilon
Context-Free Grammars: an overview
7. Properties of Context-Free Languages
PDAs Accept Context-Free Languages
FORMAL LANGUAGES AND AUTOMATA THEORY
Pushdown Automata Reading: Chapter 6.
Context-Free Languages
Hierarchy of languages
Chapter Thirteen: Stack Machines
Definition: Let G = (V, T, P, S) be a CFL
7. Properties of Context-Free Languages
Chapter 6 Simplification of Context-free Grammars and Normal Forms
CHAPTER 2 Context-Free Languages
Context-Free Grammars
فصل دوم Context-Free Languages
Properties of Context-Free Languages
The Pumping Lemma for CFL’s
Chapter 2 Context-Free Language - 01
Key Answers for Homework #7
FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY
Presentation transcript:

Context-Free Grammars Pushdown Store Automata Properties of the CFLs Chapter 3 Context-Free Grammars Pushdown Store Automata Properties of the CFLs

Context-Free Grammars G = 〈V, T, P, S〉: V = variables (capital letters); T = terminals (small letters); P ⊆ V  (V ∪ T)* are the rules (or productions); and S ∈ V is the start symbol. Note that V and T must be disjoint. Example: V = {S}; T = {a, b}; P = {(S, ε); (S, aSb)}. Or, S → ε | aSb. S ⇒ aSb ⇒ aaSbb ⇒ aaεbb = a²b² ∈ L(G) = {aⁿbⁿ : n ≥ 0} Writing A → α₁ | … | αᵢ means (A, αⱼ) is a rule in G for 1 ≤ j ≤ i. If A → β and α, γ ∈ (V ∪ T)* we say αAγ derives (⇒) αβγ. LG = {w ∈ T* : S ⇒⁺ w}. 9/18/2018 Theory of Computation: Chapter 3

Equal number of a’s and b’s: L = {w ∈ {a, b}* : |w|a = |w|b} S → ε | aB | bA (equal number) A → a | aS | bAA (an extra a) B → b | bS | aBB (an extra b) LG ⊆ L: Show S ⇒* w only if |ω|a,A = |ω|b,B by proving each production preserves the surplus of a’s (A’s) vs. b’s (B’s); so ω has an equal number. L ⊆ LG: Show S ⇒* all strings with an equal number, A ⇒* all strings with an extra a, and B ⇒* all strings with an extra b, by induction Basis: S ⇒ ε; A ⇒ a; B ⇒ b. Alternatively, use the grammar S → aSbS | bSaS | ε. Prove by diagram graphing #a’s - #b’s for every prefix 9/18/2018 Theory of Computation: Chapter 3

Simultaneous induction Inductive Cases (B is similar) / w = az z has an extra b S ⇒ aB ⇒* az S: ⇒ Use by IH \ w = bz z has an extra a S ⇒ bA ⇒* bz z has equal nos. A ⇒ aS ⇒* az A: z has 2 extra a’s ⇒ z = xy each with one extra a A ⇒ bAA ⇒* bxy Draw diagram illustrating a graph of the discrete function mapping string to the #a’s − #b’s it contains. 9/18/2018 Theory of Computation: Chapter 3

Regular languages are context-free Expression Grammar Ø nothing a S → a r₁; r₂ By IH: S₁ → given; S₂ → given (must be disjoint) r₁ + r₂ S → S₁ | S₂ r₁r₂ S → S₁S₂ r₁* S → ε | SS₁ A Regular Grammar only has productions of the form B → a; B → aC. p. 219, Def. 6.3 9/18/2018 Theory of Computation: Chapter 3

Parse Trees and Derivations b A a S → aB | bA | ε A → a | aS | bAA B → b | bS | aBB Unique derivations with respect to the parse tree: Left: S ⇒ bA ⇒ bbAA ⇒ bbaSA ⇒ bbabAA ⇒ bbabaA ⇒ bbabaaS ⇒ bbabaabA ⇒ bbabaaba Right: S ⇒ bA ⇒ bbAA ⇒ bbAaS ⇒ bbAabA ⇒ bbAaba ⇒ bbaSaba ⇒ bbabAaba ⇒ bbabaaba 9/18/2018 Theory of Computation: Chapter 3

Alternative parse tree b A a B S → aB | bA | ε A → a | aS | bAA B → b | bS | aBB Grammar is said to be ambiguous. 9/18/2018 Theory of Computation: Chapter 3

Theory of Computation: Chapter 3 Balanced Parentheses Let f(w) = |w|( − |w|) . A string w ∈ { ( , ) }* is balanced if: 1. f(w) = 0 2. f(w′) ≥ 0 for all prefixes w′ of w Claim: The grammar S → ε | SS | (S) generates all balanced strings. Proof: By induction. If |w| = 0 then S ⇒ ε. Else there are two cases. If w = xy for nontrivial balanced x and y, then S ⇒⁺ x and S ⇒⁺ y by IH, so use S ⇒ SS ⇒⁺ xy. Otherwise f(w′) never touches 0 in the middle. Therefore, f(w′) > 0 for all non-trivial proper prefixes. So, let w = (z) and see z is balanced, so S ⇒⁺ z by IH . Now use S ⇒ (S) ⇒⁺ (z) = w. 9/18/2018 Theory of Computation: Chapter 3

Claim: S → ε | SS | (S) generates only balanced strings. Proof: By induction on the length of a derivation. S ⇒ ε: f(ε) = 0 trivially S ⇒ SS ⇒⁺ xy: Therefore S ⇒⁺ x; S ⇒⁺ y and hence f(xy) = f(x) + f(y) = 0 + 0 = 0 (by IH). For w′ a prefix of xy, it is either a prefix of x, or xy′ for a prefix of y. So f(w′) ≥ 0 by IH in either case. S ⇒ (S) ⇒⁺ (z): f((z)) = 1 + f(z) − 1 = 0 by IH. A proper prefix of (z) is (z′ for a prefix z′ of z. So f((z′) = 1 + f(z′) > 0 by induction hypothesis. 9/18/2018 Theory of Computation: Chapter 3

Theory of Computation: Chapter 3 Discussion Define what matching parentheses are: In the context of the non-inductive definition In terms of the grammar Prove that in a balanced string, open and closed parentheses are uniquely matched in pairs that are nested properly. The open can closed parentheses in x(y)z match ⇔ y is balanced. 9/18/2018 Theory of Computation: Chapter 3

Remove Unproductive Symbols A variable A is productive in G if A ⇒⁺ w for some w in Σ*. To find all productive variables, work backwards from the terminal strings. Start: T₁ = Ø Loop: If A → α ∈ (T₁ ∪ Σ)*, then add A to T₁. This finds all variables that can produce a terminal string. Remove all rules containing an unproductive variable. Example: (eliminates D) S → Aa | B | D B → bC D → Da C → abd | AB A → aA | bA | B Needs a better example. 9/18/2018 Theory of Computation: Chapter 3

Remove Unreachable Symbols A variable A is reachable if S ⇒* αAβ for some α, β. To find all reachable variables, work forwards from S. Start: T₂ = {S} Repeat: If A → α for A ∈ T₂, then add all variables in α to T₂. This yields all variables that can be reached from the start variable. Remove all rules containing an unreachable variable. Example: (eliminates C) S → Aa | B B → b A → aA | bA | B C → abd Needs a better example. 9/18/2018 Theory of Computation: Chapter 3

Removing Useless Symbols Definition: A variable A is useful if S ⇒* αAβ ⇒⁺ w ∈ Σ*. I.e. it participates in a derivation. (Even if all strings could be derived without using A, i.e. A is redundant.) N.b. A must be productive and reachable. Theorem: Every non-empty CFL can be generated by a CFG without useless symbols. Proof: (1) First remove unproductive variables to get G₁. (2) Then remove unreachable variables to get G₂. Take any A ∈ T₂. By (2), there are α and β such that S ⇒₂* αAβ. But since αAβ ∈ (T₂ ∪ Σ)* and T₂ ⊆ T₁ (1) gives αAβ ⇒₁* w. But since all variables in this derivation are reachable from S, they are in T₂ also, and hence αAβ ⇒₂* w. Therefore A is useful in G₂. Following Martin 4th edition exercise 4.53 p. 162. Reversing the order doesn’t work. E.g. S → AB; A → a? 9/18/2018 Theory of Computation: Chapter 3

Removing Empty productions (except S → ε) Find the set N = {A : A ⇒⁺ ε} of nullable variables: let N = ∅; add A to N if A → α ∈ N*. For each rule A → X₁ … Xᵢ, add all rules of the form A → α₁…αᵢ where αⱼ = Xⱼ if Xⱼ ∉ N, and αⱼ = (Xⱼ or ε) if Xⱼ ∈ N. Remove all A → ε. S → ABCA A → CD B → Cb C → a | ε D → bD | ε Nullable variables: {C, D, A} Add: D → b; B → b; A → C | D S → BCA | ABA | ABC | BA | BC | AB | B Remove: C → ε D → ε p. 101 Kelley; Martin p. 233 (simplified version) 9/18/2018 Theory of Computation: Chapter 3

Removing Unit productions Then take the transitive closure of all unit productions to determine the unit paths A ⇒⁺ B. If A →⁺ B → α ∉ V, then add A → α. Remove all unit productions: A → B . S → S + T | T [expressions] T → T × F | F [terms] F → (S) | e [factors] Unit productions: S → T → F Add: S → T × F | (S) | e T → (S) | e Remove: S → T; T → F p. 103 Kelley; Martin p. 237 9/18/2018 Theory of Computation: Chapter 3

More examples (remove empty & unit rules) Find epsilon paths: C ⇒⁺ ε S → aB | bA | CD S → D A → a | aS | bAA B → b | bS | aBB C → D | bC | aC | ε C → b | a D → DD Unit paths: C ⇒⁺ D; S ⇒⁺ D S → aB | bA | D | CD S → DD A → a | aS | bAA B → b | bS | aBB C → D | b | bC | a | aC C → DD D → DD 9/18/2018 Theory of Computation: Chapter 3

Grammar simplification outline Eliminate empty productions: A → ε (augments P) Eliminate unit productions: B → C (augments P) (optional) eliminate useless symbols: (reduces P) Example: D is unproductive (D ⇏⁺ w ∈ T*); C is unreachable (S ⇏*…C…) S → aB | bA | DD | CD A → a | aS | bAA B → b | bS | aBB C → DD | b | bC | a | aC D → DD 9/18/2018 Theory of Computation: Chapter 3

Theory of Computation: Chapter 3 Chomsky Normal Form Theorem: Every CFL without ε can be generated by a CNF grammar with rules of the form A → BC or A → a. Proof: Take cases on rules A → X₁…Xᵢ in grammar, Xⱼ ∈ V ∪ T. Remove ε and unit productions to eliminate the cases i = 0, 1. So for i ≥ 2: For each b, replace terminals Xⱼ = b by a new variable B and add B → b. Now, for all rules of the form A → B₁ … Bᵢ where i > 2, add new variables D₁ … Dᵢ₋₂ and productions: A → B₁D₁; D₁ → B₂D₂; … Dⱼ₋₁ → BⱼDⱼ; … Dᵢ₋₂ → Bᵢ₋₁Bᵢ 9/18/2018 Theory of Computation: Chapter 3

Theory of Computation: Chapter 3 CNF example S → aB | bA A → a | aS | bAA B → b | bS | aBB Replace terminals by variables: S → C₁B | C₂A C₁ → a; C₂ → b A → a | C₁S | C₂AA B → b | C₂S | C₁BB Replace variables in longer strings: S → C₁B | C₂A C₁ → a; C₂ → b A → a | C₁S | C₂D D → AA B → b | C₂S | C₁E E → BB 9/18/2018 Theory of Computation: Chapter 3

Greibach Normal Form: (V → TV*) Goal: Get all rules into the form A → aB₁ … Bn (n ≥ 0). Start in CNF. Method: Number variables A₁, …, Aᵣ (terminals = ∞). For i = 1, …, r: substitute so Ai → A≥iγ. Use turnaround lemma to get Ai → A>iγ: Change: A → Aα₁ | … | Aαᵤ | β₁ | … | βᵥ (i.e. A ⇒* βα*) to: A → β₁ | … | βᵥ | β₁B | … | βᵥB (n.b. β > A) and: B → α₁ | … | αᵤ | α₁B | … | αᵤB (set B ≤ 0) We must have Aᵣ → aγ. For i = r, …, 1, replace Aⱼ in Aᵢ → Aⱼγ (once). Observe that no B → γ begins with another B (by induction). So replace the first symbol of γ (once). Def. of GNF in Kelley is wrong? But see 3.9.5 Note: all leftmost derivations are like S ⇒* T*V*? 9/18/2018 Theory of Computation: Chapter 3

Theory of Computation: Chapter 3 Example S = A₁ → A₄A₃ | A₅A₂ | A₁A₁ (A₁A₁ added for interest) A = A₂ → a | A₄A₁ | A₅A₆ B = A₃ → b | A₅A₁ | A₄A₇ C₁ = A₄ → a C₂ = A₅ → b D = A₆ → A₂A₂ E = A₇ → A₃A₃ Apply turnaround lemma to A₁: A₁ → A₄A₃ | A₅A₂ | A₄A₃B | A₅A₂B (N.b. terminals are numbered ∞) B → A₁ | A₁B Refer to CNF conversion example 9/18/2018 Theory of Computation: Chapter 3

Theory of Computation: Chapter 3 Continued Substitute up: for i = 1, …, r replace Aⱼ in Aᵢ → Aⱼα, so that j ≥ i: once: A₆ → aA₂ | A₄A₁A₂ | A₅A₆A₂ A₇ → bA₃ | A₅A₁A₃ | A₄A₇A₃ again: A₆ → aA₂ | aA₁A₂ | bA₆A₂ A₇ → bA₃ | bA₁A₃ | aA₇A₃ Substitute down: for i = r, …, 1 replace Aⱼ in Aᵢ → Aⱼα, making j > i, A₃ → b | bA₁ | aA₇ A₂ → a | aA₁ | bA₆ A₁ → aA₃ | bA₂ | aA₃B | bA₂B B → aA₃ | bA₂ | aA₃B | bA₂B | aA₃BB | bA₂BB 9/18/2018 Theory of Computation: Chapter 3

Theory of Computation: Chapter 3 Pushdown Automata σ, A|γ p q A pushdown store automaton is a finite automaton with a stack. The stack always starts out with a bottom of stack symbol (Z). Transitions: (q, γ) ∈ Δ(p, σ, A) iff: Pop A Push γ (right-to-left) Before: After: A γ β β (bottom of stack) Important: A must be on top of the stack, but we cannot sense or test for an empty stack! Z s 9/18/2018 Theory of Computation: Chapter 3

Theory of Computation: Chapter 3 Example: L = {aⁿbⁿ : n ≥ 1} a, C|CC a, Z|CZ What is the grammar? Σ = {a, b} Γ = {C, Z} Execution by table: b, C|ε b, C|ε ε, Z|Z s q f Z C stack Z state s q f input a b ε Exercise 3.7.1 in Kelley C stack Z state s q – input a b 9/18/2018 Theory of Computation: Chapter 3

Formal PDA (inherently nondeterministic) M = 〈Q, Σ, Γ, Δ, s, Z, F〉 Γ is the set of stack symbols (capital letters) Δ ⊆ (Q × (Σ ∪ {ε}) × Γ) × (Q × Γ*) Note: ε-transitions are allowed Meaning: (p, σ, A), (q, γ) ∈ Δ iff in state p, upon reading σ (or nothing) on the input, and A on the stack, M could move to state q, consuming σ (or nothing), popping A, and pushing γ. Note: While the input shrinks (or stays the same), the stack may grow. Define: (p, σx, Aβ) ⊦ (q, x, γβ) LM = {w ∈ Σ* : (s, w, Z) ⊦* (f, ε, γ) f ∈ F} Acceptance does not require empty stack, but all input must be read. 9/18/2018 Theory of Computation: Chapter 3

Example: L = {wcwᴿ : w ∈ {a, b}*} What is the grammar? Σ = {a, b, c}; Γ = {A, B, Z} Underscore is an abbreviation. a, _|A_ b, B|ε c, _|_ ε, Z|Z s q f Z Similar: {wwᴿ : w ∈ {a, b}*} S → aSa | bSb, → ε Exercise 3.7.4 in Kelley b, _|B_ a, A|ε 9/18/2018 Theory of Computation: Chapter 3

Example: L = {w ∈ {a, b}* : |w|a = |w|b} What was the grammar? a, _|A_ a, B|ε B A stack C  state q f input a b ε ε, C|ε q f C b, _|B_ b, A|ε 9/18/2018 Theory of Computation: Chapter 3

Empty stack acceptance Final state acceptance: LM = {w ∈ Σ* : (s, w, Z) ⊦* (f, ε, γ) for some f ∈ F} Empty stack acceptance: = {w ∈ Σ* : (s, w, Z) ⊦* (q, ε, ε) for any q ∈ Q} Final state PDAs have equivalent empty stack PDAs, and vice versa: By adding a new bottom-of-stack symbol: and either A final state to accept any ‘empty’ stack: or A ‘final’ state to empty the stack: ε, Z′|ZZ′ Z′ s′ s ε, Z′|Z′ q f ε, _|_ ε, _|ε f q 9/18/2018 Theory of Computation: Chapter 3

Theory of Computation: Chapter 3 CFG → PDA Theorem: Every CFL is accepted by some PDA. Proof: In a GNF leftmost derivation, S ⇒* αβ where α ∈ T*, β ∈ V*. Idea: Let Γ = V; Σ = T; Q = {s}; Z = S. Construct a single state empty stack acceptor where β goes onto the stack and α is consumed. Construction: Rules A → aB₁…Bᵢ in P becomes transitions in M. Example: S → aBS | bAS | ε; A → bAA | a; B → aBB | b q S a, A | B₁…Bᵢ 9/18/2018 Theory of Computation: Chapter 3

Proof of GNF to PDA construction Idea: Do induction on the length n of a leftmost derivation, preserving: S ⇒* αβ ⇔ (q, α, S) ⊦* (q, ε, β) for α ∈ Σ*, β ∈ Γ* The input consumed are the generated terminals, and the remaining variables are the contents of the stack. Base Case: If n = 0, then α = ε and β = S. Induction Hypothesis: Suppose S ⇒ⁿ αβ iff (q, α, S) ⊦ⁿ (q, ε, β). Induction Step: Let A → aB₁…Bᵢ be the last rule applied in a derivation. Then S ⇒ⁿ αAβ ⇒ αaB₁…Bᵢβ iff (q, αa, S) ⊦ⁿ (q, a, Aβ) ⊦ (q, ε, B₁…Bᵢβ). 9/18/2018 Theory of Computation: Chapter 3

Theory of Computation: Chapter 3 General method Take any CF grammar (without restrictions). Let Γ = T ∪ V (keep Σ = T). Construction: (still a single state empty stack acceptor) Add a pop S move in case language contains ε. ε, A | γ for each rule A → γ q S mutually exclusive since V ∩ T = ∅ σ, σ | ε for each symbol σ ∈ T 9/18/2018 Theory of Computation: Chapter 3

Theory of Computation: Chapter 3 PDA → CFG Theorem: The language accepted by a PDA can be generated by a CFG. Proof: Let M be a empty stack acceptor. Construct G from the variables [q, A, p] ∈ Q × Γ × Q, which generate strings that take M from state q to state p with the position occupied by A on top of stack removed. For start state s and bottom of stack symbol Z, use the (GNF) productions: S → [s, Z, q] for each q ∈ Q. In addition, whenever Add [q, A, qᵢ] → a[r, B₁, q₁][q₁, B₂, q₂] … [qᵢ₋₁, Bᵢ, qᵢ] for each q₁, …, qᵢ ∈ Q. If i = 0 it is a pure pop move a, A|ε. And the rule becomes [q, A, r] → a. a, A|B₁…Bᵢ q r 9/18/2018 Theory of Computation: Chapter 3

Theory of Computation: Chapter 3 Conversion template w, A|ε q p Idea: [q, A, p] ⇒* w iff That is, the net effect is to consume w and erase A from stack: it diminishes by 1 and does not go below that point anytime previously. [q, A, qᵢ] → a[r, B₁, q₁][q₁, B₂, q₂] … [qᵢ₋₁, Bᵢ, qᵢ] If i = 0, this becomes [q, A, r] → a. a, A|B₁…Bᵢ unknown states B₁ . A Bᵢ …… q r p w q r qᵢ 9/18/2018 Theory of Computation: Chapter 3

Theory of Computation: Chapter 3 a, A|AA a, B|ε a, S|AS Example: ε, S|ε S q f b, A|ε b, B|BB b, S|BS S → [q, S, f] [q, S, f] → a [q, A, q] [q, S, f] → a [q, A, f] [f, S, f] → b [q, B, q] [q, S, f] → b [q, B, f] [f, S, f] → ε [f, S, f] → (nothing) [q, A, q] → a[q, A, q][q, A, q] → a[q, A, f][f, A, q] → b [q, B, q] → b[q, B, q][q, B, q] → b[q, B, f][f, B, q] → a [f, _, q] → (nothing) S → aAS | bBS | ε A → aAA | b B → bBB | a 9/18/2018 Theory of Computation: Chapter 3

Theory of Computation: Chapter 3 Closure Properties a, A|α p, q p′, q′ iff Fact: The context-free languages are closed under +, ·, and *. Proof: See proof that all regular languages are context-free. Fact: If L is CF, and R is regular, then L ∩ R is context-free. Proof: Let L be recognized by a PDA ML, and R by a FA MR. Run them in parallel: (for A ∈ Γ, a ∈ Σ ∪ {ε}, α ∈ Γ*) I.e. q = q′ when a = ε. Accept iff empty stack and q′ final. Does this work for two PDAs? a, A|α p p′ and a ε q q′ Also see proof in Kelley p. 113 Thm. 8.4 Martin 9/18/2018 Theory of Computation: Chapter 3

Non-Closure Properties Fact: The context-free languages are not closed under intersection: {aⁿbⁿ : n ≥ 0}c* ∩ a*{bⁿcⁿ : n ≥ 0} = {aⁿbⁿcⁿ : n ≥ 0} (Which we will see later is not a context-free language.) Corollary: The context-free languages are not closed under complementation. Reason: If they were, DeMorgan’s rules would imply closure under intersection, which we already know is false. 9/18/2018 Theory of Computation: Chapter 3

Pumping Lemma for context-free languages Lemma: Let L be an infinite CFL, ε ∉ L. Then there is a k ≥ 0 such that if z ∈ L and |z| > k, then z can be written as z = uvwxy with |vwx| ≤ k, |vx| ≥ 1, and uvⁱwxⁱy ∈ L for all i ≥ 0. Proof: Let G be a CFG for L in CNF, with n variables. Let k = 2ⁿ and suppose z ∈ L, |z| > k. Since there are at most 2ⁿ nodes at level n (root = level 0) of the parse tree, there must be a variable at level n + 1 because |z| > 2ⁿ (recall leaves are Cⱼ → σ). So among the last n + 1 variables along this path from the root, there must be a repetition. Pick the last one, and call it A. So S ⇒* uAy ⇒* uvAxy ⇒* uvwxy = z. A ⇒⁺ vAx means |vx| ≠ 0, for otherwise A ⇒⁺ A would contradict CNF. And height ≤ n + 1 implies |vwx| ≤ 2ⁿ. Furthermore, A ⇒* vⁱAxⁱ for every i means S ⇒* uAy ⇒* uvⁱAxⁱy ⇒* uvⁱwxⁱy. Height of one node tree is one. 9/18/2018 Theory of Computation: Chapter 3

Theory of Computation: Chapter 3 Examples Example: L = {aⁿbⁿcⁿ : n ≥ 0} is not context-free. Proof: Pick aⁿbⁿcⁿ = uvwxy ∈ L, |vwx| ≤ n, |vx| ≥ 1. One of a, b, c does not appear in v or x, hence pumping them will exclude a symbol. So uv²wx²y ∉ L. Example: L = {ww : w ∈ {a, b}*} is not CF. Proof: Suppose L is context-free. Then L' = L ∩ a*b*a*b* would be also. Pick aⁿbⁿaⁿbⁿ = uvwxy ∈ L', where|vwx| ≤ n and |vx| ≥ 1. Consider all possible cases of where vwx could lie in aⁿbⁿaⁿbⁿ and see that pumping it will always result in a string uv²wx²y ∉ L'. [Martin, 4th ed. p. 209] has a good explanation of {ww : w ∈ {a, b}*} (Example 6.4). 9/18/2018 Theory of Computation: Chapter 3

Emptiness / Finitude for CF grammars Emptiness: Determine the set of productive variables, i.e. those that can generate terminal strings. Grammar generates a non-empty language if and only if S is productive, i.e. S ⇒* w. Algorithm: See if S ∈ V∞, the fixed-point of V ← {A : A → α ∈ (V ∪ T)*}. Finiteness: (conceptually superior to classical proof using PL) Algorithm: Remove useless symbols and convert to a CNF. |LG| < ∞ iff the digraph V, E is acyclic, where: V = {variables} E = {(A, B) : A → BC or A → CB} 9/18/2018 Theory of Computation: Chapter 3

CYK (Cocke–Younger–Kasami ) algorithm Membership testing x ∈ L can be done in cubic time, O(|x|³) via dynamic programming (agglomeration, or a “bottom-up” algorithm). Idea: Let xᵢⱼ be the length j substring of x starting at position i. For each i and j, determine the sets of variables V(i, j) = {A ∈ V, A ⇒* xᵢⱼ}. Then for any string x of length n, x ∈ L iff S ∈ V(1, n). Algorithm: Start with a grammar G in CNF, x ∈ T⁺. By induction on j: j = 1 V(i, 1) = {A ∈ V: A → xᵢ₁, the ith symbol of x} j > 1 V(i, j) = {A ∈ V: A → BC; B ∈ V(i, k); C ∈ V(i + k, j − k); 1 ≤ k < j} In parallel O(log n) time for unambiguous context-free languages. See H&U p. 140 for sequential algorithm. 9/18/2018 Theory of Computation: Chapter 3

Theory of Computation: Chapter 3 Diagram and example V j = 1 j = 2 j = 3 j = 4 b i = 1 i = 2 a i = 3 i = 4 S → AB | BC A → BA | a B → CC | b C → AB | a Is bbab generated? Box (i, j) represents V(i, j). V(i, 1) = {A : A → xᵢ₁} V(i, j) = {A : A → BC; where B ∈ V(i, k); C ∈ V(i + k, j − k); 1 ≤ k < j} V j = 1 j = 2 j = 3 j = 4 b i = 1 B ∅ A S, C i = 2 A, S a i = 3 A, C i = 4 9/18/2018 Theory of Computation: Chapter 3