CSCI 3130: Formal languages and automata theory Andrej Bogdanov The Chinese University of Hong Kong Limitations of pushdown automata Fall 2011
Non context-free languages L 1 = { a n b n : n ≥ 0} L 2 = {s: s has same number of a s and b s} L 3 = { a n b n c n : n ≥ 0} L 4 = {ss R : s { a, b }*} L 5 = {ss: s { a, b }*} These are not regular Are they context-free? ✔ ✔ ?
An attempt Let’s try to design a CFG or PDA L 3 = { a n b n c n : n ≥ 0} S → a B c | B → ?? read a / push x read b / pop x ???
What would happen if... Suppose we could construct some CFG for L 3, e.g. Let’s do some long derivations S BC B CS | b C SB | a... S BC CSC a SC a BCC ab CC aba C aba SB aba BCB abab CB ababa B ababab
Repetition in long derivations If derivation is long enough, some variable must appear twice on same path in parse tree S BC BCSS BCBC a b a b a b S BC CSC a SC a BCC ab CC aba C aba SB aba BCB abab CB ababa B ababab
Pumping example Then we can “cut and paste” part of parse tree S BC BCSS BCBC ababab BS BC a b b B C BC S S B C B a b a b b C S ababab ababbabb ✗
Pumping example We can repeat this many times Every sufficiently large derivation will have a middle part that can be repeated indefinitely ababab ✗ ababbabb ✗ ababbbabbb abab n ab n bb
Pumping in general uvwxyuv 3 wx 3 y u v v v w x y A A A A x x uv 2 wx 2 y u v v w x y A A A x u v w x y A A u w y A uwy
Example If L 3 has a context-free grammar G, then What happens for a n b n c n ? No matter how it is split, uv 2 wx 2 y ∉ L 3 ! If uvwxy is in G, so are uv 2 wx 2 y, uv 3 wx 3 y, uwy,... L 3 = { a n b n c n : n ≥ 0} w u y x v a a a... a a b b b... b b c c c... c c
Pumping lemma for context-free languages Pumping lemma: For every context-free language L There exists a number n such that for every string z in L, we can write z = uvwxy where |vwx| ≤ n |vx| ≥ 1 For every i ≥ 0, the string uv i wx i y is in L. w u y x v
Pumping lemma for context-free languages So to prove L is not context-free, it is enough that For every n there exists z in L, such that for every way of writing z = uvwxy where |vwx| ≤ n and |vx| ≥ 1, the string uv i wx i y is not in L for some i ≥ 0. w u y x v
Proving language is not context-free Like for regular languages, you need a strategy that always wins you this game Donald choose n write z = uvwxy ( |vwx| ≤ n, |vx| ≥ 1) you choose z L choose i you win if uv i wx i y L 1 2 w u y x v ≤ n At least one is not empty
Example Donald choose n write z = uvwxy ( |vwx| ≤ n, |vx| ≥ 1) you choose z L choose i you win if uv i wx i y L L 3 = { a n b n c n : n ≥ 0} w u y x v a a a... a a b b b... b b c c c... c c choose n write z = uvwxy z = a n b n c n i = ?
Example Case 1: v or x contains two kinds of symbols Then uv 2 wx 2 y not in L 3 because pattern is wrong Case 2: v and x both contain one kind of symbol Then uv 2 wx 2 y does not have same number of a s, b s, c s x v a a a... a a b b b... b b c c c... c c x v
More examples L 1 = { a n b n : n ≥ 0} L 2 = {s: s has same number of a s and b s} L 3 = { a n b n c n : n ≥ 0} L 4 = {ss R : s ∈ { a, b }*} L 5 = {ss: s ∈ { a, b }*} Which is context-free? ✘ ✔ ✔ ✔
Example 1 2 L 5 = {ss: s ∈ { a, b }*} w u y x v a a a a a a a a a b choose n write z = uvwxy z = a n ba n b i = ? w x v u y a a a a a a a a a b What if:
Example 1 2 L 5 = {ss: s { a, b }*} w u y x v a a a a a a b b b b b b choose n write z = uvwxy z = a n b n a n b n i = ? Recall that |vwx| ≤ n
Example Case 1: w x v a a a a a a b b b b b b Three cases Case 2: w x v a a a a a a b b b b b b Case 3: w x v a a a a a a b b b b b b vwx is in the first half of a n b n a n b n vwx is in the middle part of a n b n a n b n vwx is in the second half of a n b n a n b n
Example Case 1: w x v a a a a a a b b b b b b Apply pumping with i = 0 Case 2: w x v a a a a a a b b b b b b Case 3: w x v a a a a a a b b b b b b uwy looks like a j b k a n b n, where j < n or k < n uwy looks like a n b j a k b n, where j < n or k < n uwy looks like a n b n a j b k, where j < n or k < n
Example Case 1: Apply pumping with i = 0 Case 2: Case 3: uv 0 wx 0 y looks like a j b k a n b n, where j < n or k < n uv 0 wx 0 y looks like a n b j a k b n, where j < n or k < n uv 0 wx 0 y looks like a n b n a j b k, where j < n or k < n Not of the form ss This covers all the cases, so L 5 is not context-free L 5 = {ss: s ∈ { a, b }*}