CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Closure properties Limitations of regular languages Fall 2009
Operations that preserve regularity We saw three operations that preserve regularity: –Union: If L, L’ are regular languages, so is L L’ –Concatenation: If L, L’ are regular languages, so is LL’ –Star: If L is a regular language, so is L* Exercise: If L is regular, is L 4 also regular? Answer: Yes, because L 4 = ((LL)L)L
Example The language L of strings that end in 101 is regular How about the language L of strings that do not end in 101 ? (0+1)*101
Example Hint: A string does not end in 101 if and only if it ends in one of the following patterns: (or it has length 0, 1, or 2) So L can be described by the regular expression 000, 001, 010, 011, 100, 110, 111 (0+1)*( ) + + (0 + 1) + (0 + 1)(0 + 1)
Complement The complement L of a language L is the set of all strings that are not in L Examples ( = {0, 1} ) –L 1 = all strings that end in 101 –L 1 = all strings that do not end in 101 = all strings end in 000, …, 111 or have length 0, 1, or 2 –L 2 = 1* = { , 1, 11, 111, …} –L 2 = all strings that contain at least one 0 = (0 + 1)*0(0 + 1)*
Closure under complement If L is a regular language, is L also regular? Previous examples indicate answer should be yes Theorem If L is a regular language, so is L.
Proof of closure under complement To argue this, we can use any of the equivalent definitions for regular languages: The DFA definition will be most convenient –We will assume L is accepted by a DFA, and show the same for L regular expression DFANFA
Proof of closure under complement Suppose L is regular, then it is accepted by a DFA M Now consider the DFA M’ with the accepting and rejecting states of M reversed
Proof of closure under complement Now for every input x * : M accepts x After processing x, M ends in an accepting state After processing x, M’ ends in an rejecting state M’ rejects x Language of M’ is L L is regular
Intersection The intersection L L’ is the set of strings that are in both L and L’ Examples: If L, L’ are regular, is L L’ also regular? L = (0 + 1)*111L’ = 1* L L’ = L = (0 + 1)*101L’ = 1* L L’ = 1*111 ∅
Closure under intersection Theorem Proof: If L and L’ are regular languages, so is L L’. L regular L’ regular L regular L’ regular L L’ regular But L L’ = L L’ L L’ regular L L’ regular.
Reversal The reversal w R of a string w is w written backwards The reversal L R of a language L is the language obtained by reversing all its strings w = cavew R = evac L = {cat, dog}L R = {tac, god}
Reversal of regular languages L = all strings that end in 101 is regular How about L R ? This is the language of all strings beginning in 101 Yes, because it is represented by (0+1)* (0+1)*
Closure under reversal Theorem Proof –We will use the representation of regular languages by regular expressions If L is a regular language, so is L R. regular expression DFANFA
Proof of closure under reversal If L is regular, then there is a regular expression E that describes it We will give a systematic way of reversing E Recall that a regular expression can be of the following types: –Special expressions and –Alphabet symbols a, b, … –The union, concatenation, or star of simpler expressions In each of these cases we show how to do a reversal
Proof of closure under reversal regular expression E a (alphabet symbol) E 1 + E 2 reversal E R E1E2E1E2 E1*E1* a E 1 R + E 2 R E2RE1RE2RE1R (E 1 R )*
A question If L is regular, is L DUP also regular? regular expression DFANFA ? L DUP = {ww: w L} L = {cat, dog} Ex. L DUP = {catcat, dogdog}
A question Let’s try with regular expression: Let’s try with NFA: q0q0 q1q1 NFA for L L DUP = LL L = {a, b} L DUP = {aa, bb} LL = {aa, ab, ba, bb}
An example Let’s try to design an NFA for L DUP L = 0*1 is regular L DUP = {11, 0101, , ,...} = {0 n 10 n 1: n ≥ 0} L DUP = {1, 01, 001, 0001,...}
An example L DUP = {11, 0101, , ,...} = {0 n 10 n 1: n ≥ 0}
Non-regular languages
A non-regular language Another example We reason by contradiction: –Suppose we have managed to construct a DFA M for L –We argue something must be wrong with this DFA –In particular, M must accept some strings outside L L = {0 n 1 n : n ≥ 0} is not regular.
A non-regular language What happens when we run M on input x = 0 n+1 1 n+1 ? –M better accept, because x L M imaginary DFA for L with n states x
A non-regular language What happens when we run M on input x = 0 n+1 1 n+1 ? –M better accept, because x L –But since M has n states, it must revisit at least one of its states while reading 0 n+1 M x r 1 n+1
Pigeonhole principle Here, balls are 0 s, bins are states: Suppose you are tossing n + 1 balls into n bins. Then two balls end up in the same bin. If you have a DFA with n states and it reads n + 1 consecutive 0 s, then it must end up in the same state twice.
A non-regular language What happens when we run M on input x = 0 n+1 1 n+1 ? –M better accept, because x L 2 –But since M has n states, it must revisit at least one of its states while reading 0 n+1 –But then the DFA must contain a loop with 0 s M x r 1 n+1
A non-regular language The DFA will then also accept strings that go around the loop multiple times But such strings have more 0 s than 1 s, so they are not in L 2 ! M r 1 n+1
General method for showing non-regularity Every regular language L has a property: For every sufficiently long input z in L, there is a “middle part” in z that, even if repeated several times, keeps the input inside L z a1a1 a k+1 akak …… a n-1 anan a n+1 …a m
Pumping lemma for regular languages Pumping lemma: For every regular language L There exists a number n such that for every string z in L, we can write z = u v w where |uv| ≤ n |v| ≥ 1 For every i ≥ 0, the string u v i w is in L. z …… u v w
Proving non-regularity If L is regular, then: So to prove L is not regular, it is enough to show: There exists n such that for every z in L, we can write z = u v w where |uv| ≤ n, |v| ≥ 1 and For every i ≥ 0, the string u v i w is in L. For every n there exists z in L, such that for every way of writing z = u v w where |uv| ≤ n and |v| ≥ 1, the string u v i w is not in L for some i ≥ 0.
Proving non regularity For every n there exists z in L, such that for every way of writing z = u v w where |uv| ≤ n and |v| ≥ 1, the string u v i w is not in L for some i ≥ 0. This is a game between you and an imagined adversary adversary choose n write z = uvw ( |uv| ≤ n, |v| ≥ 1) you choose z L choose i you win if uv i w L 1 2
Arguing non-regularity You need to give a strategy that, regardless of what the adversary does, always wins you the game adversary choose n write z = uvw ( |uv| ≤ n, |v| ≥ 1) you choose z L choose i you win if uv i w L 1 2
Example adversary choose n write z = uvw ( |uv| ≤ n, |v| ≥ 1) you choose z L choose i you win if uv i w L 1 2 adversaryyou u vw choose n write z = uvw z = 0 n+1 1 n+1 i = 2 uv 2 w = 0 j+2k+l 1 n+1 = 0 n+1+k 1 n+1 L L = {0 n 1 n : n ≥ 0} u vw v
Example adversaryyou u vw choose n write z = uvw z = 0 n+1 10 n+1 1 i = 2 uv 2 w = 0 j+2k+l 10 n+1 1 = 0 n+1+k 10 n+1 1 L L DUP = {0 n 10 n 1: n ≥ 0} u vw v
Which of these are regular? L 1 = {1 n : n is divisible by 3} L 2 = {1 n : n is prime} = {1} L 3 = {x: x has same number of 0s and 1s} L 4 = {x: x has same number of patterns 01 and 10} L 5 = {x: x has more 0s than 1s} L 6 = {x: x has different number of 0s and 1s} = {0, 1}
Example L 3 = {x: x has same number of 0s and 1s} adversaryyou u vw choose n write z = uvw z = 0 n+1 1 n+1 i = 2 uv 2 w = 0 j+2k+l 1 n+1 = 0 n+1+k 1 n+1 L u vw v
Example L 4 = {x: x has same number of 01s and 10s} adversaryyou 1 2 choose n write z = uvw z = (01) n+1 (10) n+1 i = u vw 5 01 patterns 5 10 patterns u vw 6 01 patterns 6 10 patterns v u w 4 01 patterns 4 10 patterns is regular!
Example L 4 = {x: x has same number of 01s and 10s} r0r0 r1r s0s0 s1s q0q0 1 0 more 10 s more 01 s
Example L 5 = {x: x has more 0s than 1s} adversaryyou u vw choose n write z = uvw z = 0 n+1 1 n i = 0 uv 0 w = 0 j+l 1 n+1 L u w
Example L 6 = {x: x has different number of 0s than 1s} adversaryyou 1 choose n z = ? there is an easier way! L 3 = {x: x has same number of 0s and 1s} = L 6 If L 6 is regular, then L 3 = L 6 is also regular But L 3 is not regular, so L 6 cannot be regular
Example L 2 = {1 p : p is prime} adversaryyou u = 1 a v = 1 b w = 1 c choose n write z = uvw = 1 a 1 b 1 c z = 1 n : n > p is prime i = a + c uv i w = 1 a 1 ib 1 c = 1 a+ib+c = 1 a+(a+c)b+c = 1 (a+c)(b+1) = 1 composite L 2