CS 3240 – Chapter 4
Closure Properties Algorithms for Elementary Questions: Is a given word, w, in L? Is L empty, finite or infinite? Are L 1 and L 2 the same set? Detecting non-regular languages CS Properties of Regular Languages2
Closure of operations If x and y are in the same set, is x op y also? Example: The integers are closed under addition ▪ They are not closed under division Regular languages are closed under everything! Typical set operations CS Properties of Regular Languages3
Regular languages are closed under: Kleene Star ( * ) Union (+) Concatenation (xy) (By definition!) They are also closed under: Complement (reverse state acceptability ✓ ) Intersection Set difference Reversal (already proved in homework #12, 2.3 ✓ ) CS Properties of Regular Languages4
Proof from set theory: L 1 ∩ L 2 = (L 1 ’ ∪ L 2 ’)’ Since complement and union are closed, intersection must be also! QED CS Properties of Regular Languages5
6 Note how the intersection is never shaded L 1 ’ ∪ L 2 ’ shades everything but where they overlap Therefore, (L 1 ’ ∪ L 2 ’)’ is the overlap (intersection)
A – B: Everything that is in A but not in B A – B = A ∩ B’ We have already shown that regular languages are closed under intersection and complement. QED CS Properties of Regular Languages7
Start with a composite start state: Consisting of the two start states Follow all out-edges simultaneously As we did for NFA-to-DFA conversion States containing any original final state is a final state in the result for union Because one of the machines accepts there States containing an original final state from each original machine is a final state in the result for intersection Because both of the machines accept there ¿How would you construct the difference machine? CS Properties of Regular Languages8
9 -x 1 x2x2 +x 3 a b a b a,b aaaa b b b b Double-a EVEN-EVEN
x i, y i ab x 1, y 1 x 2, y 3 x 1, y 2 x 2, y 3 x 3, y 1 x 1, y 4 x 1, y 2 x 2, y 4 x 1, y 1 x 3, y 1 x 3, y 3 x 3, y 2 x 1, y 4 x 2, y 2 x 1, y 3 x 2, y 4 x 3, y 2 x 1, y 3 x 3, y 3 x 3, y 1 x 3, y 4 x 3, y 2 x 3, y 4 x 3, y 1 x 2, y 2 x 3, y 4 x 1, y 1 x 1, y 3 x 2, y 1 x 1, y 4 x 3, y 4 x 3, y 2 x 3, y 3 x 2, y 1 x 3, y 3 x 1, y 2 For union: assign accepting states where any original x i or y i accept. For intersection: assign accepting states only where both original x i or y i accept simultaneously. No need to compute (L 1 ’ ∪ L 2 ’)’ ! For difference, assign accepting states where one accepts and the other does not.
11 a b The resulting machine… a a a b b b a a a a a aa a b b b b b b b
Given a word w, and a regular language, L, can we answer the question: Is w ∊ L? You tell me… CS Properties of Regular Languages12
A graph theory problem: Find a path from the start to a final state in the associated FA Algorithm: “mark” the start state repeat: mark any state with an incoming edge from a previously marked state until an accepting state is marked or no new states were marked at all CS Properties of Regular Languages13
Attempt to convert the associated FA to a regular expression By the state bypass and elimination algorithm If you get a regular expression, then a string is accepted CS Properties of Regular Languages14
Suppose a minimal machine, M, for the language L has p states If M accepts any non-empty words at all, it must accept one of length <= p Why? So… Systematically try all possible strings in Σ* of length 1 through p. If none are accepted, then no non-empty strings at all are in L. CS Properties of Regular Languages15
Convert its machine to a regular expression It is infinite iff it has a star Another way: A language is infinite if there is a cycle in an accepting path A (tedious) graph theory problem CS Properties of Regular Languages16
Suppose L’s minimal machine, M, has p states Any path of length p has (or is) a cycle And any cycle must have or be a cycle of length p or less Because a state is revisited after at most p characters So, infinite languages have a machine with at least one cycle of length p or less in an accepting path* And all non-empty languages have a string of length p or less (already showed that)… CS Properties of Regular Languages17
Let m denote the length of a cycle in an accepting path We know m ≤ p Let k be the length of a string in L such that k ≤ p There has to be one if the language is infinite! Then strings of length k + im are accepted, i ≥ 0 By traversing the cycle i times But k + im ≤ p + ip = (i+1)p So, there must be some i such that p ≤ k+im ≤ 2p Procedure: Test all strings of length p through 2p-1 CS Properties of Regular Languages18
That is, are they the same set of strings? Set-theoretic argument: Two sets are equal if their symmetric difference is empty (denoted by A ∆ B or A ⊖ B) A ∆ B = A ∪ B – A ∩ B = A – B ∪ B – A But A – B = A ∩ B’, and B – A = B ∩ A’ So L 1 = L 2 iff (L 1 ∩ L 2 ’) ∪ (L 1 ’ ∩ L 2 ) = ∅ CS Properties of Regular Languages19
CS Properties of Regular Languages20
CS Properties of Regular Languages21
Not all languages are regular We need to recognize whether languages are regular or not We don’t want to waste time using regular language processing techniques where they don’t apply CS Properties of Regular Languages22
CS Properties of Regular Languages23
CS Properties of Regular Languages24
CS Properties of Regular Languages25
Consider a n b n ab is regular ab + aabb = a n b n, 0 ≤ n ≤ 2, is regular Any finite language is regular (why?) But a n b n, n ≥ 0 is not regular (why not?) How do we prove it’s not regular!?! CS Properties of Regular Languages26
Finite Automata don’t have unlimited counting capability They only have a fixed number of states Intuitively, we see that an automaton can’t keep track of counts for a n b n where n is arbitrarily large But intuition is often faulty. We need a proof! CS Properties of Regular Languages27
Any accepted string of length p (the number of states) or greater forces a cycle in an accepting path. In other words, at least one state is visited a second time And that “revisit” must happen within the first p characters of the string ▪ Because that’s when the (p+1)th state is entered This could be any state (start, final, other) CS Properties of Regular Languages28
Consider a k b k, where k is greater than the number of states in a supposed DFA accepting all a n b n, n ≥ 0 Before the first b is encountered, a state has been visited at least twice (because there are more a’s than states) Suppose the length of the associated cycle is m Then the string a k+im b k is also accepted! This contradicts the existence of a DFA that accepts a n b n CS Properties of Regular Languages29
CS Properties of Regular Languages30 The first “revisit”
For every infinite regular language, L, there is a number, p, such that for all strings, s, in L, where |s| ≥ p, you can partition s into three concatenated substrings, xyz, such that: 1. |y| > 0 2. |xy| ≤ p 3. xy * z ∈ L CS Properties of Regular Languages31
You can only use the pumping lemma to show that a language is not regular By showing it fails the “pumping” conditions of infinite regular languages Note: Some non-regular languages pump! The trick is to find a convenient string Usually the condition |xy| ≤ p is also key Sometimes pumping down (i = 0) is easiest CS Properties of Regular Languages32
Consider the string a p b p It is in this language It is long enough (≥ p in length) Now let a p b p = xyz Remember |xy| ≤ p What can you conclude about y? CS Properties of Regular Languages33
You can treat proving a language non-regular as a “game”: 1. You pick a string, s, in L, where |s| ≥ p ▪ You may pick any such string; choose wisely! 2. Opponent picks x, y, and z ▪ But must obey |xy| ≤ p and |y| > 0 3. You show it can’t be “pumped” ▪ Because a pumped string falls “outside” the language Must anticipate all possible partitions xyz CS Properties of Regular Languages34
a i b j, i > j PALINDROME w = w R (same backwards and forwards) ww Equal halves PRIME (a m where m is prime) SQUARE (a m where m is a perfect square) CS Properties of Regular Languages35
Strings with equal number of a’s and b’s NOTPRIME CS Properties of Regular Languages36
NOTPRIME is pumpable! Let y = the whole string (a km ) The number of a’s will always be a multiple of km, hence not prime Note: zero is not a prime number This does not violate the pumping lemma The pumping lemma draws no conclusion about non-regular languages CS Properties of Regular Languages37