Poetry The Pumping Lemma Any regular language L has a magic number p And any long-enough word in L has the following property: Amongst its first p symbols is a segment you can find Whose repetition or omission leaves x amongst its kind. So if you find a language L which fails this acid test, And some long word you pump becomes distinct from all the rest, By contradiction you have shown that language L is not A regular guy, resiliant to the damage you have wrought. But if, upon the other hand, x stays within its L, Then either L is regular, or else you chose not well. For w is xyz, and y cannot be null, And y must come before p symbols have been read in full. As mathematical postscript, an addendum to the wise: The basic proof we outlined here does certainly generalize. So there is a pumping lemma for all languages context-free, Although we do not have the same for those that are r.e. -- Martin Cohn
More on showing L non-regular Algorithms and Decision Procedures for Regular Languages MA/CSSE 474 Theory of Computation
Your Questions? Previous class days' material (especially the Pumping Theorem) Reading Assignments HW 7 problems Why Problem 10 on HW6? Anything else
Using The Pumping Theorem to show that L is not Regular: We use the contrapositive of the theorem: If some long enough string in L is not "pumpable", then L is not regular. What we need to show in order to show L non-regular: ( k 1 ( a string w L (|w| k and ( x, y, z ((w = xyz ∧ |xy| k ∧ y ) → q 0 (xy q z ∉ L)))))) → L is not regular. Before our next class meeting: Be sure that you are convinced that this really is the contrapositive of the pumping theorem.
Example: { a n b n : n 0} is not Regular k is the number from the Pumping Theorem. We don't get to choose it. Choose w to be a k/2 b k/2 (“long enough”). 1 2 a a a a a … a a a a a b b b b … b b b b b b x y z Adversary chooses x, y, z with the required properties: |xy| k, y , We must show ∃ q 0 (xy q z ∉ L). Three cases to consider: ● y entirely in region 1: ● y partly in region 1, partly in 2: ● y entirely in region 2: For each case, we must find at least one value of q that takes xy q z outside the language L. The most common q values to use are q=0 and q=2.
A Complete Proof We prove that L = { a n b n : n 0} is not regular If L were regular, then there would exist some k such that any string w where |w| k must satisfy the conditions of the theorem. Let w = a k/2 b k/2 . Since |w| k, w must satisfy the conditions of the pumping theorem. So, for some x, y, and z, w = xyz, |xy| k, y , and q 0, xy q z is in L. We show that no such x, y, and z exist. There are 3 cases for where y could occur: We divide w into two regions: aaaaa ….. aaaaaa | bbbbb ….. bbbbbb 1 | 2 So y can fall in: ● (1): y = a p for some p. Since y , p must be greater than 0. Let q = 2. The resulting string is a k+p b k. But this string is not in L, since it has more a ’s than b ’s. ● (2): y = b p for some p. Since y , p must be greater than 0. Let q = 2. The resulting string is a k b k+p. But this string is not in L, since it has more b ’s than a ’s. ● (1, 2): y = a p b r for some non-zero p and r. Let q = 2. The resulting string will have interleaved a ’s and b ’s, and so is not in L. There exists one long string in L for which no pumpable x, y, z exist. So L is not regular.
What You Should Write (read these details later) We prove that L = { a n b n : n 0} is not regular Let w = a k/2 b k/2 . (If not completely obvious, as in this case, show that w is in fact in L.) aaaaa ….. aaaaaa | bbbbb ….. bbbbbb 1 | 2 There are three possibilities for y: ● (1): y = a p for some p. Since y , p must be greater than 0. Let q = 2. The resulting string is a k+p b k. But this string is not in L, since it has more a ’s than b ’s.. ● (2): y = b p for some p. Since y , p must be greater than 0. Let q = 2. The resulting string is a k b k+p. But this string is not in L, since it has more b ’s than a ’s. ● (1, 2): y = a p b r for some non-zero p and r. Let q = 2. The resulting string will have interleaved a ’s and b ’s, and so is not in L. Thus L is not regular.
Using the Pumping Theorem Effectively ● To choose w: ● Choose a w that is in the part of L that makes it not regular. ● Choose a w that is only barely in L. ● Choose a w with as homogeneous as possible an initial region of length at least k. ● To choose q: ● Try letting q be either 0 or 2. ● If that doesn’t work, analyze L to see if there is some other specific value that will work.
A better choice for w Second try. A choice of w that makes it easier: Choose w to be a k b k (We get to choose any w whose length is at least k). 1 2 a a a a a … a a a a a b b b b … b b b b b b x y z We show that there is no x, y, z with the required properties: |xy| k, y , q 0 (xy q z is in L). Since |xy| k, y must be in region 1. So y = a p for some p 1. Let q = 2, producing: a k+p b k which L, since it has more a ’s than b ’s. We only have to find one q that takes us outside of L.
Recap: Using the Pumping Theorem If L is regular, then every “long” string in L is pumpable. To show that L is not regular, we find one string that isn’t. To use the Pumping Theorem to show that a language L is not regular, we must: 1. Choose a string w where |w| k. Since we do not know what k is, we must describe w in terms of k. 2. Divide the possibilities for y into a set of equivalence classes that can be considered together. 3. For each such class of possible y values where |xy| k and y : Choose a value for q such that xy q z is not in L.
Where we are so far To show a language L to be non-regular: –Myhill-Nerode theorem Number of equivalence classes for L is infinite –Pumping Theorem –Closure properties, in conjunction with languages already shown to be non-regular.
Using the Closure Properties to prove a language non-regular The two most useful properties are closure under: Intersection Complement
Using the Closure Properties L = {w { a, b }*: # a (w) = # b (w)} If L were regular, then: L = L _______ would also be regular. But it isn’t.
L = { a i b j : i, j 0 and i j} Try to use the Pumping Theorem. What would you choose for w?
L = { a i b j : i, j 0 and i j} Try to use the Pumping Theorem Let w = a k b k+k!. Then y = a m for some nonzero m, m ≤ k. Let q = (k!/m) + 1 (i.e., pump in (k!/m) times). Note that (k!/m) must be an integer because m ≤ k. The number of a ’s in the new string is k + (k!/m)m = k + k!. So the new string is a k+k! b k+k!, which has equal numbers of a ’s and b ’s, thusis not in L.
L = { a i b j : i, j 0 and i j} An easier way: If L is regular then so is L. Is it?
L = { a i b j : i, j 0 and i j} An easier way: If L is regular then so is L. Is it? L = A n B n {out of order} If L is regular, then so is L = L a * b * = ___________
L = { a i b j c k : i, j, k ≥ 0 and (if i = 1 then j = k)} We will show that every string in L of length at least 1 is pumpable. Does that imply that L is regular? We shall see! Rewrite the final condition as: (i 1) or (j = k)
L = { a i b j c k : i, j, k ≥ 0 and (i 1 or j = k)} Every string in L of length at least 1 is pumpable: If i = 0 then: if j 0, let y be b ; otherwise, let y be c. Pump in or out. Then i will still be 0 and thus not equal to 1, so the resulting string is in L. If i = 1 then: let y be a. Pump in or out. Then i will no longer equal 1, so the resulting string is in L. If i = 2 then: let y be aa. Pump in or out. Then i cannot equal 1, so the resulting string is in L. If i > 2 then: let y be a. Pump out once or in any number of times. Then i cannot equal 1, so the resulting string is in L.
L = { a i b j c k : i, j, k ≥ 0 and (i 1 or j = k)} But the closure theorems help. If L is regular, then so is: L = L ab * c *. L = { ab k c k : k ≥ 0} Can easily use Pumping Theorem to show that L is not regular
L = { a i b j c k : i, j, k ≥ 0 and (i 1 or j = k)} An Alternative If L is regular, then so is L R : L R = { c k b j a i : i, j, k ≥ 0 and (i 1 or j = k)} Use Pumping to show that L R is not regular:
Exploiting Problem-Specific Knowledge L = {w {0, 1, 2, 3, 4,5, 6, 7}*: w is the octal representation of a nonnegative integer that is divisible by 7}
Is English Regular? Is English finite?
In the event that the Purchaser defaults in the payment of any installment of purchase price, taxes, insurance, interest, or the annual charge described elsewhere herein, or shall default in the performance of any other obligations set forth in this Contract, the Seller may: at his option: (a) Declare immediately due and payable the entire unpaid balance of purchase price, with accrued interest, taxes, and annual charge, and demand full payment thereof, and enforce conveyance of the land by termination of the contract or according to the terms hereof, in which case the Purchaser shall also be liable to the Seller for reasonable attorney's fees for services rendered by any attorney on behalf of the Seller, or (b) sell said land and premises or any part thereof at public auction, in such manner, at such time and place, upon such terms and conditions, and upon such public notice as the Seller may deem best for the interest of all concerned, consisting of advertisement in a newspaper of general circulation in the county or city in which the security property is located at least once a week for Three (3) successive weeks or for such period as applicable law may require and, in case of default of any purchaser, to re-sell with such postponement of sale or resale and upon such public notice thereof as the Seller may determine, and upon compliance by the Purchaser with the terms of sale, and upon judicial approval as may be required by law, convey said land and premises in fee simple to and at the cost of the Purchaser, who shall not be liable to see to the application of the purchase money; and from the proceeds of the sale: First to pay all proper costs and charges, including but not limited to court costs, advertising expenses, auctioneer's allowance, the expenses, if any required to correct any irregularity in the title, premium for Seller's bond, auditor's fee, attorney's fee, and all other expenses of sale occurred in and about the protection and execution of this contract, and all moneys advanced for taxes, assessments, insurance, and with interest thereon as provided herein, and all taxes due upon said land and premises at time of sale, and to retain as compensation a commission of five percent (5%) on the amount of said sale or sales; SECOND, to pay the whole amount then remaining unpaid of the principal of said contract, and interest thereon to date of payment, whether the same shall be due or not, it being understood and agreed that upon such sale before maturity of the contract the balance thereof shall be immediately due and payable; THIRD, to pay liens of record against the security property according to their priority of lien and to the extent that funds remaining in the hands of the Seller are available; and LAST, to pay the remainder of said proceeds, if any, to the vendor, his heirs, personal representatives, successors or assigns upon the delivery and surrender to the vendee of possession of the land and premises, less costs and excess of obtaining possession.
Is English Regular? ● The rat ran. ● The rat that the cat saw ran. ● The rat that the cat that the dog chased saw ran. Let: A = { cat, rat, dog, bird, bug, pony } V = { ran, saw, chased, flew, sang, frolicked }. Let L = English { The A ( that the A)* V* V}. L = { The A ( that the A) n V n V, n 0}. Let w = The cat ( that the rat ) k saw k ran.
Functions from one Language to Another Let firstchars(L) = {w : y L (y = cx, c L, x L *, and w c*)} Are the regular languages closed under firstchars? Lfirstchars(L) a*b*a*b* ca * cb *
Defining Functions from one Language to Another Let chop(L) = {w : x L (x = x 1 cx 2, x 1 L *, x 2 L *, c L, |x 1 | = |x 2 |, and w = x 1 x 2 )} Are the regular languages closed under chop? Lchop(L) a*b*a*b* a * db * Recap: Give an English description of the relationship between chop(L) and L
Defining Functions from One Language to Another Let maxstring(L) = {w: w L, z * (z wz L)} Are the regular languages closed under maxstring? Lmaxstring(L) a*b*a*b* ab * a a*b*aa*b*a
Defining Functions from One Language to Another Let mix(L) = {w: x, y, z (x L, x = yz, |y| = |z|, w = yz R )}. Are the regular languages closed under mix? Lmix(L) ( a b )* ( ab )* ( ab )* a ( ab )*
Decision Procedures A decision procedure is an algorithm whose result is a Boolean value. It must: ● Halt ● Be correct Important decision procedures exist for regular languages: ● Given an FSM M and a string s, does M accept s? ● Given a regular expression and a string w, does generate w?
Membership We can answer the membership question by running an FSM. But we must be careful if it's an NDFSM:
Membership decideFSM(M: FSM, w: string) = If ndfsmsimulate(M, w) accepts then return True else return False. decideregex( : regular expression, w: string) = From , use regextofsm to construct an FSM M such that L( ) = L(M). Return decideFSM(M, w). Recall that ndfsmsimulate takes epsilon-closure at every stage, so there is no danger of getting into an infinite loop.
Emptiness and Finiteness ● Given an FSM M, is L(M) empty? ● Given an FSM M, is L(M) = M *? ● Given an FSM M, is L(M) finite? ● Given an FSM M, is L(M) infinite? ● Given two FSMs M 1 and M 2, are they equivalent?
Emptiness Given an FSM M, is L(M) empty? The simulation approach: The graph analysis approach: 1. Let M = ndfsmtodfsm(M). 2. For each string w in * such that |w| < |K M | do: Run decideFSM(M, w). 3. If M accepts at least one such string, return False. Else return True. 1. Mark all states that are reachable via some path from the start state of M. 2. If at least one marked state is an accepting state, return False. Else return True.
Totality Given an FSM M, is L(M) = M *?
Finiteness Given an FSM M, is L(M) finite? The graph analysis approach: The simulation approach
Equivalence ● Given two FSMs M 1 and M 2, are they equivalent? In other words, is L(M 1 ) = L(M 2 )? We can describe two different algorithms for answering this question.
Equivalence ● Given two FSMs M 1 and M 2, are they equivalent? In other words, is L(M 1 ) = L(M 2 )? equalFSMs 1 (M 1 : FSM, M 2 : FSM) = 1. M 1 = buildFSMcanonicalform(M 1 ). 2. M 2 = buildFSMcanonicalform(M 2 ). 3. If M 1 and M 2 are equal, return True, else return False.
Equivalence ● Given two FSMs M 1 and M 2, are they equivalent? In other words, is L(M 1 ) = L(M 2 )? equalFSMs 2 (M 1 : FSM, M 2 : FSM) = 1. Construct M A to accept L(M 1 ) - L(M 2 ). 2. Construct M B to accept L(M 2 ) - L(M 1 ). 3. Construct M C to accept L(M A ) L(M B ). 4. Return emptyFSM(M C ). Observe that M 1 and M 2 are equivalent iff: (L(M 1 ) - L(M 2 )) (L(M 2 ) - L(M 1 )) = .
Minimality ● Given DFSM M, is M minimal?
Answering Specific Questions Given two regular expressions 1 and 2, is: (L( 1 ) L( 2 )) – { } ? 1. From 1, construct an FSM M 1 such that L( 1 ) = L(M 1 ). 2. From 2, construct an FSM M 2 such that L( 2 ) = L(M 2 ). 3. Construct M such that L(M ) = L(M 1 ) L(M 2 ). 4. Construct M such that L(M ) = { }. 5. Construct M such that L(M ) = L(M ) - L(M ). 6. If L(M ) is empty return False; else return True.
Answering Specific Questions Given two regular expressions 1 and 2, are there at least 3 strings that are generated by both of them?
Summary of Algorithms ● Operate on FSMs without altering the language that is accepted: ● Ndfsmtodfsm ● MinDFSM
Summary of Algorithms ● Compute functions of languages defined as FSMs: ● Given FSMs M 1 and M 2, construct a FSM M 3 such that L(M 3 ) = L(M 2 ) L(M 1 ). ● Given FSMs M 1 and M 2, construct a new FSM M 3 such that L(M 3 ) = L(M 2 ) L(M 1 ). ● Given FSM M, construct an FSM M* such that L(M*) = (L(M))*. ● Given a DFSM M, construct an FSM M* such that L(M*) = L(M). ● Given two FSMs M 1 and M 2, construct an FSM M 3 such that L(M 3 ) = L(M 2 ) L(M 1 ). ● Given two FSMs M 1 and M 2, construct an FSM M 3 such that L(M 3 ) = L(M 2 ) - L(M 1 ). ● Given an FSM M, construct an FSM M* such that L(M*) = (L(M)) R. ● Given an FSM M, construct an FSM M* that accepts letsub(L(M)).
Algorithms, Continued ● Converting between FSMs and regular expressions: ● Given a regular expression , construct an FSM M such that: L( ) = L(M) ● Given an FSM M, construct a regular expression such that: L( ) = L(M) ● Algorithms that implement operations on languages defined by regular expressions: any operation that can be performed on languages defined by FSMs can be implemented by converting all regular expressions to equivalent FSMs and then executing the appropriate FSM algorithm.
Algorithms, Continued ● Converting between FSMs and regular grammars: ● Given a regular grammar G, construct an FSM M such that: L(G) = L(M) ● Given an FSM M, construct a regular grammar G such that: L(G) = L(M).
Algorithms: Decision Procedures ● Decision procedures that answer questions about languages defined by FSMs: ● Given an FSM M and a string s, decide whether s is accepted by M. ● Given an FSM M, decide whether L(M) is empty. ● Given an FSM M, decide whether L(M) is finite. ● Given two FSMs, M 1 and M 2, decide whether L(M 1 ) = L(M 2 ). ● Given an FSM M, is M minimal? ● Decision procedures that answer questions about languages defined by regular expressions: Again, convert the regular expressions to FSMs and apply the FSM algorithms.
Context-Free Grammars Chapter 11
Languages and Machines
Rewrite Systems and Grammars A rewrite system (or production system or rule-based system) is: ● a list of rules, and ● an algorithm for applying them. Each rule has a left-hand side and a right hand side. Example rules: S a S b a S a S b b S ab S a
Simple-rewrite simple-rewrite(R: rewrite system, w: initial string) = 1. Set working-string to w. 2. Until told by R to halt do: Match the lhs of some rule against some part of working-string. Replace the matched part of working-string with the rhs of the rule that was matched. 3. Return working-string.
A Rewrite System Formalism A rewrite system formalism specifies: ● The form of the rules ● How simple-rewrite works: ● How to choose rules? ● When to quit?
An Example w = S a S Rules: [1] S a S b [2] a S ● What order to apply the rules? ● When to quit?
Rule Based Systems ● Expert systems ● Cognitive modeling ● Business practice modeling ● General models of computation ● Grammars
Grammars Define Languages A grammar, G, is a set of rules that are stated in terms of two alphabets: a terminal alphabet, , that contains the symbols that make up the strings in L(G), and a nonterminal alphabet, the elements of which will function as working symbols that will be used while the grammar is operating. These symbols will disappear by the time the grammar finishes its job and generates a string. A grammar has a unique start symbol, often called S.
Using a Grammar to Derive a String Simple-rewrite (G, S) will generate the strings in L(G). We will use the symbol to indicate steps in a derivation. In our example: [1] S a S b [2] a S A derivation could begin with: S a S b aa S bb …
Generating Many Strings Multiple rules may match. Given: S a S b, S b S a, and S Derivation so far: S a S b aa S bb Three choices at the next step: S a S b aa S bb aaa S bbb (using rule 1), S a S b aa S bb aab S abb (using rule 2), S a S b aa S bb aabb (using rule 3).
Generating Many Strings One rule may match in more than one way. Given: S a TT b, T b T a, and T Derivation so far: S a TT b Two choices at the next step: S a TT b ab T a T b S a TT b a T b T ab
When to Stop May stop when: 1.The working string no longer contains any nonterminal symbols (including, when it is ). In this case, we say that the working string is generated by the grammar. Example: S a S b aa S bb aabb
When to Stop May stop when: 2.There are nonterminal symbols in the working string but none of them is in a substring that is the left-hand side of any rule in the grammar. In this case, we have a blocked or non-terminated derivation but no generated string. Example: Rules:S a S b, S b T a, and S Derivations: S a S b ab T ab [blocked]
When to Stop It is possible that neither (1) nor (2) is achieved. Example: G contains only the rules S B a and B b B, with S the start symbol. Then all derivations proceed as: S B a b B a bb B a bbb B a bbbb B a ...
Context-free Grammars, Languages, and PDAs Context-free Language Context-free Grammar PDA L Accepts
Context-Free Grammars No restrictions on the form of the right hand sides. S ab D e FG ab But require single non-terminal on left hand side. S but not ASB
a n b n Balanced Parentheses language a m b n : m>= n
Context-Free Grammars A context-free grammar G is a quadruple, (V, , R, S), where: ● V is the rule alphabet, which contains nonterminals and terminals. ● (the set of terminals) is a subset of V, ● R (the set of rules) is a finite subset of (V - ) V*, ● S (the start symbol) is an element of V - . Example: ({S, a, b }, { a, b }, {S a S b, S }, S) Rules are also known as productions.