Properties of Regular Languages

Slides:



Advertisements
Similar presentations
Properties of Regular Sets
Advertisements

Lecture 9,10 Theory of AUTOMATA
Nonregular languages Sipser 1.4 (pages 77-82). CS 311 Mount Holyoke College 2 Nonregular languages? We now know: –Regular languages may be specified either.
Nonregular languages Sipser 1.4 (pages 77-82). CS 311 Fall Nonregular languages? We now know: –Regular languages may be specified either by regular.
1 Regular Expressions Highlights: –A regular expression is used to specify a language, and it does so precisely. –Regular expressions are very intuitive.
1 More Properties of Regular Languages. 2 We have proven Regular languages are closed under: Union Concatenation Star operation Reverse.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Limitations.
Courtesy Costas Busch - RPI1 Non-regular languages.
FSA Lecture 1 Finite State Machines. Creating a Automaton  Given a language L over an alphabet , design a deterministic finite automaton (DFA) M such.
1 Background Information for the Pumping Lemma for Context-Free Languages Definition: Let G = (V, T, P, S) be a CFL. If every production in P is of the.
FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY
Regular Languages A language is regular over  if it can be built from ;, {  }, and { a } for every a 2 , using operators union ( [ ), concatenation.
Formal Language Finite set of alphabets Σ: e.g., {0, 1}, {a, b, c}, { ‘{‘, ‘}’ } Language L is a subset of strings on Σ, e.g., {00, 110, 01} a finite language,
1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 5 School of Innovation, Design and Engineering Mälardalen University 2012.
Introduction to CS Theory Lecture 3 – Regular Languages Piotr Faliszewski
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Closure.
TK PrasadPumping Lemma1 Nonregularity Proofs. TK PrasadPumping Lemma2 Grand Unification Regular Languages: Grand Unification (Parallel Simulation) (Rabin.
Regular Expressions and Languages A regular expression is a notation to represent languages, i.e. a set of strings, where the set is either finite or contains.
Chapter 4 Pumping Lemma Properties of Regular Languages Decidable questions on Regular Languages.
CS 3240 – Chapter 4.  Closure Properties  Algorithms for Elementary Questions:  Is a given word, w, in L?  Is L empty, finite or infinite?  Are L.
Class Discussion Can you draw a DFA that accepts the language {a k b k | k = 0,1,2,…} over the alphabet  ={a,b}?
Properties of Regular Languages
Chapter 6 Properties of Regular Languages. 2 Regular Sets and Languages  Claim(1). The family of languages accepted by FSAs consists of precisely the.
CS 203: Introduction to Formal Languages and Automata
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
General Discussion of “Properties” The Pumping Lemma Membership, Emptiness, Etc.
Theory of Languages and Automata By: Mojtaba Khezrian.
Nonregular Languages Section 2.4 Wed, Oct 5, 2005.
CSE 105 theory of computation
Foundations of Computing Science
Regular Languages, Regular Operations, Closure
Standard Representations of Regular Languages
CSE322 PUMPING LEMMA FOR REGULAR SETS AND ITS APPLICATIONS
CSE 105 theory of computation
PROPERTIES OF REGULAR LANGUAGES
Properties of Regular Languages
FORMAL LANGUAGES AND AUTOMATA THEORY
CSE 3813 Introduction to Formal Languages and Automata
PDAs Accept Context-Free Languages
Nonregular Languages Section 2.4 Wed, Oct 5, 2005.
Pushdown Automata Reading: Chapter 6.
Context-Free Languages
Hierarchy of languages
Properties of Regular Languages
Properties of Regular Languages
Infiniteness Test The Pumping Lemma Nonregular Languages
CS 154, Lecture 3: DFANFA, Regular Expressions.
4. Properties of Regular Languages
Definition: Let G = (V, T, P, S) be a CFL
Introduction to Finite Automata
Deterministic PDAs - DPDAs
Finite Automata Reading: Chapter 2.
Elementary Questions about Regular Languages
FORMAL LANGUAGES, AUTOMATA, AND COMPUTABILITY
Properties of Context-Free Languages
Closure Properties of Regular Languages
Chapter 4 Properties of Regular Languages
CS21 Decidability and Tractability
Non-Regular Languages
Recap lecture 26 Example of nonregular language, pumping lemma version I, proof, examples,
Instructor: Aaron Roth
Instructor: Aaron Roth
CSE 105 theory of computation
Recap lecture 18 NFA corresponding to union of FAs,example, NFA corresponding to concatenation of FAs,examples, NFA corresponding to closure of an FA,examples.
Instructor: Aaron Roth
Recap lecture 25 Intersection of two regular languages is regular, examples, non regular languages, example.
CSC312 Automata Theory Kleene’s Theorem Lecture # 12
CSE 105 theory of computation
Presentation transcript:

Properties of Regular Languages Reading: Chapter 4

Pumping Lemma for Regular Languages Lemma: (the pumping lemma) Let M be a DFA with |Q| = n states, and let x be a string in L(M), where |x| >= n. Then x = uvw, where u,v, and w are all in Σ* and: 1<= |uv| <= n |v| >= 1 uviw is in L(M), for all i >= 0 Note this statement of the pumping lemma is slightly different than the books’.

Pumping Lemma for Regular Languages Proof: Let M be a DFA where |Q| = n, and let x be a string in L(M) where |x|>=n. Furthermore, let x = a1a2 … am where δ(q0, a1a2 … ap) = qjp. a1 a2 a3 … am qj0 qj1 qj2 qj3… qjm m>=n and qj0 is q0 Consider the first n symbols, and first n+1 states on the above path: a1 a2 a3 … an qj0 qj1 qj2 qj3… qjn Since |Q| = n, it follows from the pigeon-hole principle that js = jt for some 0<=s<t<=n, i.e., some state appears on this path twice (perhaps many states appear more than once, but at least one does).

… … … same cycle q0 qj1 qjs qjt qjn q0 qjs = qjt qjn a1 a2 as as+1 at … … … same cycle q0 qj1 qjs qjt a1 a2 as as+1 at at+1 qjn an q0 qjs = qjt qjn a1…as at+1…an as+1…at

Since 0<=s<t<=n and uv = a1…at it follows that: Let: u = a1…as v = as+1…at Since 0<=s<t<=n and uv = a1…at it follows that: 1<= |v| and therefore 1<=|uv| |uv|<=n and therefore 1 <= |uv| <= n In addition, let: w = at+1…am It follows that uviw = a1…as(as+1…at)iat+1…am is in L(M), for all i >= 0.  In other words, when processing the accepted string x, the cycle was traversed once, but could have been traversed as many times as desired, and the resulting string would still be accepted.

q0 0,1 q4 q2 q1 q3 1 Example: n = 5 x = 0001000 is in L(M) 0 0 0 1 0 0 0 q0 q1 q2 q3 q1 q2 q3 q4 first n+1 states u = 0 v = 001 w = 000 uviw is in L(M), i.e., 0(001)i000 is in L(M), for all i>= 0 q0 0,1 q4 q2 q1 q3 1

Note this does not mean that every string accepted by the DFA has this form: 001 is in L(M) but is not of the form 0(001)i000 Similarly, this doesn’t even mean that every long string accepted by the DFA has this form: 0011111 is in L(M), is very long, but is not of the form 0(001)i000 Note, however, in this latter case 0011111 could be similarly decomposed. q0 0,1 q4 q2 q1 q3 1

Note: It may be the case that no x in L(M) has |x| >= n. q0 0,1 q1 q2

Example: n = 4 x = bbbab is in L(M) b b b a b |x| = 5 q0 q0 q0 q0 q1 q3 u = ε v = b w = bbab (b)ibbab is in L(M), for all i>=0 a b a q0 q1 b b a q2 q3 a,b

Example: n = 4 x = bbbab is in L(M) b b b a b |x| = 5 q0 q0 q0 q0 q1 q3 u = b v = b w = bab b(b)ibab is in L(M), for all i>=0 a b a q0 q1 b b a q2 q3 a,b

Example: n = 4 x = bbbab is in L(M) b b b a b |x| = 5 q0 q0 q0 q0 q1 q3 u = bb v = b w = ab bb(b)iab is in L(M), for all i>=0 a b a q0 q1 b b a q2 q3 a,b

Example: n = 4 x = bbbab is in L(M) b b b a b |x| = 5 q0 q0 q0 q0 q1 q3 u = b v = bb w = ab b(bb)iab is in L(M), for all i>=0 a b a q0 q1 b b a q2 q3 a,b

NonRegularity Example Theorem: The language: L = {0k1k | k>=0} (1) is not regular. Proof: (by contradiction) Suppose that L is regular. Then there exists a DFA M such that: L = L(M) (2) We will show that M accepts some strings not in L, contradicting (2). Suppose that M has n states, and consider a string x=0m1m, where m>>n. By (1), x is in L. By (2), x is also in L(M). Note that L basically corresponds to balanced parenthesis.

Also, since x = 0m1m it follows that uv is a substring of 0m. Since x is very long, i.e., |x| = 2*m >> n, it follows from the pumping lemma that: x = uvw 1 <= |uv| <= n 1 <= |v|, and uviw is in L(M), for all i >= 0 Since 1<=|uv|<=n and n<<m, it follows that 1<=|uv|<=m. Also, since x = 0m1m it follows that uv is a substring of 0m. In other words v=0j, for some j>=1. Since uviw is in L(M), for all i>=0, it follows that 0m+cj1m is in L(M), for all c>=1. But by (2), 0m+cj1m is in L, for any c>=1, a contradiction.  By the way, 0n1n corresponds to balanced parenthasis. What does this say about DFAs? They can’t count arbitrarily high. But can we do this with a program? Of course. Count the number of 0’s, count the number of 1’s, and then compare them with an…IF STATEMENT! Told you if-statements would be helpful! So what does this all mean…are DFAs a good model of general computing? Of course not, they can’t even count! A very primitive device.

NonRegularity Example Theorem: The language: L = {0k1k2k | k>=0} (1) is not regular. Proof: (by contradiction) Suppose that L is regular. Then there exists a DFA M such that: L = L(M) (2) We will show that M accepts some strings not in L, contradicting (2). Suppose that M has n states, and consider a string x=0m1m2m, where m>>n. By (1), x is in L. By (2), x is also in L(M).

Also, since x = 0m1m2m it follows that uv is a substring of 0m. Since is very long, i.e., |x| = 3*m >> n, it follows from the pumping lemma that: x = uvw 1 <= |uv| <= n 1 <= |v|, and uviw is in L(M), for all i >= 0 Since 1<=|uv|<=n and n<<m, it follows that 1<=|uv|<=m. Also, since x = 0m1m2m it follows that uv is a substring of 0m. In other words v=0j, for some j>=1. Since uviw is in L(M), for all i>=0, it follows that 0m+cj1m2m is in L(M), for all c>=1. But by (2), 0m+cj1m2m is in L, for any c>=1, a contradiction.  Note that the above proof is almost identical to the previous proof.

NonRegularity Example Theorem: The language: L = {0m1n2m+n | m,n >=0} (1) is not regular. Proof: (by contradiction) Suppose that L is regular. Then there exists a DFA M such that: L = L(M) (2) We will show that M accepts some strings not in L, contradicting (2). Suppose that M has n states, and consider a string x=0m1n2m+n, where m>>n. By (1), x is in L. By (2), x is also in L(M).

Since |x| = m >> n, it follows from the pumping lemma that: x = uvw 1 <= |uv| <= n 1 <= |v|, and uviw is in L(M), for all i >= 0 Since 1<=|uv|<=n and n<<m, it follows that 1<=|uv|<=m. Also, since x = 0m1n2m+n it follows that uv is a substring of 0m. In other words v=0j, for some j>=1. Since uviw is in L(M), for all i>=0, it follows that 0m+cj1m2m+n is in L(M), for all c>=1. In other words v can be “pumped” as many times as we like, and we still get a string in L(M). But by (2), 0m+cj1n2m+n is in L, for any c>=1, a contradiction.  Note that the above proof is almost identical to the previous proof.

By the way… {0n1n | 0=<n} Is not regular {0n1n | 0=<n<=k, for some fixed k} Is regular, for any fixed k. For k=3: L = {ε, 01, 0011, 000111} q0 q1 q2 q3 1 1 1 1 0,1 1 1 q7 q6 q5 q4 0,1

The following theorems will all exploit the following observations… Let Σ be an alphabet, and let |Σ| = k. How many strings of length n are there? kn For example, if Σ ={0,1} and n = 3, then there are 8 strings of length 3.

Theorem: Let M = (Q, Σ, δ, q0, F) be a DFA Theorem: Let M = (Q, Σ, δ, q0, F) be a DFA. Then L(M) is non-empty iff there exists an x in L(M) such that |x| < |Q|. Proof: (if) Suppose there exists a string x in L(M) such that |x| < |Q|. Then clearly L(M) is non-empty.

Theorem: Let M = (Q, Σ, δ, q0, F) be a DFA Theorem: Let M = (Q, Σ, δ, q0, F) be a DFA. Then L(M) is non-empty iff there exists an x in L(M) such that |x| < |Q|. (only if) By contradiction. Suppose that L(M) is non-empty, but that there exists no string x in L(M) such that |x| < |Q|. It follows that |y| >= |Q|, for all y in L(M). Let z be a string of shortest length in L(M). Then |z| >= |Q|. By the pumping lemma z=uvw, |v|>=1 and uviw is in L(M) for all i>=0. But then uv0w = uw is in L(M) and: |uw| = |z| - |v| <= |z| - 1 because |v| >= 1 < |z| Since uw is in L(M), it follows that z is not a string of shortest length in L(M), a contradiction.

Corollary: Let M = (Q, Σ, δ, q0, F) be a DFA Corollary: Let M = (Q, Σ, δ, q0, F) be a DFA. Then there is an algorithm to determine if L(M) is empty. Proof: From the theorem it follows that if L(M) is non-empty then there exists a string x where |x| < n and n = |Q| such that M accepts x. We can try running M on each string of length < n to see if any are accepted. If one is accepted then L(M) is non-empty, otherwise L(M) is empty. Since the number of states |Q| and the number of input symbols |Σ| are both fixed, it follows that there are at most a finite number of strings (of length less than |Q|) that need to be tested. Note that a simple graph search algorithm works here too…in fact, the above algorithm is a graph search…

Theorem: Let M = (Q, Σ, δ, q0, F) be a DFA Theorem: Let M = (Q, Σ, δ, q0, F) be a DFA. Then L(M) is finite iff |x| < |Q| for all x in L(M). Proof: (if) Suppose that |x| < |Q| for all x in L(M). Since the number of states |Q| and the number of input symbols |Σ| are both fixed, it follows that there are at most a finite number of strings of length less than |Q|. It follows that L(M) is finite (exercise: give an upper bound on the number of such strings). (only if) By contradiction. Suppose that L(M) is finite, but that |x| >= |Q| for some x in L(M). From the pumping lemma it follows that x=uvw, |v|>=1 and uviw is in L(M) for all i>=0. But then L(M) would be infinite, a contradiction. 

Theorem: Let M = (Q, Σ, δ, q0, F) be a DFA Theorem: Let M = (Q, Σ, δ, q0, F) be a DFA. Then L(M) is infinite iff there exists an x in L(M) such that |x| >= |Q|. Proof: (if) Suppose there exists an x in L(M) such that |x| >= |Q|. From the pumping lemma it follows that x=uvw, |v|>=1 and uviw is in L(M) for all i>=0. Therefore L(M) is infinite. (only if) By contradiction. Suppose that L(M) is infinite, but that there is no x in L(M) such that |x| >= |Q|. It follows that each x in L(M) has length less than |Q|. Since the number of states |Q| and the number of input symbols |Σ| are both fixed, it follows that there are at most a finite number of strings of length less than |Q|. It follows that L(M) is finite. A contradiction.  Note that the above also follows directly from the previous theorem.

Theorem: Let M = (Q, Σ, δ, q0, F) be a DFA and let n = |Q| Theorem: Let M = (Q, Σ, δ, q0, F) be a DFA and let n = |Q|. Then L(M) is infinite iff there exists an x in L(M) such that n <= |x| < 2n. Proof: (left as an exercise; similar to the previous theorem). Corollary: Let M = (Q, Σ, δ, q0, F) be a DFA. Then there is an algorithm to determine if L(M) is infinite. Proof: (left as an exercise).

Closure Properties of Regular Languages Consider various operations on languages: = {x | x is in Σ* and x is not in L} = Σ* - L = {x | x is in L1 or L2} = {x | x is in L1 and L2} = {x | x is in L1 but not in L2} L1L2 = {xy | x is in L1 and y is in L2} L* = Li = L0 U L1 U L2 U… L+ = Li = L1 U L2 U… A closure property for the regular languages states that if a set of one or more languages is regular, then some operation on that set results in a regular language.

Theorem: Let M = (Q, Σ, δ, q0, F) be a DFA Theorem: Let M = (Q, Σ, δ, q0, F) be a DFA. Then M’ = (Q, Σ, δ, q0, Q-F) is a DFA where L(M’) = Σ* - L(M). Proof: (left as an exercise). Corollary: The regular languages are closed under complementation, i.e., if L is a regular language then so is . Does the above construction work for NFAs?

Theorem: The regular languages are closed with respect to union, concatenation and Kleene closure. Proof: Let L1 and L2 be regular languages. Then there exist regular expressions r1 and r2 such that L(r1) = L1 and L(r2) = L2. But then r1 + r2 is a regular expression representing . Similarly for concatenation and Kleene closure.

How about Intersection? Lemma: Let L1 and L2 be subsets of Σ*. Then .

Theorem: Let L1 and L2 be regular languages. Then is a regular language. Proof: (algebraic proof) Let L1 and L2 be regular languages. Then: is regular, because the regular languages are closed wrt complementation is regular, because the regular languages are closed wrt union But by the lemma: 

A direct construction of a DFA M such that : Let: M1 = (Q1, Σ, δ1,p0,F1), where Q1 = {p0, p1,…} M2 = (Q2, Σ, δ2,q0,F2), where Q2 = {q0, q1,…} where L(M1) = L1 and L(M2) = L2. Construct M where: Q = Q1 x Q2 Note that M has a state for each = {[p0, q0], [p0, q1], [p0, q2],…} pair of states in M1 and M2 Σ = as with M1 and M2 F = F1 x F2 start state = [p0, q0] δ([pi, qj], a) = [δ1(pi, a), δ2(qj, a)] for all [pi, qj] in Q and a in Σ

p0 p1 1 1 q0 q1 1 q2 0,1 Example: The construct gives: Q = Q1 x Q2 = {[p0, q0], [p0, q1], [p0, q2], [p1, q0], [p1, q1], [p1, q2]} Σ = {0, 1} F = {[p1, q2]} start state = [p0, q0] δ – Some examples: δ([p0, q2], 0) = [δ1(p0, 0), δ2(q2, 0)] = [p1, q2] δ([p1, q2], 1) = [δ1(p1, 1), δ2(q2, 1)] = [p1, q2] p0 p1 1 1 q0 q1 1 q2 0,1

Question: What if Σ1 does not equal Σ2? Final Result: Question: What if Σ1 does not equal Σ2? Try all three machines on: (what happens?) 0100 0101 0011 1010 1 1 1 p0, q0 p0, q1 p0, q2 p1, q0 p1, q1 p1, q2 1 1 1