Chapter 2 Regular Languages & Finite Automata. Regular Expressions A finitary denotation of a regular language over . ØL Ø = Ø aL a = {a} where a ∈ 

Slides:

Advertisements

Similar presentations

Lecture 9,10 Theory of AUTOMATA

Advertisements

FORMAL LANGUAGES, AUTOMATA, AND COMPUTABILITY

1 Languages. 2 A language is a set of strings String: A sequence of letters Examples: “cat”, “dog”, “house”, … Defined over an alphabet: Languages.

CS21 Decidability and Tractability

1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.

Finite Automata Great Theoretical Ideas In Computer Science Anupam Gupta Danny Sleator CS Fall 2010 Lecture 20Oct 28, 2010Carnegie Mellon University.

1 Introduction to Computability Theory Lecture4: Regular Expressions Prof. Amos Israeli.

1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.

1 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY (For next time: Read Chapter 1.3 of the book)

Courtesy Costas Busch - RPI1 Non Deterministic Automata.

Transparency No. 4-1 Formal Language and Automata Theory Chapter 4 Patterns, Regular Expressions and Finite Automata (include lecture 7,8,9) Transparency.

Lecture 3: Closure Properties & Regular Expressions Jim Hook Tim Sheard Portland State University.

CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Regular.

CS5371 Theory of Computation Lecture 6: Automata Theory IV (Regular Expression = NFA = DFA)

Lecture 3 Goals: Formal definition of NFA, acceptance of a string by an NFA, computation tree associated with a string. Algorithm to convert an NFA to.

79 Regular Expression Regular expressions over an alphabet  are defined recursively as follows. (1) Ø, which denotes the empty set, is a regular expression.

Fall 2006Costas Busch - RPI1 Non-Deterministic Finite Automata.

FSA Lecture 1 Finite State Machines. Creating a Automaton  Given a language L over an alphabet , design a deterministic finite automaton (DFA) M such.

1 Regular Languages Finite Automata eg. Supermarket automatic door: exit or entrance.

Definitions Equivalence to Finite Automata

Costas Busch - LSU1 Non-Deterministic Finite Automata.

Today Chapter 1: RE = Regular Languages, nonregular languages RL pumping lemma Chapter 2: Context-Free Languages (CFLs)

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Regular Languages A language is regular over  if it can be built from ;, {  }, and { a } for every a 2 , using operators union ( [ ), concatenation.

Formal Language Finite set of alphabets Σ: e.g., {0, 1}, {a, b, c}, { ‘{‘, ‘}’ } Language L is a subset of strings on Σ, e.g., {00, 110, 01} a finite language,

1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 5 School of Innovation, Design and Engineering Mälardalen University 2012.

Basics of automata theory

Chapter 9. Chapter Summary Relations and Their Properties n-ary Relations and Their Applications (not currently included in overheads) Representing Relations.

CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong NFA to DFA.

Regular Expressions and Languages A regular expression is a notation to represent languages, i.e. a set of strings, where the set is either finite or contains.

Regular Expressions Chapter 6 1. Regular Languages Regular Language Regular Expression Finite State Machine L Accepts 2.

Chapter 4 Pumping Lemma Properties of Regular Languages Decidable questions on Regular Languages.

Class Discussion Can you draw a DFA that accepts the language {a k b k | k = 0,1,2,…} over the alphabet  ={a,b}?

Chapter 6 Properties of Regular Languages. 2 Regular Sets and Languages  Claim(1). The family of languages accepted by FSAs consists of precisely the.

Chapter 9. Chapter Summary Relations and Their Properties n-ary Relations and Their Applications (not currently included in overheads) Representing Relations.

CS 203: Introduction to Formal Languages and Automata

Chapter 3 Regular Expressions, Nondeterminism, and Kleene’s Theorem Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction.

Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 1 Regular Languages Some slides are in courtesy.

Lecture 5: Finite Automata 虞台文大同大學資工所智慧型多媒體研究室.

UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.

Transparency No. 4-1 Formal Language and Automata Theory Chapter 4 Patterns, Regular Expressions and Finite Automata (include lecture 7,8,9) Transparency.

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA.

Algorithms for hard problems Automata and tree automata Juris Viksna, 2015.

CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.

Finite Automata Great Theoretical Ideas In Computer Science Victor Adamchik Danny Sleator CS Spring 2010 Lecture 20Mar 30, 2010Carnegie Mellon.

CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.

Equivalence with FA * Any Regex can be converted to FA and vice versa, because: * Regex and FA are equivalent in their descriptive power ** Regular language.

1 Chapter 3 Regular Languages.  2 3.1: Regular Expressions (1)   Regular Expression (RE):   E is a regular expression over  if E is one of:

 2004 SDU Lecture4 Regular Expressions.  2004 SDU 2 Regular expressions A third way to view regular languages. Say that R is a regular expression if.

1 Introduction to the Theory of Computation Regular Expressions.

Regular Languages Chapter 1 Giorgi Japaridze Theory of Computability.

Complexity and Computability Theory I Lecture #5 Rina Zviel-Girshin Leah Epstein Winter

Finite Automata A simple model of computation. 2 Finite Automata2 Outline Deterministic finite automata (DFA) –How a DFA works.

WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.

Non Deterministic Automata

PROPERTIES OF REGULAR LANGUAGES

CSE 3813 Introduction to Formal Languages and Automata

Complexity and Computability Theory I

Chapter 2 FINITE AUTOMATA.

REGULAR LANGUAGES AND REGULAR GRAMMARS

Hierarchy of languages

Properties of Regular Languages

Non-Deterministic Finite Automata

FORMAL LANGUAGES, AUTOMATA, AND COMPUTABILITY

Chapter 4 Properties of Regular Languages

CS21 Decidability and Tractability

Closure Properties of Regular Languages

Chapter 1 Regular Language

Finite-State Machines with No Output

Presentation transcript:

Chapter 2 Regular Languages & Finite Automata

Regular Expressions A finitary denotation of a regular language over . ØL Ø = Ø aL a = {a} where a ∈  r + sL r ∪ L s r and s are regular rsL r · L s r and s are regular r*L r *r is regular L((a + b)*a) = {xa : x ∈ {a, b}*} = {w : w ends in a} L(0* + 0*(1 + 11)[00*(1 + 11)]*0*) = {w : 111  w}

Example: r = (0 + 10)*(1 + 10)* Claim: L(r) = {w : every pair of adjacent zeros appears before any pair of adjacent ones} Justification: w ∈ L(r) implies w = w ₁ w ₂ with w ₁ ∈ (0 + 10)* and w ₂ ∈ (1 + 10)*. Since (0 + 10)* cannot have double ones and (1 + 10)* cannot have double zeros we can’t have a 11 before a 00. So every double zero appears before any double one.

Continued: r = (0 + 10)*(1 + 10)* Conversely any string w ≠ …11…00…, i.e. w = …00…00…11…11… with the property can be written w = xyz where x is the shortest prefix containing all the double zeros (i.e. ε or ending in 00) and z is the shortest suffix containing all the double ones (i.e. ε or beginning with 11). This is possible precisely because w satisfies the requirement of all double zeros before any double ones. Now see that y must be of the form (10)*. Since x ∈ (0 + 10)* and z ∈ (1 + 10)*, w = xyz ∈ (0 + 10)*(10)*(1 + 10)*) = L(r), since (10)* is subsumed in its neighbors.

Regular Expressions identities Facts: Ø is the additive identity: Ø·r = Ø = r·ØØ + r = r = r + Ø Ø* is the multiplicative identity: Ø*·r = r·Ø* = r r + r = r r*r* = r*(r*)* = r*r(st) = (rs)t r + s = s + r(r + s) + t = r + (s + t) r(s + t) = rs + rt (r + s)t = rt + st (r*s*)* = (r + s)* [HW problem] Regular operations are monotone (e.g. ‘*’) r ⊆ s ⇒ f(r) ⊆ f(s)

Disjunctive Normal Form (DNF) Theorem: Every regular expression can be written as r 1 +…+ r n where each r i does not containing the ‘+’ symbol. Proof: By structural induction. The bases are trivial, as well as r + s. r · s = (r 1 + … + r n )·(s 1 + … + s m ) by induction hypothesis. Now use distributive law to finish. r* = (r 1 + … + r n )* by IH = (r 1 * ····· r n *)* which follows from the identity (a + b)* = (a*b*)* by induction.

Deterministic Finite Automata: DFA Formal Syntax s ∈ Q, F ⊆ Q, δ : Q  Σ → Q current state next state  Q, Σ,δ,s, F  finite set of states alphabet transition function start statefinal states M = input tapestring from Σ read head finite control

DFA Example: parity Q = {q 0, q 1 }Σ = {0, 1}s = q 0 F = {q 0 } L(M) = {w ∈ {0, 1}* : w has an even number of one’s} E.g. let w = 0110 ∈ L(M) (q 0, 0110)  M (q 0, 110)  M (q 1, 10)  M (q 0, 0)  M (q 0, ε) qσδ q0q0 0q0q0 q0q0 1q1q1 q1q1 0q1q1 q1q1 1q0q0 q0q0 q1q

Algorithm for DFA q : = s{M begins in state s} h : = 1{with head leftmost} while σ(h) <> blank{as long as head is reading a symbol} q : = δ(q, σ(h)){change state} h : = h + 1{move head right one symbol} accept : = q in F{accept if we end in a final state} Formal definitions for semantics: A configuration of M is an element of Q  Σ*, (q, w), where w hasn’t been read yet. (q, σw)  M (δ(q, σ), w) is the yields function  M : Q  Σ + → Q  Σ* M accepts w  (s, w)  M * (f, ε) for some f ∈ F

DFA Example #2 L(M) = {w : w is a sequence of pairs ab or ba} q0q0 q3q3 a, b b a q2q2 q1q1 a b b a

More DFA Examples Example of using minus L M = 1*0(0 + 1)* = Σ* − 1* 0 0, 1 1 A C B A simplified finite automaton recognizing (0 + 1)*10 sInput t01 aAAB tBCB eCAB M

DFA closure properties Let M ₁ = (Q ₁, Σ, s ₁, A ₁, δ ₁ ) and M ₂ = (Q ₂, Σ, s ₂, A ₂, δ ₂ ) accept the languages L ₁ and L ₂ respectively. Let M = (Q ₁  Q ₂, Σ, (s ₁, s ₂ ), A, (δ ₁ (p ₁, a), δ ₂ (p ₂, a))) for any p ₁ ∈ Q ₁, p ₂ ∈ Q ₂, and a ∈ Σ. Claim: 1. If A = {(q ₁, q ₂ ) : q ₁ ∈ A ₁ or q ₂ ∈ A ₂ }, M accepts L ₁ ∪ L ₂. 2. If A = {(q ₁, q ₂ ) : q ₁ ∈ A ₁ & q ₂ ∈ A ₂ }, M accepts L ₁ ∩ L ₂. 3. If A = {(q ₁, q ₂ ) : q ₁ ∈ A ₁ & q ₂ ∉ A ₂ }, M accepts L ₁ − L ₂.

Nondeterministic Finite Automata: NFA Same as DFA except :Δ: Q  Σ → 2 Q i.e. iff q ∈ Δ(p, σ) Yields relation is no longer a function: (q, σx)  M (q′, x) if q′ ∈ Δ(q, σ)  M * is same as before (meaning: all valid paths) Acceptance: M accepts w   f ∈ F such that (s, w)  M * (f, ε) pq σ

NFA Example Search for x s f x Σ Σ q0q0 qnqn σ1σ1 Σ q1q1 Σ … σnσn x = σ 1 … σ n

NFA to DFA Theorem Definitions:δ(p, ε) ≡ p δ(P, σ) ≡ {δ(p, σ) : p ∈ P} δ(P, wσ) = δ(δ(P, w), σ) Δ(P, σ) ≡ ⋃ {Δ(p, σ) : p ∈ P} Δ(p, ε) ≡ {p} Δ(P, wσ) = Δ(Δ(P, w), σ) Theorem: For every NFA M =  Q, Σ, s, F, Δ , there is an equivalent DFA. Proof: Define the DFA M′ =  2 Q, Σ, {s}, {P ⊆ Q: P ∩ F ≠ Ø}, δ(P ⊆ Q, σ) = Δ(P, σ)  Idea: single state in M′ is a set of states in M. Show M can reach f ∈ F iff M′ reaches a state containing f.

NFA to DFA Example M =  {q 0, q 1 }, {0, 1}, Δ, q 0, {q 1 }  L M = {w ∈ {0, 1}* : w doesn’t begin with 10} Δ01 q0q0 {q 0, q 1 }{q1}{q1} q1q1 Ø q₀q₀ q1q try 110 M′ =  {Ø, {q 0 }, {q 1 }, {q 0, q 1 }}, {0, 1}, δ, {q 0 }, {{q ₀ }, {q 1 }, {q 0, q 1 }}  Ø {q 0, q 1 } 0, {q1}{q1} {q₀}{q₀} 0 δ01 ØØØ {q0}{q0}{q 0, q 1 }{q1}{q1} {q1}{q1}Ø

NFA to DFA Proof Do by showing that Δ(s, w) = δ({s}, w) by induction on |w| Basis: |w| = 0 ⇒ w = ε ⇒ Δ(s, ε) = {s} = δ({s}, ε) by definition. Induction: Take wσ. δ({s}, wσ) ≡ δ(δ({s}, w), σ), and Δ(s, wσ) ≡ Δ(Δ(s, w), σ) call this P But δ(P, σ) = Δ(P, σ) by definition! δ({s}, w) = Δ(s, w) IH s p ∈ Pp ∈ Pr ∈ Rr ∈ R σ NFA : w {s}{s} P R σ DFA : w  IH  by construction

Another NFA to DFA Example L = {w ∈ {a, b }* : bb ⊆ w} s q b p a,ba,b b a,ba,b NFA Δab s{s}{s}{s, p} pØ{q}{q} q{q}{q}{q}{q} δab {s}{s}{s}{s} {s}{s}{s, p, q} {s, q}{s, p, q} {s, q} {s, p, q} {s}{s} b {s, p} b b a DFA {s, q} a b a a

NFA with ε-moves Extend domain of Δ ⊆ Q  (Σ ∪ {ε})  Q so that machine can change state without consuming any input: pq ε Example: q0q0 q2q2 ε q1q1 ε 012 Δ012ε q0q0 {q0}{q0}ØØ{q1}{q1} q1q1 Ø{q1}{q1}Ø{q2}{q2} q2q2 ØØ{q2}{q2}Ø

Removing ε-moves Let Δ ε * = Δ* ∩ [Q  {ε}  Q], the transitive reflexive closure of the ε-edges. So Δ ε *(p) = {q : (p, q) ∈ Δ ε *} Extend Δ to Δ′(p, σ) = Δ(Δ ε *(p), σ) Extend F to F′ = {p : Δ ε *(p) ∩ F ≠ Ø} Remove all ε-edges and claim that new machine is the same as the old. Idea: break paths in old machine into pq ε p ε q σ σ pf ε ε … ε … σnσn σiσi ε … σ1σ1

NFA with ε-moves, example continued Δ012 q0q0 {q0}{q0}{q1}{q1}{q2}{q2} q1q1 Ø{q1}{q1}{q2}{q2} q2q2 ØØ{q2}{q2} q0q0 q2q2 ε q1q1 ε

Regular Language to NFA Theorem:Let r be a regular expression. Then L r = L M for some NFA a ε ε MrMr MsMs Proof: Basis: r = Ø r = a L M = Ø L M = {a} Induction: r + s L r = L M r by IH L s = L M s L M = {ε}L M r ∪ {ε}L M s = L r ∪ L s = L r+s MrMr MsMs ε L M = L M r ∙ {ε} ∙ L M s L r ∙ L s = L r∙s r ∙ s MrMr ε ε r*r* L M = {ε} ∪ L M r + = L r * = L r*

Regular Expression for Parity (10*1 + 0)* M 0 = 0 1 0ε ε simplify by eliminating ε-transitions and identifying equivalent states 0 M 0* = M 1 = M 10*1 = 1εε simplification

Example continued M (10*1+0) = 1 ε ε use ε-closure and eliminate unreachable states and combine final states together M (10*1+0)* = 1 ε ε simplify using ε-closure to since no transitions enter the start state, and since the start state and final state are equivalent

DFA → Regular Language (classical method) Theorem: Let M be a DFA. Then L(M) = L(r) for some regular expression r. Proof: Number the states Q = {s = q 1, …, q n } (no q 0 ). Let R ij k be the set of strings from Σ* which take M from state q i to state q j without passing through any state numbered higher than k. R ij 0 = R ij k = R ij k-1 + R ik k-1 ∙ (R kk k-1 )* ∙ R kj k-1 each R k−1 is regular by IH L(M) = ∪ {R 1j n : q j ∈ F } is a finite union of regular sets {a : δ(q i, a) = q j } i ≠ j {a : δ(q i, a) = q j } ∪ {ε} i = j qiqi qjqj a qiqi a qiqi qjqj qkqk ≤ k − 1

DFA → Regular Language Proof Claim: Each R ij k is regular. Proof by induction on k. Basis: R ij 0 = L(r ij 0 ) where r ij 0 = a 1 + … + a m + Ø* Induction Step: R ij k = L(r ij k ) where r ij k = r ij k-1 + r ik k-1 (r kk k-1 )* r kj k-1 L(M) is a finite union of regular sets, hence regular □ a i ∈ R ij 0 if ε ∈ R ij 0

DFA for Parity Using r ij k Method L(M) = r 11 2 = r 12 1 (r 22 1 )*r r 11 1 = 0*1(10*1 + 0)*10* + 0* q1q1 q2q k = 0k = 1 r 11 k 0 + ε(0 + ε)(0 + ε)*(0 + ε) + (0 + ε) = 0* r 12 k 1(0 + ε)0*1 + 1 = 0*1 r 21 k 110*(0 + ε) + 1 = 10* r 22 k 0 + ε10* ε

FA → Regular Language Start: Number the states s = q 0, …, q n. Idea: find a solution to the problem A i = {w ∈ Σ* : Δ(q i, w) ∩ F ≠ Ø} when i = 0 Solve: mutually recursive equations A i = ∑ {σA j : q j ∈ Δ(q i, σ), σ ∈ Σ} + {ε : if q i ∈ F} Show: can be solved by a regular expression

Arden’s Lemma Lemma: The recursive equation X = AX + B, where A and B are languages, ε ∉ A, has a unique solution X = A*B. Proof: Obviously A(A*B) + B = (A⁺ + ε)B = A*B is a solution. Clearly B ⊆ X, ⇒ AB ⊆ X ⇒ … ⇒ A*B ⊆ X means it is minimal. If a larger solution L existed, then C = L \ A*B ≠ Ø. Then A*B + C = A(A*B + C) + B = A⁺B + AC + B = A*B + AC. Now, C is disjoint from A*B, so (A*B + C) ∩ C = (A*B + AC) ∩ C ⇒ C = AC ∩ C ⇒ C ⊆ AC. Let x ∈ AC be of minimal length. Then x = yz, y ∈ A, z ∈ C. But ε ∉ A by hypothesis ⇒ z ∈ AC with |z| < |x|, contradiction. Note: ε ∉ A is not a restriction because in any FA, an epsilon loop from any state to itself can be removed.

Arden’s Lemma Example A 1 = 1A 0 + 0A 1 = 0*1A 0 A 0 = ε + 0A 0 + 1A 1 A 0 = ε + 0A *1A 0 = ε + (0 + 10*1)A 0 = (0 + 10*1)*ε = (0 + 10*1)* q0q0 q1q

Picture for Arden’s Lemma Solving Recursive Equations Note that after each phase A i = … A j<i …. In particular, A 0 is solved. C A2BA2B AB B A*BA*B AC A2BA2B AB B A*B A __ + B A n = A 0 ….. A n … A i+1 = A 0 ….. A i+1 A n = A 0 ….. A n−1 A i+1 = A 0 ….. A i … use Arden’s to eliminate A n A n−1 = A 0 ….. A n A n−1 = A 0 ….. A n−1 substitute for A n Arden repeated substitution A i+1, …, A n A i = A 0 ….. A n A i = A 0 ….. A i … x Ax + B

Pumping Lemma Theorem: Let L be an infinite regular language. Then there is an n such that for all w ∈ L with |w| ≥ n, w can be written as w = uvx with |v| ≥ 1 and |uv| ≤ n such that for all i ≥ 0, uv i x ∈ L. Proof: Let L = L M for some DFA M with n states. Running M on w ∈ L with |w| ≥ n means it visits ≥ n + 1 states, so some state appears twice (PHP), in which case uv i x ∈ L M for all i ≥ 0. Temporally: state appears twice on the path from start to final Spatially: we must pass through a loop on the diagram

Irregularity Method Take an infinite language L, assume it is regular toward a contradiction via the pumping lemma. I.e. show:  n,  w ∈ L, |w| ≥ n,   uvx = w, |uv| ≤ n, v ≠ ε  i ≥ 0  uv i x ∉ L Example: Suppose L = {a m b m : m ≥ 0} is regular. Given any n, take w = a n b n. Since w = uvx with |uv| ≤ n and |v| ≥ 1, v ∈ a +. Choose i = 0, to get uv 0 x = a n−|v| b n ∉ L. Contradiction Example: Suppose L = {0 i² : i ≥ 1} is regular. Take w = 0 n² = uvx, with 1 ≤ |v| ≤ n. So uv²x = 0 n²+|v|, but n 2 + |v| < (n + 1) 2 = n 2 + 2n + 1. So uv²x ∉ L. Contradiction to PL. Using closure properties (intersection) to show irregularity: Example: For L = {ww R : w ∈ {a, b}*}, let L' = L ∩ a*bba* = {a n b 2 a n : n ≥ 0} which is easy to show (by PL) irregular.

Pumping Lemma (explanation) Idea: If L is infinite and regular, it must satisfy: i.e. if L is infinite and doesn’t satisfy the property, then it can’t be regular.  uv i x ∈ L n w ∈ L |w| ≥ n uvx = w i ≥ 0 |v| ≥ 1 |uv| ≤ n  uv i x ∉ L

Decision Algorithms for Regular Sets Suppose L is given by a FA M (no ε transitions) with start state s. Let → be the DAG of M (ignore transition labels). Then L M ≠ Ø iff s →* f for some final state f. |L M | = ∞ iff s →* q →⁺ q →* f for some state q. Equivalence: L₁ = L₂ iff (L₁ ⊆ L₂ and L₂ ⊆ L₁) ⇔ (L₁ ∪ L₂) ∖ (L₁ ∩ L₂) = Ø.

Closure Properties Example Using closure properties to prove non-regularity Show {a n ba n : n ≥ 1} is not regular Define h 1 (a) = ah 1 (b) = bah 1 (c) = a h 2 (a) = 0h 2 (b) = 1h 2 (c) = 1 h 1 −1 ({a n ba n : n ≥ 1} ⊆ {(a + c) n b(a + c) n−1 : n ≥ 1} so h 1 −1 ({a n ba n : n ≥ 1} ∩ a*bc* = {a n bc n−1 : n ≥ 1} and h 2 ({a n bc n−1 : n ≥ 1} = {0 n 11 n−1 : n ≥ 1} = {0 n 1 n : n ≥ 1} not regular

Decision Algorithms for Regular Sets If L is regular: Does L = 0 (Ø)? Does L = 1 (Ø*)? Is L finite? For a regular expression, this can be answered recursively: Basis:Ø = Ø ; a ≠ Ø Induction:r + s = Ø  r = Ø and s = Ø r* ≠ Ør∙s = Ø  r = Ø or s = Ø Basis:Ø ≠ Ø* ; a ≠ Ø* Induction:r + s = Ø*  r or s = Ø* and the other = Ø or Ø* r∙s = Ø*  r = Ø* = s r* = Ø*  r = Ø* or r = Ø Basis:|Ø| < ∞ ; |a| < ∞ Induction:|r + s| < ∞  |r| and |s| are both < ∞ |r∙s| < ∞  r or s = Ø or |r| and |s| < ∞ |r*| < ∞  r = Ø or Ø* = Ø? | | < ∞ ?

Simplifying regular expressions Fact: If we let Ø* = ε, then every non-empty regular expression can be written without the use of the empty set. Reason: Ø can be removed bottom-up from every sub- expression because it behaves like the additive identity and multiplicative zero. The only exception is Kleene star, Ø*. Fact: Every regular language without ε can be written without the use of ε. Reason: Ø* can be removed top-down from every sub- expression without ε because it behaves like the multiplicative identity. Fact: Once these exceptional cases are removed, a regular expression denotes an infinite language iff it contains a Kleene star (*). 38

Decision algorithm for Regular Sets (classical treatment) Assume all regular languages are represented by a DFA M with n states. (1) L M is nonempty   w, |w| < n, w ∈ L M Proof: (1) (  ) obvious ( ⇒ ) Let w be a minimal length word accepted by M. If |w| ≥ n, then by the pumping lemma, w = uvx, |v| ≥ 1, and ux ∈ L M which contradicts minimality of |w|. Therefore |w| < n.

(2) L M is infinite   w, n ≤ |w| < 2n, w ∈ L M Proof: (2) (  ) If  w ∈ L M, n ≤ |w| < 2n, then by the pumping lemma, w = uvx and uv i x ∈ L M  i ≥ 0 (|v| ≠ 0), which implies L M is infinite. ( ⇒ ) Suppose L M is infinite, with w ∈ L M of minimal length ≥ n. If |w| ≥ 2n, then by pumping lemma, w = uvx 1 ≤ |v| ≤ n. Then ux ∈ L M, |ux| n, which is a contradiction. □ (3) Equivalence: There is an algorithm to determine if L M 1 = L M 2 Proof: (L M 1 ∩ L′ M 2 ) ∪ (L′ M 1 ∩ L M 2 ) = Ø  L M 1 = L M 2 □ Decision algorithm for Regular Sets (classical treatment), cont.