Regular Expressions Section 1.3 (also 1.1, 1.2) CSC 4170 Theory of Computation
Regular operations 1.3.a Union: L1 L2 = {x | x L1 or x L2} {Good,Bad} {Boy,Girl} = {0,00,000,…} {1,11,111,…} = L = Concatenation: L1 L2 = {xy | x L1 and y L2} {Good,Bad} {Boy,Girl} = {0,00,000,…} {1,11,111,…} = L = Star: L * = {x 1 …x k | k 0 and each x i L} {Boy,Girl} * = {0,00,000,…} * = * =
Regular expressions 1.3.b We say that R is a regular expression (RE) iff R is one of the following: 1. a, where a is a symbol of the alphabet 2. 3. 4. (R1) (R2), where R1 and R2 are RE 5. (R1) (R2), where R1 and R2 are RE 6. (R1)*, where R1 is a RE What language is represented by the expression: {a} { } The union of the languages represented by R1 and R2 The concatenation of the languages represented by R1 and R2 The star of the language represented by R1 Conventions: The symbol is often omitted in RE Some parentheses can be omitted. The precedence order for the operators is: * (highest), (medium), (lowest)
Regular languages 1.3.c A language is said to be regular iff it can be represented by a regular expression. Language Expression {11} {Boy, Girl, Good, Bad} { ,0,00,000,0000,…} {0,00,000,0000,…} { ,01,0101,010101, ,…} {x | x = 0 k where k is a multiple of 2 or 3} {x | x is divisible by 8} {x | x MOD 4 = 3}
Exercising reading regular expressions 1.3.d Expression Language 0*10* (Good Bad)(Boy Girl) (Tom Bob)_is_(good bad) {Name_is_adjective | Name is an uppercase letter followed by zero or more lowercase letters, and adjective is a lowercase letter followed by zero or more lowercase letters} (0 1)*101(0 1)* ( (0 1)(0 1) ) *
Regular languages and DFA-recognizable languages are the same 1.3.e Theorem 1.54* A language is regular if and only if some NFA (DFA) recognizes it. In other words, a) [The “only if” part] For every regular expression there is an NFA that recognizes exactly the language represented by that expression. b) [The “if” part] For every NFA there is a regular expression that represents exactly the language recognized by that NFA.
Constructing an NFA from a regular expression: Base cases 1.3.f Case of a, where a is a symbol of the alphabet. Case of Case of
Constructing an NFA from a regular expression: Case of union 1.3.g Case of (R1) (R2), where R1 and R2 are RE First, construct NFAs N1 and N2 from R1 and R2: s1 N1 N2 s2 Then, combine them in the following way: s1 N1 N2 s2
Constructing an NFA from a regular expression: Case of concatenation 1.3.h Case of (R1) (R2), where R1 and R2 are RE First, construct NFAs N1 and N2 from R1 and R2: N1 s2 N2 Then, combine them in the following way: N1 s2 N2 s1
Constructing an NFA from a regular expression: Case of star 1.3.i Case of (R1) *, where R1 is a RE First, construct an NFA N1 from R1: s1 N1 Then, extend it in the following way: s1 N1
Constructing an NFA from a regular expression: An example 1.3.j #(0 1)* (0 1)* # 0 0 1 # #011
GNFA 1.3.k great (great)* grand mother father grand g r e a t g r e a t g r e a t g r a n d f a t h e r
About -transitions 1.3.l great (great)* grand mother father grand Adding or removing -transitions does not change the recognized language
The same GNFA simplified 1.3.m great grand mother father
Ripping a state out 1.3.n mother father grand (great)*
Eliminating parallel transitions 1.3.o mother father (great)*grand
Again ripping out 1.3.p ( (great)*grand) (mother father)
How, exactly, to do ripping out 1.3.q1 Assume, we are ripping out the state r from a GNFA that has no parallel transitions. Let L be the label of the loop from r to r (if there is no loop, then L= ). L T R S
How, exactly, to do ripping out 1.3.q2 Assume, we are ripping out the state r from a GNFA that has no parallel transitions. Let L be the label of the loop from r to r (if there is no loop, then L= ). 1. For every pair s 1,s 2 of states such that there is an E 1 -labeled transition from s 1 to r and an E 2 -labeled transition from r to s 2, add an R 1 L*R 2 -labeled transition from s 1 to s 2 ; L T R S RL*T SL*T
How, exactly, to do ripping out 1.3.q3 Assume, we are ripping out the state r from a GNFA that has no parallel transitions. Let L be the label of the loop from r to r (if there is no loop, then L= ). 1. For every pair s 1,s 2 of states such that there is an E 1 -labeled transition from s 1 to r and an E 2 -labeled transition from r to s 2, add an R 1 L*R 2 -labeled transition from s 1 to s 2 ; 2. Delete r together with all its incoming and outgoing transitions. RL*T SL*T
How, exactly, to eliminate parallel transitions 1.3.r Whenever you see parallel transitions labeled with R1 and R2, Replace them by a transition labeled with R1 R2. R1 R2 R1 R2 Repeat until there are no parallel transitions remaining.
From NFA to RE 1.3.s a b b b
From NFA to RE: Step t Step 1: If there are incoming arrows to the start state, or the start state is an accept state, then add a new start state and connect it with an -arrow to the old start state. a b b b a
From NFA to RE: Step u a b b b Step 2: If there are more than one, or no, accept states, or there is an accept state that has outgoing arrows, then add a new accept state, make all the old accept states non-accept states and connect each of them with an -arrow to the new accept state. a
From NFA to RE: Step v a b b b Step 3: Eliminate all parallel transitions. a
From NFA to RE: Step w1 b b Step 4: While there are internal states (states that are neither the start nor the accept state), do the following: Step 4.1: Select an internal state and rip it out; Step 4.2: Eliminate all parallel transitions. a b aa ab
From NFA to RE: Step w2 b b aa Step 4: While there are internal states (states that are neither the start nor the accept state), do the following: Step 4.1: Select an internal state and rip it out; Step 4.2: Eliminate all parallel transitions. a b ab
From NFA to RE: Step w3 Step 4: While there are internal states (states that are neither the start nor the accept state), do the following: Step 4.1: Select an internal state and rip it out; Step 4.2: Eliminate all parallel transitions. b a(b aa)* b(b aa)* b(b aa)*ab a(b aa)*ab
From NFA to RE: Step w4 Step 4: While there are internal states (states that are neither the start nor the accept state), do the following: Step 4.1: Select an internal state and rip it out; Step 4.2: Eliminate all parallel transitions. a(b aa)* b(b aa)* b(b aa)*ab b a(b aa)*ab
From NFA to RE: Step w5 Step 4: While there are internal states (states that are neither the start nor the accept state), do the following: Step 4.1: Select an internal state and rip it out; Step 4.2: Eliminate all parallel transitions. a(b aa)* ( b a(b aa)*ab ) ( b(b aa)*ab ) * ( b(b aa)* )
From NFA to RE: Step w6 Step 4: While there are internal states (states that are neither the start nor the accept state), do the following: Step 4.1: Select an internal state and rip it out; Step 4.2: Eliminate all parallel transitions. ( ( b a(b aa)*ab ) ( b(b aa)*ab ) * ( b(b aa)* ) ( a(b aa)* )
From NFA to RE: Step x Step 5: Return the label of the only remaining arrow (if there is no arrow, return ). Claim: The resulting RE represents exactly the language recognized by the original NFA. This completes the proof of Theorem ( ( b a(b aa)*ab ) ( b(b aa)*ab ) * ( b(b aa)* ) ( a(b aa)* )