1 Chapter 3 Regular Languages
2 3.1: Regular Expressions (1) Regular Expression (RE): E is a regular expression over if E is one of: a, where a If r and s are regular expressions (REs), then the following expressions also regular: r | s (r or s) rs (r followed by s) r * (r repeated zero or more times) Each RE has an equivalent regular language (RL)
3 3.1: Regular Expressions (2) Regular Language (RL): L is a regular language over if L is one of: empty set { }a set that contains empty string a} where a If R and S are regular languages (RL), then the following languages also regular: R S = {w | w R or w S} RS = {rs | r R and s S} R * = R 0 R 1 R 2 R 3 …
4 Rules for Specifying Regular Expressions: is a regular expression L = 2.If a is in , a is a regular expression L = {a}, the set containing the string a. 3.Let r and s be regular expressions with languages L(r) and L(s). Then a. r | s is a RE L(r) L(s) b. rs is a RE L(r) L(s) c. r* is a RE (L(r))* d. (r) is a RE L(r), extra parenthesis precedenceprecedence 3.1: Regular Expressions (3)
5 3.1: Regular Expressions (4) Examples: {0, 1}{00, 11} = {000, 011, 100, 111} {0} = { , 0, 00, 000, 0000, …} } * = { , 10, 1010, , …, 01, 0101, , …, 1001, , , …, 0110, , , …} Notational shorthand: L 0 = L i = LL i-1 L + = LL *
6 3.1: Regular Expressions (5) Let L be a language over {a, b}, each string in L contains the substring bb L = {a, b} * {bb}{a, b} * L is regular language (RL). Why? {a} and {b} are RLs {a, b} is RL {a, b} * is RL {b}{b} = {bb} is also RL Then L = {a, b} * {bb}{a, b} * is RL
7 3.1: Regular Expressions (6) Let L be a language over {a, b}, each string in L begins and ends with an a contains at least one b L = {a}{a, b} * {b}{a, b} * {a} L is regular language (RL). Why? {a} and {b} are RLs {a, b} is RL {a, b} * is RL Then L = {a}{a, b} * {b}{a, b} * {a} is RL
8 3.1: Regular Expressions (7) L = {a, b} * {bb}{a, b} * RE = (a|b) * bb(a|b) * L = {a}{a, b} * {b}{a, b} * {a} RE = a(a|b) * b(a|b) * a This RE = (a)|((b)*(c)) is equivalent to a|b*c We say REs r and s are equivalent (r=s), iff r and s represent the same language Example: r = a|b, s = b|a r = s Why? Since L(r) = L(s) = {a, b}
9 3.1: Regular Expressions (8) Let = {a, b} RE a|b L = {a, b} RE (a|b)(a|b) L = {aa, ab, ba, bb} RE aa|ab|ba|bb same as above RE a* L = { , a, aa, aaa, …} RE (a|b)* L = set of all strings of a’s and b’s including RE (a*b*)* same as above RE a|a*b L = {a,b,ab,aab,aaab, …}
: Regular Expressions (9) Algebraic Properties of regular Expressions AXIOM r | s = s | r r | (s | t) = (r | s) | t (r s) t = r (s t) r = r = r r* = (r*)* = ( r | ) + = r + | r ( s | t ) = r s | r t ( s | t ) r = s r | t r r** = r* r + = r r*
: Regular Expressions (10) abc concatenation (“followed by”) a | b | c alternation (“or”)* zero or more occurrences+ one or more occurrences
: Regular Expressions (11) All strings of 1s and 0s (0 | 1) * All strings of 1s and 0s beginning with a 1 1 (0 | 1) *
: Regular Expressions (12) All strings containing two or more 0s (1|0) * 0(1|0) * 0(1|0) * All strings containing an even number of 0s (1 * 01 * 01 * ) * | 1 *
: Regular Expressions (13) All strings containing an even number of 0s and even number of 1s Assume that ( 0 0 | 1 1 ) is X X* | (X* ( 0 1 | 1 0 ) X* ( 0 1 | 1 0 ) X*)* OR ( 0 0 | 1 1 ) * (( 0 1 | 1 0 )( 0 0 | 1 1 ) * ( 0 1 | 1 0 )( 0 0 | 1 1 ) * ) * All strings of alternating 0s and 1s ( | 1 ) ( 0 1 ) * ( | 0 )
: Regular Expressions (14) Strings over the alphabet {0, 1} with no consecutive 0's (1 | 01 ) * (0 | ) 1 * (01 + ) * (0 | ) 1 * (011 * ) * (0 | ) Strings over the alphabet {a, b} with exactly three b's a * ba * ba * ba * Strings over the alphabet {a, b, c} containing (at least once) bc (a|b|c) * bc(a|b|c) *
: Regular Expressions (15) Strings over the alphabet {a, b} in which substrings ab and ba occur an unequal number of times (a + b + ) + | (b + a + ) +
: Regular Expressions (16) Describe the following in English: (0|1)* all strings over {0, 1} b*ab*ab*ab* all strings over {a, b} with exactly 3 a’s
: Regular Expressions (17) Examples of RE: 01 * {0, 01, 011, 0111, …..} (01 * )(01) {001, 0101, 01101, , …..} (0 | 1) * = {0, 1, 00, 01, 10, 11, …..} i.e., all strings of 0 and 1 (0 | 1) * 00 (0 | 1) * = {00, 1001, …..} i.e., all 0 and 1 strings containing a “00”
: Regular Expressions (18) More Examples of RE: (1 | 10) * all strings starting with “1” and containing no “00” (0 | 1) * 011 all strings ending with “011” 0 * 1 * all strings with no “0” after “1” 00 * 11 * all strings with at least one “0” and one “1”, and no “0” after “1”
: Regular Expressions (19) What languages do the following RE represent? ((0 | 1)(0 | 1)) * | ((0 | 1)(0 | 1)(0 | 1)) *
: Regular Expressions (20) Home Study: Construct a RE over ={0,1} such that It does not contain any string with two consecutive “0”s It has no prefix with two or more “0”s than “1” nor two or more “1”s than “0” The set of all strings ending with “00” The set of all strings with 3 consecutive 0’s The set of all strings beginning with “1”, which when interpreted as a binary no., is divisible by 5 The set of all strings with a “1” at the 5th position from the right The set of all strings not containing 101 as a sub-string
:Connection Between RE & RL (1) A language L is called regular if and only if there exists some DFA M such that L = L(M). Since a DFA has an equivalent NFA, then A language L is called regular if and only if there exists some NFA N such that L = L(N). If we have a RE r, we can construct an NFA that accept L(r).
:Connection Between RE & RL (2) 2. For a in the regular expression, construct NFA a start L = {a} 1. For in the regular expression, construct NFA start L = { } 0. For in the regular expression, construct NFA start L = { } =
:Connection Between RE & RL (3) where i and f are new start / final states, and -moves are introduced from i to the old start states of M s and M t as well as from all of their final states to f. 3.(a) If s and t are regular expressions, M s and M t are their NFAs. s|t has NFA: start if MsMs MtMt L = {L(M s ) L(M t )}
:Connection Between RE & RL (4) 3.(b) If s and t are regular expressions, M s, M t their NFAs. st (concatenation) has NFA: MsMs start if MtMt where i is the start state of M s (or new under the alternative) and f is the final state of M t (or new). Overlap maps final states of M s to start state of M t L = {L(M s )L(M t )}
:Connection Between RE & RL (5) f MsMs start i where : i is new start state and f is new final state -move i to f (to accept null string) -moves i to old start, old final(s) to f -move old final to old start (WHY?) 3.(c) If s is a regular expressions and M s its NFA, s* (Kleene star) has NFA: L = {L(M s ) * }
:Connection Between RE & RL (6) Build an NFA- that accepts (a|b) * ba abbaa|b a start a b q1q1 b a b
:Connection Between RE & RL (7) Build an NFA- that accepts (a|b) * ba (a|b) * a b
:Connection Between RE & RL (8) Build an NFA- that accepts (a|b) * ba a b a b
:Connection Between RE & RL (9) r 13 r 12 r5r5 r3r3 r 11 r4r4 r9r9 r 10 r8r8 r7r7 r6r6 r0r0 r1r1 r2r2 b * c a a | ( ) b | * c (ab*c) | (a(b|c*)) Decomposition for this regular expression: What is the NFA? Let’s construct it !
:Connection Between RE & RL (10) r3:r3: a r0:r0: b r2:r2: c b r1:r1:r 4 : r 1 r 2 b c r 5 : r 3 r 4 b ac
:Connection Between RE & RL (11) r 11 : a r7:r7: b r6:r6: c c r 9 : r 7 | r 8 b r 10 : r 9 c r8:r8: c r 12 : r 11 r 10 b a
:Connection Between RE & RL (12) r 13 : r 5 | r 12 b ac c b a
:Connection Between RE & RL (13) Let’s try a ( b | c ) * 1. a, b, & c 2. b | c 3. ( b | c ) * S0S0 S1S1 a S0S0 S1S1 b S0S0 S1S1 c S1S1 S2S2 b S3S3 S4S4 c S0S0 S5S5 S2S2 S3S3 b S4S4 S5S5 c S1S1 S6S6 S0S0 S7S7
:Connection Between RE & RL (14) 4. a ( b | c ) * S0S0 S1S1 a S4S4 S5S5 b S6S6 S7S7 c S3S3 S8S8 S2S2 S9S9 S0S0 S1S1 a b | c
:Connection Between RE & RL (15) Let : a abb a*b + 3 patterns NFA’s : start 1 b b bb a a a
:Connection Between RE & RL (16) NFA for : a | abb | a * b + 0 b b bb a a a start
38 Regular Expression to NFA- (a | ba) * a
39 First Parsing Step concatenate (a|ba) * a
40 Second Parsing Step concatenate *a a|ba
41 Third Parsing Step concatenate *a | aba
42 Fourth Parsing Step concatenate *a | a ba
43 Identify Leaf Nodes concatenate +a | a ba
44 Convert Leaf Nodes concatenate * | a a a b
45 Identify Convertible Node(s) concatenate * | a a a b
46 Convert Node concatenate * | a aab
47 Identify Convertible Node concatenate * | a aab
48 Convert Node concatenate * a a ab
49 Identify Convertible Node concatenate * a a ab
50 Convert Node concatenate a a ab
51 Identify Convertible Node concatenate a a ab
52 Convert Final Node a a ab
:Expression Graphs (1) NFA to RE If L is accepted by some NFA- , then L is represented by some regular expression An expression graph is like a state diagram but it can have regular expressions as labels on arcs An NFA- is an expression graph An expression graph can be reduced to one with just two states If we reduce an NFA- in this way, the arc label then corresponds to the regular expression representing it
:Expression Graphs (2) w ik k w ji ji w ji w ik kj w ik k w ji ji w ji (w ii ) * w ik kj w ii w w * start w1w1 (w 1 ) * w 2 (w 3 w 4 (w 1 ) * w 2 ) * startw2w2 w3w3 w4w4
:Expression Graphs (3) Merge Edges : c b a a | b | c p ac * b b a c Replace state by Edges q0q0 q1q1 q0q0 q1q1
:Expression Graphs (4) Let G be the state diagram of a finite automata Let m be the number of final states of G Make m copies of G, each of which has one final state. Call these graphs G 1, G 2, …, G m For each G t Repeat Do the steps in the previous slide Until the only states in G t are the start state and the single final state Determine the RE of G t The RE of G is obtained by joining RE’s of each G t by or
:Expression Graphs (5) c 3 c 12 b G:start b c 3 c 12 b G 1 :start b c 3 c 12 b G 2 :start b
:Expression Graphs (6) c 3 c 12 b G 1 :start b 3 cc 1 bb 1 b b*
:Expression Graphs (7) c 3 c 12 b G 2 :start b 3 cc 1 bb b * ccb * RE for G b * | b * ccb *