Download presentation
Presentation is loading. Please wait.
Published byCynthia Perkins Modified over 9 years ago
1
MA/CSSE 474 Theory of Computation Regular Expressions Intro
2
Your Questions? Previous class days' material Reading Assignments HW5 problems Anything else Still more language ambiguity!
3
Regular Languages Regular Language Regular Expression Finite State Machine Describes Accepts
4
Regular Expressions The regular expressions over an alphabet are the strings that can be obtained as follows: 1. is a regular expression. 2. is a regular expression. 3. Every element of is a regular expression. 4. If , are regular expressions, then so is . 5. If , are regular expressions, then so is . 6. If is a regular expression, then so is *. 7. is a regular expression, then so is +. 8. If is a regular expression, then so is ( ). #7 is here for convenience only (syntactic sugar); many authors do not include + in the list of r.e. builders.
5
Regular Expression Examples If = { a, b }, the following are regular expressions: a ( a b )* (abba ) + (a bab)
6
Regular Expressions Define Languages Define L, a semantic interpretation function for regular expressions (Let and be arbitrary regular expressions over alphabet ). 1. L( ) = . 2. L( ) = { }. 3. If c , L(c) = {c}. 4. L( ) = L( ) L( ). 5. L( ) = L( ) L( ). 6. L( *) = (L( ))*. 7. L( + ) = L( *) = L( ) (L( ))*. If L( ) is equal to , then L( + ) is also equal to . Otherwise L( + ) is the language that is formed by concatenating together one or more strings drawn from L( ). 8. L(( )) = L( ).
7
The Role of the Rules Rules 1, 3, 4, 5, and 6 give the language its power to define sets. Rule 8 has as its only role grouping other operators. Rules 2 and 7 appear to add functionality to the regular expression language, but they don’t. 2. is a regular expression. 7. is a regular expression, then so is +.
8
Operator Precedence in Regular Expressions RegularArithmeticExpressions HighestKleene star and +exponentiation concatenationmultiplication Lowestunionaddition a b * c d *x y 2 + i j 2
9
Analyzing a Regular Expression L(( a b )* b ) = L(( a b )*) L( b ) = (L(( a b )))* L( b ) = (L( a ) L( b ))* L( b ) = ({ a } { b })* { b } = { a, b }* { b }.
10
From English to reg exps L = {w { a, b }*: |w| is even} L = {w { 0, 1 }*: w is a binary representation of a multiple of 4} L = {w { a, b }*: w contains an odd number of a ’s}
11
Hidden: Going the Other Way L = {w { a, b }*: |w| is even} ( a b ) ( a b ))* ( aa ab ba bb )* L = {w { 0, 1 }*: w is a binary representation of a multiple of 4} 0 1(0 1)*00 L = {w { a, b }*: w contains an odd number of a ’s} b * ( ab * ab *)* a b * b * a b * ( ab * ab *)*
12
The Details Matter a * b * ( a b )* ( ab )* a * b *
13
More Regular Expression Examples L ( ( aa *) ) = L ( ( a )* ) = L = {w { a, b }*: there is no more than one b in w} L = {w { a, b }* : no two consecutive letters in w are the same}
14
The Details Matter L 1 = {w { a, b }* : every a is immediately followed a b } A regular expression for L 1 : A FSM for L 1 : L 2 = {w { a, b }* : every a has a matching b somewhere} A regular expression for L 2 : A FSM for L 2 :
15
Kleene’s Theorem Finite state machines and regular expressions define the same class of languages. To prove this, we must show: Theorem: Any language that can be defined by a regular expression can be accepted by some FSM and so is regular. Theorem: Every regular language (i.e., every language that can be accepted by some DFSM) can be defined with a regular expression.
16
For Every Regular Expression There is a Corresponding FSM We’ll show this by construction. An FSM for: : A single element c of : :
17
Union If is the regular expression and if both L( ) and L( ) are regular:
18
Concatenation If is the regular expression and if both L( ) and L( ) are regular:
19
Kleene Star If is the regular expression * and if L( ) is regular:
20
An Example (b ab )* An FSM for b An FSM for a An FSM for b An FSM for ab :
21
An Example ( b ab )* An FSM for ( b ab ):
22
An Example ( b ab )* An FSM for ( b ab )*:
23
For Every FSM There is a Corresponding Regular Expression We’ll show this by construction. The construction is different than the textbook's. Let M = ({q 1, …, q n }, , , q 1, A) be a DFSM. Define R ijk to be the set of all strings x * such that (q i,x) |- M (q j, ), and if (q i,y) |- M (q, ), for any prefix y of x (except y= and y=x), then k That is, R ijk is the set of all strings that take us from q i to q j without passing through any intermediate states numbered higher than k. In this case, "passing through" means both entering and leaving. Note that either i or j (or both) may be greater than k. * *
24
Example: R ijk R ijk is the set of all strings that take us from q i to q j without passing through any intermediate states numbered higher than k. In this case, "passing through" means both entering and leaving. Note that either i or j (or both) may be greater than k. R 000 R 010 R 011 R 021 R 022 R 232 R 233
25
DFA Reg. Exp. construction R ijk is the set of all strings that take M from q i to q j without passing through any intermediate states numbered higher than k. Examples: R ijn is Also note that L(M) is the union of R 1jn over all q j in A. We will show that for all i,j {1, …, n} and all k {0, …, n}, R ijk is defined by a regular expression. –We already know that the union of languages defined by reg. exps. is defined by a reg. exp.
26
DFA Reg. Exp. continued R ijk is the set of all strings that take M from q i to q j without passing through any intermediate states numbered higher than k. It can be computed recursively: Base cases (k = 0): –If i j, R ij0 = {a : (q i, a) = q j } –If i = j, R ii0 = {a : (q i, a) = q i } { } Recursive case (k > 0): R ijk is R ijk-1 R ikk-1 (R kkk-1 )*R kjk-1 We show by induction that each R ijk is defined by some regular expression r ijk.
27
DFA Reg. Exp. Proof pt. 1 Base case definition (k = 0): –If i j, R ij0 = {a : (q i, a) = q j } –If i = j, R ii0 = {a : (q i, a) = q i } { } Base case proof: R ij0 is a finite set of symbols, each of which is either or a single symbol from . So R ij0 can be defined by the reg. exp. r ij0 = a 1 a 2 … a p (or a 1 a 2 … a p if i=j), where {a 1, a 2, …,a p } is the set of all symbols a such that (q i, a) = q j. Note that if M has no direct transitions from q i to q j, then r ij0 is (it is if i=j and no "loop" on that state).
28
DFA Reg. Exp. Proof pt. 2 Recursive definition (k > 0): R ijk is R ijk-1 R ikk-1 (R kkk-1 )*R kjk-1 Induction hypothesis: For each and, there is a regular expression r k-1 such that L(r k-1 )= R k-1. Induction step. By the recursive parts of the definition of regular expressions and the languages they define, and by the above recursive defintion of R ijk : R ijk = L(r ijk-1 r ikk-1 (r kkk-1 )*r kjk-1 )
29
DFA Reg. Exp. Proof pt. 3 We showed by induction that each R ijk is defined by some regular expression r ijk. In particular, for all q j A, there is a regular expression r 1jn that defines R 1jn. Then L(M) = L(r 1j 1 n … r 1j p n ), where A = {q j 1, …, q j p }
30
An Example Start q 1 q 2 q 3 0 0 1 1 0,1 k=0k=1k=2 r 11k (00)* r 12k 000(00)* r 13k 110*1 r 21k 000(00)* r 22k 00(00)* r 23k 11 010*1 r 31k (0 1)(00)*0 r 32k 0 1 0 1(0 1)(00)* r 33k (0 1)0*1
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.