Download presentation
Presentation is loading. Please wait.
Published byDebra Wade Modified over 9 years ago
1
Regular Expressions 15-211 Fundamental Data Structures and Algorithms Peter Lee March 13, 2003
3
Announcements Homework #4 is due on Monday! Monday, March 17, 11:59pm Reading: Handout (from last time)
4
Recap: FSMs
5
Finite State Machines (FSMs) Input String M {Yes, No} M = (, S, q0, F, ) Input alphabet State set Initial state Final states Transition function
6
Can extend :S S to ’:S * S ’(q, ) = q ’(q, aw) = ’((q, a), w) Transition functions A deterministic finite automaton (DFA) Inductively:
7
DFA example Which strings of as and bs are accepted? Transition function: { (q0,a) q1, (q0,b) q0, (q1,a) q2, (q1,b) q1, (q2,a) q2, (q2,b) q2 } 1 2 0 aa bba,b
8
Nondeterministic FSMs (NFAs) NFAs can transition to more than one state on any input :S P(S) As before, can extend: ’:S * P(S) Inductively: ’(q, ) = {q} ’(q, aw) = p(q, a) ’(p, w)
9
NFA example 0 1 a,b ab b Transition function: { (q0,a) {q0,q1}, (q0,b) {q1}, (q1,a) , (q1,b) {q0,q1} }
10
Questions 1. Are there languages L that can be accepted by NFAs but not DFAs? 2. What practical use are there for FSMs? No! Today: the proof. After the proof…
11
The Idea An NFA can be in more than one state at a time Define a DFA whose states correspond with all combinations of the NFA states
12
Another handy extension Extend :S P(S) to ’:S * P(S) to ’’:P(S) * P(S) ’’({q1,…qn}, w) = ’(qi, w) 1 i n
13
NFA into a DFA example 0 1 a,b ab b In the DFA, construct these states: S = {[], [q0], [q1], [q0,q1]} Each state in the DFA represents a set of states in the NFA NFA:
14
NFA into a DFA example 0 1 a,b ab b DFA: S= {[], [q0], [q1], [q0,q1]} What is for the DFA? ([],a) = [] and ([],b) = [] ([q0],a) = [q0,q1] ([q0],b) = [q1] ([q1],a) = [] ([q1],b) = [q0,q1] ([q0,q1],a) = [q0,q1] ([q0,q1],b) = [q0,q1] 0 0,1 a,b a b b 1
15
The theorem Thm: Let L be a language accepted by an NFA. Then there exists a DFA that also accepts L. Proof: Let’s use the construction shown on the previous slides. We must prove that the DFA accepts the same language as the NFA.
16
The proof More formally: Let M = (, S, q0, F, ) be the NFA, and M’ = (, S’, q0’, F’, ’) be the DFA. We want to prove that, given any input string w, that ’(q0’,w)=[qi,qj,…,qk] iff (q0,w)={qi,qj,…,qk}
17
By induction (of course!) Base case: Trivial for the empty input string. Induction hypothesis: Assume true for all input strings of length n or less.
18
By induction… Let wa be a string of length n+1. Then ’(q0’,wa) = ’(’(q0’,w),a) By the IH, ’(q0’,w) = [qi,qj,…,qk] iff (q0,w) = {qi,qj,…,qk} And by definition of ’ ’([qi,qj,…,qk],a) = [qa,qb,…,qc] iff ({qi,qj,…,qk},a) = {qa,qb,…,qc} Thus, ’(q0’,wa) = [qa,qb,…,qc] iff (q0, wa) = {qa,qb,…,qc}
19
Regular Languages
20
Regular languages The language accepted by M: L(M) = {w | ’(q0,w) F} Can also say: The language recognized by M The language decided by M When M is a FSM, we say that the language is regular
21
Another question Is the complement of a regular language also regular? L’ = * - L Hint 1: Is there a way to construct a complement machine? Hint 2: Consider the final states…
22
Closure properties What about union? Intersection? Product?
23
A Digression
24
Cheating vs Collaboration
25
A scenario Alice and Bob are excellent students. There is virtually no doubt that they can easily do “A” work in 15-211. But even so, 15-211 is a lot of work. And the time required might be better spent in another course, which is harder, and possibly more important.
26
A scenario, cont’d So, to save time, Alice and Bob decide to work together on the 15- 211 homeworks. They work together and hand in essentially the same programs. Alice writes a comment into her version of the code, explaining that she has collaborated with Bob. Bob does not do this.
27
A scenario, cont’d Did Alice cheat? What about Bob?
28
A second scenario Bob works very hard on his 15-211 assignment He gets everything working and hands it in 3 days early He then discusses his solution with Alice After discussing with Alice, Bob realizes that his solution is O(n 2 ), whereas the best solutions are O(nlog n)
29
Second scenario, cont’d Bob uses this new knowledge and rewrites his assignment so that it runs in O(nlog n) time, and hands it in Later, after further discussion with Alice, he realizes that his code, while acceptably fast, is still written poorly
30
Second scenario, cont’d Bob has learned a lot already, but is concerned that his grade will not reflect his state of knowledge Bob thus copies Alice’s code, makes some minor modifications, and hands it in What has happened here?
31
Regular Expressions
32
A regular language can always be described using a regular expression. Examples (01)* 00 (a|b)*ab this|that|theother 0*1*2* 01*|0 = 01* 00*11*22* = 0 + 1 + 2 + (1|0)*00(0|1)*
33
More examples [.?!][\]\"')]*($|\t| )[ \t\n]* [.?!][]"')]*($| |)[ ]* Emacs regexp: Any of. ? ! followed by Zero or more of ] “ ‘ ) followed by Any of end-of-line, tab, two spaces followed by Zero or more of space, tab, newline [Demo of emacs, sed, grep…]
34
Regular expressions Inductive definition. Let = {a,b}. is a regular expression L = {}
35
Regular expressions Inductive definition. Let = {a,b}. is a regular expression L = {} is a regular expression L = {}
36
Regular expressions Inductive definition. Let = {a,b}. is a regular expression L = {} is a regular expression L = {} Invariant: Every machine must have exactly one final state.
37
Regular Expressions Inductive definition. Let = {a,b}. is a regular expression L = {} is a regular expression L = {} a is a regular expression L = {a} a
38
Regular Expressions Inductive definition. Let = {a,b}. is a regular expression L = {} is a regular expression L = {} a is a regular expression L = {a}
39
Regular Expressions Inductive definition. Let = {a,b}. is a regular expression L = {} is a regular expression L = {} a is a regular expression L = {a} R+S is a regular expression if R and S are
40
Regular Expressions Inductive definition. Let = {a,b}. is a regular expression L = {} is a regular expression L = {} a is a regular expression L = {a} R+S is a regular expression if R and S are L R+S = L R U L S
41
Regular Expressions Inductive definition. Let = {a,b}. is a regular expression L = {} is a regular expression L = {} a is a regular expression L = {a} R+S is a regular expression if R and S are L R+S = L R U L S R S
42
Regular Expressions Inductive definition. Let = {a,b}. is a regular expression L = {} is a regular expression L = {} a is a regular expression L = {a} R+S is a regular expression if R and S are L R+S = L R U L S Invariant: Every machine must have exactly one final state. R S
43
Regular Expressions Inductive definition. Let = {a,b}. is a regular expression L = {} is a regular expression L = {} a is a regular expression L = {a} R+S is a regular expression if R and S are L R+S = L R U L S Add a new final state with transitions from old final states if necessary Invariant: Every machine must have exactly one final state. R S
44
Regular Expressions Inductive definition. Let = {a,b}. is a regular expression L = {} is a regular expression L = {} a is a regular expression L = {a} R+S is a regular expression if R and S are L R+S = L R U L S R S
45
Regular Expressions Inductive definition. Let = {a,b}. is a regular expression is a regular expression a is a regular expression R+S is a regular expression if R and S are RS is a regular expression if R and S are L RS = {uv | u L R & v L S } R S
46
Regular Expressions Inductive definition. Let = {a,b}. is a regular expression is a regular expression a is a regular expression R+S is a regular expression if R and S are RS is a regular expression if R and S are R* is a regular expression if R is L R* = U 0 i L R i R
47
Regular Expressions The language described by a regular expression can be accepted by an FSM. RE NFA NFA DFA A regular language can always be described using a regular expression. DFA RE
48
Regular Expressions Membership in a regular language can be tested in time linear in the size of the input string.
49
Building FSMs An FSM is a directed graph How large is the input alphabet? How many states? How fast must it run? How to get the lowest constant factor? How to minimize space? Representations Matrix Array of lists Hashtable Overlapping hashtable Switch statement ab 011 123 210 323 414
50
Manipulating FSMs Eliminate unreachable states Transform NFA into DFA Transform NFA into NFA Minimize DFA Create FSM from regular expression Create regular expression from FSM Test equivalence of FSMs Test emptiness of FSM language
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.