Lecture 8 NFA Subset Construction & Epsilon Transitions CSCE 355 Foundations of Computation Lecture 8 NFA Subset Construction & Epsilon Transitions Topics: Regular expressions Thompson Construction Examples Thompson Construction June 10, 2015
TEST 1 – Post Mortem – Problems due Thursday Last Time: Readings 2.3 Mutual Induction Proof revisited Languages denoted by regular expressions Examples Ruby Regular Expressions TEST 1 – Post Mortem – Problems due Thursday #6 Prove the number of nodes in an m-ary tree is at most 1+m+m^2 … m^h and leave as a sum. New: Readings section 2.2.3-2.4 Author’s Website Solutions Online ε-NFA NFA ε-NFA DFA subset construction Regular Expressions Relations Machines and Regular expressions
Figure 2.20 - ε-NFA for fixed decimals Given the transition table Draw the transition Diagram Compute the Eclosure for each state ε +,- . 0,1,2,…9 q0 {q1} Ф q1 {q2} {q1,q4} q2 q3 {q5} {q3} q4 q5
Delta-hat ( δ ) with ε-transitions
Eliminating ε-transitions
Regular Expressions DFAs recognize languages NFAs recognize languages Regular expressions denote languages – so that we can write the description of a language as the combination of less complex languages Examples: r = a*b* L(r)={w in {a,b}*| all a’s come before any b} r= (0+1)*000(0+1)* L(r) = {w | w contains 000}
Grep Unix utility man grep man –k regexp
Recursive Definition of Reg Expr Definition of regular expressions over an alphabet Σ Base cases: if a ε Σ then A is a regular expression and denotes L(a) = { a } ε is a regular expression and denotes L(ε) = { ε } A variable, usually a capital such as L, and of course the L(L) = L Recursive definition If r and s are regular expressions denoting the languages L(r) and L(s) then rs is a regular expression denoting L(rs) = L(r)L(s) r+s is a regular expression denoting L(r+s) = L(r) L(s) r* is a regular expression denoting [L(r*)] = [L(r)] * (r) is a regular expression denoting L( (r) ) = L(r)
Thompson Construction Based on recursive (inductive) definition of regular expressions We describe NFAs (with epsilon moves) that recognize the base cases. Then assuming we have NFAs for smaller expressions r and s we construct NFAs for r + s rs r*
Recursive Definition of Reg Expr Definition of regular expressions over an alphabet Σ Base cases: if a ε Σ then A is a regular expression and denotes L(a) = { a } ε is a regular expression and denotes L(ε) = { ε } A variable, usually a capital such as L, and of course the L(L) = L Recursive definition If r and s are regular expressions denoting the languages L(r) and L(s) then rs is a regular expression denoting L(r)L(s) r+s is a regular expression denoting L(r) L(s) r* is a regular expression denoting [L(r)]* =
Thompson Construction Base cases
Thompson Construction Recursive cases
Thompson Construction Recursive cases
Thompson Construction Examples
2008 Sample test 1 outline Proof Techniques Inductive proof mutual induction proof Given DFA Transition diagram Input “abaa” L(M) Give DFA for L NFA NFA for L NFA DFA (Subset) εNFA εNFA for L ε-closure (ECLOSE in text) εNFA DFA (Subset) Ruby regular expressions
Design DFA that accepts Language L Example For a DFA D how do you prove L(D) = L ?
Mutual Induction Proof Define three statements for a mutual induction proof that could help in proving that L(M) = L ={x ε {0, 1}* | that x has a number of zeroes divisible by 3 }
HW solutions 2.5.1 State\input ε a b c p Φ { p } { q } { r } q *r ε-closure{p} = {p}
2.5.3 a,b The set of strings consisting of zero or more a’s followed by zero or more b’s followed by zero or more c’s The set of strings consisting of either 01 repeated one or more times or 010 repeated one of more times
2.3.2 Subset NFA without ε NFA DFA 1 { p } { q, s } { q } 1 p State\input 1 { p } { q, s } { q } State\input 1 p { q, s } { q } *q { r } {q, r} r { s } { p } *s ϕ
2.3.4 Done before The set of strings over {0,1, … 9} such that the final digit has appeared before The set of strings over {0,1, … 9} such that the final digit has not appeared before The set of strings of 0’s and 1’s such that there are two 0’s separated by a number of positions that is a multiple of 4. Note 0 is allowable multiple.
Tenth symbol from the right is a ‘1’
Pop Quiz Induction proof
Homework: 2.5.1 2.5.3 a,b Regular expressions