CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong NFA to DFA conversion and regular expressions Fall 2009
NFAs are as powerful as DFAs Obviously, an NFA can do everything a DFA can do But can it do more? Theorem If a language L is accepted by some NFA, then it is also accepted by some DFA. NO!
Proof of theorem We will show a general way to convert every NFA into an equivalent DFA Step 1: Simplify NFA by eliminating -transitions Step 2: Convert simplified NFA (without s) We do this first
NFA to DFA conversion intuition 1 0 0, 1 qq qq qq NFA: DFA: 1 qq q 0 or q 1 1 q 0 or q
NFA to DFA conversion intuition 1 0 0, 1 qq qq qq NFA: DFA: 1 qq {q 0, q 1 } 1 {q 0, q 2 }
General method NFADFA states q 0, q 1, …, q n q 0 }, {q 1 }, {q 0,q 1 }, …, {q 0,…,q n } one for each subset of states in the NFA initial state q0q0 q0}q0} transitions ’({q i1,…,q ik }, a) = (q i1, a) ∪ … ∪ (q ik, a) accepting states F Q F’ = {S: S contains some state in F}
Why the method works At the end, the DFA accepts when it is in a state that contains some accepting state of NFA So the DFA accepts only when the NFA accepts too After reading n symbols, the DFA is in state {q i1,…,q ik } if and only if the NFA is in one of the states q i1,…,q ik
Converting via the general method 1 0 0, 1 qq qq qq NFA: DFA: {q 0, q 1 } {q 0, q 2 }{q 0, q 1, q 2 } {q 1, q 2 } {q 0 } {q 1 } {q 2 } 0,
Converting via the general method {q 0, q 1 } {q 0, q 2 }{q 0, q 1, q 2 } {q 1, q 2 } {q 0 } {q 1 } {q 2 } 0, After eliminating the dead states and transitions, we end up with the same picture
Proof of theorem We will show a general way to convert every NFA into an equivalent DFA Step 1: Simplify NFA by eliminating -transitions Step 2: Convert simplified NFA (without s)
Eliminating -transitions q0q0 q1q1 q2q2 , NFA: Transition table of corresponding NFA: states inputs 0 1 q0q0 q1q1 q2q2 {q 1, q 2 } {q 0, q 1, q 2 } Accepting states of NFA: q 0, q 1, q 2
Eliminating -transitions q0q0 q1q1 q2q2 , NFA: NFA without s: q0q0 q1q1 q2q2 0,
Eliminating -transitions: General method For every state q i and every symbol a , replace every path out of q i like For every accept state q f, make accepting all states connected to it via s: q4q4 qiqi q5q5 q2q2 q0q0 q5q5 a qiqi q5q5 a qiqi q5q5 a qiqi a q9q9 q7q7 q3q3 qfqf
Regular expressions
Operations on strings Given two strings s = a 1 …a n and t = b 1 …b m, we define their concatenation st = a 1 …a n b 1 …b m We define s n as the concatenation ss…s n times s = abb, t = cbast = abbcba s = 011s 3 =
Operations on languages The concatenation of languages L 1 and L 2 is Similarly, we write L n for LL…L ( n times) The union of languages L 1 L 2 is the set of all strings that are in L 1 or in L 2 Example: L 1 = {01, 0}, L 2 = { , 1, 11, 111, …}. What is L 1 L 2 and L 1 L 2 ? L 1 L 2 = {st: s L 1, t L 2 }
Operations on languages The star (Kleene closure) of L are all strings made up of zero or more chunks from L : –This is always infinite, and always contains Example: L 1 = {01, 0}, L 2 = { , 1, 11, 111, …}. What is L 1 * and L 2 * ? L * = L 0 L 1 L 2 …
Constructing languages with operations Let’s fix an alphabet, say = {0, 1} We can construct languages by starting with simple ones, like {0}, {1} and combining them {0}({0} {1})* all strings that start with 0 ({0}{1}*) ({1}{0}*) 0(0+1)* 01*+10*
Regular expressions A regular expression over is an expression formed using the following rules: –The symbol is a regular expression –The symbol is a regular expression –For every a , the symbol a is a regular expression –If R and S are regular expressions, so are R+S, RS and R*. A language is regular if it is represented by a regular expression
Examples 01* (01*)(01) 0 followed by any number of 1 s = 0(1*)= {0, 01, 011, 0111, …} = {001, 0101, 01101, , …} 0 followed by any number of 1 s and then 01 = {0, 1}
Examples (0+1)* = { , 0, 1, 00, 01, 10, 11, …} any string 0+1= {0, 1} (0+1)*01(0+1)* strings of length 1 any string that contatins the pattern 01 (0+1)*010 any string that ends in 010
Examples (0+1)(0+1) (0+1)(0+1)(0+1) strings of length 2 strings of length 3 ((0+1)(0+1))*+((0+1)(0+1)(0+1))* ((0+1)(0+1))* strings of even length ((0+1)(0+1)(0+1))* strings of length divisible by 3 all strings whose length is even or divisible by 3 = strings of length 0, 2, 3, 4, 6, 8, 9, 10, 12,...
Examples ((0+1)(0+1)+(0+1)(0+1)(0+1))* (0+1)(0+1) (0+1)(0+1)(0+1) strings of length 2 strings of length 3 (0+1)(0+1)+(0+1)(0+1)(0+1) strings of length 2 or 3 strings that can be broken in blocks, where each block has length 2 or 3
Examples ((0+1)(0+1)+(0+1)(0+1)(0+1))* strings that can be broken in blocks, where each block has length 2 or ✓ ✓✓✓✓✗ this includes all strings except those of length 1 ((0+1)(0+1)+(0+1)(0+1)(0+1))* = all strings except 0 and 1
Examples ( )*( +0+00) ends in at most two 0 s there can be at most two 0 s between consecutive 1 s Guess: ( )*( +0+00) = {x: x does not contain 000} there are never three consecutive 0 s
Examples Write a regular expression for all strings with two consecutive 0 s. = {0, 1} (0+1)*00(0+1)* (anything) 00 (anything else)
Examples Write a regular expression for all strings that do not contain two consecutive 0 s. = {0, 1}... at most one 0 in every block ending in 1... and at most one 0 in the last block (1 + 01) ( + 0) (1 + 01)*( + 0) blocks ending in 1 last block
Examples Write a regular expression for all strings with an even number of 0 s. = {0, 1} even number of zeros = ( two zeros )* two zeros = 1*01*01* (1*01*01*)*
Main theorem for regular languages Theorem A language is regular if and only if it is the language of some DFA DFA NFA regular expression regular languages
Road map NFA regular expression NFA without DFA
M2M2 Examples: regular expression → NFA R 1 = 0 R 2 = R 3 = (0 + 1)* q0q0 q1q1 0 q0q0 q1q1 q2q2 q3q3 0 q4q4 q5q5 1 q0’q0’q1’q1’ M2M2
General method regular expr NFA q0q0 q0q0 symbol a q0q0 q1q1 a RS q0q0 q1q1 MRMR MSMS
General method continued regular expr NFA R + S q0q0 q1q1 MRMR MSMS R*R* q0q0 q1q1 MRMR
Road map NFA regular expression NFA without DFA