Finite Automata & Regular Languages Sipser, Chapter 1
Deterministic Finite Automata A DFA or deterministic finite automaton M is a 5-tuple, M = (Q, , , q 0, F), where: Q is a finite set of states of M is the finite input alphabet of M : Q Q is the state transition function q 0 is the start state of M F Q is the set of accepting states or final states of M
DFA Example State diagram Q = { q 0, q 1 } = { 0, 1 } F = { q 1 } q0q0 q1q M 01 q0q0 q0q0 q1q1 q1q1 q1q1 q0q0 State Table
State table & state transition function State table State transition function (q 0, 0) = q 0, (q 0, 1) = q 1 (q 1, 0) = q 1, (q 1, 1) = q 0 01 q0q0 q0q0 q1q1 q1q1 q1q1 q0q0
State transitions If q, q’ Q, s , and (q, s) = q’, then we say that q’ is an s-successor of q, or there is a transition from q to q’ on input s, and we write q s q’ Example: since (q 0, 1) = q 1, then there is a transition from q 0 to q 1 on input 1, and we write q 0 1 q 1.
State sequences If a string of input symbols w = s 0 s 1 s 2 … s k-1 takes M from initial state q 0 to state q k, namely q 0 s0 q 1 s1 q 2 s2 q 3 … s[k-1] q k then we say that q k is a w-successor of q 0, and write q 0 w q k. Also q 0 q 1 q 2 … q k is called an admissible state sequence for w.
Strings accepted by a DFA Let M = (Q, , , q 0, F) be a DFA, and w = s 0 s 1 s 2 … s k-1 * be a string over alphabet . Then M accepts w if there exists an admissible state sequence q 0 q 1 q 2 … q k for w, starting at initial state q 0 and ending with state q k, where q k F. That is, M accepts input string w if M ends up in one of the final states.
Language recognized by a DFA The language L(M) that is recognized by a DFA, M = (Q, , , q 0, F), is the set of all strings accepted by M. That is, L(M) = { w * | M accepts w } = { w * | q 0 w q k, q k F }. Example: For the previous DFA, L(M) is the set of all strings of 0s and 1s with odd parity, that is, odd number of 1s.
DFA Example 2 Recognizer for 11*01* B D A C ,1 Trap
DFA Example 2 M = (Q, , , q 0, F), L(M) = 11*01* Q = { q 0 =A, B, C, D } = { 0, 1 } F = { C } 01 ADB BCB CDC DDD
DFA Example 3 Modulo 3 counter A B C 1 1 1,R 2,R 2 2 0,R 0 0
DFA Example 3 M = (Q, , , q 0, F) Q = { q 0 =A, B, C } = { 0, 1, 2, R } F = { A } 012R AABCA BBCAA CCABA
Regular Languages A language L * is called regular if there exists a DFA M such that L(M)=L. Earlier, we defined a language L * as regular if there exists a T3 or regular (left-linear or right-linear) grammar G such that L(G)=L. We shall prove that these two definitions are equivalent.
Operations on Regular Languages Let A and B be regular languages: Union: A B = { x | x A or x B } Concatenation: AB = { xy | x A and y B }. Kleene Closure (A-star) A* = {x 1 x 2 x 3... x k | k 0 and x i A }
Examples of regular operations A = { good, bad }, B = { boy, girl } A B = { good, bad, boy, girl } AB = { goodboy, goodgirl, badboy, badgirl } A* = {, good, bad, goodgood, goodbad, badgood, badbad, … }
Closure under Union If A and B are regular languages, then their union, A B, is a regular language
Union Machine M(A B) q0 q1F q2F p0 p1F p2F M(A) M(B) r0
Closure under Concatenation If A and B are regular languages, then their concatenation, AB, is a regular language.
Concatenation Machine M(AB)
Closure under Kleene Star If A is a regular language, then the Kleene closure of A, A*, is also a regular language
Kleene Closure Machine M(A*)
NFAs: Nondeterministic Finite Automata Presence of lambda transtitions. May have more than one initial state. On input a, state q may have no transition out. On input a, state q may have more than one transition out.
NFAs A nondeterministic finite automaton M is a five-tuple M = ( Q, , R, I, F ), where Q is a finite set of states is the (finite) input alphabet R is the transition relation, R Q Q I Q is the set of initial states F Q is the set of final states
Example NFAs NFA that recognizes the language 0*1 1*0 NFA that recognizes the language (0 1)*11 (0 1)*
Converting NFAs to DFAs Given a NFA, M = (Q, , R, I, F), build a DFA, M’ = (Q’, , , S 0, F’) as follows. The states S 0, S 1, S 2, … of M’ are sets of states of M. The initial state of M’ is obtained by putting together all the initial states of M and all states reachable from those by transitions, and calling this set S 0, the initial state of M’
Converting NFAs to DFAs For each state S k already in Q’ in M’, and for each input symbol a , put together into a set S j all states of M reachable from each state in S k on input a. This set S j may or may not yet already be in Q’. Also it may be the empty set . Add to the transition from S k to S j on input a. Since there can only be a finite number of subsets of states of M, this procedure will stop after a finite number of steps.
Example conversions Convert the NFA for the language (0 1)*00 (0 1)*11 to a DFA 0, A B C D E F
State transition table of NFA 01 AA,BA- BC-- C--- DDD,E- E-F- F---
State table of DFA 01 A,DA,B,DA,D,E A,B,DA,B,C,DA,D,E A,B,DA,D,E,F A,B,C,D A,D,E A,D,E,FA,B,DA,D,E,F
State diagram of DFA AD ABD ADE ABCD ADEF
Regular Expressions (r.e.) If a , then the set a = {a} is a r.e. The set = { } is a r.e. The set = { } is a r.e. If R and S are r.e., then (R S) is a r.e. If R and S are r.e., then (RS) is a r.e. If R is a r.e., then ( R )* is a r.e. Any r.e. is obtained by a finite application of the above rules.
REs and Regular Languages R.E.s are shorthand notation for regular languages.
Regex: REs in Unix [a-f], [^a-f] R*, R+, R? {R} RS R|S
Minimization of DFAs Subset construction (Myhill-Nerode Theorem)
NFAs, DFAs, & Lexical Analyzer Generators Sec 3.6: Finite Automata, Aho, Sethi, Ullman, “Compilers: P.T.T” Sec 3.7: From REs to NFAs (Thompson’s Construction) Sec 3.8: Design of a Lexical Analyzer generator Sec 3.9: Optimization of DFA-based Lexical Analyzers