CPSC 121: Models of Computation 2008/9 Winter Term 2 DFAs in Depth Steve Wolfman, based on notes by Patrice Belleville and others
Lecture Prerequisites None.
Learning Goals: “Pre-Class” None.
Learning Goals: In-Class By the end of this unit, you should be able to for the exam: Build a DFA to recognize a particular language. Identify the language recognized by a particular DFA. Connect the graphical and formal representations of DFAs and illustrate how each part works. Discuss point of learning goals.
Learning Goals: In-Class By the end of this unit, you should be able to for yourself: Use the formalisms that we’ve learned so far (especially power sets and set union) to illustrate that nondeterministic finite automata are no more powerful than deterministic finite automata. Demonstrate through a simple contradiction argument that there are interesting properties of programs that cannot be computed in general. Describe how the theoretical story 121 has been telling connects to the branch of CS theory called automata theory. Discuss point of learning goals.
Outline Formally Specifying DFAs Designing DFAs Analyzing DFAs: Can DFAs Count Non-Deterministic Finite Automata: Are They More Powerful than DFAs? Combining DFA Languages CS Theory and the Halting Problem
Reminder: Formal DFA Specification Q is the (finite) set of states. q0 is the start state, and q0 Q. F is the set of accepting states, and F Q. Is the (finite) set of letters in the input language. : Q Q is the transition function. a a,b a b b a,b
One Way to Formalize DFA Operation The algorithm “RunDFA” runs a given DFA on an input, resulting in “Yes”, “No”, or (if the input is invalid) “Huh?” RunDFA((Q, q0, F, , ), s) If s is empty, then: if q0 F, return “Yes”, else return “No”. Otherwise, let s0 be the first element of s and srest be the rest of s. If s0 , return “Huh?” Otherwise, let q' = (q0, s0). Return RunDFA((Q, q', F, , ), srest). a,b a b a b a,b
Formal DFA Example A DFA is a five-tuple: (Q, , , q0, F) a,b b a What is Q? ? q0? F?
Formal DFA Example A DFA is a five-tuple: (Q, , , q0, F) a,b b a What does this do? Let’s try it!
Formal DFA Example A DFA is a five-tuple: (Q, , , q0, F) a,b b a What is the problem with ?
Formal DFA Example (Fixed) A DFA is a five-tuple: (Q, , , q0, F) a,b a,b b a a b What is Q? ? ? q0? F?
Running the DFA How does the DFA runner work? a,b a,b b a a b Input: abbab Only partly on the exam. (You should be able to simulate a DFA, but needn’t use RunDFA.)
Outline Formally Specifying DFAs Designing DFAs Analyzing DFAs: Can DFAs Count Non-Deterministic Finite Automata: Are They More Powerful than DFAs? Combining DFA Languages CS Theory and the Halting Problem
Strategy for Designing a DFA To design a DFA for some language, try: Should the empty string be accepted? If so, draw an accepting start state; else, draw a rejecting one. From that state, consider what each letter would lead to. Group letters that lead to the same outcome. (Some may lead back to the start state if they’re no different from “starting over”.) Decide for each new state whether it’s accepting. Repeat steps 2 and 3 for new states as long as you have unfinished states, always trying to reuse existing states! Also, as you go, give as descriptive names as you can to your states, to help figure out when to reuse them!
Problem: Design a DFA to Recognize “Decimal Numbers”
Is this a decimal number? Let’s build a DFA to check whether an input is a decimal number. Which of the following is a decimal number? 3.5 2.011 -6 5. -3.1415926536 .25 --3 3-5 5.2.5 . Rules for whether something is a decimal number?
Is this a decimal number? Something followed by ? is optional. Anything followed by a + can be repeated. [0-9] is a digit, an element of the set {0, 1, …, 9}. – is literally –. \. means. () are used to group elements together. Rules for a decimal number: -?[0-9]+(\.[0-9]+)?
Is this a decimal number? Let’s build a DFA to check whether an input is a decimal number. Which of the following is a decimal number? 3.5 2.011 -6 5. -3.1415926536 .25 --3 3-5 5.2.5 . Rules for whether something is a decimal number?
Problem: Design a DFA to Recognize “Decimal Numbers” Problem: Design a DFA to recognize decimal numbers, defined using the regular expression: -?[0-9]+(\.[0-9]+)?
More DFA Design Problems Design a DFA that recognizes words that have two or more contiguous sequences of vowels. (This is a rough stab at “multisyllabic” words.) Design a DFA that recognizes base 4 numbers with at least one digit in which the digits are in sorted order. Design a DFA that recognizes strings containing the letters a, b, and c in which all the as come before the first c and no two bs neighbour each other.
Outline Formally Specifying DFAs Designing DFAs Analyzing DFAs: Can DFAs Count Non-Deterministic Finite Automata: Are They More Powerful than DFAs? Combining DFA Languages CS Theory and the Halting Problem
Can a DFA Count? Problem: Design a DFA to recognize anbn: the language that includes all strings that start with some number of as and end with the same number of bs.
DFAs Can’t Count Let’s prove it. The heart of the proof is the insight that a DFA can only have a finite number of states and so can only “count up” a finite number of as. We proceed by contradiction. Assume some DFA recognizes the language anbn.
DFAs Can’t Count Assume DFA counter recognizes anbn. counter must have some finite number of states k = |Qcounter|. Consider the input akbk. counter must repeat (at least) one state as it processes the k as in that string. (Because it starts in q0 and transitions k times; if it didn’t repeat a state, that would be k+1 states, which is greater than |Qcounter|.) (In fact, it must repeat the state it’s in just before the first b.)
DFAs Can’t Count Consider the input akbk. counter must repeat (at least) one state as it processes the k as in that string. counter accepts akbk, meaning it ends in an accepting state. a a a ... a b ... a a b Doesn’t necessarily look like this, but similar.
DFAs Can’t Count Now, consider the number of as that are processed between the first and second time visiting the repeated state. Call it r. Give counter a(k-r)bk instead. (Note: r > 0.) a a a ... a b ... a a b
DFAs Can’t Count Give counter a(k-r)bk instead. counter must accept a(k-r)bk, but a(k-r)bk is not in the language anbn . This is a contradiction! QED a a a ... a b ... a a b
Outline Formally Specifying DFAs Designing DFAs Analyzing DFAs: Can DFAs Count Non-Deterministic Finite Automata: Are They More Powerful than DFAs? Combining DFA Languages CS Theory and the Halting Problem
Let’s Try Something More Powerful (?): NFAs An NFA is like a DFA except: Any number (zero or more) of arcs can lead from each state for each letter of the alphabet There may be many start states When we run a DFA, we put a finger on the current state and then read one letter of the input at a time, transitioning from state to state. When we run an NFA, we may need many fingers to keep track of many current states. When we transition, we go to all the possible next states.
NFAs Continued An NFA is like a DFA except: Any number (zero or more) of arcs can lead from each state for each letter of the alphabet There may be many start states A DFA accepts when we end in an accepting state. An NFA accepts when at least one of the states we end in is accepting.
Another way to think of NFAs An NFA is a non-deterministic finite automaton. Instead of doing one predictable thing at each state given an input, it may have choices of what to do. If one of the choices leads to accepting the input, the NFA will make that choice. (It “magically” knows which choice to make.)
Decimal Number NFA... Is Easy! - 0-9 s1 s2 int dot 0-9 real 0-9 -?[0-9]+(\.[0-9]+)? 0-9
Are NFAs More Powerful than DFAs? Problem: Prove that every language an NFA can recognize can be recognized by a DFA as well. Strategy: Given an NFA, show how to build a DFA (i.e., Q, q0, F, , and ) that accepts/rejects the same strings. Hint: Formally, what is the “group of states” an NFA can be in at any given time? Build the parts of the DFA based on that (i.e., Q is the set of all such things).
Worked Problem: Are NFAs More Powerful than DFAs? Problem: Prove that every language an NFA can recognize can be recognized by a DFA as well. WLOG, consider an arbitrary NFA “N”. We’ll build a DFA “D” that does the same thing. We now build each element of the five-tuple defining D. We do so in such a way that D faithfully simulates N’s operation.
Worked Problem: Are NFAs More Powerful than DFAs? N’s states are QN. Let QD = P(QN). (Each state in D represents being in a subset of the states in N.) N starts in some subset Q0N of QN. Let q0D = Q0N. (D’s start state represents being in the set of start states in N.) N has accepting states FN. Let FD = {S QN | S FN }. (A state in D is accepting if it represents a set of states in N that contains at least one accepting state.) N has an alphabet N. Let D = N. (They operate on the same strings.) N has zero or more arcs leading out of each state in QN for each letter in N. D(qD, s) operates on a state qD that actually represents a set of states from QN. So:
Decimal Number NFA... as a DFA 0-9 - . 0-9 {s1, s2} {s2} {s2, int} {dot} 0-9 else 0-9 else . {real} else { } else 0-9 LOTS of states left off that are unreachable from the start state.
Outline Formally Specifying DFAs Designing DFAs Analyzing DFAs: Can DFAs Count Non-Deterministic Finite Automata: Are They More Powerful than DFAs? Combining DFA Languages CS Theory and the Halting Problem
Combining Languages Problem: Given a DFA that recognizes language A and a DFA that recognizes language B, can you build a DFA that recognizes: Any string in language A or B Only strings in both languages A and B Only strings that are composed of a string from A followed by a string from B
Worked Problem: Combining Languages Problem: Design a combined DFA “C” for any string in BOTH language A and B. Strategy: build a DFA that “keeps track” of where it is in both DFA “A” and DFA “B”. Let QC = QA QB. Let qC0 = (qA0, qB0). Let FC = {(qA, qB) QC | qA FA qB FB}. (Only “double” states that accept in both DFA A and B.) Let C = A (which must also equal B). Let C((qA, qB), s) = (A(qA, s), B(qB, s)). (Transition to the next “double” state by tracking where A and B would go on each part of the state.)
Outline Formally Specifying DFAs Designing DFAs Analyzing DFAs: Can DFAs Count Non-Deterministic Finite Automata: Are They More Powerful than DFAs? Combining DFA Languages CS Theory and the Halting Problem
Another Punch Line to the 121 Story: Automata Theory NFAs are no more powerful than DFAs (although we might need 2n states in a DFA to simulate an NFA). One direction that CS theory goes is to explore what different models of computation are capable of and how they’re related. Are all models limited?
The Halting Problem Remember our decision algorithms? On a given problem, they say “yes” or “no”. If we’re really smart, we can probably write a program to solve any given decision problem (in a finite amount of time), right? How about this one: given a program and its input, decide whether the program ever finishes (halts). I.e., does it go into an infinite loop? This was one of the deepest questions in math (and CS!) in 1900, posed by David Hilbert.
The Halting Problem OK, let’s say we write the code for this inside the method doesHalt: public static boolean doesHalt(String program, String input) { // Don’t know what goes here, // but we assume (for contradiction!) it exists. }
The Halting Problem Now, remember Russell’s Paradox and the self-referential “set that contains all sets that do not contain themselves”? Let’s build a program that takes a program as input and... public static void main(String[] args) { // Assume the input is args[1]. if (doesHalt(args[1], args[1])) while(true) ; else return; } Let’s call our program Russ, for short. It halts only if its input, if run on itself, wouldn’t halt.
The Halting Problem Now, run Russ with Russ itself as input. Does Russ halt under those circumstances or not? QED, Halt, Full Stop. Proof (originally) thanks to Kurt Gödel.