Instructor: Aaron Roth

Slides:



Advertisements
Similar presentations
Theory of Computing Lecture 23 MAS 714 Hartmut Klauck.
Advertisements

Lecture 24 MAS 714 Hartmut Klauck
Finite Automata CPSC 388 Ellen Walker Hiram College.
Summary Showing regular Showing non-regular construct DFA, NFA
Finite Automata Great Theoretical Ideas In Computer Science Anupam Gupta Danny Sleator CS Fall 2010 Lecture 20Oct 28, 2010Carnegie Mellon University.
FSA Lecture 1 Finite State Machines. Creating a Automaton  Given a language L over an alphabet , design a deterministic finite automaton (DFA) M such.
Theory of Computing Lecture 22 MAS 714 Hartmut Klauck.
Costas Busch - LSU1 Non-Deterministic Finite Automata.
1 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY For next time: Read 2.1 & 2.2.
FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY
1 A Single Final State for Finite Accepters. 2 Observation Any Finite Accepter (NFA or DFA) can be converted to an equivalent NFA with a single final.
Transparency No. 8-1 Formal Language and Automata Theory Chapter 8 DFA state minimization (lecture 13, 14)
Complexity and Computability Theory I Lecture #13 Instructor: Rina Zviel-Girshin Lea Epstein Yael Moses.
1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 5 School of Innovation, Design and Engineering Mälardalen University 2012.
Introduction to CS Theory
Decidable Questions About Regular languages 1)Membership problem: “Given a specification of known type and a string w, is w in the language specified?”
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2010.
 2005 SDU Lecture13 Reducibility — A methodology for proving un- decidability.
CS 203: Introduction to Formal Languages and Automata
MA/CSSE 474 Theory of Computation Minimizing DFSMs.
FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA.
Finite Automata Great Theoretical Ideas In Computer Science Victor Adamchik Danny Sleator CS Spring 2010 Lecture 20Mar 30, 2010Carnegie Mellon.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
Regular Languages Chapter 1 Giorgi Japaridze Theory of Computability.
Lecture #5 Advanced Computation Theory Finite Automata.
Recursively Enumerable and Recursive Languages. Definition: A language is recursively enumerable if some Turing machine accepts it.
CIS 262 Automata, Computability, and Complexity Fall Instructor: Aaron Roth
Non-regular languages
Standard Representations of Regular Languages
CSE322 PUMPING LEMMA FOR REGULAR SETS AND ITS APPLICATIONS
PROPERTIES OF REGULAR LANGUAGES
Busch Complexity Lectures: Reductions
Single Final State for NFA
Hierarchy of languages
Busch Complexity Lectures: Undecidable Problems (unsolvable problems)
Properties of Regular Languages
CS 154, Lecture 4: Limitations on DFAs (I),
CS 154, Lecture 3: DFANFA, Regular Expressions.
Alternating tree Automata and Parity games
4. Properties of Regular Languages
Non-Deterministic Finite Automata
Decidable Languages Costas Busch - LSU.
Minimal DFA Among the many DFAs accepting the same regular language L, there is exactly one (up to renaming of states) which has the smallest possible.
Non Deterministic Automata
Finite Automata Reading: Chapter 2.
Elementary Questions about Regular Languages
Non-regular languages
Chapter 4 Properties of Regular Languages
CS21 Decidability and Tractability
MA/CSSE 474 Theory of Computation Minimizing DFSMs.
MA/CSSE 474 Theory of Computation
Proposed in Turing’s 1936 paper
Instructor: Aaron Roth
Instructor: Aaron Roth
Instructor: Aaron Roth
Instructor: Aaron Roth
Instructor: Aaron Roth
Instructor: Aaron Roth
Instructor: Aaron Roth
Regular Language Equivalence and DFA Minimization
Instructor: Aaron Roth
Instructor: Aaron Roth
Instructor: Aaron Roth
Instructor: Aaron Roth
Instructor: Aaron Roth
Instructor: Aaron Roth
Instructor: Aaron Roth
Instructor: Aaron Roth
Presentation transcript:

Instructor: Aaron Roth aaroth@cis.upenn.edu CIS 262 Automata, Computability, and Complexity Spring 2019 http://www.seas.upenn.edu/~cse262/ Instructor: Aaron Roth aaroth@cis.upenn.edu Lecture: February 25, 2019

Recap: Lower Bounds on State Complexity Definition: Strings u and v are distinguishable with respect to a language L if there exists w such that only one of u.w and v.w is in L If strings u and v are distinguishable with respect to L, then corresponding DFA cannot end up in the same state after reading u and v If there is a set S of k strings such that every pair of strings in S is distinguishable, then a machine for L must have at least k states If there is an infinite set S of pairwise distinguishable strings, what can we conclude?

Proving Non-regularity L = { w | count(w,a) = count(w,b) } Consider S = { e, a, aa, aaa, … } = { ak | k >= 0 } S contains infinitely many strings S contains pairwise distinguishable strings: Consider two strings ai and aj with i != j ai. bi has equal number of a’s and b’s, so is in L aj. bi has unequal number of a’s and b’s, so is not in L Conclusion: no finite number of states suffice to accept L. For every number k, a DFA for L must have at least k states. No DFA can accept L, that is, L is not regular !

Regularity and Distinguishability Theorem: If there exists an infinite set S of pairwise distinguishable (w.r.t. L) strings, then L is not regular Proof: Suppose S is an infinite set of pairwise distinguishable strings To prove: L is not regular Assume to the contrary By definition, there exists a DFM M that accepts L We know that if there are k pairwise distinguishable strings, then k is a lower bound on the number of states of any DFA for L. Hence, number of strings in S <= number of states of M S cannot be infinite, contradiction! The converse of the theorem also holds! (we won’t prove it)

Proving Non-regularity To prove that a language L is not regular, identify a set S of strings such that 1. S is infinite 2. for every pair of distinct strings u and v in S, u and v are distinguishable w.r.t. L (that is, find a string w such that only one of u.w and v.w is in L) Textbook contains an alternative method for showing that a language L is not regular, called Pumping Lemma method (section 1.4) You can use this method in your answers as long as you use it correctly

Example Language L = { w.w | w in {a,b}* } L = { 𝜖, aa, bb, abab, aaaa, baba, bbbb, … } A string w belongs to L if w can be split into two identical halves Is L regular ? As the machine scans the input from left to right, the amount of information that needs to be stored bounded a priori, independent of the current input ?

Proving Non-regularity L = { w.w | w in {a,b}* } Consider S = { e, a, aa, aaa, … } = { ak | k >= 0 } S contains infinitely many strings S contains pairwise distinguishable strings: Consider two strings ai and aj with i != j ai. ai can be split into two identical halves, so is in L aj. ai cannot be split into two identical halves, so is not in L Conclusion: L is not regular. Is the proof correct ?

Bug in the Proof Consider two strings ai and aj with i != j ai. ai can be split into two identical halves, so is in L aj. ai cannot be split into two identical halves, so is not in L This conclusion is false !! Note this claim should hold for all values of i and j with i != j But consider the case when i=4 and j=2. The string a2. a4 is a6 , and is in L !

Correct Proof of Non-regularity L = { w.w | w in {a,b}* } Consider S = { e, a, aa, aaa, … } = { ak | k >= 0 } S contains infinitely many strings S contains pairwise distinguishable strings: Consider two strings ai and aj with i != j ai. b ai b can be split into two identical halves, so is in L Consider aj. b ai b Since i != j, if we split this string into two parts of equal length, first part cannot end with b. but second does, so two parts cannot be identical, and string is not in L Conclusion: L is not regular.

Modified Example L = { w.w | w in {a}* } Is L regular ? L = { e, aa, aaaa, aaaaaa, … } L = { w | w contains only a’s and has even length } Regular !

Another Example = { a }, L = { w | length of w is a perfect square } L = { e, a, a4, a9, a16, … } Is L regular ?

Proof of Non-regularity S = { a }, L = { w | length of w is a perfect square } Consider S = { e, a, aa, aaa, … } = { ak | k >= 0 } S contains infinitely many strings To show that S contains pairwise distinguishable strings, consider two strings ai and aj with i < j. Goal: find a value p (that depends on i and j) such that i+p is a perfect square but j+p is guaranteed not to be a perfect square If we succeed, then ai . ap is in L, but aj . ap is not in L. Hence, strings ai and aj are distinguishable w.r.t. L L is not regular.

Decision Problems Example problems: Given a DFA M and an input string w, is w in L(M)? Given a DFA M, is L(M) an empty set ? Given DFAs M and M’, are they equivalent, that is, L(M)=L(M’)? Here the input itself is a machine, and we want to design a program/machine that analyzes the behavior or the semantics of the input machine to answer the specified question Such problems are (typically) solvable when the input machine is a DFA (or an NFA, or a regular expression) The program that solves such a problem though is a general program (that is, not a DFA itself). We will describe it informally, but one can easily code it up (and also formalize it as a Turing machine)

Membership Problem Given a DFA M and an input string w, is w in L(M)? Initialize state of M to its initial state For each symbol in w, update the state by applying transition function If the state at the end is in F, w is in L(M), else not What is the time complexity of this program? Number of steps: linear in length of w

Emptiness Problem Given a DFA M, is L(M) non-empty ? Does M accept some string ? Same as: is a final state reachable from the initial state by following a sequence of transitions? Graph reachability problem Solution: Compute the set Reach of states reachable from q0 Reach initialized to { q0 } Repeat If there is a state q in Reach and a symbol s such that q’=d(q, s) is not in Reach, then add q’ to Reach Until Reach does not change If Reach contains a state in F, L(M) is non-empty, else is empty Complexity: If M has n states, loop executes at most n times.

Emptiness of Intersection Given DFAs M and M’, is there a string that they both accept? Is the intersection of L(M) and L(M’) non-empty ? Step 1: Construct the product machine N that accepts the intersection of the two languages Recall: state of N = (state of M, state of M’) Step 2: Check if L(N) is non-empty Complexity? If M and M’ have n states each, then N has n2 states. Step 2 is linear in number of states of N. So overall quadratic

Equivalence Problem Given DFAs M and M’, are they equivalent ? Is L(M) = L(M’) ? Note: two DFAs can look very different in terms of states and transitions but are equivalent Observe: L(M) = L(M’) iff 1. there is no w that M accepts and M’ rejects and 2. there is no w that M’ accepts and M rejects 1 is equivalent to checking intersection of L(M) and ~L(M’) is empty Complement M’ to ~M’ (by toggling accept/reject states) Check if intersection of L(M) and L(~M’) is empty 2 is equivalent to checking if intersection of L(~M) and L(M’) is empty Complexity: Quadratic

Minimization Problem Given a DFA M, find an equivalent DFA with least number of states q0 a q1 q2 b q3 q01 q2 b q3 a Is this DFA minimal, or can we find a smaller equivalent one? States q0 and q1 are “equivalent” and hence can be merged to obtain a three-state DFA, that’s minimal

Distinguishable States Consider DFA M = (Q, S, q0, F, d) Two states q and q’ are distinguishable if there exists a string w such that only one of d*(q,w) and d*(q’,w) is in F that is, in state q, if remaining input is w, should accept, but in q, with same remaining input, should reject (or vice versa) Claim: Distinguishable states cannot be merged Almost true, holds as long as q and q’ are reachable from the initial state.

Distinguishable States Consider DFA M = (Q, S, q0, F, d) Consider two states q and q’ that are reachable and distinguishable q is reachable, so there is a string u such that d*(q0,u) = q q’ is reachable, so there is a string u’ such that d*(q0,u’) = q’ Since q and q’ are distinguishable, there is a string w such that only one of d*(q, w) and d*(q’,w) is in F So only one of u.w and u’.w is in L(M) Thus u and u’ are distinguishable: in any machine equivalent to M, states after processing u and u’ must be distinct, that is, states q and q’ cannot be merged

Minimization Algorithm Consider DFA M = (Q, S, q0, F, d) Step 1: Compute the set of states in Q that are reachable from q0 Remove all unreachable states Step 2: Compute all pairs of distinguishable states Step 3: Merge equivalent (i.e. indistinguishable) states

Example Minimization Step 1: b a q3 q4 Step 1: Find states reachable from q0, and remove the rest b b b a a a q0 q1 q5 b a b b a a q2 q6

Computing Distinguishable States Two states q and q’ are distinguishable if there is a string w that takes one to a final state and the other to a non-final state S := { (q, q’) | exactly one of q and q’ is in F } /* clearly such pairs are distinguishable by the empty string */ Repeat S := S U { (q,q’) | there exists s such that (d(q, s),d(q’, s)) is in S } /* suppose d(q, s)=p and d(q’, s)=p’ if string w distinguishes states p and p’, then the string s.w takes exactly one of q and q’ to a final state, so distinguishes them */ Until S does not change

Computing Distinguishable States Correctness of the algorithm: Only distinguishable pairs are added to S If a pair of states is distinguishable, then it does get added to S (exercise: try proving this by induction on length of w) How many iterations of the loop are possible ?

Example Minimization Step 2: b a Step 2: Compute set of distinguishable state-pairs q3 q4 b b b a a a q0 q1 q5 b a a b q2

Example Minimization Step 2: b a Step 2: Compute set of distinguishable state-pairs q3 q4 b b b a a a q0 q1 q5 b a a b q2 S := { Pairs (q, q’) such that only one of them is final } Illustration: a pair (q,q’) is in S if they have different colors

Example Minimization After one iteration of loop: b a q3 q4 b b b a a a q0 q1 q5 b a a b q2 After one iteration of loop: q3 can be distinguished from q0, q1, and q2

Example Minimization Next iteration: no more pairs can be added to S q3 q4 b b b a a a q0 q1 q5 b a a b q2 Next iteration: no more pairs can be added to S That is: Target states of transitions from states with same color on same symbol have same color

Example Minimization States q0, q1, q2 are all equivalent, b a q3 q4 States q0, q1, q2 are all equivalent, And so are states q4, q5 b b b a a a q0 q1 q5 b a a b Merge equivalent states into a single one ! q2 b a q3 b q012 a b q45 a

Step 3: Building Minimal DFA Q : Set of reachable states of given DFA M S: Set of pairs of distinguishable states Two states are equivalent if they are not distinguishable Note: if p and q are equivalent, and q and r are equivalent, so are states p and r Thus equivalent states naturally lead to “partitioning” of Q into equivalence classes In previous example: partitioning of Q is into three partitions {q0, q1 , q2 }, {q3 } , {q4 , q5 }

Step 3: Definition of Minimal(M) States = Partitions of Q (as defined by notion of equivalence) Initial state = Partition P containing initial state of M Final states = Partition P is final if all its states are final states of M Transition function: if d(q, s)=q’ in M and state q is in partition P and state q’ is in partition P’ then d’(P, s)=P’ in Minimal(M)

Properties of Minimization All states in Minimal(M) are distinguishable from one another, so this is the smallest number of states possible! If we start with a different DFA M’ such that L(M)=L(M’), and apply minimization algorithm to M’, we will get exactly the same minimal DFA! (may be names are different, but structure is identical) Minimal DFA is canonical, depends only on the language it accepts!

Regular Expression Compilation: Summary Regular expression r e-NFA M(r) DFA M’(r) DFA Minimal(M’(r))