Instructor: Aaron Roth aaroth@cis.upenn.edu CIS 262 Automata, Computability, and Complexity Spring 2019 http://www.seas.upenn.edu/~cse262/ Instructor: Aaron Roth aaroth@cis.upenn.edu Lecture: April 1, 2019
Recap: Problems about Turing Machines Recognizable-but-undecidable -- Membership: Given M and w, does M accept w ? -- Halting: Given M and w, does M halt on w ? -- Non-emptiness: Given M, is there input w that M accepts ? Unrecognizable -- Non-membership: Given M and w, does M not accept w ? -- Emptiness: Given M, is L(M) empty ? -- Equivalence: Given M and M’, do they accept exactly same inputs ? Bottomline: Questions regarding the semantics of a TM, i.e., about its executions and inputs it accepts, are undecidable
What does it mean for real programs ? Checking whether a program is correct or is buggy is undecidable Such a verifier is greatly desirable for ensuring software reliability, but does not exist !! yes/proof Program Verifier Correctness specification no/bug
More Undecidable Problems ? Questions about what a TM/program computes are undecidable What other computational problems are undecidable ? Emptiness problem for Linear Bounded Automata Post Correspondence Problem Hilbert’s 10th Problem: Finding integer roots of a polynomial … To prove undecidability of a problem, we need to reduce a known undecidable problem such as ATM to it, but now reduction proof is non-trivial
Restricted Turing Machines ? Two models at two ends of computing power: 1. Finite automata (DFA): Number of states independent of input Can check for only regular patterns in input Analysis questions (e.g. equivalence of DFAs) are solvable 2. Turing machines: Unlimited number of tape cells Models general purpose computation Analysis questions (e.g. does M accept w) are undecidable Natural question: Is there a restriction of TM model that a. computation power greater than regular languages b. analysis questions (at least some) are decidable
Linear Bounded Automata a b b a b b a _ q0 Like a TM, but if input w consists of k symbols, then uses only first k cells of tape One way to enforce this restriction: on blank, TM must move left Can still go back-and-forth on first k cells, using them as memory
What can an LBA compute ? a b b a b b a _ q0 PALINDROMES = { w | w is a palindrome } TM we designed is an LBA PRIMES = { 0p | p is a prime number } All regular languages can be accepted by LBA But not a general purpose model of computation: LBA can use only O(n) memory, where n is length of input
Membership Problem for LBAs a b b a b b a _ q ALBA = { <M,w> | M is an LBA and M accepts w } Does an LBA accept a given input ? On an input w with k symbols, how many possible distinct configurations? A configuration given by tape content, state of M, and position of head So if |w|=k, then at most |Q| . k . |G|k distinct configurations Note: a standard TM on a given input w has infinitely many distinct configurations, since number of tape cells it uses is unbounded
Membership Problem for LBAs ALBA = { <M,w> | M is an LBA and M accepts w } Does an LBA accept a given input ? Decidable ! If M accepts w, then it must accept within |Q| . |w| . |w||G| steps (otherwise some configuration repeats, and M loops forever)
Emptiness/Non-emptiness Problem for LBAs NELBA = { <M> | M is an LBA and L(M) is non-empty } Does a given LBA accept some input ? Recognizable (since LBA is just a special kind of TM) Is it decidable ? Can we find a bound K depending on, say, number of states of M, such that if M accepts some input, then it must accept input of length <= K No ! Undecidability proof by reduction from membership problem for TMs
Encoding TM Execution as a string Consider TM M = (Q, q0, qa, qr, G, S, _, d ) Configuration C = u q v such that u, v are strings over G and q in Q C is a string of the form G* . Q . G* Finite execution of M on an input w encoded as a string over G U Q U {#} # C0 # C1 # C2 # … # Ck # where C0 = q0 w, and each configuration Ci+1 = Next(Ci), obtained by executing one step of M Such a string encodes accepting execution if state in Ck equals qa u1 u2 u3 …….. uk v1 v2 v3 …….. vm _ _ _ _ q
Checking if a string encodes a TM execution Consider TM M = (Q, q0, qa, qr, G, S, _, d ) and input w to M Consider the following problem: Given a string x over G U Q U {#}, does it encode accepting execution of M on w ? that is, is it of the form # C0 # C1 # C2 # … # Ck # where each block between #’s is a configuration of M such that 1. C0 equals q0 w 2. each Ci+1 = Next( Ci ), and 3. Ck is accepting configuration
Checking if a string encodes a TM execution a/R b/a, L x = # q0 a b a a # c q1 b a a # q3 c a a a # … q1 q3 a/c, R TM M q0 Input w = a b a a Claim: { x | x encodes the accepting execution of M on w } is decidable, and in fact, can be decided by an LBA No additional memory needed Suffices to go back and forth on x checking all requirements
Undecidability of NELBA ATM = { <M, w> | M is a TM and M accepts w } NELBA = { <M> | M is a LBA and L(M) is non-empty } Given TM M = (Q, q0, qa, qr, G, S, _, d ) and input w to M, construct an LBA M’ with input alphabet G U Q U {#} such that M’ accepts a string x exactly when x encodes the accepting execution of M on w Thus, <M’> is in NELBA if and only if <M,w> is in ATM ATM reduces to NELBA NELBA is undecidable
Post Correspondence Problem (PCP) Input: List of pairs of strings Consider two strings Left and Right, Initially both empty In each round, pick a pair, extend Left by first string in pair and Right by second string Goal: Make Left and Right to be equal (at least one round must be played) Question: Is there a way to achieve the goal ? Pick pair 2 Left = abaaa Right = ab Pick pair 1 = abaaaa = abaaa Pick pair 1 = abaaaaa = abaaaaaa Pick pair 3 = abaaaaaab = abaaaaaab Pair No. Left Right 1 a aaa 2 abaaa ab 3 b
Another PCP Example Pair No. Left Right 1 ab aba 2 baa aa 3 (e, e) 1 2 (abab, abaaba) (abbaa, abaaa) (ababa, ababaa) 3 1 2 Convince yourself that it is impossible to achieve Left = Right
Post Correspondence Problem (PCP) Input: List of pairs of strings (ui , vi), for i = 1, 2, ..k Is there a (non-empty) sequence of integers i1,i2, … , in such that ui1 ui2 … uin = vi1 vi2 … vin Examples: Input = { (a, aaa), (abaaa, ab), (ab, b) } Answer: Yes, desired sequence is 2, 1, 1, 3 Input = { (ab, aba), (baa, aa), (aba, baa) } Answer: No Question: Can we write a computer program to solve PCP ?
Recognizability of PCP Pair No. Left Right 1 ab aba 2 baa aa 3 (e, e) 1 2 3 (ab, aba) (baa, aa) (aba, baa) 3 1 2 (abab, abaaba) (abbaa, abaaa) (ababa, ababaa) 3 1 2 A TM/program can explore all possible solutions in a systematic way, basically exploring all paths in the tree such as above If the answer is YES, this search is guaranteed to find it Conclusion: PCP is recognizable !
PCP is Undecidable ! Input: List of pairs of strings (ui , vi), for i = 1, 2, ..k Is there a (non-empty) sequence of integers i1,i2, … , in such that ui1 ui2 … uin = vi1 vi2 … vin Problem seems simple, and intuition says that if there is no solution, a program should be able to detect it But intuition is wrong in this case ! The problem is equivalent to the halting problem ! To simplify proof, consider a slight variant of PCP, where we require i1 = 1 (that is, solution sequence must pick first pair in first round) Proof by reducing membership problem for TMs to this modified PCP
Reduction from Membership Problem for TM Consider TM M = (Q, q0, qa, qr, G, S, _, d ) Configuration C = u q v such that u, v are strings over G and q in Q C is a string of the form G* . Q . G* Accepting execution of M on input w encoded as a string over G U Q U{#} # C0 # C1 # C2 # … # Ck # where C0 = q0 w, and each configuration Ci+1 = Next(Ci), obtained by executing one step of M, and state in Ck equals qa Goal: Construct a list of pairs as input to PCP such that Left and Right can be made equal if and only if there is an accepting execution of M on w
Reduction from Membership Problem for TM Consider TM M = (Q, q0, qa, qr, G, S, _, d ) and input string w for M First pair for PCP: ( #, # q0 w # ) In the modified version, solution must start with first pair So initially, Left = # and Right = # q0 w # Intuitively, Left and Right both encode sequences of configurations of M capturing the execution of M on w But Left is one configuration behind Right
Reduction from Membership Problem for TM TM M = (Q, q0, qa, qr, G, S, _, d ) and input string w for M First pair: ( #, # q0 w # ) As Left and Right get extended, they look like: Left = # C0 # C1 # … # Cn # Right = # C0 # C1 # … # Cn # Cn+1 # Pairs for PCP input are chosen so that the only way to continue is to extend Left by Cn+1 and Right by the configuration Cn+2 obtained by executing one step of the TM in the configuration Cn+1
Sample Extension: Right Move Suppose Left and Right look like Left = # … # Right = # … # a b a a q3 a b # For example, suppose G = { a, b } and d(q3, a) = (q5, b, R) PCP list contains following pairs: (a, a), (b, b), (q3 a , b q5 ) In our solution, only way to extend the sequences leads to: Left = # … # a b a a q3 a b # Right = # … # a b a a q3 a b # a b a a b q5 b #
Sample Extension: Left Move Suppose Left and Right look like and d(q5, b) = (q2, b, L) Left = # … # Right = # … # a b a a b q5 b # PCP list contains: (b q5 b , q2 b b ), besides (a, a) and (b, b) In our solution, only way to extend the sequences leads to: Left = # … # a b a a b q5 b # Right = # … # a b a a b q5 b # a b a a q2 b b # Intuition: A successor of a configuration “u q v” looks pretty much the same, except for the symbols just before and right after q which change according to the transitions of the TM
Encoding Acceptance Left and Right look like Left = # C0 # C1 # … # Cn # Right = # C0 # C1 # … # Cn # Cn+1 # All pairs so far make sure Left has smaller length than Right When can Left “catch up” with Right ? When the state in the configuration equals the accepting state qa Add pairs : (a qa , qa) and (qa a , qa) for every tape symbol a
Encoding Acceptance Suppose Left and Right look like Left = # … # Right = # … # a b qa b # We have pairs : (a qa , qa), (qa a , qa), (b qa , qa), (qa b , qa) Above can be extended to Left = # … # a b qa b # Right = # … # a b qa b # a b qa # And finally to Left = # … # a b qa b # a b qa # a qa # Right = # … # a b qa b # a b qa # a qa # qa # Add also the pair (qa # #, # ) to ensure final match
Reduction from Membership Problem for TM Given TM M = (Q, q0, qa, qr, G, S, _, d ) and input string w for M Consider following input to PCP First pair: ( #, # q0 w # ) /* sets up initial configuration */ For each a in G, (a, a ) /* Allows copying of tape cells */ For each q in Q and a in G, if d(q, a) = (q’, b, R) then (q a, b q’) /* simulates right move */ if d(q, a) = (q’, b, L) then for each c in G, (c q a, q’ c b) /* simulates left move */ For each a in G, (#, _ #) /* allows expanding tape */ For each a in G, (a qa , qa) and (qa a , qa) /* allows Left to catch up */ (qa ##, #) /* allows Left to finally match */ PCP has a solution (starting with first pair) if and only if M accepts w ATM reduces to PCP, and PCP is undecidable