Undecidable problems for CFGs

Slides:



Advertisements
Similar presentations
CSCI 3130: Formal languages and automata theory Tutorial 9
Advertisements

INHERENT LIMITATIONS OF COMPUTER PROGRAMS CSci 4011.
INHERENT LIMITATIONS OF COMPUTER PROGRAMS CSci 4011.
February 6, 2015CS21 Lecture 141 CS21 Decidability and Tractability Lecture 14 February 6, 2015.
1 Section 14.1 Computability Some problems cannot be solved by any machine/algorithm. To prove such statements we need to effectively describe all possible.
Reducibility A reduction is a way of converting one problem into another problem in such a way that a solution to the second problem can be used to solve.
FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Regular.

Fall 2006Costas Busch - RPI1 Pushdown Automata PDAs.
CS5371 Theory of Computation Lecture 10: Computability Theory I (Turing Machine)
CS5371 Theory of Computation Lecture 8: Automata Theory VI (PDA, PDA = CFG)
Fall 2005Costas Busch - RPI1 Pushdown Automata PDAs.
CS5371 Theory of Computation Lecture 12: Computability III (Decidable Languages relating to DFA, NFA, and CFG)
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong DFA minimization.
CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Turing Machines.
CSCI 3130: Formal languages and automata theory Andrej Bogdanov The Chinese University of Hong Kong Turing.
 2005 SDU Lecture13 Reducibility — A methodology for proving un- decidability.
Turing Machine. Turing machine The mathematical models (FAs, TGs, PDAs) that have been discussed so far can decide whether a string is accepted or not.
Lecture 9  2010 SDU Lecture9: Turing Machine.  2010 SDU 2 Historical Note Proposed by Alan Turing in 1936 in: On Computable Numbers, with an application.
CSCI 3130: Formal languages and automata theory Andrej Bogdanov The Chinese University of Hong Kong Undecidable.
CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Pushdown.
1 Section 13.1 Turing Machines A Turing machine (TM) is a simple computer that has an infinite amount of storage in the form of cells on an infinite tape.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Pushdown.
CSCI 3130: Formal languages and automata theory Tutorial 7 Chin.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Undecidable.
CSCI 3130: Formal languages and automata theory Andrej Bogdanov The Chinese University of Hong Kong Decidable.
CS 154 Formal Languages and Computability April 28 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron Mak.
Theory of Computation Automata Theory Dr. Ayman Srour.
Lecture 11  2004 SDU Lecture7 Pushdown Automaton.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Undecidable.
February 1, 2016CS21 Lecture 121 CS21 Decidability and Tractability Lecture 12 February 1, 2016.
TMs for {a2n | n  0} {a2n | n  0}
More variants of Turing Machines
PDAs Accept Context-Free Languages
Ambiguity Parsing algorithms
Turing Machines.
CS21 Decidability and Tractability
Theorem 29 Given any PDA, there is another PDA that accepts exactly the same language with the additional property that whenever a path leads to ACCEPT,
Intro to Theory of Computation
Pushdown Automata PDAs
Pushdown Automata PDAs
Pushdown Automata PDAs
Pushdown Automata PDAs
Undecidable problems for CFGs
FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY
Reducibility The Chinese University of Hong Kong Fall 2010
LR(0) grammars The Chinese University of Hong Kong Fall 2010
Pushdown automata and CFG ↔ PDA conversions
LR(1) grammars The Chinese University of Hong Kong Fall 2010
Chapter 5 Pushdown Automata
Decidability Turing Machines Coded as Binary Strings
Decidability Turing Machines Coded as Binary Strings
Decidable and undecidable languages
Reducability Sipser, pages
CS21 Decidability and Tractability
More undecidable languages
Recap lecture 44 Decidability, whether a CFG generates certain string (emptiness), examples, whether a nonterminal is used in the derivation of some word.
Decidability and Tractability
LR(1) grammars The Chinese University of Hong Kong Fall 2011
Variants of Turing Machines
… NPDAs continued.
CS 461 – Nov. 14 Section 5.2 – Undecidable problems starting from ATM:
Pushdown automata The Chinese University of Hong Kong Fall 2011
Subject Name: FORMAL LANGUAGES AND AUTOMATA THEORY
More undecidable languages
Normal forms and parsing
Automata, Grammars and Languages
Presentation transcript:

Undecidable problems for CFGs The Chinese University of Hong Kong Fall 2010 CSCI 3130: Formal languages and automata theory Undecidable problems for CFGs Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130

Decidable vs. undecidable “DFA M accepts w” “TM M accepts w” “CFG G generates w” “TM M halts on w” “DFAs M and M’ accept same inputs” “TM M accepts some input” “TM M and M’ accept same inputs” “CFG G generates all inputs” “CFG G is ambiguous” more? ?

Representing computations L1 = {w%w: w ∈{a, b}*} a/aR b/bR x/xR %/%R a/aL abbaa%abbaa q0 q1 q2 b/bL a/aL a/xL b/bL xbbaa%abbaa q1 x/xR x/xL a/xR xbbaa%abbaa ... q1 %/%R ☐/☐R %/%R q0 q7 qa q5 q6 b/xR b/xL xbbaa%abbaa q2 q3 q4 xbbaa%xbbaa q5 %/%R ... xxxxx%xxxxx qa a/aR x/xR b/bR x/xR

Configurations A configuration consists of the current state, the head position, and tape contents configuration … a b ☐ q1 abq1a q1 qacc a/bR … a b ☐ qacc abbqacc

Computation histories a/aR b/bR x/xR q0a0b0%0a0b0 x0q1b0%0a0b0 x0b6q1%0a0b0 x0b6%0q2a0b0 x0b6q5%0x1b0 x0q6b0%0x0b0 %/%R a/aL q1 q2 b/bL a/aL a/xL b/bL x/xR x/xL a/xR %/%R ☐/☐R %/%R q0 q7 qa q5 q6 b/xR b/xL ... q3 q4 %/%R x0x0%0x0x0q7 x0x0%0x0x0☐aqa a/aR x/xR b/bR x/xR computation history

Computation histories as strings If M halts on w, the computation history of (M, w) is the sequence of configurations C1, ..., Cl that M goes through on input w. q0a0b0%0a0b0 x0q1b0%0a0b x0x0%0x0x0q7 x0x0%0x0x0qa ... #q0ab%ab#xq1b%ab# . . . #xx%xx☐qa# C1 C2 Cl The computation history can be written as a string hist over alphabet ∪Q∪{#} accepting history: M accepts w qacc occurs in hist rejecting history: M rejects w qrej occurs in hist

Undecidable problems for CFGs ALLCFG = {〈G〉: G is a CFG that generates all strings} The language ALLCFG is undecidable. We will argue that If ALLCFG can be decided, so can ATM.

Undecidable problems for CFGs accept if G generates all strings reject if not accept if M rej/loops w construct G A 〈G〉 〈M, w〉 reject if M accepts w G generates all strings if M rejects or loops on w G fails to generate some string if M accepts w

Undecidable problems for CFGs G fails to generate some string 〈M, w〉 construct G 〈G〉 M accepts w The alphabet of G will be ∪Q∪{#} G will generate all strings except the computation history of (M, w), if it is accepting First we construct a PDA P, then convert it to CFG G

Undecidability via computation histories candidate computation history hist of (M, w) P accept everything except accepting hist #q0ab%ab#xq1b%ab# . . . #xx%xx☐qa# reject P: On input hist, // try to spot a mistake in hist If hist is not of the form #w1#w2#...#wk#, accept. q0 e If w1 ≠ q0w or wk does not contain qa, accept. If two consecutive blocks wi#wi+1 do not follow from the transitions of M, accept. Otherwise, hist is an accepting history, so reject.

Computation is local #0q0a0b0%0a0b0 #0x0q1b0%0a0b0 #0x0b6q1%0a0b0 a/aR b/bR x/xR #0q0a0b0%0a0b0 #0x0q1b0%0a0b0 #0x0b6q1%0a0b0 #0x0b6%0q2a0b0 #0x0b6q5%0x1b0 #0x0q6b0%0x0b0 %/%R a/aL q1 q2 b/bL a/aL a/xL b/bL x/xR x/xL a/xR %/%R ☐/☐R %/%R q0 q7 qa q5 q6 b/xR b/xL ... q3 q4 %/%R #0x0x0%0x0x0q7 #0x0x0%0x0x0☐aqa a/aR x/xR b/bR x/xR Changes between configurations always occur around the head

Valid and invalid transition windows valid windows invalid windows … 6a3b0x0 … … 0a6b0x0 …0 … 6q2a0b0 … … 0a6b0q2 …0 … 6a3q2a0 … … 0q5a6x0 …0 … 6q2q2a0 … … 0q2q2x3 …0 q2 a/xL … 6a3b0a0 … … 0a6b0q5 …0 … 6a3q2a0 … … 0q5a6b0 …0 q5 … 6a3a0☐0 … … 0x6a0☐0 …0 … 6a3q2a0 … … 0a6q5x0 …0

Implementing P If two consecutive blocks wi#wi+1 do not follow from the transitions of M, accept: #0x0b6%0q2a0b #0x0b6q5%0x1b For every position of wi: offset Remember offset from # in wi on stack Remember first row of window in state After reaching the next #: Pop offset from # from stack as you consume input Remember second row of window in state If window is invalid, accept; Otherwise reject.

Recap of computation history method ALLCFG = {〈G〉: G is a CFG that generates all strings} If ALLCFG can be decided, so can ATM. G accepts all strings except accepting computation histories of (M, w) We first construct a PDA P, then convert it to G 〈M, w〉 construct G 〈G〉

The Post Correspondence Problem Input: A set of tiles like this Given an infinite supply of such tiles, can you match top and bottom? bab cc c ab a ab baa a a baba bab e a ab baa a bab e c ab c ab bab cc a baba

Undecidability of PCP PCP = {〈T〉: T is a collection of tiles that contains a top-bottom match} The language PCP is undecidable. Proof: We will show that If PCP can be decided, so can ATM.

Undecidability of PCP 〈M, w〉 T (collection of tiles) M accepts w T contains a match Idea: Matches represent accepting histories #q0ab%ab#xq1b%ab#...#xx%xx☐qa# #q0ab%ab#xq1b%ab#...#xx%xx☐qa# e #q0ab%ab #q0a #xq1 b a % a b # xq1% x%q2 …

An assumption We will assume that one of the PCP tiles is marked as a starting tile Later we’ll see how to “simulate” the starting tile by an ordinary tile s bab cc c ab a ab baa a a baba

for each valid window with state qi in top middle Undecidability of PCP 〈M, w〉 T (collection of tiles) M accepts w T contains a match On input 〈M, w〉 we construct these tiles for PCP e #q0w s x1qix2 x3x4x5 for each valid window with state qi in top middle xx for all x in G∪{#} xqa qa ☐# qax ☐#qix1 ☐#x2x3 qa## #

Undecidability of PCP tile type purpose e #q0w s represents initial configuration represent valid transitions between configurations xx x1qix2 x3x4x5 add blank spaces before # if necessary ☐# ☐#qix1 ☐#x2x3 xqa qa qax qa## # complete match if computation accepts

Undecidability of PCP #q0a%ab#xq1%ab#...#xx%xxq7☐#xx%xx☐qa# accepting computation history #q0a b % ab# xq1b%ab#... #xx%x xq7☐ # #q0ab%ab #xq1 b % ab# ...#xx%xxq7☐ #xx%x x☐qa # s e #q0w x1qix2 x3x4x5 ☐#qix1 ☐#x2x3 ☐# xx

Undecidability of PCP Once the accepting state symbol occurs, the last two tiles can “eat up” the rest of the symbols #xx%xx ☐qa #xx%x xqa #xx%xqa#... # qa## #xx%xx☐qa #xx%xx qa #xx%x qa #...#qa # # xx xqa qa qax qa qa## #

for each valid window of this form Undecidability of PCP If M rejects on input w, then qrej appears on bottom at some point, but it cannot be matched on top If M loops on w, then matching keeps going forever s e #q0w a1qia3 b1b2b3 ☐#qia2 ☐#b1b2 ☐# xx xqa qa qax qa qa## # for each valid window of this form for all x in G∪{#}

Getting rid of the starting tile We assumed that one tile marked as starting tile We can remove assumption by changing tiles a bit s a aba ba bb b c cca a *a* *a*b*a b*a* *b*b b* *c c*c*a* *a  * “starting tile” begins with * “middle tiles” “ending tile” matches last *

Getting rid of the starting tile aba ba bb b c b c cca a *a* *a*b*a b*a* *b*b b* *c b* *c c*c*a* *a  * can only use as starting tile can only use to complete match

Ambiguity of CFGs AMB = {G: G is an ambiguous CFG} The language AMB is undecidable. We will argue that If AMB can be decided, so can PCP.

Ambiguity of CFGs T G Step 1: Number the tiles (collection of tiles) (CFG) If T can be matched, then G is ambiguous If T cannot be matched, then G is unambiguous Step 1: Number the tiles 1 2 3 bab cc c ab a ab

Ambiguity of CFGs T G Terminals: a, b, c, 1, 2, 3 Variables: S, T, B (collection of tiles) (CFG) Terminals: a, b, c, 1, 2, 3 Variables: S, T, B Productions: T → babT1 B → ccB1 T → cT2 B → abB2 T → aT3 B → abB3 bab cc c ab a 1 2 3 S → T | B

Ambiguity of CFGs T G Terminals: a,b,c,1,2,3 Variables: S, T, B 1 2 3 (collection of tiles) (CFG) Terminals: a,b,c,1,2,3 Variables: S, T, B 1 2 3 Productions: bab cc c ab a ab S → T | B T → babT1 T → cT2 T → aT3 T → bab1 T → c2 T → a3 B → ccB1 B → abB2 B → abB3 B → cc1 B → ab2 B → ab3

Ambiguity of CFGs Each sequence of tiles gives two derivations If the tiles match, these two derive the same string 1 2 2 bab cc c ab c ab S ⇒ T ⇒ babT1 ⇒ babcT21⇒ babcc221 S ⇒ B ⇒ ccB1 ⇒ ccabB21⇒ ccabab221

Ambiguity of CFGs T G ✓ ✓ Argue by contradiction: (collection of tiles) (CFG) ✓ If T can be matched, then G is ambiguous ✓ If T cannot be matched, then G is unambiguous Argue by contradiction: If G is ambiguous then ambiguity must look like this S T a1 n1 ai ni a2 n2 … B b1 m1 bj mj b2 m2 Then n1...ni = m1…mj So there is a match a1 b1 a2 b2 ai bi n1 n2 ni …