Download presentation
Presentation is loading. Please wait.
Published bySuzan Simmons Modified over 6 years ago
1
Why Study Automata? What the Course is About Administrivia
Welcome to CS154 Why Study Automata? What the Course is About Administrivia
2
How Could That Be? Regular expressions are used in many systems.
E.g., UNIX a.*b. E.g., DTD’s describe XML tags with a RE format like person (name, addr, child*). Finite automata model protocols, electronic circuits. Theory is used in model-checking.
3
How? – (2) Context-free grammars are used to describe the syntax of essentially every programming language. Not to forget their important role in describing natural languages. And DTD’s taken as a whole, are really CFG’s.
4
How? – (3) When developing solutions to real problems, we often confront the limitations of what software can do. Undecidable things – no program whatever can do it. Intractable things – there are programs, but no fast programs. CS154 gives you the tools.
5
Other Good Stuff in CS154 We’ll learn how to deal formally with discrete systems. Proofs: You never really prove a program correct, but you need to be thinking of why a tricky technique really works. We’ll gain experience with abstract models and constructions. Models layered software architectures.
6
Course Outline Regular Languages and their descriptors:
Finite automata, nondeterministic finite automata, regular expressions. Algorithms to decide questions about regular languages, e.g., is it empty? Closure properties of regular languages.
7
Course Outline – (2) Context-free languages and their descriptors:
Context-free grammars, pushdown automata. Decision and closure properties.
8
Course Outline – (3) Recursive and recursively enumerable languages.
Turing machines, decidability of problems. The limit of what can be computed. Intractable problems. Problems that (appear to) require exponential time. NP-completeness and beyond.
9
Text Hopcroft, Motwani, Ullman, Automata Theory, Languages, and Computation 3rd Edition. Course covers essentially the entire book.
10
Finite Automata Motivation An Example
11
Informal Explanation Finite automata are finite collections of states with transition rules that take you from one state to another. Original application was sequential switching circuits, where the “state” was the settings of internal bits. Today, several kinds of software can be modeled by FA.
12
Representing FA Simplest representation is often a graph.
Nodes = states. Arcs indicate state transitions. Labels on arcs tell what causes the transition.
13
Example: Recognizing Strings Ending in “ing”
Saw ing g i Not i or g Saw i i Not i Saw in n i Not i or n nothing Start
14
Automata to Code In C/C++, make a piece of code for each state. This code: Reads the next input. Decides on the next state. Jumps to the beginning of the code for that state.
15
Example: Automata to Code
2: /* i seen */ c = getNextInput(); if (c == ’n’) goto 3; else if (c == ’i’) goto 2; else goto 1; 3: /* ”in” seen */ . . .
16
Automata to Code – Thoughts
How would you do this in Java, which has no goto? You don’t really write code like this. Rather, a code generator takes a “regular expression” describing the pattern(s) you are looking for. Example: .*ing works in grep.
17
Example: Protocol for Sending Data
timeout data in Ready Sending Start ack
18
Extended Example Thanks to Jay Misra for this example.
On a distant planet, there are three species, a, b, and c. Any two different species can mate. If they do: The participants die. Two children of the third species are born.
19
Strange Planet – (2) Observation: the number of individuals never changes. The planet fails if at some point all individuals are of the same species. Then, no more breeding can take place. State = sequence of three integers – the numbers of individuals of species a, b, and c.
20
Strange Planet – Questions
In a given state, must the planet eventually fail? In a given state, is it possible for the planet to fail, if the wrong breeding choices are made?
21
Questions – (2) These questions mirror real ones about protocols.
“Can the planet fail?” is like asking whether a protocol can enter some undesired or error state. “Must the planet fail” is like asking whether a protocol is guaranteed to terminate. Here, “failure” is really the good condition of termination.
22
Strange Planet – Transitions
An a-event occurs when individuals of species b and c breed and are replaced by two a’s. Analogously: b-events and c-events. Represent these by symbols a, b, and c, respectively.
23
Strange Planet with 2 Individuals
200 020 002 a b c 011 101 110 Notice: all states are “must-fail” states.
24
Strange Planet with 3 Individuals
111 a c b 300 030 003 a a 210 102 021 201 120 012 c b b c Notice: four states are “must-fail” states. The others are “can’t-fail” states. State 111 has several transitions.
25
Strange Planet with 4 Individuals
400 022 130 103 211 a c b 040 202 013 310 121 b a c 004 220 301 031 112 c b a Notice: states 400, etc. are must-fail states. All other states are “might-fail” states.
26
Taking Advantage of Symmetry
The ability to fail depends only on the set of numbers of the three species, not on which species has which number. Let’s represent states by the list of counts, sorted by largest-first. Only one transition symbol, x.
27
The Cases 2, 3, 4 x 211 400 110 111 x x x x 200 300 310 x 210 220 x Notice: for the case n = 4, there is nondeterminism : different transitions are possible from 211 on the same input.
28
5 Individuals 500 410 221 320 311 Notice: 500 is a must-fail state; all others are might-fail states.
29
6 Individuals 600 510 222 420 321 411 330 Notice: 600 is a must-fail state; 510, 420, and 321 are can’t-fail states; 411, 330, and 222 are “might-fail” states.
30
7 Individuals 700 322 610 331 520 421 511 430 Notice: 700 is a must-fail state; All others are might-fail states.
31
Questions for Thought Without symmetry, how many states are there with n individuals? What if we use symmetry? For n individuals, how do you tell whether a state is “must-fail,” “might- fail,” or “can’t-fail”?
32
Introduction to Finite Automata
Languages Deterministic Finite Automata Representations of Automata
33
Alphabets An alphabet is any finite set of symbols.
Examples: ASCII, Unicode, {0,1} (binary alphabet ), {a,b,c}.
34
Strings The set of strings over an alphabet Σ is the set of lists, each element of which is a member of Σ. Strings shown with no commas, e.g., abc. Σ* denotes this set of strings. ε stands for the empty string (string of length 0).
35
Example: Strings {0,1}* = {ε, 0, 1, 00, 01, 10, 11, 000, 001, . . . }
{0,1}* = {ε, 0, 1, 00, 01, 10, 11, 000, 001, } Subtlety: 0 as a string, 0 as a symbol look the same. Context determines the type.
36
Languages A language is a subset of Σ* for some alphabet Σ.
Example: The set of strings of 0’s and 1’s with no two consecutive 1’s. L = {ε, 0, 1, 00, 01, 10, 000, 001, 010, 100, 101, 0000, 0001, 0010, 0100, 0101, 1000, 1001, 1010, } Hmm… 1 of length 0, 2 of length 1, 3, of length 2, 5 of length 3, 8 of length 4. I wonder how many of length 5?
37
Deterministic Finite Automata
A formalism for defining languages, consisting of: A finite set of states (Q, typically). An input alphabet (Σ, typically). A transition function (δ, typically). A start state (q0, in Q, typically). A set of final states (F ⊆ Q, typically). “Final” and “accepting” are synonyms.
38
The Transition Function
Takes two arguments: a state and an input symbol. δ(q, a) = the state that the DFA goes to when it is in state q and input a is received.
39
Graph Representation of DFA’s
Nodes = states. Arcs represent transition function. Arc from state p to state q labeled by all those input symbols that have transitions from p to q. Arrow labeled “Start” to the start state. Final states indicated by double circles.
40
Example: Graph of a DFA Accepts all strings without two consecutive 1’s. Start 1 A C B 0,1 Consecutive 1’s have been seen. Previous string OK, does not end in 1. Previous String OK, ends in a single 1.
41
Alternative Representation: Transition Table
Final states starred * Columns = input symbols 1 A A B B A C C C C Arrow for start state Rows = states
42
Extended Transition Function
We describe the effect of a string of inputs on a DFA by extending δ to a state and a string. Induction on length of string. Basis: δ(q, ε) = q Induction: δ(q,wa) = δ(δ(q,w),a) w is a string; a is an input symbol.
43
Extended δ: Intuition Convention:
… w, x, y, x are strings. a, b, c,… are single symbols. Extended δ is computed for state q and inputs a1a2…an by following a path in the transition graph, starting at q and selecting the arcs with labels a1, a2,…,an in turn.
44
Example: Extended Delta
1 A A B B A C C C C δ(B,011) = δ(δ(B,01),1) = δ(δ(δ(B,0),1),1) = δ(δ(A,1),1) = δ(B,1) = C
45
Delta-hat In book, the extended δ has a “hat” to distinguish it from δ itself. Not needed, because both agree when the string is a single symbol. δ(q, a) = δ(δ(q, ε), a) = δ(q, a) ˄ ˄ Extended deltas
46
Language of a DFA Automata of all kinds define languages.
If A is an automaton, L(A) is its language. For a DFA A, L(A) is the set of strings labeling paths from the start state to a final state. Formally: L(A) = the set of strings w such that δ(q0, w) is in F.
47
Example: String in a Language
String 101 is in the language of the DFA below. Start at A. Start 1 A C B 0,1
48
Example: String in a Language
String 101 is in the language of the DFA below. Follow arc labeled 1. 0,1 1 1 A B C Start
49
Example: String in a Language
String 101 is in the language of the DFA below. Then arc labeled 0 from current state B. 0,1 1 1 A B C Start
50
Example: String in a Language
String 101 is in the language of the DFA below. Finally arc labeled 1 from current state A. Result is an accepting state, so 101 is in the language. 0,1 1 1 A B C Start
51
Example – Concluded The language of our example DFA is:
{w | w is in {0,1}* and w does not have two consecutive 1’s} Read a set former as “The set of strings w… Such that… These conditions about w are true.
52
Proofs of Set Equivalence
Often, we need to prove that two descriptions of sets are in fact the same set. Here, one set is “the language of this DFA,” and the other is “the set of strings of 0’s and 1’s with no consecutive 1’s.”
53
Proofs – (2) In general, to prove S=T, we need to prove two parts: S ⊆ T and T ⊆ S. That is: If w is in S, then w is in T. If w is in T, then w is in S. As an example, let S = the language of our running DFA, and T = “no consecutive 1’s.”
54
Part 1: S ⊆ T To prove: if w is accepted by
Start 1 A C B 0,1 To prove: if w is accepted by then w has no consecutive 1’s. Proof is an induction on length of w. Important trick: Expand the inductive hypothesis to be more detailed than you need.
55
The Inductive Hypothesis
If δ(A, w) = A, then w has no consecutive 1’s and does not end in 1. If δ(A, w) = B, then w has no consecutive 1’s and ends in a single 1. Basis: |w| = 0; i.e., w = ε. (1) holds since ε has no 1’s at all. (2) holds vacuously, since δ(A, ε) is not B. “length of” Important concept: If the “if” part of “if..then” is false, the statement is true.
56
Start 1 A C B 0,1 Inductive Step Assume (1) and (2) are true for strings shorter than w, where |w| is at least 1. Because w is not empty, we can write w = xa, where a is the last symbol of w, and x is the string that precedes. IH is true for x.
57
Inductive Step – (2) Need to prove (1) and (2) for w = xa.
Start 1 A C B 0,1 Need to prove (1) and (2) for w = xa. (1) for w is: If δ(A, w) = A, then w has no consecutive 1’s and does not end in 1. Since δ(A, w) = A, δ(A, x) must be A or B, and a must be 0 (look at the DFA). By the IH, x has no 11’s. Thus, w has no 11’s and does not end in 1.
58
Inductive Step – (3) Start 1 A C B 0,1 Now, prove (2) for w = xa: If δ(A, w) = B, then w has no 11’s and ends in 1. Since δ(A, w) = B, δ(A, x) must be A, and a must be 1 (look at the DFA). By the IH, x has no 11’s and does not end in 1. Thus, w has no 11’s and ends in 1.
59
Part 2: T ⊆ S X Now, we must prove: if w has no 11’s, then w is accepted by Contrapositive : If w is not accepted by then w has 11. Y Start 1 A C B 0,1 Key idea: contrapositive of “if X then Y” is the equivalent statement “if not Y then not X.” Start 1 A C B 0,1
60
Using the Contrapositive
Start 1 A C B 0,1 Every w gets the DFA to exactly one state. Simple inductive proof based on: Every state has exactly one transition on 1, one transition on 0. The only way w is not accepted is if it gets to C.
61
Using the Contrapositive – (2)
Start 1 A C B 0,1 The only way to get to C [formally: δ(A,w) = C] is if w = x1y, x gets to B, and y is the tail of w that follows what gets to C for the first time. If δ(A,x) = B then surely x = z1 for some z. Thus, w = z11y and has 11.
62
Regular Languages A language L is regular if it is the language accepted by some DFA. Note: the DFA must accept only the strings in L, no others. Some languages are not regular. Intuitively, regular languages “cannot count” to arbitrarily high integers.
63
Example: A Nonregular Language
L1 = {0n1n | n ≥ 1} Note: ai is conventional for i a’s. Thus, 04 = 0000, e.g. Read: “The set of strings consisting of n 0’s followed by n 1’s, such that n is at least 1. Thus, L1 = {01, 0011, ,…}
64
Another Example L2 = {w | w in {(, )}* and w is balanced }
Note: alphabet consists of the parenthesis symbols ’(’ and ’)’. Balanced parens are those that can appear in an arithmetic expression. E.g.: (), ()(), (()), (()()),…
65
But Many Languages are Regular
Regular Languages can be described in many ways, e.g., regular expressions. They appear in many contexts and have many useful properties. Example: the strings that represent floating point numbers in your favorite language is a regular language.
66
Example: A Regular Language
L3 = { w | w in {0,1}* and w, viewed as a binary integer is divisible by 23} The DFA: 23 states, named 0, 1,…,22. Correspond to the 23 remainders of an integer divided by 23. Start and only final state is 0.
67
Transitions of the DFA for L3
If string w represents integer i, then assume δ(0, w) = i%23. Then w0 represents integer 2i, so we want δ(i%23, 0) = (2i)%23. Similarly: w1 represents 2i+1, so we want δ(i%23, 1) = (2i+1)%23. Example: δ(15,0) = 30%23 = 7; δ(11,1) = 23%23 = 0. Key idea: design a DFA by figuring out what each state needs to remember about the past.
68
Another Example L4 = { w | w in {0,1}* and w, viewed as the reverse of a binary integer is divisible by 23} Example: is in L4, because its reverse, is 46 in binary. Hard to construct the DFA. But theorem says the reverse of a regular language is also regular.
69
Nondeterministic Finite Automata
Nondeterminism Subset Construction
70
Nondeterminism A nondeterministic finite automaton has the ability to be in several states at once. Transitions from a state on an input symbol can be to any set of states.
71
Nondeterminism – (2) Start in one start state.
Accept if any sequence of choices leads to a final state. Intuitively: the NFA always “guesses right.”
72
Example: Moves on a Chessboard
States = squares. Inputs = r (move to an adjacent red square) and b (move to an adjacent black square). Start state, final state are in opposite corners.
73
Example: Chessboard – (2)
r b 2, 4, ,3,5 2, 2, ,5,7 2,4,6,8 1,3,7,9 2, ,5,9 4, 4, ,7,9 6, * 1 2 5 7 9 3 4 8 6 1 r b 4 2 1 5 3 5 7 1 3 9 7 Accept, since final state reached
74
Formal NFA A finite set of states, typically Q.
An input alphabet, typically Σ. A transition function, typically δ. A start state in Q, typically q0. A set of final states F ⊆ Q.
75
Transition Function of an NFA
δ(q, a) is a set of states. Extend to strings as follows: Basis: δ(q, ε) = {q} Induction: δ(q, wa) = the union over all states p in δ(q, w) of δ(p, a)
76
Language of an NFA A string w is accepted by an NFA if δ(q0, w) contains at least one final state. The language of the NFA is the set of strings it accepts.
77
Example: Language of an NFA
1 2 5 7 9 3 4 8 6 For our chessboard NFA we saw that rbb is accepted. If the input consists of only b’s, the set of accessible states alternates between {5} and {1,3,7,9}, so only even-length, nonempty strings of b’s are accepted. What about strings with at least one r?
78
Equivalence of DFA’s, NFA’s
A DFA can be turned into an NFA that accepts the same language. If δD(q, a) = p, let the NFA have δN(q, a) = {p}. Then the NFA is always in a set containing exactly one state – the state the DFA is in after reading the same input.
79
Equivalence – (2) Surprisingly, for any NFA there is a DFA that accepts the same language. Proof is the subset construction. The number of states of the DFA can be exponential in the number of states of the NFA. Thus, NFA’s accept exactly the regular languages.
80
Subset Construction Given an NFA with states Q, inputs Σ, transition function δN, state state q0, and final states F, construct equivalent DFA with: States 2Q (Set of subsets of Q). Inputs Σ. Start state {q0}. Final states = all those with a member of F.
81
Critical Point The DFA states have names that are sets of NFA states.
But as a DFA state, an expression like {p,q} must be read as a single symbol, not as a set. Analogy: a class of objects whose values are sets of objects of another class.
82
Subset Construction – (2)
The transition function δD is defined by: δD({q1,…,qk}, a) is the union over all i = 1,…,k of δN(qi, a). Example: We’ll construct the DFA equivalent of our “chessboard” NFA.
83
Example: Subset Construction
r b 2, 4, ,3,5 2, 2, ,5,7 2,4,6,8 1,3,7,9 2, ,5,9 4, 4, ,7,9 6, r b {1} {2,4} {5} {2,4} {5} Alert: What we’re doing here is the lazy form of DFA construction, where we only construct a state if we are forced to. *
84
Example: Subset Construction
r b 2, 4, ,3,5 2, 2, ,5,7 2,4,6,8 1,3,7,9 2, ,5,9 4, 4, ,7,9 6, * r b {1} {2,4} {5} {2,4} {2,4,6,8} {1,3,5,7} {5} {2,4,6,8} {1,3,5,7}
85
Example: Subset Construction
r b 2, 4, ,3,5 2, 2, ,5,7 2,4,6,8 1,3,7,9 2, ,5,9 4, 4, ,7,9 6, * r b {1} {2,4} {5} {2,4} {2,4,6,8} {1,3,5,7} {5} {2,4,6,8} {1,3,7,9} {2,4,6,8} {1,3,5,7} * {1,3,7,9}
86
Example: Subset Construction
r b 2, 4, ,3,5 2, 2, ,5,7 2,4,6,8 1,3,7,9 2, ,5,9 4, 4, ,7,9 6, * r b {1} {2,4} {5} {2,4} {2,4,6,8} {1,3,5,7} {5} {2,4,6,8} {1,3,7,9} {2,4,6,8} {2,4,6,8} {1,3,5,7,9} {1,3,5,7} * {1,3,7,9} * {1,3,5,7,9}
87
Example: Subset Construction
r b 2, 4, ,3,5 2, 2, ,5,7 2,4,6,8 1,3,7,9 2, ,5,9 4, 4, ,7,9 6, * r b {1} {2,4} {5} {2,4} {2,4,6,8} {1,3,5,7} {5} {2,4,6,8} {1,3,7,9} {2,4,6,8} {2,4,6,8} {1,3,5,7,9} {1,3,5,7} {2,4,6,8} {1,3,5,7,9} * {1,3,7,9} * {1,3,5,7,9}
88
Example: Subset Construction
r b 2, 4, ,3,5 2, 2, ,5,7 2,4,6,8 1,3,7,9 2, ,5,9 4, 4, ,7,9 6, * r b {1} {2,4} {5} {2,4} {2,4,6,8} {1,3,5,7} {5} {2,4,6,8} {1,3,7,9} {2,4,6,8} {2,4,6,8} {1,3,5,7,9} {1,3,5,7} {2,4,6,8} {1,3,5,7,9} * {1,3,7,9} {2,4,6,8} {5} * {1,3,5,7,9}
89
Example: Subset Construction
r b 2, 4, ,3,5 2, 2, ,5,7 2,4,6,8 1,3,7,9 2, ,5,9 4, 4, ,7,9 6, * r b {1} * {1,3,5,7,9} {2,4,6,8} {1,3,5,7,9} * {1,3,7,9} {2,4,6,8} {5} {2,4,6,8} {2,4,6,8} {1,3,7,9} {5} {2,4} {2,4,6,8} {1,3,5,7} {1,3,5,7} {2,4} {5}
90
Proof of Equivalence: Subset Construction
The proof is almost a pun. Show by induction on |w| that δN(q0, w) = δD({q0}, w) Basis: w = ε: δN(q0, ε) = δD({q0}, ε) = {q0}.
91
Induction Assume IH for strings shorter than w.
Let w = xa; IH holds for x. Let δN(q0, x) = δD({q0}, x) = S. Let T = the union over all states p in S of δN(p, a). Then δN(q0, w) = δD({q0}, w) = T. For NFA: the extension of δN. For DFA: definition of δD plus extension of δD. That is, δD(S, a) = T; then extend δD to w = xa.
92
NFA’s With ε-Transitions
We can allow state-to-state transitions on ε input. These transitions are done spontaneously, without looking at the input string. A convenience at times, but still only regular languages are accepted.
93
Example: ε-NFA ε 0 1 ε A {E} {B} ∅ B ∅ {C} {D} C ∅ {D} ∅ D ∅ ∅ ∅
ε A {E} {B} ∅ B ∅ {C} {D} C ∅ {D} ∅ D ∅ ∅ ∅ E {F} ∅ {B, C} F {D} ∅ ∅ * C E F A B D 1 ε
94
Closure of States CL(q) = set of states you can reach from state q following only arcs labeled ε. Example: CL(A) = {A}; CL(E) = {B, C, D, E}. Closure of a set of states = union of the closure of each state. C E F A B D 1 ε
95
Extended Delta δ Basis: (q, ε) = CL(q).
˄ δ Basis: (q, ε) = CL(q). Induction: (q, xa) is computed as follows: Start with (q, x) = S. Take the union of CL(δ(p, a)) for all p in S. Intuition: (q, w) is the set of states you can reach from q following a path labeled w. ˄ δ ˄ δ ˄ δ And notice that δ(q, a) is not that set of states, for symbol a.
96
Example: Extended Delta
C E F A B D 1 ε Example: Extended Delta ˄ δ (A, ε) = CL(A) = {A}. (A, 0) = CL({E}) = {B, C, D, E}. (A, 01) = CL({C, D}) = {C, D}. Language of an ε-NFA is the set of strings w such that (q0, w) contains a final state. ˄ δ ˄ δ ˄ δ
97
Equivalence of NFA, ε-NFA
Every NFA is an ε-NFA. It just has no transitions on ε. Converse requires us to take an ε-NFA and construct an NFA that accepts the same language. We do so by combining ε–transitions with the next transition on a real input. Warning: This treatment is a bit different from that in the text.
98
Picture of ε-Transition Removal
Transitions on ε Transitions on ε
99
Picture of ε-Transition Removal
To here, and performs the subset construction Text goes from here a a a Transitions on ε Transitions on ε
100
Picture of ε-Transition Removal
To here, with no subset construction a We’ll go from here a a Transitions on ε Transitions on ε
101
Equivalence – (2) Start with an ε-NFA with states Q, inputs Σ, start state q0, final states F, and transition function δE. Construct an “ordinary” NFA with states Q, inputs Σ, start state q0, final states F’, and transition function δN.
102
Equivalence – (3) Compute δN(q, a) as follows:
Let S = CL(q). δN(q, a) is the union over all p in S of δE(p, a). F’ = the set of states q such that CL(q) contains a state of F. Intuition: δN incorporates ε–transitions before using a but not after.
103
Equivalence – (4) Prove by induction on |w| that
CL(δN(q0, w)) = E(q0, w). Thus, the ε-NFA accepts w if and only if the “ordinary” NFA does. ˄ δ
104
Example: ε-NFA-to-NFA
Interesting closures: CL(B) = {B,D}; CL(E) = {B,C,D,E} ε A {E} {B} ∅ B ∅ {C} {D} C ∅ {D} ∅ D ∅ ∅ ∅ E {F} ∅ {B, C} F {D} ∅ ∅ * A {E} {B} B ∅ {C} C ∅ {D} D ∅ ∅ E {F} {C, D} F {D} ∅ * Since closures of B and E include final state D. * * Since closure of E includes B and C; which have transitions on 1 to C and D. ε-NFA
105
Summary DFA’s, NFA’s, and ε–NFA’s all accept exactly the same set of languages: the regular languages. The NFA types are easier to design and may have exponentially fewer states than a DFA. But only a DFA can be implemented!
106
Definitions Equivalence to Finite Automata
Regular Expressions Definitions Equivalence to Finite Automata
107
RE’s: Introduction Regular expressions are an algebraic way to describe languages. They describe exactly the regular languages. If E is a regular expression, then L(E) is the language it defines. We’ll describe RE’s and their languages recursively.
108
RE’s: Definition Basis 1: If a is any symbol, then a is a RE, and L(a) = {a}. Note: {a} is the language containing one string, and that string is of length 1. Basis 2: ε is a RE, and L(ε) = {ε}. Basis 3: ∅ is a RE, and L(∅) = ∅.
109
RE’s: Definition – (2) Induction 1: If E1 and E2 are regular expressions, then E1+E2 is a regular expression, and L(E1+E2) = L(E1)L(E2). Induction 2: If E1 and E2 are regular expressions, then E1E2 is a regular expression, and L(E1E2) = L(E1)L(E2). Concatenation : the set of strings wx such that w Is in L(E1) and x is in L(E2).
110
RE’s: Definition – (3) Induction 3: If E is a RE, then E* is a RE, and L(E*) = (L(E))*. Closure, or “Kleene closure” = set of strings w1w2…wn, for some n > 0, where each wi is in L(E). Note: when n=0, the string is ε.
111
Precedence of Operators
Parentheses may be used wherever needed to influence the grouping of operators. Order of precedence is * (highest), then concatenation, then + (lowest).
112
Examples: RE’s L(01) = {01}. L(01+0) = {01, 0}. L(0(1+0)) = {01, 00}.
Note order of precedence of operators. L(0*) = {ε, 0, 00, 000,… }. L((0+10)*(ε+1)) = all strings of 0’s and 1’s without two consecutive 1’s.
113
Equivalence of RE’s and Automata
We need to show that for every RE, there is an automaton that accepts the same language. Pick the most powerful automaton type: the ε-NFA. And we need to show that for every automaton, there is a RE defining its language. Pick the most restrictive type: the DFA.
114
Converting a RE to an ε-NFA
Proof is an induction on the number of operators (+, concatenation, *) in the RE. We always construct an automaton of a special form (next slide).
115
Form of ε-NFA’s Constructed
No arcs from outside, no arcs leaving Start state: Only state with external predecessors “Final” state: Only state with external successors
116
RE to ε-NFA: Basis a Symbol a: ε: ∅: ε
117
RE to ε-NFA: Induction 1 – Union
For E1 E2 For E1 ε For E2
118
RE to ε-NFA: Induction 2 – Concatenation
For E1E2 For E1 For E2 ε
119
RE to ε-NFA: Induction 3 – Closure
For E* ε For E
120
DFA-to-RE A strange sort of induction.
States of the DFA are assumed to be 1,2,…,n. We construct RE’s for the labels of restricted sets of paths. Basis: single arcs or no arc at all. Induction: paths that are allowed to traverse next state in order.
121
k-Paths A k-path is a path through the graph of the DFA that goes though no state numbered higher than k. Endpoints are not restricted; they can be any state.
122
Example: k-Paths 1 3 2 0-paths from 2 to 3: RE for labels = 0.
0-paths from 2 to 3: RE for labels = 0. 1-paths from 2 to 3: RE for labels = 0+11. 2-paths from 2 to 3: RE for labels = (10)*0+1(01)*1 3-paths from 2 to 3: RE for labels = ??
123
k-Path Induction Let Rijk be the regular expression for the set of labels of k-paths from state i to state j. Basis: k=0. Rij0 = sum of labels of arc from i to j. ∅ if no such arc. But add ε if i=j.
124
1 3 2 Example: Basis R120 = 0. R110 = ∅ + ε = ε.
125
k-Path Inductive Case A k-path from i to j either:
Never goes through state k, or Goes through k one or more times. Rijk = Rijk-1 + Rikk-1(Rkkk-1)* Rkjk-1. Doesn’t go through k Goes from i to k the first time Zero or more times from k to k Then, from k to j
126
Illustration of Induction
Path to k Paths not going through k i From k to k Several times j k States < k From k to j
127
Final Step The RE with the same language as the DFA is the sum (union) of Rijn, where: n is the number of states; i.e., paths are unconstrained. i is the start state. j is one of the final states.
128
Example R233 = R232 + R232(R332)*R332 = R232(R332)*
1 3 2 Example R233 = R232 + R232(R332)*R332 = R232(R332)* R232 = (10)*0+1(01)*1 R332 = 0(01)*(1+00) + 1(10)*(0+11) R233 = [(10)*0+1(01)*1] [(0(01)*(1+00) + 1(10)*(0+11))]*
129
Summary Each of the three types of automata (DFA, NFA, ε-NFA) we discussed, and regular expressions as well, define exactly the same set of languages: the regular languages.
130
Algebraic Laws for RE’s
Union and concatenation behave sort of like addition and multiplication. + is commutative and associative; concatenation is associative. Concatenation distributes over +. Exception: Concatenation is not commutative.
131
Identities and Annihilators
∅ is the identity for +. R + ∅ = R. ε is the identity for concatenation. εR = Rε = R. ∅ is the annihilator for concatenation. ∅R = R∅ = ∅.
132
Decision Properties of Regular Languages
General Discussion of “Properties” The Pumping Lemma Membership, Emptiness, Etc.
133
Properties of Language Classes
A language class is a set of languages. We have one example: the regular languages. We’ll see many more in this class. Language classes have two important kinds of properties: Decision properties. Closure properties.
134
Representation of Languages
Representations can be formal or informal. Example (formal): represent a language by a RE or DFA defining it. Example: (informal): a logical or prose statement about its strings: {0n1n | n is a nonnegative integer} “The set of strings consisting of some number of 0’s followed by the same number of 1’s.”
135
Decision Properties A decision property for a class of languages is an algorithm that takes a formal description of a language (e.g., a DFA) and tells whether or not some property holds. Example: Is language L empty?
136
Subtle Point: Representation Matters
You might imagine that the language is described informally, so if my description is “the empty language” then yes, otherwise no. But the representation is a DFA (or a RE that you will convert to a DFA). Can you tell if L(A) = for DFA A?
137
Why Decision Properties?
When we talked about protocols represented as DFA’s, we noted that important properties of a good protocol were related to the language of the DFA. Example: “Does the protocol terminate?” = “Is the language finite?” Example: “Can the protocol fail?” = “Is the language nonempty?”
138
Why Decision Properties – (2)
We might want a “smallest” representation for a language, e.g., a minimum-state DFA or a shortest RE. If you can’t decide “Are these two languages the same?” I.e., do two DFA’s define the same language? You can’t find a “smallest.”
139
Closure Properties A closure property of a language class says that given languages in the class, an operator (e.g., union) produces another language in the same class. Example: the regular languages are obviously closed under union, concatenation, and (Kleene) closure. Use the RE representation of languages.
140
Why Closure Properties?
Helps construct representations. Helps show (informally described) languages not to be in the class.
141
Example: Use of Closure Property
We can easily prove L1 = {0n1n | n > 0} is not a regular language. L2 = the set of strings with an = number of 0’s and 1’s isn’t either, but that fact is trickier to prove. Regular languages are closed under . If L2 were regular, then L2 L(0*1*) = L1 would be, but it isn’t.
142
The Membership Question
Our first decision property is the question: “is string w in regular language L?” Assume L is represented by a DFA A. Simulate the action of A on the sequence of input symbols forming w.
143
Example: Testing Membership
Next symbol Start 1 A C B 0,1 Current state
144
Example: Testing Membership
Next symbol Start 1 A C B 0,1 Current state
145
Example: Testing Membership
Next symbol Start 1 A C B 0,1 Current state
146
Example: Testing Membership
Next symbol Start 1 A C B 0,1 Current state
147
Example: Testing Membership
Next symbol Start 1 A C B 0,1 Current state
148
Example: Testing Membership
Next symbol Start 1 A C B 0,1 Current state
149
What if the Regular Language Is not Represented by a DFA?
There is a circle of conversions from one form to another: RE ε-NFA DFA NFA
150
The Emptiness Problem Given a regular language, does the language contain any string at all. Assume representation is DFA. Construct the transition graph. Compute the set of states reachable from the start state. If any final state is reachable, then yes, else no.
151
The Infiniteness Problem
Is a given regular language infinite? Start with a DFA for the language. Key idea: if the DFA has n states, and the language contains any string of length n or more, then the language is infinite. Otherwise, the language is surely finite. Limited to strings of length n or less.
152
Proof of Key Idea If an n-state DFA accepts a string w of length n or more, then there must be a state that appears twice on the path labeled w from the start state to a final state. Because there are at least n+1 states along the path.
153
Proof – (2) w = xyz x y z Then xyiz is in the language for all i > 0. Since y is not ε, we see an infinite number of strings in L.
154
Infiniteness – Continued
We do not yet have an algorithm. There are an infinite number of strings of length > n, and we can’t test them all. Second key idea: if there is a string of length > n (= number of states) in L, then there is a string of length between n and 2n-1.
155
Proof of 2nd Key Idea Remember:
x y z Remember: We can choose y to be the first cycle on the path. So |xy| < n; in particular, 1 < |y| < n. Thus, if w is of length 2n or more, there is a shorter string in L that is still of length at least n. Keep shortening to reach [n, 2n-1].
156
Completion of Infiniteness Algorithm
Test for membership all strings of length between n and 2n-1. If any are accepted, then infinite, else finite. A terrible algorithm. Better: find cycles between the start state and a final state.
157
Finding Cycles Eliminate states not reachable from the start state.
Eliminate states that do not reach a final state. Test if the remaining transition graph has any cycles.
158
The Pumping Lemma We have, almost accidentally, proved a statement that is quite useful for showing certain languages are not regular. Called the pumping lemma for regular languages.
159
Statement of the Pumping Lemma
Number of states of DFA for L For every regular language L There is an integer n, such that For every string w in L of length > n We can write w = xyz such that: |xy| < n. |y| > 0. For all i > 0, xyiz is in L. Labels along first cycle on path labeled w
160
Example: Use of Pumping Lemma
We have claimed {0k1k | k > 1} is not a regular language. Suppose it were. Then there would be an associated n for the pumping lemma. Let w = 0n1n. We can write w = xyz, where x and y consist of 0’s, and y ε. But then xyyz would be in L, and this string has more 0’s than 1’s.
161
Decision Property: Equivalence
Given regular languages L and M, is L = M? Algorithm involves constructing the product DFA from DFA’s for L and M. Let these DFA’s have sets of states Q and R, respectively. Product DFA has set of states Q R. I.e., pairs [q, r] with q in Q, r in R.
162
Product DFA – Continued
Start state = [q0, r0] (the start states of the DFA’s for L, M). Transitions: δ([q,r], a) = [δL(q,a), δM(r,a)] δL, δM are the transition functions for the DFA’s of L, M. That is, we simulate the two DFA’s in the two state components of the product DFA.
163
Example: Product DFA 1 [A,C] [A,D] A B 1 0, 1 1 1 1 [B,C] [B,D] 1 C D
1 [A,C] [A,D] A B 1 0, 1 1 1 1 [B,C] [B,D] 1 C D 1
164
Equivalence Algorithm
Make the final states of the product DFA be those states [q, r] such that exactly one of q and r is a final state of its own DFA. Thus, the product accepts w iff w is in exactly one of L and M.
165
Example: Equivalence 1 [A,C] [A,D] A B 1 0, 1 1 1 1 [B,C] [B,D] 1 C D
1 [A,C] [A,D] A B 1 0, 1 1 1 1 [B,C] [B,D] 1 C D 1
166
Equivalence Algorithm – (2)
The product DFA’s language is empty iff L = M. But we already have an algorithm to test whether the language of a DFA is empty.
167
Decision Property: Containment
Given regular languages L and M, is L M? Algorithm also uses the product automaton. How do you define the final states [q, r] of the product so its language is empty iff L M? Answer: q is final; r is not.
168
Example: Containment 1 [A,C] [A,D] A B 1 0, 1 1 1 1 [B,C] [B,D] 1 C D
1 [A,C] [A,D] A B 1 0, 1 1 1 1 [B,C] [B,D] 1 C D Note: the only final state is unreachable, so containment holds. 1
169
The Minimum-State DFA for a Regular Language
In principle, since we can test for equivalence of DFA’s we can, given a DFA A find the DFA with the fewest states accepting L(A). Test all smaller DFA’s for equivalence with A. But that’s a terrible algorithm.
170
Efficient State Minimization
Construct a table with all pairs of states. If you find a string that distinguishes two states (takes exactly one to an accepting state), mark that pair. Algorithm is a recursion on the length of the shortest distinguishing string.
171
State Minimization – (2)
Basis: Mark a pair if exactly one is a final state. Induction: mark [q, r] if there is some input symbol a such that [δ(q,a), δ(r,a)] is marked. After no more marks are possible, the unmarked pairs are equivalent and can be merged into one state.
172
Transitivity of “Indistinguishable”
If state p is indistinguishable from q, and q is indistinguishable from r, then p is indistinguishable from r. Proof: The outcome (accept or don’t) of p and q on input w is the same, and the outcome of q and r on w is the same, then likewise the outcome of p and r.
173
Constructing the Minimum-State DFA
Suppose q1,…,qk are indistinguishable states. Replace them by one state q. Then δ(q1, a),…, δ(qk, a) are all indistinguishable states. Key point: otherwise, we should have marked at least one more pair. Let δ(q, a) = the representative state for that group.
174
Example: State Minimization
r b {1} * {1,3,5,7,9} {2,4,6,8} {1,3,5,7,9} * {1,3,7,9} {2,4,6,8} {5} {2,4,6,8} {2,4,6,8} {1,3,7,9} {5} {2,4} {2,4,6,8} {1,3,5,7} {1,3,5,7} {2,4} {5} r b A B C B D E C D F D D G E D G F D C G D G * Here it is with more convenient state names Remember this DFA? It was constructed for the chessboard NFA by the subset construction.
175
Example – Continued r b A B C B D E C D F D D G E D G F D C G D G *
G F E D C B A B C D E F x x Start with marks for the pairs with one of the final states F or G.
176
Example – Continued r b A B C B D E C D F D D G E D G F D C G D G *
G F E D C B A B C D E F x x Input r gives no help, because the pair [B, D] is not marked.
177
Example – Continued r b A B C B D E C D F D D G E D G F D C G D G *
G F E D C B A B C D E F x x x x x x x x x But input b distinguishes {A,B,F} from {C,D,E,G}. For example, [A, C] gets marked because [C, F] is marked.
178
Example – Continued r b A B C B D E C D F D D G E D G F D C G D G *
G F E D C B A B C D E F x x x x x x x x x x x [C, D] and [C, E] are marked because of transitions on b to marked pair [F, G].
179
Example – Continued r b A B C B D E C D F D D G E D G F D C G D G *
G F E D C B A B C D E F x x x x x x x x x x x x [A, B] is marked because of transitions on r to marked pair [B, D]. [D, E] can never be marked, because on both inputs they go to the same state.
180
Example – Concluded r b A B C B D E C D F D D G E D G F D C G D G *
B H H C H F H H G F H C G H G * G F E D C B A B C D E F x x x x x x x x x x x x Replace D and E by H. Result is the minimum-state DFA.
181
Eliminating Unreachable States
Unfortunately, combining indistinguishable states could leave us with unreachable states in the “minimum-state” DFA. Thus, before or after, remove states that are not reachable from the start state.
182
Clincher We have combined states of the given DFA wherever possible.
Could there be another, completely unrelated DFA with fewer states? No. The proof involves minimizing the DFA we derived with the hypothetical better DFA.
183
Proof: No Unrelated, Smaller DFA
Let A be our minimized DFA; let B be a smaller equivalent. Consider an automaton with the states of A and B combined. Use “distinguishable” in its contrapositive form: If states q and p are indistinguishable, so are δ(q, a) and δ(p, a).
184
Inferring Indistinguishability
q p b r s q0 Must be indistinguishable Must be indistinguishable Start states of A and B indistinguishable because L(A) = L(B). p0
185
Inductive Hypothesis Every state q of A is indistinguishable from some state of B. Induction is on the length of the shortest string taking you from the start state of A to q.
186
Proof – (2) Basis: Start states of A and B are indistinguishable, because L(A) = L(B). Induction: Suppose w = xa is a shortest string getting A to state q. By the IH, x gets A to some state r that is indistinguishable from some state p of B. Then δ(r, a) = q is indistinguishable from δ(p, a).
187
Proof – (3) However, two states of A cannot be indistinguishable from the same state of B, or they would be indistinguishable from each other. Violates transitivity of “indistinguishable.” Thus, B has at least as many states as A.
188
Closure Properties of Regular Languages
Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism
189
Closure Properties Recall a closure property is a statement that a certain operation on languages, when applied to languages in a class (e.g., the regular languages), produces a result that is also in that class. For regular languages, we can use any of its representations to prove a closure property.
190
Closure Under Union If L and M are regular languages, so is L M.
Proof: Let L and M be the languages of regular expressions R and S, respectively. Then R+S is a regular expression whose language is L M.
191
Closure Under Concatenation and Kleene Closure
Same idea: RS is a regular expression whose language is LM. R* is a regular expression whose language is L*.
192
Closure Under Intersection
If L and M are regular languages, then so is L M. Proof: Let A and B be DFA’s whose languages are L and M, respectively. Construct C, the product automaton of A and B. Make the final states of C be the pairs consisting of final states of both A and B.
193
Example: Product DFA for Intersection
1 [A,C] [A,D] A B 1 0, 1 1 1 1 [B,C] [B,D] 1 C D 1
194
Closure Under Difference
If L and M are regular languages, then so is L – M = strings in L but not M. Proof: Let A and B be DFA’s whose languages are L and M, respectively. Construct C, the product automaton of A and B. Make the final states of C be the pairs where A-state is final but B-state is not.
195
Example: Product DFA for Difference
1 [A,C] [A,D] A B 1 0, 1 1 1 1 [B,C] [B,D] 1 C D Notice: difference is the empty language 1
196
Closure Under Complementation
The complement of a language L (with respect to an alphabet Σ such that Σ* contains L) is Σ* – L. Since Σ* is surely regular, the complement of a regular language is always regular.
197
Closure Under Reversal
Recall example of a DFA that accepted the binary strings that, as integers were divisible by 23. We said that the language of binary strings whose reversal was divisible by 23 was also regular, but the DFA construction was very tricky. Good application of reversal-closure.
198
Closure Under Reversal – (2)
Given language L, LR is the set of strings whose reversal is in L. Example: L = {0, 01, 100}; LR = {0, 10, 001}. Proof: Let E be a regular expression for L. We show how to reverse E, to provide a regular expression ER for LR.
199
Reversal of a Regular Expression
Basis: If E is a symbol a, ε, or ∅, then ER = E. Induction: If E is F+G, then ER = FR + GR. FG, then ER = GRFR F*, then ER = (FR)*.
200
Example: Reversal of a RE
Let E = 01* + 10*. ER = (01* + 10*)R = (01*)R + (10*)R = (1*)R0R + (0*)R1R = (1R)*0 + (0R)*1 = 1*0 + 0*1.
201
Homomorphisms A homomorphism on an alphabet is a function that gives a string for each symbol in that alphabet. Example: h(0) = ab; h(1) = ε. Extend to strings by h(a1…an) = h(a1)…h(an). Example: h(01010) = ababab.
202
Closure Under Homomorphism
If L is a regular language, and h is a homomorphism on its alphabet, then h(L) = {h(w) | w is in L} is also a regular language. Proof: Let E be a regular expression for L. Apply h to each symbol in E. Language of resulting RE is h(L).
203
Example: Closure under Homomorphism
Let h(0) = ab; h(1) = ε. Let L be the language of regular expression 01* + 10*. Then h(L) is the language of regular expression abε* + ε(ab)*. Note: use parentheses to enforce the proper grouping.
204
Example – Continued abε* + ε(ab)* can be simplified.
ε* = ε, so abε* = abε. ε is the identity under concatenation. That is, εE = Eε = E for any RE E. Thus, abε* + ε(ab)* = abε + ε(ab)* = ab + (ab)*. Finally, L(ab) is contained in L((ab)*), so a RE for h(L) is (ab)*.
205
Inverse Homomorphisms
Let h be a homomorphism and L a language whose alphabet is the output language of h. h-1(L) = {w | h(w) is in L}.
206
Example: Inverse Homomorphism
Let h(0) = ab; h(1) = ε. Let L = {abab, baba}. h-1(L) = the language with two 0’s and any number of 1’s = L(1*01*01*). Notice: no string maps to baba; any string with exactly two 0’s maps to abab.
207
Closure Proof for Inverse Homomorphism
Start with a DFA A for L. Construct a DFA B for h-1(L) with: The same set of states. The same start state. The same final states. Input alphabet = the symbols to which homomorphism h applies.
208
Proof – (2) The transitions for B are computed by applying h to an input symbol a and seeing where A would go on sequence of input symbols h(a). Formally, δB(q, a) = δA(q, h(a)).
209
Example: Inverse Homomorphism Construction
1 Since h(1) = ε a B C B A , 0 Since h(0) = ab a A b b b C a h(0) = ab h(1) = ε
210
Proof – (3) Induction on |w| shows that δB(q0, w) = δA(q0, h(w)).
Basis: w = ε. δB(q0, ε) = q0, and δA(q0, h(ε)) = δA(q0, ε) = q0.
211
Proof – (4) Induction: Let w = xa; assume IH for x.
δB(q0, w) = δB(δB(q0, x), a). = δB(δA(q0, h(x)), a) by the IH. = δA(δA(q0, h(x)), h(a)) by definition of the DFA B. = δA(q0, h(x)h(a)) by definition of the extended delta. = δA(q0, h(w)) by def. of homomorphism.
212
Context-Free Grammars
Formalism Derivations Backus-Naur Form Left- and Rightmost Derivations
213
Informal Comments A context-free grammar is a notation for describing languages. It is more powerful than finite automata or RE’s, but still cannot define all possible languages. Useful for nested structures, e.g., parentheses in programming languages.
214
Informal Comments – (2) Basic idea is to use “variables” to stand for sets of strings (i.e., languages). These variables are defined recursively, in terms of one another. Recursive rules (“productions”) involve only concatenation. Alternative rules for a variable allow union.
215
Example: CFG for { 0n1n | n > 1}
Productions: S -> 01 S -> 0S1 Basis: 01 is in the language. Induction: if w is in the language, then so is 0w1.
216
CFG Formalism Terminals = symbols of the alphabet of the language being defined. Variables = nonterminals = a finite set of other symbols, each of which represents a language. Start symbol = the variable whose language is the one being defined.
217
Productions A production has the form variable -> string of variables and terminals. Convention: A, B, C,… are variables. a, b, c,… are terminals. …, X, Y, Z are either terminals or variables. …, w, x, y, z are strings of terminals only. , , ,… are strings of terminals and/or variables.
218
Example: Formal CFG Here is a formal CFG for { 0n1n | n > 1}.
Terminals = {0, 1}. Variables = {S}. Start symbol = S. Productions = S -> 01 S -> 0S1
219
Derivations – Intuition
We derive strings in the language of a CFG by starting with the start symbol, and repeatedly replacing some variable A by the right side of one of its productions. That is, the “productions for A” are those that have A on the left side of the ->.
220
Derivations – Formalism
We say A => if A -> is a production. Example: S -> 01; S -> 0S1. S => 0S1 => 00S11 =>
221
Iterated Derivation =>* means “zero or more derivation steps.”
Basis: =>* for any string . Induction: if =>* and => , then =>* .
222
Example: Iterated Derivation
S -> 01; S -> 0S1. S => 0S1 => 00S11 => So S =>* S; S =>* 0S1; S =>* 00S11; S =>*
223
Sentential Forms Any string of variables and/or terminals derived from the start symbol is called a sentential form. Formally, is a sentential form iff S =>* .
224
Language of a Grammar If G is a CFG, then L(G), the language of G, is {w | S =>* w}. Note: w must be a terminal string, S is the start symbol. Example: G has productions S -> ε and S -> 0S1. L(G) = {0n1n | n > 0}. Note: ε is a legitimate right side.
225
Context-Free Languages
A language that is defined by some CFG is called a context-free language. There are CFL’s that are not regular languages, such as the example just given. But not all languages are CFL’s. Intuitively: CFL’s can count two things, not three.
226
BNF Notation Grammars for programming languages are often written in BNF (Backus-Naur Form ). Variables are words in <…>; Example: <statement>. Terminals are often multicharacter strings indicated by boldface or underline; Example: while or WHILE.
227
BNF Notation – (2) Symbol ::= is often used for ->.
Symbol | is used for “or.” A shorthand for a list of productions with the same left side. Example: S -> 0S1 | 01 is shorthand for S -> 0S1 and S -> 01.
228
BNF Notation – Kleene Closure
Symbol … is used for “one or more.” Example: <digit> ::= 0|1|2|3|4|5|6|7|8|9 <unsigned integer> ::= <digit>… Note: that’s not exactly the * of RE’s. Translation: Replace … with a new variable A and productions A -> A | .
229
Example: Kleene Closure
Grammar for unsigned integers can be replaced by: U -> UD | D D -> 0|1|2|3|4|5|6|7|8|9
230
BNF Notation: Optional Elements
Surround one or more symbols by […] to make them optional. Example: <statement> ::= if <condition> then <statement> [; else <statement>] Translation: replace [] by a new variable A with productions A -> | ε.
231
Example: Optional Elements
Grammar for if-then-else can be replaced by: S -> iCtSA A -> ;eS | ε
232
BNF Notation – Grouping
Use {…} to surround a sequence of symbols that need to be treated as a unit. Typically, they are followed by a … for “one or more.” Example: <statement list> ::= <statement> [{;<statement>}…]
233
Translation: Grouping
You may, if you wish, create a new variable A for {}. One production for A: A -> . Use A in place of {}.
234
Example: Grouping L -> S [{;S}…]
Replace by L -> S [A…] A -> ;S A stands for {;S}. Then by L -> SB B -> A… | ε A -> ;S B stands for [A…] (zero or more A’s). Finally by L -> SB B -> C | ε C -> AC | A A -> ;S C stands for A… .
235
Leftmost and Rightmost Derivations
Derivations allow us to replace any of the variables in a string. Leads to many different derivations of the same string. By forcing the leftmost variable (or alternatively, the rightmost variable) to be replaced, we avoid these “distinctions without a difference.”
236
Leftmost Derivations Say wA =>lm w if w is a string of terminals only and A -> is a production. Also, =>*lm if becomes by a sequence of 0 or more =>lm steps.
237
Example: Leftmost Derivations
Balanced-parentheses grammmar: S -> SS | (S) | () S =>lm SS =>lm (S)S =>lm (())S =>lm (())() Thus, S =>*lm (())() S => SS => S() => (S)() => (())() is a derivation, but not a leftmost derivation.
238
Rightmost Derivations
Say Aw =>rm w if w is a string of terminals only and A -> is a production. Also, =>*rm if becomes by a sequence of 0 or more =>rm steps.
239
Example: Rightmost Derivations
Balanced-parentheses grammmar: S -> SS | (S) | () S =>rm SS =>rm S() =>rm (S)() =>rm (())() Thus, S =>*rm (())() S => SS => SSS => S()S => ()()S => ()()() is neither a rightmost nor a leftmost derivation.
240
Relationship to Left- and Rightmost Derivations
Parse Trees Definitions Relationship to Left- and Rightmost Derivations Ambiguity in Grammars
241
Parse Trees Parse trees are trees labeled by symbols of a particular CFG. Leaves: labeled by a terminal or ε. Interior nodes: labeled by a variable. Children are labeled by the right side of a production for the parent. Root: must be labeled by the start symbol.
242
Example: Parse Tree S -> SS | (S) | () S ) (
243
Yield of a Parse Tree The concatenation of the labels of the leaves in left-to-right order That is, in the order of a preorder traversal. is called the yield of the parse tree. Example: yield of is (())() S ) (
244
Parse Trees, Left- and Rightmost Derivations
For every parse tree, there is a unique leftmost, and a unique rightmost derivation. We’ll prove: If there is a parse tree with root labeled A and yield w, then A =>*lm w. If A =>*lm w, then there is a parse tree with root A and yield w.
245
Proof – Part 1 Induction on the height (length of the longest path from the root) of the tree. Basis: height 1. Tree looks like A -> a1…an must be a production. Thus, A =>*lm a1…an. A a1 an . . .
246
Part 1 – Induction Assume (1) for trees of height < h, and let this tree have height h: By IH, Xi =>*lm wi. Note: if Xi is a terminal, then Xi = wi. Thus, A =>lm X1…Xn =>*lm w1X2…Xn =>*lm w1w2X3…Xn =>*lm … =>*lm w1…wn. A X1 Xn . . . w1 wn
247
Proof: Part 2 Given a leftmost derivation of a terminal string, we need to prove the existence of a parse tree. The proof is an induction on the length of the derivation.
248
Part 2 – Basis If A =>*lm a1…an by a one-step derivation, then there must be a parse tree A a1 an . . .
249
Part 2 – Induction Assume (2) for derivations of fewer than k > 1 steps, and let A =>*lm w be a k-step derivation. First step is A =>lm X1…Xn. Key point: w can be divided so the first portion is derived from X1, the next is derived from X2, and so on. If Xi is a terminal, then wi = Xi.
250
Induction – (2) That is, Xi =>*lm wi for all i such that Xi is a variable. And the derivation takes fewer than k steps. By the IH, if Xi is a variable, then there is a parse tree with root Xi and yield wi. Thus, there is a parse tree A X1 Xn . . . w1 wn
251
Parse Trees and Rightmost Derivations
The ideas are essentially the mirror image of the proof for leftmost derivations. Left to the imagination.
252
Parse Trees and Any Derivation
The proof that you can obtain a parse tree from a leftmost derivation doesn’t really depend on “leftmost.” First step still has to be A => X1…Xn. And w still can be divided so the first portion is derived from X1, the next is derived from X2, and so on.
253
Ambiguous Grammars A CFG is ambiguous if there is a string in the language that is the yield of two or more parse trees. Example: S -> SS | (S) | () Two parse trees for ()()() on next slide.
254
Example – Continued S S S S ( ) ( ) S S S S ( ) ( ) ( ) ( )
255
Ambiguity, Left- and Rightmost Derivations
If there are two different parse trees, they must produce two different leftmost derivations by the construction given in the proof. Conversely, two different leftmost derivations produce different parse trees by the other part of the proof. Likewise for rightmost derivations.
256
Ambiguity, etc. – (2) Thus, equivalent definitions of “ambiguous grammar’’ are: There is a string in the language that has two different leftmost derivations. There is a string in the language that has two different rightmost derivations.
257
Ambiguity is a Property of Grammars, not Languages
For the balanced-parentheses language, here is another CFG, which is unambiguous. B -> (RB | ε R -> ) | (RR B, the start symbol, derives balanced strings. R generates strings that have one more right paren than left.
258
Example: Unambiguous Grammar
B -> (RB | ε R -> ) | (RR Construct a unique leftmost derivation for a given balanced string of parentheses by scanning the string from left to right. If we need to expand B, then use B -> (RB if the next symbol is “(” and ε if at the end. If we need to expand R, use R -> ) if the next symbol is “)” and (RR if it is “(”.
259
The Parsing Process Remaining Input: (())()
Steps of leftmost derivation: B Next symbol B -> (RB | ε R -> ) | (RR
260
The Parsing Process Remaining Input: ())()
Steps of leftmost derivation: B (RB Next symbol B -> (RB | ε R -> ) | (RR
261
The Parsing Process Remaining Input: ))()
Steps of leftmost derivation: B (RB ((RRB Next symbol B -> (RB | ε R -> ) | (RR
262
The Parsing Process Remaining Input: )() Steps of leftmost derivation:
B (RB ((RRB (()RB Next symbol B -> (RB | ε R -> ) | (RR
263
The Parsing Process Remaining Input: () Steps of leftmost derivation:
B (RB ((RRB (()RB (())B Next symbol B -> (RB | ε R -> ) | (RR
264
The Parsing Process Remaining Input: ) Steps of leftmost derivation:
B (())(RB (RB ((RRB (()RB (())B Next symbol B -> (RB | ε R -> ) | (RR
265
The Parsing Process Remaining Input: Steps of leftmost derivation:
B (())(RB (RB (())()B ((RRB (()RB (())B Next symbol B -> (RB | ε R -> ) | (RR
266
The Parsing Process Remaining Input: Steps of leftmost derivation:
B (())(RB (RB (())()B ((RRB (())() (()RB (())B Next symbol B -> (RB | ε R -> ) | (RR
267
LL(1) Grammars As an aside, a grammar such B -> (RB | ε R -> ) | (RR, where you can always figure out the production to use in a leftmost derivation by scanning the given string left-to-right and looking only at the next one symbol is called LL(1). “Leftmost derivation, left-to-right scan, one symbol of lookahead.”
268
LL(1) Grammars – (2) Most programming languages have LL(1) grammars.
LL(1) grammars are never ambiguous.
269
Inherent Ambiguity It would be nice if for every ambiguous grammar, there were some way to “fix” the ambiguity, as we did for the balanced-parentheses grammar. Unfortunately, certain CFL’s are inherently ambiguous, meaning that every grammar for the language is ambiguous.
270
Example: Inherent Ambiguity
The language {0i1j2k | i = j or j = k} is inherently ambiguous. Intuitively, at least some of the strings of the form 0n1n2n must be generated by two different parse trees, one based on checking the 0’s and 1’s, the other based on checking the 1’s and 2’s.
271
One Possible Ambiguous Grammar
S -> AB | CD A -> 0A1 | 01 B -> 2B | 2 C -> 0C | 0 D -> 1D2 | 12 A generates equal 0’s and 1’s B generates any number of 2’s C generates any number of 0’s D generates equal 1’s and 2’s And there are two derivations of every string with equal numbers of 0’s, 1’s, and 2’s. E.g.: S => AB => 01B =>012 S => CD => 0D => 012
272
Normal Forms for CFG’s Eliminating Useless Variables Removing Epsilon
Removing Unit Productions Chomsky Normal Form
273
Variables That Derive Nothing
Consider: S -> AB, A -> aA | a, B -> AB Although A derives all strings of a’s, B derives no terminal strings (can you prove this fact?). Thus, S derives nothing, and the language is empty.
274
Testing Whether a Variable Derives Some Terminal String
Basis: If there is a production A -> w, where w has no variables, then A derives a terminal string. Induction: If there is a production A -> , where consists only of terminals and variables known to derive a terminal string, then A derives a terminal string.
275
Testing – (2) Eventually, we can find no more variables.
An easy induction on the order in which variables are discovered shows that each one truly derives a terminal string. Conversely, any variable that derives a terminal string will be discovered by this algorithm.
276
Proof of Converse The proof is an induction on the height of the least-height parse tree by which a variable A derives a terminal string. Basis: Height = 1. Tree looks like: Then the basis of the algorithm tells us that A will be discovered. A a1 an . . .
277
Induction for Converse
Assume IH for parse trees of height < h, and suppose A derives a terminal string via a parse tree of height h: By IH, those Xi’s that are variables are discovered. Thus, A will also be discovered, because it has a right side of terminals and/or discovered variables. A X1 Xn . . . w1 wn
278
Algorithm to Eliminate Variables That Derive Nothing
Discover all variables that derive terminal strings. For all other variables, remove all productions in which they appear either on the left or the right.
279
Example: Eliminate Variables
S -> AB | C, A -> aA | a, B -> bB, C -> c Basis: A and C are identified because of A -> a and C -> c. Induction: S is identified because of S -> C. Nothing else can be identified. Result: S -> C, A -> aA | a, C -> c
280
Unreachable Symbols Another way a terminal or variable deserves to be eliminated is if it cannot appear in any derivation from the start symbol. Basis: We can reach S (the start symbol). Induction: if we can reach A, and there is a production A -> , then we can reach all symbols of .
281
Unreachable Symbols – (2)
Easy inductions in both directions show that when we can discover no more symbols, then we have all and only the symbols that appear in derivations from S. Algorithm: Remove from the grammar all symbols not discovered reachable from S and all productions that involve these symbols.
282
Eliminating Useless Symbols
A symbol is useful if it appears in some derivation of some terminal string from the start symbol. Otherwise, it is useless. Eliminate all useless symbols by: Eliminate symbols that derive no terminal string. Eliminate unreachable symbols.
283
Example: Useless Symbols – (2)
S -> AB, A -> C, C -> c, B -> bB If we eliminated unreachable symbols first, we would find everything is reachable. A, C, and c would never get eliminated.
284
Why It Works After step (1), every symbol remaining derives some terminal string. After step (2) the only symbols remaining are all derivable from S. In addition, they still derive a terminal string, because such a derivation can only involve symbols reachable from S.
285
Epsilon Productions We can almost avoid using productions of the form A -> ε (called ε-productions ). The problem is that ε cannot be in the language of any grammar that has no ε– productions. Theorem: If L is a CFL, then L-{ε} has a CFG with no ε-productions.
286
Nullable Symbols To eliminate ε-productions, we first need to discover the nullable variables = variables A such that A =>* ε. Basis: If there is a production A -> ε, then A is nullable. Induction: If there is a production A -> , and all symbols of are nullable, then A is nullable.
287
Example: Nullable Symbols
S -> AB, A -> aA | ε, B -> bB | A Basis: A is nullable because of A -> ε. Induction: B is nullable because of B -> A. Then, S is nullable because of S -> AB.
288
Proof of Nullable-Symbols Algorithm
The proof that this algorithm finds all and only the nullable variables is very much like the proof that the algorithm for symbols that derive terminal strings works. Do you see the two directions of the proof? On what is each induction?
289
Eliminating ε-Productions
Key idea: turn each production A -> X1…Xn into a family of productions. For each subset of nullable X’s, there is one production with those eliminated from the right side “in advance.” Except, if all X’s are nullable, do not make a production with ε as the right side.
290
Example: Eliminating ε-Productions
S -> ABC, A -> aA | ε, B -> bB | ε, C -> ε A, B, C, and S are all nullable. New grammar: S -> ABC | AB | AC | BC | A | B | C A -> aA | a B -> bB | b Note: C is now useless. Eliminate its productions.
291
Why it Works Prove that for all variables A:
If w ε and A =>*old w, then A =>*new w. If A =>*new w then w ε and A =>*old w. Then, letting A be the start symbol proves that L(new) = L(old) – {ε}. (1) is an induction on the number of steps by which A derives w in the old grammar.
292
Proof of 1 – Basis If the old derivation is one step, then A -> w must be a production. Since w ε, this production also appears in the new grammar. Thus, A =>new w.
293
Proof of 1 – Induction Let A =>*old w be an n-step derivation, and assume the IH for derivations of less than n steps. Let the first step be A =>old X1…Xn. Then w can be broken into w = w1…wn, where Xi =>*old wi, for all i, in fewer than n steps.
294
Induction – Continued By the IH, if wi ε, then Xi =>*new wi.
Also, the new grammar has a production with A on the left, and just those Xi’s on the right such that wi ε. Note: they all can’t be ε, because w ε. Follow a use of this production by the derivations Xi =>*new wi to show that A derives w in the new grammar.
295
Proof of Converse We also need to show part (2) – if w is derived from A in the new grammar, then it is also derived in the old. Induction on number of steps in the derivation. We’ll leave the proof for reading in the text.
296
Unit Productions A unit production is one whose right side consists of exactly one variable. These productions can be eliminated. Key idea: If A =>* B by a series of unit productions, and B -> is a non-unit- production, then add production A -> . Then, drop all unit productions.
297
Unit Productions – (2) Find all pairs (A, B) such that A =>* B by a sequence of unit productions only. Basis: Surely (A, A). Induction: If we have found (A, B), and B -> C is a unit production, then add (A, C).
298
Proof That We Find Exactly the Right Pairs
By induction on the order in which pairs (A, B) are found, we can show A =>* B by unit productions. Conversely, by induction on the number of steps in the derivation by unit productions of A =>* B, we can show that the pair (A, B) is discovered.
299
Proof The the Unit-Production-Elimination Algorithm Works
Basic idea: there is a leftmost derivation A =>*lm w in the new grammar if and only if there is such a derivation in the old. A sequence of unit productions and a non-unit production is collapsed into a single production of the new grammar.
300
Cleaning Up a Grammar Theorem: if L is a CFL, then there is a CFG for L – {ε} that has: No useless symbols. No ε-productions. No unit productions. I.e., every right side is either a single terminal or has length > 2.
301
Cleaning Up – (2) Proof: Start with a CFG for L.
Perform the following steps in order: Eliminate ε-productions. Eliminate unit productions. Eliminate variables that derive no terminal string. Eliminate variables not reached from the start symbol. Must be first. Can create unit productions or useless variables.
302
Chomsky Normal Form A CFG is said to be in Chomsky Normal Form if every production is of one of these two forms: A -> BC (right side is two variables). A -> a (right side is a single terminal). Theorem: If L is a CFL, then L – {ε} has a CFG in CNF.
303
Proof of CNF Theorem Step 1: “Clean” the grammar, so every production right side is either a single terminal or of length at least 2. Step 2: For each right side a single terminal, make the right side all variables. For each terminal a create new variable Aa and production Aa -> a. Replace a by Aa in right sides of length > 2.
304
Example: Step 2 Consider production A -> BcDe.
We need variables Ac and Ae. with productions Ac -> c and Ae -> e. Note: you create at most one variable for each terminal, and use it everywhere it is needed. Replace A -> BcDe by A -> BAcDAe.
305
CNF Proof – Continued Step 3: Break right sides longer than 2 into a chain of productions with right sides of two variables. Example: A -> BCDE is replaced by A -> BF, F -> CG, and G -> DE. F and G must be used nowhere else.
306
Example of Step 3 – Continued
Recall A -> BCDE is replaced by A -> BF, F -> CG, and G -> DE. In the new grammar, A => BF => BCG => BCDE. More importantly: Once we choose to replace A by BF, we must continue to BCG and BCDE. Because F and G have only one production.
307
CNF Proof – Concluded We must prove that Steps 2 and 3 produce new grammars whose languages are the same as the previous grammar. Proofs are of a familiar type and involve inductions on the lengths of derivations.
308
Definition Moves of the PDA Languages of the PDA Deterministic PDA’s
Pushdown Automata Definition Moves of the PDA Languages of the PDA Deterministic PDA’s
309
Pushdown Automata The PDA is an automaton equivalent to the CFG in language-defining power. Only the nondeterministic PDA defines all the CFL’s. But the deterministic version models parsers. Most programming languages have deterministic PDA’s.
310
Intuition: PDA Think of an ε-NFA with the additional power that it can manipulate a stack. Its moves are determined by: The current state (of its “NFA”), The current input symbol (or ε), and The current symbol on top of its stack.
311
Intuition: PDA – (2) Being nondeterministic, the PDA can have a choice of next moves. In each choice, the PDA can: Change state, and also Replace the top symbol on the stack by a sequence of zero or more symbols. Zero symbols = “pop.” Many symbols = sequence of “pushes.”
312
PDA Formalism A PDA is described by:
A finite set of states (Q, typically). An input alphabet (Σ, typically). A stack alphabet (Γ, typically). A transition function (δ, typically). A start state (q0, in Q, typically). A start symbol (Z0, in Γ, typically). A set of final states (F ⊆ Q, typically).
313
Conventions a, b, … are input symbols. …, X, Y, Z are stack symbols.
But sometimes we allow ε as a possible value. …, X, Y, Z are stack symbols. …, w, x, y, z are strings of input symbols. , ,… are strings of stack symbols.
314
The Transition Function
Takes three arguments: A state, in Q. An input, which is either a symbol in Σ or ε. A stack symbol in Γ. δ(q, a, Z) is a set of zero or more actions of the form (p, ). p is a state; is a string of stack symbols.
315
Actions of the PDA If δ(q, a, Z) contains (p, ) among its actions, then one thing the PDA can do in state q, with a at the front of the input, and Z on top of the stack is: Change the state to p. Remove a from the front of the input (but a may be ε). Replace Z on the top of the stack by .
316
Example: PDA Design a PDA to accept {0n1n | n > 1}. The states:
q = start state. We are in state q if we have seen only 0’s so far. p = we’ve seen at least one 1 and may now proceed only if the inputs are 1’s. f = final state; accept.
317
Example: PDA – (2) The stack symbols:
Z0 = start symbol. Also marks the bottom of the stack, so we know when we have counted the same number of 1’s as 0’s. X = marker, used to count the number of 0’s seen on the input.
318
Example: PDA – (3) The transitions: δ(q, 0, Z0) = {(q, XZ0)}.
δ(q, 0, X) = {(q, XX)}. These two rules cause one X to be pushed onto the stack for each 0 read from the input. δ(q, 1, X) = {(p, ε)}. When we see a 1, go to state p and pop one X. δ(p, 1, X) = {(p, ε)}. Pop one X per 1. δ(p, ε, Z0) = {(f, Z0)}. Accept at bottom.
319
Actions of the Example PDA
q Z0
320
Actions of the Example PDA
q X Z0
321
Actions of the Example PDA
q X Z0
322
Actions of the Example PDA
1 1 1 q X Z0
323
Actions of the Example PDA
1 1 p X Z0
324
Actions of the Example PDA
1 p X Z0
325
Actions of the Example PDA
Z0
326
Actions of the Example PDA
Z0
327
Instantaneous Descriptions
We can formalize the pictures just seen with an instantaneous description (ID). A ID is a triple (q, w, ), where: q is the current state. w is the remaining input. is the stack contents, top at the left.
328
The “Goes-To” Relation
To say that ID I can become ID J in one move of the PDA, we write I⊦J. Formally, (q, aw, X)⊦(p, w, ) for any w and , if δ(q, a, X) contains (p, ). Extend ⊦ to ⊦*, meaning “zero or more moves,” by: Basis: I⊦*I. Induction: If I⊦*J and J⊦K, then I⊦*K.
329
Example: Goes-To Using the previous example PDA, we can describe the sequence of moves by: (q, , Z0)⊦(q, 00111, XZ0)⊦ (q, 0111, XXZ0)⊦(q, 111, XXXZ0)⊦ (p, 11, XXZ0)⊦(p, 1, XZ0)⊦(p, ε, Z0)⊦ (f, ε, Z0) Thus, (q, , Z0)⊦*(f, ε, Z0). What would happen on input ?
330
Legal because a PDA can use
ε input even if input remains. Answer (q, , Z0)⊦(q, , XZ0)⊦ (q, 01111, XXZ0)⊦(q, 1111, XXXZ0)⊦ (p, 111, XXZ0)⊦(p, 11, XZ0)⊦(p, 1, Z0)⊦ (f, 1, Z0) Note the last ID has no move. is not accepted, because the input is not completely consumed.
331
Aside: FA and PDA Notations
We represented moves of a FA by an extended δ, which did not mention the input yet to be read. We could have chosen a similar notation for PDA’s, where the FA state is replaced by a state-stack combination, like the pictures just shown.
332
FA and PDA Notations – (2)
Similarly, we could have chosen a FA notation with ID’s. Just drop the stack component. Why the difference? My theory: FA tend to model things like protocols, with indefinitely long inputs. PDA model parsers, which are given a fixed program to process.
333
Language of a PDA The common way to define the language of a PDA is by final state. If P is a PDA, then L(P) is the set of strings w such that (q0, w, Z0) ⊦* (f, ε, ) for final state f and any .
334
Language of a PDA – (2) Another language defined by the same PDA is by empty stack. If P is a PDA, then N(P) is the set of strings w such that (q0, w, Z0) ⊦* (q, ε, ε) for any state q.
335
Equivalence of Language Definitions
If L = L(P), then there is another PDA P’ such that L = N(P’). If L = N(P), then there is another PDA P’’ such that L = L(P’’).
336
Proof: L(P) -> N(P’) Intuition
P’ will simulate P. If P accepts, P’ will empty its stack. P’ has to avoid accidentally emptying its stack, so it uses a special bottom- marker to catch the case where P empties its stack without accepting.
337
Proof: L(P) -> N(P’)
P’ has all the states, symbols, and moves of P, plus: Stack symbol X0, used to guard the stack bottom against accidental emptying. New start state s and “erase” state e. δ(s, ε, X0) = {(q0, Z0X0)}. Get P started. δ(f, ε, X) = δ(e, ε, X) = {(e, ε)} for any final state f of P and any stack symbol X.
338
Proof: N(P) -> L(P’’) Intuition
P” simulates P. P” has a special bottom-marker to catch the situation where P empties its stack. If so, P” accepts.
339
Proof: N(P) -> L(P’’)
P’’ has all the states, symbols, and moves of P, plus: Stack symbol X0, used to guard the stack bottom. New start state s and final state f. δ(s, ε, X0) = {(q0, Z0X0)}. Get P started. δ(q, ε, X0) = {(f, ε)} for any state q of P.
340
Deterministic PDA’s To be deterministic, there must be at most one choice of move for any state q, input symbol a, and stack symbol X. In addition, there must not be a choice between using input ε or real input. Formally, δ(q, a, X) and δ(q, ε, X) cannot both be nonempty.
341
Conversion of CFG to PDA Conversion of PDA to CFG
Equivalence of PDA, CFG Conversion of CFG to PDA Conversion of PDA to CFG
342
Overview When we talked about closure properties of regular languages, it was useful to be able to jump between RE and DFA representations. Similarly, CFG’s and PDA’s are both useful to deal with properties of the CFL’s.
343
Overview – (2) Also, PDA’s, being “algorithmic,” are often easier to use when arguing that a language is a CFL. Example: It is easy to see how a PDA can recognize balanced parentheses; not so easy as a grammar. But all depends on knowing that CFG’s and PDA’s both define the CFL’s.
344
Converting a CFG to a PDA
Let L = L(G). Construct PDA P such that N(P) = L. P has: One state q. Input symbols = terminals of G. Stack symbols = all symbols of G. Start symbol = start symbol of G.
345
Intuition About P Given input w, P will step through a leftmost derivation of w from the start symbol S. Since P can’t know what this derivation is, or even what the end of w is, it uses nondeterminism to “guess” the production to use at each step.
346
Intuition – (2) At each step, P represents some left- sentential form (step of a leftmost derivation). If the stack of P is , and P has so far consumed x from its input, then P represents left-sentential form x. At empty stack, the input consumed is a string in L(G).
347
Transition Function of P
δ(q, a, a) = (q, ε). (Type 1 rules) This step does not change the LSF represented, but “moves” responsibility for a from the stack to the consumed input. If A -> is a production of G, then δ(q, ε, A) contains (q, ). (Type 2 rules) Guess a production for A, and represent the next LSF in the derivation.
348
Proof That L(P) = L(G) We need to show that (q, wx, S) ⊦* (q, x, ) for any x if and only if S =>*lm w. Part 1: “only if” is an induction on the number of steps made by P. Basis: 0 steps. Then = S, w = ε, and S =>*lm S is surely true.
349
Induction for Part 1 Consider n moves of P: (q, wx, S) ⊦* (q, x, ) and assume the IH for sequences of n-1 moves. There are two cases, depending on whether the last move uses a Type 1 or Type 2 rule.
350
Use of a Type 1 Rule The move sequence must be of the form (q, yax, S) ⊦* (q, ax, a) ⊦ (q, x, ), where ya = w. By the IH applied to the first n-1 steps, S =>*lm ya. But ya = w, so S =>*lm w.
351
Use of a Type 2 Rule The move sequence must be of the form (q, wx, S) ⊦* (q, x, A) ⊦ (q, x, ), where A -> is a production and = . By the IH applied to the first n-1 steps, S =>*lm wA. Thus, S =>*lm w = w.
352
Proof of Part 2 (“if”) We also must prove that if S =>*lm w, then (q, wx, S) ⊦* (q, x, ) for any x. Induction on number of steps in the leftmost derivation. Ideas are similar; read in text.
353
Proof – Completion We now have (q, wx, S) ⊦* (q, x, ) for any x if and only if S =>*lm w. In particular, let x = = ε. Then (q, w, S) ⊦* (q, ε, ε) if and only if S =>*lm w. That is, w is in N(P) if and only if w is in L(G).
354
From a PDA to a CFG Now, assume L = N(P).
We’ll construct a CFG G such that L = L(G). Intuition: G will have variables generating exactly the inputs that cause P to have the net effect of popping a stack symbol X while going from state p to state q. P never gets below this X while doing so.
355
Variables of G G’s variables are of the form [pXq].
This variable generates all and only the strings w such that (p, w, X) ⊦*(q, ε, ε). Also a start symbol S we’ll talk about later.
356
Productions of G Each production for [pXq] comes from a move of P in state p with stack symbol X. Simplest case: δ(p, a, X) contains (q, ε). Then the production is [pXq] -> a. Note a can be an input symbol or ε. Here, [pXq] generates a, because reading a is one way to pop X and go from p to q.
357
Productions of G – (2) Next simplest case: δ(p, a, X) contains (r, Y) for some state r and symbol Y. G has production [pXq] -> a[rYq]. We can erase X and go from p to q by reading a (entering state r and replacing the X by Y) and then reading some w that gets P from r to q while erasing the Y. Note: [pXq] =>* aw whenever [rYq] =>* w.
358
Productions of G – (3) Third simplest case: δ(p, a, X) contains (r, YZ) for some state r and symbols Y and Z. Now, P has replaced X by YZ. To have the net effect of erasing X, P must erase Y, going from state r to some state s, and then erase Z, going from s to q.
359
Picture of Action of P p X a w x r Y Z w x s Z x q
360
Third-Simplest Case – Concluded
Since we do not know state s, we must generate a family of productions: [pXq] -> a[rYs][sZq] for all states s. [pXq] =>* awx whenever [rYs] =>* w and [sZq] =>* x.
361
Productions of G: General Case
Suppose δ(p, a, X) contains (r, Y1,…Yk) for some state r and k > 3. Generate family of productions [pXq] -> a[rY1s1][s1Y2s2]…[sk-2Yk-1sk-1][sk-1Ykq]
362
Completion of the Construction
We can prove that (q0, w, Z0)⊦*(p, ε, ε) if and only if [q0Z0p] =>* w. Proof is in text; it is two easy inductions. But state p can be anything. Thus, add to G another variable S, the start symbol, and add productions S -> [q0Z0p] for each state p.
363
The Pumping Lemma for CFL’s
Statement Applications
364
Intuition Recall the pumping lemma for regular languages.
It told us that if there was a string long enough to cause a cycle in the DFA for the language, then we could “pump” the cycle and discover an infinite sequence of strings that had to be in the language.
365
Intuition – (2) For CFL’s the situation is a little more complicated.
We can always find two pieces of any sufficiently long string to “pump” in tandem. That is: if we repeat each of the two pieces the same number of times, we get another string in the language.
366
Statement of the CFL Pumping Lemma
For every context-free language L There is an integer n, such that For every string z in L of length > n There exists z = uvwxy such that: |vwx| < n. |vx| > 0. For all i > 0, uviwxiy is in L.
367
Proof of the Pumping Lemma
Start with a CNF grammar for L – {ε}. Let the grammar have m variables. Pick n = 2m. Let |z| > n. We claim (“Lemma 1 ”) that a parse tree with yield z must have a path of length m+2 or more.
368
Proof of Lemma 1 If all paths in the parse tree of a CNF grammar are of length < m+1, then the longest yield has length 2m-1, as in: m variables one terminal 2m-1 terminals
369
Back to the Proof of the Pumping Lemma
Now we know that the parse tree for z has a path with at least m+1 variables. Consider some longest path. There are only m different variables, so among the lowest m+1 we can find two nodes with the same label, say A. The parse tree thus looks like:
370
Parse Tree in the Pumping-Lemma Proof
v y x w Can’t both be ε. < 2m = n because a longest path chosen
371
Pump Zero Times u y u y A w A v x A w
372
Pump Twice u y u y A v x A v x A v x A w A w
373
Pump Thrice Etc., Etc. u y u y A v x A v x A v x A w A v x A w
374
Using the Pumping Lemma
Non-CFL’s typically involve trying to match two pairs of counts or match two strings. Example: The text uses the pumping lemma to show that {ww | w in (0+1)*} is not a CFL.
375
Using the Pumping Lemma – (2)
{0i10i | i > 1} is a CFL. We can match one pair of counts. But L = {0i10i10i | i > 1} is not. We can’t match two pairs, or three counts as a group. Proof using the pumping lemma. Suppose L were a CFL. Let n be L’s pumping-lemma constant.
376
Using the Pumping Lemma – (3)
Consider z = 0n10n10n. We can write z = uvwxy, where |vwx| < n, and |vx| > 1. Case 1: vx has no 0’s. Then at least one of them is a 1, and uwy has at most one 1, which no string in L does.
377
Using the Pumping Lemma – (4)
Still considering z = 0n10n10n. Case 2: vx has at least one 0. vwx is too short (length < n) to extend to all three blocks of 0’s in 0n10n10n. Thus, uwy has at least one block of n 0’s, and at least one block with fewer than n 0’s. Thus, uwy is not in L.
378
Properties of Context-Free Languages
Decision Properties Closure Properties
379
Summary of Decision Properties
As usual, when we talk about “a CFL” we really mean “a representation for the CFL, e.g., a CFG or a PDA accepting by final state or empty stack. There are algorithms to decide if: String w is in CFL L. CFL L is empty. CFL L is infinite.
380
Non-Decision Properties
Many questions that can be decided for regular sets cannot be decided for CFL’s. Example: Are two CFL’s the same? Example: Are two CFL’s disjoint? How would you do that for regular languages? Need theory of Turing machines and decidability to prove no algorithm exists.
381
Testing Emptiness We already did this.
We learned to eliminate variables that generate no terminal string. If the start symbol is one of these, then the CFL is empty; otherwise not.
382
Testing Membership Want to know if string w is in L(G).
Assume G is in CNF. Or convert the given grammar to CNF. w = ε is a special case, solved by testing if the start symbol is nullable. Algorithm (CYK ) is a good example of dynamic programming and runs in time O(n3), where n = |w|.
383
CYK Algorithm Let w = a1…an.
We construct an n-by-n triangular array of sets of variables. Xij = {variables A | A =>* ai…aj}. Induction on j–i+1. The length of the derived string. Finally, ask if S is in X1n.
384
CYK Algorithm – (2) Basis: Xii = {A | A -> ai is a production}.
Induction: Xij = {A | there is a production A -> BC and an integer k, with i < k < j, such that B is in Xik and C is in Xk+1,j.
385
Example: CYK Algorithm
Grammar: S -> AB, A -> BC | a, B -> AC | b, C -> a | b String w = ababa X12={B,S} X23={A} X34={B,S} X45={A} X11={A,C} X22={B,C} X33={A,C} X44={B,C} X55={A,C}
386
Example: CYK Algorithm
Grammar: S -> AB, A -> BC | a, B -> AC | b, C -> a | b String w = ababa X13={} Yields nothing X12={B,S} X23={A} X34={B,S} X45={A} X11={A,C} X22={B,C} X33={A,C} X44={B,C} X55={A,C}
387
Example: CYK Algorithm
Grammar: S -> AB, A -> BC | a, B -> AC | b, C -> a | b String w = ababa X13={A} X24={B,S} X35={A} X12={B,S} X23={A} X34={B,S} X45={A} X11={A,C} X22={B,C} X33={A,C} X44={B,C} X55={A,C}
388
Example: CYK Algorithm
Grammar: S -> AB, A -> BC | a, B -> AC | b, C -> a | b String w = ababa X14={B,S} X13={A} X24={B,S} X35={A} X12={B,S} X23={A} X34={B,S} X45={A} X11={A,C} X22={B,C} X33={A,C} X44={B,C} X55={A,C}
389
Example: CYK Algorithm
Grammar: S -> AB, A -> BC | a, B -> AC | b, C -> a | b String w = ababa X15={A} X14={B,S} X25={A} X13={A} X24={B,S} X35={A} X12={B,S} X23={A} X34={B,S} X45={A} X11={A,C} X22={B,C} X33={A,C} X44={B,C} X55={A,C}
390
Testing Infiniteness The idea is essentially the same as for regular languages. Use the pumping lemma constant n. If there is a string in the language of length between n and 2n-1, then the language is infinite; otherwise not. Let’s work this out in class.
391
Closure Properties of CFL’s
CFL’s are closed under union, concatenation, and Kleene closure. Also, under reversal, homomorphisms and inverse homomorphisms. But not under intersection or difference.
392
Closure of CFL’s Under Union
Let L and M be CFL’s with grammars G and H, respectively. Assume G and H have no variables in common. Names of variables do not affect the language. Let S1 and S2 be the start symbols of G and H.
393
Closure Under Union – (2)
Form a new grammar for L M by combining all the symbols and productions of G and H. Then, add a new start symbol S. Add productions S -> S1 | S2.
394
Closure Under Union – (3)
In the new grammar, all derivations start with S. The first step replaces S by either S1 or S2. In the first case, the result must be a string in L(G) = L, and in the second case a string in L(H) = M.
395
Closure of CFL’s Under Concatenation
Let L and M be CFL’s with grammars G and H, respectively. Assume G and H have no variables in common. Let S1 and S2 be the start symbols of G and H.
396
Closure Under Concatenation – (2)
Form a new grammar for LM by starting with all symbols and productions of G and H. Add a new start symbol S. Add production S -> S1S2. Every derivation from S results in a string in L followed by one in M.
397
Closure Under Star Let L have grammar G, with start symbol S1.
Form a new grammar for L* by introducing to G a new start symbol S and the productions S -> S1S | ε. A rightmost derivation from S generates a sequence of zero or more S1’s, each of which generates some string in L.
398
Closure of CFL’s Under Reversal
If L is a CFL with grammar G, form a grammar for LR by reversing the right side of every production. Example: Let G have S -> 0S1 | 01. The reversal of L(G) has grammar S -> 1S0 | 10.
399
Closure of CFL’s Under Homomorphism
Let L be a CFL with grammar G. Let h be a homomorphism on the terminal symbols of G. Construct a grammar for h(L) by replacing each terminal symbol a by h(a).
400
Example: Closure Under Homomorphism
G has productions S -> 0S1 | 01. h is defined by h(0) = ab, h(1) = ε. h(L(G)) has the grammar with productions S -> abS | ab.
401
Closure of CFL’s Under Inverse Homomorphism
Here, grammars don’t help us. But a PDA construction serves nicely. Intuition: Let L = L(P) for some PDA P. Construct PDA P’ to accept h-1(L). P’ simulates P, but keeps, as one component of a two-component state a buffer that holds the result of applying h to one input symbol.
402
Architecture of P’ Input: 0 0 1 1 h(0) Buffer Read first remaining
symbol in buffer as if it were input to P. State of P Stack of P
403
Formal Construction of P’
States are pairs [q, b], where: q is a state of P. b is a suffix of h(a) for some symbol a. Thus, only a finite number of possible values for b. Stack symbols of P’ are those of P. Start state of P’ is [q0 ,ε].
404
Construction of P’ – (2) Input symbols of P’ are the symbols to which h applies. Final states of P’ are the states [q, ε] such that q is a final state of P.
405
Transitions of P’ δ’([q, ε], a, X) = {([q, h(a)], X)} for any input symbol a of P’ and any stack symbol X. When the buffer is empty, P’ can reload it. δ’([q, bw], ε, X) contains ([p, w], ) if δ(q, b, X) contains (p, ), where b is either an input symbol of P or ε. Simulate P from the buffer.
406
Proving Correctness of P’
We need to show that L(P’) = h-1(L(P)). Key argument: P’ makes the transition ([q0, ε], w, Z0)⊦*([q, x], ε, ) if and only if P makes transition (q0, y, Z0) ⊦*(q, ε, ), h(w) = yx, and x is a suffix of the last symbol of w. Proof in both directions is an induction on the number of moves made.
407
Nonclosure Under Intersection
Unlike the regular languages, the class of CFL’s is not closed under . We know that L1 = {0n1n2n | n > 1} is not a CFL (use the pumping lemma). However, L2 = {0n1n2i | n > 1, i > 1} is. CFG: S -> AB, A -> 0A1 | 01, B -> 2B | 2. So is L3 = {0i1n2n | n > 1, i > 1}. But L1 = L2 L3.
408
Nonclosure Under Difference
We can prove something more general: Any class of languages that is closed under difference is closed under intersection. Proof: L M = L – (L – M). Thus, if CFL’s were closed under difference, they would be closed under intersection, but they are not.
409
Intersection with a Regular Language
Intersection of two CFL’s need not be context free. But the intersection of a CFL with a regular language is always a CFL. Proof involves running a DFA in parallel with a PDA, and noting that the combination is a PDA. PDA’s accept by final state.
410
DFA and PDA in Parallel DFA Accept Input if both accept PDA S t a
Looks like the state of one PDA DFA Accept if both accept Input PDA S t a c k
411
Formal Construction Let the DFA A have transition function δA.
Let the PDA P have transition function δP. States of combined PDA are [q,p], where q is a state of A and p a state of P. δ([q,p], a, X) contains ([δA(q,a),r], ) if δP(p, a, X) contains (r, ). Note a could be , in which case δA(q,a) = q.
412
Formal Construction – (2)
Accepting states of combined PDA are those [q,p] such that q is an accepting state of A and p is an accepting state of P. Easy induction: ([q0,p0], w, Z0)⊦* ([q,p], , ) if and only if δA(q0,w) = q and in P: (p0, w, Z0)⊦*(p, , ).
413
Undecidability Everything is an Integer Countable and Uncountable Sets
Turing Machines Recursive and Recursively Enumerable Languages
414
Integers, Strings, and Other Things
Data types have become very important as a programming tool. But at another level, there is only one type, which you may think of as integers or strings. Key point: Strings that are programs are just another way to think about the same one data type.
415
Example: Text Strings of ASCII or Unicode characters can be thought of as binary strings, with 8 or 16 bits/character. Binary strings can be thought of as integers. It makes sense to talk about “the i-th string.”
416
Binary Strings to Integers
There’s a small glitch: If you think simply of binary integers, then strings like 101, 0101, 00101,… all appear to be “the fifth string.” Fix by prepending a “1” to the string before converting to an integer. Thus, 101, 0101, and are the 13th, 21st, and 37th strings, respectively.
417
Example: Images Represent an image in (say) GIF.
The GIF file is an ASCII string. Convert string to binary. Convert binary string to integer. Now we have a notion of “the i-th image.”
418
Example: Proofs A formal proof is a sequence of logical expressions, each of which follows from the ones before it. Encode mathematical expressions of any kind in Unicode. Convert expression to a binary string and then an integer.
419
Proofs – (2) But a proof is a sequence of expressions, so we need a way to separate them. Also, we need to indicate which expressions are given.
420
Proofs – (3) Quick-and-dirty way to introduce new symbols into binary strings: Given a binary string, precede each bit by 0. Example: 101 becomes Use strings of two or more 1’s as the special symbols. Example: 111 = “the following expression is given”; 11 = “end of expression.”
421
Example: Encoding Proofs
… A given expression follows An ex- pression End of expression A given expression follows Expression End Notice this 1 could not be part of the “end”
422
Example: Programs Programs are just another kind of data.
Represent a program in ASCII. Convert to a binary string, then to an integer. Thus, it makes sense to talk about “the i-th program.” Hmm…There aren’t all that many programs.
423
Finite Sets Intuitively, a finite set is a set for which there is a particular integer that is the count of the number of members. Example: {a, b, c} is a finite set; its cardinality is 3. It is impossible to find a 1-1 mapping between a finite set and a proper subset of itself.
424
Infinite Sets Formally, an infinite set is a set for which there is a 1-1 correspondence between itself and a proper subset of itself. Example: the positive integers {1, 2, 3,…} is an infinite set. There is a 1-1 correspondence 1<->2, 2<->4, 3<->6,… between this set and a proper subset (the set of even integers).
425
Countable Sets A countable set is a set with a 1-1 correspondence with the positive integers. Hence, all countable sets are infinite. Example: All integers. 0<->1; -i <-> 2i; +i <-> 2i+1. Thus, order is 0, -1, 1, -2, 2, -3, 3,… Examples: set of binary strings, set of Java programs.
426
Example: Pairs of Integers
Order the pairs of positive integers first by sum, then by first component: [1,1], [2,1], [1,2], [3,1], [2,2], [1,3], [4,1], [3,2],…, [1,4], [5,1],… Interesting exercise: figure out the function f(i,j) such that the pair [i,j] corresponds to the integer f(i,j) in this order.
427
Enumerations An enumeration of a set is a 1-1 correspondence between the set and the positive integers. Thus, we have seen enumerations for strings, programs, proofs, and pairs of integers.
428
How Many Languages? Are the languages over {0,1}* countable?
No; here’s a proof. Suppose we could enumerate all languages over {0,1}* and talk about “the i-th language.” Consider the language L = { w | w is the i-th binary string and w is not in the i-th language}.
429
Proof – Continued Clearly, L is a language over {0,1}*.
Lj Clearly, L is a language over {0,1}*. Thus, it is the j-th language for some particular j. Let x be the j-th string. Is x in L? If so, x is not in L by definition of L. If not, then x is in L by definition of L. x Recall: L = { w | w is the i-th binary string and w is not in the i-th language}. j-th
430
Diagonalization Picture
Strings … … 1 1 1 1 2 1 Languages 3 4 1 5 … …
431
Diagonalization Picture
Can’t be a row – it disagrees in an entry of each row. Strings Flip each diagonal entry … … 1 1 1 2 Languages 3 1 4 1 5 … …
432
Proof – Concluded We have a contradiction: x is neither in L nor not in L, so our sole assumption (that there was an enumeration of the languages) is wrong. Comment: This is really bad; there are more languages than programs. E.g., there are languages with no membership algorithm.
433
Hungarian Arguments We have shown the existence of a language with no algorithm to test for membership, but we have no way to exhibit a particular language with that property. A proof by counting the things that fail and claiming they are fewer than all things is called a Hungarian argument.
434
Turing-Machine Theory
The purpose of the theory of Turing machines is to prove that certain specific languages have no algorithm. Start with a language about Turing machines themselves. Reductions are used to prove more common questions undecidable.
435
Picture of a Turing Machine
Action: based on the state and the tape symbol under the head: change state, rewrite the symbol and move the head one square. State . . . . . . A B C A D Infinite tape with squares containing tape symbols chosen from a finite alphabet
436
Why Turing Machines? Why not deal with C programs or something like that? Answer: You can, but it is easier to prove things about TM’s, because they are so simple. And yet they are as powerful as any computer. More so, in fact, since they have infinite memory.
437
Then Why Not Finite-State Machines to Model Computers?
In principle, you could, but it is not instructive. Programming models don’t build in a limit on memory. In practice, you can go to Fry’s and buy another disk. But finite automata vital at the chip level (model-checking).
438
Turing-Machine Formalism
A TM is described by: A finite set of states (Q, typically). An input alphabet (Σ, typically). A tape alphabet (Γ, typically; contains Σ). A transition function (δ, typically). A start state (q0, in Q, typically). A blank symbol (B, in Γ- Σ, typically). All tape except for the input is blank initially. A set of final states (F ⊆ Q, typically).
439
Conventions a, b, … are input symbols. …, X, Y, Z are tape symbols.
…, w, x, y, z are strings of input symbols. , ,… are strings of tape symbols.
440
The Transition Function
Takes two arguments: A state, in Q. A tape symbol in Γ. δ(q, Z) is either undefined or a triple of the form (p, Y, D). p is a state. Y is the new tape symbol. D is a direction, L or R.
441
Actions of the PDA If δ(q, Z) = (p, Y, D) then, in state q, scanning Z under its tape head, the TM: Changes the state to p. Replaces Z by Y on the tape. Moves the head one square in direction D. D = L: move left; D = R; move right.
442
Example: Turing Machine
This TM scans its input right, looking for a 1. If it finds one, it changes it to a 0, goes to final state f, and halts. If it reaches a blank, it changes it to a 1 and moves left.
443
Example: Turing Machine – (2)
States = {q (start), f (final)}. Input symbols = {0, 1}. Tape symbols = {0, 1, B}. δ(q, 0) = (q, 0, R). δ(q, 1) = (f, 0, R). δ(q, B) = (q, 1, L).
444
Simulation of TM δ(q, 0) = (q, 0, R) δ(q, 1) = (f, 0, R)
δ(q, B) = (q, 1, L) q B B B B
445
Simulation of TM δ(q, 0) = (q, 0, R) δ(q, 1) = (f, 0, R)
δ(q, B) = (q, 1, L) q B B B B
446
Simulation of TM δ(q, 0) = (q, 0, R) δ(q, 1) = (f, 0, R)
δ(q, B) = (q, 1, L) q B B B B
447
Simulation of TM δ(q, 0) = (q, 0, R) δ(q, 1) = (f, 0, R)
δ(q, B) = (q, 1, L) q B B B
448
Simulation of TM δ(q, 0) = (q, 0, R) δ(q, 1) = (f, 0, R)
δ(q, B) = (q, 1, L) q B B B
449
Simulation of TM δ(q, 0) = (q, 0, R) δ(q, 1) = (f, 0, R)
δ(q, B) = (q, 1, L) f No move is possible. The TM halts and accepts. B B B
450
Instantaneous Descriptions of a Turing Machine
Initially, a TM has a tape consisting of a string of input symbols surrounded by an infinity of blanks in both directions. The TM is in the start state, and the head is at the leftmost input symbol.
451
TM ID’s – (2) An ID is a string q, where is the tape between the leftmost and rightmost nonblanks (inclusive). The state q is immediately to the left of the tape symbol scanned. If q is at the right end, it is scanning B. If q is scanning a B at the left end, then consecutive B’s at and to the right of q are part of .
452
TM ID’s – (3) As for PDA’s we may use symbols ⊦ and ⊦* to represent “becomes in one move” and “becomes in zero or more moves,” respectively, on ID’s. Example: The moves of the previous TM are q00⊦0q0⊦00q⊦0q01⊦00q1⊦000f
453
Formal Definition of Moves
If δ(q, Z) = (p, Y, R), then qZ⊦Yp If Z is the blank B, then also q⊦Yp If δ(q, Z) = (p, Y, L), then For any X, XqZ⊦pXY In addition, qZ⊦pBY
454
Languages of a TM A TM defines a language by final state, as usual.
L(M) = {w | q0w⊦*I, where I is an ID with a final state}. Or, a TM can accept a language by halting. H(M) = {w | q0w⊦*I, and there is no move possible from ID I}.
455
Equivalence of Accepting and Halting
If L = L(M), then there is a TM M’ such that L = H(M’). If L = H(M), then there is a TM M” such that L = L(M”).
456
Proof of 1: Acceptance -> Halting
Modify M to become M’ as follows: For each accepting state of M, remove any moves, so M’ halts in that state. Avoid having M’ accidentally halt. Introduce a new state s, which runs to the right forever; that is δ(s, X) = (s, X, R) for all symbols X. If q is not accepting, and δ(q, X) is undefined, let δ(q, X) = (s, X, R).
457
Proof of 2: Halting -> Acceptance
Modify M to become M” as follows: Introduce a new state f, the only accepting state of M”. f has no moves. If δ(q, X) is undefined for any state q and symbol X, define it by δ(q, X) = (f, X, R).
458
Recursively Enumerable Languages
We now see that the classes of languages defined by TM’s using final state and halting are the same. This class of languages is called the recursively enumerable languages. Why? The term actually predates the Turing machine and refers to another notion of computation of functions.
459
Recursive Languages An algorithm is a TM that is guaranteed to halt whether or not it accepts. If L = L(M) for some TM M that is an algorithm, we say L is a recursive language. Why? Again, don’t ask; it is a term with a history.
460
Example: Recursive Languages
Every CFL is a recursive language. Use the CYK algorithm. Every regular language is a CFL (think of its DFA as a PDA that ignores its stack); therefore every regular language is recursive. Almost anything you can think of is recursive.
461
More About Turing Machines
“Programming Tricks” Restrictions Extensions Closure Properties
462
Overview At first, the TM doesn’t look very powerful.
Can it really do anything a computer can? We’ll discuss “programming tricks” to convince you that it can simulate a real computer.
463
Overview – (2) We need to study restrictions on the basic TM model (e.g., tapes infinite in only one direction). Assuming a restricted form makes it easier to talk about simulating arbitrary TM’s. That’s essential to exhibit a language that is not recursively enumerable.
464
Overview – (3) We also need to study generalizations of the basic model. Needed to argue there is no more powerful model of what it means to “compute.” Example: A nondeterministic TM with 50 six-dimensional tapes is no more powerful than the basic model.
465
Programming Trick: Multiple Tracks
Think of tape symbols as vectors with k components. Each component chosen from a finite alphabet. Makes the tape appear to have k tracks. Let input symbols be blank in all but one track.
466
Picture of Multiple Tracks
Represents input symbol 0 q Represents the blank X B B Y B B Z B Represents one symbol [X,Y,Z]
467
Programming Trick: Marking
A common use for an extra track is to mark certain positions. Almost all cells hold B (blank) in this track, but several hold special symbols (marks) that allow the TM to find particular places on the tape.
468
Marking q B X B W Y Z Marked Y Unmarked W and Z
469
Programming Trick: Caching in the State
The state can also be a vector. First component is the “control state.” Other components hold data from a finite alphabet.
470
Example: Using These Tricks
This TM doesn’t do anything terribly useful; it copies its input w infinitely. Control states: q: Mark your position and remember the input symbol seen. p: Run right, remembering the symbol and looking for a blank. Deposit symbol. r: Run left, looking for the mark.
471
Example – (2) States have the form [x, Y], where x is q, p, or r and Y is 0, 1, or B. Only p uses 0 and 1. Tape symbols have the form [U, V]. U is either X (the “mark”) or B. V is 0, 1 (the input symbols) or B. [B, B] is the TM blank; [B, 0] and [B, 1] are the inputs.
472
The Transition Function
Convention: a and b each stand for “either 0 or 1.” δ([q,B], [B,a]) = ([p,a], [X,a], R). In state q, copy the input symbol under the head (i.e., a ) into the state. Mark the position read. Go to state p and move right.
473
Transition Function – (2)
δ([p,a], [B,b]) = ([p,a], [B,b], R). In state p, search right, looking for a blank symbol (not just B in the mark track). δ([p,a], [B,B]) = ([r,B], [B,a], L). When you find a B, replace it by the symbol (a ) carried in the “cache.” Go to state r and move left.
474
Transition Function – (3)
δ([r,B], [B,a]) = ([r,B], [B,a], L). In state r, move left, looking for the mark. δ([r,B], [X,a]) = ([q,B], [B,a], R). When the mark is found, go to state q and move right. But remove the mark from where it was. q will place a new mark and the cycle repeats.
475
Simulation of the TM q B . . . B B B B B B
476
Simulation of the TM p . . . X B B B B B
477
Simulation of the TM p . . . X B B B B B
478
Simulation of the TM r B . . . X B B B B
479
Simulation of the TM r B . . . X B B B B
480
Simulation of the TM q B . . . B B B B B
481
Simulation of the TM p 1 . . . B X B B B
482
Semi-infinite Tape We can assume the TM never moves left from the initial position of the head. Let this position be 0; positions to the right are 1, 2, … and positions to the left are –1, –2, … New TM has two tracks. Top holds positions 0, 1, 2, … Bottom holds a marker, positions –1, –2, …
483
Simulating Infinite Tape by Semi-infinite Tape
State remembers whether simulating upper or lower track. Reverse directions for lower track. q U/L * Put * here at the first move You don’t need to do anything, because these are initially B.
484
More Restrictions – Read in Text
Two stacks can simulate one tape. One holds positions to the left of the head; the other holds positions to the right. In fact, by a clever construction, the two stacks to be counters = only two stack symbols, one of which can only appear at the bottom. Factoid: Invented by Pat Fischer, whose main claim to fame is that he was a victim of the Unabomber.
485
Extensions More general than the standard TM.
But still only able to define the RE languages. Multitape TM. Nondeterministic TM. Store for key-value pairs.
486
Multitape Turing Machines
Allow a TM to have k tapes for any fixed k. Move of the TM depends on the state and the symbols under the head for each tape. In one move, the TM can change state, write symbols under each head, and move each head independently.
487
Simulating k Tapes by One
Use 2k tracks. Each tape of the k-tape machine is represented by a track. The head position for each track is represented by a mark on an additional track.
488
Picture of Multitape Simulation
q X head for tape 1 A B C A C B tape 1 X head for tape 2 U V U U W V tape 2
489
Nondeterministic TM’s
Allow the TM to have a choice of move at each step. Each choice is a state-symbol-direction triple, as for the deterministic TM. The TM accepts its input if any sequence of choices leads to an accepting state.
490
Simulating a NTM by a DTM
The DTM maintains on its tape a queue of ID’s of the NTM. A second track is used to mark certain positions: A mark for the ID at the head of the queue. A mark to help copy the ID at the head and make a one-move change.
491
Picture of the DTM Tape Where you are copying IDk with a move Front of
queue X Y ID0 # ID1 # … # IDk # IDk+1 … # IDn # New ID Rear of queue
492
Operation of the Simulating DTM
The DTM finds the ID at the current front of the queue. It looks for the state in that ID so it can determine the moves permitted from that ID. If there are m possible moves, it creates m new ID’s, one for each move, at the rear of the queue.
493
Operation of the DTM – (2)
The m new ID’s are created one at a time. After all are created, the marker for the front of the queue is moved one ID toward the rear of the queue. However, if a created ID has an accepting state, the DTM instead accepts and halts.
494
Why the NTM -> DTM Construction Works
There is an upper bound, say k, on the number of choices of move of the NTM for any state/symbol combination. Thus, any ID reachable from the initial ID by n moves of the NTM will be constructed by the DTM after constructing at most (kn+1-k)/(k-1)ID’s. Sum of k+k2+…+kn
495
Why? – (2) If the NTM accepts, it does so in some sequence of n choices of move. Thus the ID with an accepting state will be constructed by the DTM in some large number of its own moves. If the NTM does not accept, there is no way for the DTM to accept.
496
Taking Advantage of Extensions
We now have a really good situation. When we discuss construction of particular TM’s that take other TM’s as input, we can assume the input TM is as simple as possible. E.g., one, semi-infinite tape, deterministic. But the simulating TM can have many tapes, be nondeterministic, etc.
497
Real Computers Recall that, since a real computer has finite memory, it is in a sense weaker than a TM. Imagine a computer with an infinite store for name-value pairs. Generalizes an address space.
498
Simulating a Name-Value Store by a TM
The TM uses one of several tapes to hold an arbitrarily large sequence of name-value pairs in the format #name*value#… Mark, using a second track, the left end of the sequence. A second tape can hold a name whose value we want to look up.
499
Lookup Starting at the left end of the store, compare the lookup name with each name in the store. When we find a match, take what follows between the * and the next # as the value.
500
Insertion Suppose we want to insert name-value pair (n, v), or replace the current value associated with name n by v. Perform lookup for name n. If not found, add n*v# at the end of the store.
501
Insertion – (2) If we find #n*v’#, we need to replace v’ by v.
If v is shorter than v’, you can leave blanks to fill out the replacement. But if v is longer than v’, you need to make room.
502
Insertion – (3) Use a third tape to copy everything from the first tape at or to the right of v’. Mark the position of the * to the left of v’ before you do. Copy from the third tape to the first, leaving enough room for v. Write v where v’ was.
503
Closure Properties of Recursive and RE Languages
Both closed under union, concatenation, star, reversal, intersection, inverse homomorphism. Recursive closed under difference, complementation. RE closed under homomorphism.
504
Union Let L1 = L(M1) and L2 = L(M2).
Assume M1 and M2 are single-semi- infinite-tape TM’s. Construct 2-tape TM M to copy its input onto the second tape and simulate the two TM’s M1 and M2 each on one of the two tapes, “in parallel.”
505
Union – (2) Recursive languages: If M1 and M2 are both algorithms, then M will always halt in both simulations. Accept if either accepts. RE languages: accept if either accepts, but you may find both TM’s run forever without halting or accepting.
506
Picture of Union/Recursive
Accept M1 Reject OR Accept Input w M Accept M2 AND Reject Reject Remember: = “halt without accepting
507
Picture of Union/RE Accept M1 OR Accept Input w M Accept M2
508
Intersection/Recursive – Same Idea
Accept M1 Reject AND Accept Input w M Accept M2 OR Reject Reject
509
Intersection/RE Accept M1 AND Accept Input w M Accept M2
510
Difference, Complement
Recursive languages: both TM’s will eventually halt. Accept if M1 accepts and M2 does not. Corollary: Recursive languages are closed under complementation. RE Languages: can’t do it; M2 may never halt, so you can’t be sure input is in the difference.
511
Concatenation/RE Let L1 = L(M1) and L2 = L(M2).
Assume M1 and M2 are single-semi- infinite-tape TM’s. Construct 2-tape Nondeterministic TM M: Guess a break in input w = xy. Move y to second tape. Simulate M1 on x, M2 on y. Accept if both accept.
512
Concatenation/Recursive
Can’t use a NTM. Systematically try each break w = xy. M1 and M2 will eventually halt for each break. Accept if both accept for any one break. Reject if all breaks tried and none lead to acceptance.
513
Star Same ideas work for each case.
RE: guess many breaks, accept if M1 accepts each piece. Recursive: systematically try all ways to break input into some number of pieces.
514
Reversal Start by reversing the input.
Then simulate TM for L to accept w if and only wR is in L. Works for either Recursive or RE languages.
515
Inverse Homomorphism Apply h to input w. Simulate TM for L on h(w).
Accept w iff h(w) is in L. Works for Recursive or RE.
516
Homomorphism/RE Let L = L(M1).
Design NTM M to take input w and guess an x such that h(x) = w. M accepts whenever M1 accepts x. Note: won’t work for Recursive languages.
517
Decidability Turing Machines Coded as Binary Strings
Diagonalizing over Turing Machines Problems as Languages Undecidable Problems
518
Binary-Strings from TM’s
We shall restrict ourselves to TM’s with input alphabet {0, 1}. Assign positive integers to the three classes of elements involved in moves: States: q1(start state), q2 (final state), q3, … Symbols X1 (0), X2 (1), X3 (blank), X4, … Directions D1 (L) and D2 (R).
519
Binary Strings from TM’s – (2)
Suppose δ(qi, Xj) = (qk, Xl, Dm). Represent this rule by string 0i10j10k10l10m. Key point: since integers i, j, … are all > 0, there cannot be two consecutive 1’s in these strings.
520
Binary Strings from TM’s – (2)
Represent a TM by concatenating the codes for each of its moves, separated by 11 as punctuation. That is: Code111Code211Code311 …
521
Enumerating TM’s and Binary Strings
Recall we can convert binary strings to integers by prepending a 1 and treating the resulting string as a base-2 integer. Thus, it makes sense to talk about “the i-th binary string” and about “the i-th Turing machine.” Note: if i makes no sense as a TM, assume the i-th TM accepts nothing.
522
Table of Acceptance String j 1 2 3 4 5 6 . . . 1 TM 2 i 3 4 x 5 6 .
1 2 3 4 5 6 . TM i x x = 0 means the i-th TM does not accept the j-th string; 1 means it does.
523
Diagonalization Again
Whenever we have a table like the one on the previous slide, we can diagonalize it. That is, construct a sequence D by complementing each bit along the major diagonal. Formally, D = a1a2…, where ai = 0 if the (i, i) table entry is 1, and vice-versa.
524
The Diagonalization Argument
Could D be a row (representing the language accepted by a TM) of the table? Suppose it were the j-th row. But D disagrees with the j-th row at the j-th column. Thus D is not a row.
525
Diagonalization – (2) Consider the diagonalization language Ld = {w | w is the i-th string, and the i-th TM does not accept w}. We have shown that Ld is not a recursively enumerable language; i.e., it has no TM.
526
Problems Informally, a “problem” is a yes/no question about an infinite set of possible instances. Example: “Does graph G have a Hamilton cycle (cycle that touches each node exactly once)? Each undirected graph is an instance of the “Hamilton-cycle problem.”
527
Problems – (2) Formally, a problem is a language.
Each string encodes some instance. The string is in the language if and only if the answer to this instance of the problem is “yes.”
528
Example: A Problem About Turing Machines
We can think of the language Ld as a problem. “Does this TM not accept its own code?” Aside: We could also think of it as a problem about binary strings. Do you see how to phrase it?
529
Decidable Problems A problem is decidable if there is an algorithm to answer it. Recall: An “algorithm,” formally, is a TM that halts on all inputs, accepted or not. Put another way, “decidable problem” = “recursive language.” Otherwise, the problem is undecidable.
530
Bullseye Picture Not recursively enumerable languages Decidable Ld
problems = Recursive languages Recursively enumerable languages Are there any languages here?
531
From the Abstract to the Real
While the fact that Ld is undecidable is interesting intellectually, it doesn’t impact the real world directly. We first shall develop some TM-related problems that are undecidable, but our goal is to use the theory to show some real problems are undecidable.
532
Examples: Undecidable Problems
Can a particular line of code in a program ever be executed? Is a given context-free grammar ambiguous? Do two given CFG’s generate the same language?
533
The Universal Language
An example of a recursively enumerable, but not recursive language is the language Lu of a universal Turing machine. That is, the UTM takes as input the code for some TM M and some binary string w and accepts if and only if M accepts w.
534
Designing the UTM Inputs are of the form: Code for M 111 w
Note: A valid TM code never has 111, so we can split M from w. The UTM must accept its input if and only if M is a valid TM code and that TM accepts w.
535
The UTM – (2) The UTM will have several tapes.
Tape 1 holds the input M111w Tape 2 holds the tape of M. Mark the current head position of M. Tape 3 holds the state of M.
536
The UTM – (3) Step 1: The UTM checks that M is a valid code for a TM.
E.g., all moves have five components, no two moves have the same state/symbol as first two components. If M is not valid, its language is empty, so the UTM immediately halts without accepting.
537
The UTM – (4) Step 2: The UTM examines M to see how many of its own tape squares it needs to represent one symbol of M. Step 3: Initialize Tape 2 to represent the tape of M with input w, and initialize Tape 3 to hold the start state.
538
The UTM – (5) Step 4: Simulate M.
Look for a move on Tape 1 that matches the state on Tape 3 and the tape symbol under the head on Tape 2. If found, change the symbol and move the head marker on Tape 2 and change the State on Tape 3. If M accepts, the UTM also accepts.
539
A Question Do we see anything like universal Turing machines in real life?
540
Proof That Lu is Recursively Enumerable, but not Recursive
We designed a TM for Lu, so it is surely RE. Suppose it were recursive; that is, we could design a UTM U that always halted. Then we could also design an algorithm for Ld, as follows.
541
Proof – (2) Given input w, we can decide if it is in Ld by the following steps. Check that w is a valid TM code. If not, then its language is empty, so w is in Ld. If valid, use the hypothetical algorithm to decide whether w111w is in Lu. If so, then w is not in Ld; else it is.
542
Proof – (3) But we already know there is no algorithm for Ld.
Thus, our assumption that there was an algorithm for Lu is wrong. Lu is RE, but not recursive.
543
Bullseye Picture Not recursively enumerable languages Decidable Ld
problems = Recursive languages Ld All these are undecidable Recursively enumerable languages Lu
544
More Undecidable Problems
Rice’s Theorem Post’s Correspondence Problem Some Real Problems
545
Properties of Languages
Any set of languages is a property of languages. Example: The infiniteness property is the set of infinite languages.
546
Properties of Langauges – (2)
As always, languages must be defined by some descriptive device. The most general device we know is the TM. Thus, we shall think of a property as a problem about Turing machines. Let LP be the set of binary TM codes for TM’s M such that L(M) has property P.
547
Trivial Properties There are two (trivial ) properties P for which LP is decidable. The always-false property, which contains no RE languages. The always-true property, which contains every RE language. Rice’s Theorem: For every other property P, LP is undecidable.
548
Plan for Proof of Rice’s Theorem
Lemma needed: recursive languages are closed under complementation. We need the technique known as reduction, where an algorithm converts instances of one problem to instances of another. Then, we can prove the theorem.
549
Closure of Recursive Languages Under Complementation
If L is a language with alphabet Σ*, then the complement of L is Σ* - L. Denote the complement of L by Lc. Lemma: If L is recursive, so is Lc. Proof: Let L = L(M) for a TM M. Construct M’ for Lc. M’ has one final state, the new state f.
550
Proof – Concluded M’ simulates M.
But if M enters an accepting state, M’ halts without accepting. If M halts without accepting, M’ instead has a move taking it to state f. In state f, M’ halts.
551
Reductions A reduction from language L to language L’ is an algorithm (TM that always halts) that takes a string w and converts it to a string x, with the property that: x is in L’ if and only if w is in L.
552
TM’s as Transducers We have regarded TM’s as acceptors of strings.
But we could just as well visualize TM’s as having an output tape, where a string is written prior to the TM halting. Such a TM translates its input to its output.
553
Reductions – (2) If we reduce L to L’, and L’ is decidable, then the algorithm for L’ + the algorithm of the reduction shows that L is also decidable. Used in the contrapositive: If we know L is not decidable, then L’ cannot be decidable.
554
Reductions – Aside This form of reduction is not the most general.
Example: We “reduced” Ld to Lu, but in doing so we had to complement answers. More in NP-completeness discussion on Karp vs. Cook reductions.
555
Proof of Rice’s Theorem
We shall show that for every nontrivial property P of the RE languages, LP is undecidable. We show how to reduce Lu to LP. Since we know Lu is undecidable, it follows that LP is also undecidable.
556
The Reduction Our reduction algorithm must take M and w and produce a TM M’. L(M’) has property P if and only if M accepts w. M’ has two tapes, used for: Simulates another TM ML on the input to M’. Simulates M on w. Note: neither M, ML, nor w is input to M’.
557
The Reduction – (2) Assume that does not have property P.
If it does, consider the complement of P, which would also be decidable by the lemma. Let L be any language with property P, and let ML be a TM that accepts L. M’ is constructed to work as follows (next slide).
558
Design of M’ On the second tape, write w and then simulate M on w.
If M accepts w, then simulate ML on the input x to M’, which appears initially on the first tape. M’ accepts its input x if and only if ML accepts x.
559
Action of M’ if M Accepts w
Simulate ML on input x On accept Simulate M on input w x Accept iff x is in ML
560
Design of M’ – (2) Suppose M accepts w.
Then M’ simulates ML and therefore accepts x if and only if x is in L. That is, L(M’) = L, L(M’) has property P, and M’ is in LP.
561
Design of M’ – (3) Suppose M does not accept w.
Then M’ never starts the simulation of ML, and never accepts its input x. Thus, L(M’) = , and L(M’) does not have property P. That is, M’ is not in LP.
562
Action of M’ if M Does not Accept w
Simulate M on input w Never accepts, so nothing else happens and x is not accepted x
563
Design of M’ – Conclusion
Thus, the algorithm that converts M and w to M’ is a reduction of Lu to LP. Thus, LP is undecidable.
564
Picture of the Reduction
Accept iff M accepts w A real reduction algorithm M, w Hypothetical algorithm for property P M’ Otherwise halt without accepting This would be an algorithm for Lu, which doesn’t exist
565
Applications of Rice’s Theorem
We now have any number of undecidable questions about TM’s: Is L(M) a regular language? Is L(M) a CFL? Does L(M) include any palindromes? Is L(M) empty? Does L(M) contain more than 1000 strings? Etc., etc.
566
Post’s Correspondence Problem
But we’re still stuck with problems about Turing machines only. Post’s Correspondence Problem (PCP) is an example of a problem that does not mention TM’s in its statement, yet is undecidable. From PCP, we can prove many other non-TM problems undecidable.
567
PCP Instances An instance of PCP is a list of pairs of nonempty strings over some alphabet Σ. Say (w1, x1), (w2, x2), …, (wn, xn). The answer to this instance of PCP is “yes” if and only if there exists a nonempty sequence of indices i1,…,ik, such that wi1…win = xi1…xin. In text: a pair of lists. Should be “w sub i sub 1,” etc.
568
Example: PCP Let the alphabet be {0, 1}.
Let the PCP instance consist of the two pairs (0, 01) and (100, 001). We claim there is no solution. You can’t start with (100, 001), because the first characters don’t match.
569
Example: PCP – (2) Recall: pairs are (0, 01) and (100, 001) 01
01 Must start with first pair 100 001 Can add the second pair for a match 100 001 As many times as we like But we can never make the first string as long as the second.
570
Example: PCP – (3) Suppose we add a third pair, so the instance becomes: 1 = (0, 01); 2 = (100, 001); 3 = (110, 10). Now 1,3 is a solution; both strings are In fact, any sequence of indexes in 12*3 is a solution.
571
Proving PCP is Undecidable
We’ll introduce the modified PCP (MPCP) problem. Same as PCP, but the solution must start with the first pair in the list. We reduce Lu to MPCP. But first, we’ll reduce MPCP to PCP.
572
Example: MPCP The list of pairs (0, 01), (100, 001), (110, 10), as an instance of MPCP, has a solution as we saw. However, if we reorder the pairs, say (110, 10), (0, 01), (100, 001) there is no solution. No string 110… can ever equal a string 10… .
573
Representing PCP or MPCP Instances
Since the alphabet can be arbitrarily large, we need to code symbols. Say the i-th symbol will be coded by “a” followed by i in binary. Commas and parentheses can represent themselves.
574
Representing Instances – (2)
Thus, we have a finite alphabet in which all instances of PCP or MPCP can be represented. Let LPCP and LMPCP be the languages of coded instances of PCP or MPCP, respectively, that have a solution.
575
Reducing LMPCP to LPCP Take an instance of LMPCP and do the following, using new symbols * and $. For the first string of each pair, add * after every character. For the second string of each pair, add * before every character. Add pair ($, *$). Make another copy of the first pair, with *’s and an extra * prepended to the first string.
576
Example: LMPCP to LPCP MPCP instance, in order: (110, 10)
(0, 01) (100, 001) Add *’s PCP instance: (1*1*0*, *1*0) (0*, *0*1) (1*0*0*, *0*0*1) ($, *$) Ender (*1*1*0*, *1*0) Special pair version of first MPCP choice – only possible start for a PCP solution.
577
LMPCP to LPCP – (2) If the MPCP instance has a solution string w, then padding with stars fore and aft, followed by a $ is a solution string for the PCP instance. Use same sequence of indexes, but special pair to start. Add ender pair as the last index.
578
LMPCP to LPCP – (3) Conversely, the indexes of a PCP solution give us a MPCP solution. First index must be special pair – replace by first pair. Remove ender.
579
Reducing Lu to LMPCP We use MPCP to simulate the sequence of ID’s that M executes with input w. If q0w⊦I1⊦I2⊦ … is the sequence of ID’s of M with input w, then any solution to the MPCP instance we can construct will begin with this sequence of ID’s. # separates ID’s and also serves to represent blanks at the end of an ID.
580
Reducing Lu to LMPCP – (2)
But until M reaches an accepting state, the string formed by concatenating the second components of the chosen pairs will always be a full ID ahead of the string from the first pair. If M accepts, we can even out the difference and solve the MPCP instance.
581
Reducing Lu to LMPCP – (3)
Key assumption: M has a semi-infinite tape; it never moves left from its initial head position. Alphabet of MPCP instance: state and tape symbols of M (assumed disjoint) plus special symbol # (assumed not a state or tape symbol).
582
Reducing Lu to LMPCP – (4)
First MPCP pair: (#, #q0w#). We start out with the second string having the initial ID and a full ID ahead of the first. (#, #). We can add ID-enders to both strings. (X, X) for all tape symbols X of M. We can copy a tape symbol from one ID to the next.
583
Example: Copying Symbols
Suppose we have chosen MPCP pairs to simulate some number of steps of M, and the partial strings from these pairs look like: . . . # . . . #ABqCD# A B
584
Reducing Lu to LMPCP – (5)
For every state q of M and tape symbol X, there are pairs: (qX, Yp) if δ(q, X) = (p, Y, R). (ZqX, pZY) if δ(q, X) = (p, Y, L) [any Z]. Also, if X is the blank, # can substitute. (q#, Yp#) if δ(q, B) = (p, Y, R). (Zq#, pZY#) if δ(q, X) = (p, Y, L) [any Z].
585
Example: Copying Symbols – (2)
Continuing the previous example, if δ(q, C) = (p, E, R), then: . . . #AB . . . #ABqCD#AB If M moves left, we should not have copied B if we wanted a solution. Ep qC D #
586
Reducing Lu to LMPCP – (6)
If M reaches an accepting state f, then f “eats” the neighboring tape symbols, one or two at a time, to enable M to reach an “ID” that is essentially empty. The MPCP instance has pairs (XfY, f), (fY, f), and (Xf, f) for all tape symbols X and Y. To even up the strings and solve: (f##, #).
587
Example: Cleaning Up After Acceptance
… # … #ABfCDE# A BfC f D E # AfD f E # fE f # f## #
588
CFG’s from PCP We are going to prove that the ambiguity problem (is a given CFG ambiguous?) is undecidable. As with PCP instances, CFG instances must be coded to have a finite alphabet. Let a followed by a binary integer i represent the i-th terminal.
589
CFG’s from PCP – (2) Let A followed by a binary integer i represent the i-th variable. Let A1 be the start symbol. Symbols ->, comma, and ε represent themselves. Example: S -> 0S1 | A, A -> c is represented by A1->a1A1a10,A1->A10,A10->a11
590
CFG’s from PCP – (3) Consider a PCP instance with k pairs.
i-th pair is (wi, xi). Assume index symbols a1,…, ak are not in the alphabet of the PCP instance. The list language for w1,…, wk has a CFG with productions A -> wiAai and A -> wiai for all i = 1, 2,…, k.
591
List Languages Similarly, from the second components of each pair, we can construct a list language with productions B -> xiBai and B -> xiai for all i = 1, 2,…, k. These languages each consist of the concatenation of strings from the first or second components of pairs, followed by the reverse of their indexes.
592
Example: List Languages
Consider PCP instance (a,ab), (baa,aab), (bba,ba). Use 1, 2, 3 as the index symbols for these pairs in order. A -> aA1 | baaA2 | bbaA3 | a1 | baa2 | bba3 B -> abB1 | aabB2 | baB3 | ab1 | aab2 | ba3
593
Reduction of PCP to the Ambiguity Problem
Given a PCP instance, construct grammars for the two list languages, with variables A and B. Add productions S -> A | B. The resulting grammar is ambiguous if and only if there is a solution to the PCP instance.
594
Example: Reduction to Ambiguity
A -> aA1 | baaA2 | bbaA3 | a1 | baa2 | bba3 B -> abB1 | aabB2 | baB3 | ab1 | aab2 | ba3 S -> A | B There is a solution 1, 3. Note abba31 has leftmost derivations: S => A => aA1 => abba31 S => B => abB1 => abba31
595
Proof the Reduction Works
In one direction, if a1,…, ak is a solution, then w1…wkak…a1 equals x1…xkak…a1 and has two derivations, one starting S -> A, the other starting S -> B. Conversely, there can only be two derivations of the same terminal string if they begin with different first productions. Why? Next slide.
596
Proof – Continued If the two derivations begin with the same first step, say S -> A, then the sequence of index symbols uniquely determines which productions are used. Each except the last would be the one with A in the middle and that index symbol at the end. The last is the same, but no A in the middle.
597
Example: S =>A=>*…2321
w1 A 2 w2 A 3 w3 A 2 w2
598
More “Real” Undecidable Problems
To show things like CFL-equivalence to be undecidable, it helps to know that the complement of a list language is also a CFL. We’ll construct a deterministic PDA for the complement langauge.
599
DPDA for the Complement of a List Language
Start with a bottom-of-stack marker. While PCP symbols arrive at the input, push them onto the stack. After the first index symbol arrives, start checking the stack for the reverse of the corresponding string.
600
Complement DPDA – (2) The DPDA accepts after every input, with one exception. If the input has consisted so far of only PCP symbols and then index symbols, and the bottom-of-stack marker is exposed after reading an index symbol, do not accept.
601
Using the Complements For a given PCP instance, let LA and LB be the list languages for the first and second components of pairs. Let LAc and LBc be their complements. All these languages are CFL’s.
602
Using the Complements Consider LAc LBc. Also a CFL.
= Σ* if and only if the PCP instance has no solution. Why? a solution a1,…, an implies w1…wnan…a1 is not in LAc, and the equal x1…xnan…a1 is not in LBc. Conversely, anything missing is a solution.
603
Undecidability of “= Σ*”
We have reduced PCP to the problem is a given CFL equal to all strings over its terminal alphabet?
604
Undecidablility of “CFL is Regular”
Also undecidable: is a CFL a regular language? Same reduction from PCP. Proof: One direction: If LAc LBc = Σ*, then it surely is regular.
605
“= Regular” – (2) Conversely, we can show that if L = LAc LBc is not Σ*, then it can’t be regular. Proof: Suppose wx is a solution to PCP, where x is the indices. Define homomorphism h(0) = w and h(1) = x.
606
“= Regular” – (3) h(0n1n) is not in L, because the repetition of any solution is also a solution. However, h(y) is in L for any other y in {0,1}*. If L were regular, so would be h–1(L), and so would be its complement = {0n1n |n > 1}.
607
Intractable Problems Time-Bounded Turing Machines Classes P and NP
Polynomial-Time Reductions
608
Time-Bounded TM’s A Turing machine that, given an input of length n, always halts within T(n) moves is said to be T(n)-time bounded. The TM can be multitape. Sometimes, it can be nondeterministic. The deterministic, multitape case corresponds roughly to “an O(T(n)) running-time algorithm.”
609
The class P If a DTM M is T(n)-time bounded for some polynomial T(n), then we say M is polynomial-time (“polytime ”) bounded. And L(M) is said to be in the class P. Important point: when we talk of P, it doesn’t matter whether we mean “by a computer” or “by a TM” (next slide).
610
Polynomial Equivalence of Computers and TM’s
A multitape TM can simulate a computer that runs for time O(T(n)) in at most O(T2(n)) of its own steps. If T(n) is a polynomial, so is T2(n).
611
Examples of Problems in P
Is w in L(G), for a given CFG G? Input = w. Use CYK algorithm, which is O(n3). Is there a path from node x to node y in graph G? Input = x, y, and G. Use Dijkstra’s algorithm, which is O(n log n) on a graph of n nodes and arcs.
612
Running Times Between Polynomials
You might worry that something like O(n log n) is not a polynomial. However, to be in P, a problem only needs an algorithm that runs in time less than some polynomial. Surely O(n log n) is less than the polynomial O(n2).
613
A Tricky Case: Knapsack
The Knapsack Problem is: given positive integers i1, i2 ,…, in, can we divide them into two sets with equal sums? Perhaps we can solve this problem in polytime by a dynamic-programming algorithm: Maintain a table of all the differences we can achieve by partitioning the first j integers.
614
Knapsack – (2) Basis: j = 0. Initially, the table has “true” for 0 and “false” for all other differences. Induction: To consider ij, start with a new table, initially all false. Then, set k to true if, in the old table, there is a value m that was true, and k is either m+ij or m-ij.
615
Knapsack – (3) Suppose we measure running time in terms of the sum of the integers, say m. Each table needs only space O(m) to represent all the positive and negative differences we could achieve. Each table can be constructed in time O(n).
616
Knapsack – (4) Since n < m, we can build the final table in O(m2) time. From that table, we can see if 0 is achievable and solve the problem.
617
Subtlety: Measuring Input Size
“Input size” has a specific meaning: the length of the representation of the problem instance as it is input to a TM. For the Knapsack Problem, you cannot always write the input in a number of characters that is polynomial in either the number-of or sum-of the integers.
618
Knapsack – Bad Case Suppose we have n integers, each of which is around 2n. We can write integers in binary, so the input takes O(n2) space to write down. But the tables require space O(n2n). They therefore require at least that order of time to construct.
619
Bad Case – (2) Thus, the proposed “polynomial” algorithm actually takes time O(n22n) on an input of length O(n2). Or, since we like to use n as the input size, it takes time O(n2sqrt(n)) on an input of length n. In fact, it appears no algorithm solves Knapsack in polynomial time.
620
Redefining Knapsack We are free to describe another problem, call it Pseudo-Knapsack, where integers are represented in unary. Pseudo-Knapsack is in P.
621
The Class NP The running time of a nondeterministic TM is the maximum number of steps taken along any branch. If that time bound is polynomial, the NTM is said to be polynomial-time bounded. And its language/problem is said to be in the class NP.
622
Example: NP The Knapsack Problem is definitely in NP, even using the conventional binary representation of integers. Use nondeterminism to guess one of the subsets. Sum the two subsets and compare.
623
P Versus NP Originally a curiosity of Computer Science, mathematicians now recognize as one of the most important open problems the question P = NP? There are thousands of problems that are in NP but appear not to be in P. But no proof that they aren’t really in P.
624
Complete Problems One way to address the P = NP question is to identify complete problems for NP. An NP-complete problem has the property that if it is in P, then every problem in NP is also in P. Defined formally via “polytime reductions.”
625
Complete Problems – Intuition
A complete problem for a class embodies every problem in the class, even if it does not appear so. Compare: PCP embodies every TM computation, even though it does not appear to do so. Strange but true: Knapsack embodies every polytime NTM computation.
626
Polytime Reductions Goal: find a way to show problem L to be NP-complete by reducing every language/problem in NP to L in such a way that if we had a deterministic polytime algorithm for L, then we could construct a deterministic polytime algorithm for any problem in NP.
627
Polytime Reductions – (2)
We need the notion of a polytime transducer – a TM that: Takes an input of length n. Operates deterministically for some polynomial time p(n). Produces an output on a separate output tape. Note: output length is at most p(n).
628
Polytime Transducer state n input scratch tapes < p(n) output
Remember: important requirement is that time < p(n).
629
Polytime Reductions – (3)
Let L and M be langauges. Say L is polytime reducible to M if there is a polytime transducer T such that for every input w to T, the output x = T(w) is in M if and only if w is in L.
630
Picture of Polytime Reduction
in L not in M T not in M
631
NP-Complete Problems A problem/language M is said to be NP- complete if for every language L in NP, there is a polytime reduction from L to M. Fundamental property: if M has a polytime algorithm, then L also has a polytime algorithm. I.e., if M is in P, then every L in NP is also in P, or “P = NP.”
632
The Plan All of NP polytime reduces to SAT, which SAT polytime
is therefore NP-complete 3- SAT SAT polytime reduces to 3-SAT NP SAT 3-SAT polytime reduces to many other problems; they’re all NP-complete
633
Proof That Polytime Reductions “Work”
Suppose M has an algorithm of polynomial time q(n). Let L have a polytime transducer T to M, taking polynomial time p(n). The output of T, given an input of length n, is at most of length p(n). The algorithm for M on the output of T takes time at most q(p(n)).
634
Proof – (2) We now have a polytime algorithm for L:
Given w of length n, use T to produce x of length < p(n), taking time < p(n). Use the algorithm for M to tell if x is in M in time < q(p(n)). Answer for w is whatever the answer for x is. Total time < p(n) + q(p(n)) = a polynomial.
635
The Satisfiability Problem
Cook’s Theorem: An NP-Complete Problem Restricted SAT: CSAT, 3SAT
636
Boolean Expressions Boolean, or propositional-logic expressions are built from variables and constants using the operators AND, OR, and NOT. Constants are true and false, represented by 1 and 0, respectively. We’ll use concatenation (juxtaposition) for AND, + for OR, - for NOT, unlike the text.
637
Example: Boolean expression
(x+y)(-x + -y) is true only when variables x and y have opposite truth values. Note: parentheses can be used at will, and are needed to modify the precedence order NOT (highest), AND, OR.
638
The Satisfiability Problem (SAT )
Study of boolean functions generally is concerned with the set of truth assignments (assignments of 0 or 1 to each of the variables) that make the function true. NP-completeness needs only a simpler question (SAT): does there exist a truth assignment making the function true?
639
Example: SAT (x+y)(-x + -y) is satisfiable.
There are, in fact, two satisfying truth assignments: x=0; y=1. x=1; y=0. x(-x) is not satisfiable.
640
SAT as a Language/Problem
An instance of SAT is a boolean function. Must be coded in a finite alphabet. Use special symbols (, ), +, - as themselves. Represent the i-th variable by symbol x followed by integer i in binary.
641
Example: Encoding for SAT
(x+y)(-x + -y) would be encoded by the string (x1+x10)(-x1+-x10)
642
SAT is in NP There is a multitape NTM that can decide if a Boolean formula of length n is satisfiable. The NTM takes O(n2) time along any path. Use nondeterminism to guess a truth assignment on a second tape. Replace all variables by guessed truth values. Evaluate the formula for this assignment. Accept if true.
643
Cook’s Theorem SAT is NP-complete.
Really a stronger result: formulas may be in conjunctive normal form (CSAT) – later. To prove, we must show how to construct a polytime reduction from each language L in NP to SAT. Start by assuming the most resticted possible form of NTM for L (next slide).
644
Assumptions About NTM for L
One tape only. Head never moves left of the initial position. States and tape symbols are disjoint. Key Points: States can be named arbitrarily, and the constructions many-tapes-to-one and two-way- infinite-tape-to-one at most square the time.
645
More About the NTM M for L
Let p(n) be a polynomial time bound for M. Let w be an input of length n to M. If M accepts w, it does so through a sequence I0⊦I1⊦…⊦Ip(n) of p(n)+1 ID’s. Assume trivial move from a final state. Each ID is of length at most p(n)+1, counting the state.
646
From ID Sequences to Boolean Functions
The Boolean function that the transducer for L will construct from w will have (p(n)+1)2 “variables.” Let variable Xij represent the j-th position of the i-th ID in the accepting sequence for w, if there is one. i and j each range from 0 to p(n).
647
Picture of Computation as an Array
Initial ID X00 X01 … X0p(n) I1 X10 X11 … X1p(n) . . Ip(n) Xp(n)0 Xp(n)1 … Xp(n)p(n)
648
Intuition From M and w we construct a boolean formula that forces the X’s to represent one of the possible ID sequences of NTM M with input w, if it is to be satisfiable. It is satisfiable iff some sequence leads to acceptance.
649
From ID’s to Boolean Variables
The Xij’s are not boolean variables; they are states and tape symbols of M. However, we can represent the value of each Xij by a family of Boolean variables yijA, for each possible state or tape symbol A. yijA is true if and only if Xij = A.
650
Points to Remember The boolean function has components that depend on n. These must be of size polynomial in n. Other pieces depend only on M. No matter how many states/symbols m has, these are of constant size. Any logical formula about a set of variables whose size is independent of n can be written somehow.
651
Designing the Function
We want the Boolean function that describes the Xij’s to be satisfiable if and only if the NTM M accepts w. Four conditions: Unique: only one symbol per position. Starts right: initial ID is q0w. Moves right: each ID follows from the next by a move of M. Finishes right: M accepts.
652
Unique Take the AND over all i, j, Y, and Z of (-yijY+ -yijZ).
That is, it is not possible for Xij to be both symbols Y and Z.
653
Starts Right The Boolean Function needs to assert that the first ID is the correct one with w = a1…an as input. X00 = q0. X0i = ai for i = 1,…, n. X0i = B (blank) for i = n+1,…, p(n). Formula is the AND of y0iZ for all i, where Z is the symbol in position i.
654
Finishes Right Somewhere, there must be an accepting state.
Form the OR of Boolean variables yijq, where i and j are arbitrary and q is an accepting state. Note: differs from text.
655
Running Time So Far Unique requires O(p2(n)) symbols be written.
Parentheses, signs, propositional variables. Algorithm is easy, so it takes no more time than O(p2(n)). Starts Right takes O(p(n)) time. Finishes Right takes O(p2(n)) time. Variation over symbols Y and Z is independent of n, so covered by constant. Variation over i and j
656
Running Time – (2) Caveat: Technically, the propositions that are output of the transducer must be coded in a fixed alphabet, e.g., x10011 rather than yijA. Thus, the time and output length have an additional factor O(log n) because there are O(p2(n)) variables. But log factors do not affect polynomials
657
Works because Unique assures only one yijX true. Moves Right Xij = Xi-1,j whenever the state is none of Xi-1,j-1, Xi-1,j, or Xi-1,j+1. For each i and j, construct a formula that says (in propositional variables) the OR of “Xij = Xi-1,j” and all yi-1,k,A where A is a state symbol (k = i-1, i, or i+1). Note: Xij = Xi-1,j is the OR of yijA.yi-1,jA for all symbols A.
658
Constraining the Next Symbol
… A B C … … A q C … ? ? ? Hard case; all three may depend on the move of M B Easy case; must be B
659
Moves Right – (2) In the case where the state is nearby, we need to write an expression that: Picks one of the possible moves of the NTM M. Enforces the condition that when Xi-1,j is the state, the values of Xi,j-1, Xi,j, and Xi,j+1. are related to Xi-1,j-1, Xi-1,j, and Xi-1,j+1 in a way that reflects the move.
660
Example: Moves Right Suppose δ(q, A) contains (p, B, L).
Then one option for any i, j, and C is: C q A p C B If δ(q, A) contains (p, B, R), then an option for any i, j, and C is: C B p
661
Moves Right – (3) For each possible move, express the constraints on the six X’s by a Boolean formula. For each i and j, take the OR over all possible moves. Take the AND over all i and j. Small point: for edges (e.g., state at 0), assume invisible symbols are blank.
662
Running Time We have to generate O(p2(n)) Boolean formulas, but each is constructed from the moves of the NTM M, which is fixed in size, independent of the input w. Takes time O(p2(n)) and generates an output of that length. Times log n, because variables must be coded in a fixed alphabet.
663
Cook’s Theorem – Finale
In time O(p2(n) log n) the transducer produces a boolean formula, the AND of the four components: Unique, Starts, Finishes, and Moves Right. If M accepts w, the ID sequence gives us a satisfying truth assignment. If satisfiable, the truth values tell us an accepting computation of M.
664
Picture So Far We have one NP-complete problem: SAT.
In the future, we shall do polytime reductions of SAT to other problems, thereby showing them NP-complete. Why? If we polytime reduce SAT to X, and X is in P, then so is SAT, and therefore so is all of NP.
665
Conjunctive Normal Form
A Boolean formula is in Conjunctive Normal Form (CNF) if it is the AND of clauses. Each clause is the OR of literals. A literal is either a variable or the negation of a variable. Problem CSAT : is a Boolean formula in CNF satisfiable?
666
Example: CNF (x + -y + z)(-x)(-w + -x + y + z) (…
667
NP-Completeness of CSAT
The proof of Cook’s theorem can be modified to produce a formula in CNF. Unique is already the AND of clauses. Starts Right is the AND of clauses, each with one variable. Finishes Right is the OR of variables, i.e., a single clause.
668
NP-Completeness of CSAT – (2)
Only Moves Right is a problem, and not much of a problem. It is the product of formulas for each i and j. Those formulas are fixed, independent of n.
669
NP-Completeness of CSAT – (3)
You can convert any formula to CNF. It may exponentiate the size of the formula and therefore take time to write down that is exponential in the size of the original formula, but these numbers are all fixed for a given NTM M and independent of n.
670
k-SAT If a boolean formula is in CNF and every clause consists of exactly k literals, we say the boolean formula is an instance of k-SAT. Say the formula is in k-CNF. Example: 3-SAT formula (x + y + z)(x + -y + z)(x + y + -z)(x + -y + -z)
671
k-SAT Facts Every boolean formula has an equivalent CNF formula.
But the size of the CNF formula may be exponential in the size of the original. Not every boolean formula has a k-SAT equivalent. 2SAT is in P; 3SAT is NP-complete.
672
Proof: 2SAT is in P (Sketch)
Pick an assignment for some variable, say x = true. Any clause with –x forces the other literal to be true. Example: (-x + -y) forces y to be false. Keep seeing what other truth values are forced by variables with known truth values.
673
Proof – (2) One of three things can happen:
You reach a contradiction (e.g., z is forced to be both true and false). You reach a point where no more variables have their truth value forced, but some clauses are not yet made true. You reach a satisfying truth assignment.
674
Proof – (3) Case 1: (Contradiction) There can only be a satisfying assignment if you use the other truth value for x. Simplify the formula by replacing x by this truth value and repeat the process. Case 3: You found a satisfying assignment, so answer “yes.”
675
Proof – (4) Case 2: (You force values for some variables, but other variables and clauses are not affected). Adopt these truth values, eliminate the clauses that they satisfy, and repeat. In Cases 1 and 2 you have spent O(n2) time and have reduced the length of the formula by > 1, so O(n3) total.
676
3SAT This problem is NP-complete. Clearly it is in NP, since SAT is.
It is not true that every Boolean formula can be converted to an equivalent 3-CNF formula, even if we exponentiate the size of the formula.
677
3SAT – (2) But we don’t need equivalence.
We need to reduce every CNF formula F to some 3-CNF formula that is satisfiable if and only if F is. Reduction involves introducing new variables into long clauses, so we can split them apart.
678
Reduction of CSAT to 3SAT
Let (x1+…+xn) be a clause in some CSAT instance, with n > 4. Note: the x’s are literals, not variables; any of them could be negated variables. Introduce new variables y1,…,yn-3 that appear in no other clause.
679
CSAT to 3SAT – (2) Replace (x1+…+xn) by (x1+x2+y1)(x3+y2+ -y1) … (xi+yi-1+ -yi-2) … (xn-2+yn-3+ -yn-4)(xn-1+xn+ -yn-3) If there is a satisfying assignment of the x’s for the CSAT instance, then one of the literals xi must be made true. Assign yj = true if j < i-1 and yj = false for larger j.
680
CSAT to 3SAT – (3) We are not done.
We also need to show that if the resulting 3SAT instance is satisfiable, then the original CSAT instance was satisfiable.
681
CSAT to 3SAT – (4) Suppose (x1+x2+y1)(x3+y2+ -y1) … (xn-2+yn-3+ -yn-4)(xn-1+xn+ -yn-3) is satisfiable, but none of the x’s is true. The first clause forces y1 = true. Then the second clause forces y2 = true. And so on … all the y’s must be true. But then the last clause is false.
682
CSAT to 3SAT – (5) There is a little more to the reduction, for handling clauses of 1 or 2 literals. Replace (x) by (x+y1+y2) (x+y1+ -y2) (x+ -y1+y2) (x+ -y1+ -y2). Replace (w+x) by (w+x+y)(w+x+ -y). Remember: the y’s are different variables for each CNF clause.
683
CSAT to 3SAT Running Time
This reduction is surely polynomial. In fact it is linear in the length of the CSAT instance. Thus, we have polytime-reduced CSAT to 3SAT. Since CSAT is NP-complete, so is 3SAT.
684
More NP-Complete Problems
NP-Hard Problems Tautology Problem Node Cover Knapsack
685
Next Steps We can now reduce 3SAT to a large number of problems, either directly or indirectly. Each reduction must be polytime. Usually we focus on length of the output from the transducer, because the construction is easy. But key issue: must be polytime.
686
Next Steps – (2) Another essential part of an NP- completeness proof is showing the problem is in NP. Sometimes, we can only show a problem NP-hard = “if the problem is in P, then P = NP,” but the problem may not be in NP.
687
Example: NP-Hard Problem
The Tautology Problem is: given a Boolean formula, is it satisfied by all truth assignments? Example: x + -x + yz Not obviously in NP, but it’s complement is. Guess a truth assignment; accept if that assignment doesn’t satisfy the formula.
688
Key Point Regarding Tautology
An NTM can guess a truth assignment and decide whether formula F is satisfied by that assignment in polytime. But if the NTM accepts when it guesses a satisfying assignment, it will accept F whenever F is in SAT, not Tautology.
689
Co-NP A problem/language whose complement is in NP is said to be in Co-NP. Note: P is closed under complementation. Thus, P Co-NP. Also, if P = NP, then P = NP = Co-NP.
690
Tautology is NP-Hard While we can’t prove Tautology is in NP, we can prove it is NP-hard. Suppose we had a polytime algorithm for Tautology. Take any Boolean formula F and convert it to -(F). Obviously linear time.
691
Tautology is NP-Hard – (2)
F is satisfiable if and only -(F) is not a tautology. Use the hypothetical polytime algorithm for Tautology to test if -(F) is a tautology. Say “yes, F is in SAT” if -(F) is not a tautology and say “no” otherwise. Then SAT would be in P, and P = NP.
692
Historical Comments There were actually two notions of “NP- complete” that differ subtlely. And only if P NP. Steve Cook, in his 1970 paper, was really concerned with the question “why is Tautology hard?” Remember: theorems are really logical tautologies.
693
History – (2) Cook used “if problem X is in P, then P = NP” as the definition of “X is NP- hard.” Today called Cook completeness. In 1972, Richard Karp wrote a paper showing many of the key problems in Operations Research to be NP- complete.
694
History – (3) Karp’s paper moved “NP-completeness” from a concept about theorem proving to an essential for any study of algorithms. But Karp used the definition of NP- completeness “exists a polytime reduction,” as we have. Called Karp completeness.
695
Cook Vs. Karp Completeness
In practice, there is very little difference. Biggest difference: for Tautology, Cook lets us flip the answer after a polytime reduction. In principle, Cook completeness could be much more powerful, or (if P = NP) exactly the same.
696
Cook Vs. Karp – (2) But there is one important reason we prefer Karp-completeness. Suppose I had an algorithm for some NP-complete problem that ran in time O(nlog n). A function that is bigger than any polynomial, yet smaller than the exponentials like 2n.
697
Cook Vs. Karp – (3) If “NP-complete is Karp-completeness, I can conclude that all of NP can be solved in time O(nf(n)), where f(n) is some function of the form c logkn. Still faster than any exponential, and faster than we have a right to expect. But if I use Cook-completeness, I cannot say anything of this type.
698
The Node Cover Problem Given a graph G, we say N is a node cover for G if every edge of G has at least one end in N. The problem Node Cover is: given a graph G and a “budget” k, does G have a node cover of k or fewer nodes?
699
Example: Node Cover One possible node cover of size 3: {B, C, E} A B C
700
NP-Completeness of Node Cover
Reduction from 3SAT. For each clause (X+Y+Z) construct a “column” of three nodes, all connected by vertical edges. Add a horizontal edge between nodes that represent any variable and its negation. Budget = twice the number of clauses.
701
Example: The Reduction to Node Cover
(x + y + z)(-x + -y + -z)(x + -y +z)(-x + y + -z) x z y -x -z -y x z -y -x -z y Budget = 8
702
Example: Reduction – (2)
A node cover must have at least two nodes from every column, or some vertical edge is not covered. Since the budget is twice the number of columns, there must be exactly two nodes in the cover from each column. Satisfying assignment corresponds to the node in each column not selected.
703
Example: Reduction – (3)
(x + y + z)(-x + -y + -z)(x + -y +z)(-x + y + -z) Truth assignment: x = y = T; z = F Pick a true node in each column x z y -x -z -y x z -y -x -z y
704
Example: Reduction – (4)
(x + y + z)(-x + -y + -z)(x + -y +z)(-x + y + -z) Truth assignment: x = y = T; z = F The other nodes form a node cover x z y -x -z -y x z -y -x -z y
705
Proof That the Reduction Works
The reduction is clearly polytime. Need to show: If we construct from 3SAT instance F a graph G and a budget k, then G has a node cover of size k if and only if F is satisfiable.
706
Proof: If Suppose we have a satisfying assignment A for F.
For each clause of F, pick one of its three literals that A makes true. Put in the node cover the two nodes for that clause that do not correspond to the selected literal. Total = k nodes – meets budget.
707
Proof: If – (2) The selected nodes cover all vertical edges.
Why? Any two nodes for a clause cover the triangle of vertical edges for that clause. Horizontal edges are also covered. A horizontal edge connects nodes for some x and -x. One is false in A and therefore its node must be selected for the node cover.
708
Proof: Only If Suppose G has a node cover with at most k nodes.
One node cannot cover the vertical edges of any column, so each column has exactly 2 nodes in the cover. Construct a satisfying assignment for F by making true the literal for any node not in the node cover.
709
Proof: Only If – (2) Worry: What if there are unselected nodes corresponding to both x and -x? Then we would not have a truth assignment. But there is a horizontal edge between these nodes. Thus, at least one is in the node cover.
710
Optimization Problems
NP-complete problems are always yes/no questions. In practice, we tend to want to solve optimization problems, where our task is to minimize (or maximize) a parameter subject to some constraints.
711
Example: Optimization Problem
People who care about node covers would ask: Given this graph, what is the smallest number of nodes I can pick to form a node cover? If I can solve that problem in polytime, then I can solve the yes/no version.
712
Example – Continued Polytime algorithm: given graph G and budget k, solve the optimization problem for G. If the smallest node cover for G is of size k or less, answer “yes’; otherwise answer “no.”
713
Optimization Problems – (2)
Optimization problems are never, strictly speaking, in NP. They are not yes/no. But there is a Cook reduction from the yes/no version to the optimization version.
714
Optimization Problems – (3)
That is enough to show that if the optimization version of an NP-complete problem can be solved in polytime, then P = NP. A strong argument that you cannot solve the optimization version of an NP-complete problem in polytime.
715
The Knapsack Problem We shall prove NP-complete a version of Knapsack with a budget: Given a list L of integers and a budget k, is there a subset of L whose sum is exactly k? Later, we’ll reduce this version of Knapsack to our earlier one: given an integer list L, can we divide it into two equal parts?
716
Knapsack is in NP Guess a subset of the list L. Add ‘em up.
Accept if the sum is k.
717
Polytime Reduction of 3SAT to Knapsack
Given 3SAT instance F, we need to construct a list L and a budget k. Suppose F has c clauses and v variables. L will have base-32 integers of length c+v, and there will be 3c+2v of them.
718
Picture of Integers for Literals
1 1 in i-th position if this integer is for xi or -xi. 1’s in all positions such that this literal makes the clause true. v c All other positions are 0.
719
Pictures of Integers for Clauses
5 6 7 i For the i-th clause
720
Example: Base-32 Integers
(x + y + z)(x + -y + -z) c = 2; v = 3. Assume x, y, z are variables 1, 2, 3, respectively. Clauses are 1, 2 in order given.
721
Example: (x + y + z)(x + -y + -z)
For x: For -x: 00100 For y: For -y: 01010 For z: For -z: 10010 For first clause: , 00006, For second clause: , 00060,
722
The Budget k = 8(1+32+322+…+32c-1) + 32c(1+32+322+…+32v-1)
That is, 8 for the position of each clause and 1 for the position of each variable. Key Point: Base-32 is high enough that there can be no carries between positions.
723
Key Point: Details Among all the integers, the sum of digits in the position for a variable is 2. And for a clause, it is = 21. 1’s for the three literals in the clause; 5, 6, and 7 for the integers for that clause. Thus, the budget must be satisfied on a digit-by-digit basis.
724
Key Point: Concluded Thus, if a set of integers matches the budget, it must include exactly one of the integers for x and -x. For each clause, at least one of the integers for literals must have a 1 there, so we can choose either 5, 6, or 7 to make 8 in that position.
725
Proof the Reduction Works
Each integer can be constructed from the 3SAT instance F in time proportional to its length. Thus, reduction is O(n2). If F is satisfiable, take a satisfying assignment A. Pick integers for those literals that A makes true.
726
Proof the Reduction Works – (2)
The selected integers sum to between 1 and 3 in the digit for each clause. For each clause, choose the integer with 5, 6, or 7 in that digit to make a sum of 8. These selected integers sum to exactly the budget.
727
Proof: Converse We must also show that a sum of integers equal to the budget k implies F is satisfiable. In each digit for a variable x, either the integer for x or the digit for -x, but not both is selected. let truth assignment A make this literal true.
728
Proof: Converse – (2) In the digits for the clauses, a sum of 8 can only be achieved if among the integers for the variables, there is at least one 1 in that digit. Thus, truth assignment A makes each clause true, so it satisfies F.
729
The Partition-Knapsack Problem
This problem is what we originally referred to as “knapsack.” Given a list of integers L, can we partition it into two disjoint sets whose sums are equal? Partition-Knapsack is NP-complete; reduction from Knapsack.
730
Reduction of Knapsack to Partition-Knapsack
Given instance (L, k) of Knapsack, compute the sum s of all the integers in L. Linear in input size. Output is L followed by two integers: s and 2k. Example: L = 3, 4, 5, 6; k = 7. Partition-Knapsack instance = 3, 4, 5, 6, 14, 18. Solution Solution
731
Proof That Reduction Works
The sum of all integers in the output instance is 2(s+k). Thus, the two partitions must each sum to s+k. If the input instance has a subset of L that sums to k, then pick it plus the integer s to solve the output instance.
732
Proof: Converse Suppose the output instance of Partition- Knapsack has a solution. The integers s and 2k cannot be in the same partition. Because their sum is more than half 2(s+k). Thus, the subset of L that is in the partition with s sums to k. Thus, it solves the Knapsack instance.
733
Relationship to Other Classes
Polynomial Space The classes PS and NPS Relationship to Other Classes Equivalence PS = NPS A PS-Complete Problem
734
Polynomial-Space-Bounded TM’s
A TM M is said to be polyspace- bounded if there is a polynomial p(n) such that, given input of length n, M never uses more than p(n) cells of its tape. L(M) is in the class polynomial space, or PS.
735
Nondeterministic Polyspace
If we allow a TM M to be nondeterministic but to use only p(n) tape cells in any sequence of ID’s when given input of length n, we say M is a nondeterministic polyspace-bounded TM. And L(M) is in the class nondeterministic polyspace, or NPS.
736
Relationship to Other Classes
Obviously, P PS and NP NPS. If you use polynomial time, you cannot reach more than a polynomial number of tape cells. Alas, it is not even known whether P = PS or NP = PS. On the other hand, we shall show PS = NPS.
737
Exponential Polytime Classes
A DTM M runs in exponential polytime if it makes at most cp(n) steps on input of length n, for some constant c and polynomial p. Say L(M) is in the class EP. If M is an NTM instead, say L(M) is in the class NEP (nondeterministic exponential polytime ).
738
More Class Relationships
P NP PS EP, and at least one of these is proper. A diagonalization proof shows that P EP. PS EP requires proof. Key Point: A polyspace-bounded TM has only cp(n) different ID’s. We can count to cp(n) in polyspace and stop it after it surely repeated an ID.
739
Proof PS EP Let M be a p(n)-space bounded DTM with s states and t tape symbols. Assume M has only one semi-infinite tape. The number of possible ID’s of M is sp(n)tp(n) . Tape contents States Positions of tape head
740
Proof PS EP – (2) Note that (t+1)p(n)+1 > p(n)tp(n).
Use binomial expansion (t+1)p(n)+1 = tp(n)+1 + (p(n)+1)tp(n) + … Also, s = (t+1)c, where c = logt+1s. Thus, sp(n)tp(n) < (t+1)p(n)+1+c. We can count to the maximum number of ID’s on a separate tape using base t+1 and p(n)+1+c cells – a polynomial.
741
Proof PS EP – (2) Redesign M to have a second tape and to count on that tape to sp(n)tp(n). The new TM M’ is polyspace bounded. M’ halts if its counter exceeds sp(n)tp(n). If M accepts, it does so without repeating an ID. Thus, M’ is exponential-polytime bounded, proving L(M) is in EP.
742
Savitch’s Theorem: PS = NPS
Key Idea: a polyspace NTM has “only” cp(n) different ID’s it can enter. Implement a deterministic, recursive function that decides, about the NTM, whether I⊦*J in at most m moves. Assume m < cp(n), since if the NTM accepts, it does so without repeating an ID.
743
Savitch’s Theorem – (2) Recursive doubling trick: to tell if I⊦*J in < m moves, search for an ID K such that I⊦*K and K⊦*J, both in < m/2 moves. Complete algorithm: ask if I0⊦*J in at most cp(n) moves, where I0 is the initial ID with given input w of length n, and J is any of the ID’s with an accepting state and length < p(n).
744
Recursive Doubling boolean function f(I, J, m) {
for (all ID’s K using p(n) tape) if (f(I, K, m/2) && f(K, J, m/2)) return true; return false; }
745
Stack Implementation of f
I, J, m O(p(n)) space I, K, m/2 O(p(n)) space L, K, m/4 O(p(n)) space . . . M, N, 1 O(p(n)) space O(p2(n)) space
746
Space for Recursive Doubling
f(I, J, m) requires space O(p(n)) to store I, J, m, and the current K. m need not be more than cp(n), so it can be stored in O(p(n)) space. How many calls to f can be active at once? Largest m is cp(n).
747
Space for Recursive Doubling – (2)
Each call with third argument m results in only one call with argument m/2 at any one time. Thus, at most log2cp(n) = O(p(n)) calls can be active at any one time. Total space needed by the DTM is therefore O(p2(n)) – a polynomial.
748
PS-Complete Problems A problem P in PS is said to be PS- complete if there is a polytime reduction from every problem in PS to P. Note: it has to be polytime, not polyspace, because: Polyspace can exponentiate the output size. Without polytime, we could not deal with the question P = PS?
749
What PS-Completeness Buys
If some PS-complete problem is: In P, then P = PS. In NP, then NP = PS.
750
Quantified Boolean Formulas
We shall meet a PS-complete problem, called QBF : is a given quantified boolean formula true? But first we meet the QBF’s themselves. We shall give a recursive (inductive) definition of QBF’s along with the definition of free/bound variable occurrences.
751
QBF’s – (2) First-order predicate logic, with variables restricted to true/false. Basis: Constants 0 (false) and 1 (true) are QBF’s. A variable is a QBF, and that variable occurrence is free in this QBF.
752
QBF’s – (3) Induction: If E and F are QBF’s, so are:
E AND F, E OR F, and NOT F. Variables are bound or free as in E or F. (x)E and (x)E for any variable x. All free occurrences x are bound to this quantifier, and other occurrences of variables are free/bound as in E. Use parentheses to group as needed. Precedence: quantifiers, NOT, AND, OR.
753
Example: QBF (x)(y)(((x)(x OR y)) AND NOT (x AND y)) bound bound
754
Evaluating QBF’s In general, a QBF is a function from truth assignments for its free variables to {0, 1} (false/true). Important special case: no free variables; a QBF is either true or false. We shall give the evaluation only for these formulas.
755
Evaluating QBF’s – (2) Induction on the number of operators, including quantifiers. Stage 1: eliminate quantifiers. Stage 2: evaluate variable-free formulas. Basis: 0 operators. Expression can only be 0 or 1, because there are no free variables. Truth value is 0 or 1, respectively.
756
Induction Expression is NOT E, E OR F, or E AND F.
Evaluate E and F; apply boolean operator to the results. Expression is (x)E. Construct E0 = E with each x bound to this quantifier replaced by 0, and analogously E1. E is true iff both E0 and E1 are true. Expression is (x)E. Same, but E is true iff either E0 or E1 is true.
757
Example: Evaluation (x)(y)(((x)(x OR y)) AND NOT (x AND y))
Substitute x = 0 for outer quantifier: (y)(((x)(x OR y)) AND NOT (0 AND y)) Substitute x = 1 for outer quantifier: (y)(((x)(x OR y)) AND NOT (1 AND y))
758
Example: Evaluation – (2)
Let’s follow the x = 0 subproblem: (y)(((x)(x OR y)) AND NOT (0 AND y)) Two cases: y = 0 and y = 1. ((x)(x OR 0)) AND NOT (0 AND 0) ((x)(x OR 1)) AND NOT (0 AND 1)
759
Example: Evaluation – (3)
Let’s follow the y = 0 subproblem: ((x)(x OR 0)) AND NOT (0 AND 0) Need to evaluate (x)(x OR 0). x = 0: 0 OR 0 = 0. x = 1: 1 OR 0 = 1. Hence, value is 1. Answer is 1 AND NOT (0 AND 0) = 1.
760
Example: Evaluation – (4)
Let’s follow the y = 1 subproblem: ((x)(x OR 1)) AND NOT (0 AND 1) Need to evaluate (x)(x OR 1). x = 0: 0 OR 1 = 1. x = 1: 1 OR 1 = 1. Hence, value is 1. Answer is 1 AND NOT (0 AND 1) = 1.
761
Example: Evaluation – (5)
Now we can resolve the (outermost) x = 0 subproblem: (y)(((x)(x OR y)) AND NOT (0 AND y)) We found both of its subproblems are true. We only needed one, since the outer quantifier is y. Hence, 1.
762
Example: Evaluation – (6)
Next, we must deal with the x = 1 case: (y)(((x)(x OR y)) AND NOT (1 AND y)) It also has the value 1, because the subproblem y = 0 evaluates to 1. Hence, the entire QBF has value 1.
763
The QBF Problem The problem QBF is: Theorem: QBF is PS-complete.
Given a QBF with no free variables, is its value 1 (true)? Theorem: QBF is PS-complete. Comment: What makes QBF extra hard? Alternation of quantifiers. Example: if only used, then the problem is really SAT.
764
Part I: QBF is in PS Suppose we are given QBF F of length n.
F has at most n operators. We can evaluate F using a stack of subexpressions that never has more than n subexpressions, each of length < n. Thus, space used is O(n2).
765
QBF is in PS – (2) Suppose we have subexpression E on top of the stack, and E = G OR H. Push G onto the stack. Evaluate it recursively. If true, return true. If false, replace G by H, and return what H returns.
766
QBF is in PS – (3) Cases E = G AND H and E = NOT G are handled similarly. If E = (x)G, then treat E as if it were E = E0 OR E1. Observe: difference between and OR is succinctness; you don’t write both E0 and E1. But E0 and E1 must be almost the same. If E = (x)G, then treat E as if it were E = E0 AND E1.
767
Part II: All of PS Polytime Reduces to QBF
Recall that if a polyspace-bounded TM M accepts its input w of length n, then it does so in cp(n) moves, where c is a constant and p is a polynomial. Use recursive doubling to construct a QBF saying “there is a sequence of cp(n) moves of M leading to acceptance of w.”
768
Polytime Reduction: The Variables
We need collections of boolean variables that together represent one ID of M. A variable ID I is a collection of O(p(n)) variables yj,A. True iff the j-th position of the ID I is A (a state or tape symbol). 0 < j < p(n)+1 = length of an ID.
769
The Variables – (2) We shall need O(p(n)) variable ID’s.
So the total number of boolean variables is O(p2(n)). Shorthand: (I), where I is a variable ID, is short for (y1)(y2)(…), where the y’s are the boolean variables belonging to I. Similarly (I).
770
Structure of the QBF The QBF is (I0)(If)(S AND N AND F AND U), where: I0 and If are variable ID’s representing the start and accepting ID’s respectively. U = “unique” = one symbol per position. S = “starts right”: I0 = q0w. F = “finishes right” = If accepts. N = “moves right.”
771
Structure of U, S, and F U is as done for Cook’s theorem.
S asserts that the first n+1 symbols of I0 are q0w, and other symbols are blank. F asserts one of the symbols of If is a final state. All are easy to write in O(p(n)) time.
772
Structure of QBF N N(I0,If) needs to say that I0⊦*If by at most cp(n) moves. We construct subexpressions N0, N1, N2,… where Ni(I,J) says “I⊦*J by at most 2i moves.” N is Nk, where k = log2cp(n) = O(p(n)). Note: differs from text, where the subscripts exponentiate.
773
Constructing the Ni’s Basis: N0(I, J) says “I=J OR I⊦J.”
If I represents variables yj,A and J represents variables zj,A, we say I=J by the boolean expression for yj,A = zj,A for all j and A. Remember: a=b is (a AND b) OR (NOT a AND NOT b). I⊦J uses the same idea as for SAT.
774
Induction Suppose we have constructed Ni and want to construct Ni+1.
Ni+1(I, J) = “there exists K such that Ni(I, K) and Ni(K, J).” We must be careful: We must write O(p(n)) formulas, each in polynomial time.
775
Induction – (2) If each formula used two copies of the previous formula, times and sizes would exponentiate. Trick: use to make one copy of Ni serve for two. Ni+1(I, J) = “if (P,Q) = (I,K) or (P,Q) = (K,J), then Ni(P, Q).”
776
Induction – (3) More formally, Ni+1(I, J) = (K)(P)(Q)(
Express as boolean variables Induction – (3) More formally, Ni+1(I, J) = (K)(P)(Q)( ((P I OR Q K) AND (P K OR Q J)) OR Ni(P, Q)) Pair (P,Q) is neither (I,K) nor (K,J) Or P⊦*Q in at most 2i moves.
777
Induction – (4) We can thus write Ni+1 in time O(p(n)) plus the time it takes to write Ni. Remember: N is Nk, where k = log2cp(n) = O(p(n)). Thus, we can write N in time O(p2(n)). Finished!! The whole QBF for w can be written in polynomial time.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.