1 Strings and Languages Operations Concatenation Exponentiation Kleene Star Regular Expressions
2 Strings and Language Operations Concatenation Exponentiation Kleene star Pages of the recommended text Regular expressions Pages of the recommended text
3 String Concatenation If x and y are strings over alphabet , the concatenation of x and y is the string xy formed by writing the symbols of x and the symbols of y consecutively. Suppose x = abb and y = ba xy = abbba yx = baabb
4 Properties of String Concatenation Suppose x, y, and z are strings. Concatenation is not commutative. xy is not guaranteed to be equal to yx Concatenation is associative (xy)z = x(yz) = xyz The empty string is the identity for concatenation x/\ = /\x = x
5 Language Concatenation Suppose L 1 and L 2 are languages (sets of strings). The concatenation of L 1 and L 2, denoted L 1 L 2,is defined as L 1 L 2 = { xy | x L 1 and y L 2 } Example, Let L 1 = { ab, bba } and L 2 = { aa, b, ba } What is L 1 L 2 ? Solution Let x 1 = ab, x 2 = bba, y 1 = aa, y 2 = b, y 3 = ba L 1 L 2 = { x 1 y 1, x 1 y 2, x 1 y 3, x 2 y 1, x 2 y 2, x 2 y 3 } = { abaa, abb, abba, bbaaa, bbab, bbaba}
6 Language Concatenation is not commutative Let L 1 = { aa, bb, ba } and L 2 = { /\, aba } Let x 1 = aa, x 2 = bb, x 3 =ba, y 1 = /\, y 2 = aba L 1 L 2 = { x 1 y 1, x 1 y 2, x 2 y 1, x 2 y 2, x 3 y 1, x 3 y 2 } = { aa, aaaba, bb, bbaba, ba, baaba } L 2 L 1 = { y 1 x 1, y 1 x 2, y 1 x 3, y 2 x 1, y 2 x 2, y 2 x 3 } = { aa, bb, ba, abaaa, ababb, ababa } L 2 L 2 = { y 1 y 1, y 1 y 2, y 2 y 1, y 2 y 2 } = { /\, aba, aba, abaaba } = { /\, aba, abaaba } (dropped extra aba)
7 Associativity of Language Concatenation (L 1 L 2 )L 3 = L 1 (L 2 L 3 ) = L 1 L 2 L 3 Example Let L 1 ={a,b}, L 2 ={c,d}, and L 3 ={e,f} L 1 L 2 L 3 =({a,b}{c,d}){e,f} ={ac, ad, bc, bd}{e,f} ={ ace,acf,ade,aef,bce,bcf,bde,bdf } L 1 L 2 L 3 ={a,b}({c,d}{e,f}) ={a,b}{ce, df, ce, df} ={ ace,acf,ade,aef,bce,bcf,bde,bdf }
8 Special Cases What language is the identity for language concatenation? The set containing only the empty string /\: {/\} Example {aab,ba,abc}{/\} = {/\}{aab,ba,abc} = {aab,ba,abc} What about {}? For any language L, L {} = {} L = {} Thus {} for concatenation is like 0 for multiplication Example {aab,ba,abc}{} = {}{aab,ba,abc} = {} The intuitive reason is that we must choose a string from both sets that are being concatenated, but there is nothing to choose from {}.
9 Exponentiation We use exponentiation to indicate the number of items being concatenated Symbols Strings Set of symbols ( for example) Set of strings (languages) a 3 = aaa x 3 = xxx 3 = = { x * | |x|=3 } L 3 = LLL
10 Examples of Exponentiation Let x=abb, ={a,b}, L={ab,b} a 4 = aaaa x 3 = (abb)(abb)(abb) = abbabbabb = = {a,b}{a,b}{a,b} ={aaa,aab,aba,abb,baa,bab,bba,bbb} L 3 = LLL = {ab,b}{ab,b}{ab,b} = {ababab,ababb,abbab,abbb, babab,babb,bbab,bbb}
11 Results of Exponentiation Exponentiation of a symbol or a string results in a string. Exponentiation of a set of symbols or a set of strings results in a set of strings a symbol a string a string a string a set of symbols a set of strings a set of strings a set of strings
12 Special Cases of Exponentiation a 0 = /\ x 0 = /\ 0 = { /\ } L 0 = { /\ } for any language L {aa,bb} 0 = { /\ } { a, aa, aaa, aaaa, …} 0 = { /\ } { /\ } 0 = { /\ } 0 = { } 0 = { /\ }
13 Kleene Star Kleene * is a unary operation on languages. Kleene * is not an operation on strings However, see the pages on regular expressions. L* represents any finite number of concatenations of L. L* = U k>0 L k = L 0 U L 1 U L 2 U … For any L, /\ is always an element of L* because L 0 = { /\ } Thus, for any L, L* !=
14 Example of Kleene Star Let L={aa} L 0 ={ /\ } L 1 =L={aa } L 2 ={ aaaa } L 3 = … L* = L 0 L 1 L 2 L 3 … = { /\, aa, aaaa, aaaaaa, … } = set of all strings that can be obtained by concatenating 0 or more copies of aa
15 Example of Kleene Star Let L={aa, b} L 0 ={ /\ } L 1 =L={aa,b} L 2 = LL={ aaaa, aab, baa, bb} L 3 = … L* = L 0 L 1 L 2 L 3 … = set of all strings that can be obtained by concatenating 0 or more copies of aa and b
16 Regular Languages Regular languages are languages that can be obtained from the very simple languages over , using only Union Concatenation Kleene Star
17 Examples of Regular Languages {aab} (i.e. {a}{a}{b} ) {aa,b} (i.e. {a}{a} {b} ) {a,b}* language of strings that can be obtained by concatenating any number of a’s and b’s {bb}{a,b}* language of strings that begin with bb (followed by any number of a’s and b’s) {a}*{bb,/\} language of strings that begin with any number of a’s and end with an optional bb. {a}* {b}* language of strings that consist of only a’s or only b’s and /\.
18 Regular Expressions We can simplify the formula for regular languages slightly by leaving out the set brackets { } and replacing with + The results are called regular expressions.
19 Examples of Regular Expressions Set notationRegular Expressions {aab}aab {aa,b} = {aa} {b} aa+b {a,b}* = ({a} {b})* (a+b)* {bb}{a,b}* = {bb}({a} {b})* bb(a+b)* {a}*{bb,/\} = {a}*({bb} {/\}) a*(bb+/\) {a}* {b}* a*+b*
20 String or Language? Consider the regular expression a*(bb+/\) a*(bb+/\) is a string over alphabet {a, b, *, +, /\, (, ), } a*(bb+/\) represents a language over alphabet {a, b} It represents the language of strings over {a,b} that begin with any number of a’s and end with an optional bb. Some regular expressions look just like strings over alphabet {a,b} Regular expression aaba represents the language {aaba} Regular expression /\ represents the language {/\} It should be clear from the context whether a sequence of symbols is a regular expression or just a string.
21 Module 1: Course Overview Course: CSE 460 Instructor: Dr. Eric Torng TA: Phillip Mienk
22 What is this course? Philosophy of computing course We take a step back to think about computing in broader terms Science of computing course We study fundamental ideas/results that shape the field of computer science “Applied” computing course We learn study a broad range of material with relevance to computing today
23 Philosophy Phil. of life What is the purpose of life? What are we capable of accomplishing in life? Are there limits to what we can do in life? Why do we drive on parkways and park on driveways? Phil. of computing What is the purpose of programming? What can we achieve through programming? Are there limits to what we can do with programs? Why don’t debuggers actually debug programs?
24 Science Physics Study of fundamental physical laws and phenomenon like gravity and electricity Engineering Governed by physical laws Our material Study of fundamental computational laws and phenomenon like undecidability and universal computers Programming Governed by computational laws
25 Applied computing Applications are not immediately obvious In some cases, seeing the applicability of this material requires advanced abstraction skills Every year, there are people who leave this course unable to see the applicability of the material Others require more material in order to completely understand their application for example, to understand how regular expressions and context-free grammars are applied to the design of compilers, you need to take a compilers course
26 Some applications Important programming languages regular expressions (perl) finite state automata (used in hardware design) context-free grammars Proofs of program correctness Subroutines Using them to prove problems are unsolvable String searching/Pattern matching Algorithm design concepts such as recursion
27 Fundamental Theme * What are the capabilities and limitations of computers and computer programs? What can we do with computers/programs? Are there things we cannot do with computers/programs?
28 Module 2: Fundamental Concepts Problems Programs Programming languages
29 Problems We view solving problems as the main application for computer programs
30 Inputs Outputs (4,2,3,1) (3,1,2,4) (7,5,1) (1,2,3) (1,2,3,4) (1,5,7) (1,2,3) Definition A problem is a mapping or function between a set of inputs and a set of outputs Example Problem: Sorting
31 How to specify a problem Input Describe what an input instance looks like Output Describe what task should be performed on the input In particular, describe what output should be produced
32 Example Problem Specifications* Sorting problem Input –Integers n 1, n 2,..., n k Output –n 1, n 2,..., n k in nondecreasing order Find element problem Input –Integers n 1, n 2, …, n k –Search key S Output –yes if S is in n 1, n 2, …, n k, no otherwise
33 Programs Programs solve problems
34 Purpose Why do we write programs? One answer To solve problems What does it mean to solve a problem? Informal answer: For every legal input, a correct output is produced. Formal answer: To be given later
35 Programming Language Definition A programming language defines what constitutes a legal program Example: a pseudocode program may not be a legal C++ program which may not be a legal C program A programming language is typically referred to as a “computational model” in a course like this.
36 C++ Our programming language will be C++ with minor modifications Main procedure will use input parameters in a fashion similar to other procedures no argc/argv Output will be returned type specified by main function type
37 Maximum Element Problem Input integer n >= 1 List of n integers Output The largest of the n integers
38 C++ Program which solves the Maximum Element Problem* int main(int A[], int n) { int i, max; if (n < 1) return (“Illegal Input”); max = A[0]; for (i = 1; i < n; i++) if (A[i] > max) max = A[i]; return (max); }
39 Fundamental Theme Exploring capabilities and limitations of C++ programs
40 Restating the Fundamental Theme * We will study the capabilities and limits of C++ programs Specifically, we will try and identify What problems can be solved by C++ programs What problems cannot be solved by C++ programs
41 Question Is C++ general enough? Or is it possible that there exists some problem such that can be solved by some program P in some other reasonable programming language but cannot be solved by any C++ program?
42 Church’s Thesis (modified) We have no proof of an answer, but it is commonly accepted that the answer is no. Church’s Thesis (three identical statements) C++ is a general model of computation Any algorithm can be expressed as a C++ program If some algorithm cannot be expressed by a C++ program, it cannot be expressed in any reasonable programming language
43 Summary * Problems When we talk about what programs can or cannot “DO”, we mean what PROBLEMS can or cannot be solved
44 Module 3: Classifying Problems One of the main themes of this course will be to classify problems in various ways By solvability Solvable, “half-solvable”, unsolvable We will focus our study on decision problems function (one correct answer for every input) finite range (yes or no is the correct output)
45 Classification Process Take some set of problems and partition it into two or more subsets of problems where membership in a subset is based on some shared problem characteristic Set of Problems Subset 1Subset 2Subset 3
46 Classify by Solvability Criteria used is whether or not the problem is solvable that is, does there exist a C++ program which solves the problem? Set of All Problems Solvable ProblemsUnsolvable Problems
47 Function Problems We will focus on problems where the mapping from input to output is a function Set of All Problems Non-Function ProblemsFunction Problems
48 General (Relation) Problem the mapping is a relation that is, more than one output is possible for a given input Inputs Outputs
49 Criteria for Function Problems mapping is a function unique output for each input Inputs Outputs
50 Example Non-Function Problem Divisor Problem Input: Positive integer n Output: A positive integral divisor of n Inputs Outputs
51 Example Function Problems Sorting Multiplication Problem Input: 2 integers x and y Output: xy Inputs Outputs 2,5 10
52 Another Example * Maximum divisor problem Input: Positive integer n Output: size of maximum divisor of n smaller than n Inputs Outputs
53 Decision Problems We will focus on function problems where the correct answer is always yes or no Set of Function Problems Non-Decision ProblemsDecision Problems
54 Criteria for Decision Problems Output is yes or no range = {Yes, No} Note, problem must be a function problem only one of Yes/No is correct Inputs Outputs Yes No
55 Example Decision sorting Input: list of integers Yes/No question: Is the list in nondecreasing order? Inputs Outputs Yes No (1,3,2,4) (1,2,3,4)
56 Another Example Decision multiplication Input: Three integers x, y, z Yes/No question: Is xy = z? Inputs Outputs Yes No (3,5,14) (3,5,15)
57 A Third Example * Decision Divisor Problem Input: Two integers x and y Yes/No question: Is y a divisor of x? Inputs Outputs Yes No (14,5) (14,7)
58 Focus on Decision Problems Set of All Problems Solvable ProblemsUnsolvable Problems Decision Problems Other Probs When studying solvability, we are going to focus specifically on decision problems There is no loss of generality, but we will not explore that here
59 Finite Domain Problems These problems have only a finite number of inputs Set of All Problems Finite Domain ProblemsInfinite Domain Problems
60 Lack of Generality Set of All Problems Solvable ProblemsUnsolvable Problems All finite domain problems can be solved using “table lookup” idea Finite Domain Infinite Domain Empty
61 Table Lookup Program int main(string x) { switch x { case “Bill”: return(3); case “Judy”: return(25); case “Tom”: return(30); default: cerr << “Illegal input\n”; }
62 Key Concepts Classification Theme Decision Problems Important subset of problems We can focus our attention on decision problems without loss of generality Same is not true for finite domain problems Table lookup
63 Module 4: Formal Definition of Solvability Analysis of decision problems Two types of inputs:yes inputs and no inputs Language recognition problem Analysis of programs which solve decision problems Four types of inputs: yes, no, crash, loop inputs Solving and not solving decision problems Classifying Decision Problems Formal definition of solvable and unsolvable decision problems
64 Analyzing Decision Problems Can be defined by two sets
65 Decision Problems and Sets Decision problems consist of 3 sets The set of legal input instances (or universe of input instances) The set of “yes” input instances The set of “no” input instances Yes InputsNo Inputs Set of All Legal Inputs
66 Redundancy * Only two of these sets are needed; the third is redundant Given The set of legal input instances (or universe of input instances) –This is given by the description of a typical input instance The set of “yes” input instances –This is given by the yes/no question We can compute The set of “no” input instances
67 Typical Input Universes *: The set of all finite length strings over finite alphabet Examples {a}*: {/\, a, aa, aaa, aaaa, aaaaa, … } {a,b}*: {/\, a, b, aa, ab, ba, bb, aaa, aab, aba, abb, … } {0,1}*: {/\, 0, 1, 00, 01, 10, 11, 000, 001, 010, 011, … } The set of all integers If the input universe is understood, a decision problem can be specified by just giving the set of yes input instances
68 Language Recognition Problem Input Universe * for some finite alphabet Yes input instances Some set L subset of * No input instances * - L When is understood, a language recognition problem can be specified by just stating what L is.
69 Language Recognition Problem * Traditional Formulation Input A string x over some finite alphabet Task Is x in some language L subset of ? 3 set formulation Input Universe * for a finite alphabet Yes input instances Some set L subset of * No input instances * - L When is understood, a language recognition problem can be specified by just stating what L is.
70 Equivalence of Decision Problems and Languages All decision problems can be formulated as language recognition problems Simply develop an encoding scheme for representing all inputs of the decision problem as strings over some fixed alphabet The corresponding language is just the set of strings encoding yes input instances In what follows, we will often use decision problems and languages interchangeably
71 Visualization * Yes Inputs Original Decision Problem No Inputs Encoding Scheme over alphabet Language L * - L Corresponding Language Recognition Problem
72 Analyzing Programs which Solve Decision Problems Four possible outcomes
73 Program Declaration * Suppose a program P is designed to solve some decision problem . What does P’s declaration look like? What should P return on a yes input instance? What should P return on a no input instance?
74 Program Declaration II Suppose a program P is designed to solve a language recognition problem . What does P’s declaration look like? bool main(string x) { We will assume that the string declaration is correctly defined for the input alphabet –If = {a,b}, then string will define variables consisting of only a’s and b’s –If = {a, b, …, z, A, …, Z}, then string will define variables consisting of any string of alphabet characters
75 Programs and Inputs Notation P denotes a program x denotes an input for program P 4 possible outcomes of running P on x P halts and says yes: P accepts input x P halts and says no: P rejects input x P halts without saying yes or no: P crashes on input x We typically ignore this case as it can be combined with rejects P never halts: P infinite loops on input x
76 Programs and the Set of Legal Inputs Based on the 4 possible outcomes of running P on x, P partitions the set of legal inputs into 4 groups Y(P): The set of inputs P accepts When the problem is a language recognition problem, Y(P) is often represented as L(P) N(P): The set of inputs P rejects C(P): The set of inputs P crashes on I(P): The set of inputs P infinite loops on Because L(P) is often used in place of Y(P) as described above, we use notation I(P) to represent this set
77 Illustration All Inputs Y(P)N(P)C(P) I(P)
78 Analyzing Programs and Decision Problems Distinguish the two carefully
79 Program solving a decision problem Formal Definition: A program P solves decision problem if and only if The set of legal inputs for P is identical to the set of input instances of Y(P) is the same as the set of yes input instances for N(P) is the same as the set of no input instances for Otherwise, program P does not solve problem Note C(P) and I(P) must be empty in order for P to solve problem Y(P)N(P)C(P) I(P)
80 Solvable Problem A decision problem is solvable if and only if there exists some C++ program P which solves When the decision problem is a language recognition problem for language L, we often say that L is solvable or L is decidable A decision problem is unsolvable if and only if all C++ programs P do not solve Similar comment as above
81 Illustration of Solvability Inputs of Program P Y(P)N(P) C(P)I(P)C(P)I(P) Inputs of Problem Yes InputsNo Inputs
82 Program half- solving a problem Formal Definition: A program P half-solves problem if and only if The set of legal inputs for P is identical to the set of input instances of Y(P) is the same as the set of yes input instances for N(P) union C(P) union I(P) is the same as the set of no input instances for Otherwise, program P does not half-solve problem Note C(P) and I(P) need not be empty Y(P)N(P)C(P) I(P)
83 Half-solvable Problem A decision problem is half-solvable if and only if there exists some C++ program P which half-solves When the decision problem is a language recognition problem for language L, we often say that L is half-solvable A decision problem is not half-solvable if and only if all C++ programs P do not half- solve
84 Illustration of Half-Solvability * Inputs of Program P Y(P)N(P) C(P)I(P) Inputs of Problem Yes InputsNo Inputs
85 Hierarchy of Decision Problems Solvable The set of half-solvable decision problems is a proper subset of the set of all decision problems The set of solvable decision problems is a proper subset of the set of half-solvable decision problems. Half-solvable All decision problems
86 Why study half-solvable problems? A correct program must halt on all inputs Why then do we define and study half-solvable problems? One Answer: the set of half-solvable problems is the natural class of problems associated with general computational models like C++ Every program half-solves some decision problem Some programs do not solve any decision problem In particular, programs which do not halt do not solve their corresponding decision problems
87 Key Concepts Four possible outcomes of running a program on an input The four subsets every program divides its set of legal inputs into Formal definition of a program solving (half-solving) a decision problem a problem being solvable (half-solvable) Be precise: with the above two statements!
88 Module 5 Topics Proof of the existence of unsolvable problems Proof Technique –There are more problems/languages than there are programs/algorithms –Countable and uncountable infinities
89 Overview We will show that there are more problems than programs Actually more problems than programs in any computational model (programming language) Implication Some problems are not solvable
90 Preliminaries Define set of problems Observation about programs
91 Define set of problems We will restrict the set of problems to be the set of language recognition problems over the alphabet {a}. That is Universe: {a}* Yes Inputs: Some language L subset of {a}* No Inputs: {a}* - L
92 Set of Problems * The number of distinct problems is given by the number of languages L subset of {a}* 2 {a}* is our shorthand for this set of subset languages Examples of languages L subset of {a}* 0 elements: { } 1 element: {/\}, {a}, {aa}, {aaa}, {aaaa}, … 2 elements: {/\, a}, {/\, aa}, {a, aa}, … Infinite # of elements: {a n | n is even}, {a n | n is prime}, {a n | n is a perfect square}
93 Infinity and {a}* All strings in {a}* have finite length The number of strings in {a}* is infinite The number of languages L in 2 {a}* is infinite The number of strings in a language L in 2 {a}* may be finite or infinite
94 Define set of programs The set of programs we will consider are the set of legal C++ programs as defined in earlier lectures Key Observation Each C++ program can be thought of as a finite length string over alphabet P P = {a, …, z, A, …, Z, 0, …, 9, white space, punctuation}
95 Example * int main(int A[], int n){ {26 characters including newline} int i, max; {13 characters including initial tab} {1 character: newline} if (n < 1) {12 characters} return (“Illegal Input”); {28 characters including 2 tabs} max = A[0]; {13 characters} for (i = 1; i < n; i++) {25 characters} if (A[i] > max) {18 characters} max = A[i]; {15 characters} return (max); {15 characters} } {2 characters including newline}
96 Number of programs The set of legal C++ programs is clearly infinite It is also no more than | P * | P = {a, …, z, A, …, Z, 0, …, 9, white space, punctuation}
97 Goal Show that the number of languages L in 2 {a}* is greater than the number of strings in P * P = {a, …, z, A, …, Z, 0, …, 9, white space, punctuation} Problem Both are infinite
98 How do we compare the relative sizes of infinite sets? Bijection (yes) Proper subset (no)
99 Bijections Two sets have EQUAL size if there exists a bijection between them bijection is a 1-1 and onto function between two sets Examples Set {1, 2, 3} and Set {A, B, C} Positive even numbers and positive integers
100 Bijection Example Positive Integers Positive Even Integers i2i …...
101 Proper subset Finite sets S1 proper subset of S2 implies S2 is strictly bigger than S1 Example –women proper subset of people –number of women less than number of people Infinite sets Counterexample even numbers and integers
102 Two sizes of infinity Countable Uncountable
103 Countably infinite set S * Definition 1 S is equal in size (bijection) to N N is the set of natural numbers {1, 2, 3, …} Definition 2 (Key property) There exists a way to list all the elements of set S (enumerate S) such that the following is true Every element appears at a finite position in the infinite list
104 Uncountable infinity * Any set which is not countably infinite Examples Set of real numbers 2 {a}*, the set of all languages L which are a subset of {a}* Further gradations within this set, but we ignore them
105 Proof
106 (1) The set of all legal C++ programs is countably infinite Every C++ program is a finite string Thus, the set of all legal C++ programs is a language L C This language L C is a subset of P *
107 For any alphabet , * is countably infinite Enumeration ordering All length 0 strings | | 0 = 1 string: All length 1 strings | | strings All length 2 strings | | 2 strings … Thus, P * is countably infinite
108 Example with alphabet {a,b} * Length 0 strings 0 and Length 1 strings 1 and a, 2 and b Length 2 strings 3 and aa, 4 and ab, 5 and ba, 6 and bb,... Question write a program that takes a number as input and computes the corresponding string as output
109 (2) The set of languages in 2 {a}* is uncountably infinite Diagonalization proof technique “Algorithmic” proof Typically presented as a proof by contradiction
110 Algorithm Overview * To prove this set is uncountably infinite, we construct an algorithm D that behaves as follows: Input A countably infinite list of languages L[] subset of {a}* Output A language D(L[]) which is a subset of {a}* that is not on list L[]
111 Visualizing D List L[] L[0] L[1] L[2] L[3]... Language D(L[]) not in list L[] Algorithm D
112 Why existence of D implies result If the number of languages in 2 {a}* is countably infinite, there exists a list L[] s.t. L[] is complete it contains every language in 2 {a}* L[] is countably infinite The existence of algorithm D implies that no list of languages in 2 {a}* is both complete and countably infinite Specifically, the existence of D shows that any countably infinite list of languages is not complete
113 Visualizing One Possible L[ ] * L[0] L[1] L[2] L[3] L[4]... aaaaaaaaaa... IN OUTIN OUT IN OUT IN OUT #Rows is countably infinite Given #Cols is countably infinite {a} * is countably infinite Consider each string to be a feature A set contains or does not contain each string
114 Constructing D(L[ ]) * We construct D(L[]) by using a unique feature (string) to differentiate D(L[]) from L[i] Typically use ith string for language L[i] Thus the name diagonalization L[0] L[1] L[2] L[3] L[4]... aaaaaaaaaa... IN OUTIN OUT IN OUT IN OUT IN OUT D(L[])
115 Questions * Do we need to use the diagonal? Every other column and every row? Every other row and every column? What properties are needed to construct D(L[])? L[0] L[1] L[2] L[3] L[4]... aaaaaaaaaa... IN OUTIN OUT IN OUT IN OUT
116 Visualization Solvable Problems The set of solvable problems is a proper subset of the set of all problems. All problems
117 Summary Equal size infinite sets: bijections Countable and uncountable infinities More languages than algorithms Number of algorithms countably infinite Number of languages uncountably infinite Diagonalization technique Construct D(L[]) using infinite set of features The set of solvable problems is a proper subset of the set of all problems
118 Module 6 Topics Program behavior problems Input of problem is a program/algorithm Definition of type program Program correctness Testing versus Proving
119 Number Theory Problems These are problems where we investigate properties of numbers Primality Input: Positive integer n Yes/No Question: Is n a prime number? Divisor Input: Integers m,n Yes/No question: Is m a divisor of n?
120 Graph Theory Problems These are problems where we investigate properties of graphs Connected Input: Graph G Yes/No Question: Is G a connected graph? Subgraph Input: Graphs G 1 and G 2 Yes/No question: Is G 1 a subgraph of G 2 ?
121 Program Behavior Problems * These are problems where we investigate properties of programs and how they behave Give an example problem with one input program P Give an example problem with two input programs P 1 and P 2
122 Program Representation Program variables Abstractly, we define the type “program” graph G, program P More concretely, we define type program to be a string over the program alphabet P = {a, …, z, A, …, Z, 0, …, 9, punctuation, white space} Note, many strings over P are not legal programs We consider them to be programs that always crash Possible declaration of main procedure bool main(program P)
123 Program correctness * How do we determine whether or not a program P we have written is correct? What are some weaknesses of this approach? What might be a better approach?
124 Testing versus Analyzing Program P Test Inputs x1x1 x2x2 x3x3... Outputs P(x 1 ) P(x 2 ) P(x 3 )... Analyzer Program P Analysis of Program P
125 2 Program Behavior Problems * Correctness Input Program P Yes/No Question Does P correctly solve the primality problem? Functional Equivalence Input Programs P 1, P 2 Yes/No Question Is program P 1 functionally equivalent to program P 2
126 Module 7 Halting Problem Fundamental program behavior problem A specific unsolvable problem Diagonalization technique revisited Proof more complex
127 Definition Input Program P Assume the input to program P is a single unsigned int –This assumption is not necessary, but it simplifies the following unsolvability proof –To see the full generality of the halting problem, remove this assumption Nonnegative integer x, an input for program P Yes/No Question Does P halt when run on x? Notation Use H as shorthand for halting problem when space is a constraint
128 Example Input * Program with one input of type unsigned int bool main(unsigned int Q) { int i=2; if ((Q = = 0) || (Q= = 1)) return false; while (i<Q) { if (Q%i = = 0) return (false); i++; } return (true); } Input x 4
129 Three key definitions
130 Definition of list L * P * is countably infinite where P = {characters, digits, white space, punctuation} Type program will be type string with P as the alphabet Define L to be the strings in P * listed in enumeration order length 0 strings first length 1 strings next … Every program is a string in P For simplicity, consider only programs that have one input the type of this input is an unsigned int Consider strings in P * that are not legal programs to be programs that always crash (and thus halt on all inputs)
131 Definition of P H * If H is solvable, some program must solve H Let P H be a procedure which solves H We declare it as a procedure because we will use P H as a subroutine Declaration of P H bool P H (program P, unsigned int x) In general, the type of x should be the type of the input to P Comments We do not know how P H works However, if H is solvable, we can build programs which call P H as a subroutine
132 bool main(unsigned int y) /* main for program D */ { program P = generate(y); if (P H (P,y)) while (1>0); else return (yes); } /* generate the yth string in P * in enumeration order */ program generate(unsigned int y) /* code for program of slide 21 from module 5 did this for {a,b}* */ bool P H (program P, unsigned int x) /* how P H solves H is unknown */ Definition of program D
133 Generating P y from y * We won’t go into this in detail here This was the basis of the question at the bottom of slide 21 of lecture 5 (alphabet for that problem was {a,b} instead of P ). This is the main place where our assumption about the input type for program P is important for other input types, how to do this would vary Specification 0 maps to program 1 maps to program a 2 maps to program b 3 maps to program c … 26 maps to program z 27 maps to program A …
134 Proof that H is not solvable
135 Argument Overview * H is solvable D is NOT on list L P H exists Definition of Solvability D exists D’s code D is on list L L is list of all programs D does NOT exist P H does NOT exist H is NOT solvable pq is logically equivalent to (not q)(not p)
136 Proving D is not on list L Use list L to specify a program behavior B that is distinct from all real program behaviors (for programs with one input of type unsigned int) Diagonalization argument similar to the one for proving the number of languages over {a} is uncountably infinite No program P exists that exhibits program behavior B Argue that D exhibits program behavior B Thus D cannot exist and thus is not on list L
137 Non-existent program behavior B
138 Visualizing List L * P0P0 P1P1 P2P2 P3P3 P4P4... 1234 HHHHH NHHHH HH H H H HH #Rows is countably infinite p * is countably infinite #Cols is countably infinite Set of n nnegative integers is countably infinite Consider each number to be a feature A program halts or doesn’t halt on each integer We have a fixed L this time
139 Diagonalization to specify B * P0P0 P1P1 P2P2 P3P3 P4P4... 1234 HHHHH NHHHH HH H H H HH We specify a non-existent program behavior B by using a unique feature (number) to differentiate B from P i NH H H H B
140 Arguing D exhibits program behavior B
141 bool main(unsigned int y) /* main for program D */ { program P = generate(y); if (P H (P,y)) while (1>0); else return (yes); } /* generate the yth string in P * in enumeration order */ program generate(unsigned int y) /* code for extra credit program of slide 21 from lecture 5 did this for {a,b}* */ bool P H (program P, unsigned int x) /* how P H solves H is unknown */ Code for D
142 Visualization of D in action on input y Program D with input y (type for y: unsigned int) Given input y, generate the program (string) P y Run P H on P y and y Guaranteed to halt since P H solves H IF (P H (P y,y)) while (1>0); else return (yes); P0P0 P1P1 P2P2 PyPy... 12 HH HH NH H D...y HNH
143 Alternate Proof
144 Alternate Proof Overview For every program P y, there is a number y that we associate with it The number we use to distinguish program P y from D is this number y Using this idea, we can arrive at a contradiction without explicitly using the table L The diagonalization is hidden
145 H is not solvable, proof II Assume H is solvable Let P H be the program which solves H Use P H to construct program D which cannot exist Contradiction This means program P H cannot exist. This implies H is not solvable D is the same as before
146 Arguing D cannot exist If D is a program, it must have an associated number y What does D do on this number y? 2 cases D halts on y This means P H (D,y) = NO –Definition of D This means D does not halt on y –P H solves H Contradiction This case is not possible
147 Continued D does not halt on this number y This means P H (D,y) = YES –Definition of D This means D halts on y –P H solves H Contradiction This case is not possible Both cases are not possible, but one must be for D to exist Thus D cannot exist
148 Implications * The Halting Problem is one of the simplest problems we can formulate about program behavior We can use the fact that it is unsolvable to show that other problems about program behavior are also unsolvable This has important implications restricting what we can do in the field of software engineering In particular, “perfect” debuggers/testers do not exist We are forced to “test” programs for correctness even though this approach has many flaws
149 Summary Halting Problem definition Basic problem about program behavior Halting Problem is unsolvable We have identified a specific unsolvable problem Diagonalization technique Proof more complicated because we actually need to construct D, not just give a specification B
150 Module 8 Closure Properties Definition Language class definition set of languages Closure properties and first-order logic statements For all, there exists
151 Closure Properties A set is closed under an operation if applying the operation to elements of the set produces another element of the set Example/Counterexample set of integers and addition set of integers and division
152 Integers and Addition Integers 2 5 7
153 Integers and Division Integers 2 5.4
154 Language Classes We will be interested in closure properties of language classes A language class is a set of languages Thus, the elements of a language class (set of languages) are languages which are sets themselves Crucial Observation When we say that a language class is closed under some set operation, we apply the set operation to the languages (elements of the language classes) rather than the language classes themselves
155 Example Language Classes * In all these examples, we do not explicitly state what the underlying alphabet is Finite languages Languages with a finite number of strings CARD-3 Languages with at most 3 strings
156 Finite Sets and Set Union * Finite Sets {0,1,00} {0,1,11} {0,1,00,11}
157 CARD-3 and Set Union CARD-3 {0,1,00} {0,1,11} {0,1,00,11} CARD-3: sets with at most 3 elements
158 Finite Sets and Set Complement Finite Sets {0,1,01} {/\,00,10,11,000,...}
159 Infinite Number of Facts A closure property often represents an infinite number of facts Example: The set of finite languages is closed under the set union operation {} union {} is a finite language {} union {0} is a finite language... { } union {} is a finite language...
160 First-order logic and closure properties * A way to formally write (not prove) a closure property For all L 1,...,L k in LC, op (L 1,... L k ) in LC Only one expression is needed because of the for all quantifier Number of languages k is determined by arity of the operation op
161 Example F-O logic statements * For all L 1,L 2 in FINITE, L 1 union L 2 in FINITE For all L 1,L 2 in CARD-3, L 1 union L 2 in CARD-3 For all L in FINITE, L c in FINITE For all L in CARD-3, L c in CARD-3
162 Stating a closure property is false What is true if a set is not closed under some k-ary operator? There exist k elements of that set which, when combined together under the given operator, produce an element not in the set There exists L 1,...,L k in LC, op (L 1, …, L k ) not in LC Example Finite sets and set complement
163 Complementing a F-O logic statement Complement “For all L 1,L 2 in CARD-3, L 1 union L 2 in CARD-3” not (For all L 1,L 2 in CARD-3, L 1 union L 2 in CARD-3) There exists L 1,L 2 in CARD-3, not (L 1 union L 2 in CARD-3) There exists L 1,L 2 in CARD-3, L 1 union L 2 not in CARD-3
164 Proving/Disproving * Which is easier and why? Proving a closure property is true Proving a closure property is false
165 Module 9 Recursive and r.e. language classes representing solvable and half-solvable problems Proofs of closure properties for the set of recursive (solvable) languages for the set of r.e. (half-solvable) languages Generic element/template proof technique Relationship between RE and REC pseudoclosure property
166 RE and REC language classes REC A solvable language is commonly referred to as a recursive language for historical reasons REC is defined to be the set of solvable or recursive languages RE A half-solvable language is commonly referred to as a recursively enumerable or r.e. language RE is defined to be the set of r.e. or half- solvable languages
167 Why study closure properties of RE and REC? It tests how well we really understand the concepts we encounter language classes, REC, solvability, half- solvability It highlights the concept of subroutines and how we can build on previous algorithms to construct new algorithms we don’t have to build our algorithms from scratch every time
168 Example Application * Setting I have two programs which can solve the language recognition problems for L 1 and L 2 I want a program which solves the language recognition problem for L 1 intersect L 2 Question Do I need to develop a new program from scratch or can I use the existing programs to help? Does this depend on which languages L 1 and L 2 I am working with?
169 Closure Properties of REC * We now prove REC is closed under two set operations Set Complement Set Intersection In these proofs, we try to highlight intuition and common sense
170 Set Complement Example * Even: the set of even length strings over {0,1} Complement of Even? Odd: the set of odd length strings over {0,1} Is Odd recursive (solvable)? How is the program P’ that solves Odd related to the program P that solves Even?
171 Set Complement Lemma If L is a solvable language, then L complement is a solvable language Proof Let L be an arbitrary solvable language First line comes from For all L in REC Let P be the C++ program which solves L P exists by definition of REC
172 Modify P to form P’ as follows Identical except at very end Complement answer –Yes -> No –No -> Yes Program P’ solves L complement Halts on all inputs Answers correctly Thus L complement is solvable Definition of solvable proof continued
173 P’ Illustration P Input x YES No P’ YES No
174 Code for P’ bool main(string y) { if (P (y)) return no; else return yes; } bool P (string y) /* details deleted; key fact is P is guaranteed to halt on all inputs */
175 Set Intersection Example * Even: the set of even length strings over {0,1} Mod-5: the set of strings of length a multiple of 5 over {0,1} What is Even intersection Mod-5? Mod-10: the set of strings of length a multiple of 10 over {0,1} How is the program P 3 (Mod-10) related to programs P 1 (Even) and P 2 (Mod-5)
176 Set Intersection Lemma * If L 1 and L 2 are solvable languages, then L 1 intersection L 2 is a solvable language Proof Let L 1 and L 2 be arbitrary solvable languages Let P 1 and P 2 be programs which solve L 1 and L 2, respectively
177 Construct program P 3 from P 1 and P 2 as follows P 3 runs both P 1 and P 2 on the input string If both say yes, P 3 says yes Otherwise, P 3 says no P 3 solves L 1 intersection L 2 Halts on all inputs Answers correctly L 1 intersection L 2 is a solvable language proof continued
178 P 3 Illustration P1P1 P2P2 Yes/No AND Yes/No P3P3
179 Code for P 3 bool main(string y) { if (P 1 (y) && P 2 (y)) return yes; else return no; } bool P 1 (string y) /* details deleted; key fact is P 1 always halts. */ bool P 2 (string y) /* details deleted; key fact is P 2 always halts. */
180 Other Closure Properties Unary Operations Language Reversal Kleene Star Binary Operations Set Union Set Difference Symmetric Difference Concatenation
181 Closure Properties of RE * We now try to prove RE is closed under the same two set operations Set Intersection Set Complement In these proofs We define a more formal proof methodology We gain more intuition about the differences between solvable and half-solvable problems
182 RE Closed Under Set Intersection Expressing this closure property as an infinite set of facts Let L i denote the ith r.e. language L 1 intersect L 1 is in RE L 1 intersect L 2 is in RE... L 2 intersect L 1 is in RE...
183 Generic Element or Template Proofs Since there are an infinite number of facts to prove, we cannot prove them all individually Instead, we create a single proof that proves each fact simultaneously I like to call these proofs generic element or template proofs
184 Basic Proof Ideas Name your generic objects In this case, we use L 1 and L 2 Only use facts which apply to any relevant objects We will only use the fact that there must exist P 1 and P 2 which half-solve L 1 and L 2 Work from both ends of the proof The first and last lines are usually obvious, and we can often work our way in
185 Set Intersection Example * Let L 1 and L 2 be arbitrary r.e. languages L 1 intersection L 2 is an r.e. language There exist P 1 and P 2 s.t. Y(P 1 )=L 1 and Y(P 2 )=L 2 By definition of half-solvable languages There exists a program P which half-solves L 1 intersection L 2 Construct program P 3 from P 1 and P 2 Note, we can assume very little about P 1 and P 2 Prove Program P 3 half-solves L 1 intersection L 2
186 Constructing P 3 * Run P 1 and P 2 in parallel One instruction of P 1, then one instruction of P 2, and so on If both halt and say yes, halt and say yes If both halt but both do not say yes, halt and say no
187 P 3 Illustration P1P1 P2P2 Yes/No/- AND Yes/No/- P3P3 Input
188 Code for P 3 bool main(string y){ parallel-execute(P 1 (y), P 2 (y)) until both return; if ((P 1 (y) && P 2 (y)) return yes; else return no; } bool P 1 (string y) /* key fact is P 1 only guaranteed to halt on yes input instances */ bool P 2 (string y) /* key fact is P 2 only guaranteed to halt on yes input instances */
189 Proving P 3 Is Correct * 2 steps to showing P 3 half-solves L 1 intersection L 2 For all x in L 1 intersection L 2, must show P 3 accepts x –halts and says yes For all x not in L 1 intersection L 2, must show P 3 does what?
190 Part 1 of Correctness Proof P 3 accepts x in L 1 intersection L 2 Let x be an arbitrary string in L 1 intersection L 2 Note, this subproof is a generic element proof P 1 accepts x L 1 intersection L 2 is a subset of L 1 P 1 accepts all strings in L 1 P 2 accepts x P 3 accepts x We reach the AND gate because of the 2 previous facts Since both P 1 and P 2 accept, AND evaluates to YES
191 Part 2 of Correctness Proof P 3 does not accept x not in L 1 intersection L 2 Let x be an arbitrary string not in L 1 intersection L 2 By definition of intersection, this means x is not in L 1 or L 2 Case 1: x is not in L 1 2 possibilities P 1 rejects (or crashes on) x –One input to AND gate is No –Output cannot be yes –P 3 does not accept x P 1 loops on x –One input never reaches AND gate –No output –P 3 loops on x P 3 does not accept x when x is not in L 1 Case 2: x is not in L 2 Essentially identical analysis P 3 does not accept x not in L 1 intersection L 2
192 RE closed under set complement? * First-order logic formulation? What this really means Let L i denote the ith r.e. language L 1 complement is in RE L 2 complement is in RE...
193 Set complement proof overview Let L be an arbitrary r.e. language L complement is an r.e. language There exists P s.t. Y(P)=L By definition of r.e. languages There exists a program P’ which half-solves L complement Construct program P’ from P Note, we can assume very little about P Prove Program P’ half-solves L complement
194 Constructing P’ * What did we do in recursive case? Run P and then just complement answer at end Accept -> Reject Reject -> Accept Does this work in this case? No. Why not?
195 Other closure properties Unary Operations Language reversal Kleene Closure Binary operations union (on practice hw) concatenation Not closed Set difference (on practice hw)
196 Closure Property Applications How can we use closure properties to prove a language L T is r.e. or recursive? Unary operator op (e.g. complement) 1) Find a known r.e. or recursive language L 2) Show L T = L op Binary operator op (e.g. intersection) 1) Find 2 known r.e or recursive languages L 1 and L 2 2) Show L T = L 1 op L 2
197 Closure Property Applications How can we use closure properties to prove a language L T is not r.e. or recursive? Unary operator op (e.g. complement) 1) Find a known not r.e. or non-recursive language L 2) Show L T op = L Binary operator op (e.g. intersection) 1) Find a known r.e. or recursive language L 1 2) Find a known not r.e. or non-recursive language L 2 3) Show L 2 = L 1 op L T
198 Example * Looping Problem Input Program P Input x for program P Yes/No Question Does P loop on x? Looping Problem is unsolvable Looping Problem complement = H
199 Closure Property Applications Proving a new closure property Theorem: Unsolvable languages are closed under set complement Let L be an arbitrary unsolvable language If L c is solvable, then L is solvable (L c ) c = L Solvable languages closed under complement However, we are assuming that L is unsolvable Therefore, we can conclude that L c is unsolvable Thus, unsolvable languages are closed under complement
200 Pseudo Closure Property * Lemma: If L and L c are half-solvable, then L is solvable. Question: What about L c ?
201 High Level Proof Let L be an arbitrary language where L and L c are both half-solvable Let P 1 and P 2 be the programs which half-solve L and L c, respectively Construct program P 3 from P 1 and P 2 Argue P 3 solves L L is solvable
202 Constructing P 3 Problem Both P 1 and P 2 may loop on some input strings, and we need P 3 to halt on all input strings Key Observation On all input strings, one of P 1 and P 2 is guaranteed to halt. Why?
203 Illustration ** L P 1 halts LcLc P 2 halts
204 Construction and Proof P 3 ’s Operation Run P 1 and P 2 in parallel on the input string x until one accepts x Guaranteed to occur given previous argument Also, only one program will accept any string x IF P 1 is the accepting machine THEN yes ELSE no
205 P 3 Illustration P1P1 P2P2 Yes P3P3 Input Yes No
206 Code for P 3 * bool main(string y) { parallel-execute(P 1 (y), P 2 (y)) until one returns yes; if (P 1 (y)) return yes; if (P 2 (Y)) return no; } bool P 1 (string y) /* guaranteed to halt on strings in L*/ bool P 2 (string y) /* guaranteed to halt on strings in L c */
207 RE and REC REC RE All Languages L LcLc
208 RE and REC REC RE All Languages L LcLc LcLc LcLc Are there any languages L in RE - REC?
209 Module 10 Universal Algorithms moving beyond one problem at a time operating system/general purpose computer
210 Observation So far, each program solves one specific problem Divisor Sorting Multiplication Language L
211 Universal Problem/Program Universal Problem (nonstandard term) Input Program P Input x to program P Task Compute P(x) Univeral Program Program which solves universal problem Universal Turing machine
212 Example Input * int main(A[6]) { Input int i,temp; for (i=1;i<=3;i++) A[1] = 6 if (A[i] > A[i+3]) {A[2] =4 temp = A[i+3];A[3] = 2 A[i+3] = A[i];A[4] = 3 A[i] = temp;A[5] = 5 }A[6] = 1 for (i=1; i<=5; i++) for (j=i+1;j<=6;j++) if (A[j-1] > A[j]) { temp = A[j]; A[j] = A[j-1]; A[j-1] = temp; }
213 Organization Universal Program’s Memory Program P Program P’s Memory Program P int main(A[6]){ int i,temp; for (i=1;i<=3;i++) if (A[i] > A[i+3]) { temp = A[i+3]; A[i+3] = A[i]; A[i] = temp; } for (i=1; i<=5; i++) for (j=i+1;j<=6;j++) if (A[j-1] > A[j]) { temp = A[j]; A[j] = A[j-1]; A[j-1] = temp; } Program Counter int A[6],i,temp; Line 1
214 Description of Universal Program Basic Loop Find current line of program P Execute current line of program P Update program P’s memory Update program counter Return to Top of Loop
215 Past, Present, Future Turing came up with the concept of a universal program (Universal Turing machine) in the 1930’s This is well before the invention of the general purpose computer People were still thinking of computing devices as special-purpose devices (calculators, etc.) Turing helped move people beyond this narrow perspective Turing/Von Neumann perspective Computers are general purpose/universal algorithms Focused on computation Stand-alone Today, we are moving beyond this view Computation, communication, cyberspace However, results in Turing perspective still relevant
216 Halting Problem Revisited * Halting Problem is half-solvable Modified Universal Program (MUP) half- solves H Run P on x Output yes –This step only executed if first step halts Behavior What does MUP do on all yes instances of H? What does MUP do on all no inputs of H?
217 Debuggers How are debugger’s like gdb or ddd related to universal programs? How do debuggers simplify the debugging process?
218 RE and REC We now have a problem that is half- solvable but not solvable What do we now know about the complement of the Halting Problem? What additional fact about RE and set complement does this prove?
219 RE and REC REC RE ll Languages H HcHc
220 Summary Universal Programs 1930’s, Turing Introduces general purpose computing concept Not a super intelligent program, merely a precise follower of instructions Halting Problem half-solvable but not solvable RE not closed under set complement