Numerical Representation of Strings So you see that we can easily replace our usual digit range from 0 to n-1 with the range 1 to n. In our representation of strings on the alphabet A, we use the range from 1 to n to have a bijective mapping between strings and numbers. When using a range from 0 to n-1, there is no such mapping, because different strings correspond to the same number, for example: 87 = 087 = 0087 = 00087 = … November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III
Numerical Representation of Strings Now let us return to our general case of associating numbers with strings. For an alphabet A = {s1, …, sn}, the string w = sik sik-1 … si1 , si0 is called the base n notation for the number x as defined by x = iknk + ik-1nk-1 + … + i1n1 + i0 . When n is fixed, we can consider a function of one or more variables on A* as a function of the corresponding numbers. November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III
Numerical Representation of Strings So we can speak of an m-ary partial function on A* with values in A* as being partially computable, or when it is total, we can speak of it as being computable. Similarly, we can say that an m-ary function on A* is primitive recursive. But what about predicates? Notice that for an alphabet A = {s1, …, sn} the value s1 denotes 1 in base n notation. Thus an m-ary predicate on A* is simply a total m-ary function on A* whose values are either s1 (“true”) or 0 (“false”). Therefore, it makes sense to speak of an m-ary predicate on A* as being computable. November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III
Numerical Representation of Strings For a given alphabet A, any subset of A* (any set of strings on A) is called a language on A. By associating numbers with the elements of A*, we can speak of a language on A as being r.e., recursive, or primitive recursive. Notice that our base n notation even works for n = 1, that is, an alphabet containing only one symbol. For example, if A = {s1} then x = 1 is represented by the string w = s1 x = 7 is represented by the string w = s1s1s1s1s1s1s1 : November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III
Numerical Representation of Strings To avoid confusion, we will not use decimal digits as symbols in our alphabets. Accordingly, a string of decimal digits will always be meant to refer to a number. We will now define some useful functions for string operations. The first function will be CONCAT. November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III
Numerical Representation of Strings Let A be some fixed alphabet containing n symbols, A = {s1, …, sn}. For each m 1, we define CONCATn(m) as follows: CONCATn(1)(u) = u CONCATn(m+1)(u1, …, um, um+1) = zum+1, where z = CONCATn(m)(u1, …, um). This means that for given strings u1, …, um A*, CONCATn(m)(u1, …, um) is obtained by concatenating the strings u1, …, um. November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III
Numerical Representation of Strings We will usually omit the superscript, i.e., the number of arguments. For example: CONCAT3 (s3s1s2, s1s3) = s3s1s2s1s3 Translating this equation from strings to numbers: CONCAT3 (32, 6) = 294 It is also true that CONCAT5 (s3s1s2, s1s3) = s3s1s2s1s3 CONCAT5 (82, 8) = 2058 Obviously, the definition of CONCAT depends on the base n. November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III
Numerical Representation of Strings We will now look at several primitive recursive functions on strings. These functions will be useful for our further discussion of calculation on strings. It will be helpful to remember the functions g and h: g(0, n, x) = x g(m + 1, n, x) = Q+(g(m, n, x), n), then g(m, n, x) = um. h(m, n, x) = R+(g(m, n, x), n), then im = h(m, n, x), m = 0, …, k. November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III
Numerical Representation of Strings 1.) f(u) = |u| This “length” function is defined on A* and yields a natural number. How can we compute f on the number associated with A*? For each t, the number tj=0 nj has the base n representation s1[t+1]. (The expression s[m] stands for the string that consists of m times the symbol s.) November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III
Numerical Representation of Strings For example, given the alphabet A = {s1, s2 ,s3} and t = 2: The string representation of tj=0 3j = s1s1s1. This number is the smallest number whose base 3 representation contains t + 1 symbols. If we subtract 1 from the string s1s1s1, it becomes s3s3. So the length |u| can be defined as follows: |u| = mint≤u (tj=0 nj > u). Obviously, this function is primitive recursive. November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III
Numerical Representation of Strings 2.) g(u, v) = CONCATn(u, v) This function is primitive recursive because it is defined by the following equation: CONCATn(u, v) = un|v| + v Example: Alphabet A = {1, 2, 3, 4, 5, 6, 7, 8, 9, X}: CONCAT10(13, 478) = 131000 + 478 = 13478 November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III
Numerical Representation of Strings 3.) CONCATn(m)(u1, …, um) This function is primitive recursive for each m, n 1. It follows at once from the previous example using composition. Example: Alphabet A = {1, 2, 3, 4, 5, 6, 7, 8, 9, X}, m = 3: CONCATn(u, v, w) = un|v| + |w| + vn|w| + w CONCAT10(13, 478, 9) = 1310000 + 478 10 + 9 = 134789 November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III
Numerical Representation of Strings 4.) RTENDn(w) = h(0, n, w) Remember: im = h(m, n, x), m = 0, …, k. RTENDn gives the rightmost symbol of a given word. We know that h is primitive recursive, so RTENDn is also primitive recursive. November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III
Numerical Representation of Strings 5.) LTENDn(w) = h(|w| - 1, n, w) Corresponding to RTENDn, LTENDn gives the leftmost symbol of a given word. November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III
Numerical Representation of Strings 6.) RTRUNCn(w) = g(1, n, w) Remember: g(m, n, x) = um u0 = iknk + ik-1nk-1 + … + i1n1 + i0 u1 = iknk-1 + ik-1nk-2 + … + i1 RTRUNCn gives the result of removing the rightmost symbol from a given nonempty string. When we can omit the reference to the base n, we often write w- for RTRUNCn(w). Note that 0- = 0. November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III
Numerical Representation of Strings 7.) LTRUNCn(w) = w - (LTENDn(w) n|w| -1) LTRUNCn gives the result of removing the leftmost symbol from a given nonempty string. Example: Alphabet A = {1, 2, 3, 4, 5, 6, 7, 8, 9, X}: LTRUNC10(3478) = 3478 - 31000 = 478. November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III
Numerical Representation of Strings We will use these newly introduced primitive recursive functions to prove the computability of a pair of functions that can be used in changing base. Let 1 n < l. Let A Ã, where A is an alphabet of n symbols and à is an alphabet of l symbols. So whenever a string belongs to A*, it also belongs to Ã*. November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III
Numerical Representation of Strings For any x N, let w be the word in A* that represents x in base n. Then we write UPCHANGEn,l(x) for the number which w represents in base l. Examples: UPCHANGE2,6(5) = 13 The representation of 5 in base 2 is s2s1. In base 6, s2s1 represents the number 13. UPCHANGE1,5(3) = 31 The representation of 3 in base 1 is s1s1s1. In base 5, s1s1s1 represents the number 31. November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III
Numerical Representation of Strings Correspondingly, we define the following: For x N, let w be the string in Ã* which represents x in base l. Let w’ be obtained from w by crossing out all of the symbols that belong to à – A, so w’ A*. We write DOWNCHANGEn,l(x) for the number which w’ represents in base n. Example: DOWNCHANGE2,6(109) = 5 The representation of 109 in base 6 is s2s6s1. We cross out the s6 and get s2s1. In base 2, s2s1 represents the number 5. November 21, 2017 Theory of Computation Lecture 19: Calculations on Strings III