Nonconstructive methods in finite automata Rūsiņš Freivalds (University of Latvia)
To explain the main idea, we consider the following probabilistic automaton. The initial probability distribution: 1/8, 1/8, 1/8, 1/8, 1/8, 1/8, 1/8, 1/8, 0, 0, 0, 0, 0, 0, 0, 0 The first 8 states are accepting, the last 8 states are rejecting.
To explain the main idea, we consider the following probabilistic automaton. The initial probability distribution: 1/8, 1/8, 1/8, 1/8, 1/8, 1/8, 1/8, 1/8, 0, 0, 0, 0, 0, 0, 0, 0 The first 8 states are accepting, the last 8 states are rejecting. There are 2 8 possible input letters. They interchange the states q i q i+8. The input word is in the language if all the non-zero probabilities have returned to the first 8 states. It is easy to prove that any deterministic automaton needs 2 8 states for this language. Unfortunately, the probabilistic automaton has non- isolated cut-point.
To explain the main idea, we consider the following probabilistic automaton. The initial probability distribution: 1/8, 1/8, 1/8, 1/8, 1/8, 1/8, 1/8, 1/8, 0, 0, 0, 0, 0, 0, 0, 0 The first 8 states are accepting, the last 8 states are rejecting. There are 2 8 possible input letters. They interchange the states q i q i+8. The input word is in the language if all the non-zero probabilities have returned to the first 8 states. It is easy to prove that any deterministic automaton needs 2 8 states for this language. Unfortunately, the probabilistic automaton has non- isolated cut-point.
To explain the main idea, we consider the following probabilistic automaton. The initial probability distribution: 1/8, 1/8, 0, 1/8, 0, 0, 1/8, 1/8, 0, 0, 1/8, 0, 1/8, 1/8, 0, 0 The first 8 states are accepting, the last 8 states are rejecting. There are 2 8 possible input letters. They interchange the states q i q i+8. The input word is in the language if all the non-zero probabilities have returned to the first 8 states. It is easy to prove that any deterministic automaton needs 2 8 states for this language. Unfortunately, the probabilistic automaton has non- isolated cut-point.
To explain the main idea, we consider the following probabilistic automaton. The initial probability distribution: 1/8, 1/8, 0, 1/8, 0, 0, 1/8, 1/8, 0, 0, 1/8, 0, 1/8, 1/8, 0, 0 The first 8 states are accepting, the last 8 states are rejecting. There are 2 8 possible input letters. They interchange the states q i q i+8. The input word is in the language if all the non-zero probabilities have returned to the first 8 states. It is easy to prove that any deterministic automaton needs 2 8 states for this language. Unfortunately, the probabilistic automaton has non- isolated cut-point.
To explain the main idea, we consider the following probabilistic automaton. The initial probability distribution: 1/8, 1/8, 0, 1/8, 0, 0, 1/8, 1/8, 0, 0, 1/8, 0, 1/8, 1/8, 0, 0 The first 8 states are accepting, the last 8 states are rejecting. There are 2 8 possible input letters. They interchange the states q i q i+8. The input word is in the language if all the non-zero probabilities have returned to the first 8 states. It is easy to prove that any deterministic automaton needs 2 8 states for this language. Unfortunately, the probabilistic automaton has non- isolated cut-point.
To explain the main idea, we consider the following probabilistic automaton. The initial probability distribution: 1/8, 1/8, 0, 0, 1/8, 1/8, 1/8, 1/8, 0, 0, 1/8, 1/8, 0, 0, 0, 0 The first 8 states are accepting, the last 8 states are rejecting. There are 2 8 possible input letters. They interchange the states q i q i+8. The input word is in the language if all the non-zero probabilities have returned to the first 8 states. It is easy to prove that any deterministic automaton needs 2 8 states for this language. Unfortunately, the probabilistic automaton has non- isolated cut-point.
Linear codes is the simplest class of codes. The alphabet used is a fixed choice of a finite field GF(q)=F q with q elements. For most of this paper we consider a special case of GF(2)=F 2. These codes are binary codes. A generating matrix G for a linear [n, k] code over F q is a k-by-n matrix with entries in the finite field F q, whose rows are linearly independent. The linear code corresponding to the matrix G consists of all the q k possible linear combinations of rows of G. The requirement of linear independence is equivalent to saying that all the q k linear combinations are distinct.
The linear combinations of the rows in G are called codewords. However we are interested in something more. We need to have the codewords not merely distinct but also to be removed as far as possible each from another in terms of Hamming distance. Hamming distance between two vectors v=(v 1,..., v n ) and w=(w 1,..., w n ) is the number of indices i such that v i ≠ w i.
The textbook P.Garret „The Mathematics of Coding Theory”(2004) contains Theorem A. For any integer n ¸4 there is a [2n, n] binary code with a minimum distance between the codewords at least n/10. However the proof of this theorem has a serious defect. It is non-constructive. It means that we cannot find these codes or describe them in a useful manner. This is why P.Garret calls them mirage codes.
Definition. A generating matrix G of a linear code is called cyclic if along with an arbitrary row (v 1, v 2, v 3,..., v n ) the matrix G contains also a row (v 2, v 3,...,v n,v 1 ). We would wish to prove a reasonable counterpart of Theorem A for cyclic mirage codes, but this attempt fails. Instead we construct a slightly more complicated structure of mirage codes for which a counterpart of Theorem A can be proved.
We consider binary generating matrices. Let p be an odd prime number, and x be a binary word of the length p. The generating matrix G(p, x) has p rows and 2p columns. Let x = x 1 x 2 x 3... x p. The first p columns (and all p rows) make a unit matrix with elements 1 on the main diagonal and 0 in all the other positions. The last p columns (and all p rows) make a cyclic matrix with x = x 1 x 2 x 3... x p as the first row, x p x 1 x 2 x 3... x p-1 as the second row, and so on x 1 x 2 x 3 x 4... x p x p x 1 x 2 x 3... x p x p-1 x p x 1 x 2... x p x 2 x 3 x 4 x 5... x 1
Lemma. For arbitrary x, if h 1 h 2 h 3... h p h p+1 h p+2 h p+3... h 2p is a codeword in the linear code corresponding to G(p, x), then h p h 1 h 2... h p-1 h p-1 h p-2 h 2p h p+1 h p+2... h 2p-1 is also a codeword.
There are 2 p codewords of the length 2p. If the codeword is obtained as a linear combination with the coefficients c 1, c 2,..., c p then the first p components of the codeword equal c 1 c 2... c p. We denote by R(x, c 1 c 2... c p ) the subword containing the last p components of this codeword. Lemma. If c 1 c 2...c p = , then R(x, c 1 c 2... c p )= , for arbitrary x.
Definition. We will call a word trivial if all its symbols are equal. Otherwise we call the word nontrivial. Lemma. If c 1 c 2...c p is trivial, then R(x, c 1 c 2... c p ) is trivial for arbitrary x. Proof. Every symbol of R(x, c 1 c 2... c p ) equals x 1 + x x p mod 2. Lemma. If x is trivial, then R(x, c 1 c 2... c p ) is trivial for arbitrary c 1 c 2...c p..
Definition. Word x = x 1 x 2 x 3 x 4... x p is called a cyclic shift of the word y = y 1 y 2... y p if there exists i such that x 1 = y i, x 2 = y i+1,..., x p = y i+p where the addition is modulo p. If (i, p) = 1, then we say that this cyclic shift is nontrivial. Lemma. If x is a cyclic shift of y, then R(x, c 1 c 2... c p ) is a cyclic shift of R(y, c 1 c 2... c p ). Lemma. If p is an odd prime, x is a nontrivial word and y is a nontrivial cyclic shift of x, then x ≠ y. Lemma. If p is an odd prime and c 1 c 2...c p is nontrivial, then the set T c1 c2... cp = { R(x, c 1 c 2... c p ) | x is in {0, 1} p and R(x, c 1 c 2... c p is nontrivial } has a cardinality which is a multiple of p.
If q is a prime number, the set of the codewords with the operation "component-wise addition" is a group. Finite groups have useful properties. We single out Lagrange's Theorem. The order of a finite group is the number of elements in it. Lagrange's Theorem. Let GR be a finite group. Let H be a subgroup of GR. Then the order of H divides the order of GR.
For arbitrary fixed c 1 c 2...c p, the set { R(x, c 1 c 2... c p ) | x is in {0, 1} p } with algebraic operation "component-wise addition modulo p" is a group. We denote this group by B. By D we denote the group of all 2 p binary words of the length p with the same operation. Lemma. For arbitrary c 1 c 2...c p, x and y, R(x, c 1 c 2... c p ) + R(y, c 1 c 2... c p ) = R(x+y, c 1 c 2... c p ). In other words, the map D B defined by x R(x, c 1 c 2... c p ) is a group homomorphism. The kernel of the group homomorphism is the set ker_0 = {x | R(x, c 1 c 2... c p ) = }. The image of the group homomorphism is the set B. For arbitrary z in B, by ker z we denote the set ker z = {x | R(x, c 1 c 2... c p ) = z}.
Lemma. For arbitrary z in B, card(ker z ) = card(ker 0 ). Lemma. For arbitrary z in B, card(ker z ) = card(D)/card(B). Lemma. If x contains (p - 1) zeroes and 1 one, and c 1 c 2... c p is nontrivial, then R(x, c 1 c 2... c p ) is nontrivial. Proof. For such an x, the number of ones in R(x, c 1 c 2... c p ) is the same as the number of ones in c 1 c 2...c p.
Consider the sequence 2 ⁰, 2¹, 2², …, 2 p-2, 2 p-1, 2 p, … and the corresponding sequence of the remainders of these numbers modulo p r 0,r 1,r 2,…,r p-2,r p-1,r p,… Little Fermat theorem asserts that if p is prime exceeding 2, then is r p-1 congruent to 1 modulo p. If p-1 is the smallest element of this sequence congruent to 1 modulo p then we say that 2 is a primitive root modulo p.
Emil Artin made in 1927 a famous conjecture the validity of which is still an open problem: If r is neither -1 nor a square, then r is a primitive root for infinitely many primes.
Lemma. If p is an odd prime such that 2 is a primitive root modulo p and c 1 c 2...c p is nontrivial, then the set { R(x, c 1 c 2... c p ) | x is in {0, 1} p } is either of cardinality 1 or of cardinality 2. Proof. By Lagrange's Theorem the order 2 p of the group B divides the order of the group D. Hence the order of B is 2 p for some integer b. The neutral element of these groups is the word It belongs to every subgroup. There are two possible cases: 1) is in B, 2) is not in B.
Case 1) is in B. T c1 c2... cp = { R(x, c 1 c 2... c p ) | x is in {0, 1} p and R(x, c 1 c 2... c p is nontrivial } Then card(T) = card(B) - 2, and card(T) is a multiple of p. Hence 2 b = card(B) = 2 mod p and 2 b-1 = 1 mod p. Since 2 is a primitive root modulo p, either 2 b-1 = 2 p-1 or 2 b-1 = 2 0. If 2 b-1 = 2 p-1, then 2 b = 2 p and for this fixed c 1 c 2... C p the map x R(x, c 1 c 2... c p ) takes distinct x'es into distinct R(x, c 1 c 2... c p ) 's. If 2 b-1 = 2 0, then 2 b = 2 and B = { , }, but this is impossible.
Case 2) is not in B. T c1 c2... cp = { R(x, c 1 c 2... c p ) | x is in {0, 1} p and R(x, c 1 c 2... c p is nontrivial } Then card(T) = card(B) - 1, and card(T) is a multiple of p. Hence 2 b = 1 mod p. Since 2 is a primitive root modulo p, either 2 b = 2 p-1 or 2 b = 2 0. If 2 b = 2 p-1 then card(B) = 2 p-1.Hence for arbitrary z in T c1 c2... cp, card(ker z ) = 2. If 2 b = 2 0, then B = { } but this is impossible.
Definition. We say that the numbering ψ = ψ 0 (x), ψ 1 (x), ψ 2 (x),... of 1-argument partial recursive functions is computable if the 2-argument function U(n, x) = ψ n (x) is partial recursive. Definition. We say that a numbering ψ is reducible to the numbering ξ if there exists a total recursive function f(n) such that, for all n and x, ψ n (x) = ξ f(n) (x). Definition. We say that a computable numbering φ of all 1- argument partial recursive functions is a Gödel numbering if every computable numbering (of any class of 1-argument partial recursive functions) is reducible to φ.
Definition. We say that a Gödel numbering κ is a Kolmogorov numbering if for arbitrary computable numbering ψ (of any class of 1-argument partial recursive functions) there exist constants c > 0, d > 0, and a total recursive function f(n) such that: -for all n and x, ψ n (x) = κ f(n) (x), - for all n, f(n) ≤ cn + d.
There exist many distinct Kolmogorov numberings. We now fix one of them and denote it by κ. Since Kolmogorov numberings give indices for all partial recursive functions, for arbitrary x and p, there is an i such that κ i (p)=x. Let i(x, p) be the minimal i such that κ i (p) = x. It is easy to see that if x 1 ≠x 2, then i(x 1,p) ≠ i(x 2,p). We consider all binary words x of the length p and denote by x(p) the word x such that i(x,p) exceeds i(y,p) for all binary words y of the length p different from x. It is obvious that i ¸2 p-1.
We introduce a partial recursive function μ(z, ε, p) defined as follows. Above when defining G(p,x) we considered auxiliary function R(x, c 1 c 2... c p ). To define μ(z, ε, p) we consider all 2 p binary words x of the length p. If z is not a binary word of the length 2p, then μ(z, ε, p) is not defined. If ε is not in {0,1}, then μ(z, ε, p) is not defined. If z is a binary word of the length 2p and ε is in {0,1}, then we consider all x in {0,1} p such that R(x,d(z))=e(z). If there are no such x, then μ(z, ε, p) is not defined. If there is only one such x, then μ(z, ε, p) = x. If there are two such x, then the first such x in the lexicographical order, for ε=1 μ(z, ε, p)= the second such x in the lexicographical order, for ε=0 If there are more than two such x, then μ(z, ε, p) is not defined.
Now we introduce a computable numbering of some partial recursive functions. This numbering is independent of p. For each p (independently from other values of p) we order the set of all the 2 2p binary words z of the length 2p: z 0, z 1, z 2,..., z 2 2p -1. We define z 0 as the word The words z 1, z 2,..., z 2p are words with exactly one symbol 1. We strictly follow a rule "if the word z i contains less symbols 1 than the word z j, then i < j". Words with equal number of the symbols 1 are ordered lexicographically. Hence z 2 2p -1 =
For each p, we define Ψ 0 (p)= μ (z 0, 0, p) Ψ 1 (p )= μ (z 0, 1, p) Ψ 2 (p)= μ (z 1, 0, p) Ψ 3 (p)= μ (z 1, 1, p) Ψ 4 (p)= μ (z 2, 0, p) Ψ 5 (p)= μ (z 2, 1, p)... Ψ 2 2p -2 (p)= μ (z 2 2p -1, 0, p) Ψ 2 2p -1 (p)= μ (z 2 2p -1, 1, p) For j ¸2 2p+1, Ψ j (p) is undefined.
We have fixed a Kolmogorov numbering κ and we have just constructed a computable numbering Ψ of some partial recursive functions. Lemma. There exist constants c > 0 and d > 0 (independent of p) such that for arbitrary i there is a j such that Ψ i (t) = κ j (t) for all t, and j ≤ ci+d. We consider generating matrices G(p,x(p)) for linear codes where p is an odd prime such that 2 is a primitive root modulo p, and, as defined above, x(p) is an arbitrary binary word of the length p such that κ i (p) =x(p) implies i ¸2 p -1. We denote the corresponding linear code by LC 2 (p). We prove several lemmas showing that, if p is sufficiently large, then Hamming distances between arbitrary two codewords are no less than 4p/19.
A Hermitian matrix (or self-adjoint matrix) is a square matrix with complex entries which is equal to its own conjugate transpose — that is, the element in the ith row and jth column is equal to the complex conjugate of the element in the jth row and ith column, for all indices i and j: or written with the conjugate transpose: For example, is Hermitian. If the eigenvalues of a Hermitian matrix are all positive, then the matrix is positive definite; if they are all non-negative, then the matrix is positive semidefinite.
1/ / / / / / / / / / / / / / / /
A density matrix is a self-adjoint (or Hermitian) positive- semidefinite matrix, (possibly infinite dimensional), of trace one, that describes the statistical state of a quantum system. The formalism was introduced by John von Neumann (according to other sources independently by Lev Landau and Felix Bloch) in 1927.
A finite quantum automaton with mixed states is described by the inititial density matrix, the unitary matrices corresponding to the input letters, and the measurements.
1/8 1/ /8 1/ /8 1/ /8 1/
1/8 1/ /8 1/ /8 1/ /8 1/
1/ / / / /8 1/ / / / / / / /8
1/ / / / /8 1/ / / / / / / /8
1/8 1/ /8 1/ /8 1/ /8 1/
1/8 1/ /8 1/ /8 1/ /8 1/
p 1 p 2 d 1 p 3 d 2 d 3 d 4
When studying Hamming distances and Hamming codes for binary strings we use the independence of distinct symbols in the string. There is no such independence in permutations. Hamming distance between any two n-permutations either equals 0 or is no less than 2.
We wish to construct to construct a set of n-permutations containing as many as possible elements such that Hamming distance between any two of them is as large as possible.
There are n! n-permutations. We now wish to construct a subset of them such that Hamming distance between any two of them is at least 3. How large the subset may be?
Even and Odd Permutations A permutation is even if and only if the diagram has an even number of crossings.
There are n! n-permutations. Half of them are even, half of them are odd. The even permutations are a subgroup, the odd permutations are a co-set. The number of these permutations is large enough. Unfortunately, the Hamming distance is too small.
Permutahedron
How many automorphisms Fano plane has? Take any line (e.g. the line {6,1,5} ) and consider what permutations of vertices are possible that conserve the line {6,1,5}. Every permutation of this kind is described by positions of the vertices 0,2,3,4. This gives us 4! = 24 possibilities. Since there are 7 lines, the number of automorphisms is 24 times 7 = 168.
The automorphism group of Fano plane is a group of order 168 generated by These are the permutation matrices associated with the coordinate permutations,,, and
A finite projective plane of order n is a set of n 2 + n + 1 points and a set of n 2 + n + 1 subsets of points called lines such that Any two distinct points determine a unique line Any two distinct lines determine a unique point Every point is contained in exactly n + 1 lines Every line contains exactly n + 1 points The finite projective plane of order n = 2, the Fano plane, is shown above. It is easy to show that For every prime power q = p k, there exists a projective plane of order q.
From László Babai's Home Page
There are n! n-permutations. We now wish to construct a subset of them such that Hamming distance between any two of them is the maximal possible, i.e. n. How large the subset may be?
The answer is n. Indeed, such a subset exists and it is maximal.
There are n! n-permutations. We now wish to construct a subset of them such that Hamming distance between any two of them is at least n-1. How large the subset may be?
The answer is n(n-1). For rather many n such a subset exists and it is maximal.
How to construct such a set of permutations for arbitrary n ?
Let us re-write this set as
How to construct such a set of permutations for arbitrary n ? Let us re-write this set as Have we improved anything?
is the set of all linear functions modulo 5: x x+1 x+2 x+3 x+4 2x 2x+1 2x+2 2x+3 2x+4 3x 3x+1 3x+2 3x+3 3x+4 4x 4x+1 4x+2 4x+3 4x+4 Since 5 is a prime number, they are distinct permutations.
Lemma. If for a pair (n.d) the equality G(n,d) = n(n-1)(n-2)...(d+1)d holds, then G(n-1,d) = (n-1)(n-2)...(d+1)d.
It would be very nice, had we proven that G(n,d) = n(n-1)(n-2)...(d+1)d. This would imply However, to prove this theorem we need much less. It suffices to prove that G(n, const.n) = superexponential (n)
On the other hand, we are interested in the “Hamming” set of permutations being an algebraic group with a small number of generating elements. This would allow us to have a small alphabet for the languages recognized by quantum and deterministic automata.
Lagrange's theorem. For any finite group G, the order (number of elements) of every subgroup H of G divides the order of G.
There are n! n-permutations. We now wish to construct a subset of them such that Hamming distance between any two of them is n-2. How large the subset may be?
We expect n(n-1)(n-2). If n=5, then n(n-1)(n-2)=60. Indeed, there are 60 even 5-permutations. If n=6, then n(n-1)(n-2)=120. How to construct such a set?
This is a group with two generating elements: g 1 (x) = 6x+5 and g 2 (x) = x 4 + 3x + 1
Sharply 2-transitive groups were constructed using linear functions. If n is a prime number, this gives us the needed group. What we can do to construct a sharply 3-transitive group?
Cayley transform In mathematics, the Cayley transform, named after Arthur Cayley ( ), has a cluster of related meanings. As originally described by Cayley (1846), the Cayley transform is a mapping between skew-symmetric matrices and special orthogonal matrices.
where K = w 2 + x 2 + y 2 + z 2
Thank you
Rezerve