Shuffling by semi-random transpositions Elchanan Mossel, U.C. Berkeley Joint work with Yuval Peres and Alistair Sinclair
Shuffling by random transpositions At each step choose two independent uniformly chosen cards and exchange them. A A
Shuffling by random transpositions Thm[Diaconis-Shahshahani-81]: The mixing time of the random transpositions shuffle is (½ + o(1)) n log n. One can prove an O(n log n) upper bound can using “marking” (more later). Proof of an (n log n) lower bound: At each step “touch” 2 random cards. Until time (n log n)/4 there are (n 1/2 ) untouched cards ) permutation is not random.
The cyclic to random shuffle At step i exchange card at location (i mod n) with a uniformly chosen card. A
History of the cyclic to random shuffle Shuffle introduced by Thorp (65). Aldous and Diaconis (86) asked what is the mixing time? Mironov posed again and proved O(n log n) upper bound using marking.
Why do we care? General question: Is systematic scan faster than random update? (other examples: Diaconis-Ram ; Benjamini-Berger- Hoffman-M for asymmetric exclusion; Gaussian fields etc.). Would be nice to find a “natural problem” where the mixing time is strictly between (n) and (n log n) Mironov: Cyclic to random may tell us a lot about a widely used crypto algorithm RC4.
The RC4 algorithm Mironov: Let’s study algorithm assuming j is random. Slow mixing corresponds to weak crypto. More than 10 6 hits in google
Upper Bounds - Broder’s Marking Broder’s Marking argument: Call the two pointers L t and R t. Start by marking the first card that is pointed by L 1. At time t, mark card pointed by L t if either: The card at R t is marked or R t = L t.
Broder’s Marking A A LR A R=L
Broder’s marking By induction: Given the time and set of marked cards and their positions, the permutation on the marked cards is uniform. ) The time when all cards are marked is a strong uniform time (permutation is random given the time). In order to prove upper bound, need to bound the “marking time”. For random transpositions easy: By coupon collector estimate this time is O(n log n). Mironov: delicate analysis for cyclic to random.
A general n log n upper bound Thm: [M-Peres-Sinclair] An O(n log n) upper bound on the mixing time holds for any shuffle where: At step t we exchange cards L t and R t where R t are i.i.d. uniform in {0,…,n-1}. The sequence L t is independent of R t. L t can be random, deterministic etc. Cyclic to random is given by L t = t mod n. Top to random is given by L t = 0 for all n. Random transpositions is given by L t i.i.d uniform. Pf: Careful analysis of the marking process.
A general n log n upper bound Proof In more detail: May assume that L t is deterministic. Partition time into intervals of length 2n. In such an interval look at pairs of times s < t such that L s = L t (there are at least n such pairs). We can mark card x if: at time s, x is chosen by R s. R r L t for s < r < t. R t is one of the marked cards. Letting m i (u i ) be the (un)-marked card at interval i, gives E[u i+1 | F i ] · u i (1 – c m i ) for c > 0. Will skip the rest of the proof. LsLs RsRs x LtLt x RtRt
Cyclic to random shuffle – lower bound? Mironov proved c n lower bound for some c > 1 using parity as a test function: Each shuffle changes the parity with probability (1 – 1/n). After t steps, resulting parity = original parity with probability: Q: Is next to random faster than random transpositions? Note: All cards are touched by time n.
n log n lower bound for cyclic to random shuffle Thm[M-Peres-Sinclair]: The cyclic to random shuffle has a mixing time (n log n). More precisely: And here is how the proof goes:
Step 1: Homogenizing the chain Problem: The chain is not time homogenous. Can be easily fixed: Consider a chain where at time t: ( 0) swaped with (U), where U is uniform. Rotate all cards to the left: ’(k) = (k+1 mod n). Clearly chain is equivalent + It is homogenous. From now on study homogenized chain.
One card chain Markov chain for a single card: Eigenvalues satisfy = (1 – 1/n) where (n-1) n – n n = 0. Want to show slow mixing ) want close to 1.
Asymptotics of eigen values and functions = (1 – 1/n) where (n-1) n – n n = 0. Let -1 = 1 + z/n and get (1+z/n) n – n (1+z/n) + (n-1) = 0 ! e z – z – 1 = 0. Lemma 1: e z – z – 1 has non-zero complex roots. Lemma 2: If is a root, then M has an eigenvalue such that 1-| | = (1+ < )/n + O(1/n 2 ). Lemma 3: The eigenvector f corresponding to is “smooth”: |f| 1 · C |f| 2. Will write |f| for either. Pfs: Complex analysis … Remark: Numerically, the smallest non-zero root is = 2.088… … i
The test function Take f to be an eigenfunction of M corresponding to the eigenvalue closest to 1. Define the test function F Easy: E[f] = 0 ) E[F] = 0. Easy: E[F(id t )] = t |f| 2. A Longer calculation gives: E(F) = 0 E(F(id t )) = t |f| 2 E(F 2 ) = |f| 4 /n
The main Lemma Remains to bound E[|F(id t )| 2 ]. Main Lemma: ) as long as | | 2t ~ ¸ (4t + n)/n 2 the id t and (where is uniform) have large total variation distance (2 nd moment method). Since 1 - | | = O(1/n): ) 1 ¸ (n log n) E(F) = 0 E(F(id t )) = t |f| 2 E(F 2 ) = |f| 4 /n
Proof of main Lemma The main lemma can be proved using Wilson’s method and the properties of and f. Or it can be done more directly using coupling: Lemma:
Proof of main Lemma Pf idea: “Couple” the following two processes: Process 1: cards i and j move independently. Process 2: The location of cards i and j in the real process. In process 1: Remains to bound the difference between the processes using coupling. Will skip the details …
We’ve seen that the mixing time of the pseudo-random next to random shuffle has the same mixing time as the random transposition shuffle. Proof is not that hard. Problem: How general is the phenomenon? In particular: Open problem: Are there any sequences (deterministic/random) I t, such that the I t to random shuffle mixes in less than n log n time? Conclusion and Open problems