How to realize many-input AND gates a 0 d c b X ab abc X 0 acd abc X 0 cd We need just one row to create all three-input ANDs
3-input AND gate realizable with one ancilla bit that can be reused by other AND gates a 0 c b X ab abc X 0 a b c X
An example = a b c d e f g abc (g abc)de f g (g abc)de f deg = abcde f (g abc) abc=g Conclusions: 1. Every function of type abcde f is realizable with n+1 bits. 2. Every ESOP with n bits and one output is realizable in n+1 bits. g
Warning for Maslov = a b c d e f g Each of these gates costs 3 Toffoli. Toffoli gate for 5 inputs in product requires in last slide 4 of these gates. So, Toffoli gate for 5 inputs in product requires 3*4 = 12 gates. Concluding. When Maslov tells cost 1 it may be cost 12.
Main Problems in cascade design 1.Portland Quantum Logic Group, since Concept of cascades. Search from inputs, outputs, bidirectional. A* Search, GA. Many types of restricted gates. Exhaustive search. Multi-valued reversible logic. 2.Maslov, Dueck, Miller – University of New Brunswick, since Simple heuristic allowed to find good results for unlimited gate model. Because they use unlimited gate model, they do not worry about Markov’s theorem. 3.Shende, Markov, Hayes, University o0f Michigan, since Important theorem about odd and even functions. This theorem explains lack of success of our early approach for some functions. 4.Niraj Jha and Abhinav Agarwal. Princeton University, since Best results so far (for unlimited model?)
Previous research in cascades. 1.(Perkowski, Mishchenko, Khlopotine, Kerntopf, ) If I improve the Hamming Distance (HD) with every gate selected then finally the HD becomes 0, function becomes an identity and I have a solution. 2.Goal – select the best gate – greedy algorithm. Drawback. In some situations there is no gate to select, all gates worsen the HD. We are in the local point in search space from which all selections worsen the HD solution We are here and we are stuck. Every gate selection worsens HD
Previous research in cascades. 1.Example. abc d. 2.I need to flip only two bits but I cannot do this since I could do this only with products that have three literals.
Maslov, Dueck and Miller 1. Every function is realizable with DMM algorithm. Drawback. They assume gates like abcd e which require internal constants. It can be shown that the minimum of these constants is one. A simpler circuit uses more constants. Therefore their method of counting costs of gates is not fair. However, this assumption causes that algorithm is always convergent. Drawback. It is convergent with often very non-minimal costs and then they need another algorithm to improve the result by local optimizations.
Markov If function is even, then it is realizable in n levels. If the function is odd (as abc d) then it requires one more ancilla bit. So n+1 bits are needed for EVERY function of n variables. For n=3 every function is realizable with Toffoli, NOT and Feynman. He considers only functions Toffoli with 3 inputs, Feynman and NOT. He does not use Toffoli with n inputs, as Maslov uses. Main Markov Theorem
Conclusion For n>4 we assume all Toffoli with 3 inputs but on all wires, NOT on all wires and all Feynman with 2 inputs on any two wires. This is the same as Markov (but we also assume other gates), different than Maslov. Using Markov result, if the function is odd we need to add one more wire. (Maslov and Jha do not do this because they have unlimited gate model). Our results take less literals and often also less gates
Conclusion We are using all kinds of other gates, while their approaches still use only Toffoli, Feynman, NOT (just recently they add more). Our method is potentially better if we solve the non- convergence problem for some functions. Other method is to keep adding more ancilla bits. Combine various search models
Kazuo Iwama †, Yahiko Kambayashi † and Shigeru Yamashita ‡ Quantum Computation and Information, ERATO, JST † Kyoto University, ‡ NTT Transformation Rules for Designing CNOT-based Quantum Circuits
Quantum Computing (QC) Computing paradigm based on quantum physics Shor’s algorithm for prime factorization Grover’s algorithm for database search We need to design efficient “Quantum Circuit” for a Boolean function for the given problem QC is still in experimental phase, but, the above algorithm’s time complexities are much better than “classical” counterparts To perform quantum computing efficiently,
Quantum Circuit qubit Control NOT ( CNOT) If all the control bits are then target bit x4x4 x3x3 x2x2 x1x1 Control bit x 2 x 3 Target bit x 4 quantum gates: operation to qubits time
How a CNOT Gate works Just add an exor term to the target bit However, we cannot have a wire in QC Conventional logic design cannot be utilized!! NOT x4x1x2x4x1x2 x4x4 x3x3 x2x2 x1x1 x1x2x1x2 x2x3x2x3 x4x1x2x2x3x4x1x2x2x3 CNOT 4 CNOT 3 X x4x4 x3x3 x2x2 x1x1 Our notation for CNOT with XOR in wire 4
Quantum Boolean Circuit (QBC) xnxn x2x2 x1x1 Auxiliary bits x n+2 0 x n+3 0 xnxn x2x2 x1x1 x n+1 x n+1 f (x 1 x n ) 00 00 Can be used in a Quantum Algorithm
Why Local Transformation? These are based on local transformation rules for Boolean formulas (AND/OR/NOT) Resolution Proof - prove a given CNF is 0 by transformations Automated Logic Design - optimize a given circuit by transformations x 1 x 2 x 3 x 1 x’ 3 x’ 1 x’ 2 x’ 3 x’ 2 x 3 x’ 1 x 2 nil clause)
Motivation: Local Transformations for QBC? Can we enjoy a similar concept (local transformation rules) for CNOT gates worlds? In the quantum world, using AND/OR is not so good CNOT gates (with many control bits) are better logical primitives We start with “complete” local transformation rules for design methodology of CNOT based circuits
What we have done: Quantum Boolean Circuits with CNOT Gates Canonical Form Local Transformation Rules Transformation Procedure to Canonical Form C 1 S 1 = S 2 C 2 If C 1 and C 2 are equivalent, we can transform C 1 C 2 systematically Complete Local Transformation Rules for QBCs
Reduction to canonical form C 1 C 2 If C 1 and C 2 are equivalent, we can transform C 1 C 2 systematically S 1 S 2 noncanonical canonical
Reduction to canonical form C 1 C 2 If C 1 and C 2 are equivalent, we can transform C 1 C 2 systematically S1S1 noncanonical canonical
x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 Transformations used for optimization and formal verification Transformation Equivalence Check
Canonical Form f(x) = x’ 1 x’ 2 x’ 3 = 1 x 1 x 2 x 3 x 1 x 2 x 2 x 3 x 1 x 3 x 1 x 2 x 3 If we place CNOT gates lexicographically, we get the canonical form of the circuit FACT All Boolean Functions can be expressed by PPRM (Positive Polarity Reed-Muller) form uniquely w f(x) ww x3x3 x2x2 x1x1
Canonical Form of QBC xnxn x2x2 x1x1 Auxiliary bits x n+2 0 x n+3 0 xnxn x2x2 x1x1 x n+1 x n+1 f (x 1 x n ) 00 00 CNOT n+1 in PPRM form
Transformation to the Canonical Form x4f(x)x4f(x) x4x4 x3x3 x2x2 x1x1 x 4 f(x) x4x4 x3x3 x2x2 x1x1 By swapping two adjacent gates, Move all CNOT 4 to the left Moving gates!
Transformation Procedure will disappear Only CNOT n+1 left Canonical Form We gather CNOTs with the same target bit CNOT n+1 xnxn x2x2 x1x1 x n+1 f(x) x2x2 x1x1 Canonical Form CNOT n x n+2 0 x n+3 0 x n+1 xnxn x n+2 0 x n+3 0
x 4 x 1 x 2 x 1 x 2 = x 4 Transformation Rule 1 = x4x4 x3x3 x2x2 x1x1 Cancel repeated terms
Move CNOT 4 to the left t 1 C 2, t 2 C 1 condition = x4x4 x3x3 x2x2 x1x1 Transformation Rule 2 Swap independent terms t i = output of gate i C i = input of gate i notation
Move CNOT 4 to the left t 1 C 2, t 2 C 1 condition Transformation Rule 3 = x4x4 x3x3 x2x2 x1x1 added x 3 |x4>
Move CNOT 4 to the left t 1 C 2, t 2 C 1 condition Transformation Rule 4 = x4x4 x3x3 x2x2 x1x1 added x4 x1 x2 x1 x3 x1(x1 x2 x3) = x1 x2 x1 x3
Move CNOT 4 to the left Case 1: t 1 C 2, t 2 C 1 Rule 2 Case 2: t 1 C 2, t 2 C 1 Rule 3 Case 3: t 1 C 2, t 2 C 1 Rule 4 Case 4: t 1 C 2, t 2 C 1 Impossible t1t1 t2t2 C2C2 C1C1 Transformation Rule 4 cont
Nontrivial case – case 4 We cannot swap by using just one local rule We use auxiliary bits to swap gates in this case x4x4 x3x3 x2x2 x1x1 t 1 C 2, t 2 C 1
Transformation Rule 5 Auxiliary bit 00 yy 0 y y Garbage, ancilla bit
Example: How to Treat the Nontrivial Case If we encounter case 4, x1x1 x4x4 x3x3 x2x2 x1x1 x3x3 x2x2 x4x4 00 Rule 1 & Rule 2
Example: How to Treat the Nontrivial Case x1x1 x3x3 x2x2 x4x4 00 Rule 5 x1x1 x3x3 x2x2 x4x4 00
Example: How to Treat the Nontrivial Case We can delete these added gates eventually x1x1 x3x3 x2x2 x4x4 00 Rule 2, 3, 4 x1x1 x3x3 x2x2 x4x4 00 To the left To the right
What we discussed so far in this lecture? Quantum Boolean Circuit with CNOT gates Canonical form Transformation Rules for QBCs The notion of “minimum circuit” How to get the minimum circuit Future Work (it was a question asked in 2002)
Examples of applications of transformations
Example Cost: 602 Cost: 188
Cost of CNOT gates Cost (1) = 1 Cost (2) = 14 Cost (3) = 56 Cost (4) = 140 Cost (m) = 112(m-3) (m > 4) [BBC + 95] shows constructions: CNOT(2) gate by 14 basic gates CNOT(3) gate by 4 CNOT(2) gate CNOT(4) gate by 10 CNOT(2) gate Cost (m) gate by 8(m-3) CNOT(2) gate (m > 4) *should be much higher cost for many inputs
Motivation: Design Methodology
Computer Design Methodology Classical Quantum For a given specification Manually made libraries Control Logic (Boolean functions) Switch, Adder, MUX, … Manually made libraries Hadamard Transformation, Fourier Transformation,.. Control Logic (Boolean functions) Target for Automatic Design
Design Methodology for Boolean functions Classical Quantum For a given Boolean functions Designing with AND/OR/NOT Technology independent Mapping to the library of available gates Technology dependent Mapping to the library of available gates Designing with CNOT gates Technology independent Technology dependent
Design Methodology for Boolean functions Classical Quantum Standard Form (Sum-of-products forms) Transformation Rules Good starting point for design Why AND/OR/NOT ? Good starting point for design Why CNOT gates ? Fundamental concept We want to establish a similar concept Motivation Automatic Design
Why Only Boolean Function Parts? The efficient construction of QBC is very important for the implementation of QA Because only Boolean function parts vary depending on problems Boolean Oracles in Grover type algorithms Shor type algorithms Simulating classical calculations, etc. *Boolean function part can be simulated by classical Boolean functions, but we cannot utilize (classical) design methodology
Shor type QA (find r s.t. g r = x mod p ) f (a 1 a n, b 1 b n, ) g A x --B mod p Vary depending on problem W-H 00 00 00 00 00 00 QFT A B anan a1a1 bnbn b1b1
More rules
Auxiliary bit 00 Transformation Rule 6
Transformation Example Shift(S, 2) is called. Shift(S, 1) is called.
Step 1 x1x1 x2x2 Change the n -th control bit of c i to an unused auxiliary bit by adding two gates a i and b i = c1c1 c2c2 c3c3 x1x1 x2x2 0 0 a1a1 b1b1 0 a2a2 b2b2 a3a3 b3b3 We want to prove this simple fact
About Step 1 x1x1 x2x2 x1x1 x2x Rule 1 x1x1 x2x Rule 2 Rule 5 to get the previous slide
Step 2 Move a i to the left (Rules 2 & 4) x1x1 x2x c1c1 c2c2 a1a1 b1b1 a2a2 b2b2 x1x1 x2x a2a2 a1a1 c3c3 a3a3 b3b3 a3a3 c1c1 c2c2 b1b1 b2b2 c3c3 b3b3 added
Step 3 Move b i to the right (Rules 2 & 3) x1x1 x2x c1c1 c2c2 c3c3 b3b3 x1x1 x2x a2a2 a1a1 a3a3 c1c1 c2c2 b1b1 b2b2 c3c3 b3b3 b2b2 b1b1 added
Step 4 for (i = 2 to k) { Move a i to the right after a 1 (Rule 2) Change the control of a i to the ancilla bit 1 (Rule 5) } x1x1 x2x a2a2 a1a1 a3a3 x2x a1a1 a’ 2 a’ 3 x1x1
Step 5 for (each g i = CNOT n ) { Move g i to the right after a 1 (Rules 2, 3, 4) } x2x a1a1 x1x1 g1g1 g2g2 g3g3 g4g4 x2x a1a1 x1x1 g1g1 g2g2 g3g3 g4g4 *Here, we omit some redundant pairs of gates added
Step 6 (1/2) Step 6 (a): Reorder CNOT n lexicographically (Rule 2) Step 6 (b): Delete redundant gates by Rule 6 Step 6 (c): Delete redundant pairs of gates by Rule 1 x2x a1a1 x1x1 CNOT n x2x a1a1 x1x1 Step 6 (b) Step 6 (c)
Step 6 (2/2) Step 6 (a): Reorder CNOT n lexicographically (Rule 2) Step 6 (b): Delete redundant gates by Rule 6 Step 6 (c): Delete redundant pairs of gates by Rule 1 x2x a1a1 Why disappear? x1x1 x2x a1a1 x1x1
About Step 6 (c) disappear by Rule 6 Move the first gate to the right → We can move all CNOT 2 to the left successfully ∵ if some gates remain in g 1 (x), the final state of x 2 is x 2 x 2 g 1 (x) g 2 (x) this cannot be x 2 f(x) x2g1(x)x2g1(x) g2(x)g2(x) x1x1 x2x a1a1
Step 7 Move a 1 to the right after all CNOT n (Rule 2) x2x a1a1 x1x1 x2x a1a1 x1x1 CNOT n
Then Shift(S, 1) is called x2x x1x1 x2x2 0 x1x Step 1 a2a2 a1a1 b2b2 b1b1
Shift(S, 1): Step 2 x2x2 0 x1x x2x2 0 x1x Step 2 a2a2 a1a1 b2b2 b1b1 b2b2 b1b1 a1a1 a2a2
Shift(S, 1): Step 3 Step 3 x2x2 0 x1x b2b2 b1b1 a1a1 a2a2 x2x2 0 x1x b2b2 b1b1 a1a1 a2a2 added
Shift(S, 1): Steps 4 & 5 Step 5 x2x2 0 x1x a1a1 a’ 2 x2x2 0 x1x a1a1 added
Shift(S, 1): Steps 6 & 7 Step 6 & 7 x2x2 0 x1x a1a1 x2x2 0 x1x a1a1
After All Shift(S, i): Step A 1/2 x1x1 x2x while (each g i such that g i has ancilla bits and the most left) { Move g i to the left (Rule 2, 4) Delete g i (Rule 6) }
After All Shift(S, i): Step A 2/2 while (each g i such that g i has ancilla bits and the most left) { Move g i to the left (Rule 2, 4) Delete g i (Rule 6) } *Here, we omit some redundant pair of gates x1x1 x2x
After All Shift(S, i): Step B x1x1 x2x Reorder gates (except for CNOTn ) lexicographically (Rule 2) Delete redundant pairs of gates by Rule 1 x1x1 x2x
This was a very complex proof of a fact that could be easily proven from swap gate properties. Our goal was however to show the application of all the introduced transformation techniques.
Complex Rules & Minimizing Example
Complex Transformation C C 1 A C C1C1 A xixi xjxj A C C1C1 xixi xjxj C1C1 We want to prove this by transformations
A C xixi xjxj A C xixi xjxj C1C1 C1C1 A Complex Transformation (1) C1C1 A By TR (1)
Complex Transformation (2) Move them by TRs (2) & (3) A C xixi xjxj C1C1 C1C1 A A C C1C1 xixi xjxj C1C1 C1C1 A added cancel
Complex Transformation (3) A C C1C1 A xixi xjxj A C C1C1 xixi xjxj C1C1 Thus we proved what we wanted to prove
C C i A C C1C1 A C2C2 A CkCk A A C CkCk C1C1 k |A| > Σ|C k | Our Strategy of using such transformation in more complex circuits Find the portion where the following transformation can be applied CkCk C1C1
x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 a b c d e f g Best Combination: (a, d ), (b, e), (c, f )
x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 g (a, d )(b, e)(c, f ) h i j k l a b c Applying the transformation we get: Now we can remove two pairs of columns
x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 g a b c l To get solution with 6 gates
x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 g a b c m nl By TR (1) We insert a pair of columns
x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 g a b c m nl Best Combination: (a, m ), (b, g), (c, n)
x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 a b c l (a, m ) (b, g )(c, n ) Again we remove two pairs of columns
x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 a b c o p q l Best Combination: (o, p), (b, c), (q, l)
x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 (o, p) a b p (b, c)(q, l ) q Final result
Other Stuff
auxiliary bits x 5 0 x 6 0 x2x2 x1x1 x4x4 x3x3 x2x2 x1x1 x3x3 x 5 0 x 6 0 G1G1 p G2G2 G’ 2 G’ 1 x 4 F g1g1 g2g2 Simulating CNF Formulas F = (x 1 x 3 ) ( x 1 x 2 x 3 )
x4x4 x3x3 x2x2 x1x1 x4x1x3x4x1x3
x4x4 x3x3 x2x2 x1x1 x4x1x3x4x1x3
Feynman’s Notation 22 U U Toffoli Gate x =
Basic Gates in Quantum Circuits [BBC + 95] All unitary matrices can be decomposed into a combination of the above gates U Controlled NOT Basic Gates 1-bit unitary
Controlled U Gate U = u 00 u 10 u 01 u 11 m (U)= u 00 u 10 u 01 u If all of x 1 x m are apply U to the |y 2 m dimension
Classical Reversible Gate (1) C=1 → Swap I 1 and I 2 C=0 → no change FT Gate C C I1I1 I2I2 O1O1 O2O2 Universal
Classical Reversible Gate (2) C 1 = C 2 → negate I Toffoli Gate I O C2C2 C2C2 C1C1 C1C1 Universal
x4x4 x3x3 x2x2 x1x1
Transformations from [BBC + 95] Barenco et all, very important paper.
An example = a b c d e f g abc (g abc)de f g (g abc)de f deg = abcde f (g abc) abc=g g Recall this example.
Lemma 6.1 Any 2x2 unitary matrix can be decomposed: 22 U U = VV†V† V U=V 2
Lemma 7.5 U=V 2 x1x1 = x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 x8x8 x9x9 U Any 2x2 unitary matrix can be decomposed: V V†V† V
Corollary Proof Lenma 7.5 = U Any n*n unitary matrix can be decomposed: V V†V† V (n) (1) C n-1 C n-2 (Cor. 7.4) (Cor. 5.3) C n-1 = C n-1 + (n)
auxiliary bits x a j i 0 GiGi gigi x1x1 xkxk GjGj gjgj (a) x a x1x1 xkxk 0 j i GiGi (b) 0 i1i1 ikik j1j1 jkjk g’ i GjGj g’ j l1l1 l2l2 r1r1 r2r2 x1x1 xkxk x a 0 x a j i 0 GiGi (c) 0 x a i1i1 0 x a ikik 0 x a j1j1 0 x a jkjk g’ i GjGj g’ j l1l1 l2l2 l3l3 r1r1 r2r2 r3r3