Applications of Quantum Computing in Bioinformatics Tutorial Program – PM2 Monday, August 13, 2007 Applications of Quantum Computing in Bioinformatics Hongwei Huo Xidian University, Xi’an 710071, China hwhuo@mail.xidian.edu.cn Vojislav Stojkovic Morgan State University, Baltimore MD21251, USA stojkovi@jewel.morgan.edu
Preface 1.0 On Presenters 1.1 On Tutorial 1.2 Computing v. Computation
On Presenters Dr. Hongwei Huo is a Professor of Xidian University, Xi’an, China from 2001. She received her Ph.D. (2000) and MS (1991) degrees from Xidian University, Xi’an, China. Dr. Huo’s primary research interests are Algorithms and Data Structures, Bioinformatics (Cells and Atoms Computation), Parallel and Distributing Computing. Dr. Huo has published more than 40 scientific-research papers in proceedings and journals.
Dr. Vojislav Stojkovic is an Associate Professor of Morgan State University, Baltimore, Maryland, USA from 1989. He received his Ph.D. (1981) and MS (1977) degrees from University of Belgrade, Belgrade, Serbia. Dr. Stojkovic’s primary research interests are Artificial Intelligence (Agents), Bioinformatics (DNA-Computation), and Informatics Security. Dr. Stojkovic has published more than 50 scientific-research papers in proceedings and journals.
On Tutorial The tutorial “Applications of Quantum Computing in Bioinformatics” covers: - Computing with Cells and Atoms - Elements of Computer Science Related to Quantum Computing - Introduction to Quantum Computing - Some Applications of Quantum Computing in Bioinformatics - Further Readings The tutorial time is limited and some material will be just mentioned, skipped and left for homework.
Computing Computing is any goal-oriented activity requiring, benefiting from, or creating computers. Computing includes: - designing and building hardware and software systems for a wide range of purposes - processing, structuring, and managing various kinds of information - doing scientific studies using computers - making computer systems behave intelligently - finding and gathering information relevant to any particular purpose, and so on.
Computation Computation is a process following a well defined model that is understood and can be expressed by an algorithm, protocol, network topology, etc.
Computing with Cells and Atoms - DNA Computing - Membrane Computing - Quantum Computing
DNA Computing DNA computing is a form of computing which uses DNA and biochemistry and molecular biology technologies instead of the traditional silicon-based computer technologies. The main reasons for the interest in DNA-computations are: (i) size and variety of available DNA molecules (ii) massive parallelism achieved by biochemical operations on DNA molecules (iii) feasible and efficient models (iv) physical realizations of the models (v) performing computations in vivo.
Basic DNA Operations The most important basic DNA operations are: (i) Append (Concatenate, Rejoined) -- appends two DNA strands with ‘sticky ends’ (ii) Melt (Anneal, Renaturation) -- breaks two DNA strands with complementary sequences (iii) Cut -- cuts a DNA strand with restriction enzymes.
Test Tube Operations The most important test tube operations are: (i) Union (Merge, Create) -- pours the context of more tubes into one tube. (ii) Copy (Duplicate, Amplify) -- makes copies of a tube. (iii) Separate -- separates an assignment into a finite sequence of assignments sorted by the length of unit assignments. (iv) Detect -- confirms presence or absence of a unit assignment in a tube. (v) Select -- selects on the uniformly random way from an assignment a unit assignment. (vi) Append (Concatenate, Rejoined) -- appends an unit assignment to each unit assignment of an assignment. (vii) Melt (Anneal, Renaturation) -- melts each unit assignment of an assignment with a unit assignment. (viii) Extract -- extracts the context of one tube into two tubes using a pattern unit assignment. (ix) Remove -- removes unit assignments that contain occurrence(s) of other unit assignments. (x) Cut -- cuts each unit assignment of an assignment for the given length. (xi) Discard – empty the tube.
DNA Representations A DNA representation of a string c1 ... cm is a sequence c[1] ... c[m], where: - c[i], where i = 1, ..., m, is the character at the position i. Characters c[i] are uniquely encoded by DNA strands. A DNA representation of an unsigned integer number d1 ... dm is a sequence d[1]...d[m], where: - d[i], where i = 1, ..., m, is the digit at the position i, and - 0 <= d[i] <= base-1. The base may be any integer number >= then 1. Digits d[i] are uniquely encoded by DNA strands.
Example - DNA-Based Algorithm for Creating a Set of Unsigned Integer Binary Numbers procedure CreateInputSet(m, T) { base = 2; T = {ε}; // ε is the empty string for (i = 1; i <= m; i++) Copy(T, { T[base-1] }); Parallèle for(j = 0; j <= base-1; j++) k = rand(base-1); Append(T[j], k, T[j]); } Union({ T[base-1] }, T); Discard( { T[base-1] });
function Smallest(m, T) { for i = m to 1 do Example - DNA-Based Algorithm for Determining the Smallest Element of a Set of Unsigned Integer Binary Numbers function Smallest(m, T) { for i = m to 1 do Copy(T, { T[1] }); Discard(T); Parallel { Remove(T[1], {i 0}); Remove(T[0], {i 1}); } if(Detect(T[0])) then Union(T[0], T); else Union(T[1], T); } return Select(T);
Program The DNA-Based program for determining the smallest element of a set of unsigned integer binary numbers of the length m may be specified on the following way: program S { m=4; CreateInputSet(m, T); smallest = Smallest(m, T); }
Test Example m=4; CreateInputSet(4, T); base = 2; T = {ε}; // ε is the empty string T ε i = 1; Copy(T, {T[1]}); T[0] T[1] ε ε
Parallel { k = rand(1); for example k = 0 Append({ T[0], k }, T[0]); k = rand(1); for example k = 1; Append(T[1], k, T[1]); } T[0] T[1] 0ε 1ε Union({ T[1] }, T); T 0ε 1ε
Discard({ T[1] }); T[0] T[1] i = 2; Copy(T, {T[1]}); 0ε 1ε 0ε 1ε Parallel { k = rand(1); k = 0; Append({ T[0], k }, T[0]); k = rand(1); for example k = 0; Append(T[1], k, T[1]); }
T[0] T[1] 00ε 01ε 00ε 01ε Union({ T[1] }, T); T 00ε 01ε 00ε 01ε Discard({ T[1] }); And etc. k = rand(1); for example k = 1; k = rand(1); for example k = 0;
And finally T 0100ε 0101ε 0100ε 0101ε 0000ε 0001ε 0000ε 0001ε 1100ε 1101ε 1100ε 1101ε 1000ε 1001ε 1000ε 1001ε smallest = Smallest(4, T); i = 4; Copy(T, { T[1] }); T[0] T[1] Discard(T);
Parallel{Remove(T[1], {4 0}); Remove(T[0], {4 1});} T[0] T[1] 0100ε 0101ε 0100ε 0101ε 0000ε 0001ε 0000ε 0001ε 1100ε 1101ε 1100ε 1101ε 1000ε 1001ε 1000ε 1001ε if(Detect(T[0])) then Union(T[0], T); else Union(T[1], T); T 0100ε 0101ε 0100ε 0101ε 0000ε 0001ε 0000ε 0001ε i = 3; Copy(T, { T[1] }); 0100ε 0101ε 0100ε 0101ε 0000ε 0001ε 0000ε 0001ε 0100ε 0101ε 0100ε 0101ε 0000ε 0001ε 0000ε 0001ε
Parallel{Remove(T[1], {3 0}); Remove(T[0], {3 1});} T[0] T[1] 0000ε 0001ε 0000ε 0001ε 0100ε 0101ε 0100ε 0101ε if(Detect(T[0])) then Union(T[0], T); else Union(T[1], T); T 0000ε 0001ε 0000ε 0001ε i = 2; Copy(T, { T[1] }); 0000ε 0001ε 0000ε 0001ε 0000ε 0001ε 0000ε 0001ε
Parallel{Remove(T[1], {2 0}); Remove(T[0], {2 1});} T[0] T[1] 0000ε 0001ε 0000ε 0001ε 0000ε 0001ε 0000ε 0001ε if(Detect(T[0])) then Union(T[0], T); else Union(T[1], T); T 0000ε 0001ε 0000ε 0001ε i = 1; Copy(T, { T[1] });
Parallel {Remove(T[1], {1 0}); Remove(T[0], {1 1});} T[0] T[1] 0000ε 0000ε 0001ε 0001ε if(Detect(T[0])) then Union(T[0], T); else Union(T[1], T); T 0000ε 0000ε return Select(T); smallest = 0000ε;
Example - The Traveling Salesman Problem The traveling salesman has a map of several cities with one way roads between some of the cities. The goal of the Traveling Salesman problem is to find a route (path) if it exists from start-city to end-city that passes through all the cities exactly once. For our DNA computation we chose the following graph (map of cities) of 7 nodes (cities) and 13 edges (roads). Map of cities and roads
The Traveling Salesman problem is the difficult problem for conventional vonNeumann computers when it involves large numbers of cities. It is intractable with conventional vonNeumann computers, but can be solved in a relatively simple way on massively parallel (non vonNeumann) computers such as DNA computers. Adleman and his successors, including us, chose as the example the Traveling Salesman problem as the test example because it is a well-known problem that is appropriate for solving on massively parallel computers.
The following algorithm solves the Traveling Salesman problem: (1) Generate random paths through the graph representing the given road map between the cities (2) Keep only those paths with n cities, where n is the number of the cities (3) Keep only those paths that begin with the start-city (4) Keep only those paths that conclude with the end-city (5) Keep only those paths that enter all cities at least once (6) Any remaining paths are solutions.
Adleman used DNA to perform the steps of the above algorithm, which is as follows: Adleman used strands consisting of 20 nucleotides. He represented each of the cities (vertices of the path) by a separate DNA strand and each edge between two consecutive vertices by a DNA strand. The last ten nucleotides of the strand represented the first vertex while the first ten nucleotides of the strand represented the second vertex. (1) Toss a sufficient number of DNAs into a DNA computer (test tube). As a result of the biochemical process, DNA molecules combined to make multiple copies of every possible path. (2)-(5) Use chemical techniques of molecular biology to extract the paths.
We have used agents to perform all steps of the above algorithm. In the Traveling Salesman Problem, we represented graph nodes by DNA-s agents. Current MAC-G4 hardware limitations allow us to work with only about 10000 agents, implemented in Easel programming language. Although, this is quite distant from billions of DNA sequences, it is sufficient to demonstrate the basic idea and to provide a solution to the problem.
Membrane Computing Membrane computing is a natural computing inspired by the structure and functions of living cells. It connects: - distributed parallel computing models and - membrane systems. The field of membrane computing was initiated in 1998 by Gheorghe Paun.
Quantum Computing Quantum computing is a multidisciplinary research area bringing together ideas from: - information theory, - computer science, - mathematics, and - quantum physics.
Elements of Computer Science Related to Quantum Computation - Turing Machines - The Church-Turing Thesis - Complexity Classes
Turing Machines A Turing machine (TM) is a basic abstract symbol-manipulating machine which can simulate any computer that could possibly be constructed. A Turing machine that is able to simulate any other Turing machine is called a Universal Turing machine (UTM). Studying abstract properties of TM and UTM yields many insights into computer science and complexity theory. Turing and others proposed mathematical computing models that are allow to study algorithms independently of any particular computer hardware. This abstraction is invaluable.
Informal Description of TM A Turing machine consists of: - A tape - A head - A table - A state register
A Tape A tape is divided into cells, one next to the other. Each cell contains a symbol from some finite alphabet. The alphabet contains a special blank symbol (here written as 'B') and one or more other symbols. The tape is assumed to be arbitrarily extendable to the left and to the right. Cells that have not been written to before are assumed to be filled with the blank symbol. In some models: - the tape has a left end marked with a special symbol; - the tape extends or is indefinitely extensible to the right.
A Head A head can: - read and write symbols on the tape and - move the tape left and right one (and only one) cell at a time. In some models the head moves and the tape is stationary.
A Table A table (transition function) of instructions (usually 5-tuples but sometimes 4-tuples) for: - the state the machine is currently in and - the symbol it is reading on the tape tells the machine what to do. In some models, if there is no entry in the table for the current combination of symbol and state then the machine will halt. Other models require all entries to be filled.
The 5-tuple Models In case of the 5-tuple models: (i) either erase or write a symbol, and then (ii) move the head ('L' for one step left or 'R' for one step right), and then (iii) assume the same or a new state as prescribed.
The 4-tuple Models In the case of 4-tuple models: (ia) erase or to write a symbol or (ib) move the head left or right, and then (ii) assume the same or a new state as prescribed, but not both actions (ia) and (ib) in the same instruction.
A State Register A state register stores the state of the Turing table. The number of different states is always finite and there is one special start state with which the state register is initialized.
Formal Description of TM A (one tape) Turing machine is a 7-tuple where: - Q is a finite set of states - Γ is a finite set of the tape alphabet/symbols - is the blank symbol - Σ, a subset of Γ not including b, is the set of input symbols - is the transition function, where L is left shift and R is right shift. - is the initial state - is the set of final or accepting states
Example – Copy String
The Church-Turing Thesis The Church-Turing thesis states that Turing machines: - capture the informal notion of effective method in logic and mathematics - provide a precise definition of an algorithm or 'mechanical procedure'.
Turing proposed the following hypothesis: Every ‘function which would naturally be regarded as computable’ can be computed by the universal Turing machine. It should be noted that it is not clear what precisely “a function which would naturally be regarded as computable” means. Due to this ambiguity, this statement is not subject to rigorous proof. There are strong evidences for this hypothesis: - Many diverse models of computation compute the same set of functions as a Turing machine - No counterexamples of the hypothesis.
The “Power” of Computing Machines The Church-Turing thesis gives the insight into the “power” of computing machines. If a computing machine can solve all problems that a Turing machine can solve, then the computing machine is powerful as a Turing machine.
Computable or Not Computable? It is interesting and important to determine does a problem computable or not computable. Computable? Problem Not Computable?
Solvable or Not Solvable? This distinction is not fine enough to determine does a problem solvable by a physical computer in a reasonable amount of time. If a computation takes hundreds of years to get result, it may be considered as not computable from the perspective of a pragmatist. Solvable? Problem Not Solvable?
The Complexity of Algorithms An algorithm can be characterized by: - the number of operations and - the amount of memory it requires to compute a result for the given input of size N. These characterizations of the algorithm determine what is called the complexity of algorithms. Specifically, the complexity of an algorithm is determined by looking/analyzing how the number of operations and memory usage influence to getting a result for the size of the input of the algorithm.
Complexity Classes Problems are grouped into the following complexity classes: - P: Polynomial time - NP: Nondeterministic polynomial time - NP-complete
P, NP, NP-complete P: Polynomial time, the running time of the given algorithm is in the worst case some polynomial function of the size of the input. For example if an algorithm with an input size of 10 bits needs 10^4+7*10^2+5 operations to get result, it is a polynomial time algorithm. NP: Nondeterministic polynomial time, a candidate for an answer can be verified as a correct answer or not in the polynomial time. NP-complete: A set of problems such that if any member is in P, the set P is equal to the set NP.
Tractable v. Intractable Problems which can be solved in polynomial time or less are “tractable” problems. Problems which cannot be solved in polynomial time (or in other words require more than polynomial time) are “intractable” problems. Example An algorithm that needs for an input of size n - 2n operations is an intractable problem.
Open Questions Whether or not P and NP describe the same set of problems is one of the most important open questions in computer science. If a polynomial time algorithm for any NP-complete problem be discovered, then we would know that P and NP are the same set. Many NP-complete problems have known algorithms which can compute their solutions in exponential time. It is unknown whether any polynomial time algorithms exist.
When discussing the complexity of a problem it is important to distinguish between there being: - no known algorithm to compute it in polynomial time, and - no algorithm to compute it in polynomial time. There are many problems whose best known algorithm requires an exponential number of steps to compute a solution. These problems are generally considered to be intractable. Determining the prime factors of a large number is one such problem, and its assumed intractability is the bases for most cryptography.
Introduction to Quantum Computing - What is Quantum Computing? - Do We Need Quantum Computing? - Major Events in the History of Quantum Computing - The Power of Quantum Computing - Quantum Algorithms - A Quantum Computer - Problems in Building Quantum Computers - Quantum Computer Candidates - At Present … - Million Dollars Questions - Turing is Forever
- A Quantum Information - A Quantum Bit (Qubit) - The Bloch Sphere - n-Qubit System - Superposition - Measurement - Quantum Gates - Quantum Circuits - Quantum Parallelism - BQP & BPP
What is Quantum Computing? Quantum computing is a multidisciplinary area bringing together ideas from: - information theory - computer science - mathematics and - quantum physics.
Do We Need Quantum Computing? - Quantum computing is more powerful than classical computing. // - More can be computed in less time // - The complexity classes are different
Major Events in the History of Quantum Computing - 1982 - Feynman proposed the idea of making / building / constructing computing machines based on the laws of quantum mechanics instead of the laws of classical physics. - 1985 - David Deutsch developed the quantum turing machine, showing that quantum circuits are universal. - 1994 - Peter Shor developed a quantum algorithm to factor very large numbers in polynomial time. - 1997 - Lov Grover developed a quantum search algorithm with O(N) complexity.
The Power of Quantum Computing - Each of exponentially many possibilities can be used to perform a part of a computation at the same time . - In quantum systems possibilities count, even if they never happen!
Quantum Algorithms Quantum algorithms are: - completely logically different algorithms from current conventional algorithms // That is the big problem - much faster then current conventional algorithms
A Quantum Computer A quantum computer is any computing device that use distinctively quantum mechanical phenomena, such as superposition and entanglement, to perform operations on data. The behavior of quantum computers is governed by the laws of quantum mechanics. The hardware of quantum computers is fundamentally different from the hardware of classical (or conventional) computers. In a classical (or conventional) computer, the amount of data is measured by bits. In a quantum computer, the amount of data is measured by qubits (quantum bits).
Problems in Building Quantum Computers There are a number of practical difficulties in building a quantum computer. David DiVincenzo of IBM, listed the following requirements for a practical quantum computer: - scalable physically to increase the number of qubits - qubits can be initialized to arbitrary values - quantum gates faster than decoherence time - universal gate set - qubits can be read easily
Quantum Computer Candidates There are a number of quantum computer candidates: - Superconductor-based quantum computers - Trapped ion quantum computer - Electrons on helium quantum computers - "Nuclear magnetic resonance on molecules in solution“-based - "Quantum dot on surface"-based - "Cavity quantum electrodynamics" (CQED)-based - "Molecular magnet"-based - Fullerene-based ESR quantum computer - Solid state NMR Kane quantum computers - Optic-based quantum computers (Quantum optics) - Topological quantum computer - Spin-based quantum computer - Adiabatic Quantum Computing - Diamond-based quantum computers
At Present … Quantum computers can solve only trivial problems. Quantum computers that quantum algorithms require to be efficiently executed are not available.
Million Dollar Questions - Does a quantum computer can solve problems which are incomputable on a classical computer? Quantum computers may be faster than classical (von-Neumann) computers, but quantum computers cannot solve any problem that classical (von-Neumann) computers, if they have enough time and memory, cannot solve. - Does a quantum computer can make intractable problems tractable problems? // - Does quantum computation transform intractable problems into tractable problems? In some cases the answer is “yes”.
Turing is Forever A Turing machine can simulate a quantum computer, so a quantum computer could not solve an undecidable problem such as the halting problem. Quantum computers does not disprove the Church-Turing thesis: “Everything what is computable is computable by a Turing machine”
A Quantum Information A quantum information is physical information that is held in a "state" of a quantum system. A quantum information differs from classical information in several respects: - It cannot be read without the state becoming the measured value - An arbitrary state cannot be cloned - The state may be in the superposition of basis values.
A Quantum Bit (Qubit) A quantum bit (qubit) is a unit of quantum information. A qubit |z> is a unit vector in a two-dimensional vector space C2 over the complex numbers C. y |z> x
|z> = a|0> + b|1> = a, b where For each qubit |z>, there are two complex numbers a and b, a, b ε C, such that: |z> = a|0> + b|1> = a, b where |0> = 1, |1> = 0, and |a|2 + |b|2 = 1. 0 1 y |1> b |z> x a |0>
The Bloch Sphere A bit of data is represented by a single atom that is in one of two basis states denoted by |0> and |1>. A single bit of this form is a qubit. A “qubit” is the quantum equivalent of a bit. A physical implementation of a qubit could use the two energy levels of an atom: - an excited state representing |1> and - a ground state representing |0>. The Bloch sphere is a representation of a qubit, the fundamental building block of quantum computers.
N-Qubit System An n-qubit system is a superposition of the form 11…1 Ψ = Σ cx|x>, x=00…0 where cx are complex numbers such that Σ |cx|2 = 1
2-Qubit System A 2-qubit system can be represented by 2 copies of C2 tensored together, that is C2C2. The state space is 22 = 4-dimensional. If |0> and |1> are the vectors of a basis in C2 then, the set {|0>|0>, |0>|1>, |1>|0>, |1>|1>} = {|00>, |01>, |10>, |11>} is a basis in C2C2.
2-Qubit System |00> = |0> |0> = 1 1 = 1*1 = 1 0 0 1*0 0 0 0 1*0 0 0*1 0 0*0 0 |01> = |0> |1> = 1 0 = 1*0 = 0 0 1 1*1 1 |10> = |1> |0> = 0 1 = 0*1 = 0 1 0 0*0 0 1*1 1 1*0 0 |11> = |1> |1> = 0 0 = 0*0 = 0 1 1 0*1 0
n-Qubit System A n-qubit system, where n ≥ 2 can be represented by n copies of C2 tensored together, that is C2…C2. The state space is 2n dimensional. The set {|00 … 0>, |00 … 1>, …, |11 … 1>} = { |0> |0> … |0>, |0> |0> … |1>, … |1> |1> … |1> } is a basis in C2 … C2.
Superposition A single qubit can be forced into a superposition of the two states denoted by the addition of the state vectors: |> = a|0> + b|1> a and b are complex numbers and |a|2 + |b|2 = 1 If such a superposition is measured when it is in the state a|0> + b|1>, then we get |0> with probability |a|2 |1> with probability |b|2
More than One Qubit If we concatenate two qubits (a0|0> + a1|1>) (b0|0> + b1|1>) We have a 2-qubit system with 4 basis states |0>|0> = |00> |0>|1> = |01> |1>|0> = |10> |1>|1> = |11> and we can also describe the state as a0b0|00> + a0b1|01> + a1b0|10> + a1b1|11>
More than One Qubit (2) In general we can have arbitrary superpositions a00|00> + a01|01> + a10|10> + a11|11> Satisfying |a00|2 + |a01|2 + |a10|2 + |a11|2 = 1 where there is no factorization into the tensor product of two independent qubits and these states are called entangled.
Measurement If we measure both bits of a00|00> + a01|01> + a10|10> + a11|11> We get |xy> with probability |axy|2. |a|2, for amplitudes of all states matching an output bit-pattern, gives the probability that it will be read. Example 0.316|00> + 0.447|01> + 0.548|10> + 0.632|11> The probability to read the rightmost bit as 0 is |0.316|2 + |0.548|2 = 0.4 Measurement during a computation changes the state of the system but can be used in some cases to increase efficiency (measure and halt or continue).
Quantum Gates One-input gate: NOT Input state: a0|0> + a1|1> Output state: a1|0> + a0|1> Pure states are mapped thus: |0> |1> and |1> |0> Gate operator (matrix) is , |0> = , |1> = As expected: = NOT NOT
Quantum Gates (2) One-input gate: Hadamard h Maps |0> 1/2 |0> + 1/2 |1> and |0> 1/2 |0> 1/2 |1> Ignoring the normalization factor 1/2, we can write |x> (-1)x |x> |1x> One-input gate: Phase shift d H
Quantum Gates (3) Universal One-input Gate Sets Requirement: |0> Any state |> = Hadamard and phase-shift gates form a universal gate set of 1-qubit gates, every 1-qubit gate can be built from them. Example: The following circuit generates |> = cos |0> + eisin |1> up to a global factor U H 2 /2+
Quantum Gates (4) Rotation (U): Hadamard (H): CNOT: , CPHASE: There are many small “universal” sets of gates. Gates must be unitary: U+U = UU+, where U+ is the Hermiteant adjoint of U.
Quantum Circuits for Boolean Functions It is known that, for any Boolean function. f: {0, 1} {0, 1} It is possible to construct a quantum circuit Uf that computes it Specifically, to each binary function f corresponds a quantum circuit: Uf: |x, y> |x, y f(x)>, where is a binary addition
Quantum Circuits for Boolean Functions (2) What can this circuit Uf do? Example x x Uf y yf(x) |0> |1> |> = Uf (|0> |1>) = Uf |01> = |0, 1 f(0)>
Quantum Circuits for Boolean Functions (3) But what if the input is a superposition? Both values of the function show up in the final state solution. We’ve computed f(0) and f(1) at the same time. x x Uf y y f(x) 1/2 (|0> + |1>) |0> |>
Quantum Parallelism Quantum parallelism is that feature of quantum computers which makes it possible to evaluate a function f(x) on many different values of x simultaneously. We will look at an example of quantum parallelism now – how to compute f(0) and f(1) for some function f all in one go!
Quantum Parallelism Summary So, a superposition of inputs will give a superposition of outputs! We can perform many computations simultaneously This is what makes famous quantum algorithms, such as: - Shor’s algorithm for factoring, or - Grover’s algorithm for searching Simple quantum algorithm: Deutsch’s algorithm
BQP & BPP The BQP ("Bounded-error, Quantum, Polynomial-time") class of problems is the set of problems solvable by a quantum computer in polynomial-time, whose probability of error is bounded away from one quarter. The BPP (“Bounded-error, Probabilistic, Polynomial-time”) class of problems is the set of problems solvable by a probabilistic Turing machine in polynomial-time, whose probability of error is at most 1/3 for all instances. The BQP class of problems is the complement of the BPP class of problems.
Some Applications of Quantum Computing in Bioinformatics - Sequence Alignment - Matches, Mismatches and Indels - Basic Algorithmic Problem or Two Sequences Alignment - Multiple Sequences Alignments - Sum-of-Pairs Score (SPS) - Bad News - An Evolutionary Algorithm (EA) - Quantum Evolutionary Algorithms (QEAs) for Multiple Sequences Alignments - Multiple Sequences Alignments (2) - Quantum Representation of Sequences Alignment - Quantum Representation of Sequences Alignment - How to Get Sequences Alignment? - How to Get an Alignment ?
- Objective Function - Evolutionary - Choices of Rotation Angles - Evaluations and Quantum Gate - Two-phase Optimization - Open Questions - Demo
Sequence Alignment Definition An alignment of two strings S and S’ is obtained by inserting spaces into, at the beginnings, or at the ends of the two strings, so that the resulting strings (including the spaces) have the same lengths. A – T C – A – C T C A A Intended to show the similarity of two sequences (strings).
Matches, Mismatches and Indels Two aligned, identical characters in an alignment are a match. Two aligned, unequal characters are a mismatch. A character aligned with a space, represents an indel (insertion/deletion). In biology, in the case of a mutation model: - a mismatch represents a mutation - an indel represents a historical insertion or deletion of a single character
Example AACTACT–CCTAACACT–– ––CTCCTACCT––TACTTT 10 matches, 2 mismatches, 7 indels
Basic Algorithmic Problem or Two Sequences Alignment Find an alignment of the two strings that maximizes: - the number of matches - the number of mismatches - the number of spaces in the alignment of strings S and S’. That maximum number defines the similarity of the two strings. Also, Two Sequence Alignment is called Two Sequences Optimal Global Alignment
Multiple Sequences Alignments In general we also need: - to compare multiple sequences and - to find the similarities. Multiple sequences alignment generalizes the alignment idea to handle many sequences. Also, Multiple Sequences Alignment is called Multiple Sequences Optimal Global Alignment AT–C–TCGAT –TGCAT––AT ATCCA–CGCT
Sum-of-Pairs Score (SPS) For the given multiple sequences alignment, the sum-of-pairs score (SPS) is the sum of the induced pairwise alignment scores for each pair of sequences in the alignment. The sum-of-pairs score is increased as the number of sequences is aligned correctly. 1: AT–C–TCGAT 2: –TGCAT––AT 3: ATCCA–CGCT 1: AT–C–TCGAT 2: –TGCAT––AT 1: AT–C–TCGAT 2: –TGCAT––AT 3: ATCCA–CGCT 3: ATCCA–CGCT
Bad News Multiple alignment is NP-hard. One method is to approximate the optimal value.
An Evolutionary Algorithm (EA) An evolutionary algorithm (EA) is a type of a generic population-based meta-heuristic optimization algorithm. An EA uses some mechanisms inspired by biological evolution: - reproduction, - mutation, - recombination, - natural selection, and - survival of the fittest. EAs perform well in fields such as physics, chemistry, biology, genetics, operations research, engineering, robotics, economics, social sciences, and art.
Quantum Evolutionary Algorithms (QEAs) for Multiple Sequences Alignments Quantum parallelism enables all possible values of the function f to be evaluated simultaneously. Therefore, the evolutionary algorithm and quantum computing may be combined. The aim is to get benefits from quantum computing potentials to enhance both efficiency and speed of classical evolutionary algorithms.
Multiple Sequences Alignments (2) Let S be a collection of strings s1, s2, s3…sk, over alphabet . An alignment of S is a matrix A with k rows such that: Each entry is either a letter or a gap; No column is all gaps; Reading across row i, 1<= i <= k, and removing gaps, gives string si; s1 s2 s3 … sk A =
Quantum Representation of Sequences Alignment Each sequence is represented as a quantum m-tuple in QE, where m is the length of the sequence. Each pair (aij, bij) represents one qubit and corresponds to one character of the alphabet . The amplitudes aij and bij are real values satisfying |aij|2 + |bij|2 = 1. For each qubit, the binary value is calculated according to the probabilities |aij|2, |bij|2. |aij|2 and |bij|2 are interpreted as the probabilities to have respectively an element of or a gap.
Quantum Representation of Sequences Alignment All potential alignments can be represented by a quantum matrix. The quantum matrix can be seen like a probabilistic representation of all possible alignments.
Satisfying |ai|2 + |bi|2 = 1. Example If the following three sequences abcd, ac, and abd have to be aligned, the corresponding quantum matrix of these three sequences may be: Satisfying |ai|2 + |bi|2 = 1.
How to Get Sequences Alignment? Transform the quantum matrix to the binary matrix. For every pair (aij, bij), generate the random number random(0, 1) from the range [0...1]. If random(0, 1) < |bij|2, the bit of binary matrix is set to be 1, otherwise 0. Thus, the binary matrix is formed from the quantum matrix, which represents a solution observed from the quantum matrix.
How to Get an Alignment ? Transform the obtained binary matrix to the alphabetic matrix. The alphabetic matrix describes a possible sequence alignment for the given three sequences abcd, ac, and abd. For every bit of the binary matrix: - 0 represents the character at the position of corresponding sequence and -1 represents the gap.
Objective Function Evaluation of sequences alignment is performed by using an objective function (OF), which is simply a measurement of multiple sequences alignment quality. The most widely used OFs are: - SPS (sums of pairs with affine gap penalties), - COFFEE, and - HMMs.
We used SPS to evaluate sequence alignment in the experiment We used SPS to evaluate sequence alignment in the experiment. It means that the score increases with the number of sequences correctly aligned. The object function of an alignment A based on SPS can be expressed as follows: where COST is the alignment score between two aligned sequences.
Evolutionary Assume that: - Q(0) is the initial quantum matrix and - P(0) is the corresponding binary matrix. In the evolutionary process, the state of a qubit can be changed by the operation with quantum gate. Thus the rotation gate U(ij) is used to update the quantum matrix Q(t-1) and get the new quantum matrix Q(t). The binary matrix P(t) can be obtained from the quantum matrix Q(t), which can be further optimized as a mutation operator. = U(ij) , where U(ij) = It is easy to see . ij is a rotation angle of each qubit toward either 0 or 1 state.
Evaluations and Quantum Gate The P(t) and B(t-1) are evaluated by computing their objective function value, respectively. The better is stored in B(t). These steps are repeated iteratively, generation by generation. In each generation: - good binary matrix survives and - bad binary matrix is discarded.
It should be noted that: - NOT gate, - controlled NOT gate, or - Hadamard gate can be used as quantum gate. NOT gate changes the probability of the 1 (or 0) state to the 0 (or 1) state. It can be used to escape a local optimum. In controlled NOT gate, one of the two bits should be a control bit. If the control bit is 1, the NOT operation is applied to the other bit. It can be used for the problems which have a large dependency of two bits. The Hadamard gate is suitable for the algorithms which use the phase information of qubit, as well as the amplitude information.
Choices of Rotation Angles In the multiple sequence alignment problem, ij is obtained as a function of xij, bij, and the expression f(x) f(b) as shown in table 1, where xij is the j-th bit of the i-th sequence of the binary solution xtk in P(t), bij is the j-th bit of the i-th sequence of the binary solution btk in B(t), and ij is the rotation angle of the j-th qubit of the i-th row of the qubit qtk in Q(t). f() is an objective function. Table 1 Lookup table of ij xij bij f(x) f(b) ij false -0.4 true -0.6 1 0.0 0.1 0.5 -0.5 0.2
Choices of Rotation Angles (2) If it is ambiguous to select a positive or a negative number for the values of the angle parameters, it is recommended to set the values to 0. The magnitude of ij has an effect on the speed of convergence, but if it is too big, the solutions may diverge or converge prematurely to a local optimum. The values from 0.001 to 0.05 are recommended for the magnitude of ij, although they depend on the problems. The sign of ij determines the direction of convergence. ij 1 1 (aij, bij) (a’ij, b’ij) |0> |1>
Two-phase Optimization It has been shown that changing the initial values of qubits can provide better performance of QEA. Since the initial search space is directly determined by the initial values of qubits, the qubit individuals can converge to the best solution effectively if we can seek the initial values of qubits to show the initial search space with small distance to the best solution. With the strategies, two-phase optimal process is proposed as follows: 1 First-phase QEAlign 2 Second-phase QEAlign
Two-Phase Optimization (2) 1 First-phase QEAlign 2 Second-phase QEAlign In the first phase, the initial qubit individuals (quantum matrix) are divided into several groups. In each group, the initial values of qubits can be determined by the following formula: where Ng is the number of groups, 0 < <<1. In the second phase, the initial solution is the resulting induced from the first phase.
Open Questions In 2001, a 7 qubit machine was built and programmed to run Shor’s algorithm to successfully factor 15. What algorithms will be discovered next? Can quantum computers solve NP Complete problems in polynomial time?
Demo BAliBASE (Benchmark Alignment dataBASE) (Thompson et al 1999, 2001) Table 1 BAliBASE reference sets, showing the number of alignments in each set Running QEA.exe Running bali_score.exe for comparison
Further Readings - General References - Introduction to Quantum Computation - Thermal ensembles - Using Quantum Computers to Simulate Quantum Systems - Quantum cryptography - Universal Quantum Computer and the Church-Turing Thesis - Shor's Factoring Algorithm - Quantum Database Search - Quantum Sorting - Quantum Computer Simulators - Quantum Programming Languages
Thank you!