PAC-Learning and Reconstruction of Quantum States

Slides:



Advertisements
Similar presentations
How Much Information Is In A Quantum State? Scott Aaronson MIT.
Advertisements

How Much Information Is In Entangled Quantum States? Scott Aaronson MIT |
The Learnability of Quantum States Scott Aaronson University of Waterloo.
Quantum t-designs: t-wise independence in the quantum world Andris Ambainis, Joseph Emerson IQC, University of Waterloo.
Are Quantum States Exponentially Long Vectors? Scott Aaronson (who did and will have an affiliation) (did: IASwill: Waterloo) Distributions over n-bit.
Quantum Versus Classical Proofs and Advice Scott Aaronson Waterloo MIT Greg Kuperberg UC Davis | x {0,1} n ?
Quantum Software Copy-Protection Scott Aaronson (MIT) |
The Future (and Past) of Quantum Lower Bounds by Polynomials Scott Aaronson UC Berkeley.
The Learnability of Quantum States Scott Aaronson University of Waterloo.
How Much Information Is In A Quantum State?
The Power of Quantum Advice Scott Aaronson Andrew Drucker.
Multilinear Formulas and Skepticism of Quantum Computing Scott Aaronson UC Berkeley IAS.
Limitations of Quantum Advice and One-Way Communication Scott Aaronson UC Berkeley IAS Useful?
How Much Information Is In A Quantum State? Scott Aaronson MIT |
Quantum Double Feature Scott Aaronson (MIT) The Learnability of Quantum States Quantum Software Copy-Protection.
A Full Characterization of Quantum Advice Scott Aaronson Andrew Drucker.
New Evidence That Quantum Mechanics Is Hard to Simulate on Classical Computers Scott Aaronson Parts based on joint work with Alex Arkhipov.
Pretty-Good Tomography Scott Aaronson MIT. Theres a problem… To do tomography on an entangled state of n qubits, we need exp(n) measurements Does this.
How to Solve Longstanding Open Problems In Quantum Computing Using Only Fourier Analysis Scott Aaronson (MIT) For those who hate quantum: The open problems.
Scott Aaronson Institut pour l'Étude Avançée Le Principe de la Postselection.
QMA/qpoly PSPACE/poly: De-Merlinizing Quantum Protocols Scott Aaronson University of Waterloo.
The Equivalence of Sampling and Searching Scott Aaronson MIT.
When Qubits Go Analog A Relatively Easy Problem in Quantum Information Theory Scott Aaronson (MIT)
Solving Hard Problems With Light Scott Aaronson (Assoc. Prof., EECS) Joint work with Alex Arkhipov vs.
Circuit and Communication Complexity. Karchmer – Wigderson Games Given The communication game G f : Alice getss.t. f(x)=1 Bob getss.t. f(y)=0 Goal: Find.
THE QUANTUM COMPLEXITY OF TIME TRAVEL Scott Aaronson (MIT)
Quantum Computing MAS 725 Hartmut Klauck NTU
Scott Aaronson (MIT) Forrelation A problem admitting enormous quantum speedup, which I and others have studied under various names over the years, which.
Superdense coding. How much classical information in n qubits? Observe that 2 n  1 complex numbers apparently needed to describe an arbitrary n -qubit.
Probably Approximately Correct Model (PAC)
Avraham Ben-Aroya (Tel Aviv University) Oded Regev (Tel Aviv University) Ronald de Wolf (CWI, Amsterdam) A Hypercontractive Inequality for Matrix-Valued.
Quantum Computing Lecture 22 Michele Mosca. Correcting Phase Errors l Suppose the environment effects error on our quantum computer, where This is a description.
Experts and Boosting Algorithms. Experts: Motivation Given a set of experts –No prior information –No consistent behavior –Goal: Predict as the best expert.
8.4.2 Quantum process tomography 8.5 Limitations of the quantum operations formalism 量子輪講 2003 年 10 月 16 日 担当:徳本 晋
1 Introduction to Quantum Information Processing CS 467 / CS 667 Phys 467 / Phys 767 C&O 481 / C&O 681 Richard Cleve DC 3524 Course.
CS 8751 ML & KDDComputational Learning Theory1 Notions of interest: efficiency, accuracy, complexity Probably, Approximately Correct (PAC) Learning Agnostic.
Coherent Communication of Classical Messages Aram Harrow (MIT) quant-ph/
1 CS 391L: Machine Learning: Computational Learning Theory Raymond J. Mooney University of Texas at Austin.
Random Access Codes and a Hypercontractive Inequality for
Probabilistic Algorithms
The complexity of the Separable Hamiltonian Problem
Information Complexity Lower Bounds
Richard Cleve DC 2117 Introduction to Quantum Information Processing CS 667 / PH 767 / CO 681 / AM 871 Lecture 16 (2009) Richard.
Complexity-Theoretic Foundations of Quantum Supremacy Experiments
Introduction to Randomized Algorithms and the Probabilistic Method
New Characterizations in Turnstile Streams with Applications
On the Size of Pairing-based Non-interactive Arguments
Introduction to Machine Learning
A low cost quantum factoring algorithm
Shadow Tomography of Quantum States
Shadow Tomography of Quantum States
Scott Aaronson (UT Austin) MIT, November 20, 2018
Quantum Information Theory Introduction
New Results on Learning and Reconstruction of Quantum States
New Results on Learning and Reconstruction of Quantum States
Computational Learning Theory
3rd Lecture: QMA & The local Hamiltonian problem (CNT’D)
Chap 4 Quantum Circuits: p
New Results on Learning and Reconstruction of Quantum States
Computational Learning Theory
Chapter 11 Limitations of Algorithm Power
Quantum Computing and the Quest for Quantum Computational Supremacy
Online Learning of Quantum States Scott Aaronson (UT Austin)
Gentle Measurement of Quantum States and Differential Privacy
Lattices. Svp & cvp. lll algorithm. application in cryptography
Quantum Computation and Information Chap 1 Intro and Overview: p 28-58
Scott Aaronson (UT Austin) UNM, Albuquerque, October 18, 2018
Richard Cleve DC 2117 Introduction to Quantum Information Processing CS 667 / PH 767 / CO 681 / AM 871 Lecture 16 (2009) Richard.
Gentle Measurement of Quantum States and Differential Privacy *
Richard Cleve DC 3524 Introduction to Quantum Information Processing CS 467 / CS 667 Phys 667 / Phys 767 C&O 481 / C&O 681 Lecture.
Presentation transcript:

PAC-Learning and Reconstruction of Quantum States  Scott Aaronson (University of Texas at Austin) COLT’2017, Amsterdam

What Is A Quantum (Pure) State? A unit vector of complex numbers Infinitely many bits in the amplitudes ,—but when you measure, the state “collapses” to just 1 bit! (0 with probability ||2, 1 with probability ||2) Holevo’s Theorem (1973): By sending n qubits, Alice can communicate at most n classical bits to Bob (or 2n if they pre-share entanglement)

The Problem for CS A state of n entangled qubits requires 2n complex numbers to specify, even approximately: To a computer scientist, this is probably the central fact about quantum mechanics Why should we worry about it?

Answer 1: Quantum State Tomography Task: Given lots of copies of an unknown quantum state, produce an approximate classical description of it Central problem: To do tomography on a general state of n particles, you clearly need ~cn measurements Innsbruck group (2005): 8 particles / ~656,000 experiments!

Answer 2: Quantum Computing Skepticism Levin Goldreich ‘t Hooft Davies Wolfram Some physicists and computer scientists believe quantum computers will be impossible for a fundamental reason For many of them, the problem is that a quantum computer would “manipulate an exponential amount of information” using only polynomial resources But is it really an exponential amount?

Can we tame the exponential beast? Idea: “Shrink quantum states down to reasonable size” by viewing them operationally Analogy: A probability distribution over n-bit strings also takes ~2n bits to specify. But seeing a sample only provides n bits In this talk, I’ll survey 14 years of results showing how the standard tools from COLT can be used to upper-bound the “effective size” of quantum states [A. 2004], [A. 2006], [A.-Drucker 2010], [A. 2017], [A.-Hazan-Nayak 2017] Lesson: “The linearity of QM helps tame the exponentiality of QM”

The Absent-Minded Advisor Problem | Can you hand all your grad students the same nO(1)-qubit quantum state |, so that by measuring their copy of | in a suitable basis, each student can learn the {0,1} answer to their n-bit thesis question? NO [Nayak 1999, Ambainis et al. 1999] Indeed, quantum communication is no better than classical for this task as n

classical bits (trivial bounds: exp(n), |S|) On the Bright Side… Suppose Alice wants to describe an n-qubit quantum state | to Bob, well enough that, for any 2-outcome measurement E in some finite set S, Bob can estimate Pr[E(|) accepts] to constant additive error Theorem (A. 2004): In that case, it suffices for Alice to send Bob only ~n log n  log|S| classical bits (trivial bounds: exp(n), |S|)

| ALL YES/NO MEASUREMENTS PERFORMABLE USING ≤n2 GATES

Interlude: Mixed States DD Hermitian psd matrix with trace 1 Encodes (everything measurable about) a probability distribution where each superposition |i occurs with probability pi A yes-or-no measurement can be represented by a DD Hermitian psd matrix E with all eigenvalues in [0,1] The measurement “accepts”  with probability Tr(E), and “rejects”  with probability 1-Tr(E)

How does the theorem work? 1 3 2 I Alice is trying to describe the n-qubit state =|| to Bob In the beginning, Bob knows nothing about , so he guesses it’s the maximally mixed state 0=I (actually I/2n = I/D) Then Alice helps Bob improve, by repeatedly telling him a measurement EtS on which his current guess t-1 badly fails Bob lets t be the state obtained by starting from t-1, then performing Et and postselecting on getting the right outcome

Question: Why can Bob keep reusing the same state? To ensure that, we actually need to “amplify” | to |log(n), slightly increasing the number of qubits from n to n log n Gentle Measurement / Almost As Good As New Lemma: Suppose a measurement of a mixed state  yields a certain outcome with probability 1- Then after the measurement, we still have a state ’ that’s -close to  in trace distance

Solving, we find that t = O(n log(n)) Crucial Claim: Bob’s iterative learning procedure will “converge” on a state T that behaves like |log(n) on all measurements in the set S, after at most T=O(n log(n)) iterations Proof: Let pt = Pr[first t postselections all succeed]. Then Solving, we find that t = O(n log(n)) So it’s enough for Alice to tell Bob about T=O(n log(n)) measurements E1,…,ET, using log|S| bits per measurement If pt wasn’t less than, say, (2/3)pt-1, learning would’ve ended! Complexity theory consequence: BQP/qpoly  PostBQP/poly (Open whether BQP/qpoly=BQP/poly)

Quantum Occam’s Razor Theorem “Quantum states are PAC-learnable” Let | be an unknown entangled state of n particles Suppose you just want to be able to estimate the acceptance probabilities of most measurements E drawn from some probability distribution  Then it suffices to do the following, for some m=O(n): Choose m measurements independently from  Go into your lab and estimate acceptance probabilities of all of them on | Find any “hypothesis state” approximately consistent with all measurement outcomes “Quantum states are PAC-learnable”

Fat-Shattering Dimension To prove the theorem, we use Kearns and Schapire’s Fat-Shattering Dimension Let C be a class of functions from S to [0,1]. We say a set {x1,…,xk}S is -shattered by C if there exist reals a1,…,ak such that, for all 2k possible statements of the form f(x1)a1-  f(x2)a2+  …  f(xk)ak-, there’s some fC that satisfies the statement. Then fatC(), the -fat-shattering dimension of C, is the size of the largest set -shattered by C.

Small Fat-Shattering Dimension Implies Small Sample Complexity Let C be a class of functions from S to [0,1], and let fC. Suppose we draw m elements x1,…,xm independently from some distribution , and then output a hypothesis hC such that |h(xi)-f(xi)| for all i. Then provided /7 and we’ll have with probability at least 1- over x1,…,xm. Bartlett-Long 1996, building on Alon et al. 1993, building on Blumer et al. 1989

Upper-Bounding the Fat-Shattering Dimension of Quantum States Nayak 1999: If we want to encode k classical bits into n qubits, in such a way that any bit can be recovered with probability 1-p, then we need n(1-H(p))k Corollary (“turning Nayak’s result on its head”): Let Cn be the set of functions that map an n-qubit measurement E to Tr(E), for some . Then No need to thank me! Quantum Occam’s Razor Theorem follows easily…

How do we find the hypothesis state? Here’s one way: let b1,…,bm be the outcomes of measurements E1,…,Em Then choose a hypothesis state  to minimize This is a convex programming problem, which can be solved in time polynomial in D=2n (good enough in practice for n15 or so) Optimized, linear-time iterative method for this problem: [Hazan 2008]

Numerical Simulation [A.-Dechter 2008] We implemented Hazan’s algorithm in MATLAB, and tested it on simulated quantum state data We studied how the number of sample measurements m needed for accurate predictions scales with the number of qubits n, for n≤10 Result of experiment: My theorem appears to be true…

Now tested on real lab data as well. [Rocchetto et al Now tested on real lab data as well! [Rocchetto et al. 2017, in preparation]

Combining My Postselection and Quantum Learning Results [A.-Drucker 2010]: Given an n-qubit state  and complexity bound T, there exists an efficient measurement V on poly(n,T) qubits, such that any state that “passes” V can be used to efficiently estimate ’s response to any yes-or-no measurement E that’s implementable by a circuit of size T Application: Trusted quantum advice is equivalent to trusted classical advice + untrusted quantum advice In complexity terms: BQP/qpoly  QMA/poly Proof uses boosting-like techniques, together with results on -covers and fat-shattering dimension

New Result [A. 2017, in preparation]: “Shadow Tomography” Theorem: Let  be an unknown D-dimensional state, and let E1,…,EM be known 2-outcome measurements. Suppose we want to know Tr(Ei) to within additive error , for all i[M]. We can achieve this, with high probability, given only k copies of , where Open Problem: Dependence on D removable? Challenge: How to measure E1,…,EM without destroying our few copies of  in the process!

Proof Idea Theorem [A. 2006, fixed by Harrow-Montanaro-Lin 2016]: Let  be an unknown D-dimensional state, and let E1,…,EM be known 2-outcome measurements. Suppose we’re promised that either there exists an i such that Tr(Ei)c, or else Tr(Ei)c- for all i[M]. We can decide which, with high probability, given only k copies of , where Indeed, can find an i with Tr(Ei)c-, if given O(log4 M / 2) copies of  Now run my postselected learning algorithm [A. 2004], but using the above to find the Ei’s to postselect on!

Online Learning of Quantum States Answers a question of Elad Hazan (2015) Corollary of my Postselected Learning Algorithm: Let  be an unknown D-dimensional state. Suppose we’re given 2-outcome measurements E1,E2,… in sequence, each followed by the value of Tr(Et) to within /2. Each time, we’re challenged to guess a hypothesis state  such that A.-Hazan-Nayak 2017: Can also use general convex optimization tools, or an upper bound on “sequential fat-shattering dimension” of quantum states, to get a regret bound for the agnostic case, now involving a t term We can do this so that we fail at most k times, where

Summary The tools of COLT let us show that in many scenarios, the “exponentiality” of quantum states is more bark than bite I.e. there’s a short classical string that specifies how the quantum state behaves, on any 2-outcome measurement you could actually perform Applications from complexity theory to experimental quantum state tomography… Alas, these results don’t generalize to many-outcome measurements, or to learning quantum processes Biggest future challenge: Find subclasses of states and measurements for which these learning procedures are computationally efficient (Stabilizer states: Rocchetto 2017)