Running a Quantum Circuit at the Speed of Data Nemanja Isailovic, Mark Whitney, Yatish Patel, John Kubiatowicz U.C. Berkeley QEC 2007.

Slides:



Advertisements
Similar presentations
Q U RE: T HE Q UANTUM R ESOURCE E STIMATOR T OOLBOX Martin Suchara (IBM Research) October 9, 2013 In collaboration with: Arvin Faruque, Ching-Yi Lai, Gerardo.
Advertisements

Dynamic Topology Optimization for Supercomputer Interconnection Networks Layer-1 (L1) switch –Dumb switch, Electronic “patch panel” –Establishes hard links.
Data Marshaling for Multi-Core Architectures M. Aater Suleman Onur Mutlu Jose A. Joao Khubaib Yale N. Patt.
التصميم المنطقي Second Course
1 8-Bit Barrel Shifter Cyrus Thomas Ekemini Essien Kuang-Wai (Kenneth) Tseng Advisor: Dr. David Parent December 8, 2004.
Modern VLSI Design 4e: Chapter 8 Copyright  2008 Wayne Wolf Topics High-level synthesis. Architectures for low power. GALS design.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day17: November 20, 2000 Time Multiplexing.
Packet-Switched vs. Time-Multiplexed FPGA Overlay Networks Kapre et. al RC Reading Group – 3/29/2006 Presenter: Ilya Tabakh.
SCORE - Stream Computations Organized for Reconfigurable Execution Eylon Caspi, Michael Chu, Randy Huang, Joseph Yeh, Yury Markovskiy Andre DeHon, John.
Optimizing the layout and error properties of quantum circuits November 10 th, 2009 John Kubiatowicz
EE 141 Project 2May 8, Outstanding Features of Design Maximize speed of one 8-bit Division by: i. Observing loop-holes in 8-bit division ii. Taking.
1 ACS Unit of Viterbi Decoder Audy,Garrick Ng, Ichang Wu, Wen-Jiun Yong Advisor: Dave Parent Spring 2005.
Local Fault-tolerant Quantum Computation Krysta Svore Columbia University FTQC 29 August 2005 Collaborators: Barbara Terhal and David DiVincenzo, IBM quant-ph/
UNIVERSITY OF MASSACHUSETTS Dept
Digital Design – Optimizations and Tradeoffs
Modern VLSI Design 2e: Chapter 6 Copyright  1998 Prentice Hall PTR Topics n Shifters. n Adders and ALUs.
Engineering Models and Design Methods for Quantum State Machines.
Lab for Reliable Computing Generalized Latency-Insensitive Systems for Single-Clock and Multi-Clock Architectures Singh, M.; Theobald, M.; Design, Automation.
Automated Generation of Layout and Control for Quantum Circuits Mark Whitney, Nemanja Isailovic, Yatish Patel, John Kubiatowicz University of California,
Arithmetic-Logic Units CPSC 321 Computer Architecture Andreas Klappenecker.
1 8 Bit ALU Rahul Vyas Gyanesh Chhipa Jaimin Shah Advisor: Dr. David W. Parent 05/08/2006.
A Fault-tolerant Architecture for Quantum Hamiltonian Simulation Guoming Wang Oleg Khainovski.
Quantum Convolutional Coding with Entanglement Assistance Mark M. Wilde Communication Sciences Institute, Ming Hsieh Department of Electrical Engineering,
Quantum Communication, Quantum Entanglement and All That Jazz Mark M. Wilde Communication Sciences Institute, Ming Hsieh Department of Electrical Engineering,
CS252 Graduate Computer Architecture Lecture 28 Esoteric Computer Architecture DNA Computing & Quantum Computing Prof John D. Kubiatowicz
Given UPC algorithm – Cyclic Distribution Simple algorithm does cyclic distribution This means that data is not local unless item weight is a multiple.
1. 2  What is MIMO?  Basic Concepts of MIMO  Forms of MIMO  Concept of Cooperative MIMO  What is a Relay?  Why Relay channels?  Types of Relays.
Lecture 2: Field Programmable Gate Arrays September 13, 2004 ECE 697F Reconfigurable Computing Lecture 2 Field Programmable Gate Arrays.
Paraty - II Quantum Information Workshop 11/09/2009 Fault-Tolerant Computing with Biased-Noise Superconducting Qubits Frederico Brito Collaborators: P.
Quantum Computing Presented by: Don Davis PHYS
CS252 Graduate Computer Architecture Lecture 26 Modern Intel Processors, Quantum Computing and Quantum CAD Design April 30 th, 2012 Prof John D. Kubiatowicz.
1 Four-Bit Serial Adder By Huong Ho, Long Nguyen, Lin-Kai Yang Ins: Dr. David Parent Date: May 17 th, 2004.
CS252 Graduate Computer Architecture Lecture 26 Quantum Computing and Quantum CAD Design May 4 th, 2010 Prof John D. Kubiatowicz
Optimizing the layout and error properties of quantum circuits
A New Method For Developing IBIS-AMI Models
What is an And Gate? It is a digital circuit that produce logical operations The logical operations are call Boolean logical Boolean operation consist.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Floating-Point Reuse in an FPGA Implementation of a Ray-Triangle Intersection Algorithm Craig Ulmer June 27, 2006 Sandia is a multiprogram.
Tinoosh Mohsenin and Bevan M. Baas VLSI Computation Lab, ECE Department University of California, Davis Split-Row: A Reduced Complexity, High Throughput.
1 CPSC3850 Adders and Simple ALUs Simple Adders Figures 10.1/10.2 Binary half-adder (HA) and full-adder (FA). Digit-set interpretation: {0, 1}
EKT 221/4 DIGITAL ELECTRONICS II  Registers, Micro-operations and Implementations - Part3.
Modern VLSI Design 4e: Chapter 6 Copyright  2008 Wayne Wolf Topics n Shifters. n Adders and ALUs.
1 hardware of quantum computer 1. quantum registers 2. quantum gates.
EE5393, Circuits, Computation, and Biology Computing with Probabilities 1,1,0,0,0,0,1,0 1,1,0,1,0,1,1,1 1,1,0,0,1,0,1,0 a = 6/8 c = 3/8 b = 4/8.
A Reconfigurable Low-power High-Performance Matrix Multiplier Architecture With Borrow Parallel Counters Counters : Rong Lin SUNY at Geneseo
Quantum Computing Paola Cappellaro
CDA 3101 Fall 2013 Introduction to Computer Organization The Arithmetic Logic Unit (ALU) and MIPS ALU Support 20 September 2013.
Classical Control for Quantum Computers Mark Whitney, Nemanja Isailovic, Yatish Patel, John Kubiatowicz U.C. Berkeley.
Cost/Performance Tradeoffs: a case study
© 2015 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual.
Lopamudra Kundu Reg. No. : of Roll No.:- 91/RPE/ Koushik Basak
Quantum Convolutional Coding Techniques Mark M. Wilde Communication Sciences Institute, Ming Hsieh Department of Electrical Engineering, University of.
Unrolling Carry Recurrence
Conditional-Sum Adders Parallel Prefix Network Adders
Author : Weirong Jiang, Yi-Hua E. Yang, and Viktor K. Prasanna Publisher : IPDPS 2010 Presenter : Jo-Ning Yu Date : 2012/04/11.
FPGA-Based System Design: Chapter 6 Copyright  2004 Prentice Hall PTR Topics n Low power design. n Pipelining.
Arithmetic-Logic Units. Logic Gates AND gate OR gate NOT gate.
An Introduction to Quantum Computation Sandy Irani Department of Computer Science University of California, Irvine.
Lecture 17: Dynamic Reconfiguration I November 10, 2004 ECE 697F Reconfigurable Computing Lecture 17 Dynamic Reconfiguration I Acknowledgement: Andre DeHon.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 15: March 13, 2013 High Level Synthesis II Dataflow Graph Sharing.
An FFT for Wireless Protocols Dr. J. Greg Nash Centar ( HAWAI'I INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES Mobile.
Inexact and Approximate Circuits for Error Tolerant Applications IcySoc RTD 2013 Jérémy Schlachter, Vincent Camus, Christian Enz Ecole polytechnique fédérale.
Array Multiplier Haibin Wang Qiong Wu. Outlines Background & Motivation Principles Implementation & Simulation Advantages & Disadvantages Conclusions.
QUANTUM COMPUTING: Quantum computing is an attempt to unite Quantum mechanics and information science together to achieve next generation computation.
Conditional-Sum Adders Parallel Prefix Network Adders
Pipelining and Retiming 1
Adder, Subtructer, Encoder, Decoder, Multiplexer, Demultiplexer
Entangling Atoms with Optical Frequency Combs
Chapter 14 Arithmetic Circuits (II): Multiplier Rev /12/2003
Presentation transcript:

Running a Quantum Circuit at the Speed of Data Nemanja Isailovic, Mark Whitney, Yatish Patel, John Kubiatowicz U.C. Berkeley QEC 2007

The Impact of QEC H X QEC Step Zero Ancilla Prep time Serial Latency Parallel Latency Data Involvement in QEC Step

The Speed of Data Non-Transversal Logical Gate –Zhou et al., Phys. Rev. A, 62(5):52316 Ideally, execution time determined solely by data Non-Transversal Ancilla Prepare Data Involvement time hardware

Limited BW Graph 32-bit Quantum Carry-Lookahead Adder in Ion Traps –Varying rate at which encoded zero ancillae are provided for QEC –Conclusion: design architecture with “ancilla factories”

Idealized Qalypso Architecture Dense data region –Data qubits only –Local communication Shared Ancilla Factories –Distributed to data as needed –Fully multiplexed to all data –Output ports ( ): close to data –Input ports ( ): may be far from data, since recycled qubits have irrelevant state Goals –Design ancilla factories –Answer Question: How much hardware is needed for ancilla generation to run at the speed of data?

Our Quantum CAD Toolset Automated toolset to assist in architecture design –Ion trap technology –Local gates: two qubits in the same trap –Basic block abstraction: to avoid unknown electrode details Our basic layout blocks straight3-way4-wayturn gate locations Dr. Hensinger, University of Sussex 3-way intersection

Level 1 [[7,1,3]] QEC Circuits Identical Circuits Steane, Multiple-Particle Interference and QEC

Zero Ancilla Factory Design I “In-place” ancilla preparation Encoded AncillaVerification Qubits Ancilla factory consists of many of these –Encoded ancilla prepared in many places –But we want input and output ports In-place Prep In-place Prep In-place Prep In-place Prep

Zero Ancilla Factory Design II Pipelined ancilla preparation: break into stages –Match physical qubit bandwidth between stages for high utilization –Steady stream of encoded ancillae at output port Physical 0 Prep CNOTs Cat Prep Crossbar CNOTs Cat Prep Crossbar Verif Physical 0 Prep X/Z Correct Crossbar X/Z Correct Junk Physical Qubits Good Encoded Ancillae Recycle cat state qubits and failures Recycle used correction qubits

Area Needs for Ancilla Preparation

Practical Qalypso Architecture Multiple Data Regions –Each serviced by local ancilla factories –Communication network moves data between regions (not shown) –Data regions as large as possible to get benefits of minimizing inter-region movement and multiplexing ancilla factory output

Summary Investigated removing ancilla generation from critical path –Operations on data qubits dictate performance –Tradeoff in ancilla bandwidth vs execution speed Architectural approach: ancilla factories –Match production bandwidth to needs of data –Pipelining places output ports close to data Qalypso architecture –Dense data regions with local communication –Ancilla factories segregated from data Multiplexing between factories and data Input and output ports –Layout investigation => ancilla generation dominates area