Generalized Entropies

Slides:

Advertisements

Similar presentations

Extracting Randomness From Few Independent Sources Boaz Barak, IAS Russell Impagliazzo, UCSD Avi Wigderson, IAS.

Advertisements

Tony Short University of Cambridge (with Sabri Al-Safi – PRA 84, (2011))

Theory of Computing Lecture 3 MAS 714 Hartmut Klauck.

Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.

Entropy in the Quantum World Panagiotis Aleiferis EECS 598, Fall 2001.

Michael A. Nielsen University of Queensland Quantum entropy Goals: 1.To define entropy, both classical and quantum. 2.To explain data compression, and.

1 Introduction to Quantum Information Processing QIC 710 / CS 667 / PH 767 / CO 681 / AM 871 Richard Cleve DC 2117 / RAC 2211 Lecture.

Chain Rules for Entropy

Chapter 6 Information Theory

. Parameter Estimation and Relative Entropy Lecture #8 Background Readings: Chapters 3.3, 11.2 in the text book, Biological Sequence Analysis, Durbin et.

Avraham Ben-Aroya (Tel Aviv University) Oded Regev (Tel Aviv University) Ronald de Wolf (CWI, Amsterdam) A Hypercontractive Inequality for Matrix-Valued.

Theory and Applications

Review of Probability and Random Processes

Quantum Shannon Theory Patrick Hayden (McGill) 17 July 2005, Q-Logic Meets Q-Info.

Lecture 3. Relation with Information Theory and Symmetry of Information Shannon entropy of random variable X over sample space S: H(X) = ∑ P(X=x) log 1/P(X=x)‏,

Introduction to AEP In information theory, the asymptotic equipartition property (AEP) is the analog of the law of large numbers. This law states that.

Information Theory and Security Prakash Panangaden McGill University First Canada-France Workshop on Foundations and Practice of Security Montréal 2008.

1 Introduction to Quantum Information Processing QIC 710 / CS 768 / PH 767 / CO 681 / AM 871 Richard Cleve QNC 3129 Lecture 18 (2014)

1 Introduction to Quantum Information Processing QIC 710 / CS 667 / PH 767 / CO 681 / AM 871 Richard Cleve DC 2117 Lecture 16 (2011)

All of Statistics Chapter 5: Convergence of Random Variables Nick Schafer.

The Operational Meaning of Min- and Max-Entropy

QCCC07, Aschau, October 2007 Miguel Navascués Stefano Pironio Antonio Acín ICFO-Institut de Ciències Fotòniques (Barcelona) Cryptographic properties of.

Channel Capacity.

The Operational Meaning of Min- and Max-Entropy Christian Schaffner – CWI Amsterdam, NL joint work with Robert König – Caltech Renato Renner – ETH Zürich,

Entanglement sampling and applications Omar Fawzi (ETH Zürich) Joint work with Frédéric Dupuis (Aarhus University) and Stephanie Wehner (CQT, Singapore)

A generalization of quantum Stein’s Lemma Fernando G.S.L. Brandão and Martin B. Plenio Tohoku University, 13/09/2008.

Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.

Inequalities for Stochastic Linear Programming Problems By Albert Madansky Presented by Kevin Byrnes.

Channel Coding Theorem (The most famous in IT) Channel Capacity; Problem: finding the maximum number of distinguishable signals for n uses of a communication.

1 Review of Probability and Random Processes. 2 Importance of Random Processes Random variables and processes talk about quantities and signals which.

Debasis Sarkar * Department of Applied Mathematics, University of Calcutta *

Exponential Decay of Correlations Implies Area Law: Single-Shot Techniques Fernando G.S.L. Brandão ETH Zürich Based on joint work with Michał Horodecki.

Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.

Quantum Approximate Markov Chains and the Locality of Entanglement Spectrum Fernando G.S.L. Brandão Caltech Seefeld 2016 based on joint work with Kohtaro.

Basic Concepts of Information Theory Entropy for Two-dimensional Discrete Finite Probability Schemes. Conditional Entropy. Communication Network. Noise.

Parameter, Statistic and Random Samples

Random Access Codes and a Hypercontractive Inequality for

Fernando G.S.L. Brandão and Martin B. Plenio

Richard Cleve DC 2117 Introduction to Quantum Information Processing CS 667 / PH 767 / CO 681 / AM 871 Lecture 16 (2009) Richard.

Entropic uncertainty relations for anti-commuting observables

Visual Recognition Tutorial

12. Principles of Parameter Estimation

Introduction to Information theory

Outline Introduction Signal, random variable, random process and spectra Analog modulation Analog to digital conversion Digital transmission through baseband.

Sampling of min-entropy relative to quantum knowledge Robert König in collaboration with Renato Renner TexPoint fonts used in EMF. Read the TexPoint.

Information-Theoretical Analysis of the Topological Entanglement Entropy and Multipartite correlations Kohtaro Kato (The University of Tokyo) based on.

General Strong Polarization

Cryptography Lecture 4.

COMS E F15 Lecture 2: Median trick + Chernoff, Distinct Count, Impossibility Results Left to the title, a presenter can insert his/her own image.

Structural Properties of Low Threshold Rank Graphs

General Strong Polarization

Cryptography Lecture 19.

When are Fuzzy Extractors Possible?

Quantum Information Theory Introduction

Summarizing Data by Statistics

General Strong Polarization

When are Fuzzy Extractors Possible?

Chapter 4, Regression Diagnostics Detection of Model Violation

Cryptography Lecture 4.

Chapter 4 Sequences.

Information Theoretical Analysis of Digital Watermarking

The Selection Problem.

Cryptography Lecture 3.

12. Principles of Parameter Estimation

Richard Cleve DC 2117 Introduction to Quantum Information Processing CS 667 / PH 767 / CO 681 / AM 871 Lecture 16 (2009) Richard.

Chapter 8 Estimation.

Data Exploration and Pattern Recognition © R. El-Yaniv

Cryptography Lecture 18.

Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.

Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.

Presentation transcript:

Generalized Entropies Renato Renner Institute for Theoretical Physics ETH Zurich, Switzerland

Collaborators Roger Colbeck (ETH Zurich) Nilanjana Datta (U Cambridge) Oscar Dahlsten (ETH Zurich) Patrick Hayden (McGill U, Montreal) Robert König (Caltech) Christian Schaffner (CWI, Amsterdam) Valerio Scarani (NUS, Singapore) Marco Tomamichel (ETH Zurich) Ligong Wang (ETH Zurich) Andreas Winter (U Bristol) Stefan Wolf (ETH Zurich) Jürg Wullschleger (U Bristol)

operational interpretation Why is Shannon / von Neumann entropy so widely used in information theory? operational interpretation quantitative characterization of information-processing tasks easy to handle simple mathematical definition / intuitive entropy calculus

Operational interpretations of Shannon entropy (classical scenarios) data compression rate for a source PX rate = S(X) transmission rate of a channel PY|X rate = maxPX S(X) - S(X|Y) secret-key rate for a correlated source PXYZ rate ≥ S(X|Z) - S(X|Y) many more …

Operational interpretations of von Neumann entropy (quantum scenarios) data compression rate for a source ½A rate = S(A) state merging rate for a bipartite state ½AB rate = - S(A|B) randomness extraction rate for a cq-state ½XE rate = S(X|E) secret-key rate for a cqq-state ½XBE rate ≥ S(X|E) - S(X|B) …

Why is von Neumann entropy so widely used in information theory? operational meaning  quantitative characterization of information-processing tasks easy to handle simple mathematical definition / intuitive entropy calculus

More useful facts about von Neumann entropy simple definition S(½) := - tr(½ log ½) S(A) := S(½A) S(A | B) := S(A B) - S(B) entropy calculus Chain rule: S(A B | C) = S(A | C) + S(B | A C) Strong subadditivity: S(A | B) ≥ S(A | B C)

Why is von Neumann entropy so widely used in information theory? operational meaning  quantitative characterization of information-processing tasks easy to handle  simple mathematical definition / intuitive entropy calculus

Limitations of the von Neumann entropy Claim: Operational interpretations are only valid under certain assumptions. Typical assumptions (e.g., for source coding) i.i.d.: source emits n identical and independently distributed pieces of data asymptotics: n is large (n  1) Formally: PX1…Xn = (PX)£n for n  1

Can these assumptions be justified in realistic settings? i.i.d. assumption approximation justified by de Finetti’s theorem (permutation symmetry implies i.i.d. structure on almost all subsystems) problematic in certain cryptographic scenarios (e.g., in the bounded storage model) asymptotics realistic settings are always finite (small systems might be of particular interest for practice) but might be OK if convergence is fast enough (convergence often unknown, problematic in cryptography)

Is the i.i.d. assumption really needed? Example PX Fig.: k=4 Randomness extraction Hextr(X) = 1 bit (depends on maximum prob.) Data compression Hcompr(X) ¸ k bits (depends on alphabet size) Shannon entropy S(X) = 1 + k/2 (provides the right answer if PX1…Xn = (PX)£n for n  1)

Features of von Neumann entropy operational interpretations hold asymptotically under the i.i.d. assumption but generally invalid  easy to handle simple definition  entropy calculus  (Obvious) question Is there a    -entropy measure? Answer Yes: Hmin

Generalized entropies Definition: Generalized relative entropy for positive operators ½ and ¾. Dmin(½ || ¾) := min {¸2R : ½ · 2¸¢¾} Notation ½ · 2¸¢¾ means that the operator 2¸¢¾ - ½ is positive Remarks for two density operators ½ and ¾ Dmin(½ || ¾) ≥ 0 with equality iff ½ = ¾ Dmin is also defined for non-normalized operators

Classical case Let ½ and ¾ be classical states, i.e., Reminder: Dmin(½ || ¾) = min {¸: ½ · 2¸¢¾} Let ½ and ¾ be classical states, i.e., ½ = x P(x) |xihx| ¾ = x Q(x) |xihx| Then Dmin(P || Q) = min {¸: P(x) · 2¸¢Q(x), 8 x} = min {¸: P(x) / Q(x) · 2¸, 8 x} = log maxx P(x) / Q(x)

Classical case Let ½ and ¾ be classical states, i.e., ½ = x P(x) |xihx| ¾ = x Q(x) |xihx| Generalized relative entropy Dmin(P || Q) = maxx log P(x) / Q(x) Comparison: the standard relative entropy equals S(P || Q) = x P(x) log P(x) / Q(x)

Min-entropy Definition: (Conditional) min-entropy of ½AB Hmin(A | B) := - min Dmin(½AB || idA ¾B) where the minimum ranges over all states ¾B Reminder: Dmin(½ || ¾) := min {¸: ½ · 2¸¢¾} Remarks von Neumann entropy S(A | B) := - min¾ S(½AB || idA ¾B) mutual information I (A : B) := min ¾A ¾B S(½AB || ¾A ¾B) hence, the min-mutual information may be defined by Imin(A : B) := min¾A ¾B Dmin(½AB || ¾A ¾B)

Classical case Hmin(X|Y) = - log minQ maxx,y PXY(x,y) / QY(y) For a classical probability distribution PXY Hmin(X|Y) = - log minQ maxx,y PXY(x,y) / QY(y) Optimization over Q = QY gives Hmin(X|Y) = - log y PY maxx PX|Y(x,y) Remarks r.h.s. corresponds to -log of the average probability of correctly guessing X given Y. this interpretation can be extended to the fully quantum domain [König, Schaffner, RR, 2008]. Hmin(X|Y) equiv. to entropy used in [Dodis, Smith]

Min-entropy without conditioning Hmin(X) = - log maxx PX(x)

Smoothing Definition: Smooth entropy of PX H²(X) := maxX’ Hmin(X’) maximum taken over all PX’ with || PX - PX’ || · ²

Smoothing Definition: Smooth relative entropy of ½ and ¾ D²(½ || ¾) := min½’ Dmin(½’ || ¾) minimum taken over all ½ such that ||½ - ½’|| · ². Definition: Smooth min-entropy of ½AB H²(A | B) := - min¾B D²(½AB || idA ¾B)

Von Neumann entropy as a special case Consider an i.i.d. state ½A1 ... An B1 ... Bn := ½A Bn. Lemma S( A | B ) = lim²  1 limn  1 H²( A1 ... An | B1 ... Bn) / n Remark The lemma can be extended to spectral entropy rates (see [Han, Verdu] for classical distributions and [Hayashi, Nagaoka, Ogawa, Bowen, Datta] for quantum states).

Definition of a dual quantity For ½ABC pure, the von Neumann entropy satisfies S(A | B) = S(A B) - S(B) = S(C) - S(A C) = - S(A | C) Definition: Smooth max-entropy of ½AB Hmax(A | B) := - Hmin(A | C) for ½ABC pure. Observation Hmax(A) = H1/2(A) Hence, Hmax(A) is a measure for the rank of ½A.

Operational interpretations of Hmin extractable randomness Hmin(X|E) uniform (relative to E) random bits can be extracted from a random variable X [König, RR] for classical E, this result is known as left-over hashing [ILL] state merging - Hmin(A|B) bits of classical communication (from A to B) needed to merge A with B [Winter, RR] data compression Hmax(X|B) bits are needed to store X such that it can be recovered with the help of B [König] key agreement (at least) Hmin(X|E) - Hmax(X|B) secret key bits can be obtained from source of correlated randomness

Features of Hmin operational meaning  easy to handle ?

Min-entropy calculus Chain rules Strong subadditivity Hmin(A B | C) ' Hmin(A | B C) + Hmin(B | C) Hmin(A B | C) / Hmin(A | B C) + Hmax(B | C) (cf. talk by Robert König) Strong subadditivity Hmin(A | B C) · Hmin(A | B) ... “usual” entropy calculus ...

Example: Proof of strong subadditivity Lemma: Hmin(A | B C) · Hmin(A | B) Proof: By definition, we need to show that Dmin(½ABC || idA ¾BC) ¸ Dmin(½AB || idA ¾B) But this inequality holds because 2¸ ¢ idA ¾BC - ½ABC ¸ 0 implies 2¸ ¢ idA ¾B - ½AB ¸ 0 Note: the lemma implies subadditivity for S

Features of Hmin operational meaning  easy to handle  (often easier than von Neumann entropy)

Summary Hmin generalizes von Neumann entropy Main features general operational interpretation i.i.d. assumption not needed no asymptotics easy to handle simple definition simpler proofs (e.g., strong subadditivity)

quantum key distribution cryptography in the bounded storage model ... Applications quantum key distribution min-entropy plays crucial role in general security proofs cryptography in the bounded storage model see talks by Christian Schaffner and by Robert König ... Open questions Additivity conjecture Hmax corresponds to H1/2, for which the additivity conjecture still might hold New entropy measures based on Hmin?

Thanks for your attention