1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University

Prelude: one-way communication Basic goal: send a message from Alice to Bob over a channel. 2 communication channel Alice Bob

One-way communication 1)Encode; 2)Send; 3)Decode. 3 communication channel Alice Bob

Coding for one-way communication There are two main problems a good encoding needs to address: –E–Efficiency: use the least amount of the channel/storage necessary. –E–Error-correction: recover from (reasonable) errors; 4

Interactive computation Today’s theme Extending information and coding theory to interactive computation. 5 I will talk about interactive information theory and Anup Rao will talk about interactive error correction.

Efficient encoding Can measure the cost of storing a random variable X very precisely. Entropy: H(X) = ∑Pr[X=x] log(1/Pr[X=x]). H(X) measures the average amount of information a sample from X reveals. A uniformly random string of 1,000 bits has 1,000 bits of entropy. 6

Efficient encoding 7 H(X) = ∑Pr[X=x] log(1/Pr[X=x]). The ZIP algorithm works because H(X=typical 1MB file) < 8Mbits. P[“Hello, my name is Bob”] >> P[“h)2cjCv9]dsnC1=Ns{da3”]. For one-way encoding, Shannon’s source coding theorem states that Communication ≈ Information.

Efficient encoding 8 The problem of sending many samples of X can be implemented in H(X) communication on average. The problem of sending a single sample of X can be implemented in <H(X)+1 communication in expectation.

Communication complexity [Yao] Focus on the two party setting. 9 A B X Y A & B implement a functionality F(X,Y). F(X,Y) e.g. F(X,Y) = “X=Y?”

Communication complexity 10 A B X Y Goal: implement a functionality F(X,Y). A protocol π(X,Y) computing F(X,Y): F(X,Y) m 1 (X) m 2 (Y,m 1 ) m 3 (X,m 1,m 2 ) Communication cost = #of bits exchanged.

Distributional communication complexity The input pair (X,Y) is drawn according to some distribution μ. Goal: make a mistake on at most an ε fraction of inputs. The communication cost: C(F,μ,ε): C(F,μ,ε) := min π computes F with error≤ε C(π, μ). 11

Example 12 μ is a distribution of pairs of files. F is “X=Y?”: MD5(X) (128b) X=Y? (1b) Communication cost = 129 bits. ε ≈ 2 -128. A B X Y

Randomized communication complexity Goal: make a mistake of at most ε on every input. The communication cost: R(F,ε). Clearly: C(F,μ,ε)≤R(F,ε) for all μ. What about the converse? A minimax(!) argument [Yao]: R(F,ε)=max μ C(F,μ,ε). 13

A note about the model We assume a shared public source of randomness. 14 A B X Y R

The communication complexity of EQ(X,Y) The communication complexity of equality: R(EQ,ε) ≈ log 1/ε Send log 1/ε random hash functions applied to the inputs. Accept if all of them agree. What if ε=0? R(EQ,0) ≈ n, where X,Y in {0,1} n. 15

Information in a two-way channel H(X) is the “inherent information cost” of sending a message distributed according to X over the channel. 16 communication channel Alice Bob X What is the two-way analogue of H(X)?

Entropy of interactive computation A B X Y R “Inherent information cost” of interactive two-party tasks.

One more definition: Mutual Information The mutual information of two random variables is the amount of information knowing one reveals about the other: I(A;B) = H(A)+H(B)-H(AB) If A,B are independent, I(A;B)=0. I(A;A)=H(A). 18 H(A) H(B) I(A,B)

Information cost of a protocol [Chakrabarti-Shi-Wirth-Yao-01, Bar-Yossef- Jayram-Kumar-Sivakumar-04, Barak-B- Chen-Rao-10]. Caution: different papers use “information cost” to denote different things! Today, we have a better understanding of the relationship between those different things. 19

Information cost of a protocol Prior distribution: (X,Y) ~ μ. A B X Y Protocol π Protocol transcript π I(π, μ) = I(π;Y|X) + I(π;X|Y) what Alice learns about Y + what Bob learns about X

External information cost (X,Y) ~ μ. A B X Y Protocol π Protocol transcript π I ext (π, μ) = I(π;XY) what Charlie learns about (X,Y) C

Another view on I and I ext It is always the case that C(π, μ) ≥ I ext (π, μ) ≥ I(π, μ). I ext measures the ability of Alice and Bob to compute F(X,Y) in an information theoretically secure way if they are afraid of an eavesdropper. I measures the ability of the parties to compute F(X,Y) if they are afraid of each other.

Example 23 F is “X=Y?”. μ is a distribution where w.p. ½ X=Y and w.p. ½ (X,Y) are random. MD5(X) [128b] X=Y? A B X Y I ext (π, μ) = I(π;XY) = 129 bits what Charlie learns about (X,Y)

Example F is “X=Y?”. μ is a distribution where w.p. ½ X=Y and w.p. ½ (X,Y) are random. MD5(X) [128b] X=Y? A B X Y I(π, μ) = I(π;Y|X)+I(π;X|Y) ≈ what Alice learns about Y + what Bob learns about X 1 +64.5 = 65.5 bits

The (distributional) information cost of a problem F Recall: C(F,μ,ε) := min π computes F with error≤ε C(π, μ). By analogy: I(F, μ, ε) := inf π computes F with error≤ε I(π, μ). I ext (F, μ, ε) := inf π computes F with error≤ε I ext (π, μ). 25

I(F,μ,ε) vs. C(F,μ,ε): compressing interactive computation Source Coding Theorem: the problem of sending a sample of X can be implemented in expected cost <H(X)+1 communication – the information content of X. Is the same compression true for interactive protocols? Can F be solved in I(F,μ,ε) communication? Or in I ext (F,μ,ε) communication?

The big question Can interactive communication be compressed? Can π be simulated by π’ such that C(π’, μ) ≈ I(π, μ)? Does I(F,μ,ε) ≈ C(F,μ,ε)? 27

Compression results we know Let ε, ρ be constants; let π be a protocol that computes F with error ε. π’s costs: C, I ext, I. Then π can be simulated using: –(I·C) ½ ·polylog(C) communication; [Barak-B-Chen-Rao’10] –I ext ·polylog(C) communication; [Barak-B-Chen-Rao’10] –2 O(I) communication; [B’11] while introducing an extra error of ρ. 28

The amortized cost of interactive computation Source Coding Theorem: the amortized cost of sending many independent samples of X is =H(X). What is the amortized cost of computing many independent copies of F(X,Y)?

Information = amortized communication Theorem [B-Rao’11] : for ε>0 I(F,μ,ε) = lim n→∞ C(F n,μ n,ε)/n. I(F,μ,ε) is the interactive analogue of H(X). 30

Information = amortized communication Theorem [B-Rao’11] : for ε>0 I(F,μ,ε) = lim n→∞ C(F n,μ n,ε)/n. I(F,μ,ε) is the interactive analogue of H(X). Can we get rid of μ? I.e. make I(F,ε) a property of the task F? C(F,μ,ε) I(F,μ,ε) R(F,ε)?

Prior-free information cost Define: I(F,ε) := inf π computes F with error≤ε max μ I(π, μ) Want a protocol that reveals little information against all priors μ! Definitions are cheap! What is the connection between the “syntactic” I(F,ε) and the “meaningful” I(F,μ,ε)? I(F,μ,ε) ≤ I(F,ε)… 32

33 Prior-free information cost I(F,ε) := inf π computes F with error ≤ε max μ I(π, μ). I(F,μ,ε) ≤ I(F,ε) for all μ. Recall: R(F,ε)=max μ C(F,μ,ε). Theorem [B’11] : I(F,ε) ≤ 2 · max μ I(F,μ,ε/2). I(F,0) = max μ I(F,μ,0).

34 Prior-free information cost Recall: I(F,μ,ε) = lim n→∞ C(F n,μ n,ε)/n. Theorem: for ε>0 I(F,ε) = lim n→∞ R(F n,ε)/n.

Example R(EQ,0) ≈ n. What is I(EQ,0)? 35

The information cost of Equality What is I(EQ,0)? Consider the following protocol. 36 A B X in {0,1} n Y in {0,1} n A non-singular in A 1 ·X A 1 ·Y A 2 ·X A 2 ·Y Continue for n steps, or until a disagreement is discovered.

Analysis (sketch) If X≠Y, the protocol will terminate in O(1) rounds on average, and thus reveal O(1) information. If X=Y… the players only learn the fact that X=Y (≤1 bit of information). Thus the protocol has O(1) information complexity. 37

Direct sum theorems I(F,ε) = lim n→∞ R(F n,ε)/n. Questions: –Does R(F n,ε)=Ω(n·R(F,ε))? –Does R(F n,ε)=ω(R(F,ε))? 38

Direct sum strategy The strategy for proving direct sum results. Take a protocol for F n that costs C n =R(F n,ε), and make a protocol for F that costs ≈C n /n. This would mean that C n∙C. 39 ~~ A protocol for n copies of F 1 copy of F CnCn C n /n ?

Direct sum strategy If life were so simple… 40 1 copy of F CnCn C n /n Easy! Copy 1 Copy 2 Copy n

Direct sum strategy Theorem: I(F,ε) = I(F n,ε)/n ≤ C n = R(F n,ε)/n. Compression → direct sum! 41

The information cost angle There is a protocol of communication cost C n, but information cost ≤C n /n. 1 copy of F CnCn C n /n Restriction 1 bit Copy 1 Copy 2Copy n C n /n info Compression?

Direct sum theorems Best known general simulation [BBCR’10] : A protocol with C communication and I information cost can be simulated using (I·C) ½ ·polylog(C) communication. Implies: R(F n,ε) = Ω(n 1/2 ∙R(F,ε)). 43 ~

Compression vs. direct sum We saw that compression → direct sum. A form of the converse is also true. Recall: I(F,ε) = lim n→∞ R(F n,ε)/n. If there is a problem such that I(F,ε)=o(R(F,ε)), then R(F n,ε)=o(n·R(F,ε)). 44

A complete problem Can define a problem called Correlated Pointer Jumping – CPJ(C,I). The problem has communication cost C and information cost I. CPJ(C,I) is the “least compressible problem”. If R(CPJ(C,I),1/3)=O(I), then R(F,1/3)=O(I(F,1/3)) for all F. 45

The big picture R(F, ε)R(F n,ε)/n I(F, ε)I(F n,ε)/n direct sum for information information = amortized communication direct sum for communication? interactive compression?

Partial progress Can compress bounded-round interactive protocols. The main primitive is a one-shot version of Slepian-Wolf theorem. Alice gets a distribution P X. Bob gets a prior distribution P Y. Goal: both must sample from P X. 47

Correlated sampling 48 A B PXPX PYPY M ~ P X The best we can hope for is D(P X || P Y ).

49 Proof Idea Sample using D(P X || P Y )+O(log 1/ε+D(P X || P Y ) ½ ) communication with statistical error ε. PXPX PYPY u1u1 u1u1 u2u2 u2u2 u3u3 u3u3 u4u4 u4u4 u4u4u4u4 ~|U| samples Public randomness: q1q1 q2q2 q3q3 q4q4 q5q5 q6q6 q7q7 ….u1u1 u2u2 u3u3 u4u4 u5u5 u6u6 u7u7 PXPX PYPY 11 00

Proof Idea Sample using D(P X || P Y )+O(log 1/ε+D(P X || P Y ) ½ ) communication with statistical error ε. u4u4 u2u2 h 1 (u 4 )h 2 (u 4 )  50 PXPX PYPY u4u4u4u4 PXPX PYPY 11 00 u2u2u2u2

51 Proof Idea Sample using D(P X || P Y )+O(log 1/ε+D(P X || P Y ) ½ ) communication with statistical error ε. u4u4 u2u2 h 4 (u 4 )…h log 1/ ε (u 4 ) u4u4 h 3 (u 4 ) PXPX 2P Y PXPX PYPY u4u4u4u4 u4u4u4u4 h 1 (u 4 ), h 2 (u 4 ) 11 00

52 Analysis If P X (u 4 )≈2 k P Y (u 4 ), then the protocol will reach round k of doubling. There will be ≈2 k candidates. About k+log 1/ε hashes. The contribution of u 4 to cost: –P X (u 4 ) (log P X (u 4 )/P Y (u 4 ) + log 1/ε). Done!

Directions Can interactive communication be fully compressed? R(F, ε) = I(F, ε)? What is the relationship between I(F, ε), I ext (F, ε) and R(F, ε)? Many other questions on interactive coding theory! 53

54 Thank You

1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Similar presentations

Presentation on theme: "1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Similar presentations

Presentation on theme: "1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University."— Presentation transcript:

Similar presentations

About project

Feedback