Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Similar presentations


Presentation on theme: "1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University."— Presentation transcript:

1 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University

2 Prelude: one-way communication Basic goal: send a message from Alice to Bob over a channel. 2 communication channel Alice Bob

3 One-way communication 1)Encode; 2)Send; 3)Decode. 3 communication channel Alice Bob

4 Coding for one-way communication There are two main problems a good encoding needs to address: –E–Efficiency: use the least amount of the channel/storage necessary. –E–Error-correction: recover from (reasonable) errors; 4

5 Interactive computation Today’s theme Extending information and coding theory to interactive computation. 5 I will talk about interactive information theory and Anup Rao will talk about interactive error correction.

6 Efficient encoding Can measure the cost of storing a random variable X very precisely. Entropy: H(X) = ∑Pr[X=x] log(1/Pr[X=x]). H(X) measures the average amount of information a sample from X reveals. A uniformly random string of 1,000 bits has 1,000 bits of entropy. 6

7 Efficient encoding 7 H(X) = ∑Pr[X=x] log(1/Pr[X=x]). The ZIP algorithm works because H(X=typical 1MB file) < 8Mbits. P[“Hello, my name is Bob”] >> P[“h)2cjCv9]dsnC1=Ns{da3”]. For one-way encoding, Shannon’s source coding theorem states that Communication ≈ Information.

8 Efficient encoding 8 The problem of sending many samples of X can be implemented in H(X) communication on average. The problem of sending a single sample of X can be implemented in <H(X)+1 communication in expectation.

9 Communication complexity [Yao] Focus on the two party setting. 9 A B X Y A & B implement a functionality F(X,Y). F(X,Y) e.g. F(X,Y) = “X=Y?”

10 Communication complexity 10 A B X Y Goal: implement a functionality F(X,Y). A protocol π(X,Y) computing F(X,Y): F(X,Y) m 1 (X) m 2 (Y,m 1 ) m 3 (X,m 1,m 2 ) Communication cost = #of bits exchanged.

11 Distributional communication complexity The input pair (X,Y) is drawn according to some distribution μ. Goal: make a mistake on at most an ε fraction of inputs. The communication cost: C(F,μ,ε): C(F,μ,ε) := min π computes F with error≤ε C(π, μ). 11

12 Example 12 μ is a distribution of pairs of files. F is “X=Y?”: MD5(X) (128b) X=Y? (1b) Communication cost = 129 bits. ε ≈ 2 -128. A B X Y

13 Randomized communication complexity Goal: make a mistake of at most ε on every input. The communication cost: R(F,ε). Clearly: C(F,μ,ε)≤R(F,ε) for all μ. What about the converse? A minimax(!) argument [Yao]: R(F,ε)=max μ C(F,μ,ε). 13

14 A note about the model We assume a shared public source of randomness. 14 A B X Y R

15 The communication complexity of EQ(X,Y) The communication complexity of equality: R(EQ,ε) ≈ log 1/ε Send log 1/ε random hash functions applied to the inputs. Accept if all of them agree. What if ε=0? R(EQ,0) ≈ n, where X,Y in {0,1} n. 15

16 Information in a two-way channel H(X) is the “inherent information cost” of sending a message distributed according to X over the channel. 16 communication channel Alice Bob X What is the two-way analogue of H(X)?

17 Entropy of interactive computation A B X Y R “Inherent information cost” of interactive two-party tasks.

18 One more definition: Mutual Information The mutual information of two random variables is the amount of information knowing one reveals about the other: I(A;B) = H(A)+H(B)-H(AB) If A,B are independent, I(A;B)=0. I(A;A)=H(A). 18 H(A) H(B) I(A,B)

19 Information cost of a protocol [Chakrabarti-Shi-Wirth-Yao-01, Bar-Yossef- Jayram-Kumar-Sivakumar-04, Barak-B- Chen-Rao-10]. Caution: different papers use “information cost” to denote different things! Today, we have a better understanding of the relationship between those different things. 19

20 Information cost of a protocol Prior distribution: (X,Y) ~ μ. A B X Y Protocol π Protocol transcript π I(π, μ) = I(π;Y|X) + I(π;X|Y) what Alice learns about Y + what Bob learns about X

21 External information cost (X,Y) ~ μ. A B X Y Protocol π Protocol transcript π I ext (π, μ) = I(π;XY) what Charlie learns about (X,Y) C

22 Another view on I and I ext It is always the case that C(π, μ) ≥ I ext (π, μ) ≥ I(π, μ). I ext measures the ability of Alice and Bob to compute F(X,Y) in an information theoretically secure way if they are afraid of an eavesdropper. I measures the ability of the parties to compute F(X,Y) if they are afraid of each other.

23 Example 23 F is “X=Y?”. μ is a distribution where w.p. ½ X=Y and w.p. ½ (X,Y) are random. MD5(X) [128b] X=Y? A B X Y I ext (π, μ) = I(π;XY) = 129 bits what Charlie learns about (X,Y)

24 Example F is “X=Y?”. μ is a distribution where w.p. ½ X=Y and w.p. ½ (X,Y) are random. MD5(X) [128b] X=Y? A B X Y I(π, μ) = I(π;Y|X)+I(π;X|Y) ≈ what Alice learns about Y + what Bob learns about X 1 +64.5 = 65.5 bits

25 The (distributional) information cost of a problem F Recall: C(F,μ,ε) := min π computes F with error≤ε C(π, μ). By analogy: I(F, μ, ε) := inf π computes F with error≤ε I(π, μ). I ext (F, μ, ε) := inf π computes F with error≤ε I ext (π, μ). 25

26 I(F,μ,ε) vs. C(F,μ,ε): compressing interactive computation Source Coding Theorem: the problem of sending a sample of X can be implemented in expected cost <H(X)+1 communication – the information content of X. Is the same compression true for interactive protocols? Can F be solved in I(F,μ,ε) communication? Or in I ext (F,μ,ε) communication?

27 The big question Can interactive communication be compressed? Can π be simulated by π’ such that C(π’, μ) ≈ I(π, μ)? Does I(F,μ,ε) ≈ C(F,μ,ε)? 27

28 Compression results we know Let ε, ρ be constants; let π be a protocol that computes F with error ε. π’s costs: C, I ext, I. Then π can be simulated using: –(I·C) ½ ·polylog(C) communication; [Barak-B-Chen-Rao’10] –I ext ·polylog(C) communication; [Barak-B-Chen-Rao’10] –2 O(I) communication; [B’11] while introducing an extra error of ρ. 28

29 The amortized cost of interactive computation Source Coding Theorem: the amortized cost of sending many independent samples of X is =H(X). What is the amortized cost of computing many independent copies of F(X,Y)?

30 Information = amortized communication Theorem [B-Rao’11] : for ε>0 I(F,μ,ε) = lim n→∞ C(F n,μ n,ε)/n. I(F,μ,ε) is the interactive analogue of H(X). 30

31 Information = amortized communication Theorem [B-Rao’11] : for ε>0 I(F,μ,ε) = lim n→∞ C(F n,μ n,ε)/n. I(F,μ,ε) is the interactive analogue of H(X). Can we get rid of μ? I.e. make I(F,ε) a property of the task F? C(F,μ,ε) I(F,μ,ε) R(F,ε)?

32 Prior-free information cost Define: I(F,ε) := inf π computes F with error≤ε max μ I(π, μ) Want a protocol that reveals little information against all priors μ! Definitions are cheap! What is the connection between the “syntactic” I(F,ε) and the “meaningful” I(F,μ,ε)? I(F,μ,ε) ≤ I(F,ε)… 32

33 33 Prior-free information cost I(F,ε) := inf π computes F with error ≤ε max μ I(π, μ). I(F,μ,ε) ≤ I(F,ε) for all μ. Recall: R(F,ε)=max μ C(F,μ,ε). Theorem [B’11] : I(F,ε) ≤ 2 · max μ I(F,μ,ε/2). I(F,0) = max μ I(F,μ,0).

34 34 Prior-free information cost Recall: I(F,μ,ε) = lim n→∞ C(F n,μ n,ε)/n. Theorem: for ε>0 I(F,ε) = lim n→∞ R(F n,ε)/n.

35 Example R(EQ,0) ≈ n. What is I(EQ,0)? 35

36 The information cost of Equality What is I(EQ,0)? Consider the following protocol. 36 A B X in {0,1} n Y in {0,1} n A non-singular in A 1 ·X A 1 ·Y A 2 ·X A 2 ·Y Continue for n steps, or until a disagreement is discovered.

37 Analysis (sketch) If X≠Y, the protocol will terminate in O(1) rounds on average, and thus reveal O(1) information. If X=Y… the players only learn the fact that X=Y (≤1 bit of information). Thus the protocol has O(1) information complexity. 37

38 Direct sum theorems I(F,ε) = lim n→∞ R(F n,ε)/n. Questions: –Does R(F n,ε)=Ω(n·R(F,ε))? –Does R(F n,ε)=ω(R(F,ε))? 38

39 Direct sum strategy The strategy for proving direct sum results. Take a protocol for F n that costs C n =R(F n,ε), and make a protocol for F that costs ≈C n /n. This would mean that C n∙C. 39 ~~ A protocol for n copies of F 1 copy of F CnCn C n /n ?

40 Direct sum strategy If life were so simple… 40 1 copy of F CnCn C n /n Easy! Copy 1 Copy 2 Copy n

41 Direct sum strategy Theorem: I(F,ε) = I(F n,ε)/n ≤ C n = R(F n,ε)/n. Compression → direct sum! 41

42 The information cost angle There is a protocol of communication cost C n, but information cost ≤C n /n. 1 copy of F CnCn C n /n Restriction 1 bit Copy 1 Copy 2Copy n C n /n info Compression?

43 Direct sum theorems Best known general simulation [BBCR’10] : A protocol with C communication and I information cost can be simulated using (I·C) ½ ·polylog(C) communication. Implies: R(F n,ε) = Ω(n 1/2 ∙R(F,ε)). 43 ~

44 Compression vs. direct sum We saw that compression → direct sum. A form of the converse is also true. Recall: I(F,ε) = lim n→∞ R(F n,ε)/n. If there is a problem such that I(F,ε)=o(R(F,ε)), then R(F n,ε)=o(n·R(F,ε)). 44

45 A complete problem Can define a problem called Correlated Pointer Jumping – CPJ(C,I). The problem has communication cost C and information cost I. CPJ(C,I) is the “least compressible problem”. If R(CPJ(C,I),1/3)=O(I), then R(F,1/3)=O(I(F,1/3)) for all F. 45

46 The big picture R(F, ε)R(F n,ε)/n I(F, ε)I(F n,ε)/n direct sum for information information = amortized communication direct sum for communication? interactive compression?

47 Partial progress Can compress bounded-round interactive protocols. The main primitive is a one-shot version of Slepian-Wolf theorem. Alice gets a distribution P X. Bob gets a prior distribution P Y. Goal: both must sample from P X. 47

48 Correlated sampling 48 A B PXPX PYPY M ~ P X The best we can hope for is D(P X || P Y ).

49 49 Proof Idea Sample using D(P X || P Y )+O(log 1/ε+D(P X || P Y ) ½ ) communication with statistical error ε. PXPX PYPY u1u1 u1u1 u2u2 u2u2 u3u3 u3u3 u4u4 u4u4 u4u4u4u4 ~|U| samples Public randomness: q1q1 q2q2 q3q3 q4q4 q5q5 q6q6 q7q7 ….u1u1 u2u2 u3u3 u4u4 u5u5 u6u6 u7u7 PXPX PYPY 11 00

50 Proof Idea Sample using D(P X || P Y )+O(log 1/ε+D(P X || P Y ) ½ ) communication with statistical error ε. u4u4 u2u2 h 1 (u 4 )h 2 (u 4 )  50 PXPX PYPY u4u4u4u4 PXPX PYPY 11 00 u2u2u2u2

51 51 Proof Idea Sample using D(P X || P Y )+O(log 1/ε+D(P X || P Y ) ½ ) communication with statistical error ε. u4u4 u2u2 h 4 (u 4 )…h log 1/ ε (u 4 ) u4u4 h 3 (u 4 ) PXPX 2P Y PXPX PYPY u4u4u4u4 u4u4u4u4 h 1 (u 4 ), h 2 (u 4 ) 11 00

52 52 Analysis If P X (u 4 )≈2 k P Y (u 4 ), then the protocol will reach round k of doubling. There will be ≈2 k candidates. About k+log 1/ε hashes. The contribution of u 4 to cost: –P X (u 4 ) (log P X (u 4 )/P Y (u 4 ) + log 1/ε). Done!

53 Directions Can interactive communication be fully compressed? R(F, ε) = I(F, ε)? What is the relationship between I(F, ε), I ext (F, ε) and R(F, ε)? Many other questions on interactive coding theory! 53

54 54 Thank You


Download ppt "1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University."

Similar presentations


Ads by Google