Download presentation
Presentation is loading. Please wait.
Published byBrett Austin Modified over 9 years ago
1
The Price of Uncertainty in Communication Brendan Juba (Washington U., St. Louis) with Mark Braverman (Princeton)
2
≈ ≈ SINCE WE ALL AGREE ON A PROB. DISTRIBUTION OVER WHAT I MIGHT SAY, I CAN COMPRESS IT TO: “THE 9,232,142,124,214,214,123,845 TH MOST LIKELY MESSAGE. THANK YOU!”
3
1.Encodings and communication across different priors 2.Near-optimal lower bounds for different priors coding 3
4
Coding schemes 4 BirdChicken Cat DinnerPetLambDuckCowDog “MESSAGES” “ENCODINGS”
5
Communication model 5 RECALL: (, CAT) E
6
Ambiguity 6 BirdChicken Cat DinnerPetLambDuckCowDog
7
Prior distributions 7 BirdChicken Cat DinnerPetLambDuckCowDog Decode to a maximum likelihood message
8
Source coding (compression) Assume encodings are binary strings Given a prior distribution P, message m, choose minimum length encoding that decodes to m. 8 FOR EXAMPLE, HUFFMAN CODES AND SHANNON- FANO (ARITHMETIC) CODES NOTE: THE ABOVE SCHEMES DEPEND ON THE PRIOR.
9
9 SUPPOSE ALICE AND BOB SHARE THE SAME ENCODING SCHEME, BUT DON’T SHARE THE SAME PRIOR… P P Q Q CAN THEY COMMUNICATE?? HOW EFFICIENTLY??
10
10 THE CAT. THE ORANGE CAT. THE ORANGE CAT WITHOUT A HAT.
11
Closeness and communication Priors P and Q are α-close (α ≥ 1) if for every message m, αP(m) ≥ Q(m) and αQ(m) ≥ P(m) Disambiguation and closeness together suffice for communication: If for every m’≠m, P[m|e] > α 2 P[m’|e], then: Q[m|e] ≥ 1 / α P[m|e] > αP[m’|e] ≥ Q[m’|e] 11 SO, IF ALICE SENDS e THEN MAXIMUM LIKELIHOOD DECODING GIVES BOB m AND NOT m’… “α 2 -disambiguated”
12
Construction of a coding scheme (J-Kalai-Khanna-Sudan’11, Inspired by B-Rao’11) Pick an infinite random string R m for each m, Put (m,e) E ⇔ e is a prefix of R m. Alice encodes m by sending prefix of R m s.t. m is α 2 -disambiguated under P. 12 Gives an expected encoding length of at most H(P) + 2log α + 2
13
Remark Mimicking the disambiguation property of natural language provided an efficient strategy for communication. 13
14
1.Encodings and communication across different priors 2.Near-optimal lower bounds for different priors coding 14
15
Our results 1.The JKKS’11/BR’11 encoding is near optimal – H(P) + 2log α – 3log log α – O(1) bits necessary (cf. achieved H(P) + 2log α + 2 bits) 2.Analysis of positive-error setting [Haramaty-Sudan’14] : If incorrect decoding w.p. ε is allowed— – Can achieve H(P) + log α + log 1/ε bits – H(P) + log α + log 1/ε – (9/2)log log α – O(1) bits necessary for ε > 1/α 15
16
An ε-error coding scheme. (Inspired by J-Kalai-Khanna-Sudan’11, B-Rao’11) Pick an infinite random string R m for each m, Put (m,e) E ⇔ e is a prefix of R m. Alice encodes m by sending the prefix of R m of length log 1/P(m) + log α + log 1/ε 16
17
Analysis Claim. m is decoded correctly w.p. 1-ε Proof. There are at most 1/Q(m) messages with Q-probability greater than Q(m) ≥ P(m)/α. The probability that R m’ for any one of these m’ agrees with the first log 1/P(m) + log α + log 1/ε ≥ log 1/Q(m)+log 1/ε bits of R m is at most εQ(m). By a union bound, the probability that any of these agree with R m (and hence could be wrongly chosen) is at most ε. 17
18
Length lower bound 1—reduction to deterministic encodings Min-max Theorem: it suffices to exhibit a distribution over priors for which deterministic encodings must be long 18
19
Length lower bound 2—hard priors 19 log. prob. ≈0 -log α -2log α m*m* S Lemma 1: H(P) = O(1) α- close Lemma 2 α2α2 α
20
Length lower bound 3— short encodings have collisions Encodings of expected length < 2log α – 3log log α encode m 1 ≠ m 2 identically with nonzero prob. With nonzero probability over choice of P & Q, m 1,m 2 ∈ S and m* ∈ {m 1,m 2 } Decoding error with nonzero probability ☞ Errorless encodings have expected length ≥ 2log α-3log log α = H(P)+2log α-3log log α-O(1) 20
21
Length lower bound 4— very short encodings often collide If the encoding has expected length < log α + log 1/ε – (9/2)log log α m* collides with ∼ (ε log α)∙α other messages Probability that our α draws for S miss all of these messages is < 1-2ε Decoding error with probability > ε ☞ Error-ε encodings have expected length ≥ H(P) + log α + log 1/ε – (9/2)log log α – O(1) 21
22
22 Recap. We saw a variant of source coding for which (near-)optimal solutions resemble natural languages in interesting ways.
23
23 The problem. Design a coding scheme E so that for any sender and receiver with α-close prior distributions, the communication length is minimized. (In expectation w.r.t. sender’s distribution) Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.