Mathematical Theories of Communication:

Mathematical Theories of Communication:
Old and New Madhu Sudan Harvard University September 26, 2017 HLF: Mathematical Theory Communication

HLF: Mathematical Theory Communication
Communication = What? Today: Digital Communication i.e., Communicating bits/bytes/data … As opposed to “radio waves for sound” Challenge? Communication can be … … expensive: (e.g., satellite with bounded energy to communicate) … noisy: (bits flipped, DVD scratched) … interactive: (complicates above further) … contextual (assumes you are running specific OS, IPvX) September 26, 2017 HLF: Mathematical Theory Communication

Theory = Why? Why build theory and not just use whatever works? Ad-hoc solutions work today, but will they work tomorrow? Formulating problem allows comparison of solutions! And some creative solutions might be surprisingly better than naïve! Better understanding of limits. What can not be achieved … September 26, 2017 HLF: Mathematical Theory Communication

Old? New? Why new theories? Were old ones not good enough? Quite the opposite: Old ones were too good. They provided right framework and took us from ground level to “orbit”! And now we can explore all of “space” … but new possibilities leads to new challenges. September 26, 2017 HLF: Mathematical Theory Communication

Reliable Communication?
Problem from the 1940s: Advent of digital age. Communication media are always noisy But digital information less tolerant to noise! We are not ready We are now ready Alice Bob January 5, 2016 IISER: Reliable Meaningful Communication

Reliability by Repetition
Can repeat (every letter of) message to improve reliability: WWW EEE AAA RRR EEE NNN OOO WWW … WXW EEA ARA SSR EEE NMN OOP WWW … Elementary Reasoning: ↑ repetitions ⇒ ↓ Prob. decoding error; but still +ve ↑ length of transmission ⇒ ↑ expected # errors. Combining above: Rate of repetition coding →0 as length of transmission increases. Belief (pre1940): Rate of any scheme →0 as length →∞ January 5, 2016 IISER: Reliable Meaningful Communication

IISER: Reliable Meaningful Communication
Shannon’s Theory [1948] Sender “Encodes” before transmitting Receiver “Decodes” after receiving Encoder/Decoder arbitrary functions. 𝐸: 0,1 𝑘 → 0,1 𝑛 𝐷: 0,1 𝑛 → 0,1 𝑘 Rate = 𝑘 𝑛 ; Requirement: 𝑚=𝐷(𝐸 𝑚 +error) w. high prob. What are the best 𝐸,𝐷 (with highest Rate)? Alice Bob Encoder Decoder January 5, 2016 IISER: Reliable Meaningful Communication

IISER: Reliable Meaningful Communication
Shannon’s Theorem If every bit is flipped with probability 𝑝 Rate →1−𝐻(𝑝) can be achieved. 𝐻 𝑝 ≜𝑝 log 𝑝 + 1−𝑝 log −𝑝 This is best possible. Examples: 𝑝=0⇒𝑅𝑎𝑡𝑒=1 𝑝= 1 2 ⇒𝑅𝑎𝑡𝑒=0 Monotone decreasing for 𝑝∈(0, 1 2 ) Positive rate for 𝑝= ; even if 𝑘→∞ January 5, 2016 IISER: Reliable Meaningful Communication

Shannon’s contributions
Far-reaching architecture: Profound analysis: First (?) use of probabilistic method. Deep Mathematical Discoveries: Entropy, Information, Bit? Alice Bob Encoder Decoder January 5, 2016 IISER: Reliable Meaningful Communication

Challenges post-Shannon
Encoding/Decoding functions not “constructive”. Shannon picked 𝐸 at random, 𝐷 brute force. Consequence: 𝐷 takes time ~ 2 𝑘 to compute (on a computer). 𝐸 takes time 𝑘 to find! Algorithmic challenge: Find 𝐸,𝐷 more explicitly. Both should take time ~ 𝑘, 𝑘 2 , 𝑘 3 … to compute Solutions: January 5, 2016 IISER: Reliable Meaningful Communication

Chomsky: Theory of “Language”
Formalized syntax in languages mathematically (with caveats). Initially goal was to understand inter-human communication! Structure behind languages Implication on acquisition. Major side-effect: Use of “context-free grammars’’ to specify programming languages! Automated parsing, compiling, interpretation! September 26, 2017 HLF: Mathematical Theory Communication

Modern Theories: Communication is interactive E.g., “buying airline ticket online” Involve large inputs from either side: Travel agent = vast portfolio of airlines+flights Traveler = complex collection of constraints and preferences. Best protocol ≠ Travel agent sends brochure. ≠ Traveller sends entire list of constraints. How to model? What happens if errors happen? How well should context be shared? September 26, 2017 HLF: Mathematical Theory Communication

Communication Complexity: Yao
The model (with shared randomness) 𝑥 𝑦 𝑓: 𝑥,𝑦 ↦Σ 𝑅=$$$ Usually studied for lower bounds. This talk: CC as +ve model. Alice Bob 𝐶𝐶 𝑓 =# bits exchanged by best protocol 𝑓(𝑥,𝑦) w.p. 2/3 January 3, 2017 IITB Math & Inf: Functional Uncertainty

Some short protocols! Problem 1: Alice ← 𝑥 1 ,…, 𝑥 𝑛 ; Bob ← 𝑦 1 ,…, 𝑦 𝑛 ; Want to know: 𝑥 1 − 𝑦 1 +⋯+ 𝑥 𝑛 − 𝑦 𝑛 Solution: Alice → Bob: 𝑆≝𝑥 1 +⋯ 𝑥 𝑛 Bob → Alice: 𝑆− 𝑦 1 +⋯+ 𝑦 𝑛 Problem 2: Alice ← 𝑥 1 ,…, 𝑥 𝑛 ; Bob ← 𝑦 1 ,…, 𝑦 𝑛 ; Want to know: 𝑥 1 − 𝑦 ⋯+ 𝑥 𝑛 − 𝑦 𝑛 2 Solution? Deterministically: Needs 𝑛 bits of communication. Randomized: Say Alice+Bob ← 𝑟 1 , 𝑟 2 ,…, 𝑟 𝑛 ∈ −1.+1 random. Alice→Bob: 𝑆 1 = 𝑥 1 2 +⋯+ 𝑥 𝑛 2 ; 𝑆 2 = 𝑟 1 𝑥 1 +⋯ 𝑟 𝑛 𝑥 𝑛 Bob→Alice: 𝑇 1 =𝑦+⋯+𝑦; 𝑇 2 = 𝑟 1 𝑦 1 +⋯ 𝑟 𝑛 𝑦 𝑛 Thm: Exp 𝑆 1 + 𝑇 1 −2 𝑆 2 𝑇 2 = 𝑥 1 − 𝑦 ⋯+ 𝑥 𝑛 − 𝑦 𝑛 2 September 26, 2017 HLF: Mathematical Theory Communication

Application to Buying Air Tickets
If we can express every flight and every user’s preference as 𝑛 numbers (commonly done in Machine Learning) Then #bits communicated ≈2. description of final itinerary. Only two rounds of communication! Challenge: Express user preferences as numbers! Not yet there … but soon your cellphones will do it! September 26, 2017 HLF: Mathematical Theory Communication

Interaction + Errors: Schulman
Consider distributed update of shared document. What if there are errors in interaction? Error must be detected immediately? Or else all future communication wasted. But too early detection might lead to false alarms! Alice Server Bob Typical interaction: Server → User: Current state of document User → Server: Update September 26, 2017 HLF: Mathematical Theory Communication

Interactive Coding Schemes
If bits are flipped with probability 𝑝 what is the rate of communication? Limits still not precisely determined! But linear for 𝑝≤ (scales more like 1 − 𝐻 𝑝 ) Non-trivial mathematics! Some still not fully constructive … surprising even given Shannon theory! Schulman Braverman-Rao Kol-Raz Haeupler September 26, 2017 HLF: Mathematical Theory Communication

September 26, 2017 HLF: Mathematical Theory Communication

Oberwolfach: Uncertain Communication
Sales Pitch + Intro Most of communication theory [a la Shannon, Hamming]: Built around sender and receiver perfectly synchronized. So large context (codes, protocols, priors) ignored. Most Human communication (also device-device) … does not assume perfect synchronization. So context is relevant: Qualitatively (receiver takes wrong action) and Quantitatively (inputs are long!!) August 17, 2017 Oberwolfach: Uncertain Communication

Aside: Contextual Proofs & Uncertainty
Mathematical proofs assume large context. “By some estimates a proof that 2+2=4 in ZFC would require about steps … so we will use a huge set of axioms to shorten our proofs – namely, everything from high-school mathematics” [Lehman,Leighton,Meyer] Context shortens proofs. But context is uncertain! What is “high school mathematics” Is it a fixed set of axioms? Or a set from which others can be derived? Is the latter amenable to efficient reasoning? What is efficiency with large context? August 17, 2017 Oberwolfach: Uncertain Communication

Sales Pitch + Intro Most of communication theory [a la Shannon, Hamming]: Built around sender and receiver perfectly synchronized. So large context (codes, protocols, priors) ignored. Most Human communication (also device-device) … does not assume perfect synchronization. So context is relevant: Qualitatively (receiver takes wrong action) and Quantitatively (inputs are long!!) Theory? What are the problems? Starting point = Shannon? Yao? August 17, 2017 Oberwolfach: Uncertain Communication

Communication Complexity
The model (with shared randomness) 𝑥 𝑦 𝑓: 𝑥,𝑦 ↦Σ 𝑅=$$$ Usually studied for lower bounds. This talk: CC as +ve model. Alice Bob 𝐶𝐶 𝑓 =# bits exchanged by best protocol 𝑓(𝑥,𝑦) w.p. 2/3 August 17, 2017 Oberwolfach: Uncertain Communication

Aside: Easy CC Problems [Ghazi,Kamath,S’15]
∃ Problems with large inputs and small communication? Equality testing: 𝐸𝑄 𝑥,𝑦 =1⇔𝑥=𝑦; 𝐶𝐶 𝐸𝑄 =𝑂 1 Hamming distance: 𝐻 𝑘 𝑥,𝑦 =1⇔Δ 𝑥,𝑦 ≤𝑘; 𝐶𝐶 𝐻 𝑘 =𝑂(𝑘 log 𝑘 ) [Huang etal.] Small set intersection: ∩ 𝑘 𝑥,𝑦 =1⇔ wt 𝑥 , wt 𝑦 ≤𝑘 & ∃𝑖 𝑠.𝑡. 𝑥 𝑖 = 𝑦 𝑖 =1. 𝐶𝐶 ∩ 𝑘 =𝑂(𝑘) [Håstad Wigderson] Gap (Real) Inner Product: 𝑥,𝑦∈ ℝ 𝑛 ; 𝑥 2 , 𝑦 2 =1; 𝐺𝐼 𝑃 𝑐,𝑠 𝑥,𝑦 =1 if 𝑥,𝑦 ≥𝑐; =0 if 𝑥,𝑦 ≤𝑠; 𝐶𝐶 𝐺𝐼 𝑃 𝑐,𝑠 =𝑂 1 𝑐−𝑠 ; [Alon, Matias, Szegedy] Protocol: Fix ECC 𝐸: 0,1 𝑛 → 0,1 𝑁 ; Shared randomness: 𝑖← 𝑁 ; Exchange 𝐸 𝑥 𝑖 , 𝐸 𝑦 𝑖 ; Accept iff 𝐸 𝑥 𝑖 =𝐸 𝑦 𝑖 . 𝑝𝑜𝑙𝑦 𝑘 Protocol: Use common randomness to hash 𝑛 →[ 𝑘 2 ] 𝑥=( 𝑥 1 , …, 𝑥 𝑛 ) 𝑦= 𝑦 1 , …, 𝑦 𝑛 〈𝑥,𝑦〉≜ 𝑖 𝑥 𝑖 𝑦 𝑖 Main Insight: If 𝐺←𝑁 0,1 𝑛 , then 𝔼 𝐺,𝑥 ⋅ 𝐺,𝑦 =〈𝑥,𝑦〉 Unstated philosophical contribution of CC a la Yao: Communication with a focus (“only need to determine 𝑓 𝑥,𝑦 ”) can be more effective (shorter than 𝑥 ,𝐻 𝑥 ,𝐻 𝑦 ,𝐼(𝑥;𝑦)… ) August 17, 2017 Oberwolfach: Uncertain Communication

Modelling Shared Context + Imperfection
1 Many possibilities. Ongoing effort. Alice+Bob may have estimates of 𝑥 and 𝑦. More generally: 𝑥,𝑦 close (in some sense). Knowledge of 𝑓 – function Bob wants to compute may not be exactly known to Alice! Shared randomness Alice + Bob may not have identical copies. 3 2 August 17, 2017 Oberwolfach: Uncertain Communication

1. Compression Classical compression: Alice ←𝑃, 𝑚∼𝑃; Bob ←𝑃; Alice → Bob: 𝑦= 𝐸 𝑃 (𝑚) ; Bob 𝑚 = 𝐷 𝑃 𝑦 ≟𝑚; [Shannon]: 𝑚 =𝑚; w. 𝔼 𝑚∼𝑃 𝐸 𝑃 𝑚 ≤𝐻 𝑃 +1 𝐻 𝑃 ≝ 𝔼 𝑚∼𝑃 [− log 𝑃 𝑚 ] Uncertain compression [Juba,Kalai,Khanna,S.] Alice ←𝑃, 𝑚∼𝑃; Bob ←𝑄; Alice → Bob: 𝑦= 𝐸 𝑃 (𝑚) ; Bob 𝑚 = 𝐷 𝑄 𝑦 ≟𝑚; 𝑃,𝑄 Δ-close: ∀𝑚 log 𝑃(𝑚) − log 𝑄(𝑚) ≤Δ Can we get 𝔼 𝑚∼𝑃 𝐸 𝑃 𝑚 ≤𝑂 𝐻 𝑃 +Δ ? [JKKS] – Yes – with shared randomness. [CGMS] – Yes – with shared correlation. [Haramaty+S.] – Deterministically 𝑂(𝐻 𝑃 + log log |Ω|) Universe August 17, 2017 Oberwolfach: Uncertain Communication

Deterministic Compression: Challenge
Say Alice and Bob have rankings of N players. Rankings = bijections 𝜋, 𝜎 : 𝑁 → 𝑁 𝜋 𝑖 = rank of i th player in Alice’s ranking. Further suppose they know rankings are close. ∀ 𝑖∈ 𝑁 : 𝜋 𝑖 −𝜎 𝑖 ≤2. Bob wants to know: Is 𝜋 −1 1 = 𝜎 −1 1 How many bits does Alice need to send (non-interactively). With shared randomness – 𝑂(1) Deterministically? 𝑂 1 ? 𝑂( log 𝑁 )?𝑂( log log log 𝑁)? With Elad Haramaty: 𝑂 ( log ∗ 𝑛 ) August 17, 2017 Oberwolfach: Uncertain Communication

Compression as a proxy for language
Information theoretic study of language? Goal of language: Effective means of expressing information/action. Implicit objective of language: Make frequent messages short. Compression! Frequency = Known globally? Learned locally? If latter – every one can’t possibly agree on it; Yet need to agree on language (mostly)! Similar to problem of Uncertain Compression. Studied formally in [Ghazi,Haramaty,Kamath,S. ITCS 17] August 17, 2017 Oberwolfach: Uncertain Communication

2. Imperfectly Shared Randomness
Recall: Communication becomes more effective with randomness. Identity, Hamming Distance, Small Set Intersection, Inner Product. How does performance degrade if players only share correlated variables: E.g. Alice ←𝑟; Bob ←𝑠. 𝑟,𝑠)= ( 𝑟 𝑖 , 𝑡 𝑖 𝑖 i.i.d. 𝑟 𝑖 , 𝑠 𝑖 ∈ −1,1 ; 𝔼 𝑟 𝑖 =𝔼 𝑠 𝑖 =0;𝔼 𝑟 𝑖 𝑠 𝑖 =𝜌∈(0,1); [CGMS ’16]: Comm. With perfect randomness = 𝑘 ⇒ Comm. With imperfect randomness = 𝑂 𝜌 ( 2 𝑘 ) August 17, 2017 Oberwolfach: Uncertain Communication

Imperfectly Shared Randomness
Easy (Complete) Problem: Gap Inner Product: 𝑥,𝑦∈ ℝ 𝑛 𝐺𝐼 𝑃 𝑐,𝑠 𝑥,𝑦 =1 if 𝑥,𝑦 ≥𝜖⋅ 𝑥 2 ⋅ 𝑦 2 ; =0 if 𝑥,𝑦 ≤0 Decidable with 𝑂 𝜌 1 𝜖 2 (o.w.) communication Hard Problem: Sparse Gap Inner Product: 𝐺𝐼𝑃 on sparse 𝑥 𝑥∈ 0,1 𝑛 , 𝑦∈ −1,1 𝑛 ; 𝑥 1 =2𝜖𝑛 Classical communication = 𝑂 log 1 𝜖 [uses sparsity] No way to use sparsity with imperfect randomness. August 17, 2017 Oberwolfach: Uncertain Communication

3. Functional Uncertainty
[Ghazi,Komargodski,Kothari,S. ‘16] Recall positive message of Yao’s model: Communication can be brief, if Alice knows what function 𝑓(𝑥,𝑦) Bob wants to compute. What if Alice only knows 𝑓 approximately? Can communication still be short? August 17, 2017 Oberwolfach: Uncertain Communication

The Model Recall Distributional Complexity: 𝑥,𝑦 ∼𝜇; error 𝜇 (Π)≝ Pr 𝑥,𝑦∼𝜇 𝑓 𝑥,𝑦 ≠Π 𝑥,𝑦 Complexity:𝑐 𝑐 𝜇, 𝜖 𝑓 ≝ min Π:𝑒𝑟𝑟𝑜 𝑟 𝜇 Π ≤𝜖 max 𝑥,𝑦 Π(𝑥,𝑦) Functional Uncertainty Model - I: Adversary picks 𝑓,𝑔 . Nature picks 𝑥,𝑦 ∼𝜇 Alice ←(𝑓,𝑥); Bob ←(𝑔,𝑦) ; Compute 𝑔(𝑥,𝑦) Promise: 𝛿 𝜇 𝑓,𝑔 ≝ Pr 𝜇 𝑓 𝑥,𝑦 ≠𝑔 𝑥,𝑦 ≤ 𝛿 0 Goal: Compute (any) Π 𝑥,𝑦 with 𝛿 𝜇 𝑔,Π ≤ 𝜖 1 (just want 𝜖 1 →0 as 𝛿 0 →0) If 𝑓,𝑔 part of input; this is complexity of what? August 17, 2017 Oberwolfach: Uncertain Communication

Modelling Uncertainty
𝑓 𝑔 𝒢 Modelled by graph 𝒢 of possible inputs Protocols know 𝒢 but not (𝑓,𝑔) 𝑐 𝑐 𝜇,𝜖 𝒢 ≝ max 𝑓,𝑔 ∈𝒢 𝑐 𝑐 𝜇,𝜖 𝑔 𝛿 𝜇 𝒢 ≝ max 𝑓,𝑔 ∈𝒢 𝛿 𝜇 (𝑓,𝑔) Uncertain error: 𝑒𝑟𝑟𝑜 𝑟 𝜇,𝒢 Π ≝ max 𝑓,𝑔 ∈𝒢 Pr 𝜇 𝑔 𝑥,𝑦 ≠Π(𝑓,𝑔,𝑥,𝑦) Uncertain complexity: 𝑈𝑐 𝑐 𝜇,𝜖 𝒢 ≝ min Π:error Π ≤𝜖 max 𝑓,𝑔,𝑥,𝑦 Π 𝑓,𝑔,𝑥,𝑦 Compare 𝑐 𝑐 𝜇, 𝜖 0 𝒢 vs. 𝑈𝑐 𝑐 𝜇, 𝜖 1 (𝒢) want 𝜖 1 →0 as 𝜖 0 , 𝛿 𝜇 𝒢 →0 August 17, 2017 Oberwolfach: Uncertain Communication

Main Results Thm 1: (-ve) ∃𝒢,𝜇 s.t. 𝛿 𝜇 𝒢 =𝑜 1 ;𝑐 𝑐 𝜇,𝑜(1) 𝒢 =1; but 𝑈𝑐 𝑐 𝜇,.1 𝒢 =Ω 𝑛 ; ( 𝑛= 𝑥 = 𝑦 ) Thm 2: (+ve) ∀𝒢, product 𝜇, 𝑈𝑐 𝑐 𝜇, 𝜖 1 𝑜𝑛𝑒𝑤𝑎𝑦 𝒢 =𝑂 𝑐 𝑐 𝜇, 𝜖 0 𝑜𝑛𝑒𝑤𝑎𝑦 𝒢 where 𝜖 1 →0 as 𝜖 0 , 𝛿 𝜇 𝒢 →0 Thm 2’: (+ve) ∀𝒢, 𝜇, 𝑈𝑐 𝑐 𝜇, 𝜖 1 𝑜𝑛𝑒𝑤𝑎𝑦 𝒢 =𝑂 𝑐 𝑐 𝜇, 𝜖 0 𝑜𝑛𝑒𝑤𝑎𝑦 𝒢 ⋅(1+𝐼 𝑥;𝑦 ) and 𝐼 𝑥;𝑦 = Mutual Information between 𝑥;𝑦 Protocols are not continuous wrt the function being computed August 17, 2017 Oberwolfach: Uncertain Communication

Details of Negative Result
𝜇: 𝑥∼𝑈 0,1 𝑛 ;𝑦=𝑁𝑜𝑖𝑠𝑦(𝑥) ; Pr 𝑥 𝑖 ≠ 𝑦 𝑖 =1/ 𝑛 ; 𝒢= ⊕ 𝑆 𝑥⊕𝑦 , ⊕ 𝑇 (𝑥⊕𝑦) | 𝑆⊕𝑇 =𝑜 𝑛 ⊕ 𝑆 𝑧 = ⊕ 𝑖∈𝑆 𝑧 𝑖 Certain Comm: Alice → Bob: ⊕ 𝑇 𝑥 𝛿 𝜇 𝒢 = max 𝑆,𝑇 Pr 𝑥,𝑦 ⊕ 𝑆 𝑥⊕𝑦 ≠ ⊕ 𝑇 𝑥⊕𝑦 = max 𝑈: 𝑈 =𝑜 𝑛 Pr 𝑧∼Bernoulli 𝑛 𝑛 ⊕ 𝑈 𝑧 =1 =𝑜(1) Uncertain Lower bound: Standard 𝑐 𝑐 𝜇,𝜖 𝐹 where 𝐹 𝑆,𝑥 ; 𝑇,𝑦 = ⊕ 𝑇 𝑥⊕𝑦 ; Lower bound obtain by picking 𝑆,𝑇 randomly: 𝑆 uniform; 𝑇 noisy copy of 𝑆 August 17, 2017 Oberwolfach: Uncertain Communication

Positive result (Thm. 2) 𝑦 1 𝑦 2 𝑦 𝑚 Consider comm. Matrix
Protocol for 𝑔 partitions matrix in 2 𝑘 blocks Bob wants to know which block? Common randomness: 𝑦 1 ,…, 𝑦 𝑚 Alice → Bob: 𝑓 𝑥, 𝑦 1 …𝑓 𝑥, 𝑦 𝑚 Bob (whp) recognizes block and uses it. 𝑚=𝑂 𝑘 suffices. “CC preserved under uncertainty for one-way communication if 𝑥⊥𝑦” 𝑔(𝑥,𝑦) August 17, 2017 Oberwolfach: Uncertain Communication

Analysis Details W.p. 1 − 𝜖 , ∃𝑗 s.t. 𝛿 𝑔 𝑥 𝑗 ,⋅ ,𝑔 𝑥,⋅ ≤ 𝜖 Main idea: If Π 𝑔 𝑥 = Π 𝑔 𝑥 𝑗 then w.h.p. 𝛿 𝑔 𝑥 𝑗 ,⋅ ,𝑔 𝑥,⋅ ≤ 𝜖 If 𝑗∈ 𝐾 s.t. 𝛿 𝑔 𝑥 𝑗 ,⋅ , 𝑔 𝑥,⋅ ≥2 𝜖 then Pr 𝑗 is selected = exp(−𝑚). But Step 2. works only if 𝑦 𝑖 ∼ 𝜇 𝑥 August 17, 2017 Oberwolfach: Uncertain Communication

Thm 2’: Main Idea Now can not sample 𝑦 1 ,…, 𝑦 𝑚 independent of 𝑥 Instead use [HJMR’07] to sample 𝑦 𝑖 ∼ 𝜇 𝑥 Each sample costs 𝐼 𝑥;𝑦 Analysis goes through … August 17, 2017 Oberwolfach: Uncertain Communication

4. Contextual Proofs and Uncertainty?
Scenario: Alice + Bob start with axioms 𝐴: subset of clauses on 𝑋 1 ,…, 𝑋 𝑛 Alice wishes to prove 𝐴⇒𝐶 for some clause 𝐶 But proof Π:𝐴⇒𝐶 may be long (~ 2 𝑛 ) Context to rescue: Maybe Alice + Bob share context 𝐷⇐𝐴 ; and contextual proof Π ′ :𝐷⇒𝐶 short (poly(𝑛)) Uncertainty: Alice’s Context 𝐷 𝐴 ≠ 𝐷 𝐵 (Bob’s context) Alice writes proof Π ′ : 𝐷 𝐴 ⇒𝐶 When can Bob verify Π′ given 𝐷 𝐵 ? August 17, 2017 Oberwolfach: Uncertain Communication

4. Contextual Proofs and Uncertainty? -2
Scenario: Alice + Bob start with axioms 𝐴: subset of clauses on 𝑋 1 ,…, 𝑋 𝑛 Alice wishes to prove 𝐴⇒𝐶 for some clause 𝐶 Alice writes proof Π ′ : 𝐷 𝐴 ⇒𝐶 When can Bob verify Π′ given 𝐷 𝐵 ? Surely if 𝐷 𝐴 ⊆ 𝐷 𝐵 What if 𝐷 𝐴 ∖ 𝐷 𝐵 = 𝐶′ and Π ′′ : 𝐷 𝐵 ⇒𝐶′ is one step long? Can Bob still verify Π′: 𝐷 𝐴 ⇒𝐶 in poly(𝑛) time? Need feasible data structure that allows this! None known to exist. Might be harder than Partial Match Retrieval … August 17, 2017 Oberwolfach: Uncertain Communication

Summarizing Perturbing “common information” assumptions in Shannon/Yao theory, lead to many changes. Some debilitating Some not; but require careful protocol choice. In general: Communication protocols are not continuous functions of the common information. Uncertain model (𝒢) needs more exploration! Some open questions from our work: Tighten the gap: 𝑐𝑐 𝑓 ⋅𝐼 vs. 𝑐𝑐 𝑓 + 𝐼 Multi-round setting? Two rounds? What if randomness & function imprefectly shared? [Prelim. Results in [Ghazi+S’17]] August 17, 2017 Oberwolfach: Uncertain Communication

Contextual Communication & Uncertainty

Communication Complexity: Yao
The model (with shared randomness) 𝑥 𝑦 𝑓: 𝑥,𝑦 ↦Σ 𝑅=$$$ Bob 𝐶𝐶 𝑓 =# bits exchanged by best protocol 𝑓(𝑥,𝑦) w.p. 2/3 January 3, 2017 IITB Math & Inf: Functional Uncertainty

Uncertain Communication

Boole’s “modest” ambition
“The design of the following treatise is to investigate the fundamental laws of those operations of the mind by which reasoning is performed; to give expression to them in the symbolical language of a Calculus, and upon this foundation to establish the science of Logic and construct its method; to make that method itself the basis of a general method for the application of the mathematical doctrine of Probabilities; and, finally, to collect from the various elements of truth brought to view in the course of these inquiries some probable intimations concerning the nature and constitution of the human mind.” [G.Boole, “On the laws of thought …” p.1] September 26, 2017 HLF: Mathematical Theory Communication

Sales Pitch + Intro Most of communication theory [a la Shannon, Hamming]: Built around sender and receiver perfectly synchronized. So large context (codes, protocols, priors) ignored. Most Human communication (also device-device) … does not assume perfect synchronization. So context is relevant: Qualitatively (receiver takes wrong action) and Quantitatively (inputs are long!!) September 26, 2017 HLF: Mathematical Theory Communication

Aside: Contextual Proofs & Uncertainty
Mathematical proofs assume large context. “By some estimates a proof that 2+2=4 in ZFC would require about steps … so we will use a huge set of axioms to shorten our proofs – namely, everything from high-school mathematics” [Lehman,Leighton,Meyer] Context shortens proofs. But context is uncertain! What is “high school mathematics” Is it a fixed set of axioms? Or a set from which others can be derived? Is the latter amenable to efficient reasoning? What is efficiency with large context? September 26, 2017 HLF: Mathematical Theory Communication

Sales Pitch + Intro Most of communication theory [a la Shannon, Hamming]: Built around sender and receiver perfectly synchronized. So large context (codes, protocols, priors) ignored. Most Human communication (also device-device) … does not assume perfect synchronization. So context is relevant: Qualitatively (receiver takes wrong action) and Quantitatively (inputs are long!!) Theory? What are the problems? Starting point = Shannon? Yao? September 26, 2017 HLF: Mathematical Theory Communication

Communication Complexity
The model (with shared randomness) 𝑥 𝑦 𝑓: 𝑥,𝑦 ↦Σ 𝑅=$$$ Usually studied for lower bounds. This talk: CC as +ve model. Alice Bob 𝐶𝐶 𝑓 =# bits exchanged by best protocol 𝑓(𝑥,𝑦) w.p. 2/3 September 26, 2017 HLF: Mathematical Theory Communication

Aside: Easy CC Problems [Ghazi,Kamath,S’15]
∃ Problems with large inputs and small communication? Equality testing: 𝐸𝑄 𝑥,𝑦 =1⇔𝑥=𝑦; 𝐶𝐶 𝐸𝑄 =𝑂 1 Hamming distance: 𝐻 𝑘 𝑥,𝑦 =1⇔Δ 𝑥,𝑦 ≤𝑘; 𝐶𝐶 𝐻 𝑘 =𝑂(𝑘 log 𝑘 ) [Huang etal.] Small set intersection: ∩ 𝑘 𝑥,𝑦 =1⇔ wt 𝑥 , wt 𝑦 ≤𝑘 & ∃𝑖 𝑠.𝑡. 𝑥 𝑖 = 𝑦 𝑖 =1. 𝐶𝐶 ∩ 𝑘 =𝑂(𝑘) [Håstad Wigderson] Gap (Real) Inner Product: 𝑥,𝑦∈ ℝ 𝑛 ; 𝑥 2 , 𝑦 2 =1; 𝐺𝐼 𝑃 𝑐,𝑠 𝑥,𝑦 =1 if 𝑥,𝑦 ≥𝑐; =0 if 𝑥,𝑦 ≤𝑠; 𝐶𝐶 𝐺𝐼 𝑃 𝑐,𝑠 =𝑂 1 𝑐−𝑠 ; [Alon, Matias, Szegedy] Protocol: Fix ECC 𝐸: 0,1 𝑛 → 0,1 𝑁 ; Shared randomness: 𝑖← 𝑁 ; Exchange 𝐸 𝑥 𝑖 , 𝐸 𝑦 𝑖 ; Accept iff 𝐸 𝑥 𝑖 =𝐸 𝑦 𝑖 . 𝑝𝑜𝑙𝑦 𝑘 Protocol: Use common randomness to hash 𝑛 →[ 𝑘 2 ] 𝑥=( 𝑥 1 , …, 𝑥 𝑛 ) 𝑦= 𝑦 1 , …, 𝑦 𝑛 〈𝑥,𝑦〉≜ 𝑖 𝑥 𝑖 𝑦 𝑖 Main Insight: If 𝐺←𝑁 0,1 𝑛 , then 𝔼 𝐺,𝑥 ⋅ 𝐺,𝑦 =〈𝑥,𝑦〉 Unstated philosophical contribution of CC a la Yao: Communication with a focus (“only need to determine 𝑓 𝑥,𝑦 ”) can be more effective (shorter than 𝑥 ,𝐻 𝑥 ,𝐻 𝑦 ,𝐼(𝑥;𝑦)… ) September 26, 2017 HLF: Mathematical Theory Communication

Modelling Shared Context + Imperfection
1 Many possibilities. Ongoing effort. Alice+Bob may have estimates of 𝑥 and 𝑦. More generally: 𝑥,𝑦 close (in some sense). Knowledge of 𝑓 – function Bob wants to compute may not be exactly known to Alice! Shared randomness Alice + Bob may not have identical copies. 3 2 September 26, 2017 HLF: Mathematical Theory Communication

1. Compression Classical compression: Alice ←𝑃, 𝑚∼𝑃; Bob ←𝑃; Alice → Bob: 𝑦= 𝐸 𝑃 (𝑚) ; Bob 𝑚 = 𝐷 𝑃 𝑦 ≟𝑚; [Shannon]: 𝑚 =𝑚; w. 𝔼 𝑚∼𝑃 𝐸 𝑃 𝑚 ≤𝐻 𝑃 +1 𝐻 𝑃 ≝ 𝔼 𝑚∼𝑃 [− log 𝑃 𝑚 ] Uncertain compression [Juba,Kalai,Khanna,S.] Alice ←𝑃, 𝑚∼𝑃; Bob ←𝑄; Alice → Bob: 𝑦= 𝐸 𝑃 (𝑚) ; Bob 𝑚 = 𝐷 𝑄 𝑦 ≟𝑚; 𝑃,𝑄 Δ-close: ∀𝑚 log 𝑃(𝑚) − log 𝑄(𝑚) ≤Δ Can we get 𝔼 𝑚∼𝑃 𝐸 𝑃 𝑚 ≤𝑂 𝐻 𝑃 +Δ ? [JKKS] – Yes – with shared randomness. [CGMS] – Yes – with shared correlation. [Haramaty+S.] – Deterministically 𝑂(𝐻 𝑃 + log log |Ω|) Universe September 26, 2017 HLF: Mathematical Theory Communication

Deterministic Compression: Challenge
Say Alice and Bob have rankings of N players. Rankings = bijections 𝜋, 𝜎 : 𝑁 → 𝑁 𝜋 𝑖 = rank of i th player in Alice’s ranking. Further suppose they know rankings are close. ∀ 𝑖∈ 𝑁 : 𝜋 𝑖 −𝜎 𝑖 ≤2. Bob wants to know: Is 𝜋 −1 1 = 𝜎 −1 1 How many bits does Alice need to send (non-interactively). With shared randomness – 𝑂(1) Deterministically? 𝑂 1 ? 𝑂( log 𝑁 )?𝑂( log log log 𝑁)? With Elad Haramaty: 𝑂 ( log ∗ 𝑛 ) September 26, 2017 HLF: Mathematical Theory Communication

Compression as a proxy for language
Information theoretic study of language? Goal of language: Effective means of expressing information/action. Implicit objective of language: Make frequent messages short. Compression! Frequency = Known globally? Learned locally? If latter – every one can’t possibly agree on it; Yet need to agree on language (mostly)! Similar to problem of Uncertain Compression. Studied formally in [Ghazi,Haramaty,Kamath,S. ITCS 17] September 26, 2017 HLF: Mathematical Theory Communication

2. Imperfectly Shared Randomness
Recall: Communication becomes more effective with randomness. Identity, Hamming Distance, Small Set Intersection, Inner Product. How does performance degrade if players only share correlated variables: E.g. Alice ←𝑟; Bob ←𝑠. 𝑟,𝑠)= ( 𝑟 𝑖 , 𝑡 𝑖 𝑖 i.i.d. 𝑟 𝑖 , 𝑠 𝑖 ∈ −1,1 ; 𝔼 𝑟 𝑖 =𝔼 𝑠 𝑖 =0;𝔼 𝑟 𝑖 𝑠 𝑖 =𝜌∈(0,1); [CGMS ’16]: Comm. With perfect randomness = 𝑘 ⇒ Comm. With imperfect randomness = 𝑂 𝜌 ( 2 𝑘 ) September 26, 2017 HLF: Mathematical Theory Communication

Imperfectly Shared Randomness
Easy (Complete) Problem: Gap Inner Product: 𝑥,𝑦∈ ℝ 𝑛 𝐺𝐼 𝑃 𝑐,𝑠 𝑥,𝑦 =1 if 𝑥,𝑦 ≥𝜖⋅ 𝑥 2 ⋅ 𝑦 2 ; =0 if 𝑥,𝑦 ≤0 Decidable with 𝑂 𝜌 1 𝜖 2 (o.w.) communication Hard Problem: Sparse Gap Inner Product: 𝐺𝐼𝑃 on sparse 𝑥 𝑥∈ 0,1 𝑛 , 𝑦∈ −1,1 𝑛 ; 𝑥 1 =2𝜖𝑛 Classical communication = 𝑂 log 1 𝜖 [uses sparsity] No way to use sparsity with imperfect randomness. September 26, 2017 HLF: Mathematical Theory Communication

3. Functional Uncertainty
[Ghazi,Komargodski,Kothari,S. ‘16] Recall positive message of Yao’s model: Communication can be brief, if Alice knows what function 𝑓(𝑥,𝑦) Bob wants to compute. What if Alice only knows 𝑓 approximately? Can communication still be short? September 26, 2017 HLF: Mathematical Theory Communication

The Model Recall Distributional Complexity: 𝑥,𝑦 ∼𝜇; error 𝜇 (Π)≝ Pr 𝑥,𝑦∼𝜇 𝑓 𝑥,𝑦 ≠Π 𝑥,𝑦 Complexity:𝑐 𝑐 𝜇, 𝜖 𝑓 ≝ min Π:𝑒𝑟𝑟𝑜 𝑟 𝜇 Π ≤𝜖 max 𝑥,𝑦 Π(𝑥,𝑦) Functional Uncertainty Model - I: Adversary picks 𝑓,𝑔 . Nature picks 𝑥,𝑦 ∼𝜇 Alice ←(𝑓,𝑥); Bob ←(𝑔,𝑦) ; Compute 𝑔(𝑥,𝑦) Promise: 𝛿 𝜇 𝑓,𝑔 ≝ Pr 𝜇 𝑓 𝑥,𝑦 ≠𝑔 𝑥,𝑦 ≤ 𝛿 0 Goal: Compute (any) Π 𝑥,𝑦 with 𝛿 𝜇 𝑔,Π ≤ 𝜖 1 (just want 𝜖 1 →0 as 𝛿 0 →0) If 𝑓,𝑔 part of input; this is complexity of what? September 26, 2017 HLF: Mathematical Theory Communication

Modelling Uncertainty
𝑓 𝑔 𝒢 Modelled by graph 𝒢 of possible inputs Protocols know 𝒢 but not (𝑓,𝑔) 𝑐 𝑐 𝜇,𝜖 𝒢 ≝ max 𝑓,𝑔 ∈𝒢 𝑐 𝑐 𝜇,𝜖 𝑔 𝛿 𝜇 𝒢 ≝ max 𝑓,𝑔 ∈𝒢 𝛿 𝜇 (𝑓,𝑔) Uncertain error: 𝑒𝑟𝑟𝑜 𝑟 𝜇,𝒢 Π ≝ max 𝑓,𝑔 ∈𝒢 Pr 𝜇 𝑔 𝑥,𝑦 ≠Π(𝑓,𝑔,𝑥,𝑦) Uncertain complexity: 𝑈𝑐 𝑐 𝜇,𝜖 𝒢 ≝ min Π:error Π ≤𝜖 max 𝑓,𝑔,𝑥,𝑦 Π 𝑓,𝑔,𝑥,𝑦 Compare 𝑐 𝑐 𝜇, 𝜖 0 𝒢 vs. 𝑈𝑐 𝑐 𝜇, 𝜖 1 (𝒢) want 𝜖 1 →0 as 𝜖 0 , 𝛿 𝜇 𝒢 →0 September 26, 2017 HLF: Mathematical Theory Communication

Main Results Thm 1: (-ve) ∃𝒢,𝜇 s.t. 𝛿 𝜇 𝒢 =𝑜 1 ;𝑐 𝑐 𝜇,𝑜(1) 𝒢 =1; but 𝑈𝑐 𝑐 𝜇,.1 𝒢 =Ω 𝑛 ; ( 𝑛= 𝑥 = 𝑦 ) Thm 2: (+ve) ∀𝒢, product 𝜇, 𝑈𝑐 𝑐 𝜇, 𝜖 1 𝑜𝑛𝑒𝑤𝑎𝑦 𝒢 =𝑂 𝑐 𝑐 𝜇, 𝜖 0 𝑜𝑛𝑒𝑤𝑎𝑦 𝒢 where 𝜖 1 →0 as 𝜖 0 , 𝛿 𝜇 𝒢 →0 Thm 2’: (+ve) ∀𝒢, 𝜇, 𝑈𝑐 𝑐 𝜇, 𝜖 1 𝑜𝑛𝑒𝑤𝑎𝑦 𝒢 =𝑂 𝑐 𝑐 𝜇, 𝜖 0 𝑜𝑛𝑒𝑤𝑎𝑦 𝒢 ⋅(1+𝐼 𝑥;𝑦 ) and 𝐼 𝑥;𝑦 = Mutual Information between 𝑥;𝑦 Protocols are not continuous wrt the function being computed September 26, 2017 HLF: Mathematical Theory Communication

Details of Negative Result
𝜇: 𝑥∼𝑈 0,1 𝑛 ;𝑦=𝑁𝑜𝑖𝑠𝑦(𝑥) ; Pr 𝑥 𝑖 ≠ 𝑦 𝑖 =1/ 𝑛 ; 𝒢= ⊕ 𝑆 𝑥⊕𝑦 , ⊕ 𝑇 (𝑥⊕𝑦) | 𝑆⊕𝑇 =𝑜 𝑛 ⊕ 𝑆 𝑧 = ⊕ 𝑖∈𝑆 𝑧 𝑖 Certain Comm: Alice → Bob: ⊕ 𝑇 𝑥 𝛿 𝜇 𝒢 = max 𝑆,𝑇 Pr 𝑥,𝑦 ⊕ 𝑆 𝑥⊕𝑦 ≠ ⊕ 𝑇 𝑥⊕𝑦 = max 𝑈: 𝑈 =𝑜 𝑛 Pr 𝑧∼Bernoulli 𝑛 𝑛 ⊕ 𝑈 𝑧 =1 =𝑜(1) Uncertain Lower bound: Standard 𝑐 𝑐 𝜇,𝜖 𝐹 where 𝐹 𝑆,𝑥 ; 𝑇,𝑦 = ⊕ 𝑇 𝑥⊕𝑦 ; Lower bound obtain by picking 𝑆,𝑇 randomly: 𝑆 uniform; 𝑇 noisy copy of 𝑆 September 26, 2017 HLF: Mathematical Theory Communication

Positive result (Thm. 2) 𝑦 1 𝑦 2 𝑦 𝑚 Consider comm. Matrix
Protocol for 𝑔 partitions matrix in 2 𝑘 blocks Bob wants to know which block? Common randomness: 𝑦 1 ,…, 𝑦 𝑚 Alice → Bob: 𝑓 𝑥, 𝑦 1 …𝑓 𝑥, 𝑦 𝑚 Bob (whp) recognizes block and uses it. 𝑚=𝑂 𝑘 suffices. “CC preserved under uncertainty for one-way communication if 𝑥⊥𝑦” 𝑔(𝑥,𝑦) September 26, 2017 HLF: Mathematical Theory Communication

Analysis Details W.p. 1 − 𝜖 , ∃𝑗 s.t. 𝛿 𝑔 𝑥 𝑗 ,⋅ ,𝑔 𝑥,⋅ ≤ 𝜖 Main idea: If Π 𝑔 𝑥 = Π 𝑔 𝑥 𝑗 then w.h.p. 𝛿 𝑔 𝑥 𝑗 ,⋅ ,𝑔 𝑥,⋅ ≤ 𝜖 If 𝑗∈ 𝐾 s.t. 𝛿 𝑔 𝑥 𝑗 ,⋅ , 𝑔 𝑥,⋅ ≥2 𝜖 then Pr 𝑗 is selected = exp(−𝑚). But Step 2. works only if 𝑦 𝑖 ∼ 𝜇 𝑥 September 26, 2017 HLF: Mathematical Theory Communication

Thm 2’: Main Idea Now can not sample 𝑦 1 ,…, 𝑦 𝑚 independent of 𝑥 Instead use [HJMR’07] to sample 𝑦 𝑖 ∼ 𝜇 𝑥 Each sample costs 𝐼 𝑥;𝑦 Analysis goes through … September 26, 2017 HLF: Mathematical Theory Communication

4. Contextual Proofs and Uncertainty?
Scenario: Alice + Bob start with axioms 𝐴: subset of clauses on 𝑋 1 ,…, 𝑋 𝑛 Alice wishes to prove 𝐴⇒𝐶 for some clause 𝐶 But proof Π:𝐴⇒𝐶 may be long (~ 2 𝑛 ) Context to rescue: Maybe Alice + Bob share context 𝐷⇐𝐴 ; and contextual proof Π ′ :𝐷⇒𝐶 short (poly(𝑛)) Uncertainty: Alice’s Context 𝐷 𝐴 ≠ 𝐷 𝐵 (Bob’s context) Alice writes proof Π ′ : 𝐷 𝐴 ⇒𝐶 When can Bob verify Π′ given 𝐷 𝐵 ? September 26, 2017 HLF: Mathematical Theory Communication

4. Contextual Proofs and Uncertainty? -2
Scenario: Alice + Bob start with axioms 𝐴: subset of clauses on 𝑋 1 ,…, 𝑋 𝑛 Alice wishes to prove 𝐴⇒𝐶 for some clause 𝐶 Alice writes proof Π ′ : 𝐷 𝐴 ⇒𝐶 When can Bob verify Π′ given 𝐷 𝐵 ? Surely if 𝐷 𝐴 ⊆ 𝐷 𝐵 What if 𝐷 𝐴 ∖ 𝐷 𝐵 = 𝐶′ and Π ′′ : 𝐷 𝐵 ⇒𝐶′ is one step long? Can Bob still verify Π′: 𝐷 𝐴 ⇒𝐶 in poly(𝑛) time? Need feasible data structure that allows this! None known to exist. Might be harder than Partial Match Retrieval … September 26, 2017 HLF: Mathematical Theory Communication

Summarizing Perturbing “common information” assumptions in Shannon/Yao theory, lead to many changes. Some debilitating Some not; but require careful protocol choice. In general: Communication protocols are not continuous functions of the common information. Uncertain model (𝒢) needs more exploration! Some open questions from our work: Tighten the gap: 𝑐𝑐 𝑓 ⋅𝐼 vs. 𝑐𝑐 𝑓 + 𝐼 Multi-round setting? Two rounds? What if randomness & function imprefectly shared? [Prelim. Results in [Ghazi+S’17]] September 26, 2017 HLF: Mathematical Theory Communication

Thank You! September 26, 2017 HLF: Mathematical Theory Communication

Mathematical Theories of Communication:

Similar presentations

Presentation on theme: "Mathematical Theories of Communication:"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Mathematical Theories of Communication:

Similar presentations

Presentation on theme: "Mathematical Theories of Communication:"— Presentation transcript:

Similar presentations

About project

Feedback