General Strong Polarization

Slides:



Advertisements
Similar presentations
1+eps-Approximate Sparse Recovery Eric Price MIT David Woodruff IBM Almaden.
Advertisements

Invertible Zero-Error Dispersers and Defective Memory with Stuck-At Errors Ariel Gabizon Ronen Shaltiel.
Ulams Game and Universal Communications Using Feedback Ofer Shayevitz June 2006.
Randomness Extractors: Motivation, Applications and Constructions Ronen Shaltiel University of Haifa.
Approximate List- Decoding and Hardness Amplification Valentine Kabanets (SFU) joint work with Russell Impagliazzo and Ragesh Jaiswal (UCSD)
An Efficient Membership-Query Algorithm for Learning DNF with Respect to the Uniform Distribution Jeffrey C. Jackson Presented By: Eitan Yaakobi Tamar.
Gillat Kol (IAS) joint work with Ran Raz (Weizmann + IAS) Interactive Channel Capacity.
Chapter 6 Information Theory
Sparse Random Linear Codes are Locally Decodable and Testable Tali Kaufman (MIT) Joint work with Madhu Sudan (MIT)
The 1’st annual (?) workshop. 2 Communication under Channel Uncertainty: Oblivious channels Michael Langberg California Institute of Technology.
SWE 423: Multimedia Systems Chapter 7: Data Compression (2)
The Goldreich-Levin Theorem: List-decoding the Hadamard code
On Gallager’s problem: New Bounds for Noisy Communication. Navin Goyal & Mike Saks Joint work with Guy Kindler Microsoft Research.
Compression with Side Information using Turbo Codes Anne Aaron and Bernd Girod Information Systems Laboratory Stanford University Data Compression Conference.
On Kernels, Margins, and Low- dimensional Mappings or Kernels versus features Nina Balcan CMU Avrim Blum CMU Santosh Vempala MIT.
Channel Polarization and Polar Codes
Correlation testing for affine invariant properties on Shachar Lovett Institute for Advanced Study Joint with Hamed Hatami (McGill)
Of 19 June 15, 2015CUHK: Communication Amid Uncertainty1 Communication Amid Uncertainty Madhu Sudan Microsoft Research Based on joint works with Brendan.
Channel Capacity.
Amplification and Derandomization Without Slowdown Dana Moshkovitz MIT Joint work with Ofer Grossman (MIT)
Of 22 10/07/2015UMass: Uncertain Communication1 Communication Amid Uncertainty Madhu Sudan Microsoft Research Based on Juba, S. (STOC 2008, ITCS 2011)
Of 22 10/30/2015WUSTL: Uncertain Communication1 Communication Amid Uncertainty Madhu Sudan Harvard Based on Juba, S. (STOC 2008, ITCS 2011) Juba, S. (STOC.
Secret Sharing Non-Shannon Information Inequalities Presented in: Theory of Cryptography Conference (TCC) 2009 Published in: IEEE Transactions on Information.
Raptor Codes Amin Shokrollahi EPFL. BEC(p 1 ) BEC(p 2 ) BEC(p 3 ) BEC(p 4 ) BEC(p 5 ) BEC(p 6 ) Communication on Multiple Unknown Channels.
Channel Coding Theorem (The most famous in IT) Channel Capacity; Problem: finding the maximum number of distinguishable signals for n uses of a communication.
Lecture. Today Problem set 9 out (due next Thursday) Topics: –Complexity Theory –Optimization versus Decision Problems –P and NP –Efficient Verification.
Of 17 Limits of Local Algorithms in Random Graphs Madhu Sudan MSR Joint work with David Gamarnik (MIT) 7/11/2013Local Algorithms on Random Graphs1.
1 Tolerant Locally Testable Codes Atri Rudra Qualifying Evaluation Project Presentation Advisor: Venkatesan Guruswami.
Imperfectly Shared Randomness
Algebraic Property Testing:
Information Complexity Lower Bounds
CSI-447: Multimedia Systems
Locality in Coding Theory II: LTCs
New Characterizations in Turnstile Streams with Applications
Lecture 22: Linearity Testing Sparse Fourier Transform
Locality in Coding Theory
Communication Amid Uncertainty
Error-Correcting Codes:
General Strong Polarization
Communication Amid Uncertainty
Pseudorandomness when the odds are against you
Local Decoding and Testing Polynomials over Grids
Algebraic Codes and Invariance
Monte Carlo Approximations – Introduction
Maximally Recoverable Local Reconstruction Codes
Local Error-Detection and Error-correction
Turnstile Streaming Algorithms Might as Well Be Linear Sketches
Communication Amid Uncertainty
Linear sketching with parities
Matrix Martingales in Randomized Numerical Linear Algebra
Locally Decodable Codes from Lifting
The Curve Merger (Dvir & Widgerson, 2008)
How to Delegate Computations: The Power of No-Signaling Proofs
Uncertain Compression
Beer Therapy Madhu Ralf Vote early, vote often
Algebraic Property Testing
General Strong Polarization
Non-Malleable Extractors New tools and improved constructions
Linear sketching with parities
Locality in Coding Theory II: LTCs
The Byzantine Secretary Problem
Imperfectly Shared Randomness
Universal Semantic Communication
Communication Amid Uncertainty
How many deleted bits can one recover?
General Strong Polarization
General Strong Polarization
On Derandomizing Algorithms that Err Extremely Rarely
Zeev Dvir (Princeton) Shachar Lovett (IAS)
Presentation transcript:

General Strong Polarization Madhu Sudan Harvard University Joint work with Jaroslaw Blasiok (Harvard), Venkatesan Guruswami (CMU), Preetum Nakkiran (Harvard) and Atri Rudra (Buffalo) Oct. 8, 2018 Berkeley: General Strong Polarization

Part I: Motivation Achieving Shannon Capacity Oct. 8, 2018 Berkeley: General Strong Polarization

Shannon and Channel Capacity Shannon [‘48]: For every (noisy) channel of communication, ∃ capacity 𝐶 s.t. communication at rate 𝑅<𝐶 is possible and 𝑅>𝐶 is not. formalization – omitted. Example: Acts independently on bits Capacity =1 – 𝐻(𝑝) ; 𝐻(𝑝) = binary entropy! 𝐻 𝑝 =𝑝⋅ log 1 𝑝 + 1−𝑝 ⋅ log 1 1−𝑝 Capacity? ∃ 𝐸: 𝔽 2 𝑘 𝑛 → 𝔽 2 𝑛 𝐷: 𝔽 2 𝑛 → 𝔽 2 𝑘 𝑛 w. Pr 𝐷 𝐵𝑆𝐶 𝐸 𝑚 ≠𝑚 =𝑜(1) lim 𝑛→∞ 𝑘 𝑛 𝑛 ≥𝑅 (Rate) 𝑋∈ 𝔽 2 𝑋 w.p. 1−𝑝 1−𝑋 w.p. 𝑝 𝐵𝑆𝐶(𝑝) Oct. 8, 2018 Berkeley: General Strong Polarization

“Achieving” Shannon Capacity Commonly used phrase. Many mathematical interpretations. Some possibilities: How small can 𝑛 be? Get 𝑅>𝐶−𝜖 with polytime algorithms? [Luby et al.’95] running time poly 𝑛 𝜖 ? (or 𝑛=poly 1 𝜖 ?) Open till 2008 Arikan’08: Invented “Polar Codes” … Final resolution: Guruswami+Xia’13, Hassani+Alishahi+Urbanke’13 – Strong analysis Shannon ‘48: 𝑛=𝑂 1 𝜖 2 ; 𝜖≝𝐶−𝑅 Forney ‘66 :time =𝑝𝑜𝑙𝑦 𝑛, 2 1 𝜖 2 Oct. 8, 2018 Berkeley: General Strong Polarization

Polar Codes and Polarization Arikan: Defined Polar Codes, one for every 𝑡 Associated Martingale 𝑋 0 ,…, 𝑋 𝑡 ,… 𝑋 𝑡 ∈[0,1] 𝑡th code: Length 𝑛= 2 𝑡 If Pr 𝑋 𝑡 ∈ 𝜏 𝑡 ,1− 𝛿 𝑡 ≤ 𝜖 𝑡 then 𝑡th code is 𝜖 𝑡 + 𝛿 𝑡 -close to capacity, and Pr Decode BSC Encode 𝑚 ≠𝑚 ≤𝑛⋅ 𝜏 𝑡 Need 𝜏 𝑡 =𝑜 1 𝑛 (strong polarization ⇒𝜏=𝑛𝑒𝑔(𝑛)) Need 𝜖 𝑡 =1/ 𝑛 Ω(1) (⇐ strong polarization) Arikan et al. 𝜏=𝑛𝑒𝑔 𝑛 ;𝜖=𝑜 1 ; [GX13,HAU13] ↑ Oct. 8, 2018 Berkeley: General Strong Polarization

Part II: Martingales, Polarization, Strong & Local Oct. 8, 2018 Berkeley: General Strong Polarization

Martingales and Polarization [0,1]-Martingale: Sequence of r.v.s 𝑋 0 , 𝑋 1 , 𝑋 2 ,… 𝑋 𝑡 ∈ 0,1 , 𝔼 𝑋 𝑡+1 | 𝑋 0 ,…, 𝑋 𝑡 = 𝑋 𝑡 Well-studied in algorithms: [MotwaniRaghavan, §4.4], [MitzenmacherUpfal, §12] E.g., 𝑋 𝑡 = Expected approx. factor of algorithm given first 𝑡 random choices. Usually study “concentration” E.g. … whp 𝑋 final ∈[ 𝑋 0 ±𝜎] Today: Polarization: … whp 𝑋 final ∉ 𝜖,1−𝜖 ⇔ lim 𝑡→∞ 𝑋 𝑡 = 𝑁(𝑋 0 , 𝜎 2 ) ⇔ lim 𝑡→∞ 𝑋 𝑡 = Bern 𝑋 0 Oct. 8, 2018 Berkeley: General Strong Polarization

Berkeley: General Strong Polarization Some examples 𝑋 𝑡+1 = 𝑋 𝑡 + 4 −𝑡 w.p. 1 2 𝑋 𝑡 − 4 −𝑡 w.p. 1 2 𝑋 𝑡+1 = 𝑋 𝑡 + 2 −𝑡 w.p. 1 2 𝑋 𝑡 − 2 −𝑡 w.p. 1 2 𝑋 𝑡+1 = 3 2 𝑋 𝑡 w.p. 1 2 if 𝑋 𝑡 ≤ 1 2 1 2 𝑋 𝑡 w.p. 1 2 if 𝑋 𝑡 ≤ 1 2 𝑋 𝑡+1 = 𝑋 𝑡 2 w.p. 1 2 if 𝑋 𝑡 ≤ 1 2 2 𝑋 𝑡 − 𝑋 𝑡 2 w.p. 1 2 if 𝑋 𝑡 ≤ 1 2 Converges! Uniform on [0,1] Polarizes (weakly)! Polarizes (strongly)! Oct. 8, 2018 Berkeley: General Strong Polarization

Main Result: Definition and Theorem Strong Polarization: (informally) Pr 𝑋 𝑡 ∈(𝜏,1−𝜏) ≤𝜖 if 𝜏= 2 −𝜔(𝑡) and 𝜖= 2 −𝑂(𝑡) formally ∀𝛾>0 ∃𝛽<1,𝑐 𝑠.𝑡. ∀𝑡 Pr 𝑋 𝑡 ∈( 𝛾 𝑡 ,1− 𝛾 𝑡 ) ≤𝑐⋅ 𝛽 𝑡 Local Polarization: Variance in the middle: 𝑋 𝑡 ∈(𝜏,1−𝜏) ∀𝜏>0 ∃𝜎>0 𝑠.𝑡. ∀𝑡, 𝑋 𝑡 ∈ 𝜏,1−𝜏 ⇒𝑉𝑎𝑟 𝑋 𝑡+1 𝑋 𝑡 ≥𝜎 Suction at the ends: 𝑋 𝑡 ∉ 𝜏,1−𝜏 ∃𝜃>0, ∀𝑐<∞, ∃𝜏>0 𝑠.𝑡. 𝑋 𝑡 <𝜏⇒ Pr 𝑋 𝑡+1 < 𝑋 𝑡 𝑐 ≥𝜃 Theorem: Local Polarization ⇒ Strong Polarization. Both definitions qualitative! “low end” condition. Similar condition for high end Oct. 8, 2018 Berkeley: General Strong Polarization

Berkeley: General Strong Polarization Proof (Idea): Step 1: The potential Φ 𝑡 ≝ min 𝑋 𝑡 , 1− 𝑋 𝑡 decreases by constant factor in expectation in each step. ⇒𝔼 Φ 𝑇 = exp −𝑇 ⇒ Pr 𝑋 𝑇 ≥ exp −𝑇/2 ≤ exp −𝑇/2 Step 2: Next 𝑇 time steps, 𝑋 𝑡 plummets whp 2.1: Say, If 𝑋 𝑡 ≤𝜏 then Pr 𝑋 𝑡+1 ≤ 𝑋 𝑡 100 ≥1/2. 2.2: Pr ∃𝑡∈ 𝑇,2𝑇 𝑠.𝑡. 𝑋 𝑡 >𝜏 ≤ 𝑋 𝑇 /𝜏 [Doob] 2.3: If above doesn’t happen 𝑋 2𝑇 < 5 −𝑇 whp QED Oct. 8, 2018 Berkeley: General Strong Polarization

Part III: Polar Codes Encoding, Decoding, Martingale, Polarization Oct. 8, 2018 Berkeley: General Strong Polarization

Lesson 0: Compression ⇒ Coding Defn: Linear Compression Scheme: 𝑀,𝐷 form compression scheme for bern 𝑝 𝑛 if Linear map 𝑀: 𝔽 2 𝑛 → 𝔽 2 𝑚 Pr 𝑍∼bern 𝑝 𝑛 𝐷 𝑀⋅𝑍 ≠𝑍 =𝑜 1 Hopefully 𝑚 𝑛 ≤𝐻 𝑝 +𝜖 Compression ⇒ Coding Let 𝑀 ⊥ be such that 𝑀⋅𝑀 ⊥ =0; Encoder: 𝑋↦ 𝑀 ⊥ ⋅𝑋 Error-Corrector: 𝑌= 𝑀 ⊥ ⋅𝑋+𝑍↦𝑌−𝐷 𝑀⋅𝑌 =𝑌−𝐷 𝑀⋅ 𝑀 ⊥ ⋅𝑋+𝑀.𝑍 = 𝑤.𝑝. 1−𝑜(1) 𝑀 ⊥ ⋅𝑋 𝑀 𝑀 ⊥ Oct. 8, 2018 Berkeley: General Strong Polarization

Question: How to compress? Arikan’s key idea: Start with 2×2 “Polarization Transform”: 𝑈,𝑉 → 𝑈+𝑉,𝑉 Invertible – so does nothing? If 𝑈,𝑉 independent, then 𝑈+𝑉 “more random” than either 𝑉 | 𝑈+𝑉 “less random” than either Iterate (ignoring conditioning) End with bits that are almost random, or almost determined (by others). Output “totally random part” to get compression! Oct. 8, 2018 Berkeley: General Strong Polarization

Iteration elaborated: Martingale: 𝑋 𝑡 ≝ conditional entropy of random output after 𝑡 iterations B A+B A C+D A+B+C+D D C+D C D B+D H G+H G F E+F E F+H E+F+G+H Oct. 8, 2018 Berkeley: General Strong Polarization

The Polarization Butterfly 𝑃 𝑛 𝑈,𝑉 = 𝑃 𝑛 2 𝑈+𝑉 , 𝑃 𝑛 2 𝑉 Polarization ⇒∃𝑆⊆ 𝑛 𝐻 𝑊 𝑛 −𝑆 𝑊 𝑆 →0 𝑆 ≤ 𝐻 𝑝 +𝜖 ⋅𝑛 𝑍 1 𝑍 2 𝑍 𝑛 2 𝑍 𝑛 2 +1 𝑍 𝑛 2 +2 𝑍 𝑛 𝑍 1 + 𝑍 𝑛 2 +1 𝑍 2 + 𝑍 𝑛 2 +2 𝑍 𝑛 2 + 𝑍 𝑛 𝑍 𝑛 2 +1 𝑍 𝑛 2 +2 𝑍 𝑛 𝑊 1 𝑊 2 𝑊 𝑛 2 𝑊 𝑛 2 +1 𝑊 𝑛 2 +2 𝑊 𝑛 𝑈 Compression 𝐸 𝑍 = 𝑃 𝑛 𝑍 𝑆 Encoding time =𝑂 𝑛 log 𝑛 (given 𝑆) 𝑉 To be shown: 1. Decoding = 𝑂 𝑛 log 𝑛 2. Polarization! Oct. 8, 2018 Berkeley: General Strong Polarization

The Polarization Butterfly: Decoding Decoding idea: Given 𝑆, 𝑊 𝑆 = 𝑃 𝑛 𝑈,𝑉 Compute 𝑈+𝑉∼Bern 2𝑝−2 𝑝 2 Determine 𝑉∼ ? 𝑍 1 𝑍 2 𝑍 𝑛 2 𝑍 𝑛 2 +1 𝑍 𝑛 2 +2 𝑍 𝑛 𝑍 1 + 𝑍 𝑛 2 +1 𝑊 1 𝑊 2 𝑊 𝑛 2 𝑊 𝑛 2 +1 𝑊 𝑛 2 +2 𝑊 𝑛 𝑍 2 + 𝑍 𝑛 2 +2 𝑉|𝑈+𝑉∼ 𝑖 Bern( 𝑟 𝑖 ) Key idea: Stronger induction! - non-identical product distribution 𝑍 𝑛 2 + 𝑍 𝑛 𝑍 𝑛 2 +1 Decoding Algorithm: Given 𝑆, 𝑊 𝑆 = 𝑃 𝑛 𝑈,𝑉 , 𝑝 1 ,…, 𝑝 𝑛 Compute 𝑞 1 ,.. 𝑞 𝑛 2 ← 𝑝 1 … 𝑝 𝑛 Determine 𝑈+𝑉=𝐷 𝑊 𝑆 + , 𝑞 1 … 𝑞 𝑛 2 Compute 𝑟 1 … 𝑟 𝑛 2 ← 𝑝 1 … 𝑝 𝑛 ;𝑈+𝑉 Determine 𝑉=𝐷 𝑊 𝑆 − , 𝑟 1 … 𝑟 𝑛 2 𝑍 𝑛 2 +2 𝑍 𝑛 Oct. 8, 2018 Berkeley: General Strong Polarization

Polarization Martingale 𝑍 1 𝑍 2 𝑍 𝑛 2 𝑍 𝑛 2 +1 𝑍 𝑛 2 +2 𝑍 𝑛 𝑍 1 + 𝑍 𝑛 2 +1 𝑊 1 𝑊 2 𝑊 𝑛 2 𝑊 𝑛 2 +1 𝑊 𝑛 2 +2 𝑊 𝑛 Idea: Follow 𝑋 𝑡 = 𝐻 𝑍 𝑖,𝑡 𝑍 <𝑖,𝑡 for random 𝑖 𝑍 2 + 𝑍 𝑛 2 +2 𝑍 𝑖,0 𝑍 𝑖,1 … Z 𝑖,𝑡 … 𝑍 𝑖, log 𝑛 𝑍 𝑛 2 + 𝑍 𝑛 𝑍 𝑛 2 +1 𝑋 𝑡 Martingale! 𝑍 𝑛 2 +2 Polarizes? Strongly? ⇐ Locally? 𝑍 𝑛 Oct. 8, 2018 Berkeley: General Strong Polarization

Local Polarization of Arikan Martingale Variance in the Middle: Roughly: 𝐻 𝑝 ,𝐻 𝑝 →(𝐻 2𝑝−2 𝑝 2 , 2𝐻 𝑝 −𝐻 2𝑝−2 𝑝 2 ) 𝑝∈ 𝜏,1−𝜏 ⇒2𝑝−2 𝑝 2 far from 𝑝 + continuity of 𝐻 ⋅ ⇒ 𝐻 2𝑝−2 𝑝 2 far from 𝐻(𝑝) Suction: High end:𝐻 1 2 −𝛾 →𝐻 1 2 − 𝛾 2 𝐻 1 2 −𝛾 =1 −Θ 𝛾 2 ⇒1− 𝛾 2 →1 −Θ 𝛾 4 Low end: 𝐻 𝑝 ≈𝑝 log 1 𝑝 ;2𝐻 𝑝 ≈2𝑝 log 1 𝑝 ; 𝐻 2𝑝−2 𝑝 2 ≈𝐻 2𝑝 ≈2𝑝 log 1 2𝑝 ≈2𝐻 𝑝 −2𝑝≈ 2− 1 log 1 𝑝 𝐻 𝑝 Dealing with conditioning – more work (lots of Markov) Oct. 8, 2018 Berkeley: General Strong Polarization

Berkeley: General Strong Polarization “New contributions” So far: Reproduced old work (simpler proofs?) Main new technical contributions: Strong polarization of general transforms E.g. 𝑈,𝑉,𝑊 →(𝑈+𝑉,𝑉,𝑊)? Exponentially strong polarization [BGS’18] Suction at low end is very strong! Random 𝑘×𝑘 matrix yields 𝑋 𝑡+1 ≈ 𝑋 𝑡 𝑘 .99 whp Strong polarization of Markovian sources [GNS’19?] Separation of compression of known sources from unknown ones. Oct. 8, 2018 Berkeley: General Strong Polarization

Berkeley: General Strong Polarization Conclusions Polarization: Important phenomenon associated with bounded random variables. Not always bad   Needs better understanding! Recently also used in CS: [Lovett Meka] – implicit [Chattopadhyay-Hatami-Hosseini-Lovett] - explicit Oct. 8, 2018 Berkeley: General Strong Polarization

Berkeley: General Strong Polarization Thank You! Oct. 8, 2018 Berkeley: General Strong Polarization

Berkeley: General Strong Polarization Rest of the talk Background/Motivation: The Shannon Capacity Challenge Resolution by [Arikan’08], [Guruswami-Xia’13], [Hassani,Alishahi,Urbanke’13]: Polar Codes! Key ideas: Polar Codes & Arikan-Martingale. Implications of Weak/Strong Polarization. Our Contributions: Simplification & Generalization Local Polarization ⇒ Strong Polarization Arikan-Martingale Locally Polarizes (generally) Oct. 8, 2018 Berkeley: General Strong Polarization

Context for today’s talk [Arikan’08]: Martingale associated to general class of codes. Martingale polarizes (weakly) implies code achieves capacity (weakly) Martingale polarizes weakly for entire class. [GX’13, HAU’13]: Strong polarization for one specific code. Why so specific? Bottleneck: Strong Polarization Rest of talk: Polar codes, Martingale and Strong Polarization Oct. 8, 2018 Berkeley: General Strong Polarization