General Strong Polarization

Slides:

Advertisements

Similar presentations

1+eps-Approximate Sparse Recovery Eric Price MIT David Woodruff IBM Almaden.

Advertisements

Invertible Zero-Error Dispersers and Defective Memory with Stuck-At Errors Ariel Gabizon Ronen Shaltiel.

Ulams Game and Universal Communications Using Feedback Ofer Shayevitz June 2006.

Randomness Extractors: Motivation, Applications and Constructions Ronen Shaltiel University of Haifa.

Approximate List- Decoding and Hardness Amplification Valentine Kabanets (SFU) joint work with Russell Impagliazzo and Ragesh Jaiswal (UCSD)

An Efficient Membership-Query Algorithm for Learning DNF with Respect to the Uniform Distribution Jeffrey C. Jackson Presented By: Eitan Yaakobi Tamar.

Gillat Kol (IAS) joint work with Ran Raz (Weizmann + IAS) Interactive Channel Capacity.

Chapter 6 Information Theory

Sparse Random Linear Codes are Locally Decodable and Testable Tali Kaufman (MIT) Joint work with Madhu Sudan (MIT)

The 1’st annual (?) workshop. 2 Communication under Channel Uncertainty: Oblivious channels Michael Langberg California Institute of Technology.

SWE 423: Multimedia Systems Chapter 7: Data Compression (2)

The Goldreich-Levin Theorem: List-decoding the Hadamard code

On Gallager’s problem: New Bounds for Noisy Communication. Navin Goyal & Mike Saks Joint work with Guy Kindler Microsoft Research.

Compression with Side Information using Turbo Codes Anne Aaron and Bernd Girod Information Systems Laboratory Stanford University Data Compression Conference.

On Kernels, Margins, and Low- dimensional Mappings or Kernels versus features Nina Balcan CMU Avrim Blum CMU Santosh Vempala MIT.

Channel Polarization and Polar Codes

Correlation testing for affine invariant properties on Shachar Lovett Institute for Advanced Study Joint with Hamed Hatami (McGill)

Of 19 June 15, 2015CUHK: Communication Amid Uncertainty1 Communication Amid Uncertainty Madhu Sudan Microsoft Research Based on joint works with Brendan.

Channel Capacity.

Amplification and Derandomization Without Slowdown Dana Moshkovitz MIT Joint work with Ofer Grossman (MIT)

Of 22 10/07/2015UMass: Uncertain Communication1 Communication Amid Uncertainty Madhu Sudan Microsoft Research Based on Juba, S. (STOC 2008, ITCS 2011)

Of 22 10/30/2015WUSTL: Uncertain Communication1 Communication Amid Uncertainty Madhu Sudan Harvard Based on Juba, S. (STOC 2008, ITCS 2011) Juba, S. (STOC.

Secret Sharing Non-Shannon Information Inequalities Presented in: Theory of Cryptography Conference (TCC) 2009 Published in: IEEE Transactions on Information.

Raptor Codes Amin Shokrollahi EPFL. BEC(p 1 ) BEC(p 2 ) BEC(p 3 ) BEC(p 4 ) BEC(p 5 ) BEC(p 6 ) Communication on Multiple Unknown Channels.

Channel Coding Theorem (The most famous in IT) Channel Capacity; Problem: finding the maximum number of distinguishable signals for n uses of a communication.

Lecture. Today Problem set 9 out (due next Thursday) Topics: –Complexity Theory –Optimization versus Decision Problems –P and NP –Efficient Verification.

Of 17 Limits of Local Algorithms in Random Graphs Madhu Sudan MSR Joint work with David Gamarnik (MIT) 7/11/2013Local Algorithms on Random Graphs1.

1 Tolerant Locally Testable Codes Atri Rudra Qualifying Evaluation Project Presentation Advisor: Venkatesan Guruswami.

Imperfectly Shared Randomness

Algebraic Property Testing:

Information Complexity Lower Bounds

CSI-447: Multimedia Systems

Locality in Coding Theory II: LTCs

New Characterizations in Turnstile Streams with Applications

Lecture 22: Linearity Testing Sparse Fourier Transform

Locality in Coding Theory

Communication Amid Uncertainty

Error-Correcting Codes:

General Strong Polarization

Communication Amid Uncertainty

Pseudorandomness when the odds are against you

Local Decoding and Testing Polynomials over Grids

Algebraic Codes and Invariance

Monte Carlo Approximations – Introduction

Maximally Recoverable Local Reconstruction Codes

Local Error-Detection and Error-correction

Turnstile Streaming Algorithms Might as Well Be Linear Sketches

Communication Amid Uncertainty

Linear sketching with parities

Matrix Martingales in Randomized Numerical Linear Algebra

Locally Decodable Codes from Lifting

The Curve Merger (Dvir & Widgerson, 2008)

How to Delegate Computations: The Power of No-Signaling Proofs

Uncertain Compression

Beer Therapy Madhu Ralf Vote early, vote often

Algebraic Property Testing

General Strong Polarization

Non-Malleable Extractors New tools and improved constructions

Linear sketching with parities

Locality in Coding Theory II: LTCs

The Byzantine Secretary Problem

Imperfectly Shared Randomness

Universal Semantic Communication

Communication Amid Uncertainty

How many deleted bits can one recover?

General Strong Polarization

General Strong Polarization

On Derandomizing Algorithms that Err Extremely Rarely

Zeev Dvir (Princeton) Shachar Lovett (IAS)

Presentation transcript:

General Strong Polarization Madhu Sudan Harvard University Joint work with Jaroslaw Blasiok (Harvard), Venkatesan Guruswami (CMU), Preetum Nakkiran (Harvard) and Atri Rudra (Buffalo) Oct. 8, 2018 Berkeley: General Strong Polarization

Part I: Motivation Achieving Shannon Capacity Oct. 8, 2018 Berkeley: General Strong Polarization

Shannon and Channel Capacity Shannon [‘48]: For every (noisy) channel of communication, ∃ capacity 𝐶 s.t. communication at rate 𝑅<𝐶 is possible and 𝑅>𝐶 is not. formalization – omitted. Example: Acts independently on bits Capacity =1 – 𝐻(𝑝) ; 𝐻(𝑝) = binary entropy! 𝐻 𝑝 =𝑝⋅ log 1 𝑝 + 1−𝑝 ⋅ log 1 1−𝑝 Capacity? ∃ 𝐸: 𝔽 2 𝑘 𝑛 → 𝔽 2 𝑛 𝐷: 𝔽 2 𝑛 → 𝔽 2 𝑘 𝑛 w. Pr 𝐷 𝐵𝑆𝐶 𝐸 𝑚 ≠𝑚 =𝑜(1) lim 𝑛→∞ 𝑘 𝑛 𝑛 ≥𝑅 (Rate) 𝑋∈ 𝔽 2 𝑋 w.p. 1−𝑝 1−𝑋 w.p. 𝑝 𝐵𝑆𝐶(𝑝) Oct. 8, 2018 Berkeley: General Strong Polarization

“Achieving” Shannon Capacity Commonly used phrase. Many mathematical interpretations. Some possibilities: How small can 𝑛 be? Get 𝑅>𝐶−𝜖 with polytime algorithms? [Luby et al.’95] running time poly 𝑛 𝜖 ? (or 𝑛=poly 1 𝜖 ?) Open till 2008 Arikan’08: Invented “Polar Codes” … Final resolution: Guruswami+Xia’13, Hassani+Alishahi+Urbanke’13 – Strong analysis Shannon ‘48: 𝑛=𝑂 1 𝜖 2 ; 𝜖≝𝐶−𝑅 Forney ‘66 :time =𝑝𝑜𝑙𝑦 𝑛, 2 1 𝜖 2 Oct. 8, 2018 Berkeley: General Strong Polarization

Polar Codes and Polarization Arikan: Defined Polar Codes, one for every 𝑡 Associated Martingale 𝑋 0 ,…, 𝑋 𝑡 ,… 𝑋 𝑡 ∈[0,1] 𝑡th code: Length 𝑛= 2 𝑡 If Pr 𝑋 𝑡 ∈ 𝜏 𝑡 ,1− 𝛿 𝑡 ≤ 𝜖 𝑡 then 𝑡th code is 𝜖 𝑡 + 𝛿 𝑡 -close to capacity, and Pr Decode BSC Encode 𝑚 ≠𝑚 ≤𝑛⋅ 𝜏 𝑡 Need 𝜏 𝑡 =𝑜 1 𝑛 (strong polarization ⇒𝜏=𝑛𝑒𝑔(𝑛)) Need 𝜖 𝑡 =1/ 𝑛 Ω(1) (⇐ strong polarization) Arikan et al. 𝜏=𝑛𝑒𝑔 𝑛 ;𝜖=𝑜 1 ; [GX13,HAU13] ↑ Oct. 8, 2018 Berkeley: General Strong Polarization

Part II: Martingales, Polarization, Strong & Local Oct. 8, 2018 Berkeley: General Strong Polarization

Martingales and Polarization [0,1]-Martingale: Sequence of r.v.s 𝑋 0 , 𝑋 1 , 𝑋 2 ,… 𝑋 𝑡 ∈ 0,1 , 𝔼 𝑋 𝑡+1 | 𝑋 0 ,…, 𝑋 𝑡 = 𝑋 𝑡 Well-studied in algorithms: [MotwaniRaghavan, §4.4], [MitzenmacherUpfal, §12] E.g., 𝑋 𝑡 = Expected approx. factor of algorithm given first 𝑡 random choices. Usually study “concentration” E.g. … whp 𝑋 final ∈[ 𝑋 0 ±𝜎] Today: Polarization: … whp 𝑋 final ∉ 𝜖,1−𝜖 ⇔ lim 𝑡→∞ 𝑋 𝑡 = 𝑁(𝑋 0 , 𝜎 2 ) ⇔ lim 𝑡→∞ 𝑋 𝑡 = Bern 𝑋 0 Oct. 8, 2018 Berkeley: General Strong Polarization

Berkeley: General Strong Polarization Some examples 𝑋 𝑡+1 = 𝑋 𝑡 + 4 −𝑡 w.p. 1 2 𝑋 𝑡 − 4 −𝑡 w.p. 1 2 𝑋 𝑡+1 = 𝑋 𝑡 + 2 −𝑡 w.p. 1 2 𝑋 𝑡 − 2 −𝑡 w.p. 1 2 𝑋 𝑡+1 = 3 2 𝑋 𝑡 w.p. 1 2 if 𝑋 𝑡 ≤ 1 2 1 2 𝑋 𝑡 w.p. 1 2 if 𝑋 𝑡 ≤ 1 2 𝑋 𝑡+1 = 𝑋 𝑡 2 w.p. 1 2 if 𝑋 𝑡 ≤ 1 2 2 𝑋 𝑡 − 𝑋 𝑡 2 w.p. 1 2 if 𝑋 𝑡 ≤ 1 2 Converges! Uniform on [0,1] Polarizes (weakly)! Polarizes (strongly)! Oct. 8, 2018 Berkeley: General Strong Polarization

Main Result: Definition and Theorem Strong Polarization: (informally) Pr 𝑋 𝑡 ∈(𝜏,1−𝜏) ≤𝜖 if 𝜏= 2 −𝜔(𝑡) and 𝜖= 2 −𝑂(𝑡) formally ∀𝛾>0 ∃𝛽<1,𝑐 𝑠.𝑡. ∀𝑡 Pr 𝑋 𝑡 ∈( 𝛾 𝑡 ,1− 𝛾 𝑡 ) ≤𝑐⋅ 𝛽 𝑡 Local Polarization: Variance in the middle: 𝑋 𝑡 ∈(𝜏,1−𝜏) ∀𝜏>0 ∃𝜎>0 𝑠.𝑡. ∀𝑡, 𝑋 𝑡 ∈ 𝜏,1−𝜏 ⇒𝑉𝑎𝑟 𝑋 𝑡+1 𝑋 𝑡 ≥𝜎 Suction at the ends: 𝑋 𝑡 ∉ 𝜏,1−𝜏 ∃𝜃>0, ∀𝑐<∞, ∃𝜏>0 𝑠.𝑡. 𝑋 𝑡 <𝜏⇒ Pr 𝑋 𝑡+1 < 𝑋 𝑡 𝑐 ≥𝜃 Theorem: Local Polarization ⇒ Strong Polarization. Both definitions qualitative! “low end” condition. Similar condition for high end Oct. 8, 2018 Berkeley: General Strong Polarization

Berkeley: General Strong Polarization Proof (Idea): Step 1: The potential Φ 𝑡 ≝ min 𝑋 𝑡 , 1− 𝑋 𝑡 decreases by constant factor in expectation in each step. ⇒𝔼 Φ 𝑇 = exp −𝑇 ⇒ Pr 𝑋 𝑇 ≥ exp −𝑇/2 ≤ exp −𝑇/2 Step 2: Next 𝑇 time steps, 𝑋 𝑡 plummets whp 2.1: Say, If 𝑋 𝑡 ≤𝜏 then Pr 𝑋 𝑡+1 ≤ 𝑋 𝑡 100 ≥1/2. 2.2: Pr ∃𝑡∈ 𝑇,2𝑇 𝑠.𝑡. 𝑋 𝑡 >𝜏 ≤ 𝑋 𝑇 /𝜏 [Doob] 2.3: If above doesn’t happen 𝑋 2𝑇 < 5 −𝑇 whp QED Oct. 8, 2018 Berkeley: General Strong Polarization

Part III: Polar Codes Encoding, Decoding, Martingale, Polarization Oct. 8, 2018 Berkeley: General Strong Polarization

Lesson 0: Compression ⇒ Coding Defn: Linear Compression Scheme: 𝑀,𝐷 form compression scheme for bern 𝑝 𝑛 if Linear map 𝑀: 𝔽 2 𝑛 → 𝔽 2 𝑚 Pr 𝑍∼bern 𝑝 𝑛 𝐷 𝑀⋅𝑍 ≠𝑍 =𝑜 1 Hopefully 𝑚 𝑛 ≤𝐻 𝑝 +𝜖 Compression ⇒ Coding Let 𝑀 ⊥ be such that 𝑀⋅𝑀 ⊥ =0; Encoder: 𝑋↦ 𝑀 ⊥ ⋅𝑋 Error-Corrector: 𝑌= 𝑀 ⊥ ⋅𝑋+𝑍↦𝑌−𝐷 𝑀⋅𝑌 =𝑌−𝐷 𝑀⋅ 𝑀 ⊥ ⋅𝑋+𝑀.𝑍 = 𝑤.𝑝. 1−𝑜(1) 𝑀 ⊥ ⋅𝑋 𝑀 𝑀 ⊥ Oct. 8, 2018 Berkeley: General Strong Polarization

Question: How to compress? Arikan’s key idea: Start with 2×2 “Polarization Transform”: 𝑈,𝑉 → 𝑈+𝑉,𝑉 Invertible – so does nothing? If 𝑈,𝑉 independent, then 𝑈+𝑉 “more random” than either 𝑉 | 𝑈+𝑉 “less random” than either Iterate (ignoring conditioning) End with bits that are almost random, or almost determined (by others). Output “totally random part” to get compression! Oct. 8, 2018 Berkeley: General Strong Polarization

Iteration elaborated: Martingale: 𝑋 𝑡 ≝ conditional entropy of random output after 𝑡 iterations B A+B A C+D A+B+C+D D C+D C D B+D H G+H G F E+F E F+H E+F+G+H Oct. 8, 2018 Berkeley: General Strong Polarization

The Polarization Butterfly 𝑃 𝑛 𝑈,𝑉 = 𝑃 𝑛 2 𝑈+𝑉 , 𝑃 𝑛 2 𝑉 Polarization ⇒∃𝑆⊆ 𝑛 𝐻 𝑊 𝑛 −𝑆 𝑊 𝑆 →0 𝑆 ≤ 𝐻 𝑝 +𝜖 ⋅𝑛 𝑍 1 𝑍 2 𝑍 𝑛 2 𝑍 𝑛 2 +1 𝑍 𝑛 2 +2 𝑍 𝑛 𝑍 1 + 𝑍 𝑛 2 +1 𝑍 2 + 𝑍 𝑛 2 +2 𝑍 𝑛 2 + 𝑍 𝑛 𝑍 𝑛 2 +1 𝑍 𝑛 2 +2 𝑍 𝑛 𝑊 1 𝑊 2 𝑊 𝑛 2 𝑊 𝑛 2 +1 𝑊 𝑛 2 +2 𝑊 𝑛 𝑈 Compression 𝐸 𝑍 = 𝑃 𝑛 𝑍 𝑆 Encoding time =𝑂 𝑛 log 𝑛 (given 𝑆) 𝑉 To be shown: 1. Decoding = 𝑂 𝑛 log 𝑛 2. Polarization! Oct. 8, 2018 Berkeley: General Strong Polarization

The Polarization Butterfly: Decoding Decoding idea: Given 𝑆, 𝑊 𝑆 = 𝑃 𝑛 𝑈,𝑉 Compute 𝑈+𝑉∼Bern 2𝑝−2 𝑝 2 Determine 𝑉∼ ? 𝑍 1 𝑍 2 𝑍 𝑛 2 𝑍 𝑛 2 +1 𝑍 𝑛 2 +2 𝑍 𝑛 𝑍 1 + 𝑍 𝑛 2 +1 𝑊 1 𝑊 2 𝑊 𝑛 2 𝑊 𝑛 2 +1 𝑊 𝑛 2 +2 𝑊 𝑛 𝑍 2 + 𝑍 𝑛 2 +2 𝑉|𝑈+𝑉∼ 𝑖 Bern( 𝑟 𝑖 ) Key idea: Stronger induction! - non-identical product distribution 𝑍 𝑛 2 + 𝑍 𝑛 𝑍 𝑛 2 +1 Decoding Algorithm: Given 𝑆, 𝑊 𝑆 = 𝑃 𝑛 𝑈,𝑉 , 𝑝 1 ,…, 𝑝 𝑛 Compute 𝑞 1 ,.. 𝑞 𝑛 2 ← 𝑝 1 … 𝑝 𝑛 Determine 𝑈+𝑉=𝐷 𝑊 𝑆 + , 𝑞 1 … 𝑞 𝑛 2 Compute 𝑟 1 … 𝑟 𝑛 2 ← 𝑝 1 … 𝑝 𝑛 ;𝑈+𝑉 Determine 𝑉=𝐷 𝑊 𝑆 − , 𝑟 1 … 𝑟 𝑛 2 𝑍 𝑛 2 +2 𝑍 𝑛 Oct. 8, 2018 Berkeley: General Strong Polarization

Polarization Martingale 𝑍 1 𝑍 2 𝑍 𝑛 2 𝑍 𝑛 2 +1 𝑍 𝑛 2 +2 𝑍 𝑛 𝑍 1 + 𝑍 𝑛 2 +1 𝑊 1 𝑊 2 𝑊 𝑛 2 𝑊 𝑛 2 +1 𝑊 𝑛 2 +2 𝑊 𝑛 Idea: Follow 𝑋 𝑡 = 𝐻 𝑍 𝑖,𝑡 𝑍 <𝑖,𝑡 for random 𝑖 𝑍 2 + 𝑍 𝑛 2 +2 𝑍 𝑖,0 𝑍 𝑖,1 … Z 𝑖,𝑡 … 𝑍 𝑖, log 𝑛 𝑍 𝑛 2 + 𝑍 𝑛 𝑍 𝑛 2 +1 𝑋 𝑡 Martingale! 𝑍 𝑛 2 +2 Polarizes? Strongly? ⇐ Locally? 𝑍 𝑛 Oct. 8, 2018 Berkeley: General Strong Polarization

Local Polarization of Arikan Martingale Variance in the Middle: Roughly: 𝐻 𝑝 ,𝐻 𝑝 →(𝐻 2𝑝−2 𝑝 2 , 2𝐻 𝑝 −𝐻 2𝑝−2 𝑝 2 ) 𝑝∈ 𝜏,1−𝜏 ⇒2𝑝−2 𝑝 2 far from 𝑝 + continuity of 𝐻 ⋅ ⇒ 𝐻 2𝑝−2 𝑝 2 far from 𝐻(𝑝) Suction: High end:𝐻 1 2 −𝛾 →𝐻 1 2 − 𝛾 2 𝐻 1 2 −𝛾 =1 −Θ 𝛾 2 ⇒1− 𝛾 2 →1 −Θ 𝛾 4 Low end: 𝐻 𝑝 ≈𝑝 log 1 𝑝 ;2𝐻 𝑝 ≈2𝑝 log 1 𝑝 ; 𝐻 2𝑝−2 𝑝 2 ≈𝐻 2𝑝 ≈2𝑝 log 1 2𝑝 ≈2𝐻 𝑝 −2𝑝≈ 2− 1 log 1 𝑝 𝐻 𝑝 Dealing with conditioning – more work (lots of Markov) Oct. 8, 2018 Berkeley: General Strong Polarization

Berkeley: General Strong Polarization “New contributions” So far: Reproduced old work (simpler proofs?) Main new technical contributions: Strong polarization of general transforms E.g. 𝑈,𝑉,𝑊 →(𝑈+𝑉,𝑉,𝑊)? Exponentially strong polarization [BGS’18] Suction at low end is very strong! Random 𝑘×𝑘 matrix yields 𝑋 𝑡+1 ≈ 𝑋 𝑡 𝑘 .99 whp Strong polarization of Markovian sources [GNS’19?] Separation of compression of known sources from unknown ones. Oct. 8, 2018 Berkeley: General Strong Polarization

Berkeley: General Strong Polarization Conclusions Polarization: Important phenomenon associated with bounded random variables. Not always bad   Needs better understanding! Recently also used in CS: [Lovett Meka] – implicit [Chattopadhyay-Hatami-Hosseini-Lovett] - explicit Oct. 8, 2018 Berkeley: General Strong Polarization

Berkeley: General Strong Polarization Thank You! Oct. 8, 2018 Berkeley: General Strong Polarization

Berkeley: General Strong Polarization Rest of the talk Background/Motivation: The Shannon Capacity Challenge Resolution by [Arikan’08], [Guruswami-Xia’13], [Hassani,Alishahi,Urbanke’13]: Polar Codes! Key ideas: Polar Codes & Arikan-Martingale. Implications of Weak/Strong Polarization. Our Contributions: Simplification & Generalization Local Polarization ⇒ Strong Polarization Arikan-Martingale Locally Polarizes (generally) Oct. 8, 2018 Berkeley: General Strong Polarization

Context for today’s talk [Arikan’08]: Martingale associated to general class of codes. Martingale polarizes (weakly) implies code achieves capacity (weakly) Martingale polarizes weakly for entire class. [GX’13, HAU’13]: Strong polarization for one specific code. Why so specific? Bottleneck: Strong Polarization Rest of talk: Polar codes, Martingale and Strong Polarization Oct. 8, 2018 Berkeley: General Strong Polarization