The Goldreich-Levin Theorem: List-decoding the Hadamard code

Slides:



Advertisements
Similar presentations
1+eps-Approximate Sparse Recovery Eric Price MIT David Woodruff IBM Almaden.
Advertisements

The Average Case Complexity of Counting Distinct Elements David Woodruff IBM Almaden.
Invertible Zero-Error Dispersers and Defective Memory with Stuck-At Errors Ariel Gabizon Ronen Shaltiel.
Approximate List- Decoding and Hardness Amplification Valentine Kabanets (SFU) joint work with Russell Impagliazzo and Ragesh Jaiswal (UCSD)
Noise, Information Theory, and Entropy (cont.) CS414 – Spring 2007 By Karrie Karahalios, Roger Cheng, Brian Bailey.
Foundations of Cryptography Lecture 10 Lecturer: Moni Naor.
Locally Decodable Codes from Nice Subsets of Finite Fields and Prime Factors of Mersenne Numbers Kiran Kedlaya Sergey Yekhanin MIT Microsoft Research.
15-853:Algorithms in the Real World
List decoding Reed-Muller codes up to minimal distance: Structure and pseudo- randomness in coding theory Abhishek Bhowmick (UT Austin) Shachar Lovett.
Chapter 10 Shannon’s Theorem. Shannon’s Theorems First theorem:H(S) ≤ L n (S n )/n < H(S) + 1/n where L n is the length of a certain code. Second theorem:
Integration of sensory modalities
1 The Monte Carlo method. 2 (0,0) (1,1) (-1,-1) (-1,1) (1,-1) 1 Z= 1 If  X 2 +Y 2  1 0 o/w (X,Y) is a point chosen uniformly at random in a 2  2 square.
Chernoff Bounds, and etc.
Updated QuickSort Problem From a given set of n integers, find the missing integer from 0 to n using O(n) queries of type: “what is bit[j]
Randomized Algorithms Tutorial 3 Hints for Homework 2.
Complexity 18-1 Complexity Andrei Bulatov Probabilistic Algorithms.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 12 June 18, 2006
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 8 May 4, 2005
Randomized Computation Roni Parshani Orly Margalit Eran Mantzur Avi Mintz
ACT1 Slides by Vera Asodi & Tomer Naveh. Updated by : Avi Ben-Aroya & Alon Brook Adapted from Oded Goldreich’s course lecture notes by Sergey Benditkis,
Correcting Errors Beyond the Guruswami-Sudan Radius Farzad Parvaresh & Alexander Vardy Presented by Efrat Bank.
–Def: A language L is in BPP c,s ( 0  s(n)  c(n)  1,  n  N) if there exists a probabilistic poly-time TM M s.t. : 1.  w  L, Pr[M accepts w]  c(|w|),
Oded Regev Tel-Aviv University On Lattices, Learning with Errors, Learning with Errors, Random Linear Codes, Random Linear Codes, and Cryptography and.
Private Information Retrieval. What is Private Information retrieval (PIR) ? Reduction from Private Information Retrieval (PIR) to Smooth Codes Constructions.
1. 2 Overview Some basic math Error correcting codes Low degree polynomials Introduction to consistent readers and consistency tests H.W.
Locally Decodable Codes Uri Nadav. Contents What is Locally Decodable Code (LDC) ? Constructions Lower Bounds Reduction from Private Information Retrieval.
CS151 Complexity Theory Lecture 10 April 29, 2004.
Lecturer: Moni Naor Foundations of Cryptography Lecture 6: pseudo-random generators, hardcore predicate, Goldreich-Levin Theorem, Next-bit unpredictability.
Study Group Randomized Algorithms Jun 7, 2003 Jun 14, 2003.
Foundations of Privacy Lecture 11 Lecturer: Moni Naor.
Foundations of Cryptography Lecture 9 Lecturer: Moni Naor.
Hamming Codes 11/17/04. History In the late 1940’s Richard Hamming recognized that the further evolution of computers required greater reliability, in.
Introduction to AEP In information theory, the asymptotic equipartition property (AEP) is the analog of the law of large numbers. This law states that.
©2003/04 Alessandro Bogliolo Background Information theory Probability theory Algorithms.
Ragesh Jaiswal Indian Institute of Technology Delhi Threshold Direct Product Theorems: a survey.
Great Theoretical Ideas in Computer Science.
1 2. Independence and Bernoulli Trials Independence: Events A and B are independent if It is easy to show that A, B independent implies are all independent.
Streaming Algorithms Piotr Indyk MIT. Data Streams A data stream is a sequence of data that is too large to be stored in available memory Examples: –Network.
1 Two-point Sampling. 2 X,Y: discrete random variables defined over the same probability sample space. p(x,y)=Pr[{X=x}  {Y=y}]: the joint density function.
Channel Capacity.
Error Control Code. Widely used in many areas, like communications, DVD, data storage… In communications, because of noise, you can never be sure that.
Probability Theory Overview and Analysis of Randomized Algorithms Prepared by John Reif, Ph.D. Analysis of Algorithms.
Chapter 5: Probability Analysis of Randomized Algorithms Size is rarely the only property of input that affects run time Worst-case analysis most common.
PROBABILISTIC COMPUTATION By Remanth Dabbati. INDEX  Probabilistic Turing Machine  Probabilistic Complexity Classes  Probabilistic Algorithms.
Error Control Code. Widely used in many areas, like communications, DVD, data storage… In communications, because of noise, you can never be sure that.
Communication System A communication system can be represented as in Figure. A message W, drawn from the index set {1, 2,..., M}, results in the signal.
CS717 Algorithm-Based Fault Tolerance Matrix Multiplication Greg Bronevetsky.
1 Introduction to Quantum Information Processing CS 667 / PH 767 / CO 681 / AM 871 Richard Cleve DC 2117 Lecture 20 (2009)
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
1 Randomized Algorithms Andreas Klappenecker. 2 Randomized Algorithms A randomized algorithm is an algorithm that makes random choices during their execution.
Randomization Carmella Kroitoru Seminar on Communication Complexity.
The parity bits of linear block codes are linear combination of the message. Therefore, we can represent the encoder by a linear system described by matrices.
Data Stream Algorithms Lower Bounds Graham Cormode
ICS 353: Design and Analysis of Algorithms
List Decoding Using the XOR Lemma Luca Trevisan U.C. Berkeley.
1 Part Three: Chapters 7-9 Performance Modeling and Estimation.
Richard Cleve DC 2117 Introduction to Quantum Information Processing QIC 710 / CS 667 / PH 767 / CO 681 / AM 871 Lecture (2011)
Theory of Computational Complexity Probability and Computing Ryosuke Sasanuma Iwama and Ito lab M1.
PROBABILITY AND COMPUTING RANDOMIZED ALGORITHMS AND PROBABILISTIC ANALYSIS CHAPTER 1 IWAMA and ITO Lab. M1 Sakaidani Hikaru 1.
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
Probabilistic Algorithms
The Viterbi Decoding Algorithm
Stochastic Streams: Sample Complexity vs. Space Complexity
Applied Discrete Mathematics Week 11: Relations
COMS E F15 Lecture 2: Median trick + Chernoff, Distinct Count, Impossibility Results Left to the title, a presenter can insert his/her own image.
Enumerating Distances Using Spanners of Bounded Degree
RS – Reed Solomon List Decoding.
Locally Decodable Codes from Lifting
CS200: Algorithm Analysis
Error Correction Coding
Presentation transcript:

The Goldreich-Levin Theorem: List-decoding the Hadamard code Amnon Aaronsohn ECC Course, TAU

Outline Motivation Probability review Theorem and proof

Decoding Fix an (n, k, d) code C, and suppose there is an unknown message sk We are given a vector yn which is equal to the codeword C(s) with at most m of the places corrupted Suppose we want to find possible values sk for the original message so that dH(C(s),y)m If m<d/2 then there's a unique solution If d/2<m<d there could be multiple solutions

Hadamard Codes [2n, n, 2n-1]2 linear code The encoding for a message xFn is given by all 2n scalar products <x,y> for yFn (Note: all string related math here is mod 2.) Why is the relative distance 1/2? We will see a probabilistic algorithm that provides list decoding for Hadamard codes when up to 1/2-e of the bits are corrupted

Basic probability theory review Random variables (discrete) Expected value (m) E(X) = Sxp(x) Variance (s2) Var(X) = E[(X-E(X))2] = E[X2]-E[X]2

Binary random variables Pr(X=1)=p, Pr(X=0)=1-p Often used as indicator variables E(X)=… Var(X) = p(1-p) ≤ 1/4

Majority votes Consider a probabilistic algorithm that returns a binary value (0 or 1), with probability > 1/2 of returning the correct result We can amplify the probability of getting the correct answer by calling the algorithm multiple times and deciding by the majority vote In order for this to work well there should be some independence between the algorithm’s results in each invocation

Independence Events A1,...,An are independent if Pr[A1,...,An] = Pr[A1]...Pr[An] Likewise, random variables X1,...,Xn are independent if for each possible assignment x1,...,xn: Pr[X1=x1,...,Xn=xn] = Pr[X1=x1]...Pr[Xn=xn]

Pairwise independence A set of r.v.'s (or events) is pairwise independent if each pair of the set is independent Does one type of independence imply the other?

Example: xors of random bits Let X1,…,Xk be independent binary r.v.’s with p=1/2 For each non-empty subset of indexes J define XJ = iJ xi (= SiJ xi) The XJs are (1) uniformly distributed (2) not mutually independent (3) pairwise independent Can be trivially extended to random vectors

Chernoff bound Reminder: we want to improve the accuracy of an algorithm by calling it multiple times and deciding by majority vote The probability of not getting a simultaneous occurance of the majority of n independent events, each having probability p≥1/2+e, has the upper bound Pr(error) ≤ exp{-2ne2}

Chebyshev inequality For any r.v. X with expected value μ and variance s2: Pr(|X-m|≥a) ≤ s2/a2 Can be used to get an upper bound for the probability of not getting a majority of n pairwise independent events with p≥1/2+e: Pr(error) ≤ 1/(4ne2)

Back to the decoding problem Message space {0,1}n Think of codewords as binary functions: c=Had(s)  x c(x)=<s,x> Input: function f:{0,1}n{0,1}, representing a codeword with noise Output: a list L of possible messages s.t. for each sL, f agrees with Had(s) at p fraction of the function inputs: Prx[f(x)=<s,x>] = p Time complexity in terms of calls to f

No error case: p = 1 Unique decoding In this case we can recover the ith bit of the message by computing f(ei) where ei is the string with 1 at the ith position and 0 everywhere else.

Low error case: p = 3/4+e Unique decoding Why not simply use f(ei) as before? Probabilistic algorithm: Estimate-Had(x): For j = 1…k (k to be fixed) Choose rj{0,1}n randomly aj f(rj+x) - f(rj) Return majority(a1,…,ak) Now set the ith bit of the solution to Estimate-Had(ei)

Analysis Consider this part: Choose rj{0,1}n randomly aj f(rj+x) - f(rj) If both f(rj+x) and f(rj) are correct then aj = f(rj+x) - f(rj) = <s, rj+x> - <s, rj> = <s,x> Using a union bound we get Pr[aj <s,x>] ≤ 2(1-p) = 1/2-2e

Analysis (contd.) Since we take a majority vote of a1,…,ak we can use the fact that they’re independent to get a Chernoff bound of at most e-(ke2) on the probability of error The probability of getting some bit wrong is Pr[Estimate-Had(ei) is wrong for some i] ≤ ne-(ke2) Taking k = O(logn/e2) gives an O(nlogn/e2) algorithm with arbitrarily small error Note that the error probability is doubled, so doesn’t work with p<3/4

General case: p = 1/2+e List decoding The Goldreich-Levin theorem gives a probabilistic algorithm for this problem. Specifically: Input: Function f() as before Output: List L of strings such that each possible solution s appears with high probability: Prx[f(x)=<s,x>] ≥ 1/2+e  Pr[sL] ≥1/2 Run time: Poly(n/e)

The algorithm (almost) Suppose that we somehow know the values of Had(s) in m places. Specifically, we are given the strings r1,…,rm and the values b1,…,bm where bj = <s,rj>, for an unknown s We can then try to compute the value of Had(s) in any x: Estimate-With-Guess(x , r1,…,rm , b1,…,bm): For J {1,...,m} (Jf) aJ f(x+SjJ rj) - SjJ bj Return majority of all aJ Now get the bits of s by calling Estimate-With-Guess with ei as before

Analysis The idea here is that due to linearity we can get the correct values in more places than we are given For any J {1,...,m} define rJ=SjJ rj. Then <s, rJ>=<s, SjJrj>=SjJ<s, rj >=SjJ bj If the rjs are uniformly random so are the rJs The probability of getting aJ wrong is therefore the probability of getting f(x+rJ) wrong, which is bounded by 1/2-e

But! The rJs are not independent, so Chernoff bound can’t be used However, they are pairwise independent so we can use Chebyshev Pr[EWG(x , r1,…,rm , b1,…,bm) <s,x>] ≤ 1/(2me2) when the ris are independent and chosen uniformly and for each i, bi=<s,ri> We can recover all bits with an error of at most n/(2me2). Taking 2m = O(n/e2) gives an O(n2/e2) algorithm with arbitrarily small error

Completing the algorithm We don’t actually have the correct values for the bis But if m is small we can try all 2m combinations – for each solution one of them must be correct! The final algorithm: 1. Choose r1,…,rm randomly 2. For each (b1,…,bm){0,1}m: 2.1 For i=1,..,n aiEWG(ei , r1,…,rm , b1,…,bm) 2.2 Output (a1,…,an) Complexity: O(n3/e4)

Finally Now that we can generate a list where every possible solution appears with probability  1/2, we can re-run the algorithm a constant number of times to get an arbitrary small probability to miss a given solution

Summary We saw a list decoding algorithm for Hadamard code, enumerating with high probability all strings with distance arbitrarily close to 1/2 to a given string Sample f() at uniformly distributed points so that the adversary won’t be able to affect result Generate points in a linear subspace spanned by a small number of points, for which we can try all combinations Results in pairwise independent trials, so we can apply Chebyshev inequality