Locally Decodable Codes Uri Nadav. Contents What is Locally Decodable Code (LDC) ? Constructions Lower Bounds Reduction from Private Information Retrieval.

Slides:

Advertisements

Similar presentations

Optimal Lower Bounds for 2-Query Locally Decodable Linear Codes Kenji Obata.

Advertisements

Ulams Game and Universal Communications Using Feedback Ofer Shayevitz June 2006.

The Contest between Simplicity and Efficiency in Asynchronous Byzantine Agreement Allison Lewko The University of Texas at Austin TexPoint fonts used in.

Multiplicity Codes Swastik Kopparty (Rutgers) (based on [K-Saraf-Yekhanin ’11], [K ‘12], [K ‘14])

Approximate List- Decoding and Hardness Amplification Valentine Kabanets (SFU) joint work with Russell Impagliazzo and Ragesh Jaiswal (UCSD)

Applied Algorithmics - week7

Foundations of Cryptography Lecture 10 Lecturer: Moni Naor.

Locally Decodable Codes from Nice Subsets of Finite Fields and Prime Factors of Mersenne Numbers Kiran Kedlaya Sergey Yekhanin MIT Microsoft Research.

Locally Decodable Codes

Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.

Eran Omri, Bar-Ilan University Joint work with Amos Beimel and Ilan Orlov, BGU Ilan Orlov…!??!!

The PCP theorem. Summary -Proof structure -Basic ingredients -Arithmetization -Properties of polynomials -3-SAT belongs to PCP[O(n 3 ),O(1)]

CS151 Complexity Theory Lecture 7 April 20, 2004.

Sparse Random Linear Codes are Locally Decodable and Testable Tali Kaufman (MIT) Joint work with Madhu Sudan (MIT)

Linear-time encodable and decodable error-correcting codes Daniel A. Spielman Presented by Tian Sang Jed Liu 2003 March 3rd.

Threshold Phenomena and Fountain Codes

Correcting Errors Beyond the Guruswami-Sudan Radius Farzad Parvaresh & Alexander Vardy Presented by Efrat Bank.

The Goldreich-Levin Theorem: List-decoding the Hadamard code

Private Information Retrieval. What is Private Information retrieval (PIR) ? Reduction from Private Information Retrieval (PIR) to Smooth Codes Constructions.

1 Streaming Computation of Combinatorial Objects Ziv Bar-Yossef U.C. Berkeley Omer Reingold AT&T Labs – Research Ronen.

CS151 Complexity Theory Lecture 10 April 29, 2004.

1 The PCP starting point. 2 Overview In this lecture we’ll present the Quadratic Solvability problem. In this lecture we’ll present the Quadratic Solvability.

Foundations of Privacy Lecture 11 Lecturer: Moni Naor.

Linear-Time Encodable and Decodable Error-Correcting Codes Jed Liu 3 March 2003.

CS151 Complexity Theory Lecture 9 April 27, 2004.

Some 3CNF Properties are Hard to Test Eli Ben-Sasson Harvard & MIT Prahladh Harsha MIT Sofya Raskhodnikova MIT.

1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 13 June 22, 2005

Foundations of Cryptography Lecture 9 Lecturer: Moni Naor.

Hamming Codes 11/17/04. History In the late 1940’s Richard Hamming recognized that the further evolution of computers required greater reliability, in.

Some basic concepts of Information Theory and Entropy

©2003/04 Alessandro Bogliolo Background Information theory Probability theory Algorithms.

Approximating the MST Weight in Sublinear Time Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley)

CS151 Complexity Theory Lecture 9 April 27, 2015.

A Few Simple Applications to Cryptography Louis Salvail BRICS, Aarhus University.

Great Theoretical Ideas in Computer Science.

Channel Capacity.

Threshold Phenomena and Fountain Codes Amin Shokrollahi EPFL Joint work with M. Luby, R. Karp, O. Etesami.

Secure Computation (Lecture 5) Arpita Patra. Recap >> Scope of MPC > models of computation > network models > modelling distrust (centralized/decentralized.

Great Theoretical Ideas in Computer Science.

Communication System A communication system can be represented as in Figure. A message W, drawn from the index set {1, 2,..., M}, results in the signal.

CS717 Algorithm-Based Fault Tolerance Matrix Multiplication Greg Bronevetsky.

DIGITAL COMMUNICATIONS Linear Block Codes

Communication vs. Computation S Venkatesh Univ. Victoria Presentation by Piotr Indyk (MIT) Kobbi Nissim Microsoft SVC Prahladh Harsha MIT Joe Kilian NEC.

Foundations of Privacy Lecture 5 Lecturer: Moni Naor.

Fall 2013 CMU CS Computational Complexity Lectures 8-9 Randomness, communication, complexity of unique solutions These slides are mostly a resequencing.

Quantum algorithms vs. polynomials and the maximum quantum-classical gap in the query model.

Data Stream Algorithms Lower Bounds Graham Cormode

Some Computation Problems in Coding Theory

Forrelation: A Problem that Optimally Separates Quantum from Classical Computing.

Raptor Codes Amin Shokrollahi EPFL. BEC(p 1 ) BEC(p 2 ) BEC(p 3 ) BEC(p 4 ) BEC(p 5 ) BEC(p 6 ) Communication on Multiple Unknown Channels.

List Decoding Using the XOR Lemma Luca Trevisan U.C. Berkeley.

Channel Coding Theorem (The most famous in IT) Channel Capacity; Problem: finding the maximum number of distinguishable signals for n uses of a communication.

1 4.1 Hash Functions and Data Integrity A cryptographic hash function can provide assurance of data integrity. ex: Bob can verify if y = h K (x) h is a.

Network Topology Single-level Diversity Coding System (DCS) An information source is encoded by a number of encoders. There are a number of decoders, each.

RS – Reed Solomon Error correcting code. Error-correcting codes are clever ways of representing data so that one can recover the original information.

Locally Decodable Codes of fixed number of queries and Sub-exponential Length Article By Klim Efremenko Presented by Inon Peled 30 November 2008.

PROBABILITY AND COMPUTING RANDOMIZED ALGORITHMS AND PROBABILISTIC ANALYSIS CHAPTER 1 IWAMA and ITO Lab. M1 Sakaidani Hikaru 1.

New Locally Decodable Codes and Private Information Retrieval Schemes

Great Theoretical Ideas in Computer Science

Dana Ron Tel Aviv University

Umans Complexity Theory Lectures

Sublinear-Time Error-Correction and Error-Detection

Sampling of min-entropy relative to quantum knowledge Robert König in collaboration with Renato Renner TexPoint fonts used in EMF. Read the TexPoint.

Sublinear-Time Error-Correction and Error-Detection

RS – Reed Solomon List Decoding.

The Curve Merger (Dvir & Widgerson, 2008)

Topic 3: Prob. Analysis Randomized Alg.

Indistinguishability by adaptive procedures with advice, and lower bounds on hardness amplification proofs Aryeh Grinberg, U. Haifa Ronen.

CS151 Complexity Theory Lecture 10 May 2, 2019.

CS151 Complexity Theory Lecture 7 April 23, 2019.

Presentation transcript:

Locally Decodable Codes Uri Nadav

Contents What is Locally Decodable Code (LDC) ? Constructions Lower Bounds Reduction from Private Information Retrieval (PIR) to LDC

Minimum Distance For every x≠y that satisfy d(C(x),C(y)) ≥ δ Error correction problem is solvable for less than δ/2 errors Error Detection problem is solvable for less than δ errors  /2 codeword

Error-correction Encoding xC(x) Errors y Decoding ix[i]x[i] Input Codeword Worst case error assumption Corrupted codeword Bit to decodeDecoded bit

Query Complexity Number of indices decoder is allowed to read from (corrupted) codeword Decoding can be done with query complexity Ω(|C(x)|) We are interested in constant query complexity

Adversarial Model We can view the errors model as an adversary that chooses positions to destroy, and has access to the decoding/encoding scheme (but not to random coins) The adversary is allowed to insert at most  m errors

Why not decode in blocks? Adversary is worst case so it can destroy more than δ fraction of some blocks, and less from others. Nice errors: Worst Case: Many errors in the same block

Ideal Code C:{0,1} n  m Constant information rate: n/m > c Resilient against constant fraction of errors (linear minimum distance) Efficient Decoding (constant query complexity) No Such Code!

Definition of LDC C:{0,1} n  m is a (q, ,  ) locally decodable code if there exists a prob. algorithm A such that:  x  {0,1} n, y   m with distance d(y,C(x))<  m and  i  {1,..,n}, Pr[ A(y,i)=x i ] > ½ +  A reads at most q indices of y (of its choice) The Probability is over the coin tosses of A Queries are not allowed to be adaptive A must be probabilistic if q<  m A has oracle access to y

Example: Hadamard Code Hadamard is (2,δ, ½ -2δ) LDC Construction: x1x1 x2x2 xnxn source word codeword Encoding Relative minimum distance ½

Example: Hadamard Code Reconstruction codeword x1x1 x2x2 xnxn Decoding source word xixi 2 queries Pick a  R {0,1} n =+ e i =(0,…0,1,0,…,0) the i’ th entry If less than δ fraction of errors, then reconstruction probability is at least 1-2δ reconstruction formula

Another Construction… Reconstruction of bit x i,j : 1) A,B 2) A  {i},B 3) A,B  {j} 4) A  {i},B  {j} Probability of 1-4  for correct decoding

Generalization… 2 k queries m=2 kn 1/k

Smoothly Decodable Code C:{0,1} n  m is a (q,c,  ) smoothly decodable code if there exists a prob. algorithm A such that:  x  {0,1} n and  i  {1,..,n}, Pr[ A(C(x),i)=x i ] > ½ +  A reads at most q indices of C(x) (of its choice) The Probability is over the coin tosses of A Queries are not allowed to be adaptive A has access to a non corrupted codeword  i  {1,..,n} and  j  {1,..,m}, Pr[ A(·,i) reads j ] ≤ c/m The event is: A reads index j of C(x) to reconstruct index i 1 2 3

LDC is also Smooth Code Claim: Every (q,δ,ε) LDC is a (q,q/δ,ε) smooth code. Intuition – If the code is resilient against linear number of errors, then no bit of the output can be queried too often (or else adversary will choose it)

Proof: LDC is Smooth A - a reconstruction algorithm for (q,δ,ε) LDC S i = {j | Pr[A query j] > q/δm} There are at most q queries, so sum of prob. over j is q, thus |S i | < δm Set of indices read ‘too’ often

Proof:LDC is Smooth A’ – uses A as black box, returns whatever A returns as x i A’ gives A oracle access to corrupted codeword C(x)’, return only indices not in S [C(x)’] j = C(x) j otherwise 0 j  S i A reconstructs x i with probability at least 1/2 + ε, because there are at most |S i | < δm errors A’ is a (q,q/δ, ε) Smooth decoding algorithm

Proof: LDC is Smooth 000 A A C(x)’ indices that A reads too often C(x) what A gets what A wants indices that A’ fixed arbitrarily

Smooth Code is LDC A bit can be reconstructed using q uniformly distributed queries, with ε advantage, when no errors With probability (1-qδ) all the queries are to non-corrupted indices. Remember: Adversary does not know decoding procedure’s random coins

Lower Bounds Non existence for q = 1 [KT] Non linear rate for q ≥ 2 [KT] Exponential rate for linear code, q=2 [Goldreich et al] Exponential rate for every code, q=2 [Kerenidis,de Wolf] (using quantum arguments)

Information Theory basics Entropy Mutual Information I(x,y) = H(x)-H(x|y) H(x) = -∑Pr[x=i] log(Pr[x=i])

Information Theory cont… Entropy of multiple variable is less than the sum of entropies! (equal in case of all variables mutually independent: H(x 1 x 2 …x n ) ≤ ∑ H(x i ) Highest entropy is of a uniformly distributed random variable.

IT result from [KT]

Proof Combined …

Single query (q=1) Claim: If C:{0,1} n  m, is (1,δ,ε) locally decodable then: No such family of codes!

Good Index Index j is said to be ‘good’ for i, if Pr[A(C(x),i)=x i |A reads j] > ½ + ε

Single query (q=1) There exist at least a single j 1 which is good for i. By definition of LDC Conditional prob. summing over disjoint events

Perturbation Vector Def: Perturbation vector Δ j 1,j 2,… takes random values uniformly distributed from ∑, in position j 1,j 2,… and 0 otherwise. 0 0 j1»j1»∑ 0 0 j 2 »∑ 0 Destroys specified indices in most unpredicted way

Adding perturbation A resilient Against at least 1 error So, there exists at least one index, j 2 ‘good’ for i. j 2 ≠ j 1, because j 1 can not be good !

Single query (q=1) So, There are at least δm indices of The codeword ‘good’ for every i. By pigeonhole principle, there exists an index j’ in {1..m}, ‘good’ for δn indices. A resilientAgainst δm errors

Single query (q=1) Think of C(x [1..δn] ) projected on j’ as a function from the δn indices of the input. The range is ∑, and each bit of the input can be reconstructed w.p. ½ + ε. Thus by IT result:

Case q≥2 m = Ω(n) q/(q-1) Constant time reconstruction procedures are impossible for codes having constant rate!

Case q≥2 Proof Sketch A LDC C is also smooth A q smooth codeword has a small enough subset of indices, that still encodes linear amount of information So, by IT result, m (q-1)/q = Ω(n)

Applications? Better locally decodable codes have applications to PIR Applications to the practice of fault- tolerant data storage/transmission?

What about Locally Encodable A ‘Respectable Code’ is resilient against Ω(m) fraction of errors. We expect a bit of the encoding to depend on many bits of the encoding Otherwise, there exists a bit which influence less than 1/n fraction of the encoding.

Open Issues Adaptive vs Non-Adaptive Queries Closing the gap guess first q-1 answers with succeess probability ∑ q-1

Logarithmic number of queries View message as polynomial p:F k ->F of degree d ( F is a field, |F| >> d ) Encode message by evaluating p at all |F| k points To encode n -bits message, can have |F| polynomial in n, and d,k around polylog(n)

To reconstruct p(x) Pick a random line in F k passing through x ; evaluate p on d+1 points of the line; by interpolation, find degree- d univariate polynomial that agrees with p on the line Use interpolated polynomial to estimate p(x) Algorithm reads p in d+1 points, each uniformly distributed

x x+y x+2y x+(d+1)y

Private Information Retrieval (PIR) Query a public database, without revealing the queried record. Example: A broker needs to query NASDAQ database about a stock, but don’t won’t anyone to know he is interested.

PIR A k server PIR scheme of one round, for database length n consists of:

PIR – definition These function should satisfy:

Simple Construction of PIR 2 servers, one round Each server holds bits x 1,…, x n. To request bit i, choose uniformly A subset of [n] Send first server A. Send second server A+{i} (add i to A if it is not there, remove if is there) Server returns Xor of bits in indices of request S in [n]. Xor the answers.

Lower Bounds On Communication Complexity To achieve privacy in case of single server, we need n bits message. (not too far from the one round 2 server scheme we suggested).

Reduction from PIR to LDC A codeword is a Concatenation of all possible answers from both servers A query procedure is made of 2 queries to the database