Lecture 15 Private Information Retrieval Stefan Dziembowski www.dziembowski.net MIM UW 25.01.13ver 1.0.

Slides:



Advertisements
Similar presentations
Private Inference Control David Woodruff MIT Joint work with Jessica Staddon (PARC)
Advertisements

Efficient Private Approximation Protocols Piotr Indyk David Woodruff Work in progress.
Multi-Query Computationally-Private Information Retrieval with Constant Communication Rate Jens Groth, University College London Aggelos Kiayias, University.
Computational Privacy. Overview Goal: Allow n-private computation of arbitrary funcs. –Impossible in information-theoretic setting Computational setting:
Spreading Alerts Quietly and the Subgroup Escape Problem Aleksandr Yampolskiy (Yale) Joint work with James Aspnes, Zoë Diamadi, Kristian Gjøsteen, and.
Foundations of Cryptography Lecture 10 Lecturer: Moni Naor.
Efficient Information Retrieval for Ranked Queries in Cost-Effective Cloud Environments Presenter: Qin Liu a,b Joint work with Chiu C. Tan b, Jie Wu b,
Semi-Honest to Malicious Oblivious-Transfer The Black-box Way Iftach Haitner Weizmann Institute of Science.
CS555Topic 241 Cryptography CS 555 Topic 24: Secure Function Evaluation.
Lecture 8 Public-Key Encryption I Stefan Dziembowski MIM UW ver 1.0.
Eran Omri, Bar-Ilan University Joint work with Amos Beimel and Ilan Orlov, BGU Ilan Orlov…!??!!
Machine Learning Week 2 Lecture 2.
1 Lecture 14 Language class LFSA –Study limits of what can be done with FSA’s –Closure Properties –Comparing to other language classes.
Private Information Retrieval Benny Chor, Oded Goldreich, Eyal Kushilevitz and Madhu Sudan Journal of ACM Vol.45 No Reporter : Chen, Chun-Hua Date.
The RSA Cryptosystem and Factoring Integers (II) Rong-Jaye Chen.
Co-operative Private Equality Test(CPET) Ronghua Li and Chuan-Kun Wu (received June 21, 2005; revised and accepted July 4, 2005) International Journal.
CMSC 414 Computer and Network Security Lecture 9 Jonathan Katz.
Tirgul 8 Universal Hashing Remarks on Programming Exercise 1 Solution to question 2 in theoretical homework 2.
1 Lecture 7 Halting Problem –Fundamental program behavior problem –A specific unsolvable problem –Diagonalization technique revisited Proof more complex.
Private Information Retrieval. What is Private Information retrieval (PIR) ? Reduction from Private Information Retrieval (PIR) to Smooth Codes Constructions.
CMSC 414 Computer and Network Security Lecture 19 Jonathan Katz.
Privacy-Preserving Computation and Verification of Aggregate Queries on Outsourced Databases Brian Thompson 1, Stuart Haber 2, William G. Horne 2, Tomas.
Locally Decodable Codes Uri Nadav. Contents What is Locally Decodable Code (LDC) ? Constructions Lower Bounds Reduction from Private Information Retrieval.
Private Information Retrieval Amos Beimel – Ben-Gurion University Tel-Hai, June 4, 2003 This talk is based on talks by:
On Everlasting Security in the Hybrid Bounded Storage Model Danny Harnik Moni Naor.
Public Key Encryption that Allows PIR Queries Dan Boneh 、 Eyal Kushilevitz 、 Rafail Ostrovsky and William E. Skeith Crypto 2007.
Information Theory and Security
Efficient Consistency Proofs for Generalized Queries on a Committed Database R. Ostrovsky C. Rackoff A. Smith UCLA Toronto.
Cramer-Shoup is Plaintext Aware in the Standard Model Alexander W. Dent Information Security Group Royal Holloway, University of London.
Foundations of Cryptography Lecture 2 Lecturer: Moni Naor.
Lecture 12 Commitment Schemes and Zero-Knowledge Protocols Stefan Dziembowski University of Rome La Sapienza critto09.googlepages.com.
The RSA Algorithm Rocky K. C. Chang, March
Cryptography Lecture 6 Stefan Dziembowski
How to play ANY mental game
Cryptography Lecture 8 Stefan Dziembowski
Private Information Retrieval Amir Houmansadr CS660: Advanced Information Assurance Spring 2015 Content may be borrowed from other resources. See the last.
Cryptography on Non-Trusted Machines Stefan Dziembowski.
Great Theoretical Ideas in Computer Science.
Shiyuan Wang, Divyakant Agrawal, Amr El Abbadi Department of Computer Science UC Santa Barbara DBSec 2010.
Lecture 11 Chosen-Ciphertext Security Stefan Dziembowski MIM UW ver 1.0.
1 Practical Techniques for Searches on Encrypted Data Dawn Song, David Wagner, Adrian Perrig.
On the Practical Feasibility of Secure Distributed Computing A Case Study Gregory Neven, Frank Piessens, Bart De Decker Dept. of Computer Science, K.U.Leuven.
Slide 1 Vitaly Shmatikov CS 380S Introduction to Secure Multi-Party Computation.
Secure two-party computation: a visual way by Paolo D’Arco and Roberto De Prisco.
Cryptography Lecture 9 Stefan Dziembowski
Cryptography Lecture 2 Stefan Dziembowski
Public Key Encryption with keyword Search Author: Dan Boneh Rafail Ostroversity Giovanni Di Crescenzo Giuseppe Persiano Presenter: 陳昱圻.
Cryptography Lecture 12 Stefan Dziembowski
Zero-knowledge proof protocols 1 CHAPTER 12: Zero-knowledge proof protocols One of the most important, and at the same time very counterintuitive, primitives.
Secure Computation (Lecture 2) Arpita Patra. Vishwaroop of MPC.
A Hybrid Technique for Private Location-Based Queries with Database Protection Gabriel Ghinita 1 Panos Kalnis 2 Murat Kantarcioglu 3 Elisa Bertino 1 1.
Andrew Lindell Aladdin Knowledge Systems and Bar-Ilan University 04/08/08 CRYP-106 Efficient Fully-Simulatable Oblivious Transfer.
多媒體網路安全實驗室 Anonymous Authentication Systems Based on Private Information Retrieval Date: Reporter: Chien-Wen Huang 出處: Networked Digital Technologies,
Private Information Retrieval Based on the talk by Yuval Ishai, Eyal Kushilevitz, Tal Malkin.
1 Information Security – Theory vs. Reality , Winter Lecture 9: Leakage resilience (continued) Lecturer: Eran Tromer.
Keyword search on encrypted data. Keyword search problem  Linux utility: grep  Information retrieval Basic operation Advanced operations – relevance.
Forward-Security in the Limited Communication Model Stefan Dziembowski Warsaw University and CNR Pisa.
Cryptographic methods. Outline  Preliminary Assumptions Public-key encryption  Oblivious Transfer (OT)  Random share based methods  Homomorphic Encryption.
Linear, Nonlinear, and Weakly-Private Secret Sharing Schemes
Multi-Party Computation r n parties: P 1,…,P n  P i has input s i  Parties want to compute f(s 1,…,s n ) together  P i doesn’t want any information.
Topic 36: Zero-Knowledge Proofs
Searchable Encryption in Cloud
Reporter:Chien-Wen Huang
Jens Groth, University College London
Course Business I am traveling April 25-May 3rd
B504/I538: Introduction to Cryptography
Cryptography Lecture 24.
Secret Sharing: Linear vs. Nonlinear Schemes (A Survey)
Limits of Practical Sublinear Secure Computation
Cryptography Lecture 16.
Presentation transcript:

Lecture 15 Private Information Retrieval Stefan Dziembowski MIM UW ver 1.0

Plan 1.Motivation and definition 2.Information-theoretic impossibility 3.A construction of Kushilevitz and Ostrovsky 4.Overview of some other related results

AOL search data scandal (2006) # : clothes for age single men best retirement city jarrett arnold jack t. arnold jaylene and jarrett arnold gwinnett county yellow pages rescue of older dogs movies for dogs sinus infection Thelma Arnold 62-year-old widow Lilburn, Georgia

Observation The owners of databases know a lot about the users! This poses a risk to users’ privacy. E.g. consider database with stock prices… Can we do something about it? We can: trust them that they will protect our secrecy, or use cryptography! problematic problematic!

How can crypto help? Note: this problem has nothing to do with secure communication! user U database D

Our settings user U database D A new primitive: Private Information Retrieval (PIR)

Plan 1.Definition of PIR 2.An ideal PIR doesn’t exist 3.Construction of a computational PIR 4.Open problems Literature: B. Chor, E. Kushilevitz, O. Goldreich and M. Sudan, Private Information Retrieval, Journal of ACM, 1998 E. Kushilevitz and R. Ostrovsky Replication Is NOT Needed: SINGLE Database, Computationally-Private Information Retrieval, FOCS 1997

Question How to protect privacy of queries? user U database D wants to retrieve some data from D shouldn’t learn what U retrieved

Let’s make things simple! B1B1 B2B2 …BwBw index i = 1,…,w the user should learn B i BiBi ? each B i є {0,1} database B: (he may also learn other B i ’s)

Trivial solution B1B1 B2B2 …BwBw The database simply sends everything to the user!

Non-triviality The previous solution has a drawback: the communication complexity is huge! Therefore we introduce the following requirement: “Non-triviality”: the number of bits communicated between U and D has to be smaller than w.

input: Private Information Retrieval (PIR) B1B1 B2B2 …BwBw input: index i = 1,…,w at the end the user learns B i the database does not learn i the total communication is < w Note: secrecy of the database is not required correctness secrecy (of the user) non-triviality This property needs to be defined more formally polynomial time randomized interactive algorithms

How to define secrecy of the user [1/2]? iB Def. T(i,B) – transcript of the conversation. query Q(i)reply A(Q(i),B) For fixed i and B T(i,B) is a random variable (since the parties are randomized) For fixed i and B T(i,B) is a random variable (since the parties are randomized)

multi-round case: it is impossible to distinguish between T(i,B) and T(j,B) even if the adversary is malicious multi-round case: it is impossible to distinguish between T(i,B) and T(j,B) even if the adversary is malicious How to define secrecy of the user [2/2]? Secrecy of the user: for every i,j є {0,1} single-round case: it is impossible to distinguish between Q(i) and Q(j) single-round case: it is impossible to distinguish between Q(i) and Q(j) ? For simplicity say that for any i and j the distributions of T(i,B) and T(j,B) have to be identical For simplicity say that for any i and j the distributions of T(i,B) and T(j,B) have to be identical

Plan 1.Motivation and definition 2.Information-theoretic impossibility 3.A construction of Kushilevitz and Ostrovsky 4.Overview of some other related results

PIR doesn’t exists [1/4] We now show that correctness, non-triviality and secrecy cannot be satisfied simultaneously. Def: A transcript T is possible for (i,B) if P(T(i,B) = T) > 0 Take some T’, and look where it is possible: T’ indices i databases B

PIR doesn’t exists [2/4] Observation: secrecy → if T’ is possible for some B and i then it is possible for B and all the other i’s T’ indices i databases B T’

PIR doesn’t exists [3/4] non-triviality → length(transcript) < length(database) ↓ # transcripts < #databases ↓ there has to exist T’ that is possible for two databases B 0 and B 1 T’ databases B ← B 0 ← B1 ← B1 indices i

PIR doesn’t exists [4/4] B 0 and B 1 differ on at least one index i’ So, if i’ is the input of the user then correctness → contradiction T’ databases B ← B 0 ← B1 ← B1 i’ ↓ indices i

So PIR doesn’t exist! How to bypass the impossibility result? Two ideas: – limit the computing power of a cheating database – use a larger number of “independent” databases we show this

Plan 1.Motivation and definition 2.Information-theoretic impossibility 3.A construction of Kushilevitz and Ostrovsky 4.Overview of some other related results

Computationally-secure PIR secrecy: For every i,j є {0,1} it is impossible to distinguish efficiently between T(i,B) and T(j,B) ? computational-secrecy: Formally: for every polynomial-time probabilistic algorithm A the value: | P(A(T(i,B)) = 0) – P(A(T(j,B))=0) | should be negligible.

Hardness assumptions? Kushilevitz and R. Ostrovsky Replication Is NOT Needed: SINGLE Database, Computationally-Private Information Retrieval, FOCS 1997 construct PIR based on the Quadratic Residuosity Assumption Kushilevitz and R. Ostrovsky Replication Is NOT Needed: SINGLE Database, Computationally-Private Information Retrieval, FOCS 1997 construct PIR based on the Quadratic Residuosity Assumption

Quadratic Residuosity Assumption (QRA) QR(p) QR(q) ZN+:ZN+: Quadratic Residuosity Assumption (QRA): For a random a є Z N + it is computationally hard to determine if a є QR(N). Formally: for every polynomial-time probabilistic algorithm G the value: |P(G(a) = Q(a)) – 0.5| (where a is random) is negligible. QNR(q) QNR(p) ? a є Z N + ↓ QR(N) N=pq

Homomorphism of QR(pq) Q(N,a) := Homomorphism: for all a,b є Z N + Q(N,ab) = Q(N,a) xor Q(N,b) 1 if a є QR(N) 0 otherwise

We are ready to construct PIR! Our PIR will work in the group Z N +, where N=pq. What’s so good about this group?: testing membership in Z N + is easy, testing membership in QR(N) is hard for random elements on Z N +, unless one knows p and q. homomorphism of Q!

First (wrong) idea i QR X 1 QR X 2... QR X i-1 NQR X i QR X i+1... QR X w-1 QR X w B1B1 B2B2...B i-1 BiBi B i+1...B w-1 BwBw for every j = 1,...,w the database sets Y j = X j 2 if B j = 0 X j otherwise { QR Y 1 QR Y 2... QR Y i-1 YiYi QR Y i+1... QR Y w-1 QR Y w i↓i↓ Y i is a QR iff B i =0 Set M = Y 1 · Y 2 ·... · Y w M M is a QR iff B i =0 the user checks if M is a QR

Problems! PIR from the previous slide: correctness √ security? To learn i the database would need to distinguish NQR from QR. √ QR X 1 QR X 2... QR X i-1 NQR X i QR X i+1... QR X w-1 QR X w non-triviality? doesn’t hold! communication: user → database: |B| · |Z* n | database → user: |Z* n | Call it: (|B|, 1) - PIR Call it: (|B|, 1) - PIR

How to fix it? Idea: Given: construct Suppose that |B| = v 2 and present B as a v×v-matrix: B13B14B15B16 B9B10B11B12B5B6B7B8 B1B2B3B4 consider each row as a separate database consider each row as a separate database

An improved idea B13B14B15B16 B9B10B11B12 B5B6B7B8 B1B2B3B4 v v Let j be the column where B i is. In every “row” the user asks for the jth element So, instead of sending v queries the user can send one! Observe: in this way the user learns all the elements in the jth column! BiBi j↓j↓ execute v (v,1) - PIRs in parallel execute v (v,1) - PIRs in parallel Looks even worse: communication: user → database: v 2 · |Z* n | database → user: v · |Z* n | The method

Putting things together B1B1...B j-1 BjBj B j+1...BvBv BiBi B vv i QR X 1... QR X j-1 NQR X j QR X j+1... QR X v X1X1...X j-1 XjXj X j+1...XvXv X1X1 X j-1 XjXj X j+1...XvXv Y1Y1 Y j-1 YjYj Y j+1...YvYv Y vv M1M1... MvMv multiply elements in each row kth row M1M1 MkMk MvMv B j =0 iff M k is QR for every j = 1,...,v set Y j = for every j = 1,...,v set Y j = X j 2 if B j = 0 X j otherwise { jth column here the same row is copied v times: only this counts

So we are done! PIR from the previous slide: correctness √ non-triviality: communication complexity = 2√|B| · |Z n | √ security? The to learn i the database would need to distinguish NQR from QR. Formally: from any adversary that breaks our scheme we can construct an algorithm that breaks QRA

Improvements user U database D (X 1,…,X v ) (M 1,…,M v ) the user is interested just in one M i. the user is interested just in one M i. Idea: apply PIR recursively!

Plan 1.Motivation and definition 2.Information-theoretic impossibility 3.A construction of Kushilevitz and Ostrovsky 4.Overview of some other related results

Complexity of PIRs – overview of the results Communication: “recursive” PIR of [KO97]: for every c: O(|B| c ) [Cachin, Micali, Stadler, 1999]: poly-logarithmic in |B| [Lipmaa, 2005]: O(log 2 |B|) For practical analysis see: [Sion, Carbunar] On the Computational Practicality of Private Information Retrieval. their conclusion: It is the time-complexity that matters. In real-life: it is still more practical to transmit the entire database.

Extensions Symmetric PIR (also protect privacy of the database). [Gertner, Ishai, Kushilevitz, Malkin. 1998] Searching by key-words [Chor, Gilboa, Naor, 1997] Public-key encryption with key-word search [Boneh, Di Crescenzo, Ostrovsky, Persiano]

© 2013 by Stefan Dziembowski. Permission to make digital or hard copies of part or all of this material is currently granted without fee provided that copies are made only for personal or classroom use, are not distributed for profit or commercial advantage, and that new copies bear this notice and the full citation.