1 Using low-degree Homomorphism for Private Conjunction Queries
September 18, 2018 Using low-degree Homomorphism for Private Conjunction Queries Dan Boneh, Craig Gentry, Shai Halevi, Frank Wang, David Wu

2 Private Conjunction Queries
Clinet has an SQL query of the type SELECT ⋆ FROM db WHERE a1=v1 AND … AND at=vt Want to hide the values vi from the serve maybe also the attributes ai themselves Our protocols return the indexes of the matching records The client can use PIR or ORAM to fetch the records themselves September 18, 2018

3 The Basic Approach Encode database as a polynomial
A set S is encoded as a polynomial P(X) s.t. P(s)=0 for all s  S Use Kissner-Song trick If P1(X), P2(X) represent S1, S2, the a random linear combination A X = 𝑅 1 𝑋 𝑃 1 𝑋 + 𝑅 2 𝑋 𝑃 2 (𝑋) represents the intersection of S1, S2, whp. If deg 𝑅 1 =deg⁡( 𝑃 2 ) and deg 𝑅 2 =deg⁡( 𝑃 1 ) then A(X) does not leak any information beyond the intersection September 18, 2018

4 Two-Party Settings Server has database
Client has secret-key for SWHE scheme Server encode database as bivariate polynomial D(x,y) D(r,a)=v if record r has attribute a=value v Size of D ~ size of database September 18, 2018

5 Conjunction Queries “attr1=val1 AND … AND attrt=valt”
Client interpolates Q(y) s.t. Q(attri)=vali Send the encrypted Q to server For simplicity send also attr1,…,attrt in the clear Server computes 𝐴(𝑥,𝑦)=𝐷(𝑥,𝑦)−𝑄(𝑦) Additive homomorphism suffices A(r,attri)=0 iff D(r,attri)=vali Server defines Ai(X) = A(X,attri) Roots of Ai(X) are records that have attri=vali September 18, 2018

6 Conjunction Queries (cont.)
Server uses Kissner-Song trick, set 𝐵(𝑋)= 𝑖=1 𝑡 𝑅 𝑖 (𝑋) 𝐴 𝑖 (𝑋) for random 𝑅 𝑖 ’s Whp roots of B are the records in the intersection of the 𝐴 𝑖 ’s Still additive homomorphism is enough Need more if attri’s are not send in the clear Server sends encrypted 𝐵 to client Client decrypts, find roots 𝑟 𝑖 , uses PIR/ORAM to get actual records To hide also the attributes we need higher-degree homomorphism September 18, 2018

7 Three parties: Client-Proxy-Server
Proxy has encrypted inverted index For every attr=val in DB, keeps a pair (t, Enc(P)) Tag t = Hash(“attr=val”) P is polynomial s.t. P(r)=0 if record #r contains this “attr=val” pair Client sends tags ti for attri=valuei in query Proxy chooses randomizers Ri sets 𝑄= 𝑖 𝑃 𝑖 𝑅 𝑖 Q has roots in the intersection Server obliviously decrypts for Client Client factors Q, finds roots 𝑟 𝑖 , uses PIR/ORAM to get actual records September 18, 2018

8 Conserving Bandwidth 𝑄= 𝑖 𝑃 𝑖 𝑅 𝑖 is a wasteful representation
Degree ~ 2 max(deg(Pi)) High degree needed for Q to not leak information on the Pi’s Reducing to max(deg(Pi))+min(deg(Pi)) easy: Say P1 has smallest degree, then set 𝑄 ′ = 𝑖=2 𝑛 𝑃 𝑖 𝑠 𝑖 The si’s are random scalars 𝑄= 𝑃 1 𝑅+𝑄’𝑅’, deg(R)=deg(Q’), deg(R’)=def(P1) Can we reduce it further? We show how to get min(deg(Pi)) September 18, 2018

9 Polynomial GCD P1, P2 are (monic) polynomials for the sets S1,S2
𝑃 𝑏 (𝑋)= 𝑖∈ 𝑆 𝑏 (𝑋−𝑖) The smallest polynomial defining 𝑆 1 ∩ 𝑆 2 is 𝐺(𝑋)=𝐺𝐶𝐷 𝑃 1 , 𝑃 2 = 𝑖∈ 𝑆 1 ∩ 𝑆 2 (𝑋−𝑖) G does not leak information on P1,P2 beyond the intersection Computing Enc(G) from {Enc(Pb)}b takes high homomorphic capacity September 18, 2018

10 Reducing The Degree Instead of 𝑄= 𝑖 𝑅 𝑖 𝑃 𝑖 , use Q ′ =𝑄 𝑚𝑜𝑑 𝑃 1
It has degree ≤ deg 𝑃 1 −1 If Q is a random multiple of G, so is Q’ Computing Enc(Q mod P1) is easier Basic Solution: Store also 𝐸𝑛𝑐 𝑋𝑖 𝑚𝑜𝑑 𝑃1 ∀𝑖 Given the encrypted coefficeints of Q 𝐸𝑛𝑐 𝑞 0 , …, 𝐸𝑛𝑐 𝑞 𝑑 ,…𝐸𝑛𝑐 𝑞 𝑛 (𝑑=deg⁡(𝑃1)) Compute Enc( 𝑖=0 𝑛 𝑞 𝑖 ( 𝑋 𝑖 𝑚𝑜𝑑 𝑃 1 )) Only takes quadratic homomorphism September 18, 2018

11 Reducing The Degree (cont.)
Storage/homomorphism tradeoff Can store less encryptions of 𝑋𝑖 𝑚𝑜𝑑 𝑃1 by using higher homomorphic capacity E.g., Store 𝐸𝑛𝑐 𝑋 2 𝑡 𝑚𝑜𝑑 𝑃1 , t=0,1,2,… When deg(Q)=d+m, it takes log m steps to reduce Q mod P1 Using 𝑄 𝑋 ≡ 𝑖 𝑄 𝑖 𝑋 𝑋 2 𝑡 𝑚𝑜𝑑 𝑃 𝑚𝑜𝑑 𝑃 1 September 18, 2018 deg < 2t deg < d

12 Speedup Using Batching
Recall: a HE ciphertext encrypts an array of L values L is at least a few hundred, maybe more Can use it to get significant speedup: Break the database into L small db’s Each record is places at random in one of the small db’s Run the same query against all the small db’s at once The i’th database in the i’th entry of all the cipehrtexts So we get L lists of indexes instead of one i’th list has the indexes of the records in the i’th database that match the query Lists are much shorter polynomials have much smaller degree September 18, 2018

13 Implementing 3-party protocol
Two implementation: Only the basic scheme using additive cryptosystem (Pallier) The full scheme using the [Bra’12] HE Only the 2nd implementation scales to large databases Batching is key With and without the bandwidth-reduction GCD trick Without it we need lower homomorphism, smaller parameters All tests run against a 1-million record database, executing a 5-attribute conjunction ( 𝑎 1 = 𝑣 1 , …, 𝑎 5 = 𝑣 5 ) Balanced tests: each 𝑎 𝑖 = 𝑣 𝑖 matches roughly same # or records Unbalanced: 𝑎 1 = 𝑣 1 matches only ~5% as many as 𝑎 5 = 𝑣 5 September 18, 2018

14 Balanced Queries Time (minutes) ~2000 matches per tag, 8 minutes, 1MB
September 18, 2018 ~2000 matches per tag, 8 minutes, 1MB Bandwidth (MB)

15 Unbalanced Queries – Time (min)
September 18, 2018 (2.5K,2.5K,5K,10K,50K) (10K,20K,25K,50K,200K) (2.5K,2.5K,5K,5K,350K)

16 Unbalanced Queries – Bandwidth (MB)
September 18, 2018 (2.5K,2.5K,5K,10K,50K) (10K,20K,25K,50K,200K) (2.5K,2.5K,5K,5K,350K)

