Fast Searchable Encryption with Tunable Locality Ioannis Demertzis University of Maryland yannis@umd.edu Charalampos Papamanthou University of Maryland cpap@umd.edu
Cloud Computing Pros: Near infinite scalability for big data analytics Easy and ubiquitous access on solid data Cost reduction with the use of shared infrastructure + Affordable for small and medium businesses Cons: - Serious security and privacy concerns regarding outsourcing and querying on private company or personal data Solution: Privacy Preserving DBMS 2
Obstacles to Overcome (2009 -> 2015 -> 2017) Gartner says worldwide Cloud Services Market is forecast to reach $383 Billions in 2020
IDEAL SOLUTION Privacy Preserving DBMS Encrypt(DB) Client ? Encrypted Database Later: Encrypted(query) Untrusted Cloud Encrypted(results) Client
Solutions for Encrypted Search Demertzis, Papadopoulos, Papapetrou, Deligiannakis, Garofalakis “Practical Private Range Search Revisited”, SIGMOD 2016 Efficiency Security High Low CryptDB CipherBase MONOMI Google BigQuery Microsoft SQL 2016 Always Encrypted … Secure & Efficient OPE DET SSE Efficient Oblivious RAM Functional Enc FHE Secure Not all points are explained in depth (Feel free to ask me during the poster session!!)
Our Contribution In this work: A new scalable Searchable Encryption (SE) with good locality 12x more efficient than the state-of-the-art in memory SE Up to 2-3 orders of magnitude less false positives than the external memory SE Space, Read Efficiency, Locality, Parallelism, Bandwidth can be tuned to achieve optimal performance Formal proof based on widely-adopted CRYPTO security definitions
What is Searchable Encryption? Leakage is the amount of information that the untrusted cloud learns Untrusted Cloud Client ? search query: keyword
Searchable Encryption (SE) schemes Client Untrusted Cloud k1 F1 F4 F2 k2 F3 F6 F4 F2 k3 F5 F1 F1 F2 F3 F4 F5 F6
Searchable Encryption (SE) schemes Client Untrusted Cloud k1 F1 F4 F2 k2 F3 F6 F4 F2 k3 F5 F1 F1 F2 F3 F4 F5 F6
Searchable Encryption (SE) schemes Client Untrusted Cloud L1 leakage: total leakage prior to query execution e.g. size of each encrypted file, size of encrypted index k1 F1 F4 F2 k2 F3 F6 F4 F2 k3 F5 F1 F1 F2 F3 F4 F5 F6
Searchable Encryption (SE) schemes Client Search pattern: whether a search query is repeated L2 leakage (leakage during query execution) Untrusted Cloud token PRFsk() PRFsk() PRFsk() k1 k1 F1 F4 F2 k2 F3 F6 F4 F2 k3 F5 F1 F1 F2 F3 F4 F5 F6 Access pattern: encrypted document ids and files that satisfy the search query
Searchable Encryption (SE) schemes Client Search pattern: whether a search query is repeated L2 leakage (leakage during query execution) Untrusted Cloud token k1 k1 F1 F4 F2 k2 F3 F6 F4 F2 k3 F5 F1 T1 John Smith CMU 27 $3,000 Result size T2 Alice Lu UCLA 28 $4,000 TN Bruce William UMD 30 $2,000
Searchable Encryption – Locality and Read Efficiency Locality: #non-continues reads for each query. Read Efficiency: #memory locations per result item. PiBas locality = 3 & read efficiency = 1 k1 F1 F4 F2 X X X X X X F4 F5 F3 F1 F2 F6 X : false positives locality = 1 & read efficiency = O(N)
Searchable Encryption – Lower Bound “Cash and Tessaro Eurocrypt 2014” O(1) Locality and O(1) Read Efficiency requires ω(Ν) space <=3 <=4 F1 F4 F2 F5 F3 F6 locality = 1 & read efficiency = 1 Having k distinct result sizes: O(1) Locality and O(1) Read Efficiency requires O(kΝ) space
Security Game Real Scheme Simulator L1 ( Adversary ) Enc ( ) + Enc( ) &^*@h@&*^H4&*24 w1 | L2( w1 ) w1 Adversary token1 ^&*daUY@#* … … wN | L2( wN) wN tokenN &k*&()#&*@ 16
Searchable Encryption - Related Work Scheme Locality Read Efficiency Space 1st Generation of SE schemes - PiBas Θ(|result|) O(1) Ο(N) Asharov et al. STOC 2016 – Scheme NlogN O(1) Ο(NlogN) Asharov et al. STOC 2016 – OneChoiceAlloc Θ(logN loglogN) Ο(Ν) Our scheme with optimal locality O(1) O(N1/(s+1)) O(sN) Our scheme with O(L) Locality O(L) O(N1/s/L) Our scheme with O(R) Read Efficiency O(N1/s/R) O(R) Cash et al. EUROCRYPT 2014 - Lower bound: O(1) ω(Ν)
Asharov et al. STOC 2016 – OneChoiceAlloc Scheme k1= k2= k3= … 3 logN loglogN M = N / logN loglogN O(N) space, O(1) locality and Θ(logn loglogN) read efficiency
Asharov et al. STOC 2016 – OneChoiceAlloc Scheme k1= k2= k3= k1 … 3 logN loglogN M = N / logN loglogN O(N) space, O(1) locality and Θ(logn loglogN) read efficiency
Optimal Locality Scheme and Read Efficiency k distinct result sizes: O(1) Locality and O(1) Read Efficiency requires O(kΝ) space |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 logN+1 encrypted arrays Dataset: N=16 O(NlogN) space, O(1) locality and O(1) read efficiency
Optimal Locality Scheme and Read Efficiency k distinct result sizes: O(1) Locality and O(1) Read Efficiency requires O(kΝ) space |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 logN+1 encrypted arrays Dataset: N=16 O(NlogN) space, O(1) locality and O(1) read efficiency
Optimal Locality Scheme and Read Efficiency k distinct result sizes: O(1) Locality and O(1) Read Efficiency requires O(kΝ) space |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 logN+1 encrypted arrays Dataset: N=16 O(NlogN) space, O(1) locality and O(1) read efficiency
Optimal Locality Scheme and Read Efficiency k distinct result sizes: O(1) Locality and O(1) Read Efficiency requires O(kΝ) space |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 logN+1 encrypted arrays Dataset: N=16 Level i has N/2i buckets with size 2i O(NlogN) space, O(1) locality and O(1) read efficiency
Optimal Locality Scheme and Read Efficiency Input dataset, N=16 |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 logN+1 encrypted arrays k1= k2= k3= k4= k5= O(NlogN) space, O(1) locality and O(1) read efficiency
Our Approach – Optimal Locality Scheme O(sN) |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=3 encrypted arrays O(sN) space, O(1) locality and O(N1/s) read efficiency
Our Approach – Optimal Locality Scheme |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=3 encrypted arrays Not stored Stored but empty O(sN) space, O(1) locality and O(N1/s) read efficiency
Our Approach – Optimal Locality Scheme |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=3 encrypted arrays Not stored Stored but empty O(sN) space, O(1) locality and O(N1/s) read efficiency
Our Approach – Optimal Locality Scheme |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=1 encrypted arrays Read Efficiency 1 Not stored Stored but empty O(sN) space, O(1) locality and O(N1/s) read efficiency
Our Approach – Optimal Locality Scheme |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=1 encrypted arrays Read Efficiency 1 2 Not stored Stored but empty O(sN) space, O(1) locality and O(N1/s) read efficiency
Our Approach – Optimal Locality Scheme |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=1 encrypted arrays Read Efficiency 1 2 4 Not stored Stored but empty O(sN) space, O(1) locality and O(N1/s) read efficiency
Our Approach – Optimal Locality Scheme |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=1 encrypted arrays Read Efficiency 1 2 4 8 Not stored Stored but empty O(sN) space, O(1) locality and O(N1/s) read efficiency
Our Approach – Optimal Locality Scheme |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=1 encrypted arrays Read Efficiency 1 2 4 8 Each stored level requires 2*N + 2i space to avoid potential overflows O(sN) space, O(1) locality and O(N1/s) read efficiency
Our Approach – Optimal Locality Scheme |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=3 encrypted arrays O(logN/s) O(logN/s) s evenly distributed levels Maximum gap between stored levels is O(logN/s) The worst case read efficiency is O(2logN/s) = O(N1/s) O(sN) space, O(1) locality and O(N1/s) read efficiency
Our Approach – Optimal Read Efficiency |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=3 encrypted arrays O(N) space, O(N1/s) locality and O(1) read efficiency
Our Approach – Optimal Read Efficiency |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=3 encrypted arrays O(N) space, O(N1/s) locality and O(1) read efficiency
Our Approach – Constant Locality O(L) |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=1 encrypted array and tune L=4 O(N) space, O(L) locality and O(N1/s/L) read efficiency
Our Approach – Constant Locality O(L) |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=1 encrypted array and tune L=4 O(N) space, O(L) locality and O(N1/s/L) read efficiency
Our Approach – Constant Locality O(L) |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=1 encrypted array and tune L=4 O(N) space, O(L) locality and O(N1/s/L) read efficiency
Our Approach – Constant Locality O(L) |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=1 encrypted array and tune L=4 Choose L = #parallel process units (servers) O(N) space, O(L) locality and O(N1/s/L) read efficiency
Our Approach – The full protocol Client filters out the false positives Client Server filters out the false positives Untrusted Cloud Minimize the bandwidth k3 #PRFs = |result|*N1/s level=2,offset=0 Encrypted Dictionary Only 2 PRFs More bandwidth Encrypted Arrays
Experiments 1 real dataset with 6,123,276 records used for in-memory evaluation Query attribute: location description (173 distinct keywords) Synthetic dataset used for external memory evaluation N =247 -1 records (~ 1 petabyte) ,|k| =1,2,4,…, 246 Java implementation: Our scheme PiBas, state-of-the-art for in-memory settings OneChoiceAlloc, state-of-the-art for external memory 64bit machine with Intel Xeon E5-2676v3 with 64GB RAM
Experiments – Index Costs (In-memory)
Experiments – Search Costs (In-memory) End-to-End Search Time
Experiments – False Positives False Positives for Different Sizes
Experiments – Search Time (External Memory)
Experiments – Search Time (Real Dataset)
Conclusion – Future Work ____________? In this work: Formal proof based on widely-adopted CRYPTO security definitions 12x more efficient than the state-of-the-art in memory SE Up to 2-3 orders of magnitude less false positives than the external memory SE Our scheme provides various trade-offs between Space Read Efficiency (false positives) Locality Parallelism Bandwidth #Crypto operations
Tunable for arbitrary architectures Thank you!!! Questions??? 12x in-memory 580x external memory Tunable for arbitrary architectures Efficiency Security High Low OPE DET PPE FHE SSE Secure & Efficient Efficient ORAM Func/Pred Enc Secure