Fast Searchable Encryption with Tunable Locality

Slides:



Advertisements
Similar presentations
Monomi: Practical Analytical Query Processing over Encrypted Data
Advertisements

A Privacy Preserving Index for Range Queries
Oblivious Branching Program Evaluation
Efficient Information Retrieval for Ranked Queries in Cost-Effective Cloud Environments Presenter: Qin Liu a,b Joint work with Chiu C. Tan b, Jie Wu b,
Orthogonal Security With Cipherbase 1 Microsoft Research 2 UW-Madison 3 ETH-Zurich Arvind Arasu 1 Spyros Blanas 2 Ken Eguro 1 Donald Kossmann 3 Ravi Ramamurthy.
Bio Michel Hanna M.S. in E.E., Cairo University, Egypt B.S. in E.E., Cairo University at Fayoum, Egypt Currently is a Ph.D. Student in Computer Engineering.
SplitX: High-Performance Private Analytics Ruichuan Chen (Bell Labs / Alcatel-Lucent) Istemi Ekin Akkus (MPI-SWS) Paul Francis (MPI-SWS)
Lecture: Algorithmic complexity
Structured Encryption and Controlled Disclosure Melissa Chase Seny Kamara Microsoft Research Asiacrypt '10 1.
CryptDB: A Practical Encrypted Relational DBMS Raluca Ada Popa, Nickolai Zeldovich, and Hari Balakrishnan MIT CSAIL New England Database Summit 2011.
CS7380: Privacy Aware Computing Oblivious RAM 1. Motivation  Starting from software protection Prevent from software piracy A valid method is using hardware.
1 CS 430: Information Discovery Lecture 4 Data Structures for Information Retrieval.
Privacy and Integrity Preserving in Distributed Systems Presented for Ph.D. Qualifying Examination Fei Chen Michigan State University August 25 th, 2009.
ObliviStore High Performance Oblivious Cloud Storage Emil StefanovElaine Shi
Construction of efficient PDP scheme for Distributed Cloud Storage. By Manognya Reddy Kondam.
Privacy Preserving Query Processing in Cloud Computing Wen Jie
Wai Kit Wong 1, Ben Kao 2, David W. Cheung 2, Rongbin Li 2, Siu Ming Yiu 2 1 Hang Seng Management College, Hong Kong 2 University of Hong Kong.
Wai Kit Wong, Ben Kao, David W. Cheung, Rongbin Li, Siu Ming Yiu.
Privacy Preserving Payments in Credit Networks By: Moreno-Sanchez et al from Saarland University Presented By: Cody Watson Some Slides Borrowed From NDSS’15.
UC/Garbled Searchable Symmetric Encryption Kaoru Kurosawa Ibaraki University, Japan.
Onion ORAM: A Constant Bandwidth Blowup ORAM
Searching Over Encrypted Data Charalampos Papamanthou ECE and UMIACS University of Maryland, College Park Research Supported By.
Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,
Secure Data Outsourcing
Mona: Secure Multi-Owner Data Sharing for Dynamic Groups in the Cloud.
All Your Queries Are Belong to Us: The Power of File-Injection Attacks on Searchable Encryption Yupeng Zhang, Jonathan Katz, Charalampos Papamanthou University.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
Privacy Preserving Outlier Detection using Locality Sensitive Hashing
All Your Queries are Belong to Us: The Power of File-Injection Attacks on Searchable Encryption Yupeng Zhang, Jonathan Katz, Charalampos Papamanthou University.
Fast Data Analysis with Integrated Statistical Metadata in Scientific Datasets By Yong Chen (with Jialin Liu) Data-Intensive Scalable Computing Laboratory.
A presentation on ElasticSearch
Algorithm Analysis 1.
Algorithm Efficiency and Sorting
Practical Private Range Search Revisited
A Fixed-key Blockcipher
Chapter 6: Securing the Cloud
Searchable Encryption in Cloud
Tian Xia and Donghui Zhang Northeastern University
Efficient Multi-User Indexing for Secure Keyword Search
Oblivious Parallel RAM: Improved Efficiency and Generic Constructions
Analysis of Algorithms
CS422 Principles of Database Systems Course Overview
Hash table CSC317 We have elements with key and satellite data
Storage and Indexes Chapter 8 & 9
Hybrid Cloud Architecture for Software-as-a-Service Provider to Achieve Higher Privacy and Decrease Securiity Concerns about Cloud Computing P. Reinhold.
The Variable-Increment Counting Bloom Filter
COMP 430 Intro. to Database Systems
Hash-Based Indexes Chapter 11
Database Management Systems (CS 564)
CSCI 104 Log Structured Merge Trees
Using cryptography in databases and web applications
563.10: Bloom Cookies Web Search Personalization without User Tracking
Verifiable Oblivious Storage
A Privacy-Preserving Index for Range Queries
External Sorting The slides for this text are organized into chapters. This lecture covers Chapter 11. Chapter 1: Introduction to Database Systems Chapter.
CS7380: Privacy Aware Computing
Optimizing MapReduce for GPUs with Effective Shared Memory Usage
Ch 4. The Evolution of Analytic Scalability
Hash-Based Indexes Chapter 10
Algorithm Efficiency Chapter 10.
Cloud Security 李芮,蒋希坤,崔男 2018年4月.
University of Maryland
Overview of Query Evaluation
8. Comparison of Algorithms
Chapter 11 Instructor: Xin Zhang
Path Oram An Extremely Simple Oblivious RAM Protocol
Multiplicative data perturbation (2)
CRYP-F02 Actively Secure 1-out-of-N OT Extension with Application to Private Set Intersection Peter Scholl (University of Bristol) Michele Orrù (ENS Paris)
Efficient Migration of Large-memory VMs Using Private Virtual Memory
Presentation transcript:

Fast Searchable Encryption with Tunable Locality Ioannis Demertzis University of Maryland yannis@umd.edu Charalampos Papamanthou University of Maryland cpap@umd.edu

Cloud Computing Pros: Near infinite scalability for big data analytics Easy and ubiquitous access on solid data Cost reduction with the use of shared infrastructure + Affordable for small and medium businesses Cons: - Serious security and privacy concerns regarding outsourcing and querying on private company or personal data Solution: Privacy Preserving DBMS 2

Obstacles to Overcome (2009 -> 2015 -> 2017) Gartner says worldwide Cloud Services Market is forecast to reach $383 Billions in 2020

IDEAL SOLUTION Privacy Preserving DBMS Encrypt(DB) Client ? Encrypted Database Later: Encrypted(query) Untrusted Cloud Encrypted(results) Client

Solutions for Encrypted Search Demertzis, Papadopoulos, Papapetrou, Deligiannakis, Garofalakis “Practical Private Range Search Revisited”, SIGMOD 2016 Efficiency Security High Low CryptDB CipherBase MONOMI Google BigQuery Microsoft SQL 2016 Always Encrypted … Secure & Efficient OPE DET SSE Efficient Oblivious RAM Functional Enc FHE Secure Not all points are explained in depth (Feel free to ask me during the poster session!!)

Our Contribution In this work: A new scalable Searchable Encryption (SE) with good locality 12x more efficient than the state-of-the-art in memory SE Up to 2-3 orders of magnitude less false positives than the external memory SE Space, Read Efficiency, Locality, Parallelism, Bandwidth can be tuned to achieve optimal performance Formal proof based on widely-adopted CRYPTO security definitions

What is Searchable Encryption? Leakage is the amount of information that the untrusted cloud learns Untrusted Cloud Client ? search query: keyword

Searchable Encryption (SE) schemes Client Untrusted Cloud k1 F1 F4 F2 k2 F3 F6 F4 F2 k3 F5 F1 F1 F2 F3 F4 F5 F6

Searchable Encryption (SE) schemes Client Untrusted Cloud k1 F1 F4 F2 k2 F3 F6 F4 F2 k3 F5 F1 F1 F2 F3 F4 F5 F6

Searchable Encryption (SE) schemes Client Untrusted Cloud L1 leakage: total leakage prior to query execution e.g. size of each encrypted file, size of encrypted index k1 F1 F4 F2 k2 F3 F6 F4 F2 k3 F5 F1 F1 F2 F3 F4 F5 F6

Searchable Encryption (SE) schemes Client Search pattern: whether a search query is repeated L2 leakage (leakage during query execution) Untrusted Cloud token PRFsk() PRFsk() PRFsk() k1 k1 F1 F4 F2 k2 F3 F6 F4 F2 k3 F5 F1 F1 F2 F3 F4 F5 F6 Access pattern: encrypted document ids and files that satisfy the search query

Searchable Encryption (SE) schemes Client Search pattern: whether a search query is repeated L2 leakage (leakage during query execution) Untrusted Cloud token k1 k1 F1 F4 F2 k2 F3 F6 F4 F2 k3 F5 F1 T1 John Smith CMU 27 $3,000 Result size T2 Alice Lu UCLA 28 $4,000 TN Bruce William UMD 30 $2,000

Searchable Encryption – Locality and Read Efficiency Locality: #non-continues reads for each query. Read Efficiency: #memory locations per result item. PiBas locality = 3 & read efficiency = 1 k1 F1 F4 F2 X X X X X X F4 F5 F3 F1 F2 F6 X : false positives locality = 1 & read efficiency = O(N)

Searchable Encryption – Lower Bound “Cash and Tessaro Eurocrypt 2014” O(1) Locality and O(1) Read Efficiency requires ω(Ν) space <=3 <=4 F1 F4 F2 F5 F3 F6 locality = 1 & read efficiency = 1 Having k distinct result sizes: O(1) Locality and O(1) Read Efficiency requires O(kΝ) space

Security Game Real Scheme Simulator L1 ( Adversary ) Enc ( ) + Enc( ) &^*@h@&*^H4&*24 w1 | L2( w1 ) w1 Adversary token1 ^&*daUY@#* … … wN | L2( wN) wN tokenN &k*&()#&*@ 16

Searchable Encryption - Related Work Scheme Locality Read Efficiency Space 1st Generation of SE schemes - PiBas Θ(|result|) O(1) Ο(N) Asharov et al. STOC 2016 – Scheme NlogN O(1) Ο(NlogN) Asharov et al. STOC 2016 – OneChoiceAlloc Θ(logN loglogN) Ο(Ν) Our scheme with optimal locality O(1) O(N1/(s+1)) O(sN) Our scheme with O(L) Locality O(L) O(N1/s/L) Our scheme with O(R) Read Efficiency O(N1/s/R) O(R) Cash et al. EUROCRYPT 2014 - Lower bound: O(1) ω(Ν)

Asharov et al. STOC 2016 – OneChoiceAlloc Scheme k1= k2= k3= … 3 logN loglogN M = N / logN loglogN O(N) space, O(1) locality and Θ(logn loglogN) read efficiency

Asharov et al. STOC 2016 – OneChoiceAlloc Scheme k1= k2= k3= k1 … 3 logN loglogN M = N / logN loglogN O(N) space, O(1) locality and Θ(logn loglogN) read efficiency

Optimal Locality Scheme and Read Efficiency k distinct result sizes: O(1) Locality and O(1) Read Efficiency requires O(kΝ) space |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 logN+1 encrypted arrays Dataset: N=16 O(NlogN) space, O(1) locality and O(1) read efficiency

Optimal Locality Scheme and Read Efficiency k distinct result sizes: O(1) Locality and O(1) Read Efficiency requires O(kΝ) space |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 logN+1 encrypted arrays Dataset: N=16 O(NlogN) space, O(1) locality and O(1) read efficiency

Optimal Locality Scheme and Read Efficiency k distinct result sizes: O(1) Locality and O(1) Read Efficiency requires O(kΝ) space |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 logN+1 encrypted arrays Dataset: N=16 O(NlogN) space, O(1) locality and O(1) read efficiency

Optimal Locality Scheme and Read Efficiency k distinct result sizes: O(1) Locality and O(1) Read Efficiency requires O(kΝ) space |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 logN+1 encrypted arrays Dataset: N=16 Level i has N/2i buckets with size 2i O(NlogN) space, O(1) locality and O(1) read efficiency

Optimal Locality Scheme and Read Efficiency Input dataset, N=16 |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 logN+1 encrypted arrays k1= k2= k3= k4= k5= O(NlogN) space, O(1) locality and O(1) read efficiency

Our Approach – Optimal Locality Scheme O(sN) |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=3 encrypted arrays O(sN) space, O(1) locality and O(N1/s) read efficiency

Our Approach – Optimal Locality Scheme |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=3 encrypted arrays Not stored Stored but empty O(sN) space, O(1) locality and O(N1/s) read efficiency

Our Approach – Optimal Locality Scheme |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=3 encrypted arrays Not stored Stored but empty O(sN) space, O(1) locality and O(N1/s) read efficiency

Our Approach – Optimal Locality Scheme |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=1 encrypted arrays Read Efficiency 1 Not stored Stored but empty O(sN) space, O(1) locality and O(N1/s) read efficiency

Our Approach – Optimal Locality Scheme |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=1 encrypted arrays Read Efficiency 1 2 Not stored Stored but empty O(sN) space, O(1) locality and O(N1/s) read efficiency

Our Approach – Optimal Locality Scheme |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=1 encrypted arrays Read Efficiency 1 2 4 Not stored Stored but empty O(sN) space, O(1) locality and O(N1/s) read efficiency

Our Approach – Optimal Locality Scheme |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=1 encrypted arrays Read Efficiency 1 2 4 8 Not stored Stored but empty O(sN) space, O(1) locality and O(N1/s) read efficiency

Our Approach – Optimal Locality Scheme |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=1 encrypted arrays Read Efficiency 1 2 4 8 Each stored level requires 2*N + 2i space to avoid potential overflows O(sN) space, O(1) locality and O(N1/s) read efficiency

Our Approach – Optimal Locality Scheme |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=3 encrypted arrays O(logN/s) O(logN/s) s evenly distributed levels  Maximum gap between stored levels is O(logN/s) The worst case read efficiency is O(2logN/s) = O(N1/s) O(sN) space, O(1) locality and O(N1/s) read efficiency

Our Approach – Optimal Read Efficiency |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=3 encrypted arrays O(N) space, O(N1/s) locality and O(1) read efficiency

Our Approach – Optimal Read Efficiency |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=3 encrypted arrays O(N) space, O(N1/s) locality and O(1) read efficiency

Our Approach – Constant Locality O(L) |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=1 encrypted array and tune L=4 O(N) space, O(L) locality and O(N1/s/L) read efficiency

Our Approach – Constant Locality O(L) |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=1 encrypted array and tune L=4 O(N) space, O(L) locality and O(N1/s/L) read efficiency

Our Approach – Constant Locality O(L) |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=1 encrypted array and tune L=4 O(N) space, O(L) locality and O(N1/s/L) read efficiency

Our Approach – Constant Locality O(L) |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Keep only s=1 encrypted array and tune L=4 Choose L = #parallel process units (servers) O(N) space, O(L) locality and O(N1/s/L) read efficiency

Our Approach – The full protocol Client filters out the false positives Client Server filters out the false positives Untrusted Cloud Minimize the bandwidth k3 #PRFs = |result|*N1/s level=2,offset=0 Encrypted Dictionary Only 2 PRFs More bandwidth Encrypted Arrays

Experiments 1 real dataset with 6,123,276 records used for in-memory evaluation Query attribute: location description (173 distinct keywords) Synthetic dataset used for external memory evaluation N =247 -1 records (~ 1 petabyte) ,|k| =1,2,4,…, 246 Java implementation: Our scheme PiBas, state-of-the-art for in-memory settings OneChoiceAlloc, state-of-the-art for external memory 64bit machine with Intel Xeon E5-2676v3 with 64GB RAM

Experiments – Index Costs (In-memory)

Experiments – Search Costs (In-memory) End-to-End Search Time

Experiments – False Positives False Positives for Different Sizes

Experiments – Search Time (External Memory)

Experiments – Search Time (Real Dataset)

Conclusion – Future Work ____________? In this work: Formal proof based on widely-adopted CRYPTO security definitions 12x more efficient than the state-of-the-art in memory SE Up to 2-3 orders of magnitude less false positives than the external memory SE Our scheme provides various trade-offs between Space Read Efficiency (false positives) Locality Parallelism Bandwidth #Crypto operations

Tunable for arbitrary architectures Thank you!!! Questions??? 12x in-memory 580x external memory Tunable for arbitrary architectures Efficiency Security High Low OPE DET PPE FHE SSE Secure & Efficient Efficient ORAM Func/Pred Enc Secure