University of Maryland

Slides:

Advertisements

Similar presentations

Efficient Information Retrieval for Ranked Queries in Cost-Effective Cloud Environments Presenter: Qin Liu a,b Joint work with Chiu C. Tan b, Jie Wu b,

Advertisements

Stabbing the Sky: Efficient Skyline Computation over Sliding Windows COMP9314 Lecture Notes.

CS7380: Privacy Aware Computing Oblivious RAM 1. Motivation  Starting from software protection Prevent from software piracy A valid method is using hardware.

I/O-Algorithms Lars Arge Aarhus University February 6, 2007.

E.G.M. Petrakissearching1 Searching  Find an element in a collection in the main memory or on the disk  collection: (K 1,I 1 ),(K 2,I 2 )…(K N,I N )

Data Structures Hashing Uri Zwick January 2014.

Indexing Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata.

Layers of a DBMS Query optimization Execution engine Files and access methods Buffer management Disk space management Query Processor Query execution plan.

ObliviStore High Performance Oblivious Cloud Storage Emil StefanovElaine Shi

1 Physical Data Organization and Indexing Lecture 14.

Oblivious Data Structures Xiao Shaun Wang, Kartik Nayak, Chang Liu, T-H. Hubert Chan, Elaine Shi, Emil Stefanov, Yan Huang 1.

Bin Yao Spring 2014 (Slides were made available by Feifei Li) Advanced Topics in Data Management.

Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.

DBI313. MetricOLTPDWLog Read/Write mixMostly reads, smaller # of rows at a time Scan intensive, large portions of data at a time, bulk loading Mostly.

CPSC 404, Laks V.S. Lakshmanan1 External Sorting Chapter 13: Ramakrishnan & Gherke and Chapter 2.3: Garcia-Molina et al.

DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.

Equivalence Between Priority Queues and Sorting in External Memory

Lecture 1: Basic Operators in Large Data CS 6931 Database Seminar.

Searching Over Encrypted Data Charalampos Papamanthou ECE and UMIACS University of Maryland, College Park Research Supported By.

DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.

DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.

Keyword search on encrypted data. Keyword search problem  Linux utility: grep  Information retrieval Basic operation Advanced operations – relevance.

All Your Queries Are Belong to Us: The Power of File-Injection Attacks on Searchable Encryption Yupeng Zhang, Jonathan Katz, Charalampos Papamanthou University.

All Your Queries are Belong to Us: The Power of File-Injection Attacks on Searchable Encryption Yupeng Zhang, Jonathan Katz, Charalampos Papamanthou University.

1 Query Processing Part 1: Managing Disks. 2 Main Topics on Query Processing Running-time analysis Indexes (e.g., search trees, hashing) Efficient algorithms.

Algorithm Analysis 1.

Practical Private Range Search Revisited

arxiv.org/abs/ y 3-sided x1 x2 x1 x2 top-k

Query Processing Part 1: Managing Disks 1.

Searchable Encryption in Cloud

Tian Xia and Donghui Zhang Northeastern University

Rekeying for Encrypted Deduplication Storage

Module 11: File Structure

Efficient Multi-User Indexing for Secure Keyword Search

Oblivious Parallel RAM: Improved Efficiency and Generic Constructions

Tree-Structured Indexes: Introduction

Indexing & querying text

Lecture 16: Data Storage Wednesday, November 6, 2006.

Tree-Structured Indexes

Database Management Systems (CS 564)

Hybrid Cloud Architecture for Software-as-a-Service Provider to Achieve Higher Privacy and Decrease Securiity Concerns about Cloud Computing P. Reinhold.

Computing Connected Components on Parallel Computers

External Sorting Chapter 13 (Sec ): Ramakrishnan & Gehrke and

RE-Tree: An Efficient Index Structure for Regular Expressions

Fast Searchable Encryption with Tunable Locality

CSCI 104 Log Structured Merge Trees

Chapter 12: Query Processing

Oblivious RAM: A Dissection and Experimental Evaluation

Sorting atomic items Chapter 5 Distribution based sorting paradigms.

Join Processing for Flash SSDs: Remembering Past Lessons

Selectivity Estimation of Big Spatial Data

Lecture 11: DMBS Internals

Database Applications (15-415) DBMS Internals- Part III Lecture 15, March 11, 2018 Mohammad Hammoud.

HashKV: Enabling Efficient Updates in KV Storage via Hashing

Verifiable Oblivious Storage

File Systems Implementation

Advanced Topics in Data Management

STACS arxiv.org/abs/ y 3-sided x1 x2 x1 x2 top-k

Lecture 7: Index Construction

Lecture 2- Query Processing (continued)

File Storage and Indexing

File Storage and Indexing

8th Workshop on Massive Data Algorithms, August 23, 2016

Indexing 4/11/2019.

CENG 351 Data Management and File Structures

Path Oram An Extremely Simple Oblivious RAM Protocol

Efficient Aggregation over Objects with Extent

CSE 190D Database System Implementation

Virtual Memory 1 1.

Efficient Migration of Large-memory VMs Using Private Virtual Memory

Presentation transcript:

University of Maryland Searchable Encryption with Optimal Locality: Achieving Sublogarithmic Read Efficiency Ioannis Demertzis Dimitris Papadopoulos Charalampos Papamanthou University of Maryland Hong Kong UST University of Maryland yannis@umd.edu dipapado@cse.ust.hk cpap@umd.edu

? What is Searchable Encryption (SE)? + search query: Untrusted Cloud ? Client Search pattern: whether a search query is repeated + search query: keyword Setup leakage: total leakage prior to query execution e.g. size of each encrypted file, size of encrypted index Access pattern: encrypted document ids and files that satisfy the search query Security (informal): The adversary does not learn anything beyond the above leakages! 2

X X X X X X X Searchable Encryption – Locality and Read Efficiency Locality is an important efficiency dimension ([CJS+14],[DP17],… Scalable SE requires low locality and read efficiency Locality: #non-continuous reads for each query Read Efficiency: #memory locations per result item locality = 3 & read efficiency = 1 search query: keyword id1 id4 id2 X X X X X X id4 id5 id3 id1 id2 id6 X : false positives locality = 1 & read efficiency = O(N) 3

[ANS+16] – TwoChoiceAlloc * [ANS+16] – OneChoiceAlloc *under some assumptions for the SE scheme Previous Works & Our Result “Cash and Tessaro Eurocrypt 2014” Locality (L): O(1) and Read Efficiency (R): O(1) requires Space (S): ω(Ν) * Schemes with limitation on the maximum keyword-list size General Schemes [ANS+16] – NlogN scheme L: O(1), R: O(1), S: O(NlogN) [ANS+16] – TwoChoiceAlloc * L: O(1), R: O(loglogN), S: O(N) ~ [DP17] – ReadOpt L: O(N1/s), R: O(1), S: O(sN) * keyword lists in the dataset have size less than N1-1/loglogN . [ASS18]** L: O(1), R: O(ω(1)*ε-1(n) + logloglogN) for n = N1-ε(N), S: O(N) [ANS+16] – OneChoiceAlloc L: O(1), R: O(logN), S: O(N) ~ ** keyword lists in the dataset have size less than N/log3N Our Approach L: O(1), R: O(logγN), S: O(N), for γ>2/3 4

locality = 1 & read efficiency = 1 & optimal space Searchable Encryption – Naïve Approach 1 k1= k2= k3= <=4 <=3 locality = 1 & read efficiency = 1 & optimal space 5

Searchable Encryption – Naïve Approach 2 k1= k2= k3= ? 6

… [ANS+16]– OneChoiceAllocation O(N) space, O(1) locality and O(logN) read efficiency ~ k1= k2= k3= k1 … 3 logN loglogN M = N / logN loglogN 7

… [ANS+16]– TwoChoiceAllocation O(N) space, O(1) locality and O(loglogN) read efficiency ~ k1= k2= k3= ** Assuming all the keyword lists in the dataset have size less than N1-1/loglogN ** … c loglogN log2loglogN M = N / loglogN log2loglogN 7

… [ANS+16]– TwoChoiceAllocation O(N) space, O(1) locality and O(loglogN) read efficiency ~ k2= k1= k3= ** Assuming all the keyword lists in the dataset have size less than N1-1/loglogN ** k3 … c loglogN log2loglogN M = N / loglogN log2loglogN 7

[ANS+16]-OneChoiceAlloc Our Approach O(N) space, O(1) locality and O(logγN), for γ>2/3 Read Efficiency ~ [ANS+16]-OneChoiceAlloc Ο(logΝ) [ANS+16] TwoChoiceAlloc ~ O(loglogN) N1-1/loglogN N Keyword-list size 8

[ANS+16]-OneChoiceAlloc Our Approach O(N) space, O(1) locality and O(logγN), for γ>2/3 Read Efficiency ~ [ANS+16]-OneChoiceAlloc Ο(logΝ) [ANS+16] TwoChoiceAlloc ~ O(loglogN) N1-1/loglogN N Keyword-list size 8

[ANS+16]-OneChoiceAlloc Our Approach O(N) space, O(1) locality and O(logγN), for γ>2/3 Read Efficiency ~ [ANS+16]-OneChoiceAlloc Ο(logΝ) Ο(logγΝ) [ANS+16] TwoChoiceAlloc ~ O(loglogN) N1-1/loglogN N1-1/log N 1-γ N Keyword-list size 8

[ANS+16]-OneChoiceAlloc Our Approach O(N) space, O(1) locality and O(logγN), for γ>2/3 Read Efficiency ~ [ANS+16]-OneChoiceAlloc Ο(logΝ) Small Huge Ο(logγΝ) [ANS+16] TwoChoiceAlloc Sequential Scan ~ O(loglogN) N1-1/loglogN N1-1/log N 1-γ N/log N γ N Keyword-list size 8

Our Approach O(N) space, O(1) locality and O(logγN), for γ>2/3 Read Efficiency Focus of this talk! ~ [ANS+16]-OneChoiceAlloc Ο(logΝ) Small Medium Large Huge Ο(logγΝ) [ANS+16] TwoChoiceAlloc Multi-level keyword-size compression Sequential Scan ~ O(loglogN) N1-1/loglogN N1-1/log N 1-γ N/log N 2 N/log N γ N Keyword-list size 8

… Starting Point: Offline Two Choice Allocation (OTA) – [SEK03] OfflineTwoChoiceAlloc for m balls and n bins: MaxFlow( ) … n bins 9

… Starting Point: Offline Two Choice Allocation (OTA) – [SEK03] Γ L OfflineTwoChoiceAlloc for m balls and n bins: Key IDEA: One OTA per size and then Merge!! with probability at least 1 – O(1/n) … Max load <= m/n + 1 Γ L n bins 10

… … … … … Σs( ) = Our Approach: OTA per size + Merge Γ L ? A4s A2s As ks: #keyword lists with size s bs=M/s (#superbuckets) … … A4s Overflow Probability = O(1/bs) … A2s ks/bs + 1 Γ L See Lemma 4 in our paper Σs( ) = … O(N/M + logγΝ) As M = Ν/logγΝ = Ο(logγΝ) M … ? 11

… … … … … … … … … Our Approach: New analysis for OTA Our Approach: Accessing keyword lists **Novel analysis for OTA** The probability that more than O(log2N) lists of size s overflow is negligible! – see Lemma 5 in our paper … … … … A4s B4s k3 … … A2s B2s … … As Bs M Stashes … Ο(logγΝ) 12

Our Approach: New locality-aware ORAM Ο(n1/3log2n) Bandwidth and O(1) Locality We need an ORAM with the following properties: O(1) locality, existing ORAMs with polylogn bandwidth have logn locality Zero failure probability, since it will be applied on only log2n elements o(√n) bandwidth, in order to achieve sublogarithmic read efficiency  o(√ log2n) = o(logn) Α πα: [nα]  [nα] n + n2/3 Square Root ORAM Β πb: [nb]  [nb] n2/3 + n1/3 Hierarchical ORAM C * n1/3 De-amortization techniques from Goodrich et al. [GMO+11] 13

… … … … … … … … … … … Our Approach: OTA Stashes Amax Bmax A4s B4s A2s Important: max ≤ N/log2N for maintaining O(N) index size … … Amax Bmax … … … … A4s B4s … … A2s B2s … … As Bs M Stashes … 14

? Conclusion – Future Work Ο(logγΝ) Read Efficiency Keyword-list size Locality-aware Dynamic SE Read Efficiency Closer to the lower bound Open Question: Closer to the lower bound? New ORAM: Ο(n1/3log2n ) bandwidth, O(1) locality [ANS+16]-OneChoiceAlloc ~ Ο(logΝ) Small Medium Large Huge Ο(logγΝ) OTA-based approach Multi-level keyword-size compression [ANS+16] TwoChoiceAlloc Sequential Scan ~ O(loglogN) New probability bounds for OTA N1-1/loglogN N1-1/log N 1-γ N/log N 2 N/log N γ N Keyword-list size 15

Thank You! Ο(logγΝ) https://eprint.iacr.org/2017/749 Read Efficiency Closer to the lower bound New ORAM: Ο(n1/3log2n ) bandwidth, O(1) locality [ANS+16]-OneChoiceAlloc ~ Ο(logΝ) Small Medium Large Huge Ο(logγΝ) OTA-based approach Multi-level keyword-size compression [ANS+16] TwoChoiceAlloc Sequential Scan ~ O(loglogN) New probability bounds for OTA N1-1/loglogN N1-1/log N 1-γ N/log N 2 N/log N γ N Keyword-list size

[ANS+16]-OneChoiceAlloc [ASS18] in CRYPTO O(N) space, O(1) locality and ω(1)⋅ϵ(n)−1+O(logloglogN) read efficiency where n = N1-ϵ(n) Read Efficiency [ANS+16]-OneChoiceAlloc Ο(logΝ) Ο(logΝ/loglogN) Small Medium Large Huge Ο(logγΝ) [ANS+16] TwoChoiceAlloc ~ O(loglogN) O(logloglogN) N1-1/loglogN N1-1/log N 1-γ N/log N 3 N/log N 2 N/log N γ N Keyword-list size

Studying locality for HDD Access Cost = (seek time) + (rotational delay) + (transfer time) Random I/O Cost Sequential I/O Cost ~4-12 ms ~10 μs for 1 byte

Studying locality for SDD Samsung 960 Pro M.2 NVMe SSD Read Write Sequential Transfer Page size = 2MB 2222.93 MB/sec 1786.72 MB/sec Locality High Random Transfer Page size = 2MB 1339.76 MB/sec 1237.57 MB/sec Random Transfer Page size = 2KB 34.30 MB/sec 150.83 MB/sec Low More detailed analysis  http://www.storagereview.com/samsung_960_pro_m2_nvme_ssd_review

Studying locality for RAM Untrusted Cloud Client Tw Tw1 Tw2 Tw3 search query: keyword keyword id1 id4 id2 id4 id5 id3 id1 id2 id6 Tw search query: keyword Tw id1 id5 id4 id2 id3 id6