Kai-Min Chung (Academia Sinica) joint work with Zhenming Liu (Princeton) and Rafael Pass (Cornell NY Tech)

Slides:



Advertisements
Similar presentations
Why Simple Hash Functions Work : Exploiting the Entropy in a Data Stream Michael Mitzenmacher Salil Vadhan And improvements with Kai-Min Chung.
Advertisements

Greedy best-first search Use the heuristic function to rank the nodes Search strategy –Expand node with lowest h-value Greedily trying to find the least-cost.
Efficient Information Retrieval for Ranked Queries in Cost-Effective Cloud Environments Presenter: Qin Liu a,b Joint work with Chiu C. Tan b, Jie Wu b,
Chapter 4: Trees Part II - AVL Tree
1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 18: Hash Tables.
CS7380: Privacy Aware Computing Oblivious RAM 1. Motivation  Starting from software protection Prevent from software piracy A valid method is using hardware.
CSC401 – Analysis of Algorithms Lecture Notes 5 Heaps and Hash Tables Objectives: Introduce Heaps, Heap-sorting, and Heap- construction Analyze the performance.
B+-tree and Hashing.
6/14/2015 6:48 AM(2,4) Trees /14/2015 6:48 AM(2,4) Trees2 Outline and Reading Multi-way search tree (§3.3.1) Definition Search (2,4)
FALL 2006CENG 351 Data Management and File Structures1 External Sorting.
Rank-Pairing Heaps Bernhard Haeupler, Siddhartha Sen, and Robert Tarjan, ESA
Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara.
10/10/2002CSE Memory Hierarchy CSE Algorithms Quicksort vs Heapsort: the “inside” story or A Two-Level Model of Memory.
Point Location Computational Geometry, WS 2007/08 Lecture 5 Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für Informatik Fakultät für.
Hash Tables1 Part E Hash Tables  
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
© 2004 Goodrich, Tamassia (2,4) Trees
Cache Oblivious Search Trees via Binary Trees of Small Height
Tirgul 4 Order Statistics Heaps minimum/maximum Selection Overview
Skip Lists1 Skip Lists William Pugh: ” Skip Lists: A Probabilistic Alternative to Balanced Trees ”, 1990  S0S0 S1S1 S2S2 S3S3 
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
Tirgul 6 B-Trees – Another kind of balanced trees.
ObliviStore High Performance Oblivious Cloud Storage Emil StefanovElaine Shi
Chapter Tow Search Trees BY HUSSEIN SALIM QASIM WESAM HRBI FADHEEL CS 6310 ADVANCE DATA STRUCTURE AND ALGORITHM DR. ELISE DE DONCKER 1.
Oblivious Data Structures Xiao Shaun Wang, Kartik Nayak, Chang Liu, T-H. Hubert Chan, Elaine Shi, Emil Stefanov, Yan Huang 1.
ObliVM: A Programming Framework for Secure Computation
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
PODC Distributed Computation of the Mode Fabian Kuhn Thomas Locher ETH Zurich, Switzerland Stefan Schmid TU Munich, Germany TexPoint fonts used in.
An experimental study of priority queues By Claus Jensen University of Copenhagen.
An Efficient Decentralized Algorithm for the Distributed Trigger Counting (DTC) Problem Venkatesan T. Chakravarthy (IBM Research-India) Anamitra Roy Choudhury.
Data Structures Types of Data Structure Data Structure Operations Examples Choosing Data Structures Data Structures in Alice.
© 2004 Goodrich, Tamassia Trees
Hashing Suppose we want to search for a data item in a huge data record tables How long will it take? – It depends on the data structure – (unsorted) linked.
Data Structure II So Pak Yeung Outline Review  Array  Sorted Array  Linked List Binary Search Tree Heap Hash Table.
PRIORITY QUEUES AND HEAPS Slides of Ken Birman, Cornell University.
Everything is String. Closed Factorization Golnaz Badkobeh 1, Hideo Bannai 2, Keisuke Goto 2, Tomohiro I 2, Costas S. Iliopoulos 3, Shunsuke Inenaga 2,
Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta
Onion ORAM: A Constant Bandwidth Blowup ORAM
Week 15 – Wednesday.  What did we talk about last time?  Review up to Exam 1.
Mergeable Heaps David Kauchak cs302 Spring Admin Homework 7?
Exam 3 Review Data structures covered: –Hashing and Extensible hashing –Priority queues and binary heaps –Skip lists –B-Tree –Disjoint sets For each of.
1 Plaxton Routing. 2 History Greg Plaxton, Rajmohan Rajaraman, Andrea Richa. Accessing nearby copies of replicated objects, SPAA 1997 Used in several.
Course: Programming II - Abstract Data Types HeapsSlide Number 1 The ADT Heap So far we have seen the following sorting types : 1) Linked List sort by.
Polygon Triangulation
Discrete Methods in Mathematical Informatics Kunihiko Sadakane The University of Tokyo
Mohammed I DAABO COURSE CODE: CSC 355 COURSE TITLE: Data Structures.
CSC317 Selection problem q p r Randomized‐Select(A,p,r,i)
Oblivious Parallel RAM: Improved Efficiency and Generic Constructions
OblivP2P: An Oblivious Peer-to-Peer Content Sharing System
Experimental evaluation of Navigation piles
OblivP2P: An Oblivious Peer-to-Peer Content Sharing System
Hashing Exercises.
Oblivious RAM: A Dissection and Experimental Evaluation
TaoStore Overcoming Asynchronicity in Oblivious Data Storage
Balanced-Trees This presentation shows you the potential problem of unbalanced tree and show two way to fix it This lecture introduces heaps, which are.
CS7380: Privacy Aware Computing
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Balanced-Trees This presentation shows you the potential problem of unbalanced tree and show two way to fix it This lecture introduces heaps, which are.
(2,4) Trees (2,4) Trees (2,4) Trees.
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
(2,4) Trees 2/15/2019 (2,4) Trees (2,4) Trees.
CPS216: Advanced Database Systems
(2,4) Trees (2,4) Trees (2,4) Trees.
General External Merge Sort
Heavy Hitters in Streams and Sliding Windows
CENG 351 Data Management and File Structures
CS203 Lecture 15.
Path Oram An Extremely Simple Oblivious RAM Protocol
Presentation transcript:

Kai-Min Chung (Academia Sinica) joint work with Zhenming Liu (Princeton) and Rafael Pass (Cornell NY Tech)

Oblivious RAM [G87,GO96] Main memory: a size-n array of memory cells (words) Read/Write qiqi M[q i ] CPU CPU cache: small private memory Secure zone Compile RAM program to protect data privacy – Store encrypted data – Main challenge: Hide access pattern sequence of addresses accessed by CPU E.g., access pattern for binary search leaks rank of searched item

Cloud Storage Scenario Alice Bob Access data from the cloud is curious Encrypt the data Hide the access pattern

Oblivious RAM—hide access pattern Design an oblivious data structure implements – a big array of n memory cells with – Read(addr) & Write(addr, val) functionality Goal: hide access pattern

ORAM structure Illustration ORead/OWrite ORAM algorithm Multiple Read/Write Multiple Read/Write Alice Bob

Design an oblivious data structure implements – a big array of n memory cells with – Read(addr) & Write(addr, val) functionality Goal: hide access pattern – For any two sequences of Read/Write operations Q 1, Q 2 of equal length, the access patterns are statistically / computationally indistinguishable Oblivious RAM—hide access pattern

A Trivial Solution ORead/OWrite ORAM algorithm Multiple Read/Write Multiple Read/Write Alice Bob ORAM structure Perfectly secure But has O(n) overhead Perfectly secure But has O(n) overhead

ORAM complexity Time overhead – time complexity of ORead/OWrite Space overhead – ( size of ORAM structure / n ) Cache size Security – statistical vs. computational

A successful line of research Many works in literature; a partial list below:

Why statistical security? We need to encrypt data, which is only computationally secure anyway. So, why statistical security? Ans 1: Why not? It provides stronger guarantee, which is better. Ans 2: It enables new applications! – Large scale MPC with several new features [BCP14] emulate ORAM w/ secret sharing as I.T.-secure encryption require stat. secure ORAM to achieve I.T.-secure MPC

Our Result Theorem: There exists an ORAM with – Statistical security – O(log 2 n loglog n) time overhead – O(1) space overhead – polylog(n) cache size Independently, Stefanov et al. [SvDSCFRYD’13] achieve statistical security with O(log 2 n) overhead with different algorithm & very different analysis

Tree-base ORAM framework of [SCSL’10]

A simple initial step

Tree-base ORAM Framework [SCSL’10] ORAM structure Bucket Size = L … Data is stored in a complete binary tree with n leaves – each node has a bucket of size L to store up to L memory blocks A position map Pos in CPU cache indicate position of each block – Invariant 1: block i is stored somewhere along path from root to Pos[i] Position map Pos CPU cache:

Tree-base ORAM Framework [SCSL’10] ORAM structure Bucket Size = L … ORead( block i ): Fetch block i along path from root to Pos[i] Position map Pos CPU cache: Invariant 1: block i can be found along path from root to Pos[i]

Tree-base ORAM Framework [SCSL’10] ORAM structure Bucket Size = L … Position map Pos CPU cache: Invariant 1: block i can be found along path from root to Pos[i] Invariant 2: Pos[.] i.i.d. uniform given access pattern so far Access pattern = random path Issue: overflow at root 4

Tree-base ORAM Framework [SCSL’10] ORAM structure Bucket Size = L … Position map Pos CPU cache: Invariant 1: block i can be found along path from root to Pos[i] Invariant 2: Pos[.] i.i.d. uniform given access pattern so far

Flush mechanism of [CP’13] ORAM structure Bucket Size = L … Position map Pos CPU cache: Invariant 1: block i can be found along path from root to Pos[i] Invariant 2: Pos[.] i.i.d. uniform given access pattern so far

Complexity ORAM structure … Position map Pos CPU cache: Final idea: outsource Pos map using this ORAM recursively O(log n) level recursions bring Pos map size down to O(1)

Complexity ORAM structure … Position map Pos CPU cache:

Our Construction—High Level ORAM structure … CPU cache: (in addition to Pos map) queue: Invariant 1: block i can be found (i) in queue, or (ii) along path from root to Pos[i] This saves a factor of log n

Our Construction—High Level ORAM structure … CPU cache: (in addition to Pos map) queue: Fetch Put back Flush Put back Flush ×Geom(2) Read( block i ):

Our Construction—Details ORAM structure … Fetch: Fetch & remove block i from either queue or path to Pos[i] Insert it back to queue Put back: Pop a block out from queue Add it to root CPU cache: (in addition to Pos map) queue:

Our Construction—Details ORAM structure … CPU cache: (in addition to Pos map) queue: Alternatively, can be viewed as 2 buckets of size L/2 Select greedily: pick the one can move farthest

Our Construction—Review ORAM structure … CPU cache: (in addition to Pos map) queue: Fetch Put back Flush Put back Flush ×Geom(2) ORead( block i ):

Security ORAM structure … CPU cache: (in addition to Pos map) queue: Invariant 1: block i can be found (i) in queue, or (ii) along path from root to Pos[i] Invariant 2: Pos[.] i.i.d. uniform given access pattern so far Access pattern = 1 + 2*Geom(2) random paths

Main Challenge—Bound Queue Size CPU cache: (in addition to Pos map) queue: Fetch Put back Flush Put back Flush ×Geom(2) ORead( block i ): Increase by 1, +1 Change of queue size Decrease by 1, -1 May increase many, +?? Queue size may blow up?! Main Lemma: Pr[ queue size ever > log 2 n loglogn ] < negl(n) Main Lemma: Pr[ queue size ever > log 2 n loglogn ] < negl(n)

Complexity ORAM structure … CPU cache: (in addition to Pos map) queue:

Reduce analyzing ORAM to supermarket prob.: – D cashiers in supermarket w/ empty queues at t=0 – Each time step With prob. p, arrival event occur: one new customer select a random cashier, who’s queue size +1 With prob. 1-p, serving event occur: one random cashier finish serving a customer, and has queue size -1 – Customer upset if enter a queue of size > k – How many upset customers in time interval [t, t+T]? Main Lemma: Pr[ queue size ever > log 2 n loglogn ] < negl(n) Main Lemma: Pr[ queue size ever > log 2 n loglogn ] < negl(n)

Main Lemma: Pr[ queue size ever > log 2 n loglogn ] < negl(n) Main Lemma: Pr[ queue size ever > log 2 n loglogn ] < negl(n) Proved by using Chernoff bound for Markov chain with “resets” (generalize [CLLM’12])

Application of stat. secure ORAM: Large-scale MPC [BCP’14] Require one broadcast in preprocessing phase.

Conclusion

Thank you! Questions?

Fetch Put back Flush Put back Flush ×Geom(2) Increase by 1, +1 Decrease by 1, -1 May increase many, +?? Decrease by 1, -1 May increase many, +??

… A complete binary tree with n leaves to store data – Each node has a bucket to store up to L blocks A position map Pos in CPU cache indicate position of each block

Tree-base ORAM Framework [SCSL’10] ORAM structure Bucket Size = L … A complete binary tree with n leaves to store data – Each node has a bucket to store up to L blocks A position map Pos in CPU cache indicate position of each block Pos Invariant: block i can be found along path from root to Pos[i] CPU cache:

Conclusion A new statistically secure ORAM construct with O(log 2 n log log n) overhead. Introduced supermarket model. Introduced Chernoff bound with resets? Can we get rid of log log n? Seems not, because we ignore correlations across different levels and thus a union bound is needed.