Secure Computation Pragmatics Yan Huang Indiana University May 10, 2016.

Slides:



Advertisements
Similar presentations
Hybrid BDD and All-SAT Method for Model Checking Orna Grumberg Joint work with Assaf Schuster and Avi Yadgar Technion – Israel Institute of Technology.
Advertisements

Indexing DNA Sequences Using q-Grams
Yan Huang, David Evans, Jonathan Katz
Oblivious Branching Program Evaluation
Efficient Information Retrieval for Ranked Queries in Cost-Effective Cloud Environments Presenter: Qin Liu a,b Joint work with Chiu C. Tan b, Jie Wu b,
Network Algorithms, Lecture 4: Longest Matching Prefix Lookups George Varghese.
Fast Algorithms For Hierarchical Range Histogram Constructions
Latent Semantic Indexing (mapping onto a smaller space of latent concepts) Paolo Ferragina Dipartimento di Informatica Università di Pisa Reading 18.
Kai-Min Chung (Academia Sinica) joint work with Zhenming Liu (Princeton) and Rafael Pass (Cornell NY Tech)
Quick Review of Apr 10 material B+-Tree File Organization –similar to B+-tree index –leaf nodes store records, not pointers to records stored in an original.
Searching on Multi-Dimensional Data
Advanced Databases: Lecture 2 Query Optimization (I) 1 Query Optimization (introduction to query processing) Advanced Databases By Dr. Akhtar Ali.
ORAM – Used for Secure Computation by Venkatasatheesh Piduri 1.
Automating Efficient RAM- Model Secure Computation Chang Liu, Yan Huang, Elaine Shi, Jonathan Katz, Michael Hicks University of Maryland, College Park.
Dr. Kalpakis CMSC 661, Principles of Database Systems Index Structures [13]
CS7380: Privacy Aware Computing Oblivious RAM 1. Motivation  Starting from software protection Prevent from software piracy A valid method is using hardware.
B+-tree and Hashing.
Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography.
Privacy and Integrity Preserving in Distributed Systems Presented for Ph.D. Qualifying Examination Fei Chen Michigan State University August 25 th, 2009.
HASH TABLES Malathi Mansanpally CS_257 ID-220. Agenda: Extensible Hash Tables Insertion Into Extensible Hash Tables Linear Hash Tables Insertion Into.
Spring 2004 ECE569 Lecture ECE 569 Database System Engineering Spring 2004 Yanyong Zhang
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
ObliviStore High Performance Oblivious Cloud Storage Emil StefanovElaine Shi
Symbol Table (  ) Contents Map identifiers to the symbol with relevant information about the identifier All information is derived from syntax tree -
Review C++ exception handling mechanism Try-throw-catch block How does it work What is exception specification? What if a exception is not caught?
Oblivious Data Structures Xiao Shaun Wang, Kartik Nayak, Chang Liu, T-H. Hubert Chan, Elaine Shi, Emil Stefanov, Yan Huang 1.
ObliVM: A Programming Framework for Secure Computation
ObliVM: A Programming Framework for Secure Computation
DBease: Making Databases User-Friendly and Easily Accessible Guoliang Li, Ju Fan, Hao Wu, Jiannan Wang, Jianhua Feng Database Group, Department of Computer.
Access Path Selection in a Relational Database Management System Selinger et al.
Database Management 9. course. Execution of queries.
Introduction. 2COMPSCI Computer Science Fundamentals.
The X-Tree An Index Structure for High Dimensional Data Stefan Berchtold, Daniel A Keim, Hans Peter Kriegel Institute of Computer Science Munich, Germany.
1 CPS216: Advanced Database Systems Notes 04: Operators for Data Access Shivnath Babu.
A Summary of XISS and Index Fabric Ho Wai Shing. Contents Definition of Terms XISS (Li and Moon, VLDB2001) Numbering Scheme Indices Stored Join Algorithms.
Chapter 11 Indexing & Hashing. 2 n Sophisticated database access methods n Basic concerns: access/insertion/deletion time, space overhead n Indexing 
Module 5 Planning for SQL Server® 2008 R2 Indexing.
Hashing and Hash-Based Index. Selection Queries Yes! Hashing  static hashing  dynamic hashing B+-tree is perfect, but.... to answer a selection query.
Indexing and hashing Azita Keshmiri CS 157B. Basic concept An index for a file in a database system works the same way as the index in text book. For.
1 CPS216: Data-intensive Computing Systems Operators for Data Access (contd.) Shivnath Babu.
1 CPS216: Advanced Database Systems Notes 05: Operators for Data Access (contd.) Shivnath Babu.
R-Trees: A Dynamic Index Structure For Spatial Searching Antonin Guttman.
Author: Haoyu Song, Murali Kodialam, Fang Hao and T.V. Lakshman Publisher/Conf. : IEEE International Conference on Network Protocols (ICNP), 2009 Speaker:
Onion ORAM: A Constant Bandwidth Blowup ORAM
Dynamic Programming & Memoization. When to use? Problem has a recursive formulation Solutions are “ordered” –Earlier vs. later recursions.
Secure Data Outsourcing
Auditing Information Leakage for Distance Metrics Yikan Chen David Evans TexPoint fonts used in EMF. Read the TexPoint manual.
Secure Computation Basics Yan Huang Indiana University May 9, 2016.
Garbling Techniques David Evans
(More) Efficient Secure Computation from Garbled Circuits
A Case Study in Building Layered DHT Applications
CPS216: Data-intensive Computing Systems
Oblivious Parallel RAM: Improved Efficiency and Generic Constructions
OblivP2P: An Oblivious Peer-to-Peer Content Sharing System
COMP 430 Intro. to Database Systems
Privacy Preserving Subgraph Matching on Large Graphs in Cloud
B+ Tree.
Oblivious RAM: A Dissection and Experimental Evaluation
Verifiable Oblivious Storage
B+-Trees and Static Hashing
Indexing and Hashing Basic Concepts Ordered Indices
Indexing and Hashing B.Ramamurthy Chapter 11 2/5/2019 B.Ramamurthy.
Database Design and Programming
2018, Spring Pusan National University Ki-Joune Li
CPS216: Advanced Database Systems
Minwise Hashing and Efficient Search
Indexing, Access and Database System Architecture
Path Oram An Extremely Simple Oblivious RAM Protocol
CRYP-F02 Actively Secure 1-out-of-N OT Extension with Application to Private Set Intersection Peter Scholl (University of Bristol) Michele Orrù (ENS Paris)
Index Structures Chapter 13 of GUW September 16, 2019
Presentation transcript:

Secure Computation Pragmatics Yan Huang Indiana University May 10, 2016

Menu Develop efficient cryptographic protocols – Library – Compiler – RAM computation Applications – Efficient circuits for ORAM – Private edit distance over whole genomes Oblivious computation in the RAM model

A Challenge in Programmability Require many crypto primitives and optimizations – Garbling – Oblivious Transfer (OT) – OT Extension – Free-XOR – Point-and-permute – Garbled Row Reduction (GRR) Error-prone 3

Extensible Library 4 AND XOROR NOT add submult div muxcmp AES Edit-Distance PSI Library App

Example: add 1-bit 5 def add1bit(x, y, cin): t1 = xor(x, cin) t2 = xor(y, cin) s = xor(x, t2) t1 = and(t1, t2) cout = xor(cin, t1) return s,cout

Example: add n-bit numbers 6 def add(x, y, carry): s = range(len(x)) t = carry for i in range(len(x)): s[i],t = add1bit(x[i], y[i], t) return s

Write your own circuits/applications Separation of Concerns – No expert knowledge of crypto required Modular construction – Extend and reuse the circuit library Compiler Support -ObliVM [S&P’15] 7

Compute things other than circuits? Joint work with: [LHSKH, Oakland’14][LWNHS, Oakland’15]

Binary Search 9 def binSearch(a, x): lo, hi = 0, len(a) res = -1 while lo <= hi: mid = (lo+hi)//2 midval = a[mid] if midval < x: lo = mid+1 elif midval > x: hi = mid else: res = mid return res Accessing a secret array using a secret index

10 AND XOR OR … … … Cryptographer’s favorite model Naïvely implemented using circuits: a linear scan i Programmer’s favorite model M[i] = 20

Oblivious-RAM to the Rescue O-RAM – Securely outsourcing storage – Read (index), Write (index, value) – Hide access patterns – Poly-logarithmic cost per access [GO96, J. ACM], [WS08, CCS], [PR10, CRYPTO] [SSS12, NDSS], [SvSFRYD13, CCS] 11

ORAM Scheme ORAM -- Hiding Access Patterns 12 M i Read M[i] Untrusted Trusted

ORAM Scheme ORAM -- Hiding Access Patterns 13 M Read M[i] Hiding the index and the value Untrusted Trusted

Hiding the index and the value ORAM Scheme 14 M Read M[i] Untrusted Trusted

Read () Write () ORAM Scheme Obliviously computed as circuits using secure computation Hiding the index and the value 15 M Read M[i] Untrusted Trusted

def binSearch(a, x): lo, hi = 0, len(a) i, res = log(hi), -1 while i >= 0: i -= 1 mid = (lo+hi)//2 midval = a[mid] if midval < x: lo = mid+1 elif midval > x: hi = mid else: res = mid return res Sensitive Control Flow 16 Secret-dependent predicate

Handle Sensitive Conditionals 17 def binSearch(a, x): lo, hi = 0, len(a) i, res = log(hi), -1 while i >= 0: i -= 1 mid = (lo+hi)//2 midval = a[mid] if midval < x: lo = mid+1 elif midval > x: hi = mid else: res = mid return res

elif: Handle Sensitive Conditionals 18 if midval < x: lo = mid+1 elif midval > x: hi = mid res = mid

Handle Sensitive Conditionals 19 if midval < x: lo = mid+1 elif midval > x: hi = mid res = mid ORAM Scheme PC = nextInstruction()

Can we avoid retrieving instructions from ORAM? 20 if (a): b = 10 else: b = 12 b = mux(a, 10, 12) if (a): b = 10 b = mux(a, 10, b)

Avoid Putting Instructions in ORAM 21 if (a): a = x else: b = y a = mux(a, x, a) b = mux(a, b, y) if (a): a = x elif (b): b = y else: c = 0 t1 = a t2 = (not a) and b t3 = not (a or b) a = mux(t1, x, a) b = mux(t2, y, b) c = mux(t3, 0, c) Trace-oblivious program transformation

Binary Search 22 def binSearch(a, x): lo, hi = 0, len(a) res = -1 while lo <= hi: mid = (lo+hi)//2 midval = a[mid] if midval < x: lo = mid+1 elif midval > x: hi = mid else: res = mid return res

def binSearch(a, x): lo, hi = 0, len(a) i, res = log(hi), -1 while i >= 0: i -= 1 mid = (lo+hi)//2 midval = a[mid] c1 = midval < x c2 = midval > x c2 = midval == x lo = mux (c1, mid+1, lo) hi = mux (c2, mid, hi) res = mux (c3, mid, res) return res Removing Sensitive Control Flows 23

Automation using a Compiler Automatic Partition: variables are labeled with tokens –secure, public, alice, bob Automated Translation to memory- and trace- oblivious programs Oblivious program abstractions: Map-Reduce, Richer library. 24

Savings (Binary search) 25

More Results Knuth-Morris-Pratt String Matching – 20x speedup (search a 50-char pattern over a 2x10 6 -char string) Dijkstra Shortest Distances – 100x speedup (graph size: 3500 nodes) 26

Savings (KMP-matching) 27

Savings (Dijkstra) 28

Garbled Computation of ORAM? Joint work with: [LHHSS, CCS’14]

Oblivious ORAM Exhaustive Scan Square Root ORAM – [Goldreich-Ostrovsky, JACM’96] Hierarchical ORAM – [Goldreich-Ostrovsky, JACM’96] Tree ORAM – PathORAM, etc. More on this later in Dave’s lecture. More on this later in Dave’s lecture.

Server Client Bucket size: B Tree based ORAM paradigm [Shi et al. 11] 31

l x l l x Position map Client Main Invariant and Data Access Path identified by leaf node l Server 32 Stash

l x l l x Position map Client Server Block x Must Now Relocate! 33 Stash

r r x Position map r New designated leaf node Update position map Client Server Data Access: Write back (x, r, data) 34 Stash

Server Client Conflicting Goals: Less aggressive ==> Low cost More aggressive ==> Smaller bucket size Eviction: percolate blocks up towards leaves subject to invariant 35

Server Client Path ORAM Eviction Stash stash = stash + path “aggressively” write back stash subject to invariant 36

Traditional ORAM Metrics 37 Client storage Server Storage Bandwidth Round-trips Computation

Metrics for SC-ORAM 38 Client storage Server Storage Bandwidth Round-trips Computation

Goal Design a small eviction circuit and yet preserve its effectiveness. Goal Design a small eviction circuit and yet preserve its effectiveness. 39

Path ORAM eviction D: word size N: number of words 40 Most aggressive A O ( L 2 )-size circuit per eviction A O ( L logL )-size circuit per eviction

Can we evict by scanning the path constant number of times? 41

SCORAM: heuristic approach: Greedily push blocks upwards S Not aggressive enough! Stash grows too quickly! 42 Stash Leaf

SCORAM: Reduce stash size by linear scan Reverse dropping scan: put the deepest block in the stash to the deepest bucket in the path. 43

SCORAM: heuristic approach S ) Reverse dropping scan: Put the deepest element in stash to the path as deep as possible 2) Greedy push scan: Greedily push blocks down 44 Stash Leaf 2-step eviction:

1GB data, each 32 bits 45 10X 20X SCORAM [CCS’14]

Private Edit Distance over Whole Genomes Joint work with: [WHZTWB, CCS’15] Yong’an Zhao

Edit distance Preferred metric for similarity of genomic data – more relevant than Hamming distance Generic Solution using Dynamic programming –O(N2)–O(N2) – Securely compare two genomes of lengths 2000 and nucleotides: 1.29 billion AND gates, 3.7 hours – Human genomes have 3 billion nucleotides. [HEKM, USENIX’11]

Summary of Results Compute edit distance on whole human genomes in 39 seconds (error: ≈2%). Similar breast-cancer patient query of 1000-patient database in < 8 minutes

Single core Garbler Single core Evaluator 100 Mpbs 100 ms

Methodology Insights Genomic Characteristics Edit distance Set difference size Our protocols Sketch Algorithm

P A = GTTGA T Ref = ATTGACT P B = GTT ACT Edit(P A, Ref)= {(1,SUB,G),(6,DEL,1)} Edit(P B, Ref)= {(1,SUB,G),(4,DEL,1)} Edit distance → Set difference size Edit distance = 2 Set difference size = 2

Ref = ATTGACT P A = GTTGAT P B = GTTGCT Edit(P A, Ref)= {(1,SUB,G),(6,DEL,1)} Edit(P B, Ref)= {(1,SUB,G),(5,DEL,1)} Edit distance → Set difference size Edit distance = 1 Set difference size = 2

Effects Originally: edit distance, O(N 2 ), N = 3 billion Nucleotides Now: Set Intersection Size, O(N log N), N = 4 million “Edits” Most edits are substitutions; Most edits are substitutions; Insertions are sparse. Insertions are sparse. Most edits are substitutions; Most edits are substitutions; Insertions are sparse. Insertions are sparse.

Methodology Insights Genomic Characteristics Edit distance Set difference size Our protocols Sketch Algorithm

Through Sketching Intersection Size? |A-B|=|A|+|B|-2|A∩B|

By Sketching Intersection Size? |A∩B|≈10000|A-B| ε |A-B| ≈ ε |A∩B|

Better Sketching Difference Size Directly |A-B|

Basic Protocol: SketchDiffSize 1.Agree on randomness r, a hash function h() with range {1, -1} 2.Compute 3.Compute 4.Estimate set difference size using (d A - d B ) 2 Local Secure

Example (1,INS,A) (10,DEL,1) (11,SUB,A) (20,INS,T) (2,INS,A) (7,SUB,C) (11,SUB,A) (2-(-1)) 2 = h(r 1, ⋅ ) 9 Local Secure (d A - d B ) 2 =

Example (1,INS,A) (10,DEL,1) (11,SUB,A) (20,INS,T) (2,INS,A) (7,SUB,C) (11,SUB,A) (0-(-1)) 2 = Mean h(r 2, ⋅ ) Local Secure (d A - d B ) 2 =

Example (1,INS,A) (10,DEL,1) (11,SUB,A) (20,INS,T) (2,INS,A) (7,SUB,C) (11,SUB,A) (1-(-1)) 2 = Mean h(r 3, ⋅ ) Local Secure (d A - d B ) 2 =

Example (1,INS,A) (10,DEL,1) (11,SUB,A) (20,INS,T) (2,INS,A) (7,SUB,C) (11,SUB,A) (-1--1) 2 = Mean h(r 4, ⋅ ) Local Secure (d A - d B ) 2 =

Example (1,INS,A) (10,DEL,1) (11,SUB,A) (20,INS,T) (2,INS,A) (7,SUB,C) (11,SUB,A) (4-1) 2 = Mean h(r 5, ⋅ ) Local Secure (d A - d B ) 2 =

Example (1,INS,A) (10,DEL,1) (11,SUB,A) (20,INS,T) (2,INS,A) (7,SUB,C) (11,SUB,A) (4-3) 2 = Mean h(r 6, ⋅ ) Local Secure (d A - d B ) 2 =

What is ? Expectedly, 0

Optimizations Avoid secure squaring – Half normal distribution Improve local computation – Parallelization – “sparse sketch”

1.42% relative error (> 90 percentile) 39 seconds overall per comparison Whole genomes Private Edit Distance

Thresholding Edit Distance

Classic Ideal Model AABB zz z = DiffSize(A,B)

Our Modified Ideal Model AABB z, r z = SketchDiffSize(A,B,r)

Q & A