ObliVM: A Programming Framework for Secure Computation

Slides:

Advertisements

Similar presentations

Polylogarithmic Private Approximations and Efficient Matching

Advertisements

Efficient Private Approximation Protocols Piotr Indyk David Woodruff Work in progress.

Quid-Pro-Quo-tocols Strengthening Semi-Honest Protocols with Dual Execution Yan Huang 1, Jonathan Katz 2, David Evans 1 1. University of Virginia 2. University.

Yan Huang, David Evans, Jonathan Katz

Efficient Information Retrieval for Ranked Queries in Cost-Effective Cloud Environments Presenter: Qin Liu a,b Joint work with Chiu C. Tan b, Jie Wu b,

TIE Extensions for Cryptographic Acceleration Charles-Henri Gros Alan Keefer Ankur Singla.

Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.

Seunghwa Kang David A. Bader Large Scale Complex Network Analysis using the Hybrid Combination of a MapReduce Cluster and a Highly Multithreaded System.

Kai-Min Chung (Academia Sinica) joint work with Zhenming Liu (Princeton) and Rafael Pass (Cornell NY Tech)

Rational Oblivious Transfer KARTIK NAYAK, XIONG FAN.

CS555Topic 241 Cryptography CS 555 Topic 24: Secure Function Evaluation.

ORAM – Used for Secure Computation by Venkatasatheesh Piduri 1.

GARBLED CIRCUITS & SECURE TWO-PARTY COMPUTATION

Automating Efficient RAM- Model Secure Computation Chang Liu, Yan Huang, Elaine Shi, Jonathan Katz, Michael Hicks University of Maryland, College Park.

Private Analysis of Data Sets Benny Pinkas HP Labs, Princeton.

Abstract Shortest distance query is a fundamental operation in large-scale networks. Many existing methods in the literature take a landmark embedding.

1 © 2006 Pearson Addison-Wesley. All rights reserved Searching and Sorting Linear Search Binary Search -Reading p

SPAR-MPC Day 2 Breakout Sessions Mayank Varia 29 May 2014.

Chang Liu, Michael Hicks, Elaine Shi The University of Maryland, College Park.

ObliviStore High Performance Oblivious Cloud Storage Emil StefanovElaine Shi

Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.

Nirmalya Roy School of Electrical Engineering and Computer Science Washington State University Cpt S 223 – Advanced Data Structures Graph Algorithms: Minimum.

Oblivious Data Structures Xiao Shaun Wang, Kartik Nayak, Chang Liu, T-H. Hubert Chan, Elaine Shi, Emil Stefanov, Yan Huang 1.

ObliVM: A Programming Framework for Secure Computation

Orchestration by Approximation Mapping Stream Programs onto Multicore Architectures S. M. Farhad (University of Sydney) Joint work with Yousun Ko Bernd.

Secure Cloud Database using Multiparty Computation.

1 The Map-Reduce Framework Compiled by Mark Silberstein, using slides from Dan Weld’s class at U. Washington, Yaniv Carmeli and some other.

Insert presenter logo here on slide master. See hidden slide 4 for directions  Session ID: Session Classification: SEUNG GEOL CHOI UNIVERSITY OF MARYLAND.

GraphSC: Parallel Secure Computation Made Easy Kartik Nayak With Xiao Shaun Wang, Stratis Ioannidis, Udi Weinsberg, Nina Taft, Elaine Shi 1.

Automated Design of Custom Architecture Tulika Mitra

1 Searching and Sorting Linear Search Binary Search.

Binary Search From solving a problem to verifying an answer.

1 Exploring Custom Instruction Synthesis for Application-Specific Instruction Set Processors with Multiple Design Objectives Lin, Hai Fei, Yunsi ACM/IEEE.

Large-scale Deep Unsupervised Learning using Graphics Processors

Background on security

Towards a Billion Routing Lookups per Second in Software  Author: Marko Zec, Luigi, Rizzo Miljenko Mikuc  Publisher: SIGCOMM Computer Communication Review,

Parallel Processing Steve Terpe CS 147. Overview What is Parallel Processing What is Parallel Processing Parallel Processing in Nature Parallel Processing.

On the Communication Complexity of SFE with Long Output Daniel Wichs (Northeastern) joint work with Pavel Hubáček.

Interactive Rendering With Coherent Ray Tracing Eurogaphics 2001 Wald, Slusallek, Benthin, Wagner Comp 238, UNC-CH, September 10, 2001 Joshua Stough.

Non-Interactive Verifiable Computing August 5, 2009 Bryan Parno Carnegie Mellon University Rosario Gennaro, Craig Gentry IBM Research.

Amplification and Derandomization Without Slowdown Dana Moshkovitz MIT Joint work with Ofer Grossman (MIT)

Privacy Preserving Payments in Credit Networks By: Moreno-Sanchez et al from Saarland University Presented By: Cody Watson Some Slides Borrowed From NDSS’15.

Jonathan Katz Professor, Computer Science, UMD Director, Maryland Cybersecurity Center Secure Computation.

Cryptography Against Physical Attacks Dana Dachman-Soled University of Maryland

Harnessing the Cloud for Securely Outsourcing Large- Scale Systems of Linear Equations.

VEAL: Virtualized Execution Accelerator for Loops Nate Clark 1, Amir Hormati 2, Scott Mahlke 2 1 Georgia Tech., 2 U. Michigan.

Auditing Information Leakage for Distance Metrics Yikan Chen David Evans TexPoint fonts used in EMF. Read the TexPoint manual.

Second Price Auctions A Case Study of Secure Distributed Computing Bart De Decker Gregory Neven Frank Piessens Erik Van Hoeymissen.

Secure Computation Pragmatics Yan Huang Indiana University May 10, 2016.

Computational Challenges in BIG DATA 28/Apr/2012 China-Korea-Japan Workshop Takeaki Uno National Institute of Informatics & Graduated School for Advanced.

OMP2MPI A. Saà-Garriga et al., CAIAC (UAB)HIP3ES 1/28 A. Saà-Garriga, D. Castells-Rufas and J. Carrabina {Albert.saa, David.castells,

Secure Computation Basics Yan Huang Indiana University May 9, 2016.

Tools and Libraries for Manycore Computing Kathy Yelick U.C. Berkeley and LBNL.

Multi-Party Computation r n parties: P 1,…,P n  P i has input s i  Parties want to compute f(s 1,…,s n ) together  P i doesn’t want any information.

Relational Query Processing on OpenCL-based FPGAs Zeke Wang, Johns Paul, Hui Yan Cheah (NTU, Singapore), Bingsheng He (NUS, Singapore), Wei Zhang (HKUST,

BAHIR DAR UNIVERSITY Institute of technology Faculty of Computing Department of information technology Msc program Distributed Database Article Review.

Hybrid BDD and All-SAT Method for Model Checking

Analysis: Algorithms and Data Structures

Oblivious Parallel RAM: Improved Efficiency and Generic Constructions

Data structure – is the scheme of organizing related information.

Resource Elasticity for Large-Scale Machine Learning

HOP: Hardware makes Obfuscation Practical Kartik Nayak

Oblivious RAM: A Dissection and Experimental Evaluation

A Verified DSL for MPC in

Verifiable Oblivious Storage

Making Secure Computation Practical

Lecture 15: Least Square Regression Metric Embeddings

What is Computer Science About? Part 2: Algorithms

Path Oram An Extremely Simple Oblivious RAM Protocol

Presentation transcript:

ObliVM: A Programming Framework for Secure Computation Chang Liu Joint work with Xiao Shaun Wang, Kartik Nayak Yan Huang, and Elaine Shi Put UT and Berkeley Logo

Not leaking their sensitive genomic data to anyone else! Dating: Genetically Not leaking their sensitive genomic data to anyone else! Good match?

Security requirement: Problem Abstraction Bob Alice Public function f Holds 𝑥 Holds 𝑦 z = f(x, y) Reveal z Security requirement: but nothing more! 3

Efficient, requires Expertise Nina Taft Distinguished Scientist 5 researchers, 4 months to develop an (efficient) oblivious matrix factorization algorithm over secure computation Generic protocols Customized protocols Low design cost, Flexible Efficient, requires Expertise

Can generic secure computation be practical? Challenge 1: Efficiency: time & space Challenge 2: Programmability: for non-expert programmers

ObliVM: Achieve the Best of Both Worlds Programs by non-specialists achieve the performance of customized designs. Challenge 1: Efficiency: time & space Challenge 2: Programmability: for non-expert programmers

Programmer’s favorite model Cryptographer’s favorite model AND XOR OR … Cryptographer’s favorite model def binSearch(a, x): lo, hi = 0, len(a) res = -1 while lo <= hi: mid = (lo+hi)//2 midval = a[mid] if midval < x: lo = mid+1 elif midval > x: hi = mid else: res = mid return res Accessing a secret index may leak information!

How secret indexes leak information? Breast cancer Liver problem Kidney 𝑓(𝑥, 𝑦) AND XOR OR … A naive solution (in generic approaches) is to linear scan through the entire memory for each memory access. Extremely Slow!

Crypto Tool: Oblivious RAM 𝑂(𝑝𝑜𝑙𝑦 log 𝑁 ) Hide access patterns Redundancy Data Shuffling Poly-logarithmic cost per access Garbled Circuit 𝑖 Read M[i] [𝑖] ORAM Scheme [𝑀[𝑖]] [Shi, et al., 2011] Oblivious RAM with O((logN)3) Worst-Case Cost. In ASIACRYPT 2011. [Stefanov et al., 2013] Path ORAM: An extremely simple oblivious RAM protocol. In CCS 2013 [Wang, et al., 2015] Circuit ORAM: On Tightness of the Goldreich-Ostrovsky Lower Bound

Oblivious Program Source Program Oblivious Program Circuit Challenge! Easy

ObliVM: A Programming Framework for Oblivious Computation 1 2 Program-specific optimizations through static analysis Programming abstractions for oblivious computation Color – more obvious To achieve this goal, I will talk about two main ideas. The first idea is to for the compiler to perform static analysis, and perform program-specific optimizations at compile time. [LHS-CSF’13] [LHSKH-Oakland’14] [LHMHTS-ASPLOS’15] [LWNHS-Oakland’15]

Example: FindMax h[] need not be in ORAM. int max(public int n, secret int h[]) { public int i = 0; secret int m = 0; while (i < n) { if (h[i] > m) then m = h[i]; i++; } return m; Lets look at a concrete examples, findmax. This program is simple, it sequentially scans through an array to find the maximal element. Even though the array h may be secret, we need not place h in an ORAM, because the access pattern is fixed. We essentially only need to encrypt the array h. h[] need not be in ORAM. Encryption suffices.

Dynamic Memory Accesses: Main loop in Dijkstra for(int i=1; i<n; ++i) { int bestj = -1; for(int j=0; j<n; ++j) if(!vis[j] && (bestdis < 0 || dis[j] < bestdis)) bestdis = dis[j]; vis[bestj] = 1; if(!vis[j] && (bestdis + e[bestj][j] < dis[j])) dis[j] = bestdis + e[bestj][j]; } dis[]: Not in ORAM vis[], e[][]: Inside ORAM It is not important to understand the details. Distance array – look at the structure of the loops. Pause a little. “This can often give us orders of magnitude performance improvements for practical applications. “ Here is a more sophisticated example. Here is the main loop the Dijkstra shortest path algorithm. In this program there are three different arrays, the visited array denoted vis, the distance array denoted dis, and the edge array denoted e. Without having to understand the semantics of the program, if you just examine this at a syntax level, you can observe that the distance array is always going to be accessed sequentially, and therefore it need not be placed in oram. By contract, the two red arrays should be placed inside an oram. Essentially our compiler can automate such analysis and make decisions to place the minimal number of variables inside orams. Our real compiler is more sophisticated than this. It can also do other optimizations. For example, deciding when it is safe to divide variables up into multiple, smaller ORAMs. Our compiler automates this analysis

Do we need to place all variables/data inside one ORAM? here is our key observation. In a program, not all accesses will leak information. For some variables, their access patterns are safe to reveal, and for such variables, we need not place them inside an ORAM. Key observation: Accesses that do not depend on secret inputs need not be hidden

A memory-trace obliviousness type system ensures the security of the target program. [LHS-CSF’13, LHSKH-Oakland’14, LHMHTS-ASPLOS’15] [LHS-CSF ‘13] Memory Trace Oblivious Program Execution. In CSF 2013. [LHSKH-Oakland ‘14] Automating RAM-model Secure Computation. In Oakland 2014 [LHMHTS-ASPLOS ‘15] GhostRider: A Hardware-Software System for Memory Trace Oblivious Computation. In ASPLOS 2015

ObliVM: A Programming Framework for Oblivious Computation 1 2 Program-specific optimizations through static analysis Programming abstractions for oblivious computation Color – more obvious To achieve this goal, I will talk about two main ideas. The first idea is to for the compiler to perform static analysis, and perform program-specific optimizations at compile time. [LHS-CSF’13] [LHSKH-Oakland’14] [LHTHMS-ASPLOS’15] [LWNHS-Oakland’15]

Analogy to Parallel Computation A program written in C Approach 1: Limited opportunities for compile-time optimizations. Compile A program written in MapReduce Approach 2: MapReduce is a parallel programming abstraction. The best way to understand this idea is to make an analogy to parallel computation. So let’s think about how we can automate parallelism. The first approach is to take a program written in a traditional language like C, which is sequential in nature. The compiler can then try to figure out how to parallelize this program. This approach, however, offers limited opportunities for compile-time optimizations. The second approach, most of us are also familiar with. Namely, developers can code in a parallel programming paradigm such as mapreduce or spark. Such a program would give insights to the system as to how to parallelize it. Therefore we say that mapreduce is a parallel programming abstraction. Compile

Programming Abstractions for Oblivious Computation A program written in C Approach 1: Limited opportunities for compile-time optimizations. Oblivious representation using ORAM Compile A program written in ObliVM abstractions Approach 2: We provide oblivious programming abstractions. [NWIWTS-Oakland15] [WLNHS-Oakland15] Approach 1 is what I just described to you Well, imprecisely speaking, our high-level idea is essentially the same -- but replace parallel with oblivious, and replace mapreduce with oblivm. So far, our approach had been to write a program in a traditional language like C and perform static optimizations. This allows us to achieve significant speedup for a wide class of programs, but we can do better. Therefore, we consider programming abstractions for oblivious computation. Unlike mapreduce, our framework oblivm actually provides a suite of programming patterns, or programming abstractions. If the developer’s program follows these patterns or abstractions, our compiler can gain more insights, and emit efficient target code that not only leverage the generic ORAM, but also a variety of more efficient oblivious algorithms. this is our fundamental idea. In the interest of time, I will not have time to describe the details of all these programming abstractions – in particular several of them required a co-design of the abstraction and the underlying algorithms simultaneously. Therefore, I am just going to cherrypick a couple examples and give you a one-sentence summary of each. Oblivious representation using ORAM (generic) and oblivious algorithms (problem specific, but efficient) Compile

Programming abstractions Interactions between PL and algorithms Programming abstractions Oblivious algorithms Have one slide that talks about the unexpected Find common patterns, generalize into abstractions The expected

New insights lead to new algorithms Programming abstractions Interactions between PL and algorithms The unexpected New insights lead to new algorithms Programming abstractions Oblivious algorithms Have one slide that talks about the unexpected Find common patterns, generalize into abstractions The expected

New insights lead to new algorithms Programming abstractions Interactions between PL and algorithms allowed us to solve open problems in oblivious algorithms design! Interactions between PL and algorithms The unexpected New insights lead to new algorithms Programming abstractions Oblivious algorithms Depth-First Search Shortest path Minimum spanning tree Have one slide that talks about the unexpected Find common patterns, generalize into abstractions The expected

Gives oblivious Dijkstra and MST for sparse graphs Loop Coalescing Block 1 ×n Gives oblivious Dijkstra and MST for sparse graphs Block 2 ×m Block 3 ×n Here is a quick example: of PL technique that led to the discovery of new algorithms.

Gives oblivious Dijkstra and MST for sparse graphs Loop Coalescing Gives oblivious Dijkstra and MST for sparse graphs Loop coalescing slide

Hand-crafting vs. Automated Compilation 2013 ObliVM Today Nina Taft Distinguished Scientist Same Tasks Matrix Factorization 1 graduate student-day 10x-20x better performance [NIWJTB-CCS’13] 5 researchers 4 months Ridge Regression [NWIJBT-IEEE S&P ’13] 5 researchers 3 weeks [LWNHS-IEEE S&P ’15] (This work)

Speedup for More Applications Earlier non-tree-based ORAMs perform worse than linear scans of memory Speedup for More Applications Backend PL Circuit ORAM [HKFV12] 1.7x106x 7x 2x 1.2x105x 9x105x 7x 2500x 51x 9x105x 7x 2500x 51x 106 105 104 103 100 10 1 1.6x104x 7x 5.5x 407x 2.6x104x 7x 10x 366x 8200x 7x 5.5x 212x 5900x 7x 13x 65x 7400x 7x 2x 530x Speedup Dijkstra MST K-Means Heap Map/Set BSearch AMS CountMin Sketch Sketch Data size: 768KB 768KB 2MB 8GB 8GB 1GB 10GB 0.31GB

Reference point: ~24 hours in 2012 ObliVM: Binary Search on 1GB Database Reference point: ~24 hours in 2012 [HFKV-CCS’12] ObliVM Today: 7.3 secs/query With hardware AES  20 times better. For example, consider a common binary search query over a 1gigabyte database. Let me tell you about our performance today. On a single pair of processors, our oblivm framework takes 9 seconds to answer each binary search query. Of the 9 seconds, only 2.5 seconds attribute to the online cost, and all the rest can be done in an offline phase – and moreover the offline work is embarassingly parallelizable. There are also some low hanging fruits that will immediately boost our performance further. For example, our current implementation is in java, and does not utilize hardware AES. If we moved our implementation over to C, and made use of hardware AES features widely present in off-the-shelf processors today, we expect that the performance can improve to a hundredth of a second per query. To achieve this performance would require about 3GBps bandwidth. this calculation should be fairly accurate based on numbers reported in a recent work by Bellare et al. 2 EC2 virtual cores, 60GB memory, 10MBps bandwidth [HFKV-CCS’12] Holzer et al. Secure Two-Party Computations in ANSI C. In CCS ‘12

Reference point: ~24 hours in 2012 With cryptographic extensions ObliVM: Binary Search on 1GB Database Reference point: ~24 hours in 2012 [HFKV-CCS’12] With cryptographic extensions (projected) 0.3 secs/query With hardware AES  20 times better. For example, consider a common binary search query over a 1gigabyte database. Let me tell you about our performance today. On a single pair of processors, our oblivm framework takes 9 seconds to answer each binary search query. Of the 9 seconds, only 2.5 seconds attribute to the online cost, and all the rest can be done in an offline phase – and moreover the offline work is embarassingly parallelizable. There are also some low hanging fruits that will immediately boost our performance further. For example, our current implementation is in java, and does not utilize hardware AES. If we moved our implementation over to C, and made use of hardware AES features widely present in off-the-shelf processors today, we expect that the performance can improve to a hundredth of a second per query. To achieve this performance would require about 3GBps bandwidth. this calculation should be fairly accurate based on numbers reported in a recent work by Bellare et al. 2 EC2 virtual cores, 60GB memory, 300MBps bandwidth [HFKV-CCS’12] Holzer et al. Secure Two-Party Computations in ANSI C. In CCS ‘12

Overhead w.r.t. Insecure Baseline Distributed GWAS 130× slowdown 1.7×104× slowdown 9.3×106× slowdown Hamming Distance Instructions Secure computation: encrypted computation bit by bit Floating point overhead for floating point ? Slowdown in comparison with cleartext K-Means

Overhead w.r.t. Insecure Baseline Distributed GWAS Opportunities for further optimizations: 130× slowdown 1.7×104× slowdown 9.3×106× slowdown Hardware acceleration Parallelism Faster cryptography … Hamming Distance Instructions Secure computation: encrypted computation bit by bit Floating point overhead for floating point ? Slowdown in comparison with cleartext K-Means

ObliVM Adoption Privacy-preserving data mining and www.oblivm.com Privacy-preserving data mining and recommendation system Computational biology, privacy-preserving microbiome analysis Privacy-preserving Software-Defined Networking Cryptographic MIPS processor iDash secure genome analysis competition (Won an “HLI Award for Secure Multiparty Computing”)

ObliVM: Compiling Programs Future Work: From ObliVM to A Unified Programming Framework for Modern Cryptography Secure Multiparty Computation Program Obfuscation (DARPA Safeware) Fully Homomorphic Encryption Functional Encryption Verifiable Computation ObliVM: Compiling Programs into Circuits